Recent Jiamin's Blog
Implementing Custom Storage Formats in Apache Hive
Implementing Custom Storage Formats in Apache Hive Background In certain business scenarios, downstream processing systems need to handle data files directly. Although Hive officially supports formats like text, orc, parquet, etc., learning how to develop custom storage formats is crucial for addressing a more diverse range of business scenarios. Hive currently offers the ROW FORMAT SERDE mechanism for this purpose.
ROW FORMAT SERDE The ROW FORMAT SERDE in Hive is a key data formatting concept, defining how to parse and map data stored in Hive tables.
read more
Harnessing the Power of OpenAI's Latest Innovations
Introduction: Embracing the Future with OpenAI’s Updates In the ever-evolving landscape of artificial intelligence, staying updated with the latest advancements is not just a matter of curiosity, but a necessity for those looking to leverage AI for their projects. On the 11th of June, 2023, OpenAI introduced a slew of new features, marking a significant update to their Python SDK, now at version 1.0.0. In this blog, we’ll dive into these updates and explore how they can revolutionize the way we interact with AI.
read more
Langchain LLM Streaming
Langchain LLM Streaming Langchain offers the capability to perform real-time processing of tokens generated by LLM through a callback mechanism.
from langchain.chat_models import ChatOpenAI from langchain.schema import ( HumanMessage, ) from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler chat = ChatOpenAI(streaming=True, callbacks=[StreamingStdOutCallbackHandler()], temperature=0) resp = chat([HumanMessage(content="Write me a song about sparkling water.")]) Langchain supports both synchronous and asynchronous IO for token output. This corresponds to StreamingStdOutCallbackHandler and AsyncIteratorCallbackHandler, respectively.
StreamingStdOutCallbackHandler First, let’s take a look at the Langchain official implementation of StreamingStdOutCallbackHandler, which allows for real-time printing of LLM-generated tokens to the terminal.
read more