Integrating Real-Time Domain Knowledge into LLM with LangChain Chroma Vector Database

Enhancing LLM Responsiveness with LangChain Chroma Vector Database Training data is up until September 2021. Therefore, I may not have information on events or updates that occurred after that time. If you have any questions regarding post-September 2021 topics, I might not be able to provide the latest information. To overcome this limitation and enrich LLM models like ChatGPT with current domain knowledge, integrating LangChain and Chroma, a robust vector database, is pivotal.

mggg's 博客

利用LangChain向量数据库加强LLM实时领域知识

利用LangChain向量数据库提升LLM的即时知识处理能力当我们使用ChatGPT，它常常提示我们其知识只更新到了2021年9月。因此，为了使LLM模型能够处理最新的信息，将实时的知识集成到模型中变得至关重要。截止到2021年9月的训练数据。因此，我不具备在那个时间点之后发生的事件或获取的信息。如果你有关于2021年9月之后的问题，我可能就无法提供最新的信息。实现步骤 1. 建立本地向量数据库我们首先需要在本地创建一个Chroma向量数据库，并将文档信息嵌入其中。 2. 查询与用户提示相关的知识接着，基于用户的提示，我们查询向量数据库，从中检索出相关领域的知识。 3. 将领域知识集成到用户提示中最后，我们将这些领域知识整合到用户的提示中，以供LLM模型使用。 LangChain + Chroma的应用下面是LangChain和Chroma的一个示例应用。我们通过add_text_embedding将文本解析为向量，并通过query查询领域知识。 from langchain.embeddings import OpenAIEmbeddings from langchain.text_splitter import CharacterTextSplitter from langchain.vectorstores import Chroma class EmbeddingLocalBackend(object): def __init__(self, path='db'): self.path = path self.vectordb = Chroma(persist_directory=self.path, embedding_function=OpenAIEmbeddings(max_retries=9999999999)) def add_text_embedding(self, data, auto_commit=True): text_splitter = CharacterTextSplitter( separator="\n", chunk_size=1000, chunk_overlap=200, length_function=len, is_separator_regex=False, ) documents = text_splitter.create_documents(data) self.vectordb.add_documents(documents) if auto_commit: self._commit() def _commit(self): self.vectordb.persist() def query(self, query): embedding_vector = OpenAIEmbeddings().

mggg's Blog

Vector Database: Weaviate

Vector Database: Weaviate Weaviate is an innovative vector database known for its efficiency in storing and retrieving data. Utilizing vectors, Weaviate indexes data objects based on their semantic properties, offering a unique approach to data handling. It supports a variety of modules, including text2vec and OpenAI embeddings, providing flexibility in data vectorization. Getting Started with Weaviate Deploying Weaviate is straightforward with docker-compose. The OpenAI module transforms text into embeddings, enhancing semantic search capabilities.