Vector Database: Weaviate
By mggg
299 words
Vector Database: Weaviate
Weaviate is an innovative vector database known for its efficiency in storing and retrieving data. Utilizing vectors, Weaviate indexes data objects based on their semantic properties, offering a unique approach to data handling. It supports a variety of modules, including text2vec
and OpenAI embeddings
, providing flexibility in data vectorization.
Getting Started with Weaviate
Deploying Weaviate is straightforward with docker-compose
. The OpenAI module transforms text into embeddings, enhancing semantic search capabilities.
---
version: '3.4'
services:
weaviate:
image: semitechnologies/weaviate:1.20.0
restart: on-failure:0
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 20
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: "./data"
DEFAULT_VECTORIZER_MODULE: text2vec-openai
ENABLE_MODULES: text2vec-openai
OPENAI_APIKEY: sk-xxx # replace with your OpenAI key
CLUSTER_HOSTNAME: 'node1'
Integrating Weaviate with OpenAI
1. Setting Up the Connection
First, let’s establish a connection to the Weaviate vector database using the weaviate-client
Python library.
pip install weaviate-client
import weaviate
client = weaviate.Client(
url = "http://localhost:8080"
)
client.is_live()
2. Inserting Data into Weaviate
With the connection established, inserting data into the Weaviate vector database is simple. Here, we demonstrate adding a data object.
uuid = client.data_object.create({
'question': 'This vector DB is OSS & supports automatic property type inference on import',
'somePropNotInTheSchema': 123, # automatically added as a numeric property
}, 'JeopardyQuestion')
print(uuid)
3. Retrieving Data
Retrieving data from Weaviate is just as straightforward, thanks to its efficient vector search capabilities.
import json
data_object = client.data_object.get_by_id(
'88e6d5e4-f31c-47f1-a7d9-cf0260e6a75e',
class_name='JeopardyQuestion',
)
print(json.dumps(data_object, indent=2))
4. Executing Basic Searches
Performing basic search operations in Weaviate is efficient, making it ideal for applications like mlb the show 22 database
and other complex datasets.
response = (
client.query
.get("JeopardyQuestion", ["question"])
.do()
)
print(response)
Conclusion
Weaviate’s harmonious integration with OpenAI, paired with its rapid vectorization capabilities, establishes it as an exceptional choice for semantic search operations across diverse projects, from a simple python vector database to the multifaceted MLB The Show DB.