Astra DB
DataStax Astra DB is a serverless vector-capable database built on Cassandra and made conveniently available through an easy-to-use JSON API.
AstraDBStore and AstraDBByteStore need the astrapy package to be
installed:
%pip install --upgrade --quiet astrapy
The Store takes the following parameters:
api_endpoint: Astra DB API endpoint. Looks likehttps://01234567-89ab-cdef-0123-456789abcdef-us-east1.apps.astra.datastax.comtoken: Astra DB token. Looks likeAstraCS:6gBhNmsk135....collection_name: Astra DB collection namenamespace: (Optional) Astra DB namespace
AstraDBStoreβ
The AstraDBStore is an implementation of BaseStore that stores
everything in your DataStax Astra DB instance. The store keys must be
strings and will be mapped to the _id field of the Astra DB document.
The store values can be any object that can be serialized by
json.dumps. In the database, entries will have the form:
{
"_id": "<key>",
"value": <value>
}
from langchain_community.storage import AstraDBStore
from getpass import getpass
ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass("ASTRA_DB_APPLICATION_TOKEN = ")
store = AstraDBStore(
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_name="my_store",
)
store.mset([("k1", "v1"), ("k2", [0.1, 0.2, 0.3])])
print(store.mget(["k1", "k2"]))
['v1', [0.1, 0.2, 0.3]]
Usage with CacheBackedEmbeddingsβ
You may use the AstraDBStore in conjunction with a
CacheBackedEmbeddings
to cache the result of embeddings computations. Note that AstraDBStore
stores the embeddings as a list of floats without converting them first
to bytes so we donβt use fromByteStore there.
from langchain.embeddings import CacheBackedEmbeddings, OpenAIEmbeddings
embeddings = CacheBackedEmbeddings(
underlying_embeddings=OpenAIEmbeddings(), document_embedding_store=store
)
AstraDBByteStoreβ
The AstraDBByteStore is an implementation of ByteStore that stores
everything in your DataStax Astra DB instance. The store keys must be
strings and will be mapped to the _id field of the Astra DB document.
The store bytes values are converted to base64 strings for storage
into Astra DB. In the database, entries will have the form:
{
"_id": "<key>",
"value": "bytes encoded in base 64"
}
from langchain_community.storage import AstraDBByteStore
from getpass import getpass
ASTRA_DB_API_ENDPOINT = input("ASTRA_DB_API_ENDPOINT = ")
ASTRA_DB_APPLICATION_TOKEN = getpass("ASTRA_DB_APPLICATION_TOKEN = ")
store = AstraDBByteStore(
api_endpoint=ASTRA_DB_API_ENDPOINT,
token=ASTRA_DB_APPLICATION_TOKEN,
collection_name="my_store",
)
store.mset([("k1", b"v1"), ("k2", b"v2")])
print(store.mget(["k1", "k2"]))
[b'v1', b'v2']