Now store your own vector database with fairOS
Fa
irOS Ve
ctor store)FaVe is a truly decentralised, open source vector database build with Fair Data Principals in mind on top of FairOS.
IMPORTANT: FaVe is under heavy development and in early BETA stage. Some abnormal behaviour, data loss can be observed. We do not recommend parallel usage of same account from multiple installations. Doing so might corrupt your data.
You can get docker from here
You will need a bee node running with a valid stamp id.
We encourage Swarm Desktop
for setting up your bee node. Here is a guide for it.
You will need a FDP/Fairdrive account to use FaVe. You can create one from here
Export the following from your terminal
export VERBOSE=
export BEE_API=
export RPC_API=
export STAMP_ID=
export VECTORIZER_URL=
export USER=
export PASSWORD=
export POD=
Note :
VECTORIZER_URL
is optional in case we want to provide embeddings generated from other sources
Then run the following command
go run cmd/fave-server/main.go --port 1234 --keep-alive 6000m --write-timeout 6000m --read-timeout 6000m
docker run -d \
-e VERBOSE=true \
-e BEE_API=<BEE_API> \
-e RPC_API=<RPC_ENDPOINT_FOR_ENS_AUTH> \
-e STAMP_ID=<STAMP_ID> \
-e USER=<FAIROS_USERNAME> \
-e PASSWORD=<FAIROS_PASSWORD> \
-e POD=<POD_FOR_STORING_DB> \
-e VECTORIZER_URL=<API_ENDPOINT_FOR_VECTORIZER> \
-p 1234:1234 \
fairdatasociety/fave:latest --port 1234 --host 0.0.0.0 --keep-alive 6000m --write-timeout 6000m --read-timeout 6000m
Or, you can build the docker image yourself.
// build
docker build -t fds/fave .
// run
docker run -d \
-e VERBOSE=true \
-e BEE_API=<BEE_API> \
-e RPC_API=<RPC_ENDPOINT_FOR_ENS_AUTH> \
-e STAMP_ID=<STAMP_ID> \
-e USER=<FAIROS_USERNAME> \
-e PASSWORD=<FAIROS_PASSWORD> \
-e POD=<POD_FOR_STORING_DB> \
-e VECTORIZER_URL=<API_ENDPOINT_FOR_VECTORIZER> \
-p 1234:1234 \
fds/fave --port 1234 --host 0.0.0.0 --keep-alive 6000m --write-timeout 6000m --read-timeout 6000m
FaVe currently supports only test vectorization.
The system first produces vector representations or embeddings from a chosen vectorizer. Following that, it determines the nearest neighbors based on these embeddings. Once this is done, the content gets uploaded, and subsequently, the information about the nearest neighbors is also uploaded.
When conducting a search for a particular term, it computes the distance from a designated starting point and then searches for a match within the precomputed nearest neighbors.
FaVe utilizes fairOS internally, meaning it’s embedded directly rather than through REST APIs. FaVe itself offers a set of REST APIs for various functions.
Before we go any further we need these concepts cleared up:
Collection: This term refers to a namespace that points to a specific fairOS document and key-value store. In user point of view a collection is a place to store documents.
Documents: These are individual records placed within a collection.
Properties: These represent features of a document that are stored. FaVe vectorizes a set of properties that can be used to for search.
Vectorizer: It’s worth noting that the vectorizer, responsible for creating vector representations of document properties, is intended to be a separate service from FaVe.
FaVe provides a set of REST APIs for creating collections, adding documents, and retrieving nearest documents.
Before uploading data as documents, some preprocessing is required.
Here are the steps:
We have to prepare the documents in a specific format before uploading them via REST api
{
"name": "collection1",
"propertiesToIndex": ["property1"],
"documents": [
{
"id": "721dfcef-5b95-4eeb-99fc-841784a397df",
"properties": {
"property1": "foo1",
"property2": "bar1"
}
},
{
"id": "721dfcef-5b95-4eeb-99fc-841784a397dg",
"properties": {
"property1": "foo2",
"property2": "bar2"
}
}
]
}
This is an example of the add documents request body.
We have to provide the name of the collection. The propertiesToIndex is an array of properties that we want to index/vectorize in the vector database. We are only indexing property1 in this example.
The documents array contains the documents that we want to upload. Each document has a unique id and properties. Properties are the features of the document. They should contain key and value pairs. all the documents should have the same properties.
Once we have the data in the correct format, we can upload it to FaVe.
FaVe provides a REST APIs for retrieving nearest documents from a collection, given a query and a maximum distance.
The response contains the nearest documents along with their properties and their distances from the query.