NOTICE
Hey there
Here's the roadmap for AquilaDB refactoring:
- AquilaDB v1.0
- Technical Specifications - finalization after review
- Technical Specifications - public draft for review
- [Update - Jul 31 2020] White paper published after review
- [Update - Jul 25 2020] White paper public draft is now available for review.
AquilaDB core team have temporarily stopped development for upcoming 4 months 2 months period. We've decided to take a step back and face the whiteboard again. Which is required to ensure AquilaDB's sustainability to continue as an Open Source project and to reduce stress over limited resources that we have in the development process. We will see you soon for sure.
We know that some of you have reached here as part of your time critical projects. We're sorry for the inconvinience. And Don't worry, we can direct you to some alternatives that we know:
The examples available in our documentation will work in all these platforms with small API changes.
If you're learning Machine Learning techniques and interested in Similarity Search, play around and bear with us.
If you wanted to lend a hand to help the community, please check the issues section. We're happy to merge your pull requests. Any new crazy addition is also encouraged. Also please extend your help towards our Discord community support as well.
And finally, Stay Home
regards,
a_mma team
Do you like this project? We love getting a star
AquilaDB
AquilaDB is a Decentralized Vector Database to store Feature Vectors along with JSON Document Metadata. Do k-NN retrieval from anywhere, even from the darkest rifts of Aquila (in progress). It is dead simple to set up, language agnostic and drop in addition for your Machine Learning Applications. AquilaDB, as of current features is ready solution for Machine Learning engineers and Data scientists to build Neural Information Retrieval applications out of the box with minimal dependencies (visit wiki page for use case examples).
AquilaDB 1.0 release is a distant goal to achieve. Visit contribute section below to see detailed development plan and milestones.
We make sure that each release and AquilaDB Master branch are stable with all features planned up to date. All new pull requests are made to develop branch. So, develop is the default and bleeding edge branch with all the latest updates.
Github, Docker Hub, Documentation (dedicated Wiki page)
Who is this for
- If you are working on a data science project and need to store a hell lot of data and retrieve similar data based on some feature vector, this will be a useful tool to you, with extra benefits a real world web application needs.
- Are you dealing with a lot of images and related metadata? Want to find the similar ones? You are at the right place.
- If you are looking for a document database, this is not the right place for you.
Technology
AquilaDB is not built from scratch. Thanks to OSS community, it is based on a couple of cool open source projects out there. We took a couch and added some wheels and jetpacks to make it a super cool butt rest for Data Scientists and ML Engineers. While CouchDB provides us network and scalability benefits, FAISS and Annoy provides superfast similarity search. Along with our peer management service, AquilaDB provides a unique solution.
Prerequisites
You need docker installed.
Usage
AquilaDB is quick to setup and run as docker a container. All you need to do is either build it from source or pull it from Docker hub.
Option 1: build from source
- clone this repository
- build image:
docker build -f <Dockerfile name> -t ammaorg/aquiladb:latest .
Option 2: pull from dockerhub
- pull image:
docker pull ammaorg/aquiladb:latest
Finally, deploy
- deploy:
docker run -d -i -p 50051:50051 -v "<local data persist directory>:/data" -t ammaorg/aquiladb:latest
AquilaDB as a kubernetes service
Run the following kubectl command to get Aquiladb as a service exposed to a k8s-cluster
- deploy:
kubectl apply -f https://github.com/a-mma/AquilaDB/blob/<Github branch>/kubernetes/aquiladb.yml
Client SDKs
We currently have multiple client libraries in progress to abstract the communication between deployed AquilaDB and your applications.
AquilaDB exposes gRPC APIs for the clients. Which means, you can communicate directly to AquilaDB from your favourite language (API reference). Above clients makes use of that to abstract the communication details from end user. If you are familiar with gRPC and would like to contribute a new client library in any other language, please let us know. Protocol buffers API reference. Example usage of APIs in node js.
Benchmark
For benchmark results, visit https://aquiladb.xyz/docs/adb-benchmarks
Progress
This project is still under active development (pre-release). It can be used as a standalone database now. Peer manager is a work in progress, so networking capabilities are not available now. With release v1.0 we will release pre-optimized version of AquilaDB.
Contribute
We have prepared a document to get anyone interested to contribute, immediately started with AquilaDB.
Here is our high level release roadmap.
Learn
We have started meeting developers and do small talks on AquilaDB. Here are the slides that we use on those occasions: http://bit.ly/AquilaDB-slides
Video:
As of current AquilaDB release features, you can build Neural Information Retrieval applications out of the box without any external dependencies. Here are some useful links to learn more about it and start building:
- These use case examples will give you an understanding of what is possible and what not: https://github.com/a-mma/AquilaDB/wiki
- Microsoft published a paper and youtube video on this to onboard anyone interested:
- Embeddings for Everything: Search in the Neural Network Era: https://www.youtube.com/watch?v=JGHVJXP9NHw
- Autoencoders are one such deep learning algorithms that will help you to build semantic vectors - foundation for Neural Information retrieval. Here are some links to Autoencoders based IR:
- Note that, the idea of information retrieval applies not only to text data but for any data. All you need to do is, encode any source datatype to a dense vector with deep neural networks.
Our Sponsors
LOVE
to sponsor this project contact@aquiladb.xyz
Citing AquilaDB
If you use AquilaDB in an academic paper, we would
\footnote{https://github.com/a-mma/AquilaDB}
@misc{a_മ്മ2019AquilaDB,
title={AquilaDB: Neural Information Retrieval Solution},
author={Jubin Jose},
howpublished={\url{https://github.com/a-mma/AquilaDB}},
year={2019}
}
License
Apache License 2.0 license file
created with


