Real-time AI company DataStax joins the OpenSearch Software Foundation

Real-time AI company DataStax joins the OpenSearch Software Foundation

DataStax, the real-time AI company, announced that it has joined the OpenSearch Software Foundation and will be working alongside like-minded organisations to drive open innovation in data search and analytics.

The OpenSearch Software Foundation, which was launched by the Linux Foundation last year, is a community-driven initiative to support OpenSearch and its search software, an open-source, enterprise-grade search and observability suite that brings order to unstructured data at scale. The foundation’s stated aim to provide resources to enable the long-term sustainability of the open-source project and ecosystem aligns closely with DataStax’s history of open-source database software support.

DataStax and the OpenSearch Project are announcing a series of integration efforts to support Generative AI developers. Retrieval-augmented generation (RAG) is a key design pattern in generative AI. RAG applications work by assembling context from a variety of sources, which is then processed by a large language model (LLM) to provide an intelligent and relevant response. Serving these applications requires a mix of data retrieval and storage capabilities, OpenSearch and DataStax, are committed to working together to serve the broad needs of generative AI developers.

DataStax and OpenSearch: A powerful combination  

DataStax has been working with OpenSearch for some time. Last year, they announced an OpenSearch integration with JVector, an open-source, embedded vector search engine developed by DataStax Co-Founder Jonathan Ellis.

The combination provides developers with extremely flexible information retrieval, using applications that many enterprises are already familiar with. It bridges the gap between single-document Q&A and open-domain Q&A, providing the ability to reason across diverse documents and texts by combining OpenSearch’s keyword search with the dense vector search of JVector, which is the same indexing library used in Astra DB, DataStax Hyper-Converged Database (HCD), and Apache Cassandra.

Hybrid Search with JVector and OpenSearch

Vector search in HCD empowers developers to harness proprietary data stored in Cassandra/DSE databases for LLMs, AI assistants, and real-time GenAI projects without compromising data security. With HCD’s innovative JVector technology, users get a 10x enhancement in vector search performance, which surpasses the capabilities of traditional Lucene-based search.

OpenSearch offers a comprehensive suite of features tailored for enterprise search, encompassing full-text search capabilities, advanced analytics, comprehensive monitoring tools, and robust security functionalities.

The future of enterprise search  

DataStax consider OpenSearch as the future of enterprise search, particularly as they continue to expand their self-managed search offerings. They are excited to see where the integrations with OpenSearch lead, and support OpenSearch users in getting the most out of their enterprise data estates.

Moving Forward

DataStax will maintain a JVector integration for OpenSearch and offer OpenSearch as part of its self-managed offering platform, HCDP (Hyper Converged Data Platform), and as an integration for its cloud service, Astra.

Enterprises have spent years investing in search infrastructure. With the inclusion of OpenSearch, DataStax can provide developers the most flexible information retrieval possible using applications already familiar to many enterprises. OpenSearch bridges the gap between single-document Q&A and open-domain Q&A, essentially providing the ability to reason across multiple diverse documents and texts by combining keyword search in OpenSearch with the dense vector search of JVector in Astra and HCDP.

For generative AI, relevance is critical, and through this partnership will ensure that your enterprise data estate can act as context for RAG and Gen AI workflows to provide as much data to the context as possible.