Introduction

The way organizations work with data is evolving quickly as connections and relationships between different types of information become increasingly important. Graph databases have emerged as a powerful way to store and query connected data as their performance analyzing relationships and connections typically outperforms traditional SQL and NoSQL database for these use cases. In this blog, we evaluate 15 leading graph database providers based on their track record, capabilities, and innovation to identify the most promising options for both developers and enterprises to watch in 2023.

Methods of Evaluation

To rank the graph databases, we considered several key factors – popularity and size of community (based on number of GitHub stars and Stack Overflow discussions), performance benchmarks, supported queries and features, scalability, pricing models, and ongoing development activity. We also looked at trends in job postings, conferences attended, clientele, number of integrations and third-party solutions to gauge the mindshare and momentum for each vendor. Traditional criteria like ease of use, documentation, support availability were also part of the evaluation along with emerging metrics like number of backlinks, traffic and keyword trends to understand a vendor’s growth potential.

1. Datumaro

Datumaro is an open-source framework for machine learning model annotation and dataset management. It provides tools to create, annotate, augment, validate and explore different types of datasets, including graph datasets. Datumaro was created by Anthropic to help with computer vision and natural language processing projects that require high-quality labeled data.

Pros: Some key advantages of using Datumaro include:

– Framework for ML model annotation and dataset management
– Works with graph datasets from Neo4j and others
– Helpful for computer vision and NLP projects

Cons: One potential disadvantage is that as an open-source project, it may not have the same commercial support and resources as paid alternatives

Pricing: Datumaro is open-source and free to use. There is no paid version or pricing.

Some key facts about Datumaro include:

– Open-source and free to use
– Supports all major machine learning frameworks including TensorFlow, PyTorch and others
– Can work with graph datasets from Neo4j and other graph databases

GitHub: Let’s build from hereGitHub is where over 100 million developers shape the future of software, together. Contribute to the open source community, manage your Git repositories, review code like a pro, track bugs and features, power your CI/CD and DevOps workflows, and secure code before you commit it.github.comimage

2. Elasticsearch

Elasticsearch is a free and open-source distributed search and analytics engine that combines real-time search and analytics. Elasticsearch’s flexibility and scalability allow it to handle large volumes of data. It can store, search, and analyze big volumes of data in near real-time.

Pros: Some key advantages of Elasticsearch include:
– Document database with graph capabilities.
– Good for associative and property graphs.
– Native spatial, machine learning, and SQL capabilities.
– Large community and ecosystem.

Cons: One potential disadvantage of Elasticsearch is that it requires more resources and IT skills to manage and maintain compared to some other options.

Pricing: Elasticsearch has both open source and commercial licensing options. The open source Basic packages are free to use, while Gold, Platinum, and Enterprise subscription packages are paid with support and additional features.

Some key stats about Elasticsearch include:
– It powers search experiences on over 1 billion websites and devices worldwide.
– Over 500 million downloads of Elasticsearch alone and over 1 billion downloads of the Elastic Stack.
– Used by companies like NYTimes, Stack Overflow, and Cisco.

Elasticsearch Platform — Find real-time answers at scalePower insights and outcomes with the Elasticsearch Platform and AI. See into your data and find answers that matter with enterprise solutions designed to help you build, observe, and protect. Try Elasticsearch free today.elastic.coimage

3. Gemini

Gemini is a cryptocurrency exchange and custodian that allows users to buy, sell and store bitcoin and other cryptocurrencies. Founded in 2014 by Cameron and Tyler Winklevoss, Gemini is one of the most well-established exchanges in the cryptocurrency space.

Pros: Some key advantages of using Gemini include:

– Intuitive and secure trading platform
– Institutional grade security and regulatory compliance
– Insured hot and cold storage for deposited funds
– Competitive trading fees and 24/7 customer support

Cons: One potential disadvantage is that Gemini has fewer cryptocurrencies supported for trading compared to some other top exchanges. However, they focus on adding only widely used and established coins.

Pricing: Gemini has a tiered maker-taker fee structure for trading ranging from 0.25% to 0% depending on your 30-day trading volume. Withdrawal fees vary by cryptocurrency but are generally low.

Some key stats about Gemini include:

– Over 70 cryptocurrencies supported for trading including Bitcoin, Ethereum, Litecoin and more
– Regulated as a licensed NY trust by the New York State Department of Financial Services
– Over $10 billion in crypto deposited and traded since inception
– Multi-signature crypto storage with insurance for funds held

Buy, Sell & Trade Bitcoin & Other Crypto Currencies with Gemini’s Best-in-class Platform | GeminiGemini makes crypto simple. Find, Trade and Buy over 70 coins including bitcoin on the best cryptocurrency platform. Start trading crypto here.gemini.comimage

4. Neo4j

Neo4j is a native graph database developed by Neo4j, Inc. Founded in 2007, Neo4j is the leading graph database platform that applies a graph-first approach to data modeling and relationships. Neo4j’s developer-friendly platform enables data experts, developers and applications to explore data and reveal hidden insights at amazing speed and scale.

Pros: Some key advantages of Neo4j include:

– Most popular and mature graph database with largest community
– Very fast for complex graph queries
– Native graph storage and processing provides great performance
– Rich set of relationship types and properties enable flexible data modeling

Cons: One potential disadvantage of Neo4j is that it is only fully compatible with graph queries, which require a different way of thinking about data than traditional table-based SQL queries.

Pricing: Neo4j offers a variety of pricing plans including an open-core Community Edition that is free to use for non-commercial purposes. Commercial plans start from $200/month for small companies and go up on price and features for larger deployments.

Some key stats about Neo4j include:

– Over 550,000 community and commercial deployments in more than 250,000 companies globally
– Supports billions of relationships and hundreds of millions of nodes
– Adopted by leaders in the technology and finance industries including Walmart, Anthropic, Boeing, and eBay

Neo4j Graph Database & Analytics – The Leader in Graph DatabasesConnect data as it’s stored with Neo4j. Perform powerful, complex queries at scale and speed with our graph data platform.neo4j.comimage

5. DataStax Graph

DataStax Graph is a distributed graph database that is built on Apache Cassandra. It provides a massively scalable, highly available and AP consistent database designed for complex relationships and connections. DataStax Graph makes it easy to model, store and query graph-shaped data at scale.

Pros: Some key advantages of DataStax Graph include:
– Distributed graph database on Apache Cassandra for unlimited horizontal scalability
– Integrates with DataStax Enterprise for built-in search and analytics capabilities
– Gremlin query engine supports the full Gremlin traversal language for graph exploration and manipulation
– Compatible with all major graph formats and libraries like Giraph, Spark GraphX, and TensorFlow

Cons: One potential disadvantage is that as a distributed database, DataStax Graph may have higher operational overhead compared to standalone graph databases for small datasets and workloads.

Pricing: DataStax Graph has the following pricing models:
– Monthly subscription based on data volume and server performance
– Bring your own license option for enterprise agreements
– Free 30-day trial available to test the database in a development/test environment

Some key stats about DataStax Graph include:
– Supported by over 250 customers globally across various industries
– Ability to scale out to over 1,000 servers for datasets with billions of vertices and relationships
– Achieves sub-second latency at large scale for queries involving millions of vertices

Vector Database for Scalable Generative AI | DataStaxElevate your business with generative AI applications on the industry-leading vector database built for similarity search. Get started with Astra DB for free!datastax.comimage

6. RedisGraph

RedisGraph is a graph database module for Redis. It allows developers to store, access and query property graph data structures using the Redis database. RedisGraph uses a graph data model with nodes, edges and properties to store and query data. At the core is the Cypher query language for querying graph data efficiently.

Pros: Some key advantages of RedisGraph include:
– Leverages Redis’ speed and scalability for graph workloads
– Property graph data model with nodes, edges and properties
– Cypher query language for flexible graph traversals and analytics
– Integrates Redis features like caching, pub/sub and persistence

Cons: The main disadvantage is that as a relatively new product, it has less ecosystems support and integrations compared to established graph databases like Neo4j. Adoption may also be limited due to its reliance on Redis.

Pricing: RedisGraph is open source and free to use. For commercial use and additional features, Redis Enterprise is required which starts at $5,000/year for developer licenses.

Some key stats about RedisGraph:
– Part of Redis Enterprise, the real-time data platform
– Built-in module for Redis that can scale to 100s of TB of data
– Supports the Cypher query language for graph queries
– Seamlessly integrates with Redis for caching, streaming and persistence

Redis | The Real-time Data PlatformDevelopers love Redis. Unlock the full potential of the Redis database with Redis Enterprise and start building blazing fast apps.redis.comimage

7. MarkLogic

MarkLogic is a multi-model database with powerful capabilities for both graph and document data. Founded in 2001, MarkLogic has been helping organizations manage and analyze complex, distributed content with its robust platform.

Pros: Some key advantages of MarkLogic include:

– Native support for JSON, XML, and relational data models allows flexible data ingestion
– Built-in graph capabilities allow knowledge graphs and complex relationship queries
– Sophisticated full-text and semantic search capabilities
– Scale-out architecture allows limitless growth potential
– Robust API ecosystem for application integration and development

Cons: One potential disadvantage is the high cost of ownership compared to some other database options. However, this is offset by MarkLogic’s powerful feature set for complex enterprise workloads.

Pricing: MarkLogic pricing is based on a per-CPU socket license model. Exact costs will vary based on configuration and needs, but generally start at around $10,000-$15,000 per CPU socket for basic support.

Some key stats about MarkLogic include:

– Used by over 1,500 customers including government agencies, financial services firms, and manufacturers
– Supports over 16 billion documents
– Over 20 years of experience building enterprise database technology
– Support for JSON, XML, and row-based data in a single platform

Simplify Complex Data and Achieve Data Agility – MarkLogicSolve your most complex data challenges and quickly respond to business change by unlocking value from your data and achieving data agility, all in one platform.marklogic.comimage

8. RedisGraph

RedisGraph is an open source graph database module for Redis developed by Redis Labs. RedisGraph allows users to represent and query graph-structured data using a graph query language called Cypher. It extends the Redis in-memory data structure store allowing natural and efficient storage and retrieval of graph-shaped datasets within Redis.

Pros: RedisGraph has several advantages over typical OLTP and OLAP databases for graph use cases:

– As an embedded module, it is faster for graph queries than traditional RDBMS which require joins
– Performance scales linearly as the graph and query complexity increases
– Leverages Redis for strong data modeling, secondary indexing, caching and atomic operations

Cons: As an open source project, RedisGraph has fewer enterprise-level features than proprietary graph databases:

– Lacks some advanced analytics capabilities of specialized graph databases
– Support and commercial services are limited compared to paid solutions

Pricing: RedisGraph is open source and free to use. Redis Labs also offers commercial support and services for RedisGraph as part of Redis Enterprise, their fully managed Redis platform.

Some key advantages of RedisGraph include:

– Supports property graphs model where nodes and relationships can have key-value attributes
– Built-in support for common graph queries like matching, pathfinding and subgraph matching using Cypher
– Leverages Redis capabilities like caching, clustering and high availability

Redis | The Real-time Data PlatformDevelopers love Redis. Unlock the full potential of the Redis database with Redis Enterprise and start building blazing fast apps.redislabs.comimage

9. AllegroGraph

AllegroGraph is a graph database developed by Franz Inc. It is an RDF triplestore that provides flexible graph storage and powerful SPARQL querying. AllegroGraph can manage billions of triples while providing sub-second query response times.

Pros: Some key advantages of AllegroGraph include:
– RDF triplestore with graph and SPARQL support
– Very fast bulk loading and querying of graphs
– Can federate multiple triplestores as virtual graph
– Supports reasoning, inference and semantic technologies

Cons: One potential disadvantage is that AllegroGraph is a proprietary commercial product, so may not be suitable for all open source projects or budgets.

Pricing: AllegroGraph pricing is based on the volume of triples and features required. There are on-premise licenses available as well as monthly cloud-based subscriptions on AWS and Microsoft Azure.

Some key stats about AllegroGraph include:
– Can store billions of triples and relationships
– Provides sub-second querying of large graphs
– Extremely fast bulk loading of large datasets
– Can integrate data from multiple sources through federation

Franz Inc.Franz Inc. is an early innovator in Artificial Intelligence (AI) and leading supplier of Semantic Graph Database technology (AllegroGraph) with expert knowledge in developing and deploying Knowledge Graph solutions.franz.comimage

10. ArangoDB

ArangoDB is a native multi-model database that combines graph, document, and key-value data models with flexible queries and real-time analytics. It allows applications to work with connected data as graphs, documents, or key-values to speed innovation and lower operational costs.

Pros: Some key advantages of ArangoDB include:
– Multi-model database supports graph, document and key-value storage
– Strong ACID transactions for consistency
– Powerful query language AQL for graphs and documents
– Scales horizontally with sharding and replication

Cons: As with any database, one disadvantage of ArangoDB is the learning curve required to work with its query language AQL which has some differences from standard SQL.

Pricing: ArangoDB has both open source and commercial licenses. The open source version is free to use for any workload. Commercial licenses include support, maintenance, and enterprise features starting at $2,500 per year for 25GB of storage.

With more than 8 million data models deployed, ArangoDB is trusted by global enterprises to power mission critical applications. It can process over 1 million relationships per second and scales linearly with no degradation. Built on proven open source components, ArangoDB delivers high performance, availability, and reliability even for the most demanding workloads.

ArangoDBUnlock the power of ArangoDB, the most complete graph database. Explore its scalability for multiple use cases including fraud detection, supply chain, network analysis, traceability, recommendations, and more. Trusted by global enterprises. Explore the advantage today!arangodb.comimage

11. TigerGraph

TigerGraph is a native, parallel, distributed graph database for powerful analytics and machine learning on graphs. Founded in 2012 and headquartered in Redwood City, California, TigerGraph’s proven technology provides an advanced solution for link analytics at scale. Its patented GraphPlus analytics platform includes the distributed graph database, TigerGraph, for the storage, sharing and analysis of highly connected datasets, a visual query language called GSQL for analysis of graph data, and integration with multiple languages and tools.

Pros: Key advantages of TigerGraph include:

– Dedicated graph database with native graph storage and indexing
– Very fast traversal and analytics on huge graphs with billions of nodes
– Powerful graph query language GSQL optimized for graph operations
– Scales horizontally on commodity servers and cloud

Cons: One potential disadvantage is that as a relatively new database, TigerGraph may not have as large of an ecosystem of third-party tools, libraries and community support as some more established relational or NoSQL databases.

Pricing: TigerGraph offers both open source and commercial licensing options. The open source TigerGraph Community Edition is free to use for non-commercial purposes. Commercial licenses start at $2,500/month and scale based on the size and usage of the graph database deployment.

Some key stats about TigerGraph include:

– Supports graphs with billions of nodes and trillions of edges
– Fully native graph storage and indexing with the unique GraphPlus analytics platform
– Can power analytics across huge datasets up to 100x faster than other databases
– Used by customers across industries like banking, government, telco and retail

Graph Analytics Platform | Graph Database | TigerGraphTigerGraph is the fastest and only scalable graph database for the enterprise. Unleash the power and speed of our graph analytics platform today.tigergraph.comimage

12. Dgraph

Dgraph is an open source, distributed graph database that allows building and querying graphs with high performance. Dgraph is architected ground up to be multi-tenant, auto-sharding, and horizontally scalable without compromising on ACID transactions. It powers mission critical applications for several large enterprises.

Pros: Some key advantages of Dgraph include:

– Distributed, horizontally scalable architecture allowing graphs with billions of nodes and edges
– Built-in GraphQL support with GraphQL over REST
– ACID transactions for consistency across distributed servers
– Native graph query language with predicates and filters for complex queries

Cons: Potential disadvantages of Dgraph include:

– Open source license requiring payment for commercial support
– Learning curve for the graph data model and query language

Pricing: Dgraph has the following pricing plans:

– Free plan for unlimited development and testing
– Starting at $5000/month for commercial use with support and improvements

Some key stats about Dgraph include:

– Supports over 150,000 reads/writes per second
– Linear scalability up to hundreds of machines
– Stores and queries billions of entities and relationships
– Automated load balancing and partitioning across data centers

Dgraph | GraphQL Cloud Platform, Distributed Graph EngineThe only fault-tolerant, distributed graph database with native GraphQL that gives developers the tools to quickly build distributed applications at scale.dgraph.ioimage

13. DBeaver Graph

DBeaver Graph is a universal database tool for working with graph databases. It serves as both an IDE and a data viewer for graph databases, enabling visual graph exploration, interactive querying, and data import/export capabilities. DBeaver Graph supports major graph databases like Neo4j, OrientDB, and Apache Giraph.

Pros: Some key advantages of DBeaver Graph include:
– Universal tool that works across different graph databases
– Provides both an IDE interface and graphical data viewer
– Allows visual exploration and querying of graph structures
– Includes features like import/export for data transfer

Cons: One potential disadvantage is that as a universal tool, DBeaver Graph may not provide database-specific advanced features compared to standalone clients of individual graph databases.

Pricing: DBeaver Graph has both open source and commercial editions. The open source version is free to use under the GPL license, while commercial subscriptions start at $99 per year and include extras like priority support.

Some key features of DBeaver Graph include:
– Graph database IDE and viewer with import/export tools
– Supports major graph databases like Neo4j, OrientDB, Apache Giraph
– Visual graph exploration, editing and interactive querying

DBeaver PRO | One tool for all data sourcesUse advanced features of DBeaver PRO to explore, process, and administrate all possible SQL, NoSQL, and cloud data sources.dbeaver.comimage

14. OrientDB

OrientDB is a multi-model open source NoSQL database developed by OrientDB LTD. It is a document, graph and object-oriented database management system with built-in full text, geo-spatial and tag searching capabilities. OrientDB is a distributed ACID database with horizontal scalability and fault tolerance capabilities.

Pros: Some key advantages of OrientDB include:

– Multi-model database that allows storage and querying of documents, graphs and other data models in the same database
– Distributed and fault-tolerant architecture provides high-availability and horizontal scalability
– Powerful graph and document query languages (Gremlin and OSQL) for flexible data access and analysis
– ACID compliant transactions for data integrity
– Large ecosystem with official drivers and connectors for major programming languages

Cons: One potential disadvantage is that as a less commonly used open source database compared to competing products like Neo4j and MongoDB, OrientDB may have less community support and online documentation available.

Pricing: OrientDB has three main pricing options:

– Open Core – Free and open source under the Apache 2.0 license
– Enterprise – Priced subscriptions with support
– Server – Deployment of prebuilt Docker images or packages for on-premise or cloud use

Some key stats about OrientDB include:

– Over 10 years in development since its first release in 2011
– Used by over 250,000 companies worldwide including FINRA, Mazda, and Cerner
– Supports multiple programming languages and connectors including Java, Python, .NET and Node.js
– Can scale to petabytes of data on a distributed cluster

OrientDBorientdb.orgimage

15. OrientDB

OrientDB is an open-source multi-model graph and document database created by OrientDB LTD. OrientDB combines document, graph, and object-oriented APIs into a flexible platform that handles complex and interconnected data.

Pros: Some key advantages of OrientDB include:
– Support for multiple data models like documents, graphs, and objects in a single database.
– Integrated full-text search functionality and indexes.
– Horizontal scalability to distribute load across servers.
– Built-in replication for high availability.
– Support for SQL, Gremlin, and other query languages.

Cons: One potential disadvantage is that as a open-source platform, it may not have as extensive commercial support options compared to some proprietary database systems.

Pricing: OrientDB has both open-source and commercial licenses. The open-source version is free to use under the Apache 2.0 license. Commercial licenses start at $4,000/year for basic support and additional features.

Some key stats about OrientDB:
– OrientDB supports multi-model databases and allows storing data in documents, graphs, or object formats.
– It has integrated full-text search capabilities and can build full-text indexes on any property of a document or edge.
– OrientDB is horizontally scalable and allows distributing data across servers to handle large volumes of data and traffic.
– OrientDB databases can be queried using a mix of SQL, Gremlin traversal queries, and more.

OrientDBorientdb.comimage

Conclusion

While graph databases are still a niche compared to SQL and NoSQL databases, they are growing rapidly in adoption driven by new use cases in fraud detection, recommendation engines, knowledge graphs, social networking and many other domains. The technologies are also evolving at a fast clip with newquery languages, specialized indexing, machine learning integrations and distributed architectures. The options evaluated here represent some of the most full-featured, performant and innovative graph databases on the market. By keeping an eye on their developments, enterprises and developers will be well-positioned to leverage graph databases’ powerful modeling capabilities to unlock deeper insights from their complex, connected data networks.

Share via
Copy link