Will AWS Neptune become the leader in Graph databases?

Amazon Neptune, a new fully-managed cloud graph database, is generally available after its limited preview launch last year. With Amazon Neptune, customers can manage data within a graph model – a semantic structure in nodes, edges, and properties.

AWS will manage the operational aspects of the Amazon Neptune graph database within their cloud platform, and therefore customers will not have to perform operations like maintenance, patching, backups, and restores. Furthermore, the service is highly available, with read replicas, point-in-time recovery, continuous backup to Amazon S3, replication across Availability Zones (AZ), and supports encryption of data at rest and in transit.

With Amazon Neptune, AWS customers can create and query graph databases for various use cases, including fraud detection, social networking, and recommendations. A reference customer is Blackfynn, a life sciences software startup, that is looking for ways to change how Epilepsy, Alzheimer’s disease, Parkinson’s disease, ALS, and other neurological disorders are treated. Their SVP of engineering, Chris Baglieri, said in the same Business Wire article:

“We look forward to using Amazon Neptune as an integral part of our data platform. Neptune will allow us to connect the dots between genomics, pathology, neurochemistry, device and patient clinical data, efficiently and at scale, helping us drive breakthrough discoveries.”

Is AWS Neptune right for me?

When considering AWS Neptune one has to make sure a graph database is the best storage structure for the dataset. A Graph DB is used best for highly connected datasets, where many of the data-points connect to many others, in multiple relations. The easiest way to determine if a graph DB is the better option vs a relational DB is to try and model the dataset and connections using relational DB schemas. If you have many tables representing the different object and more tables representing the connections between all of these objects, then a graph database may be the best option to exhibit this dataset. It may also help you discover new connections you would have never seen using a relational DB.

Graph databases come in two technologies for storing graph data:

  1. RDF
  2. Property Graph.

Each has a popular query language associated with it, Gremlin for property graph and SPARQL for RDF.

Graph databases today support only one of the storage technologies. Even when some of the products do support both query languages, they do it by translating one language to one that is native to their storage engine incurring a performance penalty. Neptune is optimized for both query languages, which allows you to choose the one you prefer, and more importantly, it allows you to switch query languages without choosing a different graph database product or suffering performance degradation.

  1. Neptune boasts millisecond-level response on billions of connections, and as a cloud service, it’s both automated and highly scalable.
  2. You can insert data into Neptune in code, connect to the DB, and run multiple addVertex and addEdge commands or use the Loader to load data from S3, both Gremlin and RDF structure are supported.
  3. You can easily query Neptune from the AWS Console, using the Cloud 9 IDE.