This site is not optimized for Internet Explorer 9 and lower. Please choose another browser or upgrade your existing browser in order get the best experience of this website.

Pairing Neo4j ElasticSearch: The Basics

February 24, 2016

Neo4j ElasticSearch Pairing BasicsThere are a number of ways of integrating Neo4j with ElasticSearch. One common way was through the use of the Rivers plugin, but that was deprecated in ElasticSearch 1.5 and will likely be fully removed shortly after ElasticSearch 2.0. Going forward any integration will require a more sophisticated integration to index the desired nodes and relationships from Neo4j to ElasticSearch.

For those that don’t know, ElasticSearch is an open source search server based on Lucene that provides a distributed full-text search engine that utilizes JSON documents with a RESTful API.

Benefits of Neo4j ElasticSearch Pairing

ElasticSearch provides language analyzers, aggregations and other features right of the box, which are some of the reason it’s an ideal search solution to pair with Neo4j as opposed to trying to recreate all the text search capability within Neo4j. Some of the key advantages in the Neo4j ElasticSearch pairing include:

  • Swift search against large data volumes
    Large and complex graph traversal queries spanning tens to hundreds of thousands of nodes that would take many seconds will take milliseconds with ElasticSearch because the query result is stored in a single document that can be easily indexed. The design of ElasticSearch is leaner and lot simpler compared to a database consisting of columns, rows, tables, fields, and schemas, which enables many documents with concise results to be indexed in a caching mechanism when the attribute nature of the query variations doesn’t explode the combinations needing stored.
  • Document indexing to repository
    ElasticSearch can easily convert raw data (message files or log files) into internal documents. It then stores them within a basic data structure. Flowing documents to ElasticSearch is reliable to automate in a push fashion from Neo4j.
  • Quick data access via de-normalized storage
    ElasticSearch will usually house a document for every repository in which it lives in. Full text searches are swift since documents are housed nearby to corresponding metadata within the index. The aggregators and language analyzers can then be used effectively to build together search queries that go from text entry to a starting set of nodes for the Neo4j graph query to complete the process of returning a result.
  • Scalable and distributable
    ElasticSearch is capable of scaling thousands of servers while accommodating petabytes of data. Its capacity results directly from its highly distributed and intricate architecture. This scalability is a great front for the query result documents to lower the complex and potentially long running query load off Neo4j.

Integration of Neo4j with ElasticSearch

On the Neo4j side of the ElasticSearch integration, one way is to integrate Neo4j’s TransactionEventHandler to push over any graph changes to ElasticSearch. Or another approach external to the core Neo4j transactional commit lifecycle is to push changes to the graph database nodes and relationships that would impact the desired query result documents to ElasticSearch on successful commit.

GraphGrid ElasticGraph

GraphGrid makes it possible to pair Neo4j and ElasticSearch in way that allows Neo4j to house all nodes and relationships, while allowing ElasticSearch text search through indexing certain nodes, relationships and query results. An example of this type of Neo4j ElasticSearch integration may be seen on find.media: Whenever you type a query within the search field, ElasticSearch will begin to work to surface search options and once the result is selected a request made to Neo4j where it can leverage the benefit of the relationship connectedness, specified in the graph to offer comprehensive results.

Surfacing relevant search results isn’t easy, but when leveraging the connectedness of your data through Neo4j along with the aggregation and language capabilities provided by ElasticSearch a powerful pairing emerges, which many refer to as “graph-aided search”. At GraphGrid, we see it as using each technology for what it’s best and have been enabling businesses to leverage this auto-indexing connection since ElasticSearch 0.90 and Neo4j 1.9. This deep expertise is built as a core integration within the GraphGrid Data Platform for all to use as the ElasticGraph service.