This site is not optimized for Internet Explorer 9 and lower. Please choose another browser or upgrade your existing browser in order get the best experience of this website.

Relationship Direction in Cypher is Important

March 02, 2016

Relationship Direction in Cypher is ImportantThe relationship direction between two nodes is required for Cypher write queries, but can be ignored on Cypher read queries. Cypher is a graph database declarative and textual query language by Neo4j, which supported by a larger group as OpenCypher to make Cypher the SQL for graph databases. Cypher looks a bit like ASCII art in it’s representation of graph-related traversal patterns, which makes it quite intuitive and fun to use in querying graphs.

Direction can be specified in Cypher via the use of ‘<' and '> as part of a relationship pattern. Here’s an example:

Relationship Direction for Writes and Reads

When a relationship is being made in Neo4j 2.3.x utilizing the Cypher query language, it requires that the direction of the relationship to be specified. This means that whenever you’re creating data within a graph via a MERGE or CREATE statement, you must always specify a direction for such relationship because Neo4j always stores a relationship with a direction. It is also beneficial because it ensures a clean and consistent data layout in the graph for your read queries.

Querying data in the graph may seem flexible in that Cypher doesn’t require specifying a direction, but it should only be done through careful consideration to the density of OUTGOING and INCOMING relationships. The important part here is that when you’re not specifying a direction for the relationship you’re matching with the Cypher pattern, Neo4j will need to look into all relationships around the node.

Relationship Direction Example

Both of these examples are invalid CREATE statements:
CREATE (jack:Person {name:"Jack"})<-[:KNOWS]->(jill:Person {name:"Jill"});
CREATE (jack:Person {name:"Jack"})-[:KNOWS]-(jill:Person {name:"Jill"});

Both of these examples are valid CREATE statements:
CREATE (jack:Person {name:"Jack"})-[:KNOWS]->(jill:Person {name:"Jill"});
CREATE (jack:Person {name:"Jack"})<-[:KNOWS]-(jill:Person {name:"Jill"});

Let’s assume we created the first version where the direction is OUTGOING from Jack and INCOMING to Jill. If we write a read query looking from Jill’s perspective and specify the direction of the relationship it is possible to return no results and vice versa.

The flexible read query that will detect that Jack and Jill know each other is one where no direction is specified.
MATCH (jill:Person {name:"Jill"})-[:KNOWS]-(jack:Person) RETURN jill, jack;

Important – If you actually want to try this example and not have duplicate nodes in your graph, you need write the single statement like this:

CREATE (jack:Person {name:"Jack"})-[:KNOWS]->(jill:Person {name:"Jill"})
WITH jack, jill
CREATE (jack)<-[:KNOWS]-(jill);

Dense Node Relationship Direction Consideration

Consider a FOLLOWS relationship. If you have an individual who only follows 200 people, but is being followed by 20 million people, and you want to see the people this person is following, but you don’t specify the FOLLOWS direction as being OUTGOING then Neo4j will have to scan through all 20 million INCOMING FOLLOWS relationships even if your particular interest is in the 200 OUTGOING relationships.

Bi-Directional Relationships

Certain relationships are implicitly bi-directional, such as the KNOWS relationship we created in the example where there are two relationships between Jack and Jill. In these cases it can provide read query flexibility, but adds additional write load and requires more maintenance to have both. Much of this strategy will come down to your use case, but when possible I’ve found it easier to only create one relationship and always MATCH without direction specified to detect its existence for relationships that are implicitly bi-directional where there is no variation between INCOMING and OUTGOING relationships.