Finding your way around an unknown graph can seem a bit ambiguous at first due to Neo4j being schema-free. Especially if you’re newer to graph databases and used to a relational database where you would simply open the ERD and have a look through the tables. Just because Neo4j is schema-free doesn’t mean that schema-like elements are not present. The Neo4j graph database schema elements are composed of Label Names, Relationship Types, Indexes and Constraints on Property Keys. Let’s look at some techniques for getting aquatinted with an unknown graph.
- To observe the graph schema, the easiest area to look into is the browser panel in Neo4j. From there, you’ll be able to observe Label Names, Relationship Types and Property Keys. Each one can be clicked and will immediately load a maximum of 25 associated results. These results can provide a basic starting point to help you navigate your way through the graph.
- To understand the Indexes and Constraints applied to the graph database, which will begin to shape some of the business rules around the data in the graph, you’ll want to execute the query, “:schema” within the browser to observe all the index and constraint rules for the Labels and Relationships with their respective Property Keys.
- To gain a better understanding of the amount of nodes in the graph as a whole or within a certain Label so you know the data size you’re handling, run MATCH n RETURN COUNT(n); or MATCH (n:LabelNameHere) RETURN COUNT(n); respectively.
- labels() has node reference as the argument, and returns the labels that are on such node.
MATCH (n) RETURN DISTINCT LABELS(n);
IMPORTANT: Unless your graph is very small it is better to use labels() in a more focused way instead of across the entire graph.
- type() takes a relationship reference as an argument and returns the relationship type of a relationship connecting two nodes.
MATCH (n:Person)-[r]-() RETURN DISTINCT TYPE(r);
- keys() takes in a node reference as the argument and returns the property keys for the properties that are on such node.
MATCH (n:Person) RETURN DISTINCT keys(n);
MATCH (p:Person)-[r]-(x) RETURN p.name, COLLECT(DISTINCT type(r)) AS relationships, ID(x) AS id, LABELS(x) AS labels, KEYS(x) AS properties;
In the example above we’ll assume we have a Person that we’ve figured out has a name property and we’ll return all the Relationship Types to any of it’s immediate connections. Then for those immediate connections we want to return the id to anchor the return for the result rows correctly between p and x. Finally we’re including the Labels and Property Keys that are on each of the immediate connections.
In addition to now understanding how to explore an unknown graph, the main take away is to realize, “schema-free” doesn’t mean that no schema exists in the Neo4j graph database. Rather, it means that instead of defining a complete and strictly enforced schema at the outset for all data in the graph, only Indexes and Constraints are defined to provide some basic rules about how the data should be written and the rest of the graph schema evolves over time as data is being written within the graph database.