Data validation lets you gain insight on the quality of your data assets. This involves grading your organization consistently to monitor your progress. When testing data, it’s essential to set metrics, as well as succeeding steps and goals to drive improvements. Data testing is even more crucial when loading data into a schema free graph database like Neo4j. So how do we it efficiently and continuously?
The Neo4j graph database features a REST api which can be utilized to query the graph. This can be to create a collection of REST requests that query the graph using Cypher with data validation questions like, “Does every Actor have an ACTED_IN relationship to a Movie?” which, when using Cypher, would appear as:
MATCH (a:Actor) WHERE NOT (a)-[:ACTED_IN]->() RETURN COUNT(DISTINCT a) AS count;
The test of the response in Postman would validate the count coming back as 0. Assuming there’s a rule saying, “Every Actor must have an ACTED_IN relationship to be a valid Actor,” then you’ll now have a test that would verify it.
Newman is a command-line collection runner for Postman. It lets you run and test a Postman collection straight from the command line. It’s made so you can assimilate it with your integration server and build systems. Newman consistently keeps a feature parity with Postman and lets you execute collections the way they’re carried out within the collection runner in Postman.
Data validation is an important topic when it comes to databases. Since data is frequently updated, queried, deleted, or passed around, having valid data is critical. By enforcing data validation and testing, databases will be more consistent, operational, and offer more value to the user.