Advanced Cypher Queries
The examples in the preceding section are pretty simple traversals from one starting node to a set of results. In this section, let's create a more robust example with multiple starting nodes and complex conditions. For this example, we'll create a simple movie catalog that has the following types of nodes:
- Movies: Movies include The Avengers, The Avengers: Age of Ultron, Iron Man, and The Wolverine.
- Characters: Super heroes that appear in our movies include Iron Man, The Hulk, Captain America, Thor, and Wolverine.
- Tag: Tags are genres that can be associated with a movie; they include Action, Super Heroes, Marvel, X-men, and New Release.
Figure 1 shows this graph.
Figure 1 Movie catalog graph.
Essentially we're keeping track of super hero movies, tags for those movies, and the characters that appear in them.
The following snippet creates these nodes and their relationships:
public enum Labels implements Label { MOVIE, TAG, CHARACTER; } public enum RelationshipTypes implements RelationshipType { HAS_TAG, PERFORMS_IN; } public static void main( String[] args ) { GraphDatabaseService graphDB = new GraphDatabaseFactory().newEmbeddedDatabase("data"); // Setup a movie graph try( Transaction txn = graphDB.beginTx() ) { // Create characters Node ironMan = graphDB.createNode( Labels.CHARACTER ); ironMan.setProperty( "name", "Iron Man" ); Node captainAmerica = graphDB.createNode( Labels.CHARACTER ); captainAmerica.setProperty( "name", "Captain America" ); Node hulk = graphDB.createNode( Labels.CHARACTER ); hulk.setProperty( "name", "The Hulk" ); Node thor = graphDB.createNode( Labels.CHARACTER ); thor.setProperty( "name", "Thor" ); Node wolverine = graphDB.createNode( Labels.CHARACTER ); wolverine.setProperty( "name", "Wolverine" ); // Create movie tags Node actionTag = graphDB.createNode( Labels.TAG ); actionTag.setProperty( "name", "Action" ); Node marvelTag = graphDB.createNode( Labels.TAG ); marvelTag.setProperty( "name", "Marvel" ); Node superHeroTag = graphDB.createNode( Labels.TAG ); superHeroTag.setProperty( "name", "Super Hero" ); Node xmenTag = graphDB.createNode( Labels.TAG ); xmenTag.setProperty( "name", "X-Men" ); Node newReleaseTag = graphDB.createNode( Labels.TAG ); newReleaseTag.setProperty( "name", "New Release" ); // Create movies Node avengers = graphDB.createNode( Labels.MOVIE ); avengers.setProperty( "name", "The Avengers" ); avengers.createRelationshipTo( marvelTag, RelationshipTypes.HAS_TAG ); avengers.createRelationshipTo( actionTag, RelationshipTypes.HAS_TAG ); avengers.createRelationshipTo( superHeroTag, RelationshipTypes.HAS_TAG ); ironMan.createRelationshipTo( avengers, RelationshipTypes.PERFORMS_IN ); captainAmerica.createRelationshipTo( avengers, RelationshipTypes.PERFORMS_IN ); hulk.createRelationshipTo( avengers, RelationshipTypes.PERFORMS_IN ); thor.createRelationshipTo( avengers, RelationshipTypes.PERFORMS_IN ); Node avengers2 = graphDB.createNode( Labels.MOVIE ); avengers2.setProperty( "name", "The Avengers: Age of Ultron" ); avengers2.createRelationshipTo( marvelTag, RelationshipTypes.HAS_TAG ); avengers2.createRelationshipTo( actionTag, RelationshipTypes.HAS_TAG ); avengers2.createRelationshipTo( superHeroTag, RelationshipTypes.HAS_TAG ); avengers2.createRelationshipTo( newReleaseTag, RelationshipTypes.HAS_TAG ); ironMan.createRelationshipTo( avengers2, RelationshipTypes.PERFORMS_IN ); captainAmerica.createRelationshipTo( avengers2, RelationshipTypes.PERFORMS_IN ); //hulk.createRelationshipTo( avengers2, RelationshipTypes.PERFORMS_IN ); thor.createRelationshipTo( avengers2, RelationshipTypes.PERFORMS_IN ); Node ironManMovie = graphDB.createNode( Labels.MOVIE ); ironManMovie.setProperty( "name", "Iron Man" ); ironManMovie.createRelationshipTo( marvelTag, RelationshipTypes.HAS_TAG ); ironManMovie.createRelationshipTo( actionTag, RelationshipTypes.HAS_TAG ); ironManMovie.createRelationshipTo( superHeroTag, RelationshipTypes.HAS_TAG ); ironMan.createRelationshipTo( ironManMovie, RelationshipTypes.PERFORMS_IN ); Node theWolverine = graphDB.createNode( Labels.MOVIE ); theWolverine.setProperty( "name", "The Wolverine" ); theWolverine.createRelationshipTo( xmenTag, RelationshipTypes.HAS_TAG ); theWolverine.createRelationshipTo( actionTag, RelationshipTypes.HAS_TAG ); theWolverine.createRelationshipTo( superHeroTag, RelationshipTypes.HAS_TAG ); wolverine.createRelationshipTo( theWolverine, RelationshipTypes.PERFORMS_IN ); // Success! txn.success(); } }
Click here to download a zip file containing the source files for this article.
Let's start simple by finding all the movies:
match( movie :MOVIE ) return movie.name
The full code snippet follows:
System.out.println( "Movies" ); try( Transaction txn = graphDB.beginTx(); Result results = graphDB.execute( "match( movie :MOVIE ) return movie.name" ) ) { while( results.hasNext() ) { Map<String,Object> result = results.next(); System.out.println( "\t" + result.get( "movie.name" ) ); } txn.success(); }
We match all nodes with the MOVIE tag and return each node's name. The output is the following:
Movies The Avengers The Avengers: Age of Ultron Iron Man The Wolverine
Now let's find all Marvel movies. To do this, we need to find the Marvel tag node and traverse to all movies that maintain a HAS_TAG to the Marvel tag. This query looks like the following:
MATCH(marvelMovies:TAG {name:'Marvel'})<-[:HAS_TAG]-(movie) RETURN movie.name
We find each node that has the TAG label and a name property value of “Marvel” and follow that tag's HAS_TAG inbound relationships to the movies with that tag. Figure 2 shows this logic.
Figure 2 Find Marvel movies.
The output from this query is the following:
Marvel Movies Iron Man The Avengers: Age of Ultron The Avengers
Now let's find all newly released Marvel movies. This query is a little more complex because we need to find the Marvel tag node and the New Release tag node, and then find all nodes that have a HAS_TAG relationship with both of them. Figure 3 shows this logic.
Figure 3 Finding new release Marvel movies.
The query looks like the following:
MATCH(newRelease:TAG {name:'New Release'}), (marvelMovies:TAG {name:'Marvel'})-[:HAS_TAG]-(movie) WHERE (newRelease)-[:HAS_TAG]-(movie) RETURN movie.name
We define two variables in our match clause: newRelease and marvelMovies. We follow the marvelMovies tag to all movies that have HAS_TAG relationships with marvelMovies, and then we add a WHERE condition that verifies a path from the newRelease variable (which matches the “New Release” tag) to that movie.
To take this example one step crazier, let's find all new release Marvel movies about super heroes. You can add more movies to your example to validate that the query works; for an easier test, just comment out the superHeroTag from The Avengers and then validate that it is excluded from the list. The following query shows how to match three tags:
MATCH(superHeroes:TAG {name:'Super Hero'}), (newRelease:TAG {name:'New Release'}), (marvelMovies:TAG {name:'Marvel'})-[:HAS_TAG]-(movie) WHERE (newRelease)-[:HAS_TAG]-(movie) AND (superHeroes)-[:HAS_TAG]-(movie) RETURN movie.name
We define three variables: superHeroes, newReleases, and marvelMovies, all pointing to their respective tag nodes. We started from the marvelMovies node and traversed across all HAS_TAG relationships, and then filtered the results with two WHERE clauses, joined together with the AND keyword. In other words, find all Marvel movies that have a path from the “New Releases” tag to that movie and a path from the “Super Hero” tag to that movie. As expected, this query outputs the following:
New Release Marvel Super Hero Movies The Avengers: Age of Ultron
That covers multiple starting tags, but let's bring in a character too. The following query finds all Marvel movies that have Thor as a character:
MATCH(marvelMovies:TAG {name:'Marvel'})-[:HAS_TAG]-(movie)-[:PERFORMS_IN]-(thor:CHARACTER {name:'Thor'}) RETURN movie.name
We start by finding the Marvel tag and following all of its HAS_TAG relationships to movies, but then we follow that selected movie to Thor by following the PERFORMS_IN relationship to the Thor node. This query outputs the following results:
Marvel Movies with Thor The Avengers: Age of Ultron The Avengers
Next, we can combine the two strategies to find all new release Marvel movies that have Thor:
MATCH(marvelMovies:TAG {name:'Marvel'}),(newRelease:TAG {name:'New Release'})-[:HAS_TAG]-(movie)-[:PERFORMS_IN]-(thor:CHARACTER {name:'Thor'})
WHERE (marvelMovies)-[:HAS_TAG]-(movie) RETURN movie.name
We define a variable named marvelMovies that references the Marvel TAG, follow the HAS_TAG relationships from the New Release TAG to movies for which there is a path to the Thor node via the PERFORMS_IN relationship.
Likewise, we can find all new release Marvel movies featuring both Thor and The Hulk:
MATCH(marvelMovies:TAG {name:'Marvel'}), (newRelease:TAG {name:'New Release'})-[:HAS_TAG]-(movie)-[:PERFORMS_IN]- (thor:CHARACTER {name:'Thor'}), (hulk:CHARACTER {name:'The Hulk'}) WHERE (marvelMovies)-[:HAS_TAG]-(movie) AND (movie)-[:PERFORMS_IN]-(hulk) RETURN movie.name
There are multiple ways to write queries; these are just meant to encourage you to think in terms of nodes and relationships. The query I've just described could be written as follows:
MATCH(thor:CHARACTER {name:'Thor'}), (hulk:CHARACTER {name:'The Hulk'}), (marvelMovies:TAG {name:'Marvel'}), (newRelease:TAG {name:'New Release'})-[:HAS_TAG]-(movie) WHERE (marvelMovies)-[:HAS_TAG]-(movie) AND (movie)-[:PERFORMS_IN]-(hulk) AND (movie)-[:PERFORMS_IN]-(thor) RETURN movie.name
In this case, we define all of our node references at the beginning, select a starting node (New Releases), find the movies that have that tag, and then define filter conditions to refine the results. I see this last query as being easier to read; but, as you can see, there are many alternative ways of writing Cypher queries.