Recently I have been using the graph database neo4j to visualise a dataset. If you have used the HTML-based graph visualisation engine inside the neo4j browser you would have quickly come to the realisation that is not a well fit for anything else than visualising a minimal set of nodes and relationships.
Previously I have used Gephi for visualisations, and that is normally my weapon of choice when it comes to Desktop-based processing of graphs of several reasons.
So the question obviously was: how to get the nodes from neo4j to Geph
I actually thought that this topic was well described online before I started. Turns out that it wasn't. The closest I came to a working tutorial was this post by Graph people. That and the APOC user tutorial which was a bit daunting by the first looks of it. Long story short, APOC is the one plugin you want.
With that in mind, I ended up with the following architecture.
APOC - or "Awesome Procedures On Cypher" (I chose that variant), serves as a streaming client for a streaming server running inside Gephi. This means that when a query runs, as shown in the figure above, with a call to the APOC procedure for Gephi - the objects streams to Gephi's HTTP JSON-server.
Setting Up neo4j
There is a couple of steps that you should take at this point:
I'll make this brief. First off, setup Neo4j like this:
cd /usr/local/src/ mkdir /usr/local/src/neo4j-community-3.4.6/ wget https://go.neo4j.com/download-thanks.html?edition=community&release=3.4.6&flavour=unix tar -xvf neo4j-community-3.4.6.tar cd neo4j-community-3.4.6/plugins wget https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/220.127.116.11/apoc-18.104.22.168-all.jar cd ../ && ./bin/neo4j start
You have now installed the neo4j community server, installed the APOC plugin and
started neo4j locally. I suggest you go to the "neo4j Browser" at
http://localhost:7474/browser/ to get to know the workbench a little better.
Setting Up Gephi
For this step you will need Gephi (desktop application) that you can get from here.
Next: go to the Tools menu, select Plugins and install the Graph Streaming plugin:
After restarting Gephi, you can create and start the streaming server at port 8080. The default config is on the APOC side, so you will need to change that in neo4j if you'd like to use another url and port.
At this point Gephi is listening on port 8080 for objects streamed from neo4j APOC.
Streaming from neo4j Browser
To get vertices (nodes) and edges (relations) from neo4j to Gephi, you will need to stream the objects. Go to the "neo4j Browser". Once inside, execute something like the following query. This may export many objects, so make sure you keep this in mind.
MATCH path = (some_label)-[some_relation]->(some_label) CALL apoc.gephi.add(null,'workspace1',path,'weight',['address','desc','comment']) yield nodes return *
So what happened above? The APOC docs on Gephi streaming describes the procedure as the following:
apoc.gephi.add(url-or-key, workspace, data, weightproperty, ['exportproperty']) | streams passed in data to Gephi
The command was confusing to me at first, especially the part passed to Gephi. The
exportproperty defined as neo4j graph properties on beforehand as
comment is the ones that will show up if you have a look at the columns in the Gephi "Data Laboratory". These will be custom to your graph, and if you use e.g. the Python neo4jrestclient - they are the ones you add as:
Once again, these will show up as (I added
address_type after the data export, so that doesn't show in the screenshot):