Neo4j and Cypher using Py2Neo
Importing and Authenticating
from py2neo import authenticate, Graph, Node, Relationship
authenticate("localhost:7474", "neo4j", "<pass>")
graph = Graph()
You have to make sure your Neo4j Database exists at localhost:7474 with the appropriate credentials.
the graph
object is your interface to the neo4j instance in the rest of your python code. Rather thank making this a global variable, you should keep it in a class’s __init__
method.
Adding Nodes to Neo4j Graph
results = News.objects.todays_news()
for r in results:
article = graph.merge_one("NewsArticle", "news_id", r)
article.properties["title"] = results[r]['news_title']
article.properties["timestamp"] = results[r]['news_timestamp']
article.push()
[...]
Adding nodes to the graph is pretty simple,graph.merge_one
is important as it prevents duplicate items. (If you run the script twice, then the second time it would update the title and not create new nodes for the same articles)
timestamp
should be an integer and not a date string as neo4j doesnt really have a date datatype. This causes sorting issues when you store date as ‘05-06-1989’
article.push()
is an the call that actually commits the operation into neo4j. Dont forget this step.
Adding Relationships to Neo4j Graph
results = News.objects.todays_news()
for r in results:
article = graph.merge_one("NewsArticle", "news_id", r)
if 'LOCATION' in results[r].keys():
for loc in results[r]['LOCATION']:
loc = graph.merge_one("Location", "name", loc)
try:
rel = graph.create_unique(Relationship(article, "about_place", loc))
except Exception, e:
print e
create_unique
is important for avoiding duplicates. But otherwise its a pretty straightforward operation.
The relationship name is also important as you would use it in advanced cases.
Query 1 : Autocomplete on News Titles
def get_autocomplete(text):
query = """
start n = node(*) where n.name =~ '(?i)%s.*' return n.name,labels(n) limit 10;
"""
query = query % (text)
obj = []
for res in graph.cypher.execute(query):
# print res[0],res[1]
obj.append({'name':res[0],'entity_type':res[1]})
return res
This is a sample cypher query to get all nodes with the property name
that starts with the argument text
.
Query 2 : Get News Articles by Location on a particular date
def search_news_by_entity(location,timestamp):
query = """
MATCH (n)-[]->(l)
where l.name='%s' and n.timestamp='%s'
RETURN n.news_id limit 10
"""
query = query % (location,timestamp)
news_ids = []
for res in graph.cypher.execute(query):
news_ids.append(str(res[0]))
return news_ids
You can use this query to find all news articles (n)
connected to a location (l)
by a relationship.
Cypher Query Samples
Count articles connected to a particular person over time
MATCH (n)-[]->(l)
where l.name='Donald Trump'
RETURN n.date,count(*) order by n.date
Search for other People / Locations connected to the same news articles as Trump with at least 5 total relationship nodes.
MATCH (n:NewsArticle)-[]->(l)
where l.name='Donald Trump'
MATCH (n:NewsArticle)-[]->(m)
with m,count(n) as num where num>5
return labels(m)[0],(m.name), num order by num desc limit 10