Main Blog # Introduction to Property Graphs Using Python With Neo4j

This blog post is a result of our collaboration with the University of Padua, wherein the student Jean Xavier Marie Lechevalier played a significant role in selecting the topic and contributing a major portion of the content.

This blog post will delve into the fascinating world of property graphs, providing readers with a good understanding and practical knowledge to effectively use them in data modelling using Python.

## What are Property Graphs and why are they useful?

A property graph is a data model that represents a graph structure with nodes (also called vertices) and directed relationships (also called edges), where each node and relationship can have a set of properties (also called key-value pairs) associated with it that provides additional information.

Property graphs are useful because they can represent complex relationships between entities in an intuitive way. They are often used in applications that need to model and analyze connections between entities, such as social networks, recommendation systems, and supply chain networks.

Additionally, they can be used to represent many-to-many relationships, which can be difficult to model using traditional relational databases. As they are often easy to understand and work with, they are a popular choice for data modelling.

### How to create your first Property Graph

This task can be achieved with both NetworkX and Pyprograph which are powerful libraries for manipulating property graphs in Python.

NetworkX is a more general-purpose library that allows the manipulation of graphs, including property graphs whereas Pyprograph was specifically designed for property graphs. In this post, we will use NetworkX but keep in mind both libraries work very similarly.

Let’s create the following simple property graph : This can be achieved in 3 simple steps:

• Creating an empty graph
• Adding nodes (and their properties) to the graph
• Adding relationships (and their properties) to the graph
``````import networkx as nx

# Create a Graph object
G = nx.Graph()

# Add the nodes to the graph, with properties

# Add the edges to the graph

The Python code snippet above uses the NetworkX library to create a graph object, called `G`.
It adds nodes to the graph representing individuals named “Max,” “Alice,” and “Bob,” with associated properties such as age and gender. The `add_edge` function is then used to establish connections between the nodes, representing relationships such as “knows” between Max and Alice, and between Alice and Bob.

Now that our property graph is created, we could do some interesting manipulations like finding the neighbour(s) of a specific node. This can be achieved in the following way:

``````# Find Alice's neighbors :
neighbors = G.neighbors("Alice")

# Print the names of Alice's neighbors
for neighbor in neighbors:
print(neighbor) #Outputs Max and Bob``````

### How to connect to a Neo4j Database and Create Property Graph

Although Python libraries (like NetworkX) have a lot of interesting methods to query a graph, they are still quite limited. The best way to manipulate such graph data remains to use a graph databases management system such as Neo4j and its query language: Cypher.
Using such tools will allow us to do more complex queries to our Python property graph.

To connect to a Neo4j local database from Python, you can use the `py2neo` library.
You will be able to run cypher queries directly from your Python code once you are connected.
(Note that we suppose here that you have already created your empty database on Neo4j).

``````from py2neo import Graph

# You should replace "bolt://localhost:7687" with the correct Bolt URI for your database (DB)
# You should replace "neo4j" and "password" with the correct credentials for your DB

At this point, we have our graph stored in our Python file and have established the connection to Neo4j.
What we need now is to create our graph nodes and relationships in the Neo4j database. This task can be done using the py2neo library.

We can access the node’s name and properties in this way:

``````for node_id, node_data in G.nodes(data=True):
print(node_id, node_data)

#Will output :
#Max {'age': 20, 'gender': 'male'}
#Alice {'age': 22, 'gender': 'female'}
#Bob {'age': 21, 'gender': 'male'}``````

Similarly for edges :

``````for source, target, edge_data in G.edges(data=True):
print(source, target, edge_data)

#Will output :
#Max Alice {'label': 'knows'}
#Alice Bob {'label': 'knows'}``````

We are now ready to create the nodes and edges in the Neo4j database.
First, we create nodes, then the edges using the py2neo matching function. Note that the `first()` function evaluates the match and returns the first node:

``````from py2neo import Node, Relationship

for node_id, node_data in G.nodes(data=True):
# Create a Neo4j node object
node = Node("Person", name=node_id, **node_data)

# Create the node in the Neo4j db
graph.create(node)

for source, target, edge_data in G.edges(data=True):
# Look up the nodes in the Neo4j db
a = graph.nodes.match("Person", name=source).first()
b = graph.nodes.match("Person", name=target).first()

# Create a relationship between the nodes
rel = Relationship(a, "knows", b, **edge_data)

# Create the relationship in the Neo4j database
graph.create(rel)``````

This will create a property graph in the Neo4j database with three nodes (Max, Alice, Bob) and three relationships (Max-knows-Alice, Alice-knows-Max, Alice-knows-Bob). Each node has properties (name, age, gender) that can be used to store additional data about the node. Indeed, if we execute `Match (n) Return n` (Cypher equivalent of SQL `Select *` ), the following graph is returned:

### Query Neo4j Property Graphs from a Python Script

We can then execute Cypher queries to retrieve a specific set of data from the graph.
As an example, we could run the following query: Get the people in the DB who only know 1 person (i.e. people who are the starting node of the relationship “knows” just once):

``````query = """
MATCH (p:Person)-[:knows]->(q:Person)
WITH p, COUNT(q) AS num_friends
WHERE num_friends = 1
RETURN p.name
"""
result = graph.run(query)

# Print the names of the people who only know one people
for result in results:
print(result) #Outputs "Max"``````

This code snippet demonstrates how to perform a simple graph query using Cypher within a Python script, allowing you to interact with the graph database and retrieve specific information based on your requirements.

With the help of this simple example, you can now have fun creating more complex networks and executing more complex Cypher queries directly from your Python script!

## Summary

As said before, property graphs are a powerful tool for storing and manipulating graph data, and the Python libraries cited in this post make it simple to create, store and query them. This post is above all an introduction but if you wish to further explore the topics I strongly recommend reading the websites I will attach in the references.

// references
// let's collaborate

## Do you want to be published?

This blog post is part of our collaboration with the University of Padua.
If you are a University student or professor and want to collaborate, contact us through e-mail.

// STAY ALWAYS UP TO DATE 