
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Link Prediction: Predict Edges in a Network using NetworkX
Link prediction is a key idea in the field of network analysis. It involves foreseeing the potential for a link to establish between network nodes. A potent tool for network analysis, including link prediction tasks, is the NetworkX module for Python. This thorough tutorial will walk you through using NetworkX to anticipate links, replete with succinct and understandable examples.
Introduction to Link Prediction
Nodes in a network or graph represent entities, while edges or linkages between these nodes reflect their relationships. Link prediction uses the network's current topology to forecast potential links between nodes. Link prediction has various uses, such as predicting social relationships in social networks and researcher collaboration in networks of co-authorship.
Getting Started with NetworkX
Make sure the NetworkX library is installed before moving on to link prediction. If not, use pip to install it ?
pip install networkx
We also need to install the numpy and pandas libraries in order to do link prediction:
pip install numpy pandas
Basics of Network Creation in NetworkX
Let's begin by establishing a fundamental network in NetworkX ?
import networkx as nx # Create an empty graph G = nx.Graph() # Add nodes G.add_node(1) G.add_node(2) G.add_node(3) # Add edges G.add_edge(1, 2) G.add_edge(1, 3) # Draw the graph nx.draw(G, with_labels=True)
There are only three nodes and two edges in this straightforward network.
Link Prediction in NetworkX
For doing link prediction, NetworkX offers a number of functions. They are founded on various methods and theories.
Example 1: Common Neighbors
Using common neighbours is a simple link prediction method. It implies that if two nodes have a lot of shared neighbours, they are more likely to create a link.
# Create a graph G = nx.complete_graph(5) G.remove_edge(1, 3) # Perform link prediction preds = nx.common_neighbors(G, 1, 3) print(len(list(preds))) # Output: 3
Here, we add five nodes and eliminate one edge to construct a full graph (a network in which every pair of nodes is connected by a direct edge). On the basis of the quantity of shared neighbours, we then forecast this missing link.
Example 2: Jaccard Coefficient
The number of shared neighbours is divided by the total number of neighbours in order to calculate the Jaccard coefficient, which assesses the likelihood of an edge.
# Create a graph G = nx.complete_graph(5) G.remove_edge(1, 3) # Perform link prediction preds = nx.jaccard_coefficient(G, [(1, 3)]) for u, v, p in preds: print(f'({u}, {v}) -> {p}') # Output: (1, 3) -> 0.6
In this illustration, the deleted edge's Jaccard coefficient is computed.
Example 3: Preferential Attachment
According to the theory of preferential attachment, nodes with a high degree (more connections) are more likely to connect in the future.
# Create a graph G = nx.complete_graph(5) G.remove_edge(1, 3) # Perform link prediction preds = nx.preferential_attachment(G, [(1, 3)]) for u, v, p in preds: print(f'({u}, {v}) -> {p}') # Output: (1, 3) -> 12
The preferential attachment score for the deleted edge is calculated in this example.
Example 4: Adamic/Adar Index
Similar to common neighbours, the Adamic/Adar index places less emphasis on nodes with a high degree.
# Create a graph G = nx.complete_graph(5) G.remove_edge(1, 3) # Perform link prediction preds = nx.adamic_adar_index(G, [(1, 3)]) for u, v, p in preds: print(f'({u}, {v}) -> {p}') # Output: (1, 3) -> 1.8204784532536746
In this instance, a decimal value known as the Adamic/Adar index?which measures the eliminated edge?is calculated.
Example 5: Resource Allocation Index
Another metric that creates a score based on the shared neighbours of two nodes is the resource allocation index.
# Create a graph G = nx.complete_graph(5) G.remove_edge(1, 3) # Perform link prediction preds = nx.resource_allocation_index(G, [(1, 3)]) for u, v, p in preds: print(f'({u}, {v}) -> {p}') # Output: (1, 3) -> 0.6666666666666666
Here, the removed edge's resource allocation index is calculated.
Conclusion
Link prediction is an interesting subject with many of real-world uses. The NetworkX package in Python provides a number of link prediction techniques, each with advantages and disadvantages. As always, it's essential to comprehend the underlying ideas and techniques, try out many approaches, and select the one that works best for your unique use case.
We've taken a hands-on approach to comprehending link prediction in NetworkX in this guide. The fundamental ideas have been discussed, and a variety of link prediction techniques have been illustrated with simple instances. But this is only the very tip of the iceberg. The possibilities for network research and connection prediction open up as you delve deeper into NetworkX.