UC Davis DataLab Network Toolkit

Overview

This page will provide you with a few basic concepts and tools for using network data. Network science is a discipline of its own, so if you plan to use these methods of your own research, please consider this a starting point only. That being said, this page should give you the language to start researching the concepts that are of interest to you and your project.

First we will cover some system agnostic network concepts, such as defining a network and its component parts. We will then move on to common tools for network analysis, and further readings if you wish to learn more.

Network Concepts

Before using any network tools, you will need to have a relational data set. Rather than looking only at attributes of specific data points, relational data also looks at the connections between data points. In network analysis, data points are called nodes or vertices, and the connections between them are called edges or ties. Vertices can be anything—people, places, word, concepts—they are usually mapped into rows in a data frame. Edges contain any information on how these things connect or are related to each other. These components create a network or graph.

Below is an example network. In this network, people are nodes and their relations to each other are edges. Each node also has node attributes, which commonly aligns with a row in a dataframe for that node, like traditional tabular data. You can see these attributes in the table below, or by hovering over the people.

Table Example

Person Name Age Widgits
J 30 1
Y 21 3
G 32 4
Z 48 8

Network Example

Tools for Non-Coders

Gephi

Gephi is free and open-source software that allows you to visualize and explore networks. You can input network data (edgelists or matrices) and create customizable visualizations for networks, complex systems, and dynamic and hierarchical graphs. It’s quick to get started with, but more difficult to master as it involves learning Gephi’s user interface and file formats. Tutorials are available to learn the user interface.

NodeXL

The free version of NodeXL allows users to create network graphs from edgelists within Excel, and the Pro version includes more advanced network metrics, text and sentiment analysis, and a report generation tool. Neither version has been updated since 2015 but they are useful for users who don’t want to learn a programming language or external UI – all NodeXL tools operate within Excel.

Netlytic

Netlytic is a tool to gather and analyze online conversations from social network sites such as Twitter, Instagram, YouTube, Facebook, or your own dataset. Datasets can be explored and analyzed in Netlytic with a variety of text analysis, category analysis, and network analysis tools (and visualizations), and then exported to other network programs such as Pajek and UCINET, or a CSV format. The program is free for up to five datasets with up to 10,000 records in each. For more and bigger datasets, the site offers several tiers of pricing. The site includes tutorials on using the UI, and there are also tutorials to integrate Netlytic with R.

InfraNodus

InfraNodus is an interactive tool that can be used for free online or downloaded from Github to create a network from text data. Network graphs show main topics, can compare different texts, visualize connections between citations and notes or estimate the bias/opinion of a discourse.

R Packages

Fundamental

Statnet

Statnet is a family of R packages built for modeling and simulating exponential random graph models. These models allow you to keep some aspects of your network constant, and generate artificial networks withing those constraints. You can then compare your network to the artificial simulations to check for significance against random chance. Some plotting and descriptive tools are also provided. The family includes:

  • ergm – Analyze and simulate networks using exponential-family random graph models.
  • tergm – Analyze dynamic or evolving network data using temporal exponential random graph models.
  • latentnet – Fit and simulate latent space models of networks.
  • EpiModel – Tools for simulating mathematical models of infectious disease dynamics.

igraph

igraph Offers a host of visualization and analysis tools for networks. If you’re interested in calculating things like degree distribution and centrality measures this is a good place to start. It has an R and python version and can also be used with Mathematica and C/C++.

  • netCoin - A package for analyzing coincidence networks (particularly relevant to ecologists), equipped with broadly useful functions to build interactive and easily share-able network visualizations.

intergraph

intergraph provides tools for converting between networks created using the igraph and statnet packages in R.

Visualization

visNetwork

visNetwork is a great package for turning network plots into interactive html objects. Networks that are not small can be difficult to visualize, and it’s often useful to create interactive plots to expand the amount of data you show. It works well with networks created in igraph.

ggraph

ggraph is used for network plotting in R‘s ggplot/tidyverse framework. This package has supplanted the ggnetwork package.

Python Modules

Fundamental

NetworkX

NetworkX is by far the most common network analysis tool for python. It the backbone of many other network packages with more specific uses.

igraph

igraph from R also has a python module.

Further Reading

  • Networks by Mark Newman is the definitive textbook for networks and is quite accessible.
  • Social Network Analysis: Methods and Applications by Stanley Wasserman and Katherine Faust is foundational to social network analysis, and is often called the “SNA Bible”.
  • Understanding Social Networks: Theories, Concepts, and Findings by Charles Kadushin is a great introduction to social network analysis and networks in general.

Contributions

This research toolkit is maintained by the UC Davis DataLab, and is open for contribution. See how you can contribute on the Github repo.

This toolkit has been made possible thanks to contributions by: