Why we graph
Network visualisation is important and non-trivial. As Tufte (1983: 9) said:
“At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers”
All of this is crucial with networks. As a first step, network visualisation – or graphing – offers us a way to vet our data for anything strange that might be going on, both revealing and informing our assumptions and intuitions.
It is also crucial to the further communication of the lessons that we have learned through investigation with others. While there may be many dead-ends and time-sinks to visualisation, it is worth taking the time to make sure that your main points are easy to appreciate.
Brandes et al (1999) argue that visualising networks demands thinking about:
- substance: a concise and precise delivery of insights to the researcher and/or readers
- design: the ergonomics of function are 98% of the purpose of good design, aesthetics only 2%
- and algorithm: the features of the e.g. the layout algorithm
Different approaches
There are several main packages for plotting in R, as well as several for plotting networks in R. Plotting in R is typically based around two main approaches:
- the ‘base’ approach in R by default, and
- the ‘grid’ approach made popular by the famous and very flexible
{ggplot2}
package.1
Approaches to plotting graphs or networks in R can be similarly divided:
- two classic packages,
{igraph}
and{sna}
, both build upon the base R graphics engine, - newer packages
{ggnetwork}
and{ggraph}
build upon a grid approach.2
While the coercion routines available in {manynet}
make
it easy to use any of these packages’ graphing capabilities,
{manynet}
itself builds upon the ggplot2/ggraph engine to
help quickly and informatively graph networks.
Dimensions of visualisation
On her excellent and helpful website, Katya Ognyanova outlines some key dimensions of control that network researchers have to play with:
- vertex position (layout)
- vertex shape (e.g. circles, squares)
- vertex color
- vertex size
- vertex labels
- edge shape (e.g. straight, bends)
- edge type (e.g. solid, dashed)
- edge color
- edge size
- edge arrows
In the following sections, we will learn how {manynet}
provides sensible defaults on many of these elements, and offers ways to
extend or customise them.
‘gg’ stands for the Grammar of Graphics.↩︎
Others include: ‘Networkly’ for creating 2-D and 3-D interactive networks that can be rendered with plotly and can be easily integrated into shiny apps or markdown documents; ‘visNetwork’ interacts with javascript (vis.js) to make interactive networks (http://datastorm-open.github.io/visNetwork/); and ‘networkD3’ interacts with javascript (D3) to make interactive networks (https://www.r-bloggers.com/2016/10/network-visualization-part-6-d3-and-r-networkd3/).↩︎
Graphing
graphr
To get a basic visualisation of the network before adding various
specifications, the graphr()
function in
{manynet}
is a quick and easy way to obtain a clear first
look of the network for preliminary investigations and understanding of
the network. Let’s quickly visualise one of the ison_
datasets included in the package.
graphr(ison_lotr)
We can also specify the colours, groups, shapes, and sizes of nodes
and edges in the graphr()
function using the following
parameters:
node_colour
node_shape
node_size
node_group
edge_color
edge_size
graphr(ison_lotr, node_color = "Race")
graphs
graphr()
is not the only graphing function included in
{manynet}
. To graph sets of networks together,
graphs()
makes sure that two or more networks are plotted
together. This might be a set of ego networks, subgraphs, or waves of a
longitudinal network.
graphs(to_subgraphs(ison_lotr, "Race"),
waves = c(1,2,3,4))
grapht
grapht()
is another option, rendering network changes as
a gif.
ison_lotr %>%
mutate_ties(year = sample(1:12, 66, replace = TRUE)) %>%
to_waves(attribute = "year", cumulative = TRUE) %>%
grapht()
Titles and subtitles
Append ggtitle()
to add a title. {manynet}
works well with both {ggplot2}
and {ggraph}
functions that can be appended to create more tailored visualisations of
the network.
graphr(ison_adolescents) +
ggtitle("Visualisation")
Arrangements
{manynet}
also uses the {patchwork}
package
for arranging graphs together, e.g. side-by-side or above one another.
The syntax is quite straight forward and is used throughout these
vignettes.
graphr(ison_adolescents) + graphr(ison_algebra)
graphr(ison_adolescents) / graphr(ison_algebra)
Legends
While {manynet}
attempts to provide legends where
necessary, in some cases the legends offer insufficient detail, such as
in the following figure, or are absent.
ison_lotr %>%
mutate(maxbet = node_is_max(node_betweenness(ison_lotr))) %>%
graphr(node_color = "maxbet")
{manynet}
supports the {ggplot2}
way of
adding legends after the main plot has been constructed, using
guides()
to add in the legends, and labs()
for
giving those legends particular titles. Note that we can use
"\n"
within the legend title to make the title span
multiple lines.
ison_lotr %>%
mutate(maxbet = node_is_max(node_betweenness(ison_lotr))) %>%
graphr(node_color = "maxbet") +
guides(color = "legend") +
labs(color = "Maximum\nBetweenness")
An alternative to colors and legends is to use the ‘node_group’ argument to highlight groups in a network. This works best for quite clustered distributions of attributes.
graphr(ison_lotr, node_group = "Race")
Layouts
The aim of graph layouts is to position nodes in a (usually) two-dimensional space to maximise some analytic and aesthetically pleasing function. Quality measures might include:
- minimising the crossing number of edges/ties in the graph (planar graphs require no crossings)
- minimising the slope number of distinct edge slopes in the graph (where vertices are represented as points on a Euclidean plane)
- minimising the bend number in all edges in the graph (every graph has a right angle crossing (RAC) drawing with three bends per edge)
- minimising the total edge length
- minimising the maximum edge length
- minimising the edge length variance
- maximising the angular resolution or sharpest angle of edges meeting at a common vertex
- minimising the bounding box of the plot
- evening the aspect ratio of the plot
- displaying symmetry groups (subgraph automorphisms)
A range of graph layouts are available across the
{igraph}
, {graphlayouts}
, and
{manynet}
packages that can be used together with
graphr()
.
Force-directed layouts
Force-directed layouts updates some initial placement of vertices through the operation of some system of metaphorically-physical forces. These might include attractive and repulsive forces.
(graphr(ison_southern_women, layout = "kk") + ggtitle("Kamada-Kawai") |
graphr(ison_southern_women, layout = "fr") + ggtitle("Fruchterman-Reingold") |
graphr(ison_southern_women, layout = "stress") + ggtitle("Stress Minimisation"))
The Kamada-Kawai method inserts a spring between all pairs of vertices that is the length of the graph distance between them. This means that edges with a large weight will be longer. KK offers a good layout for lattice-like networks, because it will try to space the network out evenly.
The Fruchterman-Reingold method uses an attractive force between directly connected vertices, and a repulsive force between all vertex pairs. The attractive force is proportional to the edge’s weight, thus edges with a large weight will be shorter. FR offers a good baseline for most types of networks.
The Stress Minimisation method is related to the KK
algorithm, but offers better runtime, quality, and stability and so is
generally preferred. Indeed, {manynet}
uses it as the
default for most networks. It has the advantage of returning the same
layout each time it is run on the same network.
Other force-directed layouts available include:
- Simulated annealing (Davidson and Harel 1993):
"dh"
- Graph embedder (Frick et al. 1995):
"gem"
- Graphopt (Schmuhl):
"graphopt"
- Distributed recursive graph layout (Martin et al. 2008):
"drl"
Layered layouts
Layered layouts arrange nodes into horizontal (or vertical) layers, positioning them so that they reduce crossings. These layouts are best suited for directed acyclic graphs or similar.
(graphr(ison_southern_women, layout = "bipartite") + ggtitle("Bipartite")) /
(graphr(ison_southern_women, layout = "hierarchy") + ggtitle("Hierarchy")) /
(graphr(ison_southern_women, layout = "railway") + ggtitle("Railway"))
Note that "hierarchy"
and "railway"
use a
different algorithm to {igraph}
’s "bipartite"
,
and generally performs better, especially where there are multiple
layers. Whereas "hierarchy"
tries to position nodes to
minimise overlaps, "railway"
sequences the nodes in each
layer to a grid so that nodes are matched as far as possible. If you
want to flip the horizontal and vertical, you could flip the
coordinates, or use something like the following layout.
graphr(ison_southern_women, layout = "alluvial") + ggtitle("Alluvial")
Other layered layouts include:
- Tree:
"tree"
- Dominance layouts
Circular layouts
Circular layouts arrange nodes around (potentially concentric) circles, such that crossings are minimised and adjacent nodes are located close together. In some cases, location or layer can be specified by attribute or mode.
graphr(ison_southern_women, layout = "concentric") + ggtitle("Concentric")
Other such layouts include:
- circular:
"circle"
- sphere:
"sphere"
- star:
"star"
- arc or linear layouts:
"linear"
Spectral layouts
Spectral layouts arrange nodes according to the eigenvalues of the Laplacian matrix of a graph. These layouts tend to exaggerate clustering of like-nodes and the separation of less similar nodes in two-dimensional space.
graphr(ison_southern_women, layout = "eigen") + ggtitle("Eigenvector")
Somewhat similar are multidimensional scaling techniques, that visualise the similarity between nodes in terms of their proximity in a two-dimensional (or more) space.
graphr(ison_southern_women, layout = "mds") + ggtitle("Multidimensional Scaling")
graphr(ison_southern_women, layout = "mds") + ggtitle("Multidimensional Scaling")
Other such layouts include:
- Pivot multidimensional scaling:
"pmds"
Grid layouts
Grid layouts arrange nodes based on some cartesion coordinates. These can be useful for making sure all nodes’ labels are visible, but horizontal and vertical lines can overlap, making it difficult to distinguish whether some nodes are tied or not.
graphr(ison_southern_women, layout = "grid") + ggtitle("Grid")
Other grid layouts include:
- orthogonal layouts for e.g. printed circuit boards
- grid snapping for other layouts
Colors
Who’s hue?
By default, graphr()
will use a color palette that
offers fairly good contrast and, since v1.0.0 of {manynet}
,
better accessibility. However, a different hue might offer a better
aesthetic or identifiability for some nodes. Because the
graphr()
function is based on the grammar of graphics, it’s
easy to extend or alter aesthetic aspects. Here let’s try and change the
colors assigned to the different races in the ison_lotr
dataset.
graphr(ison_lotr,
node_color = "Race")
graphr(ison_lotr,
node_color = "Race") +
ggplot2::scale_colour_hue()
Grayscale
Other times color may not be desired. Some publications require
grayscale images. To use a grayscale color palette, replace
_hue
from above with _grey
:
graphr(ison_lotr,
node_color = "Race") +
ggplot2::scale_colour_grey()
Manual override
Or we may want to choose particular colors for each category. This is
pretty straightforward to do with
ggplot2::scale_colour_manual()
. Some common color names are
available, but otherwise hex color codes can be used for more specific
colors. Unspecified categories are coloured (dark) grey.
graphr(ison_lotr,
node_color = "Race") +
ggplot2::scale_colour_manual(
values = c("Dwarf" = "red",
"Hobbit" = "orange",
"Maiar" = "#DEC20B",
"Human" = "lightblue",
"Elf" = "lightgreen",
"Ent" = "darkgreen")) +
labs(color = "Color")
Theming
Perhaps you are preparing a presentation, representing your
institution, department, or research centre at home or abroad. In this
case, you may wish to theme the whole network with institutional colors
and fonts. Here we demonstrate one of the color scales available in
{manynet}
, the colors of the Sustainable Development
Goals:
graphr(ison_lotr, node_color = "Race") +
scale_color_sdgs()
More institutional scales and themes are available, and more can be implemented upon pull request.
Further flexibility
For more flexibility with visualizations, {manynet}
users are encouraged to use the excellent {ggraph}
package.
{ggraph}
is built upon the venerable {ggplot2}
package and works with tbl_graph
and igraph
objects. As with {ggplot2}
, {ggraph}
users are
expected to build a particular plot from the ground up, adding explicit
layers to visualise the nodes and edges.
library(ggraph)
ggraph(ison_greys, layout = "fr") +
geom_edge_link(edge_colour = "dark grey",
arrow = arrow(angle = 45,
length = unit(2, "mm"),
type = "closed"),
end_cap = circle(3, "mm")) +
geom_node_point(size = 2.5, shape = 19, colour = "blue") +
geom_node_text(aes(label=name), family = "serif", size = 2.5) +
scale_edge_width(range = c(0.3,1.5)) +
theme_graph() +
theme(legend.position = "none")
As we can see in the code above, we can specify various aspects of the plot to tailor it to our network.
First, we can alter the layout of the network using
the layout =
argument to create a clearer visualisation of
the ties between nodes. This is especially important for larger
networks, where nodes and ties are more easily obscured or
misrepresented. In {ggraph}
, the default layout is the
“stress” layout. The “stress” layout is a safe choice because it is
deterministic and fits well with almost any graph, but it is also a good
idea to explore and try out other layouts on your data. More layouts can
be found in the {graphlayouts}
and {igraph}
R
packages. To use a layout from the {igraph}
package, enter
only the last part of the layout algorithm name (eg.
layout = "mds"
for “layout_with_mds”).
Second, using geom_node_point()
which draws the nodes as
geometric shapes (circles, squares, or triangles), we can specify the
presentation of nodes in the network in terms of their
shape (shape=
, choose from 1 to 21), size
(size=
), or colour (colour=
). We can
also use aes()
to match to node attributes. To add labels,
use geom_node_text()
or geom_node_label()
(draws labels within a box). The font (family=
), font size
(size=
), and colour (colour=
) of the labels
can be specified.
Third, we can also specify the presentation of edges
in the network. To draw edges, we use geom_edge_link0()
or
geom_edge_link()
. Using the latter function makes it
possible to draw a straight line with a gradient. The following features
can be tailored either globally or matched to specific edge attributes
using aes()
:
colour:
edge_colour=
width:
edge_width=
linetype:
edge_linetype=
opacity:
edge_alpha=
For directed graphs, arrows can be drawn using the
arrow=
argument and the arrow()
function from
{ggplot2}
. The angle, length, arrowhead type, and padding
between the arrowhead and the node can also be specified.
To change the position of the legend, add the theme()
function from {ggplot2}
. The legend can be positioned at
the top, bottom, left, or right, or removed using “none”.
For more see David Schoch’s excellent resources on this.
Exporting plots to PDF
We can print the plots we have made to PDF by point-and-click by selecting ‘Save as PDF…’ from under the ‘Export’ dropdown menu in the plots panel tab of RStudio.
If you want to do this programmatically, say because you want to record how you have saved it so that you can e.g. make some changes to the parameters at some point, this is also not too difficult.
After running the (gg-based) plot you want to save, use the command
ggsave("my_filename.pdf")
to save your plot as a PDF to
your working directory. If you want to save it somewhere else, you will
need to specify the file path (or change the working directory, but that
might be more cumbersome). If you want to save it as a different
filetype, replace .pdf
with e.g. .png
or
.jpeg
. See ?ggsave
for more.