Missing

  • Triad census
  • Overview centrality measures

Network analysis looks at a set of actors and the relationships between them. The goal can either to describe the network (e.g. how relationships are structured) or to find explanation based on the network structure (e.g. what causes the relationships between actors or what effects do these relationships have).

Example of networks

  • The members of a company forming an informal (friendships, advice-seeking) and formal (official hierarchy) network
  • Nation states forming a network of trade or conflict relation

Basic terms and concepts

TermExplanation
NodesThe actors in a network (e.g. individuals, states, etc.)
TiesThe relationship between the actors. Depending on the type of relationship ties are also called line, link, edge or arc.
Directed vs. undirected relationshipsSome relationships between actors have a direction while others have not (e.g. mutual defense alliances like NATO where each member has to help each other vs. foreign aid flows that only go in one direction)

Undirected ties are called edge, directed ties are cald arc.
Binary vs. valued relationshipsSome relationships are binary (yes / no) while others have a certain value or strenghts (think EU membership vs. trade volume between countries)
Dyad / TriadA pair of two / three nodes and their relationship (regardless of whether there is a tie between them or not)

Adjacency Matrices

An adjacency matrix contains the information about the relationships (links, ties) between actors. The number of rows and columns equals to the number of actors (or nodes) in the networks

Depending on the type of relationships there are different types of matrices:

MatrixType of relationship
Binary vs. valued adjacency matrixBinary vs. valued relationships → takes either values of 0 / 1 or some other numerical
Symmetrical vs. asymmetrical adjacencyUndirected vs. directed relationships → the values in the matrix are symmetrical along the diagonal (, etc.) or not

For undirected relationships the cell values are symmetric along the main diagonal of the matrix.

For directed relationships the rows indicate the ties from senders to receivers and the columns indicate the ties from receivers to senders.

Warning

Actors are not allowed to have loops (a relationship with themselves), therefore values of , , etc. are always .

Values within the adjacency matrix

ValueMeaning
0no tie between actors
1tie between actors (actors are adjacent)
strength of the tie (some numerical value, i.e. the value of traded goods between countries in USD)

Example

The matrix shows a network with four actors.

To check whether it is directed or undirected we have to look at the (a)symmetry of the matrix. The relationship from actor 1 to actor 2 (seen in the second column of the first line) is 1. The reverse relationship from actor 2 to actor 1 (seen in the first column of the second line) is 0. This means that the matrix is asymmetrical and that the network is directed.

Visualised, the network looks like this:

Density

The density of a network is the share of existing ties among all potential ties in the network. So we need to find out how many ties there actually are () and how many there could be. The calculation is slightly different between directed and undirected networks.

Undirected networks

The maximum number of ties in a network with a number of nodes is:

Because ties can’t be connected to themselves, one is subtracted. Because the network is undirected, a connection between two nodes counts as a tie regardless of its direction so the number of ties is divided by two.

The density of an undirected network is then the number of ties that exist in the network in proportion to the maximum number of ties:

The density is 0 when L = 0 and 1 when all possible connections in the network are realised.

Directed networks

For directed networks, the principle is the same, but because a tie between two nodes can go either way, the division by two is not necessary.

The number of possible ties then is:

And the density is:

Once again, the values of the density can range from 0 to 1, representing everything from no connections at all to all possible ties existing and being reciprocal.

Diad / triad census

Categorises all the dyads (or triads) in a network into three diferent states:

  • M = mutual
  • A = asymmetric
  • N = null
  • X (only in triads) → look at asymmetric edges
    • D = down (edges flow in a hierarchical or linear direction)
    • U = up (edges flow opposite to some focal node)
    • C = cyclical (edges form a closed loop among three nodes)
    • T = transitive

Centrality Measures

Overview:

Centrality Measure
Degree Centrality
Betweenness Centrality
Closeness Centrality

1. Degree Centrality

The degree of a node describes the number of connections it has to other nodes. Indegree and outdegree only matter for directed networks.

Hint

  • Degree refers to the raw count of connections
  • Degree centrality usually refers to the normalised measured which allows for comparison across networks (but this wasn’t made really explicit in the materials)
DefinitionCalculation from matrixExample
DegreeNumber of direct connections a node hasSum down row or columnIn a network of bilateral trade agreements, if Germany has trade agreements with France, Italy, and Poland, Germany’s degree = 3.
OutdegreeNumber of ties sent by a node (outgoing connections)Sum across rowIf the USA sends foreign aid to 15 countries, USA’s outdegree = 15.
IndegreeNumber of ties received by a node (incoming connections)Sum across clolumnIf Kenya receives foreign aid from 8 donor countries, Kenya’s indegree = 8.

More formalized, the calculation of degrees look like this:

Type of centralityDescriptionFormula
Degree Centrality The numbers of nodes going from to or the other way around.
Indegree Centrality The numbers of incoming connections (going from to ).
Outdegree Centrality The numbers incoming connections (going from to ).

Explanation:

  • stands for centrality
  • stands for a node in the network called
  • is basically the adjacency matrix for the nodes and . The sum sign indicates that all cases where a connection between and exists (marked by a 1 in the matrix) are summed up.
  • The above the sum sign means that the step is repeated for all nodes from 1 to . It basically sets the range for the summation so that all possible connections are checked.

Normalisation of Centrality Measures

To make the degrees of a node comparable between different networks, they have to be normalised. This happens by adjusting for the number of nodes . Again, one is subtracted, because a node is not allowed to have a connection to itself. The normalised value can be interpreted in percent.

Example

Let’s take a directed network with four nodes that have the following connections:

Remember that in the adjaceny matrix, the rows show the outgoing relationship from the source to the target node. The columns show the incoming connections a node has coming from other nodes.

ABCD
A0001
B1000
C1001
D0000

To calculate the indegree centrality for node A we take the values from the column A because they represent the incoming connections:

To calculate the outdegree centrality for node B we take the values from the rows:

To normalise the values:

2. Betweenness Centrality

“Betweeness centrality […] looks at how often an actor rest between two other actors. More specifically, betweenness centrality calculates how many times an actor sits on the geodesic (i.e. the shortes path) linking to actors together” (Prell 2012, p. 104)

Steps to calculate the betweenness centrality

  • Identify the geodesic(s) for each pair of actors. A geodesic is the shortest path between two nodes in a network.
  • How often is the node part of such a geodesic?

Formula:

The betweenness centrality for actor

  • : the number of geodesics from to where is included, divided by
  • : the number of all geodesics from to
  • summed up over all pairs of other nodes

Example

In this network there are two geodesics for the pair of actors and , both with a length of three.

So to calculate the betweenness centrality of , we would count the first geodesic in the numerator of the fraction above and the second in the denominator. To continue, we would have to do the same for all other combinations of nodes and then sum up the result.

Normalised betweenness centrality

To normalise the betweenness centrality of an actor it needs to be divided through the maximally possible betweenness of a network of a size . This corresponds to the maximum number of node pairs that don’t involve .

Interpretation: A normalised betweenness centrality of 0 would mean that the actor is never part of any geodesic between other pairs. A value of 1 would mean that it is part of all geodesics (i.e. the center point in an undirected star network)

3. Closeness Centrality

Closeness is measured ”as the distance between actors, where actors who have the shortest distance to other actors are seen as having the most closeness centrality” (Prell 2012, p. 107)

The closeness centrality of an actor is 1 divided by the sum of the shortest path to each of the other nodes. The denominator by itself indicates farness which would be less intuitive, because higher values would mean less centrality.

If you are, however, interested in the farness , you can just keep the part in the denominator.

To normalise the closeness centrality one adjusts for the maximum value of closeness centrality which is determined by the number of actors .

Warning

Determing the closeness centrality works for connected graphs, because if some of the nodes are not connected to the rest of the network, their closeness is zero, which makes the values of really small.

Example

Imagine a network that looks like this:

To calculate the closeness centrality of we need to sum the (shortest) distances to all of the other nodes. These are:

Based on this the (normalised) closeness centrality is:

Application in R

Incomplete

1. Creating a star network and calculating indegrees

Using the package igraph, the make_star function creates a star network. The graph objects created by the package can be treated as a dataframe.

mode indicates the types of connections center indicates which node is at the center of the star network

The indegree can be calculated by using the function igraph:degree:

V(star)$indegree <- igraph::degree(star, mode = "in")
V(star)$indegree_norm <- igraph::degree(star, mode = "in", normalized = TRUE)

This also writes the calculated degrees to the star object?

Using the package intergraph, we can transform the graph object created before into a network object. This allows us to receive more information.

star_network <- intergraph::asNetwork(star)
summary(star_network)

2. Calculating indegrees, outdegrees, and density

Using the dataset transgov

In preparation, the the rows and columns are renamed to the country names (so that all cells in the matrix are just data) and transforming the object into a matrix (doesn’t change the look, but allows for other operations)

To calculate indegrees or outdegrees we can simply calculate the column or row sums of the matrix

indegrees <- colSums(matrix)
outdegrees <- c(rowSums(matrix), NA)

Optionally, both vectors can be bound to the matrix object using the rbindand cbindfunctions.

Normalisation, defining g and then performing an operation for each object.

Calculating the density can be either done manually or by transforming the matrix into an igraph object. The density can be calculated using the function edge_density.

Calculating the density by hand:

3. Calculating centrality

4. Plotting networks

5. Symmetrising networks