Analysis

Once you load or create a network in SocNetV, you may use the options in the Analysis menu to analyse it.

The first option in the Analysis menu is (Symmetry Test). It reports whether the network is symmetric or not. A network is called "symmetric" if for every edge (i,j) in the set E of the corresponding graph G(V,E) , the 'opposite' (j,i) edge also exists in E. In other words, when the adjacency matrix is symmetric.

Distances & Diameter

The next options in the Analysis menu (Distance, Average Distance, Distances Matrix, Diameter) etc focus on basic network/graph measures, such as the geodesic distance between nodes, the mean distance between all nodes, the diameter of the graph, the number of geodesics between nodes and the eccentricity of each node. Each option is explained below.

Distance

In graph theory, the shortest path between two vertices of the graph is called "geodesic".
The distance (or geodesic distance) of two nodes in a social network is the length of the shortest path between the corresponding vertices in the graph G(V,E).
By clicking on the "Distance" option (or pressing Ctrl+G) you will be asked for source and target nodes. Then their distance will be calculated and displayed.

Average Distance

The average distance in a social network is the average length of all shortest paths (geodesics) between the connected pairs of vertices in the corresponding graph.

Distances Matrix

The 'Distances Matrix' option calculates and displays a matrix of geodesic distances between all possible pair of nodes in the social network. A distances matrix is a n x n square matrix, in which the (i,j) element is the distance from node i to node j.

Geodesics Matrix

This option calculates and displays a n x n square matrix, where the (i,j) element is the number of geodesics between node i and node j. The produced matrix, called sigma matrix, is used in Centralities calculation (see below).

Eccentricity

The eccentricity or association number of each node i is the largest geodesic distance between node i and every other node in the graph. Therefore, in social network analysis, the eccentricity reflects how far, at most, is each node from every other node.
This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs. It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1.

Diameter

The diameter of a social network is the maximum eccentricity of any vertex in the corresponding graph G(V,E), that is the maximum distance between any two connected nodes.

Walks & Reachability

Connectedness

Checks whether the network is a connected graph, a weakly connected digraph or a disconnected graph/digraph.
In graph theory, a graph is connected if there is a path between every pair of nodes.
A digraph is strongly connected if there the a path from i to j and from j to i for all pair of nodes (i,j).
A digraph is weakly connected if at least a pair of nodes are joined by a semipath.
A digraph or a graph is disconnected if at least one node is isolate.

Walks of a given length

Clicking this option asks for a desired walk length (max: n-1). Then SocNetV calculates and displays a square matrix where each element (i,j) is the number of walks of the given length between the corresponding pair of nodes i and j.
A walk is a sequence of alternating vertices and edges such as v₀e₁, v₁e₂, v₂e₃, …, e_kv_k, where each edge, e_i is defined as e_i = {v_i-1, v_i}.
This function calculates the number of walks of the given length between each pair of nodes, by studying the powers of the sociomatrix.

Total Walks

Calculates and displays a (n x n) square matrix whose elements denote the number of walks of any length between each pair of nodes. The algorithm is based on the powers of the sociomatrix.
Please note that this function is VERY SLOW on large networks (n > 50), since it will calculate all powers of the sociomatrix up to (n-1) in order to find out all possible walks.
If you need to make a simple reachability test, we advise to use the Reachability Matrix function instead.

Reachability Matrix

Calculates the reachability matrix X^R of the graph where each (i,j) element is 1 if nodes i and j are reachable, otherwise is 0.
This function is based on the Distances Matrix; it checks whether the corresponding element of the Distances matrix is not zero. If it is not zero, then the nodes (i,j) are reachable and the X^R element is 1.

Clustering Coefficient

The Clustering Coefficient of a node quantifies how close the node and its neighbors are to being a clique. This is used to determine whether a network is a small-world or not.
This option calculates and displays the clustering coefficients of all nodes.

Tip: All the basic network statistics, such as nodes, edges and density are displayed and automatically updated in the Analysis tab of the left dock in SocNetV main window.

Triad Census

By clicking the "Triad Census" menu option, SocNetV will examine each of the triads present in the current network, and count how many of these belong to a certain triad type.
Some background:

In any network of N actors, there are C(N,3) triads.
For instance, in a network of 6 actors there are C(4,3)=20 triads, whereas in a network of 10 actors there are C(10,3)=60 triads.

In any case, though, there can be only sixteen different triad types (isomophism classes).
Every one of the C(N,3) triads of a network must belong (be isomorphic) to one of these sixteen types.
A Triad Census is a method which counts all the different types (classes) of observed triads within a network.
The triad types are coded and labeled according to their number of mutual, asymmetric and non-existent (null) dyads.
SocNetV follows the M-A-N labeling scheme, as described by Holland, Leinhardt and Davis in their studies.
In the M-A-N scheme, each triad type has a label with four characters:

The first character is the number of mutual (M) duads in the triad. Possible values: 0, 1, 2, 3.
The second character is the number of asymmetric (A) duads in the triad. Possible values: 0, 1, 2, 3.
The third character is the number of null (N) duads in the triad. Possible values: 0, 1, 2, 3.
The fourth character is infered from features or the nature of the triad, i.e. presence of cycle or transitivity. Possible values: none, D ("Down"), U ("Up"), C ("Cyclic"), T ("Transitive")

In the seven rows below, you can see all the sixteen triad types (classes).
Within each row, all the triad types have the same number of arcs present:

003
012
102     021D    021U    021C
111D    111U	030T	030C
201     120D	120U	120C
210
300

So, when you click on Triad Census menu option, SocNetV calculates and displays a vector T of length 16.
Each vector element (Tu) is the frequency of each one triad type inside the active network, i.e. T003 = 3.
Furthermore, the order of the elements of vector T is the same as the aforementioned ordering of the triad types:

T = [ T003, T012, T102, T021D, T021U, T021C, T111D, T111U, T030T, T030C, T201, T120D, T120U, T120C, T210, T300 ]

Apparently, the sum of all these frequencies Tu is C(N,3).

Centralities and Prestige

The last option in the Analysis menu opens the "Centrality and Prestige" sub-menu.

Social network analysts use various metrics (measures or indices) to calculate how 'prominent' or important each actor (node) is inside a network (graph). For instance, we might want to know how important is a person inside her friendship network or how critical is a power station inside the power company grid...

Although there are various metrics, focusing on different graph notions and applying to different graph types, they are usually refered to as 'centralities' collectively.

SocNetV follows the conceptualization of prominence that Wasserman and Faust as well as Knoke and Burt use in their essays. To our understanding, all indeces attempt to measure the visibility, the importance or the "prominence" of each node. But we distinguish two types of prominence: Centrality and Prestige.

Centrality metrics attempt to quantify how central is each node inside the network and usually examine the ties attached to that node and its geodesic distances (shortest path lengths) to other nodes. Most Centrality indices were designed for undirected graphs (symmetric), where the relations are non-directional. For instance, SocNetV can calculate Betweeness, Closeness, Degree, Stress, Graph and Eccentricity centrality indices.

For digraphs, where the relations are directional, most centrality indices can also be calculated by focusing on "choices made" (or outLinks). But due to the nature of the directional relations in digraphs, the social networks researcher usually needs to measure the "prestige" (as known as status, rank or popularity) of each node, focusing on "choices received" by other nodes rather than "choices made" by that node. Prestige indices focus exactly on "choices received" to a node. These indices measure the nominations or ties to each node from all others (or inLinks). Thus, Prestige indices can only be calculated on directed graphs..

Centrality indices are calculated for each node (node Centrality) and for the whole network (group Centrality). Thus, when you click on a centrality option, SocNetV will calculate the corresponding index of every node and the whole network and it will display them in a new window (a small text editor). From there you can save the analysis into a text file of your choice. By default, analysis files are saved on bin/ subfolder.

Degree Centrality (DC)

In undirected graphs, the DC index of each node v is the number of edges attached to it. In digraphs, the DC is the total number of arcs (outLinks) starting from v (outDegree).
This is often considered a measure of actor activity.

This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs. It can also be calculated in weighted graphs. In weighted relations, ODC is the sum of weights of all edges/outLinks attached to v.

Closeness Centrality (CC)

For each node v, CC the inverse sum of the geosesic distances from that node to every other node. CC is interpreted as the ability to access information through the \"grapevine\" of network members. Nodes with high closeness centrality are those who can reach many other nodes in few steps.

This index can be calculated in graphs and strongly connected digraphs (that is, if there is a directed path from v to u for all nodes v and u in the graph). It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1.

Undirected graphs The maximum value of CC is 1/(N-1), when the node is adjacent to all others. Thus the standard CC index (CC') is calculated by (N-1) * CC in undirected graphs. If the graph is directed, a standard CC' is calculated by (CC/sumCC)

Group CC is calculated using Freeman's general formula, in undirected graphs:

GCC = ( Sum ( maxCC' - CC' ) ) / ( (N-1) * (N-2) / (2 * N - 1) );

Directed graphs (digraphs)

Influence Range Closeness Centrality (CC)

For each node v, IRCC is the standardized inverse average distance between v and every other node reachable from v.
This improved CC index is optimized for graphs and directed graphs which are not strongly connected. Unlike the ordinary CC, which is the inverted sum of distances from node v to all others (thus undefined if a node is isolated or the digraph is not strongly connected), IRCC considers only distances from node v to nodes in its influence range J (nodes reachable from v).
The IRCC formula used is the ratio of the fraction of nodes reachable by v (|J|/(n-1)) to the average distance of these nodes from v sum(d(v,j))/|J|

( |J| / (n-1) ) / ( sum( d(v,j) ) / |J| )

Betweeness Centrality (BC)

For each node v, BC is the ratio of all geodesics between pairs of nodes which run through v. It reflects how often an node lies on the geodesics between the other nodes of the network. It can be interpreted as a measure of control. A node which lies between many others is assumed to have a higher likelihood of being able to control information flow in the network.

Note that betweeness centrality assumes that all geodesics have equal weight or are equally likely to be chosen for the flow of information between any two nodes. This is reasonable only on "regular" networks where all nodes have similar degrees. On networks with significant degree variance you might want to try informational centrality instead.

This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs. It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1.

Stress Centrality (SC)

For each node v, SC is the total number of geodesics between all other nodes which run through v. When one node falls on all other geodesics between all the remaining (N-1) nodes, then we have a star graph with maximum Stress Centrality.

This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs. It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1

Eccentricity Centrality (EC)

For each node v, the Eccentricity Centrality is the largest geodesic distance (v,u) from every other node u. Therefore, EC(v) reflects how far, at most, is each node from every other node.

This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs. It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1.

of a node k is the largest geodesic distance (k,t) from every other vertex. Therefore, EC(u) reflects how far, at most, is each node from every other node.

Power Centrality (PC)

The Power Centrality (PC) is a centrality measure suggested by Gil and Schmidt.

For each node v, this index sums its degree (with weight 1), with the size of the 2nd-order neighbourhood (with weight 2), and in general, with the size of the kth order neighbourhood (with weight k).

Thus, for each node in the network the most important other nodes are its immediate neighbours and then in decreasing importance the nodes of the 2nd-order neighbourhood, 3rd-order neighbourhood etc. For each node, the sum obtained is normalised by the total numbers of nodes in the same component minus 1.

This index can be calculated in both graphs and digraphs but is usually best suited for undirected graphs. It can also be calculated in weighted graphs although the weight of each edge (v,u) in E is always considered to be 1 (therefore not considered).

Information Centrality (IC)

The Information Centrality (IC) is an index suggested by Stephenson and Zelen (1989) which focuses on how information might flow through many different paths. IC counts all paths between nodes weighted by strength of tie and distance.

Since there is no known generalization of Stephenson & Zelen's theory for information centrality to directional relations, the index should be calculated only for graphs and is more meaningful in weighted graphs/networks.

Note: To compute this index, SocNetV drops all isolated nodes

In order to calculate the IC of each actor, we create a N x N matrix A with:

Aii=1+weighted_degree_ni
Aij=1 if (i,j)=0
Aij=1-wij if (i,j)=wij

Next, we compute the inverse matrix of A, for instance C. Note that we can always compute C since the matrix A is always a diagonally strong matrix, hence it is always invertible.

Finally, IC is computed by the formula:

IC(i) - 1 / [ Cii + (T-2R)/ N]

where:
T is the trace of matrix C (the sum of diagonal elements) and R is the sum of the elements of any row (since all rows of C have the same sum)

IC has a minimum value but not a maximum.

Degree Prestige (DP)

For each node k, this the number of arcs ending at k. Nodes with higher in-degree are considered more prominent among others. In directed graphs, this index measures the prestige of each node/actor. Thus it is called Degree Prestige. Nodes who are prestigious tend to receive many nominations or choices (in-links). The largest the index is, the more prestigious is the node.

This index can be calculated only for digraphs. In weighted relations, DP is the sum of weights of all arcs/inLinks ending at node v.

PageRank Centrality (PRP)

An importance ranking for each node based on the link structure of the network. PageRank, developed by Page and Brin (1997), focuses on how nodes are connected to each other, treating each link from a node as a citation/backlink/vote to another. In essence, for each node PageRank counts all backlinks to it, but it does so by not counting all links equally while it normalizes each link from a node by the total number of links from it. PageRank is calculated iteratively and it corresponds to the principal eigenvector of the normalized link matrix.

This index can be calculated in both graphs and digraphs but is usually best suited for directed graphs since it is a prestige measure. It can also be calculated in weighted graphs. In weighted relations, each backlink to a node v from another node u is considered to have weight=1 but it is normalized by the sum of outLinks weights (outDegree) of u. Therefore, nodes with high outLink weights give smaller percentage of their PR to node v.

Proximity Prestige (PP)

This index measures how proximate a node v is to the nodes in its influence domain I (the influence domain I of a node is the number of other nodes that can reach it). In PP calculation, proximity is based on distances to rather than distances from node v. To put it simply, in PP what matters is how close are all the other nodes to node v.

The algorithm takes the average distance to node v of all nodes in its influence domain, standardizes it by multiplying with (N-1)/I and takes its reciprocal. In essence, the formula SocNetV uses to calculate PP for every node v is the ratio of the fraction of nodes that can reach node v, to the average distance of that nodes to v:

PP = (I/(N-1))/(sum{d(u,v)}/I)

where the sum is over all nodes in I.