We can estimate the joint density of two random variables in MATLAB using the Statistics and Machine Learning Toolbox function ksdensity. If \(X\) and \(Y\) are closely related, then their mutual information content is large, and consequently, the distance is close to zero.
This result is only achieved if the mutual information content of \(X\) and \(Y\) is zero, which, in turn, only occurs if \(X\) and \(Y\) are independent. The central services node is highlighted in red in Figure 1. We see that the services sector has the highest values for all measures. A high betweenness value implies that the node is important.Ĭomputing these measures for the MST above produces the results shown in Figure 6. Similarly, the betweenness of a node measures how often that node appears on a shortest path between two other nodes in the graph. A high closeness value therefore implies that the node is central, or important. The incidence of a node counts the number of edges adjoining that node, whereas the closeness is the reciprocal of the sum of the distances from the node to all other nodes. We use the centrality method associated with graph objects:Ĭloseness = centrality(T, 'closeness', 'Cost', edgeWeights) īetweenness = centrality(T, 'betweenness', 'Cost', edgeWeights) Since mathematical graphs intrinsically provide measures of node significance, we can assess sector importance by computing these quantities. Representing the data in graph form helps us to quantify relationships between variables. If the scattered points do not deviate substantially from the reference line, then we have a good quality embedding. We assess the quality of the graph embedding by creating a Shepard plot (Figure 5). When combined with some simple plot settings, this approach allows us to create the visualization shown in Figure 1. To create a 2D visualization, we pass the resulting array of Euclidean coordinates coords to the graph class plot method: An alternative approach providing a 2D representation is to use nonlinear multi-dimensional scaling on the distance matrix via the Sammon mapping:Ĭoords = mdscale(sectorDist, 3, 'Criterion', 'Sammon') The simplest approach is to use the graph plot function directly on the tree.
There are several options for visualizing the resulting MST. Here, sectorDist is the correlation distance matrix and sectors is a cell array containing the sector names.
In MATLAB we can use the graph and minspantree functions to compute the MST directly: In this context, we can think of the MST as the “backbone” network encapsulating the dependencies between sectors. The MST is a subgraph connecting all nodes in which the total edge length is minimized. In and, the authors compute the minimum spanning tree (MST) of this graph. Assigning the industrial sectors to the nodes of a network, we then join up the nodes via edges with lengths given by the correlation distance.