6.4. Degree¶

Figure 6.1: PMF of degree in the Facebook dataset and in the WS model.¶
If the WS graph is a good model for the Facebook network, it should have the same average degree across nodes, and ideally the same variance in degree.
This function returns a list of degrees in a graph, one for each node:
def degrees(G):
return [G.degree(u) for u in G]
The mean degree in model is
However, the standard deviation of degree in the model is
What’s the problem? To get a better view, we have to look at the distribution of degrees, not just the mean and standard deviation.
We will represent the distribution of degrees with a Pmf
object, which is defined in the thinkstats2 module. Pmf
stands for “probability mass function”.
Briefly, a Pmf
maps from values to their probabilities. A Pmf
of degrees is a mapping from each possible degree, d
, to the fraction of nodes with degree d
.
As an example, we construct a graph with nodes
G = nx.Graph()
G.add_edge(1, 0)
G.add_edge(2, 0)
G.add_edge(3, 0)
nx.draw(G)
Here’s the list of degrees in this graph:
>>> degrees(G)
[3, 1, 1, 1]
Node Pmf
that represents this degree distribution:
>>> from thinkstats2 import Pmf
>>> Pmf(degrees(G))
Pmf({1: 0.75, 3: 0.25})
The result is a Pmf
object that maps from each degree to a fraction or probability. In this example,
Now we can make a Pmf
that contains node degrees from the dataset, and compute the mean and standard deviation:
>>> from thinkstats2 import Pmf
>>> pmf_fb = Pmf(degrees(fb))
>>> pmf_fb.Mean(), pmf_fb.Std()
(43.691, 52.414)
And the same for the WS model:
>>> pmf_ws = Pmf(degrees(ws))
>>> pmf_ws.mean(), pmf_ws.std()
(44.000, 1.465)
We can use the thinkplot
module to plot the results:
thinkplot.Pdf(pmf_fb, label='Facebook')
thinkplot.Pdf(pmf_ws, label='WS graph')
Figure 6.1 shows the two distributions. They are very different.
In the WS model, most users have about
Distributions like this, with many small values and a few very large values, are called heavy-tailed.