Complex network growth across diverse fields of science is hypothesized to be driven in the main by a combination of preferential attachment and node fitness processes. For measuring the respective influences of these processes, previous approaches make strong and untested assumptions on the functional forms of either the preferential attachment function or fitness function or both. We introduce a Bayesian statistical method called PAFit to estimate preferential attachment and node fitness without imposing such functional constraints that works by maximizing a log-likelihood function with suitably added regularization terms. We use PAFit to investigate the interplay between preferential attachment and node fitness processes in a Facebook wall-post network. While we uncover evidence for both preferential attachment and node fitness, thus validating the hypothesis that these processes together drive complex network evolution, we also find that node fitness plays the bigger role in determining the degree of a node. This is the first validation of its kind on real-world network data. But surprisingly the rate of preferential attachment is found to deviate from the conventional log-linear form when node fitness is taken into account. The proposed method is implemented in the R package PAFit.
Preferential attachment is a stochastic process that has been proposed to explain certain topological features characteristic of complex networks from diverse domains. The systematic investigation of preferential attachment is an important area of research in network science, not only for the theoretical matter of verifying whether this hypothesized process is operative in real-world networks, but also for the practical insights that follow from knowledge of its functional form. Here we describe a maximum likelihood based estimation method for the measurement of preferential attachment in temporal complex networks. We call the method PAFit, and implement it in an R package of the same name. PAFit constitutes an advance over previous methods primarily because we based it on a nonparametric statistical framework that enables attachment kernel estimation free of any assumptions about its functional form. We show this results in PAFit outperforming the popular methods of Jeong and Newman in Monte Carlo simulations. What is more, we found that the application of PAFit to a publically available Flickr social network dataset yielded clear evidence for a deviation of the attachment kernel from the popularly assumed log-linear form. Independent of our main work, we provide a correction to a consequential error in Newman’s original method which had evidently gone unnoticed since its publication over a decade ago.