|
Abstract - Data Mining ||
Extracting insight from large networks: implications of small-scale and large-scale structure Stanford University
Recent empirical work has demonstrated that, although there often exists meaningful "small scale" structure (e.g., clustering structure around a single individual at the size-scale of roughly 100 individuals) in large social and information networks, analogous "large scale" structure (e.g., meaningful or statistically significant properties of tens or hundreds of thousands of individuals) either is lacking entirely or is of a form that is extremely difficult for traditional machine learning and data analysis tools to identify reliably. For example, there are often small clusters which provide a "bottleneck" to diffusions (e.g., diffusive-based dynamic processes of the form of interest in viral marketing applications and tipping point models of network dynamics); on the other hand, there are typically no large clusters that have analogous bottlenecks, and thus diffusion-based metrics (and the associated machine learning and data analysis tools) are simply much less meaningful (or discriminative or useful) if one is interested in analyzing the network at large sizes. This empirical work will be briefly reviewed, and its implications for extracting insight from large networks with popular machine learning and data analysis tools will be discussed.
|
|||||||||||