AIIS | Artificial Intelligence & Information Systems

Home

Time/ Location

Research at UIUC

> Machine Learning
> Natural Language Processing

University Resources

> Related Talks
> Related Seminars

Past Seminars

Abstract - Data Mining ||
Extracting insight from large networks: implications of small-scale and large-scale structure

Stanford University

Recent empirical work has demonstrated that, although there often exists meaningful "small scale" structure (e.g., clustering structure around a single individual at the size-scale of roughly 100 individuals) in large social and information networks, analogous "large scale" structure (e.g., meaningful or statistically significant properties of tens or hundreds of thousands of individuals) either is lacking entirely or is of a form that is extremely difficult for traditional machine learning and data analysis tools to identify reliably. For example, there are often small clusters which provide a "bottleneck" to diffusions (e.g., diffusive-based dynamic processes of the form of interest in viral marketing applications and tipping point models of network dynamics); on the other hand, there are typically no large clusters that have analogous bottlenecks, and thus diffusion-based metrics (and the associated machine learning and data analysis tools) are simply much less meaningful (or discriminative or useful) if one is interested in analyzing the network at large sizes. This empirical work will be briefly reviewed, and its implications for extracting insight from large networks with popular machine learning and data analysis tools will be discussed.

Bio:
Michael Mahoney is at Stanford University. His research interests center around algorithms for very large-scale statistical data analysis, including both theoretical and applied aspects of problems in scientific and Internet domains. His current research interests include geometric network analysis; developing approximate computation and regularization methods for large informatics graphs; and applications to community detection, clustering, and information dynamics in large social and information networks. He has also worked on randomized matrix algorithms and their applications to genetics, medical imaging, and Internet problems. He has been a faculty member at Yale University and a researcher at Yahoo, and his PhD was is computational statistical mechanics at Yale University.