The Mimir Project Page: What Drives the Dynamics of Science?


Our multidisciplinary project draws upon sociology, computer science, and linguistics to study how ideas are created and propagate through scientific communities, how these communities are formed and change over time, and how multidisciplinary networks spanning these communities shape scientific innovation. We are creating sophisticated new computational models for extracting and representing ideas and measuring their impact and novelty, and for extracting and representing social relations and identifying forms of multidisciplinary collaboration. Our methods integrate the network analytic tools of social science with the language processing tools of computer science. We use network analysis to improve the ability of computational tools to identify ideas in scientific texts, and we use the tools of computational linguistics to help explain the co-evolution of scientific collaborations and innovations. We are using our models of ideas and their diffusion to investigate hypotheses such as whether multidisciplinary research accelerates or decelerates scientific innovation, and how multidisciplinarity influences student and faculty careers. We combine a shallow large-scale study of knowledge corpora (ISI Web of Knowledge, Proquest dissertations, NSF/NIH grants, US Patent Office, Federal committees) with a richer organization-level study of Stanford University (their publications, grants, affiliations, advising, teaching, etc) in order to explore and analyze the complex interrelationships of innovation and multidisciplinary collaboration.

Our research agenda is to produce new and unique data, create new computational tools, and extend theory so that scholars change their conceptions of scientific innovation, multidisciplinarity, and research communities more generally. Our integration of social network and natural language processing techniques is helping develop a new vein of research in computational social science, simultaneously offering empirical rigor and scale to the sociology of science and extending natural language processing from its previous engineering focus toward true explanatory social science models.


More information

You can find more and more up to date information at this site:


The Stanford Topic Modeling Toolbox


Conference Presentations

Testset Collection

If you have papers that are contained in the ISI Web of Knowledge, please consider providing your publication information to help us match authors to publications.