SICCS Debrief

Kivan Polimis, Mon 17 July 2017, Review


Computational approaches to health, especially approaches harnessing "big data'', offer researchers emerging methods and novel data to understand social inequalities. The infancy of big data and methods in computational social science research makes the field a Wild West frontier of ethical dilemmas and ensemble research strategies from multiple disciplines (Salganik 2017). The recent intersection of computational big data methods and social science research has fittingly left most of the history of computational social science online. For instance, Google Trends data show the meteoric rise of big data and other related terms (big data, machine learning, and computational social science) from 2012 to now. See the interactive plot for more detail on Google searches with these terms since 2004.

Computational social science employs computational approaches such as cryptography and machine learning algorithms such as random forests to analyze, simulate, and model behavioral phenomena. Additionally, computational methods provide large-scale access to unstructured data types such as text, images, and audio that previously eluded social science research. Computational social science as a field balances the inherent complexity in combining fields describing social phenomena with tools from the natural and artificial world of engineering and computer science.

Summer Institute in Computational Social Science

June 17th to July 2nd

The Russell Sage Foundation in conjunction with Princeton University sponsored the first ever Summer Institute in Computational Social Science (SICSS) to expose Ph.D. students, postdoctoral researchers, and untenured faculty within 7 years of their Ph.D. to the growing computational social scientist toolkit. Lead by Matthew Salganik and Chris Bail, this institute focused on text as data, website scraping, digital field experiments, non-probability sampling, mass collaboration, and ethics. Guest speakers and workshops spanned industry and academic backgrounds and included Gary King from Harvard, Winter Mason from Facebook, and Markus Mobius from Microsoft Research. The exposure to the breadth of personal experiences, academic backgrounds, and workflows will continue to influence my work in the future.

I particularly enjoyed mass collaboration with the Fragile Families Challenge and digital field experiments. After participating in the Fragile Families Challenge, a first of its kind challenge to meld social science research and data science competitive innovation. This challenge allows researchers to use the longitudinal Fragile Families and Child Wellbeing Study to infer six key outcomes (GPA, 'grit', material hardship, eviction, layoff and job training) for adolescents and their households by age 15. Our group combined lessons from another day of study, digital field experiments via the Wiki Survey platform, to incorporate wisdom from authors of Fragile Families studies and Amazon Mechanical Turk to prune the predictive model and reduce the potential variables from a parameter space of thousands to a few dozen predictors. I appreciated being part of an institute that united a group of researchers straddling data science and social science, live-streamed lectures for the public, and provided the (personal and professional) scaffolding for future collaboration in computational social science.

SICSS Group Picture
SICSS Group Picture
courtesy of Chris Bail's Twitter:


Salganik, Matthew J. 2017. Bit by Bit: Social Research in the Digital Age. Princeton University Press.