Kivan Polimis
Princeton • University of Washington • Data Science • Machine Learning • Population Research
I’m a data scientist and ML engineer. At Karna I build systems that inform public policy decisions. I consult independently through Atlas Analytics and hold a Regional Affiliate position at the University of Washington’s Center for the Study of Demography and Ecology.
My background is in demography and causal inference, and I spend most of my time building production ML systems. That combination tends to pull me toward problems where the statistical question and the engineering constraint are both hard: predicting patient risk in ways that hold up across demographic groups, building sports models that can update within hours of a roster change, classifying financial transactions where the rare categories carry the highest cost of error.

Selected Work
- NBA analytics pipeline (private): End-to-end MLOps on AWS for real-time NBA performance prediction. The problem is non-stationarity. A player injury announced at 10 PM changes every subsequent game’s probability distribution. Monthly retraining isn’t fast enough.
- forest-confidence-interval (open source): One of the original developers of scikit-learn-contrib/forest-confidence-interval and author of the JOSS paper. The package brings Stefan Wager’s infinitesimal jackknife approach — developed in his 2014 JMLR paper with Hastie and Efron and implemented in the randomForestCI R package — to Python’s scikit-learn ecosystem.
- Paratransit routing (open source): Team member at Data Science for Social Good, on a team of five students and two data scientists building routing systems for riders with disabilities. The constraint set is harder than standard routing, with wheelchair requirements, medical time windows, and a population that can’t easily rebook if the system gets it wrong.
- Financial transaction classification (open source): Fine-tuned BERT with a custom weighted loss function to categorize noisy financial text (truncated merchant names, payment processor prefixes). Class imbalance is the core problem. Legal Services transactions are 0.3% of volume but the highest-cost errors.
Writing
Technical Articles → Reproducible analyses, tutorials, and deep dives in Python and R.
Blog → Notes, reviews, and shorter pieces.
Contact: kivan.polimis@gmail.com