Kivan Polimis
Princeton • University of Washington • Population Research and Policy Review • PNAS • JOSS
I’m a data scientist and ML engineer. At Karna I build systems that inform public policy decisions. I consult independently through Atlas Analytics and hold a Regional Affiliate position at the University of Washington’s Center for the Study of Demography and Ecology.
My background is in demography and causal inference, and I spend most of my time building production ML systems. That combination tends to pull me toward problems where the statistical question and the engineering constraint are both hard: predicting patient risk in ways that hold up across demographic groups, building sports models that can update within hours of a roster change, classifying financial transactions where the rare categories carry the highest cost of error.

Selected Work
- Healthcare documentation (private): Built a multi-agent LLM framework using RAG to automate clinical documentation. Pilot reduced provider administrative load by 40%. The harder part was the privacy architecture: the system processes notes about real patients.
- NBA analytics pipeline (private): End-to-end MLOps on AWS for real-time NBA performance prediction. The problem is non-stationarity. A player injury announced at 10 PM changes every subsequent game’s probability distribution. Monthly retraining isn’t fast enough.
- forest-confidence-interval (open source): Contributor to scikit-learn-contrib/forest-confidence-interval and author of the JOSS paper. Random forests give point estimates without variance. This adds that.
- Paratransit routing (open source): Technical lead for Data Science for Social Good, building routing systems for riders with disabilities. The constraint set is harder than standard routing: wheelchair requirements, medical time windows, and a population that can’t easily rebook if the system gets it wrong.
- Financial transaction classification: Fine-tuned BERT with a custom weighted loss function to categorize noisy financial text (truncated merchant names, payment processor prefixes). Class imbalance is the core problem. Legal Services transactions are 0.3% of volume but the highest-cost errors.
Writing
Technical Articles → Reproducible analyses, tutorials, and deep dives in Python and R.
Blog → Notes, reviews, and shorter pieces.
Contact: kivan.polimis@gmail.com