About

Experience

Principal Data Scientist | Dun & Bradstreet | 2022 – Present

  • Building LLM-powered structured extraction pipelines for financial documents and filings
  • Led development of an internal geospatial library on PySpark using Uber H3 for scalable geographic indexing
  • Designed and deployed a multiclass text classification system for 100M+ global cargo descriptions, improving Harmonized System Code accuracy by 30+ percentage points
  • Led architecture of foreign influence risk analytics across 200M+ global entities, enabling custom logic via a Python/PySpark rules engine

Senior Data Scientist | Morningstar | 2020 – 2022

  • Built a modular portfolio backtesting library (NumPy/Pandas) for evaluating investment strategies across asset classes
  • Developed a two-way fixed effects model to identify crowded equity factor trades using proprietary risk factors

Data Scientist | Morningstar | 2018 – 2020

  • Delivered a churn prediction model using random survival forests for 100K+ retirement customers
  • Engineered a limit order book reconstruction tool for high-frequency trading data with millisecond-level accuracy

Associate Data Scientist | Morningstar | 2016 – 2018

  • Led development of a stacked ensemble propensity model, A/B tested across channels, achieving 9% uplift in adoption
  • Analyzed behavioral data in Neo4j to identify high-value user engagement patterns

Education

M.S. Computer Science | University of Illinois at Urbana-Champaign | 2020

M.A. Economics | George Mason University | 2014

B.A. Economics & Spanish | Western Kentucky University | 2012

Skills

Languages: Python, SQL, PySpark

AI/LLM: LangChain, Pydantic, RAG pipelines, vector databases (Qdrant), agents, structured extraction

Infrastructure: Databricks, GCP, Spark, Docker, Git, Linux

Statistics: Panel Data, Time Series, Causal Inference, Forecasting

Machine Learning: Deep Learning, NLP, text classification