Machine Learning Engineer, Triplebyte (San Francisco, Ca) Apr 2019 - Present
Tech Lead of Data & Infrastructure – Migrated our search back-end to Elasticsearch, allowing for 10x faster queries and vector similarity search. Setup Redshift with Kinesis Firehose for streaming event data.
Responsiveness – Built, trained, and deployed a model to predict candidate responsiveness to recruiters’ outbound messages, doubling the candidate response rate to over 50%.
Matchmaking – Launched a recommendation engine using learning-to-rank and neural collaborative filtering to match active job seekers with the companies most likely to contact them.
Data Scientist, B23 (McLean, Va) Oct 2016 - Apr 2018
Stackspace – Full-stack developer on the company’s (mostly) cloud-agnostic product to one-click launch clusters for big data apps (e.g. Airflow, Elasticsearch, Spark). Increased the launch speed of the most popular stacks by 4x by pre-baking images.
Location Data Pipeline – Built an ETL pipeline to aggregate and resolve millions of points-of-interest from disparate sources for use in predictive foot-traffic analysis, which was then sold to hedge funds.
Cloud Migration – Automated the transfer of hundreds of terabytes of imagery data between the AWS cloud and Snowball Edge devices running GIS software on-device for the US Navy, replacing a tedious process that involved burning thousands of DVDs.
Software Engineer, Agilex / Accenture Federal Services (Chantilly, Va) June 2013 - Oct 2016
- Pluto – Developed a semantic search application for matching intelligence reporting with collection requirements via latent semantic indexing (LSI). Optimized the search engine around typical user behavior to increase the average query speed by 10x.
MS, Analytics; Georgetown University (Washington, D.C.) Sep 2016 - Aug 2018
BS, summa cum laude, Computer Science and Mathematics; Virginia Tech (Blacksburg, Va) Aug 2009 - May 2013
- Phi Beta Kappa
- PyTorch, Pyro PPL, Huggingface Transformers, Pandas/NumPy/scikit-learn
- PostgreSQL, SQL, Elasticsearch, Spark, Apache NiFi, MongoDB
- AWS, GCP, Ansible, LaTeX, enough React to be dangerous
- Soccer, Twitter bots, Prediction markets, Britpop, Sports analytics, Seinfeld