My Projects

Machine Learning Competitions

  • Won Kaggle’s 2017 March Machine Learning Mania Competition – Blog post
  • Second place in 2017 Global Energy Forecasting Competition Qualifying Match – Paper
  • Led the team that won the 2013 Capital One Student Modeling Competition. The task was to build a recommender system to offer the most relevant coupons to customers. Our solution extended the matrix factorization techniques that were used in the Netflix Prize.

Logistic PCA and Generalized PCA

My dissertation research with Prof. Yoonkyung Lee deals with dimensionality reduction of binary and count data. We propose a generalization of principal component analysis to non-Gaussian data. Our method minimizes the deviance by solving for a projection matrix which projects the natural parameters of the saturated model onto a lower dimensional space. Two preprint articles are available here. An R package implementing this research for binary data is available on CRAN. A complementary R package for all types of data is available on Github. For this research, I won the department’s Whitney Award for Outstanding Thesis Researcher.

R Packages

  • logisticPCA – Dimensionality reduction for binary data – CRAN, Github
  • generalizedPCA – Dimensionality reduction for binary, count, and multinomial data – Github
  • libFMexe – A wrapper to Rendle’s libFM software for factorization machines – Github
  • nameage – Infer a person’s age, based on their first name (assuming they were born in the USA) – Github

A Few Shiny Apps

comments powered by Disqus