Machine Learning Competitions
- Won Kaggle’s 2017 March Machine Learning Mania Competition – Blog post
- Second place in 2017 Global Energy Forecasting Competition Qualifying Match – Paper
- Led the team that won the 2013 Capital One Student Modeling Competition. The task was to build a recommender system to offer the most relevant coupons to customers. Our solution extended the matrix factorization techniques that were used in the Netflix Prize.
Logistic PCA and Generalized PCA
My dissertation research with Prof. Yoonkyung Lee deals with dimensionality reduction of binary and count data. We propose a generalization of principal component analysis to non-Gaussian data. Our method minimizes the deviance by solving for a projection matrix which projects the natural parameters of the saturated model onto a lower dimensional space. An R package implementing this research for binary data is available on CRAN. A complementary R package for all types of data is available on Github. For this research, I won the department’s Whitney Award for Outstanding Thesis Researcher.
- Dimensionality Reduction for Binary Data through the Projection of Natural Parameters - Published online August 2020 in Journal of Multivariate Analysis - arXiv preprint
- Generalized Principal Component Analysis: Projection of Saturated Model Parameters - Published online October 2019 in Technometrics - preprint
R Packages
- logisticPCA – Dimensionality reduction for binary data – CRAN, Github
- generalizedPCA – Dimensionality reduction for binary, count, and multinomial data – Github
- libFMexe – A wrapper to Rendle’s libFM software for factorization machines – Github
- nameage – Infer a person’s age, based on their first name (assuming they were born in the USA) – Github