Machine Learning Researcher | Aspriring Data Engineer
Learn More
I am currently a Research Assistant and PhD Candidate in the Department of Mathematics at Penn State University. My research takes place within the vast fields of Machine Learning and Data Science. Some recent projects I've worked on involve:
When I'm not actively researching, I enjoy pursuing my passion for the practical use of data in the real world through the completion of Data Engineering and Data Science projects.
Below are some highlighted projects that I've completed on my journey from Theorist to Practitioner. For a complete list of projects and notebooks, please explore this website using the Portfolio dropdown menu at the top left of each page. Alternatively, checkout my github.
In this project, we leverage AWS Lambda, along with asynchronous AWS Transcribe & Comprehend jobs, to create an event based, fast podcast transcription pipeline.
In this project, we build a cloud native, fully dockerized real time data pipeline: orchestrated with Kubernetes, powered by Spark.
This project consists of a "skikit-learn" style deployment of a novel technique for manifold learning / nonlinear dimensionality reduction, developed by me and my advisor.
In this notebook, demonstrate a workflow using pandas and SQLite3 which scales well with large data.
In this notebook, we implement a few different join algorithms, and study their time and space complexity on real world datasets.
In this notebook, we implement a custom MapReduce framework in python, and use it to create a document search function.
In the Fall of 2022, I gave a talk on my research at the SIAM Conference on Mathematics of Data Science (MDS22). Attached are the slides I prepared for my talk.
I was an invited poster presenter at the ninth conference of the Foundations of Computational Mathematics (FoCM) Society in Paris during the Summer of 2023. Attached is a .pdf of my poster.
A key part of my Data Science journey has been solving exercies from the seminal textbook Elements of Statistical Learning. Here are my solutions to the exercises from this excellent book.