Notebooks

Here are some shorter projects, each in the form of a Jupyter Notebook.

View my Notebooks Back to Home


MapReduce Thumbnail

Document Search Using Map Reduce
In this notebook, we implement a custom MapReduce framework in python, and use it to create a document search function.

Algorithms
Join Time Complexity Thumbnail

Implementing Efficient Joins on Mobile App Data
In this notebook, we implement a few different join algorithms, and study their time and space complexity on real world datasets.

Algorithms, Data Cleaning
Big Data Workflow Thumbnail

Big Data Workflow with Pandas and SQLite3
In this notebook, demonstrate a workflow using pandas and SQLite3 which scales well with large data.

SQL, Data Processing
Crime Database Thumbnail

Creating a Crime Database Using Postgres
In this notebook, we create a Postgres database containing boston crime data. We also explore how to use Postgres to create groups and revoke/restore permissions.

PostgreSQL
ebay used car listings thumbnail

Finding Value in Used Car Listings on ebay
In this notebook, we analyze a dataset of used-car listings from the german ebay website ebay Kleinazeigan.

Data Analysis, Data Cleaning, Data Visualization
Building Fast Queries Thumbnail

Building Fast Queries on a csv
In this notebook, we create a python Inventory() class based on a laptop store's inventory. Using time/space complexity analysis, we optimize lookups that store employees may use frequently.

Algorithms, OOP, Data Visualization
Traffic Indicators Thumbnail

Finding Heavy Traffic Indicators
In this notebook, we analyze a dataset containing hourly westbound traffic on I-94.

Data Analysis, Data Cleaning, Data Visualization
Star Wars Thumbnail

Exploring Star Wars Survey Results
In this notebook, we finally answer the question everyone has been asking since 1999: how much do Star Wars fans hate Jar Jar?

Data Cleaning, Data Analysis, Data Visualization
NYC Test Data Thumbnail

Anayzing NYC Test Score Data
In this notebook, we investigate standardized test performance among high schoolers across the boroughs of New York City.

Data Cleaning, Data Analysis, Data Visualization
Exit Surveys Thumbnail

Analyzing Employee Exit Surveys
In this notebook, we analyze two datasets containing exit surveys of employees from education related government agencies.

Data Cleaning, Data Analysis, Data Visualization
chinook-schema

Answering Business Questions using SQL
In this notebook, we use SQLite3 along with the popular chinook database to demonstrate writing queries in SQL.

SQL, Data Analysis
Chunk Processing Thumbnail

Chunk Processing in Pandas
In this notebook, we demonstrate how pandas can be used to optimize datasets which are too large to fit into memory.

Data Processing, Data Cleaning
SQL Basics Thumbnail

Exploring CIA Factbook Database using SQL
In this notebook, we use SQLite3 to explore the CIA's world factbook database factbook.db.

SQL, Data Analysis