This repository will house all code, data, and files related to my work in the Springboard Data Science Immersive program. The following acts as a table of contents for the whole repository with links to the respective work cited
Key Skills
Custom Sentiment Analysis Library Created to facilitate in Overall Sentiment Analysis on Cryptocurrency News Articles scraped form the web. Used in conjunction with historical price data, the analysis is used in a deep neural network in order to predict future pricing for a crypto coin of interest
Key Skills
Exploring different image preprocessing techniques and methods in order to speed up CNN training. As a positive side effect, the transformation of original full scale data results in a smaller memory expense, both hard drive and RAM.
Key Skills
Mini project on customer segmentation and being able to identify different types of customers and then figure out ways to find more of those individuals so you can get more customers! The data comes from John Foreman's book Data Smart. The dataset contains both information on marketing newsletters/e-mail campaigns (e-mail offers sent) and transaction level data from customers (which offer customers responded to and what they bought).
Key Skills
Several EDA's performed on varying data categories. Hospital Readmittance performs a statistical analysis on a previously done analysis to critique its validity. Human Temperature EDA uses bootstrap statistics to determine the true average temperature of the human body in both male and females. Racial Discrimination performs a statistical analysis on if race has a meaningful impact on the callback rate of candidates who have submitted resumes to jobs of interest.
Key Skills
Performing several Machine Learning Algorithms in miniprojects such as: Labeling an obersvation as either male or female based on height and weight data (Logistic Regression), Regression Price Estimate on Boston Housing data using Linear Regression, and predicting movie reviews with Naive Bayes Models
Performing several exercises utlitizing MapReduce Pyspark (RDD) with a touch of MLlib
Key Skills
Key Skills
This is a SQL case study as proposed from Mode Analytics at https://modeanalytics.com/. The Jupyter notebook in this repository is a cleaned up verison of the original case study which contains all original SQL queries, and can be found here: https://modeanalytics.com/mooseburger/reports/14cbbb5670b8
Key Skills
An exercise of data extraction and exploration utilizing a JSON data source
Key Skills
Relax Challenge - Defining an "adopted user" as a user who has logged into a product on three separate days in at least one seven-day period, identify which factors predict future user adoption. You are given two datasets
Ultimate Challenge