The goal of this project was to demonstrate mastery of data science and machine learning concepts. This started with selection of a problem statement and dataset. I chose to leverage a linear regression model against a dataset on vehicle fuel efficiency from the Environmental Protection Agency (EPA)
EPA and other agencies spend large sums of money and time conducting fuel efficiency tests.
Vehicle MPG varies by make/model (alternately MPG varies by engine displacement or transmission type) and a machine learning model can be built to predict this. Replacing costly and time consuming tests with a machine learning model would certainly benefit agencies such as the EPA.
Vehicle MPG does not vary by make/model (MPG does not vary by engine displacement/transmission type) or a machine learning model cannot replace laboratory testing for fuel efficiency.
Fuel Efficiency over Time
Relationship Between MPG and Displacement, Cylinders, and CO2 Emissions
Relationship Between Displacement, MPG, and CO2 Emissions
Figure 3 below shows the averages of four major variables over time. Further research into the drastic changes in averages revealed two hidden impacts: federal law and vehicle sales trends
Averages of Major Variables over Time
Count of Vehicle Type over Time
Instructions for application use
Prediction results based on user input