Trends Based On The Time And Location Of The Taxi Ride
Order ID:89JHGSJE83839 Style:APA/MLA/Harvard/Chicago Pages:5-10 Instructions:
Trends Based On The Time And Location Of The Taxi Ride
Project Description:
This project uses data from several million Chicago taxi rides in 2016. In this project you will look for trends based on the time and location of the taxi ride. In this project, you will produce a semi-professional report answering some detailed questions about driving a cab in Chicago. The report can be in a Word, markdown, or other document format and should include charts and table illustrating relevant metrics. All graphs can be made in Excel and all tables can be made in Word or Excel. You should also turn in your SQL scripts so that you can receive partial credit for any mistakes.
The goal of this project is to create a document that can be used as a talking point for a job interview.
The report should contain the following sections:
- Executive Summary: A summary of the results of the analysis. This should be less than 1 page. A few sentences and some bullet point would suffice.
- Analysis and Results: This section should be broken up in a logical manner according to the questions below. Include answers to the questions in a narrative form using supporting charts and tables where needed*. Based on the analysis, what conclusions can you draw about the data?
*Do not include your code in your report. Turn that in separately through canvas for potential partial credit.
Questions to answer:
- Seasonality of cab fares:
- Create a tablethat gives the total paid for all fares in the month, average per fare, the standard deviation (std), and the number of fares by month.
- Which months have the highest total fare for all rides? Does there seem to be any seasonality?
- Which months have the highest average fare per ride? Does there appear to be any seasonality? As part of answering this question, produce a plot with error bars (+/- 1 standard deviation) of the average fare per month.
- Which months have the largest variance in cab rides?
- Fares by day of week:
- Create a tablethat gives the total paid for all fares in a given week day, average per fare, the standard deviation (std), and the number of fares by day.
- Which days have the highest total fare for all rides?
- Which days have the highest fares per ride? Do weekends differ from weekdays in a statistically significant way? As part of answering this question, produce a plot with error bars (+/- 1 standard deviation) of the average fare by day.
- Fares by time of day:
- Create a tablethat gives the total paid for all fares in a given hour day, average per fare, the standard deviation (std), and the number of fares by hour.
- Is it better to work the night shift or the day shift in terms of earned income?
- Payment types:
- Create a tablethat shows the average total fare and standard deviation by payment type.
- Which payment types are the most lucrative?
- Are disputed payments more likely to occur at one time of day than another?
- Conclusion:
- If you were to become a cab driver. When would you want to work? Why?
- Is there anything else that stands out as worth mentioning?
Extra Credit Questions (worth up to 15 points total):
For 1-4, create a table or plot that helps answer each of the following question:
- (2 pt) Do people tip better on the weekends or during the week?
- (2 pt) Do people tip better in the morning or in the evening?
- (2 pt) Do people take longer trips during the week or on the weekends?
- (2 pt) Do people take longer trips in the morning or the evening?
- (2 pt) Write a query that returns a receipt for each row with the following content. Items in bold should be dynamically populated from database columns. You only need to use the taxi_01 table. Remember CHAR(10,13) is used for carriage returns and line feeds.
Thank you for riding taxi #2776 on January 22, 2016 9:30 am.
Your ride lasted 6 minutes and covered a distance 0.7 miles.
The total for this trip is $4.45.
Please remember to tip your driver.
- (5 pt) Which cab driver (taxi_id) is the best? Explain how you reached that conclusion. This should involve multiple lines of evidence. It is not necessarily the one who makes the most money.
Grading:
The project will be worth 30 points and graded as follows:
Accuracy of Report Results (20 points)
- Are the SQL queries correct and producing the correct results?
- Are the tables and graphs correct?
- Does the narrative agree with the data shown?
Readability and Formatting of Report – (10 points)
- Is the report laid out in a logical order?
- Does the report contain all the necessary parts?
- Is it grammatically and logically correct?
- It is visually appealing?
Example Tables and Chart:
Month Total Number of Rides 1 1,705,805 2 1,751,192 3 1,975,108 4 1,952,152 5 1,943,584 6 1,934,479 7 1,745,387 8 1,523,947 9 1,294,020 10 1,499,771 11 1,295,000 12 1,245,712 Table 1: Total number of rides by month
About the Data
This data from this project comes from Kaggle. (https://www.kaggle.com/chicago/chicago-taxi-rides-2016) and has been loaded into the taxi database on 10.41.4.35. The data is divided into 12 tables, one for each month of 2016. Look at the link above for information about the data set. The column definitions are given below.
Column Datatype Description taxi_id INTEGER A unique identifier for the taxi. trip_start_timestamp TIMESTAMP When the trip started, rounded to the nearest 15 minutes. trip_end_timestamp TIMESTAMP When the trip ended, rounded to the nearest 15 minutes. trip_seconds INTEGER Time of the trip in seconds. trip_miles FLOAT Distance of the trip in miles. pickup_census_tract INTEGER The Census Tract where the trip began. For privacy, this Census Tract is not shown for some trips. dropoff_census_tract INTEGER The Census Tract where the trip ended. For privacy, this Census Tract is not shown for some trips. pickup_community_area INTEGER The Community Area where the trip began. dropoff_community_area INTEGER The Community Area where the trip ended. fare FLOAT The fare for the trip. tips FLOAT The tip for the trip. Cash tips generally will not be recorded. tolls FLOAT The tolls for the trip. extras FLOAT Extra charges for the trip. trip_total FLOAT Total cost of the trip, the total of the fare, tips, tolls, and extras. payment_type STRING Type of payment for the trip. company INTEGER The id code for the taxi company. pickup_latitude INTEGER The id code for the latitude of the center of the pickup census tract or the community area if the census tract has been hidden for privacy. pickup_longitude INTEGER The id code for the center of the pickup census tract or the community area if the census tract has been hidden for privacy. dropoff_latitude INTEGER The id code for the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. dropoff_longitude INTEGER The id code for the center of the dropoff census tract or the community area if the census tract has been hidden for privacy. Total fares by month in 2016
0.1 0.5 0.3 0.2 0.1 0.15 0.35 0.225 0.45 0.5 0.1455 0.2245 0.1 0.5 0.3 0.2 0.1 0.15 0.35 0.225 0.45 0.5 0.1455 0.2245 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 1.705805 1.751192 1.975108 1.952152 1.943584 1.934479 1.745387 1.523947 1.29402 1.499771 1.295 1.245712
Month
Total Fares (in millions)
RUBRIC
Excellent Quality
95-100%
Introduction 45-41 points
The background and significance of the problem and a clear statement of the research purpose is provided. The search history is mentioned.
Literature Support
91-84 points
The background and significance of the problem and a clear statement of the research purpose is provided. The search history is mentioned.
Methodology
58-53 points
Content is well-organized with headings for each slide and bulleted lists to group related material as needed. Use of font, color, graphics, effects, etc. to enhance readability and presentation content is excellent. Length requirements of 10 slides/pages or less is met.
Average Score
50-85%
40-38 points
More depth/detail for the background and significance is needed, or the research detail is not clear. No search history information is provided.
83-76 points
Review of relevant theoretical literature is evident, but there is little integration of studies into concepts related to problem. Review is partially focused and organized. Supporting and opposing research are included. Summary of information presented is included. Conclusion may not contain a biblical integration.
52-49 points
Content is somewhat organized, but no structure is apparent. The use of font, color, graphics, effects, etc. is occasionally detracting to the presentation content. Length requirements may not be met.
Poor Quality
0-45%
37-1 points
The background and/or significance are missing. No search history information is provided.
75-1 points
Review of relevant theoretical literature is evident, but there is no integration of studies into concepts related to problem. Review is partially focused and organized. Supporting and opposing research are not included in the summary of information presented. Conclusion does not contain a biblical integration.
48-1 points
There is no clear or logical organizational structure. No logical sequence is apparent. The use of font, color, graphics, effects etc. is often detracting to the presentation content. Length requirements may not be met
You Can Also Place the Order at www.collegepaper.us/orders/ordernow or www.crucialessay.com/orders/ordernow Trends Based On The Time And Location Of The Taxi Ride