Hypothesis: As peak time increases, uber price increases, another factor that could
impact uber prices is weather condition.
In addition, I am also trying to prove if change in weather
condition and Days (weekdays and weekends) can positively affect the price of Uber.
Analysis -My project is based on establishing relationship between the price of an Uber based
on three causal variables. These are the time of the day, weather and whether it is a weekday or weekend.
-I have plotted the time of the day on X-axis and the price on the Y-axis.
-For the weather, I am using a circe radius scaling function that scales based on the
weather.
-For the type of day i.e. weekday or weekend, I am using different colors.
-Finally, to make my plot informative, I have added a box plot for each
of the different times on the x axis.
-The box plot shows how the price is distributed for a given time across my dataset and
gives the viewer an informative view of the data
without looking at each point.
-Till now, both the factors (weather and time) have supported my visualisation well. I can see both the
factors impacting price in a direct way.
Research process and Sources: For this visualisation, I have collected data for 24 days and
will add more in the future. I am also going to experiment with representing weather using symbols instead of
using circle radius scaling.
- I have developed a good understanding of how to make a box plot from d3-graph-gallery.com(boxplot)
From this source, I got to know how we can make a box plot using simple shapes like lines and rectangles. I also
thought of doing a violin plot.
But my data is not extremely dense(since I have 24 data points). Therefore, I decided to stick to the
box plot.
- Since I am not from a statistics background, I got my understanding of the different
terms such as the quantiles and interquantile range in a box plot from www.khanacademy.com
-I have also used a threshold scale in my assignment to represent the day type(Weekday vs
Weekend). For this I referred to the course material as well as the following link Observablehq.com(threshold
scale)
While this link also had other scales, after reading through them, I found the scale threshold to be sufficient
for my use case.
- I also referred to material on Ordinal Scales(www.geeksforgeeks.org/d3-js-scaleordinal-function)
though I did not use them.
I still need to do some improvement on my axis marking and scale length. I am also working on an alternate
representation that relies on using different shapes. In that work, I am using weather on the x axis and trying
to use shapes for the different time. I am planning to have both in my final project to enable a better
understanding of the data.
Challenge:One major challenge that I encountered while doing this project is to depict my
overlapping data in a clear manner. It was difficult for me to show the realtionship of price with three other factors together. However, with whisker plot and scatterplot I could overcome this difficulty of overlapping data and give my visualisation a clear form.
CONCLUSION: My visualisation supports my hypothesis in the sense that in peak time like 9
am
(which is usually office hours) is depicted by squares; the uber price is higher compared to other time zones.
Also, by going through my visualisation, I can say that when temperature is low that is between 30 to 40 degree
Fahrenheit, the Uber prices go up.