Rhitika Dutta Attempt1 Final (visualisation with box plot in conjunction with scatter plot).

I am exploring the relationship between Uber Price and the time of the day. In addition, I also want to visualize how weather and the day(weekday vs weekend) have impact on Uber price.

Hypothesis: As peak time increases, uber price increases, another factor that could impact uber prices is weather condition.
In addition, I am also trying to prove if change in weather condition and Days (weekdays and weekends) can positively affect the price of Uber.


Analysis -My project is based on establishing relationship between the price of an Uber based on three causal variables. These are the time of the day, weather and whether it is a weekday or weekend.
-I have plotted the time of the day on X-axis and the price on the Y-axis.
-For the weather, I am using a circe radius scaling function that scales based on the weather.
-For the type of day i.e. weekday or weekend, I am using different colors.
-Finally, to make my plot informative, I have added a box plot for each of the different times on the x axis.
-The box plot shows how the price is distributed for a given time across my dataset and gives the viewer an informative view of the data without looking at each point. -Till now, both the factors (weather and time) have supported my visualisation well. I can see both the factors impacting price in a direct way.

Research process and Sources: For this visualisation, I have collected data for 24 days and will add more in the future. I am also going to experiment with representing weather using symbols instead of using circle radius scaling.
- I have developed a good understanding of how to make a box plot from d3-graph-gallery.com(boxplot) From this source, I got to know how we can make a box plot using simple shapes like lines and rectangles. I also thought of doing a violin plot. But my data is not extremely dense(since I have 24 data points). Therefore, I decided to stick to the box plot.
- Since I am not from a statistics background, I got my understanding of the different terms such as the quantiles and interquantile range in a box plot from www.khanacademy.com
-I have also used a threshold scale in my assignment to represent the day type(Weekday vs Weekend). For this I referred to the course material as well as the following link Observablehq.com(threshold scale) While this link also had other scales, after reading through them, I found the scale threshold to be sufficient for my use case.
- I also referred to material on Ordinal Scales(www.geeksforgeeks.org/d3-js-scaleordinal-function) though I did not use them.
I still need to do some improvement on my axis marking and scale length. I am also working on an alternate representation that relies on using different shapes. In that work, I am using weather on the x axis and trying to use shapes for the different time. I am planning to have both in my final project to enable a better understanding of the data.

Challenge:One major challenge that I encountered while doing this project is to depict my overlapping data in a clear manner. It was difficult for me to show the realtionship of price with three other factors together. However, with whisker plot and scatterplot I could overcome this difficulty of overlapping data and give my visualisation a clear form.
CONCLUSION: My visualisation supports my hypothesis in the sense that in peak time like 9 am (which is usually office hours) is depicted by squares; the uber price is higher compared to other time zones. Also, by going through my visualisation, I can say that when temperature is low that is between 30 to 40 degree Fahrenheit, the Uber prices go up.