Hourly Air Temperature data for US state of New Hampshire

weather
Tags: #<Tag:0x00007f57ff350288>

#1

Hi
I am new to R, and need some help with a data visualization. I want to graph the hourly air temp for the state of New Hampshire for all available historical dates. I have data for visits to a commercial outlets (incl. latt/long) and want to see how hourly visit numbers are affected by temperature/weather. I plan to plot the hourly air temp on a line graph, overlay that with the hourly visits to the commercial outlet and show if there is a relationship. My initial research brought me to this link https://ropensci.org/blog/blog/2017/04/04/gsodr which helped my understand the approach. Please let me if you have any insights on this query. I appreciate any help you can give.
Thanks
Brendan


#2

Thanks for your question @brendan.gallagher

I’m definitely not the best one to ask about viz questions, but do @adamhsparks or @geanders or @jonmcalder have any thoughts on this?


#3

Without actually seeing any of the data, I’d suggest a scatterplot of visits for each location with the corresponding weather data from the closest station to start and look for a pattern there, e.g. x-axis = visits, y-axis = temperature.

If you wish to plot temperature as a line, I’d do that and plot visits as points, this would most likely mean you’d have dual axis, which may or may not be desirable depending on your point of view.


#4

I think the suggestions from @adamhsparks are a good starting point. I had similar thoughts but it’s hard to know what to suggest without seeing the data - explore the data and try different plots to see where that takes you.

Maybe you can also look at things like minimum and peak temperatures as well as temperature ranges (over different time periods) - these might highlight different trends than what you can see in the raw data. My guess is that if you look at hourly data both temp and visits will each have some inherent periodicity which could either mask or enhance potential correlation depending on the lags involved. That might start getting you into time series territory - which is probably a good thing since time is implicitly key to the problem you’re looking at, but could get complicated.

I’m not sure if the above helps though - it’s entirely possible that you were instead looking for some practical steps on how to get going with plotting in R? If that’s the case I’d recommend something like the exploratory data analysis chapter in R for Data Science and/or the R graphics cookbook resources.