Hi all, first post here and new to rOpenSci and in love with rnoaa. I’m teaching time series and spatial stats this quarter and rnoaa has opened up an incredible wealth of new data. I maintain a much smaller package and know all too well how hard it is and how often users confound you with unorthodox applications. So a thanks to start.
Here is what I want to do: make a spatial points data frame (SPDF) of mean annual temperature (MAT) and total annual precipitation (TAP) for the last climate normal for all the NCDC stations in the Pacific Northwest. I thought I would do this via dplyr::filter
through state code WA, OR, ID or by a bounding box in sp
. The final SPDF would have a data slot with two values in it for all the points: MAT and TAP. I kind of thought I would try something like this to get the stations:
library(rnoaa)
library(dplyr)
require(sp)
sta <- ncdc_stations()
sta_sp <- sta$data
coordinates(sta_sp) <- ~latitude+longitude
plot(sta_sp) # only 25 points -- the first page of the stations
But I see that sta
only contains the first page (25 points) of some 1.3E5 stations. And this returns some 3.2E6 records!
sta_norms<- ncdc(datasetid="NORMAL_ANN",startdate = "2010-01-01",
enddate = "2010-01-01")
So, I can see why the calls from rnoaa are conservative. You can’t have people querying millions of points like a dummy (looks in mirror). So clearly I need some help. My initial idea was to filter the data by state or by a bounding box but I’m unsure of how to best proceed. Any suggestions appreciated and sorry for not having more code to fuss with in my example. I’m not sure how to get started. I’ve been flamed by Ripley in my youth so I can handle being told I’m over my head.
Thanks in advance.