Hi there,
Using the homr page on the noaa website, I downloaded a list of all the US weather stations. I then dropped any non active stations, any stations that don’t start with US1, and merged this data with fips code dataset. My plan is go get rain data for one day for every station and then collapse at the county level.
Here is my R code so far and the stackoverflow link
df <- dataframe$ghcnd
Grabbing necessary column
This gives me an output like:
[1] "GHCND:US1AKAB0058" "GHCND:US1AKAB0015" "GHCND:US1AKAB0021" "GHCND:US1AKAB0061"
[5] "GHCND:US1AKAB0055" "GHCND:US1AKAB0038" "GHCND:US1AKAB0051" "GHCND:US1AKAB0052"
[9] "GHCND:US1AKAB0060" "GHCND:US1AKAB0065" "GHCND:US1AKAB0062" "GHCND:US1AKFN0016"
[13] "GHCND:US1AKFN0018" "GHCND:US1AKFN0015" "GHCND:US1AKFN0011" "GHCND:US1AKFN0013"
[17] "GHCND:US1AKFN0030" "GHCND:US1AKJB0011" "GHCND:US1AKJB0014" "GHCND:US1AKKP0005"
[21] "GHCND:US1AKMS0011" "GHCND:US1AKMS0019" "GHCND:US1AKMS0012" "GHCND:US1AKMS0020"
[25] "GHCND:US1AKMS0018" "GHCND:US1AKMS0014" "GHCND:US1AKPW0001" "GHCND:US1AKSH0002"
[29] "GHCND:US1AKVC0006" "GHCND:US1AKWH0012" "GHCND:US1AKWP0001" "GHCND:US1AKWP0002"
[33] "GHCND:US1ALAT0014" "GHCND:US1ALAT0013" "GHCND:US1ALBW0095" "GHCND:US1ALBW0087"
[37] "GHCND:US1ALBW0020" "GHCND:US1ALBW0066" "GHCND:US1ALBW0031" "GHCND:US1ALBW0082"
[41] "GHCND:US1ALBW0099" "GHCND:US1ALBW0040" "GHCND:US1ALBW0004" "GHCND:US1ALBW0085"
[45] "GHCND:US1ALBW0009" "GHCND:US1ALBW0001" "GHCND:US1ALBW0094" "GHCND:US1ALBW0013"
[49] "GHCND:US1ALBW0079" "GHCND:US1ALBW0060"
In reality, I have about 22,000 weather stations. This is just showing the first 50.
rnoaa code
library(rnoaa)
options("noaakey" = Sys.getenv("noaakey"))
Sys.getenv("noaakey")
weather <- ncdc(datasetid = 'GHCND', stationid = df, var = 'PRCP', startdate = "2020-05-30",
enddate = "2020-05-30", add_units = TRUE)
Which produces the following error:
Error: Request-URI Too Long (HTTP 414)
However, when I subset the df into just, say, the first 100 entries, I can’t get data for more than the first 25. However, the package details say I should be able to run 10,000 queries a day.
Loop Attempt
for (i in 1:length(df)){
weather2<-ncdc(datasetid = 'GHCND', stationid=df1[1],var='PRCP',startdate ='2020-06-30',enddate='2020-06-30',
add_units = TRUE)
}
But this just produces the warning Sorry, no data found.
If anyone could give advise on what to try next that would be great