@sckott and @hrbrmstr, thank you both for responding so quickly. Apologies I could not respond earlier.
No, sir, this is public domain data.
I have since learned from a NOAA met that both options exist and one (FTP) may make it a bit faster to get data.
I don’t want to cache segments of the site as that would just trap any errors. Unless I’m misunderstanding you.
I considered this viable to storing the raw text, zip files. Cost, I’m unsure of and really no idea of ballpark metrics to even get a reasonable cost estimate. But I could certainly do that and CF.
The problem I have with that is, as I understand it, giving public access to buckets shows private buckets as well (names, not the data). That bugs me for some reason.
Does SQLite store zips?
I explored the data.world route prior to rrricanesdata
; I don’t believe it would help much on the current path due to size limitations.
So, the issue recently has been the GIS section of the website being changed or down. This wipes out half the functionality of rrricanes
. However, I could drop dependency on the GIS to an extent and build in the functionality; a “build your own GIS”.
For example, there are two datasets that can already be built from the advisory products: spatial point and spatial line dataframes (past track and forecast). So, technically, do not need those types of GIS datasets.
Another is forecast cone; these are standard values and not based on dynamic input. So, it could be calculated and drawn on the fly (and much faster than downloading and moving to ggplot).
Watches and warnings is trickier; that text exists within the advisories but is not parsed (and is inconsistent from year to year). While it could theoretically be done, I’m not sure there is an advantage one way or another.
Other GIS datasets cannot be built “on the fly”; storm surge products, wind speed probabilities (more detailed than the text product). These require data not available in rrricanes
.
I feel like I’m going down a rabbit hole a bit, here. In short, my thinking is archiving the data to a more structured, consistent format (if the last advisory GIS is unavailable, then you still have the previous one whereas, as it currently stands now, you could have access to none of it if the NHC site breaks again).
Archive the data - but then if you’re going to do that why not go ahead and parse the data (including GIS). Move into CSV or DB format. That makes accessing data much faster which removes the need (or modifiesthe purpose of) rrricanesdata
.
This means the same for recon data and forecast model data, when added. And this is ultimately where I’m becoming concerned. Bob, you’re correct;
But I feel like having the two tools as they stand now, while good, are too heavily dependent on what I know is an unreliable source. And I have to overcome that.