Mosaicking many large rasters without waiting for eternity?

geospatial
Tags: #<Tag:0x00007fbc5662a790>

#1

Hello folks,

Just looking for some general advice here.

I’m working with Landsat 5, 7 and 8 imagery doing NDVI trend analysis. Currently, I’m trying to mosaic scenes by year but running into a serious memory barrier. I have a 64-bit Windows machine but unfortunately I’m confined to 8 GB of RAM. When I try to mosaic even two scenes I run into a “Cannot allocate vector of __GB” error. I understand that I can get around this error by changing the ‘chunksize’ and ‘maxmemory’ parameters in rasterOptions() but doing so greatly increases processing time, especially considering that some year’s mosaics will have to be built from 100+ scenes. I am also familiar with gdalUtils::mosaic_rasters which is faster but I am mosaicking based on median value which is not an option with that function (as far as I can tell).

Is anyone aware of a method for mosaicking in parallel or otherwise speeding up the process? Unfortunately getting more RAM is not an option at the moment. Happy to share more information about my workflow if it helps at all.

Thanks in advance,
Ryan


#2

Can you post some details about a couple of representative files? Presumably these scenes are all different aligned and not part of some broader grid(?) and so the mosaicing is holus-bolus, essentially definining a new common grid and warping all the inputs to that? I’d be using gdal-merge.py or gdalwarp (perhaps after exploration with gdalbuildvrt) at the command line rather than in R for this (, but then I tend not to merge so my experience is small).


#3

@mdsumner You identified the circumstances 100% correctly. I was indeed defining a new common grid (a raster version of the area of interest with the same resolution as the scenes) and warping the scenes to it. Then, using those resampled scenes, I was trying to construct a yearly mosaic. However, given the RAM limitations of my machine, it was going to take a lifetime to mosaic a year’s worth of scenes (some years have 100+ scenes in the time period).

Then I discovered Google Earth Engine (https://earthengine.google.com/) which is capable of collecting the required scenes, applying masks and mosaicking to create composites, all in a tiny fraction of the time it would take via R or gdal. Highly suggest it to anyone doing similarly large-scale Landsat/MODIS/Sentinel analysis.


#4

Glad to hear you have found a solution

Let me just chime in anyway and second @mdsumner on using the gdal suite. I am no expert but I did have to deal with a similar situation. I needed to clip nightlight tiles, each about 3GB and a total of 6 per scene, to country boundaries. So where countries straddled more than one tile I needed to mosaic all intersecting tiles and clip. Only I couldn’t with my 8GB (same as OP).

So I ended up switching the logic, first cropping each intersecting tile individually to the region of interest and then merging/mosaicing the results.

Finally, to mask the rasters, I applied gdalwarp to output to VRT format followed by gdal_translate to save to TIFF. Both are from the gdalUtils package. This I must say ended up being extremely fast and quite memory efficient. Not very scientific but previously my computer would hang. I can try dig up numbers if that would help.


#5

@rywhale, Your yearly aggregation (median) should be performed on tiles in order to mimimize RAM consumptions. The mosaicking should be the last spatial operation. Have a look at PostGIS raster to do such operations. Using the query planner you can easily avoid RAM limitation.