Merging DEM rasters
Using batch files and GDAL to merge rasters for performance.
I recently got a dataset made up of 2364 images files representing 1km x 1km tiles of 5m resolution Digital Terrain Model (DTM). There are also another 2364 files of 1km x 1km 2m resolution Digital Surface Model (DSM). I want to serve these with Geoserver which prefers fewer, larger files over more smaller files.
GDAL is my tool of choice for batch processing images and in this case I want to use gdalwarp or gdal_merge to create my larger tiles. The files are named logically based on the National Grid - NO4321.tif - and I can use this to create 10km x 10km tiles from the 1km x 1km tiles. If I take the first (4) and third (2) numbers after the grid square reference (NO) I get the 10km tile reference - NO42.
So, how do I automate this process to create my tiles?
First up I need some text files, one for each 10km tile, with the list of image tiles making up the larger tile. The following batch command has a nested loop that iterates through two sets of numbers (0 to 9) and substitutes those in the DIR command to find all image filenames that match the string (the ? represents a single character in the string):
FOR /L %%G IN (0,1,9) DO ( FOR /L %%H IN (0,1,9) DO (DIR /b /s NO_tiflzw\NO%%G?%%H?.tif > NO%%G%%H.txt ))
I end up with a directory of text files like NO00.txt, NO01.txt, NO02.txt, etc. Each text file has a list of the 1km tiles that make up the 10km tile.
gdal_merge and gdalwarp can take a text file list as an input so now we can loop through the list of text files and generate a 10km tile for each one:
FOR /F %%I IN (*.txt) DO gdal_merge -o NO_tifmerged%~nI.tif -of GTiff -co TFW=YES -co TILED=YES -co BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co COMPRESS=LZW -ot Float32 -v --optfile %%I
This creates an internally tiled, LZW compressed, floating point GeoTIFF with accompanying world file. If you want more control over the output raster try gdalwarp instead:
FOR /F %%I IN (*.txt) DO gdalwarp -s_srs EPSG:27700 -t_srs EPSG:27700 -ot Float32 -r average --config GDAL_CACHEMAX 500 -wm 500 -of GTiff -co TFW=YES -co TILED=YES -co BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co COMPRESS=LZW --optfile %%I NO_tifwarped%~nI.tif
After this I can generate some stats for each image:
FOR /F %%G IN ('dir /b *.tif') DO gdalinfo %%G -stats
And some overviews with gdaladdo for performance:
FOR /F %%H IN ('dir /b *.tif') DO gdaladdo -r average %%H 2 4 8 16
*Remember, if you run this from the command line replace the "%%" with "%"
Now I have 35 tiles to serve instead of 2364 which should make Geoserver faster.
If you're running on Linux here are some simple Bash shell scripts which do the same as the Command shell loops above:
for i in {0..9}
do
for j in {0..9}
do
ls -Q NO/NO$i?$j?.asc > NO$i$j.txt
done
done
exit 0
for f in .txt;
do
gdalwarp -s_srs EPSG:27700 -t_srs EPSG:27700 -ot Float32 -r average
--config GDAL_CACHEMAX 500 -wm 500 -of GTiff -co TFW=YES -co TILED=YES
-co BLOCKXSIZE=512 -co BLOCKYSIZE=512 -co COMPRESS=LZW
--optfile $f NO_10k/"${f%.}".tif
done
exit 0