Make sure your have rgdal
, raster
and igraph
installed:
library('raster')
library('rgdal')
raster
The sp
package supports raster (grid) data with the SpatialGridDataFrame
and SpatialPixelsDataFrame
classes. However, the raster
package provides more specialized functionalities to raster data. Similar to the sp
package, the raster
package provides a number of classes and a number of functionalities to operate and interact with these classes. The RasterLayer
, RasterStack
and RasterBrick
classes are the most important.
Whereas sp
mainly focuses on the classes itself, the raster
package has functions for creating, reading, manipulating, and writing raster data. The package provides, among other things, general raster data manipulation functions that can easily be used to develop more specific functions.
When required, the conversion from Spatial***
raster representations to the raster
package raster representations is just a single command raster(sp_raster_variable_name)
away.
A 2D matrix is actually also a representation of a GIS raster, as it provides the possibility to do element-wise calculations:
example_matrix <- matrix(1:6, nrow = 2, ncol = 3)
example_matrix
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
Hence, a reclassification of the raster data can be achieved as follows:
example_matrix[example_matrix >= 3] <- 3.
example_matrix[example_matrix < 3] <- 0.
example_matrix
## [,1] [,2] [,3]
## [1,] 0 3 3
## [2,] 0 3 3
However, for GIS raster representations, the classes provided by the raster
package provide these type of operations while having the spatial extent
and the projection
integrated as well.
RasterLayer
A RasterLayer
object represents single-layer (variable) raster data. A RasterLayer
object always stores a number of fundamental parameters that describe it:
extent
(cfr. bbox
)To create a RasterLayer
manually, provide the required minimal information:
# specify the RasterLayer with the following parameters:
# - minimum x coordinate (left border)
# - minimum y coordinate (bottom border)
# - maximum x coordinate (right border)
# - maximum y coordinate (top border)
# - resolution (cell size) in each dimension
r_example <- raster(xmn = 4.42, ymn = 50.9, xmx = 5.4, ymx = 51.4,
resolution = c(0.1, 0.1))
r_example
## class : RasterLayer
## dimensions : 5, 10, 50 (nrow, ncol, ncell)
## resolution : 0.1, 0.1 (x, y)
## extent : 4.42, 5.42, 50.9, 51.4 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
From the output we learn that with the default parameters, the CRS is defined here in degrees. However, when it would not make sense, the CRS is default put to NA
:
raster(ncol = 36, nrow = 18, xmn = -1000, xmx = 1000, ymn = -100, ymx = 900)
## class : RasterLayer
## dimensions : 18, 36, 648 (nrow, ncol, ncell)
## resolution : 55.55556, 55.55556 (x, y)
## extent : -1000, 1000, -100, 900 (xmin, xmax, ymin, ymax)
## coord. ref. : NA
Consider the example r_example
defined in degrees and extract some additional information:
class(r_example)
## [1] "RasterLayer"
## attr(,"package")
## [1] "raster"
Indeed, we created a single layer raster layer…
res(r_example)
## [1] 0.1 0.1
with a resolution of 0.1 degrees, which we can change:
res(r_example) <- 0.05
r_example
## class : RasterLayer
## dimensions : 10, 20, 200 (nrow, ncol, ncell)
## resolution : 0.05, 0.05 (x, y)
## extent : 4.42, 5.42, 50.9, 51.4 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
However, the data currently only defines the skeleton of a raster data set. That is, it knows about its location, resolution, etc., but there are no values associated with it:
hasValues(r_example)
## [1] FALSE
Let’s provide some random numbers from a uniform distribution:
r_example <- setValues(r_example, runif(200, min = 1., max = 10.))
r_example
## class : RasterLayer
## dimensions : 10, 20, 200 (nrow, ncol, ncell)
## resolution : 0.05, 0.05 (x, y)
## extent : 4.42, 5.42, 50.9, 51.4 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : 1.00307, 9.942678 (min, max)
plot(r_example)
These values
can be sliced
, just as we would do with vectors and matrices:
values(r_example)[1:10]
## [1] 8.434002 1.673186 2.903345 8.621475 8.962932 6.403370 4.129534
## [8] 8.900143 4.838132 6.870831
An important feature of the raster
package is the option to work inMemory
or not:
inMemory(r_example)
## [1] TRUE
This defines the data itself is stored in the working memory of your computer. For manually created RasterLayers
, the data will be in memory. However, when accessing data from files, raster
support data access without actually loading all the data in memory.
A RasterLayer
objects can be created from scratch, a file, an Extent object, a matrix, an ‘image’ object, or from a Raster*
, Spatial***
,… object
A useful functionality is the conversion between XYZ columns spatial data representation and a raster representation for points on a regular grid:
xyz_example <- rasterToPoints(r_example)
head(xyz_example)
## x y layer
## [1,] 4.445 51.375 8.434002
## [2,] 4.495 51.375 1.673186
## [3,] 4.545 51.375 2.903345
## [4,] 4.595 51.375 8.621475
## [5,] 4.645 51.375 8.962932
## [6,] 4.695 51.375 6.403370
plot(rasterFromXYZ(xyz_example))
However, in many occasions these XYZ combination are not provided on a regular grid. See the rasterize
function for points that are not on a regular grid. An example of points that are not on a regular grid:
longitude <- c(4.4543, 5.02789, 4.94202, 4.49238, 4.49054, 4.54044, 5.95192,
4.49496, 4.4958, 5.68327, 4.49054, 4.49054, 4.4958, 4.48938)
latitude <- c(51.33847, 51.24824, 51.24325, 51.26218, 51.25701, 51.27518,
51.30803, 51.25803, 51.26106, 51.04185, 51.25701, 51.25701,
51.26106, 51.25955)
lonlat <- cbind(longitude, latitude)
head(lonlat)
## longitude latitude
## [1,] 4.45430 51.33847
## [2,] 5.02789 51.24824
## [3,] 4.94202 51.24325
## [4,] 4.49238 51.26218
## [5,] 4.49054 51.25701
## [6,] 4.54044 51.27518
Check the documentation of the rasterize
command provided by the raster
package and convert the lonlat
set of irregular points to a regular grid with a resolution of 0.1 degrees and an appropriate spatial extent
(e.g. Belgium). Make sure that the resulting raster values represent the number of points per grid cell. Make a plot of the resulting raster.
The output should look like:
RasterStack
and RasterBrick
Single layer raster objects are very common. Still, a collection of rasters with the same spatial extent and resolution is a common case when doing spatial analysis (combining rasters with representing each a variable). In fact, a RasterStack
/RasterBrick
is a collection of RasterLayer
objects with the same spatial extent and resolution.
The main difference between RasterStack
and RasterBrick
is that a RasterStack
is a loose collection of RasterLayer
objects that can refer to different files (but must all have the same extent and resolution), whereas a RasterBrick
can only point to a single file. A typical example of a RasterBrick
object is the representation of a multi-band satellite image.
As an example, create a dummy RasterStack
with 3 alternatives of the r_example
bundled together:
s_example <- stack(r_example, r_example**2, r_example/2.)
plot(s_example)
When checking the properties
s_example
## class : RasterStack
## dimensions : 10, 20, 200, 3 (nrow, ncol, ncell, nlayers)
## resolution : 0.05, 0.05 (x, y)
## extent : 4.42, 5.42, 50.9, 51.4 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0
## names : layer.1, layer.2, layer.3
## min values : 1.0030703, 1.0061501, 0.5015352
## max values : 9.942678, 98.856836, 4.971339
Make sure you notice the additional dimension
in the description (nlayers
).
To transform a RasterLayer
to another CRS you can use the function projectRaster
. However, whereas the transformation of coordinates (vector formats) is rather trivial, the re-projection of a raster representation can result in a loss of precision and estimates for the values of new cells must be made based on the values in the old cells. If the values are class data, the nearest neighbor is commonly used. Otherwise some sort of interpolation (e.g. bilinear) is required.
An introduction to the CRS
class definition to specify the coordinate reference system is provided in the 07-gis-r-vectors.Rmd
notebook. The definition of the CRS
object is completely similar:
crs_lambert <- CRS("+init=epsg:31370")
Using the CRS, the re-projection with the projectRaster
:
r_reproject <- projectRaster(r_example, crs = crs_lambert)
r_reproject
## class : RasterLayer
## dimensions : 12, 24, 288 (nrow, ncol, ncell)
## resolution : 3500, 5560 (x, y)
## extent : 146566.1, 230566.1, 171613.3, 238333.3 (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:31370 +proj=lcc +lat_1=51.16666723333333 +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-106.8686,52.2978,-103.7239,0.3366,-0.457,1.8422,-1.2747 +units=m +no_defs
## data source : in memory
## names : layer
## values : 0.9594252, 9.641255 (min, max)
However, using the default parameters, we do not have control on the spatial resolution (dimensions) of the resulting reprojected raster. Setting the resolution provides more control:
r_reproject <- projectRaster(r_example, crs = crs_lambert,
res = 5000)
r_reproject
## class : RasterLayer
## dimensions : 13, 16, 208 (nrow, ncol, ncell)
## resolution : 5000, 5000 (x, y)
## extent : 148566.1, 228566.1, 172773.3, 237773.3 (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:31370 +proj=lcc +lat_1=51.16666723333333 +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-106.8686,52.2978,-103.7239,0.3366,-0.457,1.8422,-1.2747 +units=m +no_defs
## data source : in memory
## names : layer
## values : 1.346314, 9.209351 (min, max)
But is generally advised to project a raster to another raster
object. By providing an existing RasterLayer
object, your newly projected data perfectly aligns with it.
# Create a base_layer to project data to
base_raster <- raster(ncol = 20, nrow = 10,
xmn = 153566.1, xmx = 223869.7,
ymn = 176630, ymx = 232773.3)
crs(base_raster) <- crs_lambert
# reproject to the new raster
r_reproject <- projectRaster(r_example, base_raster)
plot(r_reproject)
The function projectExtent
is a great utility to only project the extent
of the data and retrieve similar boundaries in the new CRS.
rgdal
or raster
First unzip the .tif
example data in the data file to the scratch folder:
unzip('../data/NE1_50m_SR.zip', exdir = "../scratch")
Note that in most cases where real data is analyzed, these raster*
objects are created from a file. The functionality to read raster data is provided by the readGDAL
function of the rgdal
package:
r_data = readGDAL("../scratch/NE1_50M_SR/NE1_50M_SR.tif")
## ../scratch/NE1_50M_SR/NE1_50M_SR.tif has GDAL driver GTiff
## and has 5400 rows and 10800 columns
class(r_data)
## [1] "SpatialGridDataFrame"
## attr(,"package")
## [1] "sp"
which results in a SpatialGridDataFrame
.
Actually, the set of data formats you can interact with, does not depend on R, but is dependent on your GDAL installation. To check if the installation supports a specific raster data format, you can get an overview of them by the gdalDrivers()
command:
head(gdalDrivers()) # Just the first 6 records are shown here
Most formats supported for reading can also be written to. This is supported by both writeGDAL
(rgdal
package) and writeRaster
(raster
package). Furthermore, for large rasters, the writeValues
function is useful as well, as it supports writing the data in chunks.
Still, the raster
command of the raster
package can also directly read raster files in several formats, also relying on the rgdal
package (and the underlying GDAL driver) behind the scenes.
Let’s check the space this SpatialGridDataFrame
requires in memory:
print(object.size(r_data) , units = "auto")
## 667.4 Mb
and converting to a RasterLayer
with the raster
command:
r_data <- raster(r_data)
class(r_data)
## [1] "RasterLayer"
## attr(,"package")
## [1] "raster"
Let’s check again the space this requires in memory after conversion to a raster
object:
print(object.size(r_data) , units = "auto")
## 222.5 Mb
Hence, for this type of raster GIS data, the raster
representation is less memory demanding. Moreover, when reading these files directly with the raster
command, the package will not load the data values in memory, while extracting the spatial information (extent, crs, dimensions):
r_data_raster <- raster("../scratch/NE1_50M_SR/NE1_50M_SR.tif")
r_data_raster
## class : RasterLayer
## band : 1 (of 3 bands)
## dimensions : 5400, 10800, 58320000 (nrow, ncol, ncell)
## resolution : 0.03333333, 0.03333333 (x, y)
## extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## data source : C:\Users\stijn_vanhoey\Documents\GitHub\course_gis_scripting\scratch\NE1_50M_SR\NE1_50M_SR.tif
## names : NE1_50M_SR
## values : 0, 255 (min, max)
inMemory(r_data_raster)
## [1] FALSE
In the case of multi-layer files, the raster
function reads by default the first layer only. For multi-layer objects (e.g. multi-layer GeoTIFF
), the brick
or stack
functions can be used to read data from file:
landstat_example <- brick('../data/LE71700552001036SGS00_SR_Gewata_INT1U.tif')
landstat_example
## class : RasterBrick
## dimensions : 593, 653, 387229, 6 (nrow, ncol, ncell, nlayers)
## resolution : 30, 30 (x, y)
## extent : 829455, 849045, 825405, 843195 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=utm +zone=36 +ellps=WGS84 +units=m +no_defs
## data source : C:\Users\stijn_vanhoey\Documents\GitHub\course_gis_scripting\data\LE71700552001036SGS00_SR_Gewata_INT1U.tif
## names : LE7170055//ta_INT1U.1, LE7170055//ta_INT1U.2, LE7170055//ta_INT1U.3, LE7170055//ta_INT1U.4, LE7170055//ta_INT1U.5, LE7170055//ta_INT1U.6
## min values : 4, 6, 3, 18, 6, 2
## max values : 39, 56, 71, 102, 138, 408
plot(landstat_example)
(Overview about the landstat bands/layers wavelengths is given here )
For reading in raster file data, try with the raster
command first. If this is not successful, check if the readGDAL
function can read it.
In the data
folder, an Arc/Info Binary Grid
is provided, called grnt_bodem
. read in the data with the variable-name grnt_bodem
and make a plot using the bpy.colors
color scale.
(In the case your GDAL driver does not support Arc/Info Binary Grid, read in the grote_nete_bodem.tif
file in the data folder)
The output should look like:
Many generic functions that allow for simple and elegant raster algebra have been implemented for Raster*
objects, including the normal algebraic operators such as {}
, logical operators such as >
, >=
, <
, ==
, !
and functions like abs
, round
, ceiling
, floor
, trunc
, sqrt
, log
, log10
, exp
, cos
, sin
, atan
, tan
, max
, min
, range
, prod
, sum
, any
, all
. In these functions you can mix raster objects with numbers, as long as the first argument is a raster object.
plot(sqrt(r_example + 10)/2,
col = heat.colors(20))
Again, also boolean indexing
(conditional replacement) is provided, which supports the reclassification of a raster data set:
r_example[r_example >= 3] <- 3.
r_example[r_example < 3] <- 0.
plot(r_example)
The reclassify
function provides the same functionality, but as a higher level function. For larger reclassify operations (many classes), the application of reclassify
is worthwhile to use.
In the data
folder, a text file systemtable_example.txt
is available containing the information of a reclassification of the grnt_bodem
variable. Read the text-file as a data.frame
and save the content as variable class_table
. Use the class_table
to reclassify
the grnt_bodem
raster. Plot the output with any colors you would like to use.
The output could look like (depends on your color preferences):
To decide about the color you could use for maps, the colorbrewer website is a great start. You can copy/paste the color codes from the website itself, but the package scales
provides the brewer_pal
function, providing direct load of the colors as a colormap:
plot(grnt_bodem_reclassified, breaks = c(0, 1, 2, 3),
col = scales::brewer_pal(palette = "Greens")(3))
To perform calculations between the individual layers of a RasterBrick
or RasterStack
, the expression should refer to the individual layers of the object. Referring to individual layers in a RasterBrick
or RasterStack
object is done by using double square brackets [[]]
. As an example, the calculation of the NDVI:
\[NDVI = \frac{NIR - Red}{NIR + Red}\]
ndvi <- (landstat_example[[4]] - landstat_example[[3]]) / (landstat_example[[4]] + landstat_example[[3]])
plot(ndvi)
Although this is a quick way to perform the calculation, directly adding, subtracting, multiplying, etc, the layers of big raster objects is not recommended. When working with big objects, it is advisable to use the calc()
function to perform these types of calculations. The reason is that R needs to load all the data first into its internal memory before performing the calculation and then runs everything in one block. It is really easy to run out of memory when doing that. A big advantage of the calc()
function is that it has a built-in block processing option for any vectorized function, allowing such calculations to be fully “RAM friendly”. The example below illustrates how to calculate NDVI from the same date set using the calc() function.
## Define the function to calculate NDVI from
ndvi.calc <- function(x) {
ndvi <- (x[[4]] - x[[3]]) / (x[[4]] + x[[3]])
return(ndvi)
}
ndvi2 <- calc(x = landstat_example, fun = ndvi.calc)
plot(ndvi2)
Several high level functions (i.e. typical GIS software functions) have been implemented for RasterLayer
objects. Examples are contour()
, focal()
(moving window), clump()
(detect patches of connected cells), zonal()
(zonal statistics), terrain()
(slope, aspect and other terrain characteristics from dem), (dis)aggregate
(change resolution)… See the help files for more detailed descriptions of each function. (also calc
is actually a high level function).
These functions work equally well for raster data sets that cannot be loaded into memory!
As an example, converting the grnt_bodem
raster to a binary mask with 0/1 values can be achieved by the clump
function:
plot(clump(grnt_bodem), col = c("darkgreen"))
Another GIS function worthwhile to mention, is the clipping if a raster with a vector layer, which is provided by the function mask
:
library(rgdal)
deelbekkens <- readOGR("../data/deelbekkens/Deelbekken.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\stijn_vanhoey\Documents\GitHub\course_gis_scripting\data\deelbekkens\Deelbekken.shp", layer: "Deelbekken"
## with 102 features
## It has 8 fields
## Integer64 fields read as strings: UIDN OIDN
crs_wgs84 <- CRS("+init=epsg:4326")
deelbekkens_wgs84 <- spTransform(deelbekkens, crs_wgs84)
r_data_crop <- crop(r_data, deelbekkens_wgs84)
r_data_mask <- mask(r_data_crop, deelbekkens_wgs84)
plot(r_data_mask)
The gdalwarp
functionality to crop a vector file can also be executed directly with GDAL from the command line. As such, similar to the subprocess
trick in Python (05-the-power-of-gdal.ipynb
) to run a GDAL command within Python as if it would be on the command line, this can be achieved by using system()
(instead of subprocess
).
Actually, the plot
function provided by the raster
package (find specific help by executing ?raster::plot
in the console) provides a lot of functionality. Also other plot functions from the system R will work on the Raster*
objects:
hist(r_example)
For multi-band files, the plotRGB
provides the option to plot false color composites:
plotRGB(landstat_example, r = 5, g = 4, b = 3)
Another interesting package, is the rasterVis package. The package provides a set of methods for enhanced visualization and interaction. Specifically the space-time plots are well designed.