1 Setup

Make sure your have rgdal, rgeos, raster, ggmap, leaflet and sp installed:

library('rgdal')
library('sp')
library('raster')
library('rgeos')
library('leaflet')
library('ggmap')

2 Spatial (vector) data objects: sp

Package sp is the central package supporting spatial data analysis in R. sp defines a set of classes to represent spatial data. A class defines a particular data type. The data.frame is an example of a class. Any particular data.frame you create is an object (instance) of that class.

In fact, the sp package does not provide many functions to modify or analyse spatial data; but the classes it defines are used in >100 other R packages that provide specific functionalities. The classes sp defines are mostly starting with Spatial*. For vector data, the basic types are the SpatialPoints, SpatialLines, and SpatialPolygons. These classes only represent geometries (i.e. the spatial information). To link the spatial information to a data-table, classes are available with these names plus DataFrame, for example, SpatialPolygonsDataFrame and SpatialPointsDataFrame.

2.1 Geometries

Consider the following vectors with coordinate information:

longitude <- c(4.4543, 5.02789, 4.94202, 4.49238, 4.49054, 4.54044, 5.95192, 
               4.49496, 4.4958, 5.68327, 4.49054, 4.49054, 4.4958, 4.48938)
latitude <- c(51.33847, 51.24824, 51.24325, 51.26218, 51.25701, 51.27518, 
              51.30803, 51.25803, 51.26106, 51.04185, 51.25701, 51.25701, 
              51.26106, 51.25955)
lonlat <- cbind(longitude, latitude)

Besides the numbers itself, the variable lonlat is not aware of any spatial context (it is just a matrix object).

pts <- SpatialPoints(lonlat)
class(pts)
## [1] "SpatialPoints"
## attr(,"package")
## [1] "sp"

We converted the vector to an object of SpatialPoints, which contains, beside the matrix data, additional information:

showDefault(pts)
## An object of class "SpatialPoints"
## Slot "coords":
##       longitude latitude
##  [1,]   4.45430 51.33847
##  [2,]   5.02789 51.24824
##  [3,]   4.94202 51.24325
##  [4,]   4.49238 51.26218
##  [5,]   4.49054 51.25701
##  [6,]   4.54044 51.27518
##  [7,]   5.95192 51.30803
##  [8,]   4.49496 51.25803
##  [9,]   4.49580 51.26106
## [10,]   5.68327 51.04185
## [11,]   4.49054 51.25701
## [12,]   4.49054 51.25701
## [13,]   4.49580 51.26106
## [14,]   4.48938 51.25955
## 
## Slot "bbox":
##                min      max
## longitude  4.45430  5.95192
## latitude  51.04185 51.33847
## 
## Slot "proj4string":
## CRS arguments: NA

So we see that the object has the coordinates we supplied, but also a bbox. This is a ‘bounding box’, or the ‘spatial extent’ that was computed from the coordinates. There is also a proj4string. This stores the coordinate reference system (CRS). We did not provide the crs so it is unknown (NA). That is not good, so let’s recreate the object, and now provide a crs. We’ll come back to the projection definition later:

crs_wgs84 <- CRS("+init=epsg:4326")
pts <- SpatialPoints(lonlat, proj4string = crs_wgs84)
showDefault(pts)
## An object of class "SpatialPoints"
## Slot "coords":
##       longitude latitude
##  [1,]   4.45430 51.33847
##  [2,]   5.02789 51.24824
##  [3,]   4.94202 51.24325
##  [4,]   4.49238 51.26218
##  [5,]   4.49054 51.25701
##  [6,]   4.54044 51.27518
##  [7,]   5.95192 51.30803
##  [8,]   4.49496 51.25803
##  [9,]   4.49580 51.26106
## [10,]   5.68327 51.04185
## [11,]   4.49054 51.25701
## [12,]   4.49054 51.25701
## [13,]   4.49580 51.26106
## [14,]   4.48938 51.25955
## 
## Slot "bbox":
##                min      max
## longitude  4.45430  5.95192
## latitude  51.04185 51.33847
## 
## Slot "proj4string":
## CRS arguments:
##  +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84
## +towgs84=0,0,0

Hence, the projection information is provided and our geometry is aware of the spatial context.

REMEMBER:

Assigning a CRS is like labeling something. You need to provide the label that corresponds to the item. Not to what you would like it to be. The latter requires a transformation.

The definition of the geometry as an object of the SpatialPoints class, provides a number of spatial functionalities that are not directly available to default matrices. As said in the introduction, the sp class mainly focuses on the classes (objects) and not on the functionalities. However, an essential functionality provided by sp itself (apart from over and aggregate), is the transformation of the coordinate reference system (CRS):

crs_lambert <- CRS("+init=epsg:31370") 
pts_lambert <- spTransform(pts, crs_lambert)
showDefault(pts_lambert)
## An object of class "SpatialPoints"
## Slot "coords":
##       longitude latitude
##  [1,]  155961.2 225412.3
##  [2,]  196022.8 215574.7
##  [3,]  190031.5 214969.8
##  [4,]  158629.2 216928.2
##  [5,]  158501.7 216352.8
##  [6,]  161980.6 218381.2
##  [7,]  260392.8 223200.1
##  [8,]  158810.1 216466.8
##  [9,]  158868.1 216804.0
## [10,]  242186.7 193225.7
## [11,]  158501.7 216352.8
## [12,]  158501.7 216352.8
## [13,]  158868.1 216804.0
## [14,]  158420.3 216635.3
## 
## Slot "bbox":
##                min      max
## longitude 155961.2 260392.8
## latitude  193225.7 225412.3
## 
## Slot "proj4string":
## CRS arguments:
##  +init=epsg:31370 +proj=lcc +lat_1=51.16666723333333
## +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666
## +x_0=150000.013 +y_0=5400088.438 +ellps=intl
## +towgs84=-106.8686,52.2978,-103.7239,0.3366,-0.457,1.8422,-1.2747
## +units=m +no_defs

Other (geometric) operations are provided by (a huge amount of) additional R Packages. The rgeos package is an important package to check for these functionalities. As an example: adding a buffer to our SpatialPoints object:

# gbuffer from the rgeos package (see later)
pts_lambert_buffer <- gBuffer(pts_lambert, byid = TRUE, width = 2000) 

Hence, the result is again a Spatial* object, i.e. a SpatialPolygons object. A plot of the pts_lambert and the pts_lambert_buffered together looks as follows:

plot(pts_lambert_buffer, border = 'blue', 
     col = 'yellow', lwd = 3, axes = TRUE)
plot(pts_lambert, add = TRUE, col = 'red', las = 1)

For quick data exploration of such a small data set, an interactive visualization provides better insight. The leaflet package is well-documented and easy to use, certainly when the dplyr piping is familiar to use:

wgs_84 <- CRS("+init=epsg:4326")
leaflet() %>%  
    addTiles() %>%  # provide a default openstreetmap background layer to the image
    addCircles(data = pts)  %>% # Add the points to the map
    addPolygons(data = spTransform(pts_lambert_buffer, wgs_84)) # Add the buffer polygons to the map

Exercise:

Check the documentation of the leaflet Package and adapt the following items to the leaflet map:
(1) The coordinates should be replaced by an icon marker instead of just a circle
(2) The buffers should not be filled and have a red color.

The output should look like:

Notice:

You can add any available WMS service to the map, by using the proper links to the WMS service. Examples are provided in this tutorial.

2.2 Spatial***DataFrame

The geometry classes provided by sp contain the spatial information, but do not add data to these points (or lines/polygons). The data attribute table is typically provided as a data.frame. For this reason, sp also provides classes that link the geometries with DataFrames: SpatialPointsDataFrame, SpatialLinesDataFrame and SpatialPolygonsDataFrame.

Let’s assume that for each of the coordinates defined above, we also have information about an identifier and a scientificName:

identifiers <- c('INBO:NBN:BFN00179000029DL', 'INBO:NBN:BFN001790000A584', 
                 'INBO:NBN:BFN001790000AACI', 'INBO:NBN:BFN0017900009DOC', 
                 'INBO:NBN:BFN001790000A4J0', 'INBO:NBN:BFN0017900009DO8', 
                 'INBO:NBN:BFN001790000AACF', 'INBO:NBN:BFN001790000A4JO', 
                 'INBO:NBN:BFN001790000A4K8', 'INBO:NBN:BFN001790000A51G', 
                 'INBO:NBN:BFN001790000A4J3', 'INBO:NBN:BFN001790000A4J5', 
                 'INBO:NBN:BFN001790000A4K6', 'INBO:NBN:BFN001790000A4L5')
scientific_names <- c('Lagarosiphon major', 'Lagarosiphon major', 
                      'Muntiacus reevesii', 'Muntiacus reevesii', 
                      'Muntiacus reevesii', 'Muntiacus reevesii', 
                      'Muntiacus reevesii', 'Muntiacus reevesii', 
                      'Muntiacus reevesii', 'Muntiacus reevesii', 
                      'Muntiacus reevesii', 'Muntiacus reevesii', 
                      'Muntiacus reevesii', 'Muntiacus reevesii')
recorded <- as.data.frame(cbind(identifiers, scientific_names))
head(recorded)

Hence, the SpatialPointsDataFrame combines the spatial information (pts_lambert) with the recorded data (recorded):

pts_recorded <- SpatialPointsDataFrame(pts_lambert, data = recorded)
pts_recorded
## class       : SpatialPointsDataFrame 
## features    : 14 
## extent      : 155961.2, 260392.8, 193225.7, 225412.3  (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:31370 +proj=lcc +lat_1=51.16666723333333 +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666 +x_0=150000.013 +y_0=5400088.438 +ellps=intl +towgs84=-106.8686,52.2978,-103.7239,0.3366,-0.457,1.8422,-1.2747 +units=m +no_defs 
## variables   : 2
## names       :               identifiers,   scientific_names 
## min values  : INBO:NBN:BFN00179000029DL, Lagarosiphon major 
## max values  : INBO:NBN:BFN001790000AACI, Muntiacus reevesii

For the moment, the take home message is that Spatial***DataFrame bring together the attribute data (data.frame), the coordinates (geometry) and the coordinate reference system (crs):

head(pts_recorded@data)
head(pts_recorded@coords)
##      longitude latitude
## [1,]  155961.2 225412.3
## [2,]  196022.8 215574.7
## [3,]  190031.5 214969.8
## [4,]  158629.2 216928.2
## [5,]  158501.7 216352.8
## [6,]  161980.6 218381.2
pts_recorded@proj4string
## CRS arguments:
##  +init=epsg:31370 +proj=lcc +lat_1=51.16666723333333
## +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666
## +x_0=150000.013 +y_0=5400088.438 +ellps=intl
## +towgs84=-106.8686,52.2978,-103.7239,0.3366,-0.457,1.8422,-1.2747
## +units=m +no_defs

2.3 Projection

In the previous section, we already used the CRS class definition to specify the coordinate reference system. As many other systems, R uses the PROJ.4 notation to define the CRS. The number of parameters depends on the projection. However, the easiest way is (mostly) using the EPSG code, for example:

But other options are available as well:

CRS("+proj=utm +zone=32")
## CRS arguments: +proj=utm +zone=32 +ellps=WGS84

In general, the definition of the CRS (input argument) should ba a string setup up by one or more +<arg>=<value> combinations, each of them separated by a blank value: +<arg1>=<value1> +<arg2>=<value2>. The CRS-creation returns an object (with a variable name you can assign):

wgs_84 <- CRS("+init=epsg:4326")
wgs_lambert <- CRS("+init=epsg:31370")
wgs_84
## CRS arguments:
##  +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84
## +towgs84=0,0,0
wgs_lambert
## CRS arguments:
##  +init=epsg:31370 +proj=lcc +lat_1=51.16666723333333
## +lat_2=49.8333339 +lat_0=90 +lon_0=4.367486666666666
## +x_0=150000.013 +y_0=5400088.438 +ellps=intl
## +towgs84=-106.8686,52.2978,-103.7239,0.3366,-0.457,1.8422,-1.2747
## +units=m +no_defs

which can be used in other functions/methods or used to support the transformation of the coordinate reference system of spatial aware objects. Hence, transformation won’t work on a default vector or matrix object:

But after transformation of the single coordinate to a SpatialPoints object, the spTransform function can be used:

single_coordinate <- SpatialPoints(matrix(c(98710.32800000161, 
                                            162573.7030000016), 
                                          ncol = 2), 
                                   proj4string = wgs_lambert)
                                  
spTransform(single_coordinate, wgs_84)
## class       : SpatialPoints 
## features    : 1 
## extent      : 3.641625, 3.641625, 50.7714, 50.7714  (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0

The application is similar to other spatial objects, such as a SpatialPointsDataFrame:

pts_recorded_wgs84 <- spTransform(pts_recorded, wgs_84)
pts_recorded_wgs84
## class       : SpatialPointsDataFrame 
## features    : 14 
## extent      : 4.4543, 5.95192, 51.04185, 51.33847  (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 2
## names       :               identifiers,   scientific_names 
## min values  : INBO:NBN:BFN00179000029DL, Lagarosiphon major 
## max values  : INBO:NBN:BFN001790000AACI, Muntiacus reevesii

Exercise:

As your regularly encounter point coordinates that need conversion to another projection system, you decide to make a functions for future use:

For any data.frame(!) with coordinate information in a known coordinate reference system, update the columns with the x/y information to another coordinate system. The function returns the updated data.frame.

The function will probably look more or less like this:

reproject_points <- function(df, col_long, col_lat,
                             project_input, project_output){
    # ...
    
    return(df)

Following test should be able to work with the reproject_points function:

pts_ex2_test <- as.data.frame(pts_recorded)
ex2_test <- reproject_points(pts_ex2_test, "longitude", "latitude",
                             CRS("+init=epsg:31370"), 
                             CRS("+init=epsg:4326"))
head(ex2_test)

Notice:

This reproject_points function is part of the inborutil package, as the function reproject_coordinates and can be used after installing the inborutils package: devtools::install_github("inbo/inborutils").

3 Reading data: rgdal

Mostly, the starting point of a data analysis involving spatial data is a data set in any kind of GIS format. The functionality to read vector data is provide by the readOGR function or the rgdal package:

deelbekkens <- readOGR("../data/deelbekkens/Deelbekken.shp")
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\stijn_vanhoey\Documents\GitHub\course_gis_scripting\data\deelbekkens\Deelbekken.shp", layer: "Deelbekken"
## with 102 features
## It has 8 fields
## Integer64 fields read as strings:  UIDN OIDN
deelbekkens
## class       : SpatialPolygonsDataFrame 
## features    : 102 
## extent      : 22041.22, 258878.5, 153054.6, 244027.1  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=lcc +lat_1=49.8333339 +lat_2=51.16666723333333 +lat_0=90 +lon_0=4.367486666666666 +x_0=150000.01256 +y_0=5400088.4378 +ellps=intl +units=m +no_defs 
## variables   : 8
## names       : UIDN, OIDN, DEELBEKKEN, BEKNR,               BEKNAAM,        STRMGEB,   OPPERVL,    LENGTE 
## min values  : 1020,    1,      01-01,     1, Bekken Brugse polders, Brugse Polders,  32091137,  35788.31 
## max values  :  994,   99,      11-11,    11,            Netebekken,        Schelde, 389367535, 139785.14
plot(deelbekkens)

Besides shapefiles, other vector data formats can be read as well. Certainly geojson is a increasingly used data format for feature (vector) data sets:

eu_10grid <- readOGR("../data/EUgrid10.geojson")
## OGR data source with driver: GeoJSON 
## Source: "C:\Users\stijn_vanhoey\Documents\GitHub\course_gis_scripting\data\EUgrid10.geojson", layer: "EUgrid10"
## with 377 features
## It has 3 fields
eu_10grid
## class       : SpatialPolygonsDataFrame 
## features    : 377 
## extent      : 2.385175, 6.480442, 49.48023, 51.54315  (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
## variables   : 3
## names       :     CellCode, EofOrigin, NofOrigin 
## min values  : 10kmE379N312,   3790000,   2940000 
## max values  : 10kmE406N304,   4060000,   3160000

Actually, the set of data formats you can interact with, does not depend on R, but is dependent on your GDAL installation. To check if the installation supports a specific vector data format, you can get an overview of them by the ogrDrivers() command:

head(ogrDrivers()) # Just the first 6 records are shown here

For data sets with multiple layers, the reading of a specific Layer is supported as well. For example, when the driver to read ESRI FileGeoDatabases is present, following commands will be useful as well:

# Information about the data (without actually reading the data itself)
ogrInfo(dsn = "name_filegeodatabase.gdb") 
# List the names of the layers in the dataset
ogrListLayers(dsn = "name_filegeodatabase.gdb")
# Reading a specific layer
readOGR(dsn = "name_filegeodatabase.gdb", layer = "layer_of_interest"

4 Vector data manipulation

4.1 DataFrame alike manipulation

Vector data represented as Spatial***DataFrame can be sub selected as a DataFrame. For example, selecting only the deelbekkens that are in the Netebekken (Boolean indexing):

nete <- deelbekkens[deelbekkens$BEKNAAM == "Netebekken", ]
plot(nete)

The result of the selection is again a SpatialPolygonsDataFrame. When only interested in the data itself, the selection of the attribute data alone can be done by converting the data to a DataFrame.

head(as.data.frame(nete))

Hence, the geometry and projection data is no longer contained, only the data itself.

REMEMBER:

The sp data types do NOT support the dplyr package. If you want to use the %>% operator and the verbs such as filter, mutate,… as used in the dplyr package, you could start using the sf package, the newest package setup with the tidyverse environment in mind. This introduction and the documentation of of sf is good to start.

4.2 sp overlay and aggregations

The sp package provides mostly the data types. However, the spTransform function is an essential function provided by the package itself. Two other functions of the sp package are worth mentioning:

  • disaggregate: groups of Line or Polygon (multi) are disaggregated to one Line/Polygon for each feature. When applied to the deelbekkens data set:
length(deelbekkens)
## [1] 102
deelbekkens_disaggr <- disaggregate(deelbekkens)
length(deelbekkens_disaggr)
## [1] 150

The number of features increases, as each polygon in the data set is now defined as an individual feature of the data set.

  • over(x, y) is used for spatial JOINs
ids <- over(spTransform(eu_10grid, wgs_lambert), pts_recorded)
ids <- na.omit(ids)
ids

The result provides the indices of the SpatialPolygonDataFrame together with the SpatialPointDataFrame data entries. Lines, and Polygon-Polygon overlays require rgeos.

Notice:

Check the vignette (vignette('over') in console) to get more details about all the different spatial JOIN options with the sp package. On the other hand, the sf package interface for spatial joins is more intuitive. This point-in-polygon example compares the sp and the sf implementation.

4.3 rgeos geometry operations

The rgeos package provide a number of geometric operations, such as gArea(), gLength(), gDistance(), gBuffer(),… (single geometry) or gIntersection(), gUnion(), gContains,… (combining geometries).

Notice:

As most commands of rgeos start with g***, use the power of the TAB-button to explore the available functions of the rgeos package: rgeos::g + TAB

These functions operate both on Spatial* objects (see above) as well as Spatial***DataFrame objects. Most of these functions speak for themselves (or read the manual). An important argument of these functions is byid. The byid argument determines if the function should be applied across sub geometries (TRUE) or the entire object (FALSE). Default is FALSE. For example, for the area calculation:

gArea(nete) # default is FALSE -> entire object
## [1] 1675885320

versus

gArea(nete, byid = TRUE)
##        10        12        14        41        69        70        80 
## 122717152 122716487 119654760  92940654 132679768 174241355  87032046 
##        89        90        91        92        93       101 
## 128287765  69738882  93940592 145759867 226004155 160171837

Comparison of geometries involves two geometries:

vertical <- eu_10grid[grep('E400', eu_10grid$CellCode), ]
horizontal <- eu_10grid[grep('N311', eu_10grid$CellCode), ]

comparison <- gContains(vertical, horizontal, 
                        byid = TRUE, checkValidity = TRUE, 
                        returnDense = TRUE) # Test also with gIntersects
true_ids <- as.data.frame(which(comparison, arr.ind = TRUE)) # Find the TRUE elements
true_ids
# adjusting the colors transparency
blue <- adjustcolor("blue", alpha.f = 0.3) 
red <- adjustcolor("darkred", alpha.f = 0.3) 

# plot the cells selected by the geometry comparison:
plot(vertical)
plot(vertical[true_ids$col,], add = TRUE, col = blue)
plot(horizontal, add = TRUE)
plot(horizontal[true_ids$row,], add = TRUE, col = red)

Exercise:

Make a leaflet plot (with open street map background) of the nete sub catchments, together with the Centroid for each of the sub geometries.

Tip 1: Looking for a command that matches the action centroid in R, is done by using ??centroid in the Console.
Tip 2: Remember that leaflet always expects WGS84 as CRS to make a plot(!)

The output should look like:

Notice:

It is not always needed to download both data sets and do the spatial operation on your own machine. Some WFS Webservices (providing maps online) support spatial operators such as within, intersects, contains,… directly. As such, the point in polygon problem can be tackled directly as explained in the tutorial ‘download feature attributes from WFS’.

5 Mapping vector data

5.1 ggplot

When working with DataFrame data, the application of ggplot2 is typical. Hence, it would be convenient to use ggplot2 (and the syntax) to plot Spatial***DataFrame as well. As ggplot2 only works on data.frame objects, a translation need to be made between the Spatial***DataFrame and a default data.frame.

For SpatialPointsDataFrame this is not really an issue, as the longitude and latitude columns provide the geometry information and can be used as such in ggplot by directly converting the data to a data.fram

pts_recorded_df <- as.data.frame(pts_recorded)
pts_recorded_df
ggplot(pts_recorded_df, aes(x = longitude, y = latitude)) +
    geom_point() +
    geom_text(size = 4, aes(label = scientific_names), 
              check_overlap = TRUE, hjust = -0.1) +
    xlim(150000, 300000)

For Lines and Polygons, the translation requires an entire interpretation of the configuration towards a data.frame. This functionality is provided by fortify, which translates the spatial information:

nete_df <- fortify(nete)
head(nete_df)
class(nete_df)
## [1] "data.frame"

The individual columns of the resulting data.frame describe the following elements:

  • long and lat: x and y coordinates
  • order: sequence of the polygones or lines
  • hole: identifies whether the feature the coordinates are part of define a hole
  • piece: collects all the points that make a single feature (polygon/line), i.e. 1 attributes table line • group: defines the subpolygons or sublines (in case of multipolygon/multiline features) • id: row identifiers of the attribute table (data)

Still, the added value of using ggplot2 is to use the attribute data to define colors,… As the attribute data table is available and we have an identifier id, we can JOIN the attribute data to the spatial information after retrieving the attribute data with the as.data.frame function:

nete_df <- merge(nete_df, as.data.frame(nete))
head(nete_df)

REMEMBER:

In order to plot SpatialLinesDataFrame or SpatialPolygonsDataFrame data, the spatial data (with fortify()) need to be merged with the attribute data (with as.data.frame) to have a ggplot2 compatible data.frame!

This seems to be a useful small tooling for a custom R function…

Exercise:

Write a function called prepare_spatial_for_ggplot that takes a SpatialLinesDataFrame or SpatialPolygonsDataFrame as input and execute the necessary steps (described above: fortify/as.data.frame/merge) to enable the plotting with ggplot2. The output of the function is a data.frame.

Following test should be able to work with the prepare_spatial_for_ggplot function:

ex4_test <- prepare_spatial_for_ggplot(nete)
head(ex4_test)

For plotting Lines, use the ggplot function geom_path() (and NOT geom_line()). For polygons, you can choose: * geom_path(): only plot the borders * geom_polygon(): usa a fill color for the polygons

ggplot(nete_df, aes(x = long, y = lat)) +
    geom_polygon(aes(group = group, fill = id)) +
    coord_fixed()

REMEMBER:

Adding group = group (grouping according to the by fortify derived groups) is required to plot the individual polygons separately(!)

Notice:

Starting from ggplot2 version 2.2.1, the plotting of spatial objects defined in the sf package will no longer need this transformation, but will directly provide the geom_sf function to plot vector data in ggplot2, see the documentation!

5.2 ggmap

The ggmap package is the spatial plotting extension to ggplot2. It provides the ability to integrate the graphs of ggplot2 with spatial backgrounds coming from online resources such as Google maps, stamen maps and OpenStreetMap. The most convenient way to setup these background maps is to use the ggmap::get_*** functions. For example, using the toner-background of stamen:

extent <- c(left = 2.54, bottom = 49.46, right = 6.4, top = 51.51)
map <- get_stamenmap(extent, zoom = 8, maptype = "toner-background")
belgium <- ggmap(map)
belgium

Just as with the OpenStreetMap integration of the leaflet package, these spatial background coming from online resources use WGS84 (decimal degrees). Make sure to convert the vector data to the WGS84 CRS:

nete_wgs84 <- spTransform(nete, wgs_84)
nete_df_wgs84 <- fortify(nete_wgs84)
nete_df_wgs84 <- merge(nete_df_wgs84, as.data.frame(nete_wgs84))
    
belgium +
    geom_path(aes(x = long, y = lat, group = group), 
          data = nete_df_wgs84, colour = "red")

The bbox can be defined more specifically to improve the boundaries of the background image. Focusing on the Nete catchment:

extent <- c(left = 4.42, bottom = 50.9, right = 5.4, top = 51.4)
map <- get_stamenmap(bbox = extent, zoom = 10, maptype = "toner-hybrid")
ggmap(map) + 
    geom_path(aes(x = long, y = lat, group = group), 
              data = nete_df_wgs84, colour = "red")

Notice:

Each of these online providers have different maptype options. Check the help of these functions to see the options (e.g. ?get_stamenmap).

The usage of Google maps enables the option to provide a google maps search as center of the image

map <- get_googlemap(center = "Geel Belgium",  scale = 2, 
                     maptype = "terrain", zoom = 9)
ggmap(map) +
    geom_polygon(aes(x = long, y = lat, group = group, fill = id, alpha = 0.5), 
          data = nete_df_wgs84, colour = "white")

The Github repository and the paper provide more introductionary material to get you started with ggmap. Another potentially useful plotting package is the tmap package.

Exercise:

Pick any spatial vector data set (shapefile, geojson,…) currently on your computer:

  1. read the file into R as a Spatial***DataFrame
  2. make a subselection of the dataset using one of the data columns and a condition (e.g. length > 10000)
  3. create a ggplot with the color of each spatial entity defined by a chosen data columns