library(sf)4 Coordinate reference systems
4.1 Introduction
The majority of normal people will get scared if there’s some problem to solve involving coordinate reference systems (also called spatial reference systems) or projections. That’s why I will keep this section as short as possible and provide only the information you’ll need to perform standard GIS analyses with R. If you want to read more about this topic, I invite you to read the following book chapters: https://r.geocompx.org/spatial-class.html#crs-intro, https://r.geocompx.org/reproj-geo-data and https://tmap.geocompx.org/map-projections.
There is a famous expression saying “spatial is special”… One of the main reasons is that spatial data will have associated locations and you thus need a reference frame to describe these locations. This reference frame is called a coordinate reference system (CRS) in the GIS world. Coordinate reference systems can be either geographic or projected.
When you’re working with GIS data, you should always know the CRS you’re using. Otherwise coordinates are just numbers without a reference frame. When you share GIS data, make sure the CRS is always defined in the data set or documented in some other way.
4.2 Geographic CRSs
A geographic CRS will identify locations using two values: longitude and latitude. However, these two values alone are insufficient to determine a precise location; they must be linked to a model that defines the shape of the Earth. The Earth’s actual shape is called a geoid, but it is almost impossible to perform computations using such a complex solid, and thus one has to use approximations. A perfect sphere is a crude approximation (which can still be useful), while an ellipsoid (also called spheroid) is a better approximation accounting for the fact that our planet is a bit compressed (flatter at the North and South Poles). Once we have chosen a shape, we need to fit it to the Earth. Since the Earth is not a perfect sphere, there’s an infinity of possible fits. An ellipsoid (or a sphere) fitted to the Earth is called a datum. Depending on the area of application of a CRS, we can use a global (also called geocentric) datum which provides an appropriate fit for the whole globe (such as WGS84), or a local datum which closely fits the Earth’s surface in a particular area.
Coordinates in degrees (longitude and latitude) only have a precise meaning when the datum is provided… But most of the time you can assume that the associated datum is WGS84. Actually, WGS84 is a datum ensemble containing successive realizations of the reference system. Fortunately this can be ignored most of time, except if we need high positional accuracy.
Remember that geographic coordinate systems are not using any projection, the measurements are done with angles! All the computations (distances, buffers, etc.) have to happen on the chosen sphere/ellipsoid, which makes things more complex. It is easy to make mistakes when working with geographic CRSs, and even smart people fell into this trap (e.g. https://georeferenced.wordpress.com/2014/05/22/worldmapblunders).
4.3 Projected CRSs
Projected CRSs are based on geographic CRSs but include an additional step: a projection on a flat surface. When using a projected CRS, locations are described using Cartesian coordinates called easting and northing (x and y). Projecting a spherical or ellipsoidal surface on a plane will cause deformations. These will affect four properties of the surface: shapes, areas, distances and directions. A map projection can satisfy at most two of these properties. Many map projections (especially global ones) do not fulfill any of these properties but are intended as a compromise (e.g. Robinson). The following table lists a few common projections based on their preserved property.
| Name | Property | Examples |
|---|---|---|
| Conformal | Preserves shape | Mercator, Lambert conformal conic, Stereographic |
| Equal area | Preserves area | Equal Earth, Albers conic, Lambert azimuthal equal-area |
| Equidistant | Preserves distance | Equirectangular, Equidistant conic, Azimuthal equidistant |
| Azimuthal | Preserves direction | Stereographic, Lambert azimuthal equal-area |
The CRS used by almost all mapping websites (OpenStreetMap, Google Maps, etc.) should never be used for any analysis. It is a slightly modified version of the Mercator projection called Web Mercator or Pseudo-Mercator. It has some advantages allowing good visualization speed, but the distortions are massive. Check the following website: https://www.thetruesize.com. Measurements (areas, distances, etc.) computed on data using this CRS will be completely wrong.
There exists a ton of different projections and all of them make different compromises, some are even totally useless (check this beautiful xkcd comic: https://xkcd.com/977). The following websites allow you to visualize the main projection types: https://www.geo-projections.com and https://map-projections.net/singleview.php. Both websites also provides a nice tool to visualize distortions called Tissot’s indicatrices.
Choosing a projection for your data can be challenging, especially if it covers a very large area. If you want to create world maps for which precision is not of utmost importance, I would recommend using the Robinson, Equal Earth or Winkel Tripel projections. For large areas, such as continents or a group of countries, the best choice will depend on the properties you want to preserve. Experts often recommend the Lambert azimuthal equal-area, the azimuthal equidistant and the Lambert conformical conic projections. Do not forget to set the latitude and the longitude of origin to the center of the study area. Using a UTM (Universal Transverse Mercator) projection is also a possibility if your data is contained in a single UTM zone. Since the distortions increase quickly when you move away from the center of a UTM zone, it is advised to restrict the longitudinal extent to 6 degrees from the central meridian of the chosen zone. The stereographic projection is especially interesting for polar regions. Things are getting easier if you work at the country level, since almost every country has its own recommended projected CRS (or CRSs).
It is often easier to work with a projected CRS, except if your data is global or covering a really large area (like a continent). Most GIS software will (still) make the assumption that you’re data is on a flat plane, even if you’re working with a geographic CRS. This behavior can easily create nonsensical results, for example a buffer where the distance is specified in degrees. The sf package is kind of an exception since it will by default perform calculations on a sphere if you use a geographic CRS, thanks to the S2 library.
However if your original data is using a geographic CRS and you only need to compute basic measurements like areas, distances or azimuths, do not project it. As we will see in the next sections, it is possible to compute these values with a high precision using unprojected data.
4.4 CRS representation
With so many CRSs available, we need a way to classify them. That’s what the EPSG (European Petroleum Survey Group) started doing a few years ago. They collected and documented most available CRSs in a data set which is now called the EPSG Geodetic Parameter Dataset (https://epsg.org/home.html). In this data set, every CRS has a unique identification number that can be used in a GIS software instead of writing the full definition of the CRS. The best available transformations between CRSs are also defined. Sadly this data set is still missing a few interesting CRSs and was thus completed by other companies such as ESRI. This is the reason why you’ll sometimes see ESRI codes instead of EPSG for some CRSs. To avoid confusion, CRSs are usually referenced by an SRID (Spatial Reference Identifier), which is made of two components, an authority (such as EPSG or ESRI) and an identifier. If no authority is mentioned you can usually assume it’s coming from the EPSG data set (especially in the open-source GIS world). For clarity, I recommend always specifying the full SRID when working with CRSs. With sf and terra (and most other GIS packages), the SRID has to be written in the form “authority:identifier”.
The following CRSs are especially interesting for us:
| SRID | Name | Description |
|---|---|---|
| EPSG:2056 | CH1903+/LV95 | Projected CRS currently used in Switzerland |
| EPSG:21781 | CH1903/LV03 | Former projected CRS used in Switzerland, you will still find data sets using this one |
| EPSG:4326 | WGS84 | Geographic CRS used for most global data sets, and by GPS devices |
| EPSG:3035 | ETRS89-extended/LAEA Europe | Projected CRS recommended for Europe |
| EPSG:3857 | Pseudo-Mercator | Projected CRS used by online maps |
| EPSG:8857 | Equal Earth Greenwich | Nice equal area projection for world maps |
| ESRI:54030 | Robinson | Aesthetically pleasing projection for world maps |
If you search for the EPSG database on your favorite search engine, you may find the website https://epsg.io. Please do not use it for EPSG codes (it’s still useful for ESRI codes, though)! It is not the official EPSG website, it doesn’t use the latest version of the EPSG database, and therefore some definitions of CRSs are outdated.
Almost all GIS software, including the sf and terra packages, rely on an open-source library called PROJ for their CRS and projection computations. It was created in 1983 by Gerald Evenden and is still actively developed by several contributors. PROJ also provides an online service for transformation grids. Thanks to PROJ, if the CRS of your data is properly defined, (almost) all CRS operations should just work automagically!
When looking for examples on the web, you will often find code snippets using what is called a proj4string to define a CRS or to reproject data. For example the proj4string for the current Swiss CRS looks like this: +proj=somerc +lat_0=46.9524055555556 +lon_0=7.43958333333333 +k_0=1 +x_0=2600000 +y_0=1200000 +ellps=bessel +towgs84=674.374,15.056,405.346,0,0,0,0 +units=m +no_defs +type=crs. This was the standard way of describing CRSs in PROJ until a few years ago. You should NOT use these strings, instead always use the EPSG (or another authority) number to be on the safe side. Otherwise you may get small to medium position errors when reprojecting your data since some arguments of the proj4strings will be ignored.
Similarly, you will sometimes see some CRS definitions using the +init= syntax (e.g., +init=EPSG:2056). This is also a proj4string. This should be avoided for the same reasons, moreover this can also cause problems with other GIS software not recognizing the CRS properly.
A SRID is a convenient way to represent a CRS but it hides all the technical details. You can easily explore how CRSs are stored in modern GIS software by providing a SRID to the st_crs() function. If you use EPSG codes, you can also simply enter the code as an integer (but please don’t do this to avoid confusion). For example if you want to inspect the current Swiss CRS:
st_crs("EPSG:2056")Coordinate Reference System:
User input: EPSG:2056
wkt:
PROJCRS["CH1903+ / LV95",
BASEGEOGCRS["CH1903+",
DATUM["CH1903+",
ELLIPSOID["Bessel 1841",6377397.155,299.1528128,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4150]],
CONVERSION["Swiss Oblique Mercator 1995",
METHOD["Hotine Oblique Mercator (variant B)",
ID["EPSG",9815]],
PARAMETER["Latitude of projection centre",46.9524055555556,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8811]],
PARAMETER["Longitude of projection centre",7.43958333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8812]],
PARAMETER["Azimuth at projection centre",90,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8813]],
PARAMETER["Angle from Rectified to Skew Grid",90,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8814]],
PARAMETER["Scale factor at projection centre",1,
SCALEUNIT["unity",1],
ID["EPSG",8815]],
PARAMETER["Easting at projection centre",2600000,
LENGTHUNIT["metre",1],
ID["EPSG",8816]],
PARAMETER["Northing at projection centre",1200000,
LENGTHUNIT["metre",1],
ID["EPSG",8817]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping (large and medium scale)."],
AREA["Liechtenstein; Switzerland."],
BBOX[45.81,5.95,47.81,10.5]],
ID["EPSG",2056]]
We get the complete description of the CRS with all its parameters (e.g. datum, etc.) and some metadata (e.g. area of application). The format used is unfortunately also named WKT, but this has nothing to do with the WKT format we’ve seen before that is used to define geometries. It is of course possible to assign a CRS to some data using the full WKT description instead of using a SRID.
Geodesists don’t always agree on axis order when documenting CRSs. Hence we have some geographic CRSs storing coordinates as latitude/longitude, such as the widely used EPSG:4326, and others using longitude/latitude. This choice made some people very sad, and therefore they created a CRS called OGC:CRS84 which is exactly the same as EPSG:4326 except for the axis order. Fortunately you don’t have to worry about axis order. The PROJ library will do all the conversions for you!
4.5 CRS query and assignment
We’ve seen how to use the st_crs() function to get information on existing CRSs. The same function can also be used to query the CRS of sf and sfc objects. The output is an object of class crs. You can use the $ method on this object to retrieve the main properties of the CRS (e.g. name). Check the help of the st_crs() function to see all the available properties.
crs_world <- st_crs(World)
class(crs_world)[1] "crs"
crs_world$Name[1] "WGS 84"
crs_worldCoordinate Reference System:
User input: EPSG:4326
wkt:
GEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
USAGE[
SCOPE["Horizontal component of 3D system."],
AREA["World."],
BBOX[-90,-180,90,180]],
ID["EPSG",4326]]
The st_crs() function is also used to assign a CRS to an existing sf or sfc object. As an input, you’ll need the usual authority:identifier SRID. This will replace any previously assigned CRS.
pt1 <- st_as_sfc("POINT(2657271 1219754)")
st_crs(pt1) <- "EPSG:2056"You can also specify a crs object as an input. This allows you to assign a CRS used by another sf or sfc object. This can be useful if the data is using some exotic CRS which doesn’t have an EPSG number.
pt2 <- st_as_sfc("POINT(2600000 1200000)")
st_crs(pt2) <- st_crs(pt1)The terra package has a similar function to query and assign a CRS to SpatRaster and SpatVector objects, the crs() function. When querying the CRS of an object, the default output is the WKT description of the CRS. If you include the describe = TRUE argument, you’ll get a user-friendly description.
r1 <- rast(nrow = 10, ncol = 10, xmin = 2600000, xmax = 2610000, ymin = 1200000, ymax = 1210000, crs = "EPSG:2056")
crs(r1)[1] "PROJCRS[\"CH1903+ / LV95\",\n BASEGEOGCRS[\"CH1903+\",\n DATUM[\"CH1903+\",\n ELLIPSOID[\"Bessel 1841\",6377397.155,299.1528128,\n LENGTHUNIT[\"metre\",1]]],\n PRIMEM[\"Greenwich\",0,\n ANGLEUNIT[\"degree\",0.0174532925199433]],\n ID[\"EPSG\",4150]],\n CONVERSION[\"Swiss Oblique Mercator 1995\",\n METHOD[\"Hotine Oblique Mercator (variant B)\",\n ID[\"EPSG\",9815]],\n PARAMETER[\"Latitude of projection centre\",46.9524055555556,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8811]],\n PARAMETER[\"Longitude of projection centre\",7.43958333333333,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8812]],\n PARAMETER[\"Azimuth at projection centre\",90,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8813]],\n PARAMETER[\"Angle from Rectified to Skew Grid\",90,\n ANGLEUNIT[\"degree\",0.0174532925199433],\n ID[\"EPSG\",8814]],\n PARAMETER[\"Scale factor at projection centre\",1,\n SCALEUNIT[\"unity\",1],\n ID[\"EPSG\",8815]],\n PARAMETER[\"Easting at projection centre\",2600000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8816]],\n PARAMETER[\"Northing at projection centre\",1200000,\n LENGTHUNIT[\"metre\",1],\n ID[\"EPSG\",8817]]],\n CS[Cartesian,2],\n AXIS[\"(E)\",east,\n ORDER[1],\n LENGTHUNIT[\"metre\",1]],\n AXIS[\"(N)\",north,\n ORDER[2],\n LENGTHUNIT[\"metre\",1]],\n USAGE[\n SCOPE[\"Cadastre, engineering survey, topographic mapping (large and medium scale).\"],\n AREA[\"Liechtenstein; Switzerland.\"],\n BBOX[45.81,5.95,47.81,10.5]],\n ID[\"EPSG\",2056]]"
crs(r1, describe = TRUE) name authority code area
1 CH1903+ / LV95 EPSG 2056 Liechtenstein; Switzerland
extent
1 5.95, 10.50, 45.81, 47.81
Assigning a CRS using the crs() function is done in exactly the same way as with st_crs().
r2 <- rast(nrow = 10, ncol = 10, xmin = 2600000, xmax = 2610000, ymin = 1200000, ymax = 1210000)
crs(r2) <- "EPSG:2056"Similarly, you can also assign the CRS used by another SpatRaster or SpatVector object to your spatial object.
r3 <- rast(nrow = 10, ncol = 10, xmin = 2600000, xmax = 2610000, ymin = 1200000, ymax = 1210000)
crs(r3) <- crs(r1)The st_crs() and crs() functions are used to query or assign a CRS, but they do not perform any CRS transformation!
4.6 CRS transformations
It is sometimes necessary to transform coordinates or spatial objects into another CRS. For example when performing a GIS analysis with multiple data sets, it can be necessary to transform some of them in order to have all the data sets using a common CRS. We almost always speak of CRS or coordinate transformation, but we should actually distinguish between conversion and transformation. Going from one CRS to another without changing datum is called coordinate conversion. This process is lossless and invertible. The equations associated with a conversion are not empirical. Recomputing coordinates in a new datum is called coordinate transformation. The computations are approximate and the processes involved are empirical. Different approaches can be used to perform a coordinate transformation, such as equations, transformation grids, etc. Due to this complexity, multiple transformation paths (also called pipelines) are possible.
Transforming a vector data set is a simple operation, we simply need to transform the points/vertices into the new CRS. The precision is only determined by the transformation used. Transforming a raster data set is more challenging. This is usually done using a technique called warping which first transforms the raster extent and then recomputes the pixel values using resampling. There is thus a loss of information and the precision is determined not only by the transformation used, but also by the raster properties. Moreover the interpolation algorithm used during the resampling must be chosen carefully depending on the raster content (e.g. continuous vs. discrete values). A more direct transformation is also possible but rarely used since the output raster will be curvilinear (i.e. the pixels will not be rectangles). Due to all these reasons, if you need vector and raster data sets with a common CRS, I’d recommend transforming the vector data set to the CRS of the raster data set and not the opposite.
As we’ve seen above, there are often several transformation pipelines available between two CRSs. Fortunately we usually don’t have to care about this complexity since all these transformations are managed by the PROJ library. In case of multiple available pipelines, PROJ will automatically choose the one with the best accuracy. If you need more information about the chosen transformation, you can use the st_proj_pipelines() function in the sf package. Here’s an example showing the transformation pipeline between the Swiss projected CRS and EPSG:4326.
trans <- sf_proj_pipelines(source_crs = "EPSG:2056", target_crs = "EPSG:4326")
transCandidate coordinate operations found: 2
Strict containment: FALSE
Axis order auth compl: FALSE
Source: EPSG:2056
Target: EPSG:4326
Best instantiable operation has accuracy: 1 m
Description: Inverse of Swiss Oblique Mercator 1995 + CH1903+ to WGS 84 (1)
+ axis order change (2D)
Definition: +proj=pipeline +step +inv +proj=somerc +lat_0=46.9524055555556
+lon_0=7.43958333333333 +k_0=1 +x_0=2600000
+y_0=1200000 +ellps=bessel +step +proj=push +v_3
+step +proj=cart +ellps=bessel +step +proj=helmert
+x=674.374 +y=15.056 +z=405.346 +step +inv
+proj=cart +ellps=WGS84 +step +proj=pop +v_3 +step
+proj=unitconvert +xy_in=rad +xy_out=deg
We see that PROJ found two possible pipelines and it automatically chose the best one which has a 1 meter accuracy. The definition also provides a list of all the steps needed for the transformation. The st_proj_pipelines() function returns a proj_pipelines object containing all the possible transformation pipelines (but only the best one is printed). This object is actually a data frame. To check the second, suboptimal, pipeline, we can thus use the usual indexing techniques.
trans <- sf_proj_pipelines(source_crs = "EPSG:2056", target_crs = "EPSG:4326")
trans[2,]Candidate coordinate operations found: 1
Strict containment: FALSE
Axis order auth compl: FALSE
Source: EPSG:2056
Target: EPSG:4326
Best instantiable operation has only ballpark accuracy
Description: Inverse of Swiss Oblique Mercator 1995 + Ballpark geographic
offset from CH1903+ to WGS 84 + axis order change
(2D)
Definition: +proj=pipeline +step +inv +proj=somerc +lat_0=46.9524055555556
+lon_0=7.43958333333333 +k_0=1 +x_0=2600000
+y_0=1200000 +ellps=bessel +step +proj=unitconvert
+xy_in=rad +xy_out=deg
Here’s another example showing the transformation pipelines between the previous (deprecated) Swiss CRS and the one used currently.
sf_proj_pipelines(source_crs = "EPSG:21781", target_crs = "EPSG:2056")Candidate coordinate operations found: 2
Strict containment: FALSE
Axis order auth compl: FALSE
Source: EPSG:21781
Target: EPSG:2056
Best instantiable operation has only ballpark accuracy
Description: Inverse of Swiss Oblique Mercator 1903M + Ballpark geographic
offset from CH1903 to CH1903+ + Swiss Oblique
Mercator 1995
Definition: +proj=pipeline +step +inv +proj=somerc +lat_0=46.9524055555556
+lon_0=7.43958333333333 +k_0=1 +x_0=600000
+y_0=200000 +ellps=bessel +step +proj=somerc
+lat_0=46.9524055555556 +lon_0=7.43958333333333
+k_0=1 +x_0=2600000 +y_0=1200000 +ellps=bessel
Operation 1 is lacking 1 grid with accuracy 0.2 m
Missing grid: ch_swisstopo_CHENyx06a.tif
URL: https://cdn.proj.org/ch_swisstopo_CHENyx06a.tif
In this case PROJ found 2 possible pipelines but it tells us that it cannot use the best one (with 0.2 meter accuracy) since the required transformation grid is not available on our computer. Hence it had to choose the second pipeline which only has ballpark accuracy. Fortunately, the PROJ library also provides an online service called the PROJ Datumgrid CDN (Content Delivery Network) which allows us to automatically download such grids. We first need to enable it in sf using the sf_proj_network() function, and then we check again the available transformation pipelines.
sf_proj_network(enable = TRUE)[1] "https://cdn.proj.org"
sf_proj_pipelines(source_crs = "EPSG:21781", target_crs = "EPSG:2056")Candidate coordinate operations found: 2
Strict containment: FALSE
Axis order auth compl: FALSE
Source: EPSG:21781
Target: EPSG:2056
Best instantiable operation has accuracy: 0.2 m
Description: Inverse of Swiss Oblique Mercator 1903M + CH1903 to CH1903+ (1)
+ Swiss Oblique Mercator 1995
Definition: +proj=pipeline +step +inv +proj=somerc +lat_0=46.9524055555556
+lon_0=7.43958333333333 +k_0=1 +x_0=600000
+y_0=200000 +ellps=bessel +step +proj=hgridshift
+grids=ch_swisstopo_CHENyx06a.tif +step
+proj=somerc +lat_0=46.9524055555556
+lon_0=7.43958333333333 +k_0=1 +x_0=2600000
+y_0=1200000 +ellps=bessel
The more accurate pipeline is now available. Note that the transformation grid will only be available in the current R session and you’ll thus need to activate the CDN to download it again if you start a new R session. If you use this CRS transformation regularly, you can also download the grid and install it manually. To do that, put the grid inside the proj folder of your sf installation (run system.file(package = "sf") to check where sf was installed).
Once a transformation grid is downloaded using the sf_proj_network() function or manually installed inside the sf package, this will obviously only affect CRS transformations performed using sf. With the terra package, the missing grids are downloaded automatically by default. Similarly to sf, it is also possible to manually install a grid in the proj folder of your terra installation (run system.file(package = "terra") to check where terra was installed).
The goal of this section was to provide a general introduction to CRS transformations. To see how to actually perform CRS transformations on real data sets, check Section 5.18 for vector data and Section 6.12 for rasters.