Habitats and Biotopes - Developer Notes
Learnings
The difference between Shapefile and GeoJSON
A Shapefile is an old but still widely used GIS format created by ESRI. A Shapefile is not one file, it is at least 3–7 files with the same name. Shapefiles have limitations, here's two main ones
- Max attribute name length: 10 characters
- Only one geometry type per file
GeoJSON is a modern, text-based format (JSON) designed for web mapping.
A single .geojson file contains:
- geometry
- attributes
- CRS (usually WGS84)
- everything together in one file
Tricks for working with WFS
Loading heavy WFS layers takes time. For good UX you want to use techniques such as
- load WMS initially as a preview then WFS when required e.g when minimum zoom level reached
- only load WFS when zoomed in enough
- load on map zoom and pan (when view changes).
- debounce WFS requests
- cancel in-flight WFS requests
- optional - cache tiles / sections - cache GeoJSON per “tile key” (e.g. a rounded bbox or a grid index)
You can clip WFS using a cql_filter(if its supported) e.g.
cql_filter=INTERSECTS(geom, SRID=4326;' + boundingPolygonWKT + ')'
or
"INTERSECTS(geom, SRID=4326;" + boundingPolygonWKT + ") AND BBOX(geom, " + bboxString + ", 'EPSG:4326')"
About biotope codes
Biotope codes found in surveys may be historical and not used today (not standard). Surveys may have used local extensions, preliminary draft codes, been commissioned before classification stabilised, contain surveryor shorthand or typos etc. All of this is common in ecological survey data. For example, The Intertidal Phase 1 Habitat Survey was:
JNNC Marine Habitat Classification for Britain and Ireland is here: https://mhc.jncc.gov.uk/
In the intertidal survey, f2004_code is the modern (JNCC) biotope identifier and "biotope" is the original survey code
Extracting Gwynedd only biotope data
This was just practicing using GDAL, but the application used WFS instead. The clipping below is to the Administrative gwynedd boundary, which excludes most intertidal & marine space (what we actually want). Perhaps there is a boundary for marine, but the generated layer would have been too big anyhow so used WFS instead.
This may not have actually been necessary, but it did at least cut down the size of the layers. By extracting the Wales biotope data for Gwynedd only, the geoJSON went from 200MB to 12MB. Further iterations, rather than loading the layer locally, may use the provided WFS endpoint, defined on https://datamap.gov.wales/layergroups/geonode:nrw_intertidal_phase_1_habitat_survey
Goal: From "Intertidal Phase 1 Habitat Survey Wales", extract only the data for Gwynedd coastline
For that I used GDAL command line tools (rather than QGIS), and used the tools ogr2ogr and ogr2ogr
brew install gdal
- Download the shapefile for "Local Authorities - High Water mark 2016"
- To see the shape file attributes
ogrinfo LocalAuthorities.shp -so -al
//OTHER USEFUL COMMANDS
//print all features and their attributes (page using less)
ogrinfo localauthorities.shp -al | less
//Dump attributes only (no geometry) to CSV:
ogr2ogr -f CSV la_attributes.csv localauthorities.shp -lco GEOMETRY=AS_XYZ
//Dump attributes + geometry to GeoJSON:
ogr2ogr -f GeoJSON out.json localauthorities.shp
// find matching Features and print their attributes (-al)
ogrinfo localauthorities.shp -al -where "name_en='Gwynedd'" | less
- Extract just the Gwynedd boundary (actually a bad idea because doesn't include low water areas)
ogr2ogr -f "GeoJSON" gwynedd_boundary.geojson \
localauthorities.shp \
-where "name_en='Gwynedd'"
- Download the shapefile for "Intertidal Phase 1 Habitat Survey Wales". Note: would have been better to download the geoJSON (the modern way)
- Clip the Intertidal Phase 1 shapefile to the Gwynedd boundary (changed this to clip to a polygon around Gwynedd and Anglesey)
ogr2ogr -f "GeoJSON" gwynedd/DGW_nrw_ph1_intertidal_biotope_gwynedd.geojson \
nrw_ph1_intertidal_biotope/nrw_ph1_intertidal_biotope.shp \
-clipsrc gwynedd/gwynedd_boundary.geojson
What this does:
- Reads all Intertidal Phase 1 polygons
- Keeps only geometry that intersects the Gwynedd boundary
-
Writes them to intertidal_phase1_gwynedd.shp
-
Download the geojson for "notes" (these are the features) "Intertidal Phase 1 Habitat Survey Wales"
- Clip the features to the Gwynedd boundary
ogr2ogr -f "GeoJSON" gwynedd/DGW_nrw_ph1_intertidal_biotope_features_gwynedd.geojson \
nrw_ph1_intertidal_features.geojson \
-clipsrc gwynedd/gwynedd_boundary.geojson
Loading layers into PostGIS
The following loads DGW_nrw_ph1_intertidal_biotope_gwynedd.geojson into a table called "gwynedd_nrw_ph1_intertidal_biotope". The layer in the geojson file is called "nrw_ph1_intertidal_biotope"
ogr2ogr \
-f "PostgreSQL" \
PG:"host=localhost dbname=hmj user=hmj" \
DGW_nrw_ph1_intertidal_biotope_gwynedd.geojson \
-nln gwynedd_nrw_ph1_intertidal_biotope \
-nlt MULTIPOLYGON \
-lco GEOMETRY_NAME=geom \
-lco FID=gid \
-overwrite \
nrw_ph1_intertidal_biotope
What those options do:
- f "PostgreSQL" – output format
- DGW_nrw_ph1_intertidal_biotope_gwynedd.geojson – your input file
- nln gwynedd_nrw_ph1_intertidal_biotope – name of the new table
- lco GEOMETRY_NAME=geom – call the geometry column geom
- lco FID=gid – primary key column name
- overwrite – drop the table if it already exists
- nrw_ph1_intertidal_biotope is the layer name
What loading a layer into PostGIS actually does
A geoJSON file is a list of geometries eg polygons or points, with each geometry having attributes. Each row in the table is simply one of these geometries. Whats special is the geom field as this is the spatial field used by PostGIS in spatial queries and operations e.g.
SELECT b.f2004_code, h.habitat_type
FROM biotope_layer b
JOIN habitats_layer h
ON ST_Intersects(b.geom, h.geom);

It means you can do normal SQL queries
SELECT * from biotope_layer b where siteid=50
and then you can then display each record geometry on a map.
How can a layer in PostGIS be displayed on a map?
Send it GeoJSON (vector data). this mean you have to implement an endpoint which queries PostGIS and returns JSON
Using WFS instead
Data Map Wales provides both WMS and WFS endpoints for the biotope layer. This is what the tool actually used (rather than the layer loaded into PostGIS). WMS/WFS allows clipping through a cqlFilter which allows you to specify a bounding polygon WKT:
https://datamap.gov.wales/layergroups/geonode:nrw_intertidal_phase_1_habitat_survey
//rough polygon around Angelsey and Gwynedd:
var boundingPolygonWKT = 'POLYGON((-4.0303302764888 53.287529645121,-3.990504837035 53.342498089685,-4.4533039093003 53.495543207761,-4.7705341339092 53.280550975858,-4.8309589385967 52.690265805667,-3.9383197784404 52.50508529651,-3.2022357940655 52.924367613604,-4.0234638214092 53.283835197482,-4.0296436309835 53.28732440641,-4.0275836944585 53.288966288494,-4.0303302764888 53.287529645121))';
var cqlFilter = encodeURIComponent("INTERSECTS(geom, SRID=4326;" + boundingPolygonWKT + ")");
var wfsUrl = wfsBaseUrl +
'?service=WFS' +
'&version=2.0.0' +
'&request=GetFeature' +
'&typeNames=geonode:nrw_ph1_intertidal_biotope' +
'&outputFormat=application/json' +
'&cql_filter=' + cqlFilter +
'&srsName=EPSG:4326';
Beware the coordinate references system (CRS)
| Purpose | CRS required |
|---|---|
| GeoJSON | EPSG:4326 |
| Leaflet internal math | EPSG:3857 |
| Tile layers (WMS) | EPSG:3857 |
| PostGIS native storage | EPSG:27700 (or whatever is appropriate) |
----> Normalise geometry to EPSG:4326 on import
Leaflet requires EPSG:4326
Lots of UK data is stored in EPSG:27700 northings/eastings. In fact ST_Area, ST_Distance, etc. are often more meaningful in a projected CRS (metres) than in EPSG:4326 (lat/lon degrees).
The geoJSON states what the CRS is:
"crs": {
"type": "name",
"properties": {
"name": "urn:ogc:def:crs:EPSG::27700"
}
}
Leaflet requires EPSG:4326 lat/lon, so when supplying data for leaflet, you may need to transform the CRS.
The following Django endpoint:
- loads polygons from PostGIS (stored in EPSG:27700, OSGB eastings/northings)
- transforms them to EPSG:4326 (lat/lon for Leaflet)
- returns valid GeoJSON
from django.http import HttpResponse
from django.core.serializers import serialize
from django.contrib.gis.db.models.functions import Transform
from .models import GwyneddNrwPh1IntertidalBiotope
def biotopes_geojson(request):
"""
Minimal example: return GeoJSON for Leaflet.
Geometry is stored in EPSG:27700 and transformed to EPSG:4326.
"""
qs = GwyneddNrwPh1IntertidalBiotope.objects.annotate(
geom_wgs84=Transform("geom", 4326)
)
geojson = serialize(
"geojson",
qs,
geometry_field="geom_wgs84",
fields=("f2004_code", "biotope"),
)
return HttpResponse(geojson, content_type="application/json")
Internally Leaflet uses EPSG:3857 (Web Mercator)
Internally, leaflet projects to EPSG:3857 (Web Mercator). This is because it is rendering a flat image of a map.
- When you add a marker, Leaflet converts the lat/lon into 3857 metres.
- When you pan/zoom, Leaflet calculates tile positions in EPSG:3857.
- When requesting WMS tiles, Leaflet typically sends BBOX in EPSG:3857.
WMS (usually) requires EPSG:3857
WMS returns image tiles, so they too usually use EPSG:3857 . When displaying a WMS layer in a Leaflet map, use the wms method (which takes care of required coordinate projections )
var layer = L.tileLayer.wms(wmsURL, {
layers: 'ALA:occurrences',
format: 'image/png',
...
...
});
Whats the meaning of siteid and siteno in the NRW / JNCC Phase 1 intertidal datasets
- internal numeric identifier
- unique per survey site
- stable within the dataset
- used to group all biotope features for a site
siteno
- a human-readable site code, often like
"10.49.1" - typically hierarchical (area → sub-area → site)
- appears in survey reports, notebooks, hand-drawn maps
- NOT linked to any published site boundary layer
These fields let you do:
“group polygons by site”
but they do not imply the existence of a published site polygon.
For JNCC/NRW intertidal biotopes:
- The biotope polygons are the mapped geometry.
- The sites are conceptual units used in the survey planning and reporting.
- Site boundaries were usually:
- hand-drawn on paper maps during fieldwork
- digitised only as biotopes, not as “site outlines”
- not published as separate datasets
So in almost all releases, site boundaries are missing.
So how do you map a “site”?
You can derive it (didn't actually use this).
A “site boundary” is logically:
Site polygon = the union of all biotope polygons with the same siteid
In PostGIS:
SELECT ST_Union(geom) AS site_geom
FROM gwynedd_nrw_ph1_intertidal_biotope
WHERE siteid = 118;
This returns a derived polygon for the site.
You can precompute all site polygons like this:
CREATE TABLE gwynedd_nrw_ph1_intertidal_biotope_sites AS
SELECT
siteid,
siteno,
ST_Union(geom) AS geom
FROM gwynedd_nrw_ph1_intertidal_biotope
GROUP BY siteid, siteno;
Now you have a site-boundary layer.
Create a Django model:
python manage.py inspectdb gwynedd_nrw_ph1_intertidal_biotope_sites \
> habitats_and_biotopes/models_inspected.py
Miscellaneous info
The biotope list (codes and hierarchy structure) can be found here (Marine Habitat Classification for Britain and Ireland):
https://mhc.jncc.gov.uk/media/1050/22_04_full_biotope_list.xlsx
This is a page for a biotope:
https://mhc.jncc.gov.uk/biotopes/jnccmncr00000658
ogr2ogr is a command-line tool (part of GDAL) used to convert, clip, filter, reproject, and manipulate vector geospatial data (Shapefiles, GeoJSON, GPKG, PostGIS, etc.).
There are more examples of biotopes than there are of habitats since biotopes are usually small in size. Habitats are usually classified into three main categories, which include marine, freshwater, and terrestrial habitats. Examples of terrestrial habitats include deserts, forests, savannahs, steppe, grasslands, and glaciers among others. Freshwater habitats include rivers, ponds, lakes, marshes, estuaries and streams, and underground rivers and lakes. Examples of marine habitats are reefs, deep seas, submarine vents, salt marshes, beaches, and the open sea. Examples of biotopes are too numerous to list but include stones, bushes, flower pots, gardens, mud, and much more.
A biotope is defined as a suite of species, adapted to specific environmental conditions, within a particular habitat type. It encompasses both the biotic (living) and abiotic (non-living) components.
Biotope literally means “place of life.” It’s the physical environment that provides the conditions for a particular community of organisms — including things like soil type, hydrology, and microclimate. It’s more about the abiotic factors (the non-living environment).
Habitat refers to the living place of a particular species or group of species — the area where an organism normally lives and grows. It’s a more species-centred concept.
Biotope = the environment as defined by its physical and chemical characteristics (plus the community as a whole).
Habitat = the living space of a particular species within that environment.
biotopes (components of habitats) and species (
Recorder 6 biotope pages
https://www.recorder6.info/WebHelpR6V625/Topics/Report_Wizard_Biotope_Selection.htm
https://www.recorder6.info/WebHelpR6V625/Topics/Biotope_Occurrence_Overview.htm
https://www.recorder6.info/WebHelpR6V625/Topics/Finding_Biotopes.htm
https://www.recorder6.info/WebHelpR6V625/Topics/Biotope_Dictionary_Browser.htm
https://www.recorder6.info/WebHelpR6V625/Topics/New_Biotopes_Overview.htm