Earth Observation Data

Recent work from European member states and other institutes show promising results in predicting land use, land cover, land use change, crop yield, crop typology and greenery in cities based on either earth observation data (like satellite and aerial imagery). Using state of the art methods and tools, various AI/ML models were built. The question is whether these kind of developments can be shared, so that more NSI’s can benefit from these developments: build once, run anywhere!

The research question we would like to answer in this work package is: can these existing AI/ML models using earth observation data be generalised over space (countries) and time (periods) and under what conditions?

We want to apply those methods on other countries or other timeframes and compare the results. It would demonstrate what methods and tools to-date seem to work well and what methods lack in useful results, what data sources are needed (and on what level they are available), what kind of (pre)processing is needed, what sorts of infrastructure this (pre)processing requires and how to interpret the results. It also demonstrates if (at all) a generalized processing pipeline and infrastructure is possible and what the specifications of such a pipeline would be. If it proves to be feasible, we will develop methodological and implementation guidelines. And as a side-effect, we will have results in terms of (validated) predictions for other countries/timeframes than the original study.

Completed Deliverables

Deliverable 7.1 - A report on two AI models using earth observation data, and an investigation of their generalisability to different countries and timeframes.

Recent work

Crop Type Mapping

We have rebuilt, optimized, and fully automated version 2.0 of our crop type mapping pipeline (Sentinel-1 OBIA & ANN/SAM). The code has been successfully pushed to a dedicated branch on GitHub.

Key Improvements & Features in v2.0:

Dynamic country support (NUTS2): We have removed hardcoded orbit/track codes (such as P1, P2) and deprecated date arrays. We introduced an automated script download_nuts_shapefiles.py that dynamically downloads and builds administrative boundary shapefiles for any European country directly from Eurostat.
Greedy set cover orbit optimization: The preprocessor now automatically reads the pass direction (Ascending/Descending) from Sentinel-1 manifest.safe files. Using a Greedy set cover solver, it selects the single most optimal pass direction and the minimum number of relative orbits needed to cover the target country, avoiding mixed passes and minimizing processing time.
Sequential processing and memory safety (Anti-OOM): To prevent Out-Of-Memory (OOM) errors and CPU over-allocation by SNAP (gpt.exe) and GDAL, the entire pipeline (calibration, coregistration, clipping, and classification) executes sequentially (orbit-by-orbit, date-by-date) and utilizes block-based processing (tiled read/write).
Interactive classifier stage selector (CLI Menu): We built an interactive text menu inside the classification scripts (1_OBIA_vector_classifier_modular_ANN.py and the SAM variant). You can run all stages at once (option [A]) or execute steps individually while dynamically tuning hyperparameters directly in the terminal (e.g. tuning segmentation scale, hidden layers architecture, or class balancing thresholds).
Segment Anything Model (SAM) Integration: The SAM-based pipeline now features full GPU acceleration (CUDA) support with CPU fallback, detailed installation guidelines, and a direct download path for the official Meta AI weights file (sam_vit_h_4b8939.pth).
Seamless country-wide merging: The unified merger script 2_OBIA_merge_classifications.py automatically scans all processed orbits for a target country, warps and merges them into a single final classification map, and generates a unified Excel metrics report (containing Overall Accuracy, Kappa, F1-scores, and areas in hectares) based on independent validation points.

Getting Started

We have written a comprehensive guide in the project’s root README.md file, which covers:

System prerequisites and Conda environment setup for both Windows and Linux.
Environment variables that must be configured in your shell before running the scripts.
Specifications for training/validation shapefiles (samples.shp), requiring the crop_id integer attribute.