Header Data with Icon
DaFab Summer School

DaFaB Summer School on Earth Observation and AI: September 22-25 2025, in Ljubljana, in the beautiful Slovenia

Applying AI to Earth Observation

4 days of lectures and hands-on sessions

Register

Join us for a summer school dedicated to AI applied to Earth Observation data, in the beautiful city of Ljubljana, Slovenia.

This summer school offers a unique opportunity to delve into the cutting-edge intersection of Artificial Intelligence (AI) and Earth Observation (EO), combined with the essential skills for managing large-scale data and workflows in modern computing environments. Participants will gain theoretical knowledge and practical experience in applying AI techniques to analyze EO data, optimizing AI performance, managing complex workflows with Kubernetes, and handling massive datasets

DaFab is a project funded by ESA under the grant agreement 101128693 — HORIZON-EUSPA-2022-SPACE

DaFab summer School September 22-24, 2025

DaFab is proud to be supported by the AI for SCIENCE 2025 for its summer school on Earth Observation data and AI. During 4 days we will cover topic such as AI models for satellite images, Kubernetes workflow, metadata generation and management and performance analysis.

During 4 days, morning will be devoted to lectures and lessons and afternoon to hands-on session provided by the DaFab Consortium.

Important dates

12 lecturers from all over Europe

Lecturers selected for the Summer School are coming from prestigious organization all over across Europe, coming from both Academia and the industry. Don't miss the opportunity to exchange with seasoned professional.

4 days with 2 sessions

The program is organized around 4 days: one dedicated to AI and EO, the second to AI and Performance, the third day to Workflow management and the last day to Earth Observation and data management.

Access to world class supercomputers

Hanson session will provide access on leadership class supercomputers in order to test, try and learn on reals systems.

Program

DaFab summer school is organised in 4 days, with a 3h lectures in the morning and a 2h session of coding and hands-on during the afternoon.

  • Monday Sep. 22, AI and EO
  • Tuesday Sep. 23, AI and Performance
  • Wednesday Sep. 24, Workflows
  • Thursday Sep. 25, EO and Data Management

9h00-12h00 AI and Earth Observation

program
9h00 Plenary Session

AI introduction in Remote Sensing

Alena Bakhorina, GCore

Evolution of AI technologies. Era of Foundation Models. Foundation Models in geoscience & datasets used for training. Popular Python frameworks for working with Earth Observation data. European projects in EO and AI.

program
9h45 Plenary Session

Earth Observation Remote Sensing: From Copernicus to Thales Alenia Space AI Applications

Michelle Aubrun, Thales Alenia Space

Michelle Aubrun received the Engineering degree in civil engineering and geomatics in 2013 and the Ph.D. degree in geography on SAR satellite image processing from the University of Montreal, Montreal, Canada, in 2019. She joined Thales Alenia Space in the beginning of 2018. Since then, she has been involved in several projects concerning image processing by deep learning approaches for remote sensing applications. Since 2020, she has also been a Researcher with the French Research Institute of Technology Saint Exupery, Toulouse, France. Her research interests include image representation learning with self-supervised approaches.

What is Earth Observation remote sensing? What are the key characteristics of EO data? What are its main fields of application? We will also introduce a concrete EO program called Copernicus.

program
10h30

** 30 minute break **

program
11h00 Plenary Session

Earth Observation data access tools

Alena Bakhorina, GCore

Introduction to STAC (SpatioTemporal Asset Catalogs) specification for discovering geospatial information. Examples of usage Python libraries for efficient data processing. 

program
11h45 Plenary Session

Earth Observation Remote Sensing: Thales Alenia Space AI Applications

Michelle Aubrun, Thales Alenia Space

What type of applications does Thales Alenia Space develop using Earth Observation data enhanced by artificial intelligence algorithms? Discover how AI-driven EO solutions are shaping innovative applications in various fields.

program
12h30

** lunch break **

program
14h00 Hands-on Session

Hands-on Session. Participant will have the opportunity to access a remote super-computer to run tests and experiments

prerequisite:

to be detailed

Outline of the sessions

to be detailed

9h00-12h00 AI and Performance

program
9h00 Plenary Session

Building the High-Performance Core of AI Factories

Farouk Mansouri, LuxProvide

AI factories must be designed to handle the most demanding stages of the machine learning lifecycle—data ingestion, preprocessing, and large-scale model training. These early phases place extreme demands on infrastructure, from GPU-accelerated supercomputers and high-throughput storage to low-latency interconnects and distributed data pipelines. This session explores how to architect and optimize dynamic, high-performance environments capable of processing massive datasets, orchestrating parallel training jobs, and scaling resources efficiently. Participants will gain practical insights into overcoming data bottlenecks, maximizing hardware utilization, and integrating HPC and cloud resources to deliver speed, scalability, and cost-efficiency at the start of the AI production chain.

program
9h45 Plenary Session

- Evolving Cloud Infrastructures for AI Workloads - High-Performance Foundations for AI Training

Clara Ulken, GCore

At GCore, Clara is leading cross-functional initiatives that align engineering, product, and business strategies to deliver scalable, high-impact technology solutions.

AI development begins with workloads that demand extreme performance. Training large language models and generative AI systems requires GPU acceleration, low-latency interconnects, and scalable storage to handle massive datasets. This session explores how Gcore’s GPU Cloud—powered by NVIDIA A100, H100, and H200 instances with InfiniBand networking—delivers HPC-grade capabilities in a flexible, cloud-native environment. We will look at distributed training strategies, mixed precision techniques, and orchestration tools that transform supercloud-class resources into elastic infrastructures for enterprise AI.

program
10h30

** 30 minute break **

program
11h00 Plenary Session

Delivering Agility: From Trained Models to Scalable AI Services

Farouk Mansouri, LuxProvide

Once AI models are trained, the challenge shifts to delivering them as fast, reliable, and cost-effective services. Inference workloads require entirely different priorities—low latency, high availability, and seamless integration into production systems—often through containerized microservices, auto-scaling cloud platforms, or edge computing. This session examines how to bridge the performance–agility gap, from compressing and optimizing models for deployment to building resilient MLOps pipelines for monitoring, retraining, and governance. Real-world patterns and case studies will illustrate how to move from HPC-heavy training to lightweight, scalable inference while maintaining cost control, compliance, and operational excellence.

program
11h45 Plenary Session

Evolving Cloud Infrastructures for AI Workloads – Agility at Scale: Deploying AI Everywhere

Clara Ulken, GCore

AI is no longer confined to research or centralized datacenters, but it increasingly powers real-time, interactive experiences where every millisecond counts. This session focuses on Gcore’s Everywhere Inference platform, which brings models closer to end users, enabling ultra-low latency, reducing unnecessary backhaul traffic, and supporting regional data handling requirements. Gcore has extended its global edge infrastructure with GPU-powered Points of Presence worldwide, creating a platform designed for workloads where speed and locality make a measurable difference. With Kubernetes integration, autoscaling, and support for hybrid deployments, enterprises can deploy optimized inference services that adapt in real time to diverse workloads. From finance to conversational AI, fraud detection and live content personalization, to immersive gaming and AR/VR, inference at the edge delivers tangible improvements in performance and user experience.

program
12h30

** lunch break **

program
14h00 Hands-on Session

Hands-on Session. Participant will have the opportunity to access a remote super-computer to run tests and experiments

Part 1 – HPC Training/Finetuning of an LLM

Goal: Show participants how to run and scale a training job on an HPC system. (45 minutes)

1. Introduction to HPC Infrastructure

o Overview of the HPC cluster architecture (login nodes, compute nodes, GPU nodes) o SLURM basics: job submission, partitions, scheduling policies o Storage layout and data transfer tips

2. Exploration of Performance Tools

o htop, nvidia-smi, ibstat for InfiniBand, iostat for disk I/O o Brief on nvtop or similar GPU monitoring tools o How to interpret load, memory, and network usage

3. Simple Run

o Launch a minimal LLM training job (small dataset, reduced parameters) o Show SLURM job submission (sbatch), log checking (squeue, sacct) o Walk through model directory structure and outputs

4. Run at Scale

o Increase dataset/model size and GPU count o Demonstrate distributed training (PyTorch DDP, DeepSpeed, or Megatron-LM) o Monitor scaling behavior and GPU utilization in real-time

5. Monitoring & Optimization

o Detect bottlenecks (GPU idle time, I/O wait, network congestion) o Adjust batch size, precision (FP32 vs FP16), and parallelism strategies o Quick discussion: trade-offs between speed and accuracy

program
14h45

** 10 minute break **

program

Part 2 – Kubernetes Deployment of the Trained Model

Goal: Deploy models as a scalable inference service (45 minutes)

1. Introduction to Kubernetes Infrastructure

o K8s architecture: master, worker nodes, pods, services, ingress o Overview of deployment options: cloud K8s, on-prem, hybrid o Brief on container images and registries

2. Exploration of Performance Tools

o kubectl top nodes/pods for resource monitoring o Logs (kubectl logs), kubectl describe for debugging o K8s dashboard or Lens for visual inspection

3. Simple Run

o Deploy model as a single replica pod with REST API (FastAPI/Flask) o Expose via NodePort or port-forwarding o Test with a sample query

4. Run at Scale

o Scale replicas using kubectl scale or HPA (Horizontal Pod Autoscaler) o Demonstrate load testing (e.g., hey or ab command) o Show autoscaling behavior under increasing load

5. Monitoring & Optimization

o Identify bottlenecks: CPU/GPU constraints, network latency o Optimize container startup, model loading time, batch inference o Discuss cost/performance trade-offs in scaling

program
15h40

** 10 minute break **

program

Part 3 – End-to-End Challenge & Wrap-Up

Goal: Apply both HPC and K8s learnings to simulate a full AI factory pipeline (45 minutes)

1. End-to-End Integration

o Take the trained model from HPC output o Package it into a Docker image o Push image to registry for K8s deployment

2. Performance Challenge

o Teams run the model at scale and optimize throughput & latency o Compare scaling behavior between HPC training and K8s inference o Introduce optional constraints (cost cap, latency target)

3. Wrap-Up & Key Takeaways

o Recap performance/agility lessons from lecture o Share best practices cheat-sheet for HPC and K8s operations o Q&+A + feedback

9h00-12h30 Beyond stand-alone performance: AI Workflow

program
9h00 Plenary Session

Containers/Docker

Giorgos Saloustros, FORTH

Abstract to be provided

program
9h45 Plenary Session

Orchestration/Kubernetes

Antony Chazapis

abstract to be provided

program
10h30

** 30 minute Coffee break **

program
11h00 Plenary Session

Worflows

Antony Chazapis

abstract to be provided

program
11h45 Plenary Session

Argo

Lefteris Vasilakis

abstract to be provided

program
12h30

** lunch break **

program
14h00 Hands-on Session

Hands-on Session. Participant will have the opportunity to access a remote super-computer to run tests and experiments

Knot

Giorgos Saloustros/Antony Chazapis

prerequisite:

to be detailed

Outline of the sessions

to be detailed

program
15h00

** 30 minute Coffee break **

program
15h30 Hands-on Session

Hands-on Session. Participant will have the opportunity to access a remote super-computer to run tests and experiments

hands-on create your own workflow with Knot/Argo

Lefteris Vasilakis/Antonis Tapanlis

prerequisite:

to be detailed

Outline of the sessions

to be detailed

9h00-12h00 Earth Observation and Data Management

program
9h00 Plenary Session

The AI Factory for Earth

Stephanie Giard, DDN

Stephanie is a Senior Product Marketing Manager at DDN. She brings extensive experience in Earth Observation from her roles at Planet, Airbus Defence & Space, Boeing, and Ubotica. Based in Fort Collins, CO, she holds a Master’s degree in Remote Sensing and GIS from Boston University.

There is an important on-going evolution of the Earth Observation data market. Originally the data market was driven by data acquisition, data acquisition cost and complexity. The Earth Observation community has been extremely successful in its effort to democratize data acquisition, as a result the community is drowning in data and starving for insight. The bottleneck is not satellites or sensors—it's data infrastructure. In this talk we will discuss this on-going evolution and the key data infrastructure technologies required to address this problem specifically in the time of AI.

program
9h45 Plenary Session

Data and Performance: Key factors and Parameters

Jean-Thomas Acquaviva, DDN

In this presentation we will present the traditional pitfall in data logistic, from management to exploitation. We will introduce some useful metric to assess the scalability and the relevance of a data infrastructure

program
10h30

** 30 minute break **

program
11h00 Plenary Session

Metadata Catalog: The needle and the haystack

Dimitrios Xenakis, CERN

TBA

program
11h45 Plenary Session

Observability and data performance at the time of AI

Jean-Thomas Acquaviva, DDN

In this presentation we will review suitable tools and methodology to observe I/O performance and tune workload to suppress potential data bottleneck. This session is an introduction of the hands-on tutorial in the afternoon

program
12h30

** lunch break **

program
14h00 Hands-on Session

Hands-on Session. Participant will have the opportunity to access a remote super-computer to run tests and experiments

prerequisite:

to be detailed

Outline of the sessions

to be detailed

Steering Committee

DaFab is a European project sponsored by ESA with a focus on applying AI technique to Earth Observation data. As we've observed the limited offer in terms of interdisciplinary event in Europe, we've decided to set-up this summer school! The summer school has been organized by the DaFab consortium as a whole, nevertheless, five members are specifically involved in the organization.




Register to the Summer School on the AI4SCience registration page


The workshop will be held in the faculty of Computer Sciences, University of Ljubjana, The Discovery Science 2025 International Conference will take place at the Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, Ljubljana, Slovenia: Access map









Sponsors

DaFab would like to warmly thanks its sponsors: