Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Fundamental Concepts

Authors
Affiliations
TU Wien
TU Wien
TU Wien

In this notebook some of the foundational concepts for open source geospatial data will be explained.

What does Open Source mean?

First of all, let’s look into what something being “open source” even means.

Open source means that something, like software, data, or a project, is made freely available for anyone to use, study, modify, and share. The key idea is openness: instead of being controlled by a single company or locked behind restrictions, the “source” (the underlying code or data) is accessible to the public. This allows people to collaborate, improve the work together, and adapt it for different needs.

One prominent example of open source geospatial data, which we will be using in the practical part later, is OpenStreetMap.

What is OpenStreetMap (OSM)?

There are various sources for spatial data, but OpenStreetMap (OSM) is arguably the most accessible and comprehensive open option. OSM is a free, collaborative database of the whole world built by volunteers, a concept known as Volunteered Geographic Information (VGI). Crucially, unlike services like Google Maps which primarily provide pre-rendered map images (raster tiles), OSM provides access to the raw vector data (points, lines, and polygons). This allows us to perform deep analysis on the geometry itself rather than just viewing it. You can find more information about it here: https://wiki.openstreetmap.org/

OSM offers the data in different file formats including PBF, XML or JSON. For data processing with large files, PBF is often recommended as it is a highly compressed format. However, we will be downloading our data as XML file since this is the native output of the API and also human-readable, making it easier to verify our data and understand the structure during this learning process.

What is an API?

An API (Application Programming Interface) is a way for different software systems to communicate with each other. It allows our code to request data or services from another system without needing to understand how that system is built internally.

In the context of geospatial data, an API lets us make requests like “Give me all the data within a certain area”. We send these requests to a server, and the API responds with structured data (such as XML or JSON) that our program can process.

For OpenStreetMap, APIs are especially important because they give us direct access to the underlying geographic data. Tools like Overpass Turbo use a specialized query language to let us precisely define what data we want and where we want it from.

What is Overpass Turbo?

Overpass is a powerful API that lets us query the massive OSM database. While it is only one of the many ways for downloading data from OSM (another one for example being Geofabrik https://download.geofabrik.de/), it has the advantage of being highly customizable in terms of the selected area. Instead of downloading the whole planet file or whole country files from Geofabrik, we can use Overpass to just download data from very specific regions, like inside of a bounding box.

Summary

In this notebook you have learned:

  • What open source geospatial data is with OSM as important example

  • What an API is with Overpass Turbo as important example