Strengthening geospatial data standards at UP42 - introducing a cloud-native asset model

Have you ever wondered what DIMAP, GeoTIFF, SAFE, SHP, NITF, NetCDF, GeoJSON, and EEC have in common? If you know a thing or two about geospatial, you’ve probably already guessed. They’re all examples of data structures and formats from the different providers and constellations that UP42 works with.

It’s quite the list. Each data format usually has its own metadata structure and encoding standard, which makes it hard to ensure compatibility and interoperability. So if you want to use algorithms on our platform to extract insights from your data, you need to adapt them for each input source and processing pipeline. Needless to say, this is resource-intensive and can act as a blocker for visualizations and downstream integrations.

Up until now, you also needed to download the asset as a compressed ZIP file first, and then decompress it before processing the data and using individual elements of a scene—even if you don’t need all band information. After downloading, common issues include inconsistencies in file formats, and in the assets’ internal structure and naming conventions. In short, it’s way too complicated.

The solution? Assets in UP42 storage can be easily accessed and streamed!

Earlier this year, UP42 adopted the STAC specification, which allows searching across multiple providers for geospatial assets with a common structure and set of metadata. Thanks to this, users can now do a STAC search in storage beyond simple attributes like date, time, and constellation.

And now, we’ve introduced a cloud-native asset model for assets in UP42 storage. Regardless of provider or delivery format, UP42 will transform the assets into a standard model. This will ensure consistency in how an asset is named, its media type (raster data->Cloud Optimized GeoTIFFs or COGs for short, vector masks->GeoJSON), and its role (e.g., data, metadata, previews, or masks). The benefit: you can easily search for and work with the data without having to understand different formats and delivery folder structures, naming conventions, and more, and most importantly: without having to download it first.

UP42 will transform raster data, which represents the majority of the current catalog and tasking data on the UP42 platform (Airbus, Intermap’s NEXTMap, Capella Space, BlackSky, Satellogic, etc.) into standardized COGs, exposed through STAC. This means that each file is now accessible and can be streamed. We will also convert all vector mask files into GeoJSONs. So if a delivery contains several cloud mask files (e.g., Shapefile (.shp, .shx, .dbf, .prj), GML, KML, KMZ), you’ll be able to access the already transformed GeoJSON files, standardized with cloud masks from other providers. The cloud-native asset model enables easier integrations for analytic workflows, visualization, and custom platform integrations. It also allows you to validate your catalog and tasking orders, as you can now display visuals and thumbnails (see examples below).

Why cloud optimized GeoTIFFs?

A GeoTIFF is a format for storing raster graphic images with additional spatial information such as map projection, coordinates, and dates. A cloud optimized GeoTIFF is a regular GeoTIFF file, hosted on a HTTP server, enabling fully cloud-based geospatial workflows. It enables streaming of raster data, so clients using HTTP range requests can ask for just the part of the file they need. The format has been optimized for efficient storage, access, and processing in the cloud. Some benefits of COGs include:

Industry support: It’s well-adopted and supported by the geospatial industry, including QGIS, Esri’s ArcGIS, Rasterio, GDAL, and others
Interoperability: A standard format enables the seamless integration and sharing of data across platforms by ensuring compatibility and eliminating the need for complex data transformation
Flexibility: COGs store geospatial data in an optimized way, allowing tiling, rendering, and processing of partial data sets to enhance retrieval of data, making them a great option for managing large geospatial datasets
Processing time: COGs enable on-demand analytics processing and visualization to facilitate the extraction of insights from geospatial data at scale, reducing processing time significantly
Data duplication: COGs reduce data duplication as data can be streamed instead of copied for download purposes
Ease of use: COGs support well-known metadata formats and can be easily consumed using HTTP range requests, mapping to spatial indexes that allow partial access to larger raster files
Compatibility: Last but not least, they’re backward compatible, supporting the original file format of GeoTIFFs

You can find more information on the benefits of COGs here.

The process of transforming UP42 deliveries into our cloud-native asset model

First and foremost, the process is automatic—there are no actions for you. It’s also free, so you won’t be charged extra for accessing standardized STAC assets (e.g., COGs, GeoJSONs) in UP42 storage.

Your tasking or catalog order delivery is made available in UP42 storage.
Asset metadata is extracted and a STAC collection and STAC items created.
All delivery files are then listed as STAC assets, raster data is transformed into COGs and vector masks into GeoJSONs; STAC assets are returned as part of STAC item response bodies.
You can access the original delivery as well as all the transformed individual files, linked with each other through STAC collections and items.

Now, let’s look at an optical image example...

{
  <...>,
  "assets": {
    "mgag5ui3_IMG_PHR1A_MS_202208080011285_SEN_6452562101-2_R1C1.JP2": {
      "href": "https://api.up42.com/v2/assets/e4a15000-890b-445c-82bb-e4364baac40b",
      "title": "Multispectral data",
      "type": "image/tiff; application=geotiff; profile=cloud-optimized",
      "roles": ["data", "multispectral"],
      "gsd": 0.7283,
      "eo:bands": [
        {
          "name": "blue",
          "common_name": "blue",
          "center_wavelength": 0.47,
          "full_width_half_max": 0.07
        },
        {
          "name": "green",
          "common_name": "green",
          "center_wavelength": 0.47,
          "full_width_half_max": 0.07
        },
        {
          "name": "red",
          "common_name": "red",
          "center_wavelength": 0.665,
          "full_width_half_max": 0.038
        }
      ]
    }
  },
  <...>
}

You can find a detailed description of all parameters in this example and more information on how to analyze STAC assets in our technical documentation. You can also find more information on the data fields in the GitHub STAC asset documentation by Radiant Earth Foundation.

An example of a multipart AOI image delivery on the UP42 console:

CNAM GIF faster version

We’re excited to bring you the numerous benefits that STAC and a cloud-native asset model offer. We will also continue to introduce cloud tools and more cloud-native formats (e.g., to cater to multidimensional weather data) to enable more efficient and centralized data management. The goal? Allowing more and more users to unlock the full potential of geospatial data. Expect detailed how-to notebooks and more updates from us soon.

Dobrina Laleva

Senior Product Marketing Manager

Optical Radar Analytics Aerial

Product

Strengthening geospatial data standards at UP42 with a new cloud-native asset model

The solution? Assets in UP42 storage can be easily accessed and streamed!

Why cloud optimized GeoTIFFs?

The process of transforming UP42 deliveries into our cloud-native asset model

Related posts

Subscribe to our newsletter