aerial-powerline-field

Have you ever wondered what DIMAP, GeoTIFF, SAFE, SHP, NITF, NetCDF, GeoJSON, and EEC have in common? If you know a thing or two about geospatial, you’ve probably already guessed. They’re all examples of data structures and formats from the different providers and constellations that UP42 works with.

It’s quite the list. Each data format usually has its own metadata structure and encoding standard, which makes it hard to ensure compatibility and interoperability. So if you want to use algorithms on our platform to extract insights from your data, you need to adapt them for each input source and processing pipeline. Needless to say, this is resource-intensive and can act as a blocker for visualizations and downstream integrations.

Up until now, you also needed to download the asset as a compressed ZIP file first, and then decompress it before processing the data and using individual elements of a scene—even if you don’t need all band information. After downloading, common issues include inconsistencies in file formats, and in the assets’ internal structure and naming conventions. In short, it’s way too complicated.

The solution? Assets in UP42 storage can be easily accessed and streamed!

Earlier this year, UP42 adopted the STAC specification, which allows searching across multiple providers for geospatial assets with a common structure and set of metadata. Thanks to this, users can now do a STAC search in storage beyond simple attributes like date, time, and constellation.

And now, we’ve introduced a cloud-native asset model for assets in UP42 storage. Regardless of provider or delivery format, UP42 will transform the assets into a standard model. This will ensure consistency in how an asset is named, its media type (raster data->Cloud Optimized GeoTIFFs or COGs for short, vector masks->GeoJSON), and its role (e.g., data, metadata, previews, or masks). The benefit: you can easily search for and work with the data without having to understand different formats and delivery folder structures, naming conventions, and more, and most importantly: without having to download it first.

UP42 will transform raster data, which represents the majority of the current catalog and tasking data on the UP42 platform (Airbus, Intermap’s NEXTMap, Capella Space, BlackSky, Satellogic, etc.) into standardized COGs, exposed through STAC. This means that each file is now accessible and can be streamed. We will also convert all vector mask files into GeoJSONs. So if a delivery contains several cloud mask files (e.g., Shapefile (.shp, .shx, .dbf, .prj), GML, KML, KMZ), you’ll be able to access the already transformed GeoJSON files, standardized with cloud masks from other providers. The cloud-native asset model enables easier integrations for analytic workflows, visualization, and custom platform integrations. It also allows you to validate your catalog and tasking orders, as you can now display visuals and thumbnails (see examples below).

Why cloud optimized GeoTIFFs?

A GeoTIFF is a format for storing raster graphic images with additional spatial information such as map projection, coordinates, and dates. A cloud optimized GeoTIFF is a regular GeoTIFF file, hosted on a HTTP server, enabling fully cloud-based geospatial workflows. It enables streaming of raster data, so clients using HTTP range requests can ask for just the part of the file they need. The format has been optimized for efficient storage, access, and processing in the cloud. Some benefits of COGs include:

  1. Industry support: It’s well-adopted and supported by the geospatial industry, including QGIS, Esri’s ArcGIS, Rasterio, GDAL, and others
  2. Interoperability: A standard format enables the seamless integration and sharing of data across platforms by ensuring compatibility and eliminating the need for complex data transformation
  3. Flexibility: COGs store geospatial data in an optimized way, allowing tiling, rendering, and processing of partial data sets to enhance retrieval of data, making them a great option for managing large geospatial datasets
  4. Processing time: COGs enable on-demand analytics processing and visualization to facilitate the extraction of insights from geospatial data at scale, reducing processing time significantly
  5. Data duplication: COGs reduce data duplication as data can be streamed instead of copied for download purposes
  6. Ease of use: COGs support well-known metadata formats and can be easily consumed using HTTP range requests, mapping to spatial indexes that allow partial access to larger raster files
  7. Compatibility: Last but not least, they’re backward compatible, supporting the original file format of GeoTIFFs

You can find more information on the benefits of COGs here.

The process of transforming UP42 deliveries into our cloud-native asset model

First and foremost, the process is automatic—there are no actions for you. It’s also free, so you won’t be charged extra for accessing standardized STAC assets (e.g., COGs, GeoJSONs) in UP42 storage.

  1. Your tasking or catalog order delivery is made available in UP42 storage.
  2. Asset metadata is extracted and a STAC collection and STAC items created.
  3. All delivery files are then listed as STAC assets, raster data is transformed into COGs and vector masks into GeoJSONs; STAC assets are returned as part of STAC item response bodies.
  4. You can access the original delivery as well as all the transformed individual files, linked with each other through STAC collections and items.

Now, let’s look at an optical image example...

{
  <...>,
  "assets": {
    "mgag5ui3_IMG_PHR1A_MS_202208080011285_SEN_6452562101-2_R1C1.JP2": {
      "href": "https://api.up42.com/v2/assets/e4a15000-890b-445c-82bb-e4364baac40b",
      "title": "Multispectral data",
      "type": "image/tiff; application=geotiff; profile=cloud-optimized",
      "roles": ["data", "multispectral"],
      "gsd": 0.7283,
      "eo:bands": [
        {
          "name": "blue",
          "common_name": "blue",
          "center_wavelength": 0.47,
          "full_width_half_max": 0.07
        },
        {
          "name": "green",
          "common_name": "green",
          "center_wavelength": 0.47,
          "full_width_half_max": 0.07
        },
        {
          "name": "red",
          "common_name": "red",
          "center_wavelength": 0.665,
          "full_width_half_max": 0.038
        }
      ]
    }
  },
  <...>
}

You can find a detailed description of all parameters in this example and more information on how to analyze STAC assets in our technical documentation. You can also find more information on the data fields in the GitHub STAC asset documentation by Radiant Earth Foundation.

An example of a multipart AOI image delivery on the UP42 console:

CNAM GIF faster version

We’re excited to bring you the numerous benefits that STAC and a cloud-native asset model offer. We will also continue to introduce cloud tools and more cloud-native formats (e.g., to cater to multidimensional weather data) to enable more efficient and centralized data management. The goal? Allowing more and more users to unlock the full potential of geospatial data. Expect detailed how-to notebooks and more updates from us soon.

Dobrina Laleva avatar

Dobrina Laleva

Senior Product Marketing Manager

Key product updates in 2023 (so far)

Key product updates in 2023 (so far)

Product

New data, better discoverability, and sample data Vexcel: Ultra high resolution aerial imagery…

Jorge Fernandez
Bringing STAC to UP42 storage: lessons learned

Bringing STAC to UP42 storage: lessons learned

Product

Data modeling challenges at UP42 Our journey with STAC started when we realized we had to adapt UP4…

Naman Jain and Dobrina Laleva
Automate your pipeline with STAC and our brand-new data management capabilities

Automate your pipeline with STAC and our brand-new data management capabilities

Product

The geospatial industry has exploded over the past few years and with it, so have data volumes. As…

Dobrina Laleva

Subscribe to our newsletter