Stereo imagery, DSMs, and machine learning for vegetation management

On August 14, 2003, a few minutes after 4 pm Eastern Daylight Time, portions of the United States and Canada experienced an electric power blackout—affecting about 50 million people. The outage lasted for 4 days in some parts of the United States, while some parts of Canada suffered rolling blackouts for over a week.

The culprit? Inadequate vegetation management.

As a result, the United States incurred approximate costs of between 4 billion and 10 billion U.S. dollars. In Canada, there was a net loss of 18.9 million work hours, and the gross domestic product went down by 0.7% that August.

Apart from huge economic losses, poor vegetation management can lead to property loss or worse, deaths. Powerlines are not the only utility at risk. Linear infrastructures, such as pipelines, telecommunication lines, and railway lines, are all vulnerable.

"..., the number of vegetation related incidents in 2009/10 was 11,500 and has risen to nearly 19,000 in 2017/8 - leading to over 1,750 train cancellations."
Recorded incidents of trees or branches in England and Wales, "The Network Rail Vegetation Management Review."

Vegetation management calls for the removal of overgrown trees and shrubs and requires predicting which trees pose a risk due to age, disease, or bad weather.

Unfortunately, manual vegetation management is reactive and focuses on minimizing risks as they are identified. Further, manual methods of monitoring are tedious, expensive, and time-consuming---not to mention error-prone.

Automating the vegetation management process leads to fast and reliable insights, translating to fewer outages and lower costs. Machine learning is a key technology in the automation process, with stereo imagery and Digital Surface Models (DSMs) as important inputs

Today, we will look at the characteristics of stereo imagery and Digital Surface Models (DSMs). We will then go over the use of machine learning in vegetation management.

Vegetation Railways Photo by sasha set

What is Vegetation Management?

Vegetation management is the targeted control and removal of unwanted vegetation. It encapsulates many methods of retarding, modifying, or eliminating vegetation at regular intervals based on their risk to the reliability of infrastructure and utilities.

Vegetation management is selective. Its focus is on vegetation that may grow into or fall onto the infrastructure or utility. Non-threatening vegetation, e.g., those that only grow to a certain height, are not removed.

Check out this post to learn more about vegetation management methods and the solutions that geospatial technologies provide.

What is a Digital Surface Model (DSM)?

A DSM is an elevation model that captures the natural and built features on the Earth's surface. A DSM is useful for modeling 3-dimensional features above the Earth, such as infrastructure and tree canopies.

It is essential to distinguish a DSM from a Digital Elevation Model (DEM)---which is a term that can sometimes be used interchangeably with a Digital Terrain Model (DTM). A DEM/DTM is a bare earth surface without natural and built features.

For a more detailed explanation, see our post on digital elevation models.

Why is a Digital Surface Model (DSM) Important in Vegetation Management?

Vegetation height is an important parameter in assessing vegetation risk to infrastructure. DSMs are a useful input in computing the height of vegetation. The height is extracted by subtracting a DTM of the area of interest from the corresponding DSM.

DSM DTM Explanation of relation between digital surface model (DSM), digital terrain model(DTM), and canopy height model (CHM).(Perko, Roland & Raggam, Hannes & Deutscher, Janik & Gutjahr, Karlheinz & Schardt, Mathias. (2011). Forest Assessment Using High-Resolution SAR Data in X-Band. Remote Sensing, vol. 3, issue 4, pp. 792-815. 3. 792-815. 10.3390/rs3040792. (CC BY 3.0))

We can extract elevation models from various datasets, including ground survey data, Light Detection and Ranging (LiDAR) data, high-resolution optical imagery such as Hexagon aerial imagery, stereo satellite imagery, or Synthetic Aperture Radar (SAR) imagery.

Since our focus is on large-scale, large-area vegetation management that can be monitored from the sky or space, we shall look at stereo satellite imagery. Vegetation height is an important parameter in assessing vegetation risk to infrastructure. DSMs are a useful input in computing the height of vegetation. The height is extracted by subtracting a DTM of the area of interest from the corresponding DSM.

What is Stereo Satellite Imagery?

Stereo satellite imagery refers to two (or three) images of the same location acquired by satellite sensors from different viewing angles at similar times. This is unlike monoscopic satellite imagery, where an area is only captured once.

Stereo imaging is important for perceiving depth. Pairs of satellite images can therefore produce stereo-derived 3D elevation models.

While some satellite constellations only capture two images, others---like Pléiades---can acquire an additional image from a near-vertical position. We call this tristereo imaging. The near-vertical acquisition increases the chances of capturing all areas in dense urban or mountainous areas. Hence, tristereo images create more accurate 3-dimensional models as compared to basic stereo.

Basic stereo Source: Pléiades-user-guide

Why is Stereo Imagery Important in Vegetation Management?

Stereo imagery enables the extraction of stereo DEMs and DSMs, which are then used to estimate vegetation heights within or near utility corridors. Furthermore, stereo data comes in pairs of multi-spectral images, making it possible to analyze the proximity, health, and species of vegetation near utilities.

That said, stereo DEMs are prone to errors in densely vegetated areas because of sparse or missing ground samples. This is especially true for dense vegetation on steep slope surfaces, where detected ground points may be inadequate to model the underlying surface accurately. In such cases, LiDAR, SAR, or even survey data may complement stereo satellite imagery.

Nevertheless, the use of satellite stereo imagery in vegetation management offers several advantages, namely:

Remote identification of the health and species of vegetation
Wide area coverage for large-scale vegetation management
Consistent overhead passes for vegetation monitoring
Monitoring of places with restricted access
Provides historic data for change detection and prediction

So, we now know that by using stereo imagery, we can estimate the location and height of vegetation around infrastructure, their health, and species, and therefore, how they might behave.

The next step is combining, analyzing, and extracting insights from different datasets at scale. This is where machine learning comes in.

What is Machine Learning?

Machine learning is the ability of a machine to learn from data and thereafter perform a specific task without being explicitly programmed to do so.

The machine learns by using algorithms that take in huge amounts of data---known as training data---and discovers patterns. The product of training a machine learning (ML) algorithm is a model that can predict outcomes based on recognized patterns in the data.

The model is evaluated by how accurately it predicts the outcomes of data it has not seen before. Generally, algorithms are run iteratively until the model error is sufficiently minimized or a maximum number of runs have been achieved.

TensorFlow is one of the most popular open-source platforms for building and training machine learning and deep learning models. Find out how we used it to visualize satellite imagery and labels when training land cover models.

Machine learning is broadly categorized as supervised or unsupervised learning.

In supervised learning, the training data contains both inputs and the resulting outputs. The learning process involves the machine finding the best way to get to the output given the input. The main tasks undertaken by supervised learning are feature classification and regression analysis to predict numeric values.

In unsupervised learning, the training data contains input data only. The aim is to find hidden patterns in the data. Clustering is the primary task undertaken by unsupervised learning. Its goal is to group similar objects.

Vegetation Railways 2 Photo by Cody King

Geospatial Machine Learning for Vegetation Management

With the increased democratization of Earth Observation (EO) data, utility companies can now leverage ML algorithms and cloud computing to build innovative vegetation management solutions.

A vegetation management solution should be able to combine weather, vegetation, and geographical data with past vegetation-related outage information to predict the location of expected future outages.

In this section, we summarize the steps involved in building such a solution---including a look at some of the ML algorithms used to detect vegetation encroachment in utility corridors.

#1: Data Collection

This entails the collection of different datasets, including:

Utility/ asset data, e.g., location, voltage, crossings, height, materials, etc.
Terrain data, e.g., elevation and slope
Weather data, e.g., wind, temperatures, precipitation, etc.
High-resolution satellite and aerial imagery
Failure history, e.g., dates and locations of power outages

Note that, for the training of ML models, historical datasets spanning several years may be required.

#2: Preprocessing the Data

Preprocessing the datasets before they are combined is needed to gain the best possible insights. The goal of preprocessing is to represent or transform the original data into a form that decreases its complexity and increases the accuracy of the ML model. Preprocessing may involve dropping duplicate features, removing outliers, creating new features, etc.

Some preprocessing tasks, like breaking up large raster imagery into smaller, more manageable sizes or increasing imagery resolution, can be undertaken using ML algorithms.

#3: Algorithm Selection and Model Development

Selecting an algorithm and building a model---through training and testing on the preprocessed datasets---is the next step. This is not a straightforward process.

Because different factors play a part in vegetation-related failure, you may need to select different algorithms for different models. For example:

To predict failure caused by vegetation naturally growing into and contacting infrastructure calls for a model that can successfully predict growth-related vegetation outages.
To predict vegetation falling into or contacting infrastructure because of poor health calls for a model that can detect vegetation health.
To predict failure caused by vegetation falling into or contacting infrastructure due to weather-related factors calls for a model that finds patterns between weather events, geographic location, infrastructure failure, among others.

As you can see, the selection of the most appropriate algorithm mainly depends on what you are trying to achieve and the data. Sometimes, the best approach is to use several algorithms---this is known as ensemble modeling.

For instance, to detect vegetation encroachment on satellite imagery, various indices and machine learning algorithms are used. These include vegetation index-based algorithms, object detection algorithms, and change detection algorithms.

Vegetation Index (VI) Based Algorithms

These are algorithms that use reflectance properties to determine a particular characteristic of vegetation. The most commonly used VI is Normalized Difference Vegetation Index (NDVI).

NDVI uses the red and near-infrared (NIR) bands of the electromagnetic spectrum to distinguish between vegetation types and determine their health. NDVI is calculated using the formula below.

NDVI = {NIR - Red} / {NIR + Red}.

For vegetation management, vegetation indices are essential for detecting plant species and dead or dying trees which pose a risk to the infrastructure.

Object Detection Based Algorithms

Another method of detecting encroachment is object detection. Object detection involves detecting instances of a specific object in an image.

In vegetation management, the objects of interest are infrastructure, e.g., power lines, towers, railway lines, transformers, etc., and vegetation to determine their proximity to each other.

Change Detection Algorithms

Change detection is a critical task in vegetation management. It helps to identify changes in vegetation that could pose a threat to infrastructure.

Change detection involves the identification of changes in vegetation type, health, and species. The algorithm takes in two coregistered images taken at different times, compares them, and outputs a change map.

#4: Integrating the Models into Vegetation Management Workflows

This entails using the ML models to extract insights from your data at scale. The insights are then used to plan and prioritize vegetation management activities, hence reducing vegetation-related failure.

For example, a lidar-derived canopy height model can be overlaid on tree species classified from multispectral imagery by a ML model to prioritize the least and most dangerous vegetation based on the species growth rates.

The performance of the model(s) can be evaluated by comparing the sections predicted as dangerous against those identified by field crews or drone inspections. The results can then be used to improve the model in a continuous feedback loop.

The Future of Vegetation Management is Here

There are many advantages to using machine learning in vegetation management, but let's keep it simple. The main reason? Addressing vegetation risks before they cause a problem.

Adoption of a data-driven approach allows utility companies to prioritize work based on 'predict and prevent' rather than 'react.' Machine learning is allowing companies to:

Make accurate decisions faster
Improve the safety of both workers and customers
Boost utility reliability
Reduce human errors

Machine learning is not only changing the way utility companies manage vegetation. It is also changing the way they view vegetation management: A challenge they can overcome by accelerating data collection and analysis.

Whether you are looking for the right data or the AI algorithms needed to uncover insights, find them all on UP42.

Rose Njambi

Contributor

Industry