Make better Machine Learning predictions with spatial algorithms

Text Size 100%:

Are you are using machine learning (ML) to support your business operations? If so, then you have the opportunity to increase the value you get from ML by incorporating geospatial information. But to get this additional value, the capability must avoid the need for excessive movement or duplication of geospatial data, which is often voluminous, as well as avoiding the need to code new algorithms.

To address this opportunity, we are excited to announce the upcoming enhancement of Oracle Machine Learning in Autonomous Database with spatial algorithms to incorporate location into your data science and machine learning-based solutions. This is one several AI and ML enhancements to Autonomous Database, as described here.  

Why include spatial data and algorithms in ML?

The use of spatial data and algorithms in your ML workflows offers benefits for both detecting patterns and making predictions.

Patterns detected by Spatial algorithms include:

  • Clusters – observations that are geographically concentrated. For example, identify groups of traffic accidents over time where all accidents in a group occurred in close proximity to each other. This allows you to relate accidents based on their proximity, and use that relationship to focus on causal factors in those areas.
  • Hotspots/coldspots – observations with value significantly higher/lower relative to the surrounding areas. For example, identify hotspot/coldspots in customer churn. Carry this insight into determination of what is going wrong in the hotspots areas and what is going right in the coldspot areas.
  • Colocation – observations located in close proximity to a location from another set of information. For example, identify the colocation pattern of your hotel locations with amenities such as shops and restaurants, to help determine which are helping to drive the most business.  

Incident clusters

 

Pattern detection example:  Incidents clustered according to thresholds for proximity and number of incidents. Incidents having same color are part of a cluster.

 

 

 

 

 

Predictions made by Spatial algorithms:

  • Account for varying influence of predictive factors across geography – predictive models generally consider many attributes (referred to as “features” in ML). Taking the case of predicting customer churn, a non-spatial model accounts for the predictive influence of features such as customer demographics, customer service info, and availability of special offers. Spatial algorithms can improve predictions by accounting for varying influence of these features across the business’s geography. For example, the availability of special offers may have more predicative influence in some areas and less in others. Given the same set of customer demographics, customer service info, and availability of special offers, a Spatial algorithm will make different predictions in different areas since it understands the geographically varying influence of these features.
  • Account for the influence nearby outcomes – spatial algorithms add the ability to determine the predictive influence of nearby outcomes. This predictive influence is referred to as “spatial dependence” and is incorporated into spatial algorithms. If such a spatial effect is observed, then spatial algorithms include it in predictions. For example, a model to predict disease outbreak will incorporate many features such as sanitation conditions, availability of clinics, and demographics.

Spatial prediction model

 

Prediction example:  Maps show home value predictions residuals (predicted value – actual value). The lighter colors represent higher accuracy. The spatial model on the right shows generally better accuracy than the non-spatial model on the left. The bar chart shows a standard  model quality score, with the spatial model showing better performance.

 

 

 

 

 

These are some of the aspects of how spatial data and algorithms enhance ML, and Oracle Machine Learning provides the ideal platform to enjoy these benefits.

New spatial enhancement in Oracle Machine Learning

Spatial data management and analysis features are built into every Oracle Database, including Autonomous Database. These spatial database features are referred to as Oracle Spatial. Machine Learning in Autonomous Database provides data scientists, data engineers, and developers with a complete environment to develop and operationalize models using familiar notebooks and languages, while behind the scenes work is performed by Autonomous Database. The spatial enhancement is coming to Oracle Machine Learning for Python (OML4Py) on Autonomous Database, so you can develop and operationalize geospatial-based predictive models at scale in Python via Oracle Machine Learning Notebooks while leveraging the native support of Oracle Spatial for spatial data management and analysis operations.

What will be included in the enhancement?

Spatial data is generally categorized as spatial vector data (point/line/polygon data such as address locations, utility lines, and trade areas) and spatial raster data (gridded data such as satellite imagery and elevation models). The first phase of this enhancement will include features to incorporate spatial point/line/polygon data into the ML lifecycle in Machine Learning for Python on Autonomous Database. The enhanced Python API is planned to include:

  • Data prep and pre/post processing: load from common spatial formats, fill and scale data, engineer new features, persist results
  • Spatial analysis: exploratory analysis, Spatial SQL operations (i.e. Python API for Oracle Spatial)
  • Spatial ML algorithms: regression, classification, clustering, anomaly detection, colocation analysis
  • Map visualization: map rendering of spatial data and background maps in OML notebooks

For both data science generalists and those with spatial expertise, this enhancement will provide a tremendous opportunity to make better predictions and discover more insights through the use of spatial data and algorithms backed by Autonomous Database.

Need more information?

You will be able to read more on this feature once it becomes available through Oracle Machine Learning Notebooks. In meanwhile:

  • Visit the Oracle Machine Leaning page for an overview and additional resources related to OML4Py and OML Notebooks
  • Visit the product page to get to know more about Oracle Spatial
  • Visit our YouTube channel for a collection of learning and what’s new videos on Oracle Spatial

 

David Lapp

Senior Principal Product Manager

David Lapp is a Senior Principal Product Manager at Oracle Corporation. His responsibilities include strategy and planning for Oracle’s Spatial and Graph technologies and cloud services, and their use across the Oracle Cloud including machine learning and analytics. Prior to his current role in product management, David spent nearly 10 years in technical pre-sales covering analytics and spatial technology for the North American Public Sector. David is a graduate of the University of Washington.