Uncharted Earns Another Spot on the Leaderboard in AI Competition by DARPA and USGS

Back

When last we checked on the joint DARPA and USGS AI for Critical Mineral Assessment Competition, we were thrilled to discover we had won the Map Georeferencing challenge. Since then, we submitted our entry for the competition’s recently completed second challenge, Map Feature Extraction. Now that the scores are in, we’re honored to find ourselves on the leaderboard once again, earning top marks in the map symbols category and second best in region features on the way to a third place finish overall. For more information on each challenge, see DARPA’s announcement of the winners and episode 63 of the Voices from DARPA podcast.

The Feature Extraction challenge is a natural follow-on to solving the Georeferencing challenge. Once the extents of a map are properly determined, we want to identify the areas, icons and lines that chart critical minerals and related geologic waypoints therein. What makes this a particularly vexing problem is that graphical marks on a map typically require a legend to both signify their importance and enable readers to decode them into usable information. For every map, a unique legend defines a unique feature set to find. The core challenge is to design AI solutions that can handle the many different legend and feature set encodings.

Historical maps developed at different times over many decades can be a difficult dataset to work with. Paper maps often suffer from degradation such as stains, folds, and other wear and tear. Pixelization introduced by digital scans compound these issues. From map to map, marking and labeling conventions may vary greatly. Even within a single map, there may be inconsistently applied labels, handwritten notes, or markings rotated at different angles. There may be so much data that the markings themselves overlap, both on the map and in their assigned symbology. With dozens or even hundreds of legend entries, colors, patterns, shapes, or thicknesses may be so similar that they’re difficult to distinguish.

Geologic map with a legend showing 60 polygon features (many of which have nearly identical colors) and 19 different point and line features. When even human perception has trouble distinguishing map features, computational approaches can be difficult to design. | Image: USGS.

The goal of the Feature Extraction challenge was to determine which areas of a map matched a specific legend entry. There were three separate problems: extracting point symbols, extracting polygons (regions filled with color or patterns), and extracting lines (such as fault lines, extents of deposits, borders, rivers, or roads). Below we highlight our approaches to extracting point symbols and polygons, which placed first and second in their categories, respectively.

Point features mark the location of landmarks such as resource deposits. These marks come in a variety of shapes, sizes, orientations, and colors. Across all the challenge data, there were nearly 100 unique point symbols. To automatically identify them, we developed a hybrid approach encompassing techniques from many domains. We used computer vision, natural language processing, entropy analysis, and edge/contour analysis to preprocess map images and filter out unwanted features and cross correlation analysis to find matches between the legend entry and the processed image at various rotations. Success was dependent on how close the map points matched the legend entry. Distinct points were easiest to find; those that resembled lines were among the most challenging.

A hybrid approach to find (1) point symbols from the legend in historical maps. Image preprocessing (2) hides text, fades features that don't match the symbol color, and fades features that are too large. Cross correlation analysis (4) finds probable instances of the legend symbol in the processed image. | Images: USGS, Uncharted

Polygon features distinguished by color or pattern pose a very different set of problems. They often overlap, they’re particularly susceptible to pixelization, their labels can vary greatly in size and placement, and their color coding may be difficult to distinguish. Like with point features, our approach to identifying polygon features is a hybrid of many computational techniques that builds on numerous experiments with computer vision, entropy analysis, artificial and convolutional neural networks (ANNs and CNNs) to aid with legend analysis and map segmentation, extraction of colors highly likely to match unique legend entries, spatial high-pass filtering to remove noise, and expansion of detected regions.

A hybrid approach to (2) identify map regions likely to match a legend entry based on color and then (3) filter the results to remove noise. With regions mapped to available legend entries, results are expanded out through black spaces (regions that don't match any provided legend entries) until they collide. | Images: USGS, Uncharted

Ultimately, USGS plans to build on the technological approaches from all the top-ranking teams on both challenges and apply them to millions of maps. They want to increase the pace at which they compile and enhance data for assessments of domestic mineral resources currently vulnerable to supply chain interruptions—a process which can take up to four years for a single mineral, and there are 50 different minerals to assess. We’re hopeful that our approaches to all the problems posed in the competition will help in that aim, and we’re looking forward to opportunities to build and strengthen the developed techniques.

Learn More:

Acknowledgements

Thanks to DARPA and USGS for sponsoring and administering the AI for Critical Mineral Assessment Competition and crafting such engaging challenges. And thanks to our fellow entrants for their innovative solutions and spurring us on to devise equally creative approaches.

Uncharted credits: Chris Bethune, David Giesbrecht, and Nathan Kronenfeld.

Last Post Next Post