How Quadrant prevents fraudulent records within location data supply chains

Picture of Roger Sundararaj

Good quality data is crucial for the operational efficiency of location-based businesses. Location-based analyses used in critical decision-making processes rely on the quality and accuracy of data powering them.

However, location data supply chains are characterized by a lack of transparency and authenticity, because data vendors prioritize volume over quality. To ensure that our customers make the most of their data purchase, we have made significant strides in identifying and erasing fraudulent data within our datasets. One way we achieve this is by examining our own data sources and constantly monitoring activity within our mobility and POI data feeds. We use data visualisation and proprietary AI models to spot instances of low-quality or fraudulent data and remove them before reaching customers. 

Identifying fraudulent mobile location data records 

Quadrant sources mobile location data via Server-to-Server (S2S) integrations with app publishers from around the world. Naturally, there are going to be duplicate records among these feeds. We proactively perform overlap analysis and run data through a de-duplicating algorithm to make sure we only ingest and disseminate unique values.  
 
Unique signals are generally indicative of a healthy data feed. However, there are a few concerns that must still be addressed. 

Lack of movement: A lack of movement tends to be an indicator of low-quality location data, whereas high-quality data shows lots of movement.  

Kansas farm” (and other similar phenomenon): Data that shows a lot of people at the same coordinates beyond what is to be reasonably expected, is always a red flag. 

Teleportation: The same device appearing in multiple countries or regions within the same 24-hour period points to extreme inaccuracies in the data. 

In the visualizations below, we can see that a device is observed in multiple, far-off locations within very short time frames. If we take travel time and possible modes of transportation into account, we can determine that this activity is not realistic. 

To identify such location data signals, we have designed an algorithm that monitors peculiar movement patterns by gauging the distance between consecutive timestamps. Post-identification, these records are removed from our feeds. 

Hydra Quality Control GIFs (1)

 
Hydra Quality Control GIFs (3)

Fraud detection to maintain accurate POI datasets 

We build custom POI datasets for customers using our exclusive POI data collection platform, Geolancer. Since Geolancer relies on external contributors (who add data in exchange for crypto rewards), we must remain vigilant for the occasional unethical user.  

All our POI data undergo a thorough quality assurance process. We have developed AI models to identify anomalies, for example, when consecutive geo-coordinates are mapped in timeframes that are not humanly possible. 

In the visual below, we see a Geolancer hopping between far off locations rapidly - something that is highly unusual for a human being. 

Geolancers hopping

When it comes to POI data, photos are the ultimate arbiter of truth, and all POIs mapped through Geolancer must include at least one photo. We leverage these images to combat other forms of 'user ingenuity’. Photos must be taken in front of a POI; hence, users cannot add or verify a location unless present at the exact coordinates. Using text-recognition we can validate if location names, categories, opening hours, and other attributes are accurate.storefronts

Our preventative algorithms allow us to immediately identify suspicious behavior so that poor-quality data never makes it into our clients’ data feeds. As a result of our due diligence, we have been able to deliver POI datasets for companies like Gojek with an exceptional 97% accuracy! 

To make the most of their investment, companies must always assess the quality assurance process of their data providers. Supplying the highest quality location data should be a priority for data providers and performing consistent quality checks on the data supply chain is a crucial step. 


 

Leverage Quadrant's industry-leading geospatial data to improve your business' performance? Speak to one of our data experts today. 

 

 

ABOUT AUTHOR

Roger Sundararaj

Data Scientist | Solidity Engineer at Quadrant

Great updates

Subscribe to our email newsletter today!