Location data is a unique class of data on its' own, the insights provided by location data is invaluable to businesses. However, they do not come without their own set of complexities.
Here are a few things you should be aware about location data and the data economy. These will help you avoid the common pitfalls when handling and transacting location data.
You might be buying inaccurate location data
And it might be causing you to make bad business decisions.
Research by Gartner shows that poor data quality costs businesses about $15 million per year.
One classic source of inaccurate location data is Bidstreams.
If you have ever used or are still using bidstream data, you might want to read these.
- What is Bidstream?
- Less than 10% of bid-stream location data is high-quality.
- Nearly 80% of location data in the bidstream is inaccurate.
Some providers might sell you IP Geolocation or location data collected from Wi-fi.
While more accurate compared to bidstream data, IP Geolocation data also has its inaccuracies.
- Android reports wrong location when connected to WiFi.
- Moved out with my router: Location services don’t get it.
For IP Geolocation data, the common cause of inaccurate location is the servers of the service provider.
So, with such a weak reputation and known flaws, why do businesses still use these sources of location data? The answer comes down to price -- they are cheap.
But as the saying goes: “You get what you pay for.”
Computer Science 101: Garbage in, garbage out. You can't get good results without good data.
When you make big decisions based off potentially inaccurate location data, it can cost your company lots of money.
Are you willing to take that risk?
Better-informed businesses, they typically use mobile location data or GPS data.
But these are not exempt and there are also problems with the handling of these data.
Processing, Cleaning, Normalising Location Data is complex and costly
Due to the way Mobile Location Data is captured, there are a few inherent problems in all raw location data:
- Missing fields
- Irregular timestamps
- Wrongly formatted values
Normally, data processing, cleaning and normalisation should be no issue for capable data practitioners.
However, things get complicated when you have to process billions of datapoints on a consistent basis.
The sheer size of a location dataset means that this process is now tedious, time-consuming, and requires lots of processing power. In turn, that adds additional costs to your business.
Most businesses simply could not afford the time, budget, or infrastructure to clean such large amount of data on a consistent basis.
On top of that, cleaning and filtering location data might leave you with barely any datapoints left. This means you run the risk you will be unable to adequately generalize from the insights revealed by the data.
By the time your data is cleaned, you are left with barely enough datapoints left to analyse.
You risk buying duplicate data
Due to the lack of transparency in the location data marketplace, companies will almost never reveal their data source. Often, they will claim to be the owner of the data.
This intricate web of data traders creates another problem in the location data marketplace: Buyers are at risk of purchasing the exact same data from two or more sellers.
Sometimes, sellers might sell you the exact same data. Other times, they might sell you datasets with a significant amount of overlapping datapoints. We call this cross-contamination.
In either case, data buyers are wasting their money buying duplicate data.
One easy method to identify whether two sellers are providing the same data is by simply comparing their data and looking for any duplicates or overlap.
However, bad actors know this and will often modify their data in an attempt to fool you into believing that you are not buying duplicate data.
This brings us to our next point.
You might be buying fake or fraudulent data
And you might not realise it.
Some common techniques which bad actors use are to introduce a delay to the timestamp or creating random changes in the latitude and longitude values.
Sometimes, these bad actors are careless with their data fraud, resulting big tell-tale signs. The following are two examples.
This particular device has been observed to have travelled from California to Sydney in one hour.
Around the world in 80 days? Make that an hour!
Nobody has invented one-hour travel from the United States to Australia. This is quite obviously faked data.
Here is another sign actors want to sell faked data. We discovered a huge amount of traffic ... in the middle of the Indian Ocean.
World’s largest pool party, where the ocean is your pool.
The above examples might be funny to you, but they do not happen often. Data fraudsters are often good that what they do, their modifications are often subtle and very hard to detect.
The above examples are ludicrous but data fraudsters are usually more adept at subtly concealing their modifications.
Unless you perform a detailed analysis on the dataset (which is time consuming and resource intensive), you might never know the difference between legitimate and tampered location data.
How to clean up the geospatial data space
At Quadrant, we want to make the data world better. We aim to make the location data space safe, fair, and efficient for businesses.
We do this in two primary ways.
Cleaned and refined location data based on your business requirements
We know and understand that it takes lots of time and effort to process and clean terabytes of data.
This is why Quadrant we will clean, filter and refine our location data to suit your business requirements.
And the benefit for customers is you only pay for what you use.
Only need data for downtown California?
We can filter them out for you.
Want to receive your location data on a daily or hourly basis?
We can do that for you.
This will save you valuable time and money from doing all the cleaning in yourself. You do not have to worry about data cleaning and processing anymore.
We ensure data provenance through our blockchain protocol
Our blockchain protocol – Quadrant Protocol, “stamps” data that we receive. Allowing users to trace and verify the origin of the data and its authenticity.
This technology also allows for mapping of the data and data sources. And de-clutter the field of information, helping users to eliminate resellers and trace bad data back to the person or organisation responsible.
Making the Location Data Economy Clean, Safe, Efficient.
While data has the potential to greatly improve businesses and products, volume alone is not enough.
Through our services and blockchain protocol, Quadrant aims to bring trust and visibility to the data economy.
We are a data and technology company that helps data professionals obtain and utilise high-quality authentic mobile location data.
To learn more about the various source of location data, visit our location data education page.
To learn more about the business application of location data, download our use cases here.
We are open to guest blogging on location data, big data, and the data marketplace. If you have an interesting article covering these topics, talk to us.