Based on the advances in the location data space, it is surprising that a lot of companies continue to fixate on the number of events per device per day to assess the quality of location data.
Events are logged by mobile apps when location tracking permissions are granted by the user – this can be as frequently as every second or as infrequently as once a day or less, depending on the data collection methodology.
Challenging existing beliefs
The assumption all too often is that if the event count is below some arbitrary number, say, 10 events per day, the data is therefore of lower quality.
This, however, is up for debate – event quantity per se can be a very misleading, and in some cases a useless, metric.
In fact, it tells us relatively little about the device activity.
There are a few reasons why.
First, when considering a dataset, the evaluation criteria shouldn’t be just about the quantity of events per device per day. It is also important to consider the quality of events.
For example, if a device is seen 100 times per day at a single location, this provides limited insights (apart from telling us this device was at that location all day).
While this may be useful to some organisations, for most businesses, the insights does not provide any actionable information. The exception may be delivery services like Amazon or DoorDash, or insights generated from a customer who dwells for many hours at a shopping mall.
A second example would be a device that is seen less than 10 times throughout a 24-hour period but in distinct and various locations (i.e. shopping mall, gym, restaurant, roadways, etc.). The insights that can be derived from a device exhibiting such events contain a lot of contextual information and are much more valuable for decision making.
For most use cases, though, high quantity of data does not necessarily mean high quality. Moreover, quantity as a raw metric doesn’t distinguish between devices seen 24 times between 10:37 and 10:39, for example, or a device seen only once per hour every hour of the day (also 24 times).
Based on these two hypothetical scenarios, the quality of data from the second example can be argued as being of higher value, despite it having significantly fewer events.
Transparency in the location data space
In our Data Quality Dashboard, we plot events per day (as seen below), plus a number of other key location data metrics, visually in our charts. Our objective is to provide data partners with transparency and a high-level review on relevant indicators including: devices, events, days, hours, accuracy, etc.
Quadrant Location Data Feed – Hydra, September 2019
Quadrant Location Data Feed – Orion, September 2019
Data buyers must be prepared to dig deeper – not just the number of events, but whether the events are recorded at different locations as these provide valuable insights and context for various business requirements. Ultimately, different business applications will require location data that fits their business requirements.
A final thing to note is that data partners can be confident that Quadrant filters out misleading events data. One real-life example from the Quadrant team was a device spotted in three countries in a single hour (teleporting). We flagged it as suspicious and removed it from our data sets.
Stay tuned as we dive into more detail on inaccurate and fraudulent data in our next post. For now, we hope you find these quick-hit insights on data quantity versus quality useful!
If you would like to find out more about our location data products, drop us a mail at firstname.lastname@example.org