Location Data Knowledge Base

You want to start using location analytics and intelligence to improve your business profitability but do not know where to start?

Read on to learn all you need to know about location data and geospatial intelligence.

Location Knowledge Base - Table Of Contents

1. Introduction
2. Basics of Location Data
3. Location Data Sources and Types of Location Data
      How Do I Get Location Data?
      Is It Legal to Use Mobile Location Data?
4. Representing Location Data
      Latitude / Longitude
      Geohash
      Types of geo-indexing systems
5. Location Data Attributes and Data Fields
6. Horizontal Accuracy
7. Location Analytics Methods, Techniques, and Industry Applications of Location Data
      Movement and Visitation Analysis
      Origin-Destination Study (O-D Study)
      Week-on-Week Interest Tracking
      Cluster Tracing
      Mobility Analysis (Transportation and Urban Planning)
      Building Audiences, Understanding Movement and Attribution
      Using Location Data in Kepler
8. Location-Based Business and Marketing Solutions

1. Introduction

Welcome to our Quadrant Location Data Knowledge Base! Through these chapters, we will learn about what location data is, why it is useful, and the mechanics behind its collection and real-world usage.

You can also download our eBooks and case studies to learn how Quadrant is helping businesses solve a myriad of challenges with location data.

Follow us on LinkedIn to get updates on new chapters and topics that we are constantly adding, or send us your questions to marketing@quadrant.io!


2. Basics of Location Data

Location data are information about the geographic positions of devices (such as smartphones or tablets) or structures (such as buildings, attractions).

The geographic positions of location data are called coordinates, and they are commonly expressed in Latitude and Longitude format.

Additional attributes such as elevation or altitude may be included and helps data users get more accurate picture of the geographic positions of their data.

latitude
Latitude
longitude
Longitude

People commonly mean GPS data when they talk about location data. In reality, there are various types of location data.

It is important to know how the data is collected as it determines the accuracy and depth of the collected data, this have direct implications on the suitability and usability of the data for a business.

lat long map

3. Location Data Sources and Types of Location Data

icon_gps

Global Positioning System (GPS)

A GPS provides latitude-longitude coordinates gathered by the hardware on a device which communicates with a satellite such as a car navigating system, a mobile phone or a fitness tracker.

The latitude/longitude coordinates generated by the GPS are considered the standard for location data. Your device receives signals from the satellites and it can calculate where it is by measuring the time it takes for the signal to arrive.

This produces very accurate and precise data under the right conditions. The quality of a GPS signal degrades significantly indoors or in locations that obstruct the view of multiple GPS satellites.

icon_sdk2

App Publishers

GPS data can be pulled directly from its point of origin - mobile devices - through in app SDKs or via a Server-to-Server (S2S) integration with app publishers. 

Data collected from mobile apps have the potential to be very accurate and insightful. User's movement patterns can be observed in aggregates to uncover deeper insights for businesses. However, the biggest challenge with this method is achieving scale.

These apps require a user’s permission to collect location information obtained generally using an opt-in interface when the user first interacts with the application. With newer restrictions on inter-app tracking, some apps only provide location data when they are open or running in the background as chosen by the owner of the device.

icon_bidstream2

Bidstream

Bidstream data is data collected from the ad servers when ads are served on mobile apps and websites. Bidstream data is easy to obtain and scale, but it is often incomplete, inaccurate, or even illegitimate.

Many ads do not collect the location but record the IP address the phone is connected to, This IP often does not reflect the actual location of the device, e.g., a person sitting in a Starbucks but connected to the university campus Wi-F will be recorded as being at the campus. Moreover, d
ue to speed at which ads need to be delivered, very often the cached location of the phone is getting recorded as the GPS does not have enough time to update itself.

Centroids, or a large accumulation of device IDs in central locations, are very common with Bidstream data, as many devices may connect to the same IP address and many apps automatically default to the central location of the country where the ad is served.

icon_beacon

Beacons

Beacons are hardware transmitters that can sense other devices when they come into close proximity.

The location data collected by beacons is very accurate. They can also collect details such as name and birthdays, which can be very valuable to businesses.

Since beacons are hardware, they have to be purchased and installed at locations businesses want to track. Therefore, as with SDK data, it can be challenging to achieve scale through this method.

icon_wifi2

Wi-Fi

Wi-Fi enables devices to emit probes to look for access points (routers).

These probes can be measured to calculate the distance between the device and the access point. The precision of Wi-Fi location data is entirely dependent on the Wi-Fi network it is built on.

Wi-Fi networks are great at providing accuracy and precision indoors. Devices can use this infrastructure for more accurate placement when GPS and cell towers are not available, or when these signals are obstructed.

icon_pos

Point of Sales (POS)

POS data is data that stems from consumer transactions. This data usually contains adjacent information such as purchase items, amount spent, and method of payment, which can provide valuable information.

Because POS data is decentralized, it would be difficult to match multiple data sources through this method. POS data also only captures customers who have made an in-store purchase, and does not capture information on people who entered the store but did not buy anything.

icon_poi_globe

Point-of-Interest (POI) Data

POI is also a type of location data, but instead of identifying a device, it describes the physical location of a business, a landmark, or any other point of interest. POI data is typically used together with other types of location data to derive insights and better understand consumer traffic and behaviour.

Learn More About POI Here

There are advantages and disadvantages to each source of location data.

Businesses should consider various factors such as budget, accuracy requirements, and use cases when evaluating the source of their location data.

Businesses should also consider the ways different types of location data can compliment each other.

How do I get location data?

Most businesses usually purchase location data or location data feeds from data providers. As they do not have the time, resource, and expertise to collect location data.

However, businesses should be aware that the quality of data from each data provider will vary. Data providers that specialise in providing location data tend to have higher quality data, while the more general data vendors might not have the expertise to provide good quality data.

Due to the nature of data, it is near impossible to verify if a provider is selling authentic data. Businesses should assess the credibility of the data provider to avoid purchasing poor quality or even fradulent data.

Quality location data is important as it correlates with the accuracy and reliability of the findings and insights. Bad data can result in false findings, which causes businesses to waste lots of time, effort, and money.

Is it legal to use mobile location data?

Unlike the use of data in the digital world (e.g., user data collected on social media), location data is free of context, i.e., it doesn’t record a person’s identity, demographics, or any other personally identifiable information. Businesses worldwide are using location data for the betterment of services, performing studies to improve lives, and solving numerous other challenges.

However, just like with any such information, consent conditions are applicable to the collection of location data. Data privacy laws like GDPR and CCPA empower users to take ownership of their information and govern how businesses are using it.

Under these consent conditions, data collectors must gain the consent of customers to use, store, manage, and share their data, while allowing them to modify or opt-out of their earlier preferences at any point in time.

Download our eBook to learn what is consent management, why it is important and how you can establish compliance with the stringent consent requirements mandated by today's data privacy laws.


4. Representing Location Data

Latitude / Longitude

latitude-1

The "latitude" of a point on Earth's surface is the angle between the equatorial plane and the straight line that passes through that point and through (or close to) the center of the Earth. The 0° parallel of latitude is designated the Equator, the fundamental plane of all geographic coordinate systems.

longitude-1

The "longitude" of a point on Earth's surface is the angle east or west of a reference meridian to another meridian that passes through that point. Fun fact, you can actually step on the meridian if you ever visit British Royal Observatory in Greenwich, in southeast London (highly recommend it as the view of the city is amazing from there).

globe_illustration

The combination of these two components specifies the position of any location on the surface of Earth. Lat/long data points can be expressed in decimal degrees (DD). The other convention for expressing lat/long is in degrees, minutes, seconds (DMS). For example, below is the same point expressed in DD and DMS (you can find many converters online):

DD: 47.21746, -1.5476425
DMS: 47° 13’ 2.856”, 1° 32’ 51.5106”

airport_dms_coordinates

You can see these DMS coordinates at airports, where the gates are marked in degrees, minutes and seconds

Another important thing to understand about decimal degrees is that they carry a level of precision. The number of decimal places required for a particular precision at the equator is:

degree_precision

A value in decimal degrees to a precision of 4 decimal places is precise to 11.132 meters at the equator. A value in decimal degrees to 5 decimal places is precise to 1.1132 meter at the Equator.

Geohash

Invented by Gustavo Niemeyer, Geohash is a geocoding system that allows the expression of a location anywhere in the world using an alphanumeric string. Geohash is a unique string derived by encoding and reducing the two-dimensional geographic coordinates (latitude and longitude) into a string of digits and letters. A Geohash can be as vague or accurate as needed depending on the length of the string.

Geohashes use Base-32 alphabet encoding i.e., uses all digits 0-9 and almost all lower-case letters except "a", "i", "l" and "o". It is a convenient way to express a location anywhere in the world. Geohashes basically divide the world into a grid with 32 cells. Each cell will also contain 32 cells, and each one of these will contain 32 cells (and so on repeatedly).

geohash_world

Adding characters to the geohash sub-divides a cell, effectively zooming in to a more detailed area. This is referred to as geohash precision. Geohash Precision is a number between 1 and 12 that specifies the precision (i.e., number of characters) of the geohash. Each additional character of the geohash adds precision to your location.

At Quadrant, we usually provide 12-precision geohash for all the events.

The cell sizes of geohashes of different lengths are as follows; note that the cell width reduces moving away from the equator (to 0 at the poles):

geohash_precision1

Visually:

geohash_precision2

Geohashes have a certain property that makes them suitable for geospatial queries like localized search (points with similar geohashes that are near each other with the same geohash prefixes).

For example, if you want to list the number of persons who were seen in and around the Empire State Building, you can first determine the geohashes you want to cover and then run a simple query:

geohash_empire_state

SELECT * FROM table_name WHERE geohash like 'dr5ru6%' or geohash like 'dr5ru3%' or geohash like 'dr5rud%' or geohash like 'dr5ru9%';

Doing this improves processing times and costs, as it allows you to quick sort through large amounts of data and work on more precise subsets of data. In fact, most data scientists use geohash to quickly sort through large location data sets, and then build specific queries (such as polygons) around the specific point/area of interest. In doing so, you can reduce your costs and increase your speed of processing, while maintaining accuracy and precision.

Types of geo-indexing systems

Geodata is information about geographic locations that is stored in a format that can be used with a geographic information system (GIS). For example, at Quadrant, our geo data is stored in three different formats which can be used for geospatial analysis: Country Codes, Latitude & Longitude coordinates, and Geohashes.

Country Codes

Usually the ISO2 2-digit alpha country code represents the locale of the devices i.e. the devices registered to users from the stipulated countries. At Quadrant, in addition to the country code, we also derive another attribute called ‘country’, where the country represents the events / devices that are seen within the geographical boundaries of stipulated countries. For example, if you want to get the total number of events seen within Singapore by using its country code, you can run a simple query: SELECT count(*) FROM table_name WHERE country = ‘SG’;

Lat/long coordinates

Coordinates can be used to identify where an event was recorded. We can use the coordinates to either list devices from a single location: SELECT * FROM table_name WHERE latitude = ‘41.9022’ and longitude = ‘-76.37695’

Or we can use a bounding box, which is an area defined by two longitudes and two latitudes, to get information from a certain area or a country.

Bounding box for Australia:

australia_geofence

To get the total number of events seen within Australia by using a bounding box, you can run a simple query:

SELECT count(*) FROM table_name WHERE (latitude BETWEEN -43.96119063892024 and -10.660607953624762 and longitude BETWEEN 112.5 and 154.51171875);

Geofencing

A geo-fence is a virtual perimeter for a real-world geographic area. They could be a radius around a single point, or a predefined set of boundary. Once a geo-fenced boundary is defined, the opportunities what businesses can do is limited by only their creativity.

One common use of geo-fencing is for businesses to set up geo-fences around their competitors. And push marketing promotions to customers that enters the zone. This is sometimes referred to as geo-conquest. Businesses could also provide Location Based Services within geo-fenced region.

Geofencing is ideal for catchment area analysis; a catchment is an area from which businesses expects to draw their customers from. Catchment areas can help businesses identify where to run their next marketing campaign or set up their next store.

geofence

Geofencing is ideal for catchment area analysis; a catchment is an area from which businesses expects to draw their customers from. Catchment areas can help businesses identify where to run their next marketing campaign or set up their next store.

catchment area

5. Location Data Attributes and Data Fields

Location data generally have some attributes or data fields in common such as latitude, longitude, and horizontal accuracy. Other data fields tend to be dependent on the source of the data.

Below is a non-exhaustive list of attributes found in location data:

icon_latlong

Latitude, Longitude, Horizontal Accuracy, Altitude & Elevation

Latitude and longitude shows the position of a device or structure. They are commonly accompanied by horizontal accuracy, which tells users the degree of error in a particular data point.

Altitude or elevation pinpoints the height above a reference point, usually sea level.

icon_timestamp

Timestamp

Timestamps are typically used for logging events alone or in a sequence.

In the case of location data, they provide context to the movement of a particular device.

Location data feeds commonly record timestamps in Unix time, otherwise known as Unix Epoch time, or Epoch for short.

icon_ip_address

IP Address

Internet Protocol Address, or commonly known as IP address, is a numerical label assigned to each device connected to a computer network.

IP addresses can be used for location, however, accuracy can be problematic. One common occurrence when looking up an IP address's location is being directed to the network provider's location.

Depending on the business uses, IP address may provide a good rough measure of geographic location.

icon_mobile_adid

Mobile Ad ID / Device ID

The Mobile Ad ID or Device ID is a unique 36-character identifier of smartphones. Mobile Ad IDs helps to identify, track and differentiate between mobile devices.

The Mobile Ad ID/Device ID is widely used across the marketing ecosystem, especially for targeting devices with ads through demand-side platforms (DSPs) and supply-side platforms (SSPs) in the app advertising supply chain.

For iOS devices the unique device ID is called Identifier For Advertising (IDFA). Apple customers can receive their mobile device ID numbers through iTunes or the Apple App Store. Before the rollout of iOS 6, this device ID was called a Unique Device Identifier (UDID) in the Apple ecosystem.

For Android devices the device ID is called Google Advertising ID (GAID). These mobile device IDs are randomly generated during the first activation of a mobile device and remain constant for the lifetime of the device unless reset by the user. Typically the IDs belonging to iOS devices will be in upper-case and those belonging to Android devices will be in lower-case.

As they are a unique identifier, they can be used to calculate aggregated metrics such as Daily Active Users (DAU), Monthly Active Users (MAU), etc. To learn more about the different queries you can run using the device ID please visit our Resources library.


6. Horizontal Accuracy

GPS satellites broadcast their signals in space with a certain accuracy. However, this accuracy is not directly controllable by the app developer, at least not through any software mechanism. The accuracy received depends on additional factors such as satellite geometry, signal blockage, atmospheric conditions, and receiver design features/quality.

Horizontal Accuracy is a radius around a 2D point, implying that the true location is somewhere within the circle formed by the given point as a center and the accuracy as a radius. As shown in the example below, the exact location of the device will be anywhere within the orange shaded circle:

accuracy_radius

Different use cases require different levels of accuracy readings. For example, a relatively weak level of horizontal accuracy would be acceptable in city- or country-level analyses, as opposed to a more precise segmentation of users that visited stores within a retail park.

It is important to understand the difference between the horizontal accuracy of a point and the precision of a point, which discussed in a previous post, is represented by the number of decimal points in the latitude and longitude coordinates. Simply put, the precision of a point is a measure of how exact the pinpoint location of a data point is on a map. The horizontal accuracy is a measure of how close the data point is to the actual (ground truth) point.

Low accuracy / Low precision

low_accuracy_low_precision

Low accuracy / High precision

low_accuracy_high_precision

High accuracy / Low precision

high_accuracy_low_precision

High accuracy / High precision

high_accuracy_high_precision

In summary, if you were to conduct an experiment where you measure the location of a device over time, these two scenarios would describe the difference:

Scenario 1:

High precision means you have low standard deviation from the mean of the distribution (seen on the second and fourth pictures).

For example, the points are measured a small distance apart from each other (all around the Empire State Building, 34th St). The true location of the device may be relatively far away from these points (Chrysler Building, 405 Lexington Ave). However, there is still high precision here because the points are all measured close to each other but low accuracy because they are far off from the true position of the device.

Scenario 2:

High accuracy means that the points you collected are closer to the true location of the device (seen on the third and fourth pictures).

For example, the points are measured close to the true position of the device (Chrysler Building, 405 Lexington Ave), but are not necessarily close to each other (some points recorded at Grand Central Terminal and others at The Westin New York Grand Central, 212 E 42nd St). The precision of these points would be lower compared to the first scenario, but the accuracy would be higher because The Westin and Grand Central Terminal lie closer to the Chrysler building which is the true location of the device.

Find out how businesses are using location data

Big book of location data use cases - eBook  Quadrant

Serving as a record of where people go and how much time they spend there, mobile location data offers unique insights on people’s activities and interests. These insights can be harnessed to bolster business performance across a host of industries.


DOWNLOAD NOW

7. Location Analytics Methods, Techniques, and Industry Applications of Location Data

PN Heatmap

Movement and Visitation Analysis

Identify trends in foot traffic to determine popular places of interest or commonly travelled routes. One common method of visualising foot traffic is through heatmaps.

Using the insights gained, businesses could analyse the potential traffic and profitability of retail or advertising locations, estimate the peak days or times.

Businesses could also perform movement traffic analysis to uncover audience behaviour and visitation pattern. These findings provides more depth when segmenting audience and customers.

Origin destination

Origin-Destination Study (O-D Study)

Origin-Destination Study is used to understand the travel patterns of people. They are commonly used for transportation planning, however, their usefulness reaches beyond that.

Traditionally, O-D studies are performed manually through roadside surveys. The growth of GPS and tracking technology makes O-D studies less time consuming and delivers much more accurate results.

Week-on-Week Interest Tracking

One of the most powerful aspects of location data is its ability to derive insights on interest and intention. As users visit certain locations, assumptions can be made on their behaviours to derive likely insights, interests, intent and more.

To illustrate this, let us look at the following example:

interest_tracking_cars

Two devices are seen visiting car showrooms in the afternoon. We see both devices spending time in each location, visiting Peugeot (mid-market brand) and BMW and Mercedes-Benz (luxury brands). We also see that based on their timestamps, the individuals have spent more time visiting the luxury brands, approximately 1 hour at BMW and Mercedes, versus 5 minutes at Peugeot.

Based on this analysis, assumptions can be derived that these individuals are in the market for cars, and likely to be interested in luxury cars. This is a strong indication of interest and intent that can be used by brands for their insights (e.g. understanding dwell time) and marketers (e.g. to build segments), and many more.

As COVID-19 starts to subside and we move towards a new normal, we will see changes in peoples’ behaviours. Therefore, it will be important to keep quickly analyse and understand new habits, movement patterns, interests and behaviour which might impact the way that businesses reopen their business and continue operate.

interest_tracking_ikea

By looking at footfall patterns for retail locations (in the above case, IKEA Copenhagen and IKEA Stockholm), location data can be used to assess the visitation index or commercial activity of particular store locations as compared to previous timeframes. As location data is variable (i.e. it continuously changes), it is important to base line or normalize and smooth the data for the analysis. Technics typically used by statisticians for normalizing and smoothing include:

Normalizing data
• By errors (standard score)
• Means
• Medians
• Standard deviations (student’s t)
• Scale (min-max feature scaling, standardized moment, coefficient of variation)

Smoothing Data
Random walk
Moving average
Simple / linear / seasonal exponential
Savitzky–Golay filter

savitzky_golay_smoothing

Example of Smoothing (note, S-G stands for Savitzky–Golay filter)

Aside from looking at specific retail spaces, when tracking economic progress can also be interesting to run a similar analysis on points-of-interests such as: transportation hubs, industrial complexes, manufacturing parks, road networks, central business districts, etc. These activities can in turn be used to predict/understand economic activity, forecast demand, and understand how reopening strategies are having an impact on different economic sectors.

Cluster Tracing

Cluster analysis, or clustering, refers to grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). In all cases, clusters need to be grouped by an attribute or set of attributes.

In the context of location data, cluster analysis is used to understand the movement patterns of certain group of people with similar traits (e.g. work location, time spent in certain areas, etc).

Recently, we have seen some interesting uses of clustering, e.g.:
• Understanding the potential spread of a disease from a cluster
• Understanding the transportation used by people living the same neighborhood
• Understanding the home locations of people working in central business districts (CBDs)
• Understanding the behavior of people who went to a concert
• Finding people who are most likely to be families or co-workers

Some methods of a conducting cluster analysis on location data include:

Method A: ID-based Analysis

When analyzing a specific event or location, e.g. home locations of people working in CBDs, you can geofence the event location and find the unique devices within the area. Using these device ids and historical data (or also future data), you can determine which locations these devices came from matching these devices to specific areas, cities, neighborhoods, etc.

cluster_analysis1

In the above example, geofencing the CBD area of Sydney, you first identify all unique device IDs in the CBD area and are therefore considered to be working it the area. For this, time is important, and you might want to look at devices seen in the area at least 2 times during weekdays between 9am – 5pm. Based on this, you can then search those devices ids in the other areas (area 1 – 5) to determine numbers of people working in the CBD by area.

Method B: Density Algorithms

Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm. It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).

cluster_analysis2

The methodology is particularly useful for geo-spatial analysis because it allows you to set-up criteria for a cluster (e.g. minimum of 4 people in 200-meter radius). Unlike other methodologies of clustering, DBSCAN does not require you to specify the number of clusters in the data a priori. DBSCAN can find arbitrarily shaped clusters. It can even find a cluster surrounded by (but not connected to) a different cluster.

cluster_analysis3

This algorithm has been useful in the fights against COVID-19, with public health officials and epidemiologists using these types of on algorithms to identify new or unknown clusters. In the above example, by specifically identifying the areas of clusters A and B, policy makers and medical professionals can make better informed decisions on how to optimize the allocation of medical resources across cities and countries.

One quick point of caution of DBSCAN is that it cannot cluster data sets well with large differences in densities.

For this, we need to look to the OPTICS algorithm. Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data. It is very similar to DBSCAN but allows you to detect clusters in data of varying density. This methodology is very powerful because it uses tree-diagram techniques (reachability-plot) which looks for connections between different devices IDs.

cluster_analysis4

The image above illustrates OPTICS. In its upper left area, the example data set is shown. The upper right part visualizes the spanning tree produced by OPTICS, and the lower part shows the reachability plot as computed by OPTICS. The yellow points in this image are considered noise and are not assigned to clusters.

Mobility Analysis

Mobility analysis is a term often used in planning or development of operations. Some good examples are smart city planning, trip management or transportation planning. Entities like the government needs to do mobility analysis to make smart decisions about the city and urban redevelopment projects, and consultant/retail analytics firms perform mobility analysis for market research and capture.

Previously, when location data was scarce, organisations had to rely on approximate data points (e.g. sample surveys and door counting) to make their decisions. But with location data, data capturing has never been faster, cheaper, and more reliable!

The two main use cases where mobile location data can be used for mobility analysis are transportation planning and urban planning.

Transportation Planning

With Covid-19 and the government-imposed lockdown measures, all the major retails shops were closed, and people’s mobility got impacted. Therefore, we saw an increase in the use of historical data and baselining against previous year’s data.

Today, as we enter the Covid-19 recovery phase, people are starting to move outside (albeit relatively less than before) and government can use location data to understand new movement patterns. In this new normal, mobility analysis can be used for resource optimization or better planning for transportation services. For example, bus times and pick up locations can be optimized based on new behaviours or to spread out congested lines.

mobility_analysis_singapore

Sample Origin Destination visual in Singapore to understand user’s mobility

Sample Methodology:

1. Consider all bus stations as POI and calculate density of people waiting for the bus service last year and current year. This helps to identify number of bus services can be reduced/increased from place to place.

2. Transform the location points spotted at the bus station to origin destination pairs. For each device id, get the first bus station location and last bus station location and the time in-between.

3. Forecast route planning via the four-step travel model:

     a.) Step 1 – Trip Generation
     b.) Step 2 – Trip Distribution
     c.) Step 3 – Mode Choice
     d.) Step 4 – Route Assignment

utps

Urban Planning

With the economy being reopened after the pandemic, people look forward to enjoying their time outdoors, meeting their family members and friends. With this, it is of no doubt that an increase in footfall of retail stores or food/beverage locations can be expected. However, with more people interacting, the risk of community transmission potentially increases and if gets worse a second wave of coronavirus spike will not be too far.

With location data, governments can perform mobility analysis and understand which dense places might have higher risks of coronavirus transmission and add restrictions to prevent places from being overcrowded.

Methodology to p
erform cluster analysis and find places that are overcrowded and bear potential risk of coronavirus transmission:

a. Perform DBSCAN clustering to find high dense cluster size. More on DBSCAN clustering can be found here.
b. Perform clustering via OPTICS.
c. A sample DBSCAN cluster with POSTGIS:

dbscan_cluster

Building Audiences, Understanding Movement and Attribution

Building Digital Audiences for Targeting

One of the ways in which mobile location data is used most frequently is for advertising and building targetable audiences.

Traditionally advertisers targeted people only based on the demographics or geography. For example, Honda Motors would target males, aged 30-35, living in urban California.

The problem with the traditional approach is that advertisers do not know if they target people who show affinity to their products or services. But with location data, people’s affinity towards a product could be understood using movement patterns and tagging them with the right behaviours.

This information could be used by advertisers to perform a more personalised targeting and in-turn potentially increaser their advertising ROI.

audiences1

Audiences have different categories like Automotive buyers, Restaurant dining, Shoppers, Travelers etc.

Consider a Device ID is seen visiting a Honda showroom on July 1st. This ID can be categorized as an automotive buyer and as a mid-market range automotive buyer. Now, the same Device ID was spotted in beauty parlour a few days after, then an additional behaviour (beauty) can be tagged to existing Device ID. By adding more and more behaviours to the Device IDs, advertisers can build a more rounded/complete picture of their audiences and provide more personalized content.

Audiences can be built by geofencing appropriate POI places. For example, to build a mid-market automobile audience, Device IDs that were spotted in Honda showrooms and also in its competitors like (Toyota and Hyundai) could be geofenced. Therefore, it also goes without saying that a POI database is necessary to build location-based audiences.

Once you have mastered the basics, you can start to refine further but adding additional algorithms. One of the algorithms that could be added is to remove potential workers in an automobile shop so that we target potential buyers and not workers. To do this, you will want to identify all Device IDs within these stores (POIs) that are consistently present over time. For example, if a Device ID is seen at the Honda dealership on Monday, Tuesday, Thursday and Friday a week later, you might want to assume that they are an employee at the dealership and exclude them from your audience.

Another important consideration is that audiences have a lifecycle which differs based on the type of audiences. For instance, a person planning to buy an automobile usually takes 3-4 months to reach a decision. Hence automobile buyers have lifecycle of 3-4 months and after that the behaviour would be removed from the Device ID. On the contrary, if a person uses a Samsung phone, the Samsung user behaviour could be valid for 1-2 years as people usually buy phones via contracts.

Understanding movement and attribution

Aside from creating audiences, location data has high potential when it comes to OOH advertising.

OOH (Out Of Home Advertising), as the word states, is simply advertising when people are outside their home (e.g. billboards, in-car ads, bus stop boards, etc.).

Just like digital advertising above, the success of OOH advertising is evaluated by understanding how successfully an ad was at driving a viewer to purchase (either online or in store). This is often referred to ROI measurement, attribution or drive to store. Location data is extremely useful for OOH advertisers who want to attribute viewers of their assets back to either physical locations such as stores or directly back to online purchases by using the Device ID. For example, if 20 out of 100 people who saw the Apple billboard went to the apple store nearby, then the attribution rate to the store is 20%.

Above are two common ways in which OOH advertisers use location data:

1. Evaluate attribution of an OOH campaign:

Say for e.g. you have a fashion apparel store (H&M). They launched an OOH campaign in the same road where they are located and want to know how many people visited the store after the advertisement.

• Filter the location data during the campaign time period
• Filter data points around the advertisement location and the store location
• Classify data points in advertisement location as set A and data points in store location as set B
• Let total_devices = footfall from set A
• Let attributed_devices = distinct devices in set A that are in set B with time stamp of the event being greater in set B than set A
• Divide attributed_devices/total_devices: this would give you the attribution percentage

audiences2

2. Find the right spot for OOH advertising to influence people's buying decisions:

Say for instance, McDonalds want to increase their morning visits to a specific store in the downtown area and are planning to launch a campaign. They want to find the right location to do an OOH campaign. In this case:

• Take a sample data (1 day or 1 week)
• Find devices spotted in the target McDonalds and surrounding area for 1 week at morning time.
• Using the device ids, find the transportation and the routes they take to reach the store. Find the most common route they take.
• Find the optimal billboard locations in the common route they take. These would be the right spots to increase the OOH attribution.

Using Location Data in Kepler

Kepler is an open source geo-spatial analysis tool using an easy drag and drop mechanism. By providing input data in form of CSV, GeoJSON, or URL, the map will allow you to design plots and save the results in the form of image, html, map etc.

Let us work through an example below!

Due to coronavirus, many retail shops and restaurants have been closed and trends/visitation patterns of shopping malls have been changed due to the circuit breaker imposed in Singapore.

To analyse these changes, a sample location data is used to find patterns of ION Orchard shopping mall, Singapore before circuit breaker (April 4, 2020) and after circuit breaker imposed (April 11, 2020).

kepler0_1

ION Orchard Shopping Mall, Singapore

Input sample dataset contains device_id, latitude, longitude, day and timestamp in CSV format, as shown below:

kepler0_2

Steps to perform our analysis:

1.) Load data to Kepler (https://kepler.gl/demo) either by drag and drop or browsing files

kepler1

Loading data to Kepler

By default, Kepler identifies latitude & longitude and forms a layer as a point form as shown below:

kepler2

2.) Converting plot types: plat types include, point, arc, heatmap, grids or line (useful for taxi driving patterns). There are many more plat types available – see details here.

In this case, converting point form to heatmap:

kepler3

3.) Density pattern between 4th April and 11th April can be done using timestamp filtration.

kepler4

Filter Data
     a) Select Timestamp field to filter data
     b) Timeframe can be adjusted, and which can run in intervals

kepler5

4th April, ION Orchard, Before CB imposed in Singapore

kepler6

11th April, ION Orchard, After CB imposed in Singapore

Already by using this simple data manipulation via Kepler, we can see evidence that the number of mall visitors declined after circuit breaker was imposed in Singapore. Additionally, to dig into each device, the device_id can also be filtered (as seen for timestamp above).

Another interesting idea is to filter or sort by color.

4.) Create new layer for color coding on devices and hide previous heatmap. Fill in latitude and longitudes and select color based on ‘device_id’ as shown below.

kepler7

5.) Save visualizations in HTML format. As highlighted in below figure at top left corner, click share -> Export Map -> Select HTML -> Allow users to edit the map -> Export.Density pattern between 4th April and 11th April can be done using timestamp filtration.

kepler8

There are other visualizations like clustering, grid heatmap, 3D visualization and even background maps can be changed to Satellite, Light background with and without borders etc. More information on Kepler is available on their website.

Kepler API is also available and can be easily integrated with Python.


8. Location-Based Business and Marketing Solutions

icon_targeted_advertising

Targeted Advertising

By sending your advertisements to the right people at the right time, you can improve the effectiveness of your advertisements and increase your conversions. Marketers often use location-based audiences to better deliver their ads to the right people.

icon_marketing_communication

Marketing Communication

Context is key when it comes to delivering the right marketing communication to your audience. By knowing where they come from, you can craft contextually relevant marketing messages that resonates with your audience. Resulting in a more effective marketing campaign.

icon_customer_analysis

Customer Analysis

Analysing your customer's movement and visitation patterns may uncover insights which you could leverage on to attract and retain customers.

icon_site_selection

Site Planning & Selection

Use location data to plan and select the best site to open your next store or to place your next outdoor advertisement.

icon_business_forecasting

Business Forecasting

Uncover correlation between footfall levels, inventory levels and business performance. Good for demand and supply planning or for business and economic research.

icon_attribution_analysis

Attribution Analysis

Measure the performance of your offline advertisements. Often used in Out-of-Home (OOH) advertising or Digital-Out-of-Home advertising (DOOH).


Mobile Location Data Feeds

icon_hydra

Hydra

Quadrant's entire data feed.

Great for organisations who want to conduct all data analysis in-house.

icon_custom

Custom Feed

Mobile location data processed and filtered based on your specified criteria.

Deploy the data almost immediately with minimal processing.

Quadrant's publicly available Data Quality Dashboard contains a suite of metrics, allowing you to conduct a high level evaluation of our data before contacting us.

Speak to a data consultant today
 
... or subscribe to our newsletter to stay up to date with Quadrant: