You want to start using location analytics and intelligence to improve your business profitability but do not know where to start?
Read on to learn all you need to know about location data and geospatial intelligence.
1. Introduction
2. Basics of Location Data
3. Location Data Sources and Types of Location Data
How Do I Get Location Data?
Is It Legal to Use Mobile Location Data?
4. Representing Location Data
Latitude / Longitude
Geohash
Types of geo-indexing systems
5. Location Data Attributes and Data Fields
6. Horizontal Accuracy
7. Location Analytics Methods, Techniques, and Industry Applications of Location Data
Movement and Visitation Analysis
Origin-Destination Study (O-D Study)
Week-on-Week Interest Tracking
Cluster Tracing
Mobility Analysis (Transportation and Urban Planning)
Building Audiences, Understanding Movement and Attribution
Using Location Data in Kepler
8. Location-Based Business and Marketing Solutions
Welcome to our Quadrant Location Data Knowledge Base! Through these chapters, we will learn about what location data is, why it is useful, and the mechanics behind its collection and real-world usage.
You can also download our eBooks and case studies to learn how Quadrant is helping businesses solve a myriad of challenges with location data.
Follow us on LinkedIn to get updates on new chapters and topics that we are constantly adding, or send us your questions to marketing@quadrant.io!
The geographic positions of location data are called coordinates, and they are commonly expressed in Latitude and Longitude format.
Additional attributes such as elevation or altitude may be included and helps data users get more accurate picture of the geographic positions of their data.
People commonly mean GPS data when they talk about location data. In reality, there are various types of location data.
It is important to know how the data is collected as it determines the accuracy and depth of the collected data, this have direct implications on the suitability and usability of the data for a business.
A GPS provides latitude-longitude coordinates gathered by the hardware on a device which communicates with a satellite such as a car navigating system, a mobile phone or a fitness tracker.
The latitude/longitude coordinates generated by the GPS are considered the standard for location data. Your device receives signals from the satellites and it can calculate where it is by measuring the time it takes for the signal to arrive.
This produces very accurate and precise data under the right conditions. The quality of a GPS signal degrades significantly indoors or in locations that obstruct the view of multiple GPS satellites.
GPS data can be pulled directly from its point of origin - mobile devices - through in app SDKs or via a Server-to-Server (S2S) integration with app publishers.
Data collected from mobile apps have the potential to be very accurate and insightful. User's movement patterns can be observed in aggregates to uncover deeper insights for businesses. However, the biggest challenge with this method is achieving scale.
These apps require a user’s permission to collect location information obtained generally using an opt-in interface when the user first interacts with the application. With newer restrictions on inter-app tracking, some apps only provide location data when they are open or running in the background as chosen by the owner of the device.
Bidstream data is data collected from the ad servers when ads are served on mobile apps and websites. Bidstream data is easy to obtain and scale, but it is often incomplete, inaccurate, or even illegitimate.
Many ads do not collect the location but record the IP address the phone is connected to, This IP often does not reflect the actual location of the device, e.g., a person sitting in a Starbucks but connected to the university campus Wi-F will be recorded as being at the campus. Moreover, due to speed at which ads need to be delivered, very often the cached location of the phone is getting recorded as the GPS does not have enough time to update itself.
Centroids, or a large accumulation of device IDs in central locations, are very common with Bidstream data, as many devices may connect to the same IP address and many apps automatically default to the central location of the country where the ad is served.
Beacons are hardware transmitters that can sense other devices when they come into close proximity.
The location data collected by beacons is very accurate. They can also collect details such as name and birthdays, which can be very valuable to businesses.
Since beacons are hardware, they have to be purchased and installed at locations businesses want to track. Therefore, as with SDK data, it can be challenging to achieve scale through this method.
Wi-Fi enables devices to emit probes to look for access points (routers).
These probes can be measured to calculate the distance between the device and the access point. The precision of Wi-Fi location data is entirely dependent on the Wi-Fi network it is built on.
Wi-Fi networks are great at providing accuracy and precision indoors. Devices can use this infrastructure for more accurate placement when GPS and cell towers are not available, or when these signals are obstructed.
POS data is data that stems from consumer transactions. This data usually contains adjacent information such as purchase items, amount spent, and method of payment, which can provide valuable information.
Because POS data is decentralized, it would be difficult to match multiple data sources through this method. POS data also only captures customers who have made an in-store purchase, and does not capture information on people who entered the store but did not buy anything.
POI is also a type of location data, but instead of identifying a device, it describes the physical location of a business, a landmark, or any other point of interest. POI data is typically used together with other types of location data to derive insights and better understand consumer traffic and behaviour.
There are advantages and disadvantages to each source of location data.
Businesses should consider various factors such as budget, accuracy requirements, and use cases when evaluating the source of their location data.
Businesses should also consider the ways different types of location data can compliment each other.
Most businesses usually purchase location data or location data feeds from data providers. As they do not have the time, resource, and expertise to collect location data.
However, businesses should be aware that the quality of data from each data provider will vary. Data providers that specialise in providing location data tend to have higher quality data, while the more general data vendors might not have the expertise to provide good quality data.
Due to the nature of data, it is near impossible to verify if a provider is selling authentic data. Businesses should assess the credibility of the data provider to avoid purchasing poor quality or even fradulent data.
Quality location data is important as it correlates with the accuracy and reliability of the findings and insights. Bad data can result in false findings, which causes businesses to waste lots of time, effort, and money.
Unlike the use of data in the digital world (e.g., user data collected on social media), location data is free of context, i.e., it doesn’t record a person’s identity, demographics, or any other personally identifiable information. Businesses worldwide are using location data for the betterment of services, performing studies to improve lives, and solving numerous other challenges.
However, just like with any such information, consent conditions are applicable to the collection of location data. Data privacy laws like GDPR and CCPA empower users to take ownership of their information and govern how businesses are using it.
Under these consent conditions, data collectors must gain the consent of customers to use, store, manage, and share their data, while allowing them to modify or opt-out of their earlier preferences at any point in time.
Download our eBook to learn what is consent management, why it is important and how you can establish compliance with the stringent consent requirements mandated by today's data privacy laws.
The "latitude" of a point on Earth's surface is the angle between the equatorial plane and the straight line that passes through that point and through (or close to) the center of the Earth. The 0° parallel of latitude is designated the Equator, the fundamental plane of all geographic coordinate systems.
The "longitude" of a point on Earth's surface is the angle east or west of a reference meridian to another meridian that passes through that point. Fun fact, you can actually step on the meridian if you ever visit British Royal Observatory in Greenwich, in southeast London (highly recommend it as the view of the city is amazing from there).
The combination of these two components specifies the position of any location on the surface of Earth. Lat/long data points can be expressed in decimal degrees (DD). The other convention for expressing lat/long is in degrees, minutes, seconds (DMS). For example, below is the same point expressed in DD and DMS (you can find many converters online):
DD: 47.21746, -1.5476425
DMS: 47° 13’ 2.856”, 1° 32’ 51.5106”
You can see these DMS coordinates at airports, where the gates are marked in degrees, minutes and seconds
Another important thing to understand about decimal degrees is that they carry a level of precision. The number of decimal places required for a particular precision at the equator is:
A value in decimal degrees to a precision of 4 decimal places is precise to 11.132 meters at the equator. A value in decimal degrees to 5 decimal places is precise to 1.1132 meter at the Equator.
Invented by Gustavo Niemeyer, Geohash is a geocoding system that allows the expression of a location anywhere in the world using an alphanumeric string. Geohash is a unique string derived by encoding and reducing the two-dimensional geographic coordinates (latitude and longitude) into a string of digits and letters. A Geohash can be as vague or accurate as needed depending on the length of the string.
Geohashes use Base-32 alphabet encoding i.e., uses all digits 0-9 and almost all lower-case letters except "a", "i", "l" and "o". It is a convenient way to express a location anywhere in the world. Geohashes basically divide the world into a grid with 32 cells. Each cell will also contain 32 cells, and each one of these will contain 32 cells (and so on repeatedly).
Adding characters to the geohash sub-divides a cell, effectively zooming in to a more detailed area. This is referred to as geohash precision. Geohash Precision is a number between 1 and 12 that specifies the precision (i.e., number of characters) of the geohash. Each additional character of the geohash adds precision to your location.
At Quadrant, we usually provide 12-precision geohash for all the events.
The cell sizes of geohashes of different lengths are as follows; note that the cell width reduces moving away from the equator (to 0 at the poles):
Visually:
Geohashes have a certain property that makes them suitable for geospatial queries like localized search (points with similar geohashes that are near each other with the same geohash prefixes).
For example, if you want to list the number of persons who were seen in and around the Empire State Building, you can first determine the geohashes you want to cover and then run a simple query:
SELECT * FROM table_name WHERE geohash like 'dr5ru6%' or geohash like 'dr5ru3%' or geohash like 'dr5rud%' or geohash like 'dr5ru9%';
Doing this improves processing times and costs, as it allows you to quick sort through large amounts of data and work on more precise subsets of data. In fact, most data scientists use geohash to quickly sort through large location data sets, and then build specific queries (such as polygons) around the specific point/area of interest. In doing so, you can reduce your costs and increase your speed of processing, while maintaining accuracy and precision.
Geodata is information about geographic locations that is stored in a format that can be used with a geographic information system (GIS). For example, at Quadrant, our geo data is stored in three different formats which can be used for geospatial analysis: Country Codes, Latitude & Longitude coordinates, and Geohashes.
Usually the ISO2 2-digit alpha country code represents the locale of the devices i.e. the devices registered to users from the stipulated countries. At Quadrant, in addition to the country code, we also derive another attribute called ‘country’, where the country represents the events / devices that are seen within the geographical boundaries of stipulated countries. For example, if you want to get the total number of events seen within Singapore by using its country code, you can run a simple query: SELECT count(*) FROM table_name WHERE country = ‘SG’;
Coordinates can be used to identify where an event was recorded. We can use the coordinates to either list devices from a single location: SELECT * FROM table_name WHERE latitude = ‘41.9022’ and longitude = ‘-76.37695’
Or we can use a bounding box, which is an area defined by two longitudes and two latitudes, to get information from a certain area or a country.
Bounding box for Australia:
To get the total number of events seen within Australia by using a bounding box, you can run a simple query:
SELECT count(*) FROM table_name WHERE (latitude BETWEEN -43.96119063892024 and -10.660607953624762 and longitude BETWEEN 112.5 and 154.51171875);
A geo-fence is a virtual perimeter for a real-world geographic area. They could be a radius around a single point, or a predefined set of boundary. Once a geo-fenced boundary is defined, the opportunities what businesses can do is limited by only their creativity.
One common use of geo-fencing is for businesses to set up geo-fences around their competitors. And push marketing promotions to customers that enters the zone. This is sometimes referred to as geo-conquest. Businesses could also provide Location Based Services within geo-fenced region.
Geofencing is ideal for catchment area analysis; a catchment is an area from which businesses expects to draw their customers from. Catchment areas can help businesses identify where to run their next marketing campaign or set up their next store.
Geofencing is ideal for catchment area analysis; a catchment is an area from which businesses expects to draw their customers from. Catchment areas can help businesses identify where to run their next marketing campaign or set up their next store.
Location data generally have some attributes or data fields in common such as latitude, longitude, and horizontal accuracy. Other data fields tend to be dependent on the source of the data.
Below is a non-exhaustive list of attributes found in location data:
Latitude and longitude shows the position of a device or structure. They are commonly accompanied by horizontal accuracy, which tells users the degree of error in a particular data point.
Altitude or elevation pinpoints the height above a reference point, usually sea level.
Timestamps are typically used for logging events alone or in a sequence.
In the case of location data, they provide context to the movement of a particular device.
Location data feeds commonly record timestamps in Unix time, otherwise known as Unix Epoch time, or Epoch for short.
Internet Protocol Address, or commonly known as IP address, is a numerical label assigned to each device connected to a computer network.
IP addresses can be used for location, however, accuracy can be problematic. One common occurrence when looking up an IP address's location is being directed to the network provider's location.
Depending on the business uses, IP address may provide a good rough measure of geographic location.
The Mobile Ad ID or Device ID is a unique 36-character identifier of smartphones. Mobile Ad IDs helps to identify, track and differentiate between mobile devices.
The Mobile Ad ID/Device ID is widely used across the marketing ecosystem, especially for targeting devices with ads through demand-side platforms (DSPs) and supply-side platforms (SSPs) in the app advertising supply chain.
For iOS devices the unique device ID is called Identifier For Advertising (IDFA). Apple customers can receive their mobile device ID numbers through iTunes or the Apple App Store. Before the rollout of iOS 6, this device ID was called a Unique Device Identifier (UDID) in the Apple ecosystem.
For Android devices the device ID is called Google Advertising ID (GAID). These mobile device IDs are randomly generated during the first activation of a mobile device and remain constant for the lifetime of the device unless reset by the user. Typically the IDs belonging to iOS devices will be in upper-case and those belonging to Android devices will be in lower-case.
As they are a unique identifier, they can be used to calculate aggregated metrics such as Daily Active Users (DAU), Monthly Active Users (MAU), etc. To learn more about the different queries you can run using the device ID please visit our Resources library.
GPS satellites broadcast their signals in space with a certain accuracy. However, this accuracy is not directly controllable by the app developer, at least not through any software mechanism. The accuracy received depends on additional factors such as satellite geometry, signal blockage, atmospheric conditions, and receiver design features/quality.
Horizontal Accuracy is a radius around a 2D point, implying that the true location is somewhere within the circle formed by the given point as a center and the accuracy as a radius. As shown in the example below, the exact location of the device will be anywhere within the orange shaded circle:
Different use cases require different levels of accuracy readings. For example, a relatively weak level of horizontal accuracy would be acceptable in city- or country-level analyses, as opposed to a more precise segmentation of users that visited stores within a retail park.
It is important to understand the difference between the horizontal accuracy of a point and the precision of a point, which discussed in a previous post, is represented by the number of decimal points in the latitude and longitude coordinates. Simply put, the precision of a point is a measure of how exact the pinpoint location of a data point is on a map. The horizontal accuracy is a measure of how close the data point is to the actual (ground truth) point.
Low accuracy / Low precision
Low accuracy / High precision
High accuracy / Low precision
High accuracy / High precision
In summary, if you were to conduct an experiment where you measure the location of a device over time, these two scenarios would describe the difference:
Scenario 1:
High precision means you have low standard deviation from the mean of the distribution (seen on the second and fourth pictures).
For example, the points are measured a small distance apart from each other (all around the Empire State Building, 34th St). The true location of the device may be relatively far away from these points (Chrysler Building, 405 Lexington Ave). However, there is still high precision here because the points are all measured close to each other but low accuracy because they are far off from the true position of the device.
Scenario 2:
High accuracy means that the points you collected are closer to the true location of the device (seen on the third and fourth pictures).
For example, the points are measured close to the true position of the device (Chrysler Building, 405 Lexington Ave), but are not necessarily close to each other (some points recorded at Grand Central Terminal and others at The Westin New York Grand Central, 212 E 42nd St). The precision of these points would be lower compared to the first scenario, but the accuracy would be higher because The Westin and Grand Central Terminal lie closer to the Chrysler building which is the true location of the device.
Identify trends in foot traffic to determine popular places of interest or commonly travelled routes. One common method of visualising foot traffic is through heatmaps.
Using the insights gained, businesses could analyse the potential traffic and profitability of retail or advertising locations, estimate the peak days or times.
Businesses could also perform movement traffic analysis to uncover audience behaviour and visitation pattern. These findings provides more depth when segmenting audience and customers.
Origin-Destination Study is used to understand the travel patterns of people. They are commonly used for transportation planning, however, their usefulness reaches beyond that.
Traditionally, O-D studies are performed manually through roadside surveys. The growth of GPS and tracking technology makes O-D studies less time consuming and delivers much more accurate results.
One of the most powerful aspects of location data is its ability to derive insights on interest and intention. As users visit certain locations, assumptions can be made on their behaviours to derive likely insights, interests, intent and more.
To illustrate this, let us look at the following example:
Two devices are seen visiting car showrooms in the afternoon. We see both devices spending time in each location, visiting Peugeot (mid-market brand) and BMW and Mercedes-Benz (luxury brands). We also see that based on their timestamps, the individuals have spent more time visiting the luxury brands, approximately 1 hour at BMW and Mercedes, versus 5 minutes at Peugeot.
Based on this analysis, assumptions can be derived that these individuals are in the market for cars, and likely to be interested in luxury cars. This is a strong indication of interest and intent that can be used by brands for their insights (e.g. understanding dwell time) and marketers (e.g. to build segments), and many more.
As COVID-19 starts to subside and we move towards a new normal, we will see changes in peoples’ behaviours. Therefore, it will be important to keep quickly analyse and understand new habits, movement patterns, interests and behaviour which might impact the way that businesses reopen their business and continue operate.
By looking at footfall patterns for retail locations (in the above case, IKEA Copenhagen and IKEA Stockholm), location data can be used to assess the visitation index or commercial activity of particular store locations as compared to previous timeframes. As location data is variable (i.e. it continuously changes), it is important to base line or normalize and smooth the data for the analysis. Technics typically used by statisticians for normalizing and smoothing include:
Normalizing data
• By errors (standard score)
• Means
• Medians
• Standard deviations (student’s t)
• Scale (min-max feature scaling, standardized moment, coefficient of variation)
Smoothing Data
• Random walk
• Moving average
• Simple / linear / seasonal exponential
• Savitzky–Golay filter
Example of Smoothing (note, S-G stands for Savitzky–Golay filter)
Aside from looking at specific retail spaces, when tracking economic progress can also be interesting to run a similar analysis on points-of-interests such as: transportation hubs, industrial complexes, manufacturing parks, road networks, central business districts, etc. These activities can in turn be used to predict/understand economic activity, forecast demand, and understand how reopening strategies are having an impact on different economic sectors.
Cluster analysis, or clustering, refers to grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). In all cases, clusters need to be grouped by an attribute or set of attributes.
In the context of location data, cluster analysis is used to understand the movement patterns of certain group of people with similar traits (e.g. work location, time spent in certain areas, etc).
Recently, we have seen some interesting uses of clustering, e.g.:
• Understanding the potential spread of a disease from a cluster
• Understanding the transportation used by people living the same neighborhood
• Understanding the home locations of people working in central business districts (CBDs)
• Understanding the behavior of people who went to a concert
• Finding people who are most likely to be families or co-workers
Some methods of a conducting cluster analysis on location data include:
Method A: ID-based Analysis
When analyzing a specific event or location, e.g. home locations of people working in CBDs, you can geofence the event location and find the unique devices within the area. Using these device ids and historical data (or also future data), you can determine which locations these devices came from matching these devices to specific areas, cities, neighborhoods, etc.
In the above example, geofencing the CBD area of Sydney, you first identify all unique device IDs in the CBD area and are therefore considered to be working it the area. For this, time is important, and you might want to look at devices seen in the area at least 2 times during weekdays between 9am – 5pm. Based on this, you can then search those devices ids in the other areas (area 1 – 5) to determine numbers of people working in the CBD by area.
Method B: Density Algorithms
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm. It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away).
The methodology is particularly useful for geo-spatial analysis because it allows you to set-up criteria for a cluster (e.g. minimum of 4 people in 200-meter radius). Unlike other methodologies of clustering, DBSCAN does not require you to specify the number of clusters in the data a priori. DBSCAN can find arbitrarily shaped clusters. It can even find a cluster surrounded by (but not connected to) a different cluster.
This algorithm has been useful in the fights against COVID-19, with public health officials and epidemiologists using these types of on algorithms to identify new or unknown clusters. In the above example, by specifically identifying the areas of clusters A and B, policy makers and medical professionals can make better informed decisions on how to optimize the allocation of medical resources across cities and countries.
One quick point of caution of DBSCAN is that it cannot cluster data sets well with large differences in densities.
For this, we need to look to the OPTICS algorithm. Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data. It is very similar to DBSCAN but allows you to detect clusters in data of varying density. This methodology is very powerful because it uses tree-diagram techniques (reachability-plot) which looks for connections between different devices IDs.
The image above illustrates OPTICS. In its upper left area, the example data set is shown. The upper right part visualizes the spanning tree produced by OPTICS, and the lower part shows the reachability plot as computed by OPTICS. The yellow points in this image are considered noise and are not assigned to clusters.
Mobility analysis is a term often used in planning or development of operations. Some good examples are smart city planning, trip management or transportation planning. Entities like the government needs to do mobility analysis to make smart decisions about the city and urban redevelopment projects, and consultant/retail analytics firms perform mobility analysis for market research and capture.
Previously, when location data was scarce, organisations had to rely on approximate data points (e.g. sample surveys and door counting) to make their decisions. But with location data, data capturing has never been faster, cheaper, and more reliable!
The two main use cases where mobile location data can be used for mobility analysis are transportation planning and urban planning.
With Covid-19 and the government-imposed lockdown measures, all the major retails shops were closed, and people’s mobility got impacted. Therefore, we saw an increase in the use of historical data and baselining against previous year’s data.
Today, as we enter the Covid-19 recovery phase, people are starting to move outside (albeit relatively less than before) and government can use location data to understand new movement patterns. In this new normal, mobility analysis can be used for resource optimization or better planning for transportation services. For example, bus times and pick up locations can be optimized based on new behaviours or to spread out congested lines.
Sample Origin Destination visual in Singapore to understand user’s mobility
Sample Methodology:
1. Consider all bus stations as POI and calculate density of people waiting for the bus service last year and current year. This helps to identify number of bus services can be reduced/increased from place to place.
2. Transform the location points spotted at the bus station to origin destination pairs. For each device id, get the first bus station location and last bus station location and the time in-between.
3. Forecast route planning via the four-step travel model:
a.) Step 1 – Trip Generation
b.) Step 2 – Trip Distribution
c.) Step 3 – Mode Choice
d.) Step 4 – Route Assignment
With the economy being reopened after the pandemic, people look forward to enjoying their time outdoors, meeting their family members and friends. With this, it is of no doubt that an increase in footfall of retail stores or food/beverage locations can be expected. However, with more people interacting, the risk of community transmission potentially increases and if gets worse a second wave of coronavirus spike will not be too far.
With location data, governments can perform mobility analysis and understand which dense places might have higher risks of coronavirus transmission and add restrictions to prevent places from being overcrowded.
Methodology to perform cluster analysis and find places that are overcrowded and bear potential risk of coronavirus transmission:
a. Perform DBSCAN clustering to find high dense cluster size. More on DBSCAN clustering can be found here.
b. Perform clustering via OPTICS.
c. A sample DBSCAN cluster with POSTGIS:
One of the ways in which mobile location data is used most frequently is for advertising and building targetable audiences.
Traditionally advertisers targeted people only based on the demographics or geography. For example, Honda Motors would target males, aged 30-35, living in urban California.
The problem with the traditional approach is that advertisers do not know if they target people who show affinity to their products or services. But with location data, people’s affinity towards a product could be understood using movement patterns and tagging them with the right behaviours.
This information could be used by advertisers to perform a more personalised targeting and in-turn potentially increaser their advertising ROI.
Audiences have different categories like Automotive buyers, Restaurant dining, Shoppers, Travelers etc.
Consider a Device ID is seen visiting a Honda showroom on July 1st. This ID can be categorized as an automotive buyer and as a mid-market range automotive buyer. Now, the same Device ID was spotted in beauty parlour a few days after, then an additional behaviour (beauty) can be tagged to existing Device ID. By adding more and more behaviours to the Device IDs, advertisers can build a more rounded/complete picture of their audiences and provide more personalized content.
Audiences can be built by geofencing appropriate POI places. For example, to build a mid-market automobile audience, Device IDs that were spotted in Honda showrooms and also in its competitors like (Toyota and Hyundai) could be geofenced. Therefore, it also goes without saying that a POI database is necessary to build location-based audiences.
Once you have mastered the basics, you can start to refine further but adding additional algorithms. One of the algorithms that could be added is to remove potential workers in an automobile shop so that we target potential buyers and not workers. To do this, you will want to identify all Device IDs within these stores (POIs) that are consistently present over time. For example, if a Device ID is seen at the Honda dealership on Monday, Tuesday, Thursday and Friday a week later, you might want to assume that they are an employee at the dealership and exclude them from your audience.
Another important consideration is that audiences have a lifecycle which differs based on the type of audiences. For instance, a person planning to buy an automobile usually takes 3-4 months to reach a decision. Hence automobile buyers have lifecycle of 3-4 months and after that the behaviour would be removed from the Device ID. On the contrary, if a person uses a Samsung phone, the Samsung user behaviour could be valid for 1-2 years as people usually buy phones via contracts.
Aside from creating audiences, location data has high potential when it comes to OOH advertising.
OOH (Out Of Home Advertising), as the word states, is simply advertising when people are outside their home (e.g. billboards, in-car ads, bus stop boards, etc.).
Just like digital advertising above, the success of OOH advertising is evaluated by understanding how successfully an ad was at driving a viewer to purchase (either online or in store). This is often referred to ROI measurement, attribution or drive to store. Location data is extremely useful for OOH advertisers who want to attribute viewers of their assets back to either physical locations such as stores or directly back to online purchases by using the Device ID. For example, if 20 out of 100 people who saw the Apple billboard went to the apple store nearby, then the attribution rate to the store is 20%.
Above are two common ways in which OOH advertisers use location data:
1. Evaluate attribution of an OOH campaign:
Say for e.g. you have a fashion apparel store (H&M). They launched an OOH campaign in the same road where they are located and want to know how many people visited the store after the advertisement.
• Filter the location data during the campaign time period
• Filter data points around the advertisement location and the store location
• Classify data points in advertisement location as set A and data points in store location as set B
• Let total_devices = footfall from set A
• Let attributed_devices = distinct devices in set A that are in set B with time stamp of the event being greater in set B than set A
• Divide attributed_devices/total_devices: this would give you the attribution percentage
2. Find the right spot for OOH advertising to influence people's buying decisions:
Say for instance, McDonalds want to increase their morning visits to a specific store in the downtown area and are planning to launch a campaign. They want to find the right location to do an OOH campaign. In this case:
• Take a sample data (1 day or 1 week)
• Find devices spotted in the target McDonalds and surrounding area for 1 week at morning time.
• Using the device ids, find the transportation and the routes they take to reach the store. Find the most common route they take.
• Find the optimal billboard locations in the common route they take. These would be the right spots to increase the OOH attribution.
If you want to learn more, check out these resources:
Kepler is an open source geo-spatial analysis tool using an easy drag and drop mechanism. By providing input data in form of CSV, GeoJSON, or URL, the map will allow you to design plots and save the results in the form of image, html, map etc.
Let us work through an example below!
Due to coronavirus, many retail shops and restaurants have been closed and trends/visitation patterns of shopping malls have been changed due to the circuit breaker imposed in Singapore.
To analyse these changes, a sample location data is used to find patterns of ION Orchard shopping mall, Singapore before circuit breaker (April 4, 2020) and after circuit breaker imposed (April 11, 2020).
ION Orchard Shopping Mall, Singapore
Input sample dataset contains device_id, latitude, longitude, day and timestamp in CSV format, as shown below:
Steps to perform our analysis:
1.) Load data to Kepler (https://kepler.gl/demo) either by drag and drop or browsing files
Loading data to Kepler
By default, Kepler identifies latitude & longitude and forms a layer as a point form as shown below:
2.) Converting plot types: plat types include, point, arc, heatmap, grids or line (useful for taxi driving patterns). There are many more plat types available – see details here.
In this case, converting point form to heatmap:
3.) Density pattern between 4th April and 11th April can be done using timestamp filtration.
Filter Data
a) Select Timestamp field to filter data
b) Timeframe can be adjusted, and which can run in intervals
4th April, ION Orchard, Before CB imposed in Singapore
11th April, ION Orchard, After CB imposed in Singapore
Already by using this simple data manipulation via Kepler, we can see evidence that the number of mall visitors declined after circuit breaker was imposed in Singapore. Additionally, to dig into each device, the device_id can also be filtered (as seen for timestamp above).
Another interesting idea is to filter or sort by color.
4.) Create new layer for color coding on devices and hide previous heatmap. Fill in latitude and longitudes and select color based on ‘device_id’ as shown below.
5.) Save visualizations in HTML format. As highlighted in below figure at top left corner, click share -> Export Map -> Select HTML -> Allow users to edit the map -> Export.Density pattern between 4th April and 11th April can be done using timestamp filtration.
There are other visualizations like clustering, grid heatmap, 3D visualization and even background maps can be changed to Satellite, Light background with and without borders etc. More information on Kepler is available on their website.
Kepler API is also available and can be easily integrated with Python.
By sending your advertisements to the right people at the right time, you can improve the effectiveness of your advertisements and increase your conversions. Marketers often use location-based audiences to better deliver their ads to the right people.
Context is key when it comes to delivering the right marketing communication to your audience. By knowing where they come from, you can craft contextually relevant marketing messages that resonates with your audience. Resulting in a more effective marketing campaign.
Analysing your customer's movement and visitation patterns may uncover insights which you could leverage on to attract and retain customers.
Use location data to plan and select the best site to open your next store or to place your next outdoor advertisement.
Uncover correlation between footfall levels, inventory levels and business performance. Good for demand and supply planning or for business and economic research.
Measure the performance of your offline advertisements. Often used in Out-of-Home (OOH) advertising or Digital-Out-of-Home advertising (DOOH).
Quadrant's entire data feed.
Great for organisations who want to conduct all data analysis in-house.
Mobile location data processed and filtered based on your specified criteria.
Deploy the data almost immediately with minimal processing.