Data Migration Guide

For legacy Foursquare customers migrating to our new, integrated dataset

Foursquare Places now encompasses the best aspects of both legacy POI data sets - unifying Factual’s best in class methods for collecting POI data and core attributes with Foursquare’s strength in collecting fresh first-party and user-generated content – and leads the industry in data accuracy and freshness. 

The purpose of this migration guide is to provide additional details to our partners and clients who are currently using Legacy Foursquare Venues data in their products and services.  It is intended for product managers and developers that currently own dataset ingestion and usage within their products and code.

Delivery Format and Cadence

New Foursquare Places deliveries will be custom to each client and offered on a client defined schedule with standard options being: monthly, weekly, or daily. Monthly deliveries will be posted on the 1st day of every month. Weekly deliveries can be configured for same day every week, the default day being Sunday. Clients have the option to choose which day of the week they prefer to have deliveries posted. Foursquare Places will now be available on Amazon S3 or Amazon Data Exchange. Clients will be required to supply their own ARN for access to Foursquare S3 buckets.

Quality Filters

We apply a quality filter for all of our client deliverables in order to ensure only the highest quality of POIs are delivered. The current quality filtering logic:

  • Excludes VRS Low (likely “fake” or not real)
  • Excludes Existence High Recall (likely closed)
  • Excludes “People” POIs (ie doctors, insurance agents)

Some legacy and potentially new customers may prefer to receive any or all of the types of POIs above or alternatively, do the filtering on their end. For these use cases, we do have datasets that are not filtered with the quality logic listed above. These datasets, however, do have commercial viability logic applied to remove POIs that are not as relevant for enterprise customers (POIs for our consumer app check-in use case). The commercial viability filtering logic:

  • Excludes Deleted POIs
  • Excludes Private POIs
  • Excludes Geography Categories (States/Municipalities, City, County)

Attribute Data Type & Format Changes

This section outlines details of DataType Changes and Format Changes of specific attribute migrations as listed in Legacy Foursquare Places Schema Mapping. The majority of new Foursquare Places Attributes maintain the same data type and format as Legacy Foursquare Attributes. A handful of attribute migrations involve simple format changes and/or data type changes that are outlined here.

Special Considerations: For collection data types such as Arrays and Lists, flat file deliveries are generated with JSON formatting of these data types. The purpose of this formatting is to provide a standard way to ingest text based values across a variety of languages and frameworks. The information below describes a recommended Data Type post ingestion, and shows an example of the values being delivered in a text based flat file.

Core Data

translatedvenuenames ⇨ name_translated

Legacy FoursquareNew Foursquare Places
Array(String)String(JSON)
[[ニューヨーク マリオット マーキス, ja], [New York Marriott Marquis, en]][{"lang":"ja", "name":"ニューヨーク マリオット マーキス"},{"lang":"en", "name":"New York Marriott Marquis"}]

The attribute translatedvenuenames is being replaced with name_translated. The Legacy attribute was an Array of Arrays(String) with a name value and its associated language code. The new attribute is an JSON formatted array of two-keyed objects containing the language code and name values.

state ⇨ region

Legacy FoursquareNew Foursquare Places
FloridaFL

The attribute state is being replaced with region. The new attribute represents any sub-national level municipal unit like State, Province, etc. US States are now abbreviated compared to the legacy attribute which stored the full name.

countrycode ⇨ country

Legacy FoursquareNew Foursquare Places
US us

The attribute countrycode is being replaced with country. The new attribute uses the standard 2-letter ISO country code in lowercase format.

score_openclose ⇨ closed_bucket

The attribute score_openclose is being replaces with closed_bucket. The legacy attribute was the result of a scoring model that contained values

  • VeryHigh: indicates the POI is marked as closed in our database.
  • High: indicates the POI is likely closed.
  • Low: indicates the POI is not likely closed.
  • Null: indicates the data is not confident enough to make a judgment.
    Coverage was low across the dataset.

The new attribute is the result of a new model trained on thousands of human annotations of Foursquare’s POI and uses features that reference how recent internet sources for the POI have been updated, the last time the POI had a check-in/tip/photo, etc. This results in new array of values that are easier to understand, and there is near 100% coverage across the data. The new values are:

  • VeryLikelyClosed: indicates places with probabilities greater than 90% being closed
  • LikelyClosed: indicates places with probabilities 70–90% being closed
  • Unsure: indicates places w/ probabilities less than 70% closed or open
  • LikelyOpen: indicates places with probabilities 70–90% being open
  • VeryLikelyOpen: indicates places with probabilities greater than 90% being open
    The new model is currently implemented for US only, and the Rest of the World will follow in Q3.

category_primary_id & category_secondary_id ⇨ fsq_category_ids

The attributes category_primary_id and category_secondary_id are being replaced with fsq_category_ids. The Legacy attributes were strings that stored two of the legacy category ids that were 24 character hexadecimal values. The new attribute is an array of integers, that can store more than two category ids. The Category ID format has changed with the new taxonomy and are values between 10000 and 19056. Please refer to the Legacy Foursquare Category Mapping for more details on the Category and Taxonomy changes.

category_primary & category_secondary ⇨ fsq_category_labels

The attributes category_primary and category_secondary are being replaced with fsq_category_labels. The legacy attributes were strings that stored two of the category names associated with category_primary_id and category_secondary_id respectively. The new attribute is an array of strings, that can store more than two category labels. Categories and Taxonomy has been reorganized. The parent level categories are now one of the following:

  • Arts and Entertainment
  • Business and Professional Services
  • Community and Government
  • Dining and Drinking
  • Event
  • Health and Medicine
  • Landmarks and Outdoors
  • Retail
  • Sports and Recreation
  • Travel and Transportation

Please refer to the Legacy Foursquare Category Mapping for more details on the Category and Taxonomy changes.

chainid ⇨ fsq_chain_id

Legacy FoursquareNew Foursquare Places
StringArray(String)
556ca462a7c87f63786aa4d8 ["3fae3191-08ff-4b62-9077-d8a1182d6ef2"]

The attribute chain_id is being replaced with fsq_chain_id. The new attribute is an Array of strings that can represent multiple chain associations. The ID value format has changed from a 24-character hexadecimal value to a 128-bit UUID.

chainname ⇨ fsq_chain_name

Legacy FoursquareNew Foursquare Places
StringArray(String)
IKEA["IKEA"]

The attribute chain_name is being replaced by fsq_chain_name. The new attribute is an Array of strings that allows for multiple chain associations. An example of multiple chain associations would be a Car Dealership that sells vehicles under multiple brands.

Rich Data

hours ⇨ hours

Legacy Foursquare New Foursquare Attribute
String Array(String)
[[1, 990, 1320,], [2, 990, 1320,], [3, 990, 1320,], [4, 990, 1380,], [5, 690, 1440,], [6, 1050, 1440,], [7, 1050, 1320,]] {"saturday":[["9:00","18:00"]],"tuesday":[["9:00","18:00"]],"friday":[["9:00","18:00"]],"thursday":[["9:00","18:00"]],"wednesday":[["9:00","18:00"]],"monday":[["9:00","18:00"]]}
The *hours* attribute is being replaced with a new data type and format. The Legacy Attribute was an Array of Arrays of open and close times on 24 hour scale, with a number corresponding to a day of the week. The new data type is a String of JSON with up to seven keys, one for each day of the week. Each key has an Array of Arrays showing the open and close times in 24 hour scale. A given day can have multiple open and close times that represent periodic daily openings. For example, a cafe might have brunch hours in the morning, be closed for the afternoon, and have second opening in the evening.

photo1, photo2, photo3 ... ⇨ photos

Legacy Foursquare New Foursquare Attribute
String Ordered Array(String)
http://ir.4sqi.net/img/general/original/_photo_1_.jpg ["http://ir.4sqi.net/img/general/original/_photo_1_.jpg", "http://ir.4sqi.net/img/general/original/_photo_2_.jpg]
http://ir.4sqi.net/img/general/original/_photo_2_.jpg
The legacy attributes storing photo references *photo1, photo2, photo3, etc* are being replaced with one new attribute called *photos*. This new attribute is an Ordered Array of strings, storing all photo references in one collection. The order is based on a ranking using the TrueSkill algorithm. A second attribute *total_photos* represents the total number of photos uploaded for the POI.

taste1, taste2, taste3 ... ⇨ tastes

Legacy Foursquare New Foursquare Attribute
String Ordered Array(String)
clippers ["clippers","Lakers games","live music","concerts","NBA games"]
Lakers games
live music
concerts
NBA games
The legacy attributes *taste1, taste2, taste3, etc* are being replaced with one attribute called *tastes*. Tastes are derived from user input from Foursquare apps. The new attribute is an Ordered Array of Strings that stores all tastes in a collection. The order is based on a ranking that measures affinity and frequency.

tip1, tip2, tip3 ... ⇨ tips

Legacy Foursquare New Foursquare Attribute
String Ordered Array(String)
[5116c6e6e4b02c4e257efcfd, Watch Grammy Night LIVE on VH1.com starting Sunday 6/5C!] [["5116c6e6e4b02c4e257efcfd", "Watch Grammy Night LIVE on VH1.com starting Sunday 6/5C!"],["50c11139e4b0fe2a3acf99c0", "This is where we won our first-ever Stanley Cup on June 11, 2012"]]
[50c11139e4b0fe2a3acf99c0, This is where we won our first-ever Stanley Cup on June 11, 2012]
The legacy attributes *tip1, tip2, tip3, etc* are being replaced with one attribute called *tips*. The new attribute is an Ordered Array of Arrays. Each inner Array is a set of strings made up of the tip ID and the tip text. The outer Array is ordered based on the total number of likes from users in the Foursquare user community.

Boolean Tags

Legacy FoursquareNew Foursquare Places
String (t\f)Boolean
tTRUE

All legacy boolean tag attributes that stored true or false as a String with values of 't' or 'f' are being replaced with true Boolean values of TRUE or FALSE.


Did this page help you?