Partner Graph File Transfer

Overview


With the Partner Graph File Transfer, your AWS S3 bucket will be filled at regular time intervals with new and updated Match Partner UIDs coming from the cascading process.

The time interval that ID5 pushes files could change over time. Your system should be built to look for new files and process them as they become available and not based on an expected interval from ID5. We recommend you look for new files every 5 minutes.

Per-Partner Delivery Method

In this delivery method, data will be delivered to a directory called /incremental at the root of your bucket with subdirectories broken out by Match Partner (directory names based on the partner’s Global Vendor List ID), then folders per UTC day /YYYYMMDD/, each containing files named with a timestamp throughout the day. For instance:

S3://[bucket]/incremental/[match partner 1 GVL ID]/20221106/000000.csv
S3://[bucket]/incremental/[match partner 1 GVL ID]/20221106/003000.csv
...
S3://[bucket]/incremental/[match partner 1 GVL ID]/20221106/233000.csv
S3://[bucket]/incremental/[match partner 1 GVL ID]/20221107/000000.csv
S3://[bucket]/incremental/[match partner 1 GVL ID]/20221107/003000.csv
...
S3://[bucket]/incremental/[match partner 1 GVL ID]/20221107/180000.csv
S3://[bucket]/incremental/[match partner 2 GVL ID]/20221106/000000.csv
...
S3://[bucket]/incremental/[match partner 2 GVL ID]/20221106/233000.csv

Each .csv file contains the incremental mappings between your UIDs and the requested Match Partner’s UIDs in a pipe-separated format. In other words, only new or changed (since the last file) UID pairs will be included in each file.

Available Match Partners are defined by contract between you and ID5. To change the list of Match Partners, please reach out to your ID5 representative.

File Format

Column Type Description
UID String Your User ID for this user
MatchPartnerGVLID Int The Global Vendor List ID of the Match Partner
MatchPartnerUID String The Match Partner’s User ID for this user

Example Output File

$ cat /incremental/matchpartner1GVLID/20220305/000300.csv
550e8400-e29b-41d4-a716-446655440000|matchpartner1GVLID|100000187421490458
d2a8378f-fe56-4ec2-96d1-3c05df02bb48|matchpartner1GVLID|1000002629570693845

Single File Delivery Method

This delivery method has been deprecated. The documentation only remains here for legacy integration purposes.

Data will be delivered to a directory called /incremental at the root of your bucket with subdirectories per UTC day /YYYYMMDD, each containing files named with a timestamp for when they were run. For example:

S3://[bucket]/incremental/20221106/000000.csv
S3://[bucket]/incremental/20221106/003000.csv
...
S3://[bucket]/incremental/20221106/233000.csv
S3://[bucket]/incremental/20221107/000000.csv
S3://[bucket]/incremental/20221107/003000.csv
...

File Format

Each .csv file contains the incremental mappings between your UIDs and the requested Match Partner’s UIDs in a pipe-separated format since the last time we ran our streaming job. The first line in the file will contain the set of match partners by their Global Vendor List ID. (Available Match Partners are defined by contract between you and ID5. To change the list of Match Partners, please reach out to your ID5 representative.) Each subsequent line in the file will represent a single user based on your UID or the ID5 ID, followed by all requested Match Partners’ UIDs that had changes.

If a column is left blank (""), this does NOT mean the mapping for this user does not exist, but rather it means there is no update. You should treat these files as purely additive to your existing mappings, not as a replacement.

The first file ID5 delivers to you will not automatically contain the entire match table; instead it will contain any IDs collected/changed since the last time we ran our streaming job. If you would like to receive the entire match table, let us know.

Header Values

Column Type Description
Source GVLID Integer Your GVL ID or the ID5 GVL ID (131)
MatchPartner1 GVLID Integer Match Partner 1’s GVL ID
MatchPartner2 GVLID Integer Match Partner 2’s GVL ID (if applicable)

Row Values

Column Type Description
UID String Your User ID or the ID5 ID for this user
MatchPartner1UID String Match Partner 1’s User ID for this user, surrounded by double quotes
MatchPartner2UID String Match Partner 2’s User ID for this user, surrounded by double quotes (if applicable)

Another way to look at this format is as follows:

[YOUR GVLID]|[GVLID1]...
"[YOUR UIDa]"|"[UID1]"...
"[YOUR UIDb]"|"[UID2]"...
"[YOUR UIDc]"|"[UID3]"...
"[YOUR UIDz]"|"[UIDn]"...

where |[GVLID1] and |[UIDn] will repeat for all Match Partners.

Your code to ingest the data file should be able to handle new Match Partners or a different order of Match Partners at any time. This way, if there’s a commercial request to add more partners, we don’t need to coordinate a release to ensure your processes don’t break.

Example Output File

Assuming your GVL ID is 35 and you are matching with partners with GVL IDs 20, 45, and 109:

$ cat /incremental/20220305/003000.csv
35|20|45|109
"AAAAAA"|"111111"|"222222"|"333333"
"BBBBBB"|"444444"|"555555"|"666666"
"CCCCCC"|"777777"|""|"999999"

Mapping Table Refreshes

In addition to the incremental updates that we push throughout the day, ID5 can also stream the full match table to you on a regular basis. This ensures a couple of things:

  • If any data is lost during the streaming process, the full extract will recover the data, rather than waiting for a change from that user to be pushed
  • If any users have opted out or had their mappings expire, this will allow you to remove them from your mapping tables since they will no longer be included in the full extract

If you'd wish to receive these refreshes, please check with your ID5 representative

Mapping Table Refresh File Location and Format

The format of the files will follow the same as the incremental updates above, depending on whether you’ve chosen Single File or Per-Partner Files. The location of the data files, though, will be different from the incremental files to allow you to have separate processing for weekly refreshes. The files will be pushed to:

Per-Partner File Location

s3://[bucket]/full-extracts/[match partner GVLID]/[datetime].csv

Single File Location

s3://[bucket]/full-extracts/[datetime].csv

Cleaning Up / Deleting Old Files

By default, ID5 does not delete any files we place in the S3 bucket. When we push files to the bucket, we perform a sync operation. This means that if you have deleted a file in the S3 bucket, but it still exists in the ID5 servers, we will push the file again to the bucket.

We keep files on our server for approximately 30 days.

If your ETL process includes deleting files from the bucket, please let us know so we can work together on a solution that meets your needs.

We recommend that you only delete files > 30 days old to avoid any issues. We can also automatically delete old files from the bucket that we have already removed from our servers if you’d like.