3. Data file Ingestion

bruno.morini@retargetly.com Updated by bruno.morini@retargetly.com

Note: Prior to sending data from custom users to the SFTP repository, there must be a synchronization of users, whose documentation can be reviewed in this article: 1. Proceso de Sync de cookies de usuario

In this document we will discuss how to perform an integration of data transmission through SFTP or AWS Bucket so that the client can enrich the information of its users within the DMP platform. It consists of the following steps:

  1. Configuring the SFTP repository / AWS Bucket
  2. Uploading data files to the repository
  3. Obtaining the status of each uploaded file

For file format specifications, refer to step 4. File format specs.

3.1. Configuring the SFTP repository / AWS Bucket

This can be done in two different ways:

3.1.A) Partner provides data file hosting.

In this scenario, Retargetly downloads the data files from partner provided SFTP Repository or AWS Bucket.

The client must setup an SFTP repository or AWS Bucket from which Retargetly will read the data files. Partner must also ensure that this repository has the necessary space to host all the files that are uploaded periodically, and they must last a minimum of 30 days from the date of creation. This is to ensure that in case of any failure, the system has the possibility to extract them again and there is no loss of information.

Retargetly will send the following information so that the client can create the SFTP repository:

  • retargetly.pub -> [File with public access SFTP key]

This public key must be installed for access to SFTP through the "retargetly" user.

Retargetly should receive the following information (only the fields whose value is between []):

  • Protocol: SFTP
  • User: retargetly
  • Host: [dirección del host]
  • Port: [numero del puerto]

If partner provides Retargetly with an AWS Bucket, instead of requesting Retargetly's pub key, the partner should share the following information:

  • AWS Access Key Id. Ex: MXLBICMFR5LPFC7B2AXD
  • AWS Secret Access Key. Ex: xF2mJwPuoMSkUTuVhmlqkfxPMWlkAplBxG2wfbSX
  • AWS Bucket repository folder. Ex: s3://partner-repository-for-retargetly/datafiles/
  • AWS Region. Ex: us-east-1

3.1.B) Retargetly provides data file hosting.

Partner uploads data files to Retargetly's SFTP servers.

3.2. Uploading data files to the repository.

The file generation process has to comply with the following policies:

  • Each file must not weight more than 250MB.
  • The files must last at least 30 days after the generation date. Then they can be deleted.
  • For file format specifications, refer to step 4. File format specs. 

3.3. Obtaining the status of each uploaded file

For each file generated within the SFTP / AWS Bucket you can get the status of it. The files have 3 possible states:

  • In process
  • Failed
  • Successful

These 3 states are reported as a brother file of the file to be ingested, but with the following extensions:

  • .processing -> file without content
  • .failed -> file with an error message
  • .success -> file with the processing results

Example, if the system is processing the file

/custom_name_0000.tsv.gz

In the same folder, this file will be present:

/custom_name_0000.tsv.gz.processing

And once finished, the .processing will be deleted and this file will be created if the execution has been successful

/custom_name_0000.tsv.gz.success

 

 

How did we do?

2. Taxonomy Definition

4. File format specs

Contact