Marketing Data Sync

GitHub stars GitHub forks

MDS - Marketing Data Sync

Solution based on the Google Megalista project.

semantic-release Code quality

Sample integration code for onboarding offline/CRM data from BigQuery as custom audiences or offline conversions in Google Ads, Google Analytics 360, Google Display & Video 360, Google Campaign Manager and Facebook Ads.

Supported integrations

How does it work

MDS was design to separate the configuration of conversion/audience upload rules from the engine, giving more freedom for non-technical teams (i.e. Media and Business Inteligence) to setup multiple upload rules on their own.

The solution consists in #1 a Google Spreadsheet (template) in which all rules are defined by mapping a data source (BigQuery Table) to a destination (data upload endpoint) and #2, an apache beam workflow running on Google Dataflow, scheduled to upload the data in batch mode.

Prerequisites

Google Cloud Services

Access Requirements

Those are the minimum roles necessary to deploy MDS:

APIs

Required APIs will depend on upload endpoints in use. We recomend you to enable all of them:

Installation

Create a copy of the configuration Spreadsheet

WIP

Creating required access tokens

To access campaigns and user lists on Google’s platforms, this dataflow will need OAuth tokens for a account that can authenticate in those systems.

In order to create it, follow these steps:

Creating a bucket on Cloud Storage

This bucket will hold the deployed code for this solution. To create it, navigate to the Storage link on the top-left menu on GCP and click on Create bucket. You can use Regional location and Standard data type for this bucket.

Running MDS

We recommend first running it locally and make sure that everything works. Make some sample tables on BigQuery for one of the uploaders and make sure that the data is getting correctly to the destination. After that is done, upload the Dataflow template to GCP and try running it manually via the UI to make sure it works. Lastly, configure the Cloud Scheduler to run MDS in the frequency desired and you’ll have a fully functional data integration pipeline.

Running locally

python3 mds_dataflow/main.py \
  --runner DirectRunner \
  --developer_token ${GOOGLE_ADS_DEVELOPER_TOKEN} \
  --setup_sheet_id ${CONFIGURATION_SHEET_ID} \
  --refresh_token ${REFRESH_TOKEN} \
  --access_token ${ACCESS_TOKEN} \
  --client_id ${CLIENT_ID} \
  --client_secret ${CLIENT_SECRET} \
  --project ${GCP_PROJECT_ID} \
  --region us-central1 \
  --temp_location gs://{$GCS_BUCKET}/tmp

Deploying Pipeline

To deploy, use the following command: ./deploy_cloud.sh project_id bucket_name region_name

Manually executing pipeline using Dataflow UI

To execute the pipeline, use the following steps:

Scheduling pipeline

To schedule daily/hourly runs, go to Cloud Scheduler:

Creating a Service Account

It’s recommended to create a new Service Account to be used with the Cloud Scheduler

Usage

Every upload method expects as source a BigQuery data with specific fields, in addition to specific configuration metadata. For details on how to setup your upload routines, refer to the MDS Wiki or the MDS user guide.

Mandatory requirements

Only contributions that meet the following requirements will be accepted:

Support:

DP6 Koopa-troopa Team

e-mail: koopas@dp6.com.br