The user deletion API for Google Analytics allows you to delete user data. In this post, I’ll share a brief introduction on what the API allows you to do and how you can use the API in Python.

How it works

The official documentation describes the API as follows:

The Google Analytics User Deletion API allows customers to process deletions of data associated with a given user identifier.

Two things are important here: the data associated with a given user identifier and the user identifier itself.

The data you can delete is limited to the data that Google Analytics directly identifies with a user. This includes the data you see in the User Explorer report. This does not include aggregated data.

So let’s consider one example of data you don’t want to see in your reports:

You have captured email addresses in your page reports.

You would like to remove this data. And you might think:

I’ll just lookup the client IDs of the sessions where it happened and delete the data with the User Deletion API.

This API does not allow you to do this.

The user identifiers you can use are the Client ID and the User ID. The Client ID is a cookie value and because of that, a real-life user can have multiple. The User ID is often a unique identifier of a user (e.g. from a CRM system) and a real-life user generally only has one.

In other words: the User Deletion API allows you to delete personal data that you are allowed to capture in Google Analytics (the Client ID and User ID) and data associated to these identifiers.

Set it up for Python

The technical documentation of the API is limited at best. Luckily, it is part of the Google Analytics API v3. This allows me to use my experience with the reporting part of the API in Python to try and set up a working script for Python. It turns out to be pretty easy. Assuming that you already have a functional analytics service object (that I call analytics  in my project), the code you can use is as easy as this:

def delete_user_by_id_type_and_id_value_from_property_id(id_type, id_value, property_id):    deletion_request = analytics.userDeletion().userDeletionRequest().upsert(
      body={
        'kind': 'analytics#userDeletionRequest',
        'id': {
          'type': id_type,
          'userId': id_value,
        },
        'webPropertyId': property_id,
      }
    ).execute()

    print(deletion_request)

The code prints the request so you can see if it has been sent successfully. All you have to do now is call the function with the correct values:

delete_user_by_id_type_and_id_value_from_property_id(
    id_type='CLIENT_ID',
    id_value='1543358643.1574329799',
    propertyId='UA-43136363-2'
)

If your request is sent successfully, you can look up your Client ID in the Google Analytics User Explorer report and see this beautiful little message:

Boom. Your data is (about to be) deleted.

It takes time

Keep in mind that part of the data is deleted within 72 hours. Full deletion of user-associated data can take up to two months. And this note in Google’s documentation is really important:

If you make use of the BigQuery export, you must also process your own deletions there.

Happy data deleting!

Leave a Reply