BigQuery Export schema

30/11/2019

This article explains the format and schema of the Google Analytics for Firebase data that is exported to BigQuery.

Datasets

For each Firebase project that is linked to BigQuery, a single dataset named "analytics_" is added to your BigQuery project. Property ID refers to your Analytics Property ID, which you can find in your Analytics Settings in Firebase. Each app for which BigQuery exporting is enabled will export its data to that single dataset.

Tables

Within each dataset, a table is imported for each day of export. These tables have the format "events_YYYYMMDD". Additionally, a table is imported for events received throughout the current day.  This table is named “events_intraday_YYYYMMDD" and it is populated in real-time as events are collected.

Rows

Each row within a table corresponds to a single event uploaded by the Analytics SDK.

Columns

The columns within the export are listed below.

 

Field name Data type Description
App    
app_info RECORD A record of information on the app.
app_info.id STRING The package name or bundle ID of the app.
app_info.firebase_app_id STRING The Firebase App ID associated with the app
app_info.install_source STRING The store that installed the app.
app_info.version STRING The app's versionName (Android) or short bundle version.
Device    
device RECORD A record of device information.
device.category STRING The device category (mobile, tablet, desktop).
device.mobile_brand_name STRING The device brand name.
device.mobile_model_name STRING The device model name.
device.mobile_marketing_name STRING The device marketing name.
device.mobile_os_hardware_model STRING The device model information retrieved directly from the operating system.
device.operating_system STRING The operating system of the device.
device.operating_system_version STRING The OS version.
device.vendor_id STRING IDFV (present only if IDFA is not collected).
device.advertising_id STRING Advertising ID/IDFA.
device.language STRING The OS language.
device.time_zone_offset_seconds INTEGER The offset from GMT in seconds.
device.is_limited_ad_tracking BOOLEAN The device's Limit Ad Tracking setting.
Stream and platform    
stream_id STRING The numeric ID of the stream.
platform STRING The platform on which the app was built.
User    
user_first_touch_timestamp INTEGER The time (in microseconds) at which the user first opened the app.
user_id STRING The user ID set via the setUserId API.
user_pseudo_id STRING The pseudonymous id (e.g., app instance ID) for the user.
user_properties RECORD A repeated record of user properties set with the setUserProperty API.
user_properties.key STRING The name of the user property.
user_properties.value RECORD A record for the user property value.
user_properties.value.string_value STRING The string value of the user property.
user_properties.value.int_value INTEGER The integer value of the user property.
user_properties.value.double_value FLOAT The double value of the user property.
user_properties.value.float_value FLOAT This field is currently unused.
user_properties.value.set_timestamp_micros INTEGER The time (in microseconds) at which the user property was last set.
user_ltv RECORD A record of Lifetime Value information about the user. This field is not populated in intraday tables.
user_ltv.revenue FLOAT The Lifetime Value (revenue) of the user. This field is not populated in intraday tables.
user_ltv.currency STRING The Lifetime Value (currency) of the user. This field is not populated in intraday tables.
Campaign   Note: traffic_source attribution is based on cross-channel last click.
traffic_source RECORD Name of the traffic source used to acquire the user. This field is not populated in intraday tables.
traffic_source.name STRING The name of the marketing campaign that acquired the user. This field is not populated in intraday tables.
traffic_source.medium STRING The name of the medium (paid search, organic search, email, etc.) that acquired the user. This field is not populated in intraday tables.
traffic_source.source STRING The name of the network that acquired the user. This field is not populated in intraday tables.
Geo    
geo RECORD A record of the user's geographic information.
geo.continent STRING The continent from which events were reported, based on IP address.
geo.sub_continent STRING The subcontinent from which events were reported, based on IP address.
geo.country STRING The country from which events were reported, based on IP address.
geo.region STRING The region from which events were reported, based on IP address.
geo.metro STRING The metro from which events were reported, based on IP address.
geo.city STRING The city from which events were reported, based on IP address.
Event    
event_date STRING The date on which the event was logged (YYYYMMDD format in the registered timezone of your app).
event_timestamp INTEGER The time (in microseconds, UTC) at which the event was logged on the client.
event_previous_timestamp INTEGER The time (in microseconds, UTC) at which the event was previously logged on the client.
event_name STRING The name of the event.
event_params RECORD A repeated record of the parameters associated with this event.
event_params.key STRING The event parameter's key.
event_params.value RECORD A record of the event parameter's value.
event_params.value.string_value STRING The string value of the event parameter.
event_params.value.int_value INTEGER The integer value of the event parameter.
event_params.value.double_value FLOAT The double value of the event parameter.
event_params.value.float_value FLOAT The float value of the event parameter.  This field is currently unused.
event_value_in_usd FLOAT The currency-converted value (in USD) of the event's "value" parameter.
event_bundle_sequence_id INTEGER The sequential ID of the bundle in which these events were uploaded.
event_server_timestamp_offset INTEGER Timestamp offset between collection time and upload time in micros.
Web    
web_info RECORD A record of information for web data.
web_info.hostname STRING The hostname associated with the logged event.
web_info.browser STRING The browser in which the user viewed content.
web_info.browser_version STRING The version of the browser in which the user viewed content.

 

 

Use this script to migrate existing BigQuery datasets from the old export schema to the new one

  1. Log in to Cloud Platform Console >: Manager resources page.
  2. Open the project whose data you want to migrate, and click Activate Google Cloud Shell at the top of the page.
  3. When the shell opens, copy the script below to a file named migration_script.sql:
    1. Issue the command cat > migration_script.sql
    2. Copy and paste the script below into the shell.
    3. Press Ctrl+D to save and exit.

    Script (migration_script.sql):

      SELECT
      @date AS event_date,
      event.timestamp_micros AS event_timestamp,
      event.previous_timestamp_micros AS event_previous_timestamp,
      event.name AS event_name,
      event.value_in_usd  AS event_value_in_usd,
       user_dim.bundle_info.bundle_sequence_id AS event_bundle_sequence_id,
      user_dim.bundle_info.server_timestamp_offset_micros as event_server_timestamp_offset,
      (
      SELECT
        ARRAY_AGG(STRUCT(event_param.key AS key,
            STRUCT(event_param.value.string_value AS string_value,
              event_param.value.int_value AS int_value,
              event_param.value.double_value AS double_value, 
              event_param.value.float_value AS float_value) AS value))
      FROM
        UNNEST(event.params) AS event_param) AS event_params,
      user_dim.first_open_timestamp_micros AS user_first_touch_timestamp,
      user_dim.user_id AS user_id,
      user_dim.app_info.app_instance_id AS user_pseudo_id,
      "" AS stream_id,
      user_dim.app_info.app_platform AS platform,
      STRUCT( user_dim.ltv_info.revenue AS revenue,
        user_dim.ltv_info.currency AS currency ) AS user_ltv,
      STRUCT( user_dim.traffic_source.user_acquired_campaign AS name,
          user_dim.traffic_source.user_acquired_medium AS medium,
          user_dim.traffic_source.user_acquired_source AS source ) AS traffic_source,
      STRUCT( user_dim.geo_info.continent AS continent,
        user_dim.geo_info.country AS country,
        user_dim.geo_info.region AS region,
        user_dim.geo_info.city AS city ) AS geo,
      STRUCT( user_dim.device_info.device_category AS category,
        user_dim.device_info.mobile_brand_name,
        user_dim.device_info.mobile_model_name,
        user_dim.device_info.mobile_marketing_name,
        user_dim.device_info.device_model AS mobile_os_hardware_model,
        @platform AS operating_system,
        user_dim.device_info.platform_version AS operating_system_version,
        user_dim.device_info.device_id AS vendor_id,
        user_dim.device_info.resettable_device_id AS advertising_id,
        user_dim.device_info.user_default_language AS language,
        user_dim.device_info.device_time_zone_offset_seconds AS time_zone_offset_seconds,
        IF(user_dim.device_info.limited_ad_tracking, "Yes", "No") AS is_limited_ad_tracking ) AS device,
      STRUCT( user_dim.app_info.app_id AS id,
        @firebase_app_id  AS firebase_app_id,
        user_dim.app_info.app_version AS version,
        user_dim.app_info.app_store AS install_source ) AS app_info,
      (
      SELECT
        ARRAY_AGG(STRUCT(user_property.key AS key,
            STRUCT(user_property.value.value.string_value AS string_value,
              user_property.value.value.int_value AS int_value,
              user_property.value.value.double_value AS double_value,
              user_property.value.value.float_value AS float_value,
              user_property.value.set_timestamp_usec AS set_timestamp_micros ) AS value))
      FROM
        UNNEST(user_dim.user_properties) AS user_property) AS user_properties
    FROM
      `SCRIPT_GENERATED_TABLE_NAME`,
      UNNEST(event_dim) AS event
      
  4. Open a new shell, and copy the bash script below to a file named migration.sh:
    1. Issue the command cat > migration.sh
    2. Copy and paste the script below into the shell.
    3. Press Ctrl+D to save and exit.
    Modify the following script to include your Analytics property ID, BigQuery project ID, Firebase app ID, BigQuery dataset name, and the start and end dates of the data you want.

    Script (migration.sh):

    # Analytics Property ID for the Project. Find this in Analytics Settings in Firebase
    PROPERTY_ID=your Analytics property ID
    
    # Bigquery Export Project
    BQ_PROJECT_ID="your BigQuery Project ID" (e.g., "firebase-public-project")
    
    # Firebase App ID for the app.
    FIREBASE_APP_ID="your Firebase App ID" (e.g., "1:300830567303:ios:09b1ab1d3ca29bda")
    
    # Dataset to import from.
    BQ_DATASET="name of BigQuery dataset you want to import from" (e.g., "com_firebase_demo_IOS")
    
    # Platform
    PLATFORM="platform of the app. ANDROID or IOS"
    
    # Date range for which you want to run migration, [START_DATE,END_DATE] inclusive.
    START_DATE=20180324
    END_DATE=20180327
    
    # Do not modify the script below, unless you know what you are doing :)
    startdate=$(date -d"$START_DATE"  +%Y%m%d) || exit -1
    enddate=$(date -d"$END_DATE"  +%Y%m%d) || exit -1
    
    # Iterate through the dates.
    DATE="$startdate"
    while [ "$DATE" -le "$enddate" ]; do
    
            # BQ table constructed from above params.
            BQ_TABLE="$BQ_PROJECT_ID.$BQ_DATASET.app_events_$DATE"
    
            echo "Migrating $BQ_TABLE"
    
            cat migration_script.sql | sed -e "s/SCRIPT_GENERATED_TABLE_NAME/$BQ_TABLE/g" | bq query \
            --debug_mode \
            --allow_large_results \
            --noflatten_results \
            --use_legacy_sql=False \
            --destination_table analytics_$PROPERTY_ID.events_$DATE \
            --batch \
            --append_table \
            --parameter=firebase_app_id::$FIREBASE_APP_ID \
            --parameter=date::$DATE \
            --parameter=platform::$PLATFORM \
            --project_id=$BQ_PROJECT_ID
    
    
            temp=$(date -I -d "$DATE + 1 day")
            DATE=$(date -d "$temp" +%Y%m%d)
    
    done
    exit
    
    # END OF SCRIPT
    
    
  5. Open a new shell, and issue the command bash migration.sh

 

Old export schema

user_dim

Field Name Data Type Description
user_dim RECORD A record of user dimensions.
user_dim.user_id STRING The user ID set via the setUserId API.
user_dim.first_open_timestamp_micros INTEGER The time (in microseconds) at which the user first opened the app.
user_dim.user_properties RECORD A repeated record of user properties set with the setUserProperty API.
user_dim.user_properties.key STRING The name of the user property
user_dim.user_properties.value RECORD A record for information about the user property.
user_dim.user_properties.value.value RECORD A record for the user property value.
user_dim.user_properties.value.value.string_value STRING The string value of the user property.
user_dim.user_properties.value.value.int_value INTEGER The integer value of the user property.
user_dim.user_properties.value.value.double_value FLOAT The double value of the user property.
user_dim.user_properties.value.set_timestamp_usec INTEGER The time (in microseconds) at which the user property was last set.
user_dim.user_properties.value.index INTEGER The index (0-24) of the user property.
user_dim.device_info RECORD A record of device information.
user_dim.device_info.device_category STRING The device category (mobile, tablet, desktop).
user_dim.device_info.mobile_brand_name STRING The device brand name.
user_dim.device_info.mobile_model_name STRING The device model name.
user_dim.device_info.mobile_marketing_name STRING The device marketing name.
user_dim.device_info.device_model STRING The device model.
user_dim.device_info.platform_version STRING The OS version.
user_dim.device_info.device_id STRING IDFV (present only if IDFA is not available).
user_dim.device_info.resettable_device_id STRING Advertising ID/IDFA.
user_dim.device_info.user_default_language STRING The OS language.
user_dim.device_info.device_time_zone_offset_seconds INTEGER The offset from GMT in seconds.
user_dim.device_info.limited_ad_tracking BOOLEAN The device's Limit Ad Tracking setting.
user_dim.geo_info RECORD A record of the user's geographic information.
user_dim.geo_info.continent STRING The continent from which events were reported, based on IP address.
user_dim.geo_info.country STRING The country from which events were reported, based on IP address.
user_dim.geo_info.region STRING The region from which events were reported, based on IP address.
user_dim.geo_info.city STRING The city from which events were reported, based on IP address.
user_dim.app_info RECORD A record of information on the app.
user_dim.app_info.app_version STRING The app's versionName (Android) or short bundle version.
user_dim.app_info.app_instance_id STRING The unique id for this instance of the app.
user_dim.app_info.app_store STRING The store which installed this app.
user_dim.app_info.app_platform STRING The platform on which this app is running.
user_dim.traffic_source RECORD Name of the traffic source used to acquired the user. This field is not populated in intraday tables.
user_dim.traffic_source.user_acquired_campaign STRING The name of the marketing campaign which acquired the user. This field is not populated in intraday tables.
user_dim.traffic_source.user_acquired_medium STRING The name of the medium (paid search, organic search, email, etc.) which acquired the user. This field is not populated in intraday tables.
user_dim.traffic_source.user_acquired_source STRING The name of the network which acquired the user. This field is not populated in intraday tables.
user_dim.bundle_info RECORD A record of information regarding the bundle in which these events were uploaded.
user_dim.bundle_info.bundle_sequence_id INTEGER The sequential id of the bundle in which these events were uploaded.
user_dim.ltv_info RECORD A record of Lifetime Value information about this user. This field is not populated in intraday tables.
user_dim.ltv_info.revenue FLOAT The Lifetime Value (revenue) of this user. This field is not populated in intraday tables.
user_dim.ltv_info.currency STRING The Lifetime Value (currency) of this user. This field is not populated in intraday tables.
 

event_dim

Field Name Data Type Description
event_dim RECORD A repeated record of information pertaining to events in this bundle.
event_dim.date STRING The date on which this event was logged (YYYYMMDD format in the registered timezone of your app.)
event_dim.name STRING The name of this event.
event_dim.params RECORD A repeated record of the parameters associated with this event.
event_dim.params.key STRING The event parameter's key.
event_dim.params.value RECORD A record of the event parameter's value.
event_dim.params.value.string_value STRING The string value of the event parameter.
event_dim.params.value.int_value INTEGER The integer value of the event parameter.
event_dim.params.value.double_value FLOAT The double value of the event parameter.
event_dim.timestamp_micros INTEGER The time (in microseconds, UTC) at which this event was logged on the client.
event_dim.previous_timestamp_micros INTEGER The time (in microseconds, UTC) at which this event was previously logged on the client.

* Nguồn: Google Analytics