Insights & Analysis

Create Advanced Google Shopping Insights using Merchant Center BigQuery exports

Price competitiveness and product benchmark data, already available in the Merchant Center interface, is made available in Google BigQuery. Leveraging these exports, you can create advanced insights into your product catalog that can be used for automation as well.

  • Export product and brand level market insight data (based on all the merchants on Google Shopping / Merchant center) to Google BigQuery
  • Enrich your productfeed with the Merchant Center data to create valuable advanced insights and automations
  • BigQuery example query

Connecting Merchant Center to Google BigQuery

First step, you will need a Google Cloud Project with an active billing account. If not present already, you can check out these instructions.

The official Google documentation provides all the info to set up the data transfer service between Merchant Center and Google BigQuery. When the connection is set up, Merchant Center will run a daily export of several reports.

merchant-center-bigquery-export-flow.svg Source: Google

Merchant Center exports

Several reports can be exported by the transfer service. You don't have to export all the reports (can be configured when setting up the export).

Be aware that the Merchant Center exports are not that small in terms of GBs. For every available Google product category and country, it will export 10.000 top products for example. So it could generate some costs in your Google Cloud project. See the tips in the last chapter of this article.

The most important exports;

ReportTable nameDescription
Products and product issuesProduct_[ACCOUNT]Productdata provided to Merchant Center, enriched with information like status, item issues, Google product categories and (country) destinations
Price benchmarksProducts_PriceBenchmarks_[ACCOUNT]Based on the price competitiveness report in the Merchant Center Interface. When available, it provides a price benchmark on product level based on data of other retailers (matching based on GTIN).
Best Sellers - Top ProductsBestSellers_TopProducts_[ACCOUNT]A ranked list of top products (max 10.000 products per country and Google product category). Based on data of all merchants (Shopping ads and unpaid listings). The same product can be featured multiple times in this table. Per ranking country + multiple times per Google product (sub) categories.
Best Sellers - Top Products InventoryBestSellers_TopProducts_Inventory_[ACCOUNT-ID]Mapping between your product feed and top products. The same product-ID can exist multiple times times (but has a unique ranking-ID). Per ranking country and in multiple Google product categories.
Best Sellers - Top BrandsBestSellers_TopBrands_[ACCOUNT]A ranked list of top brands (10.000 per country and Google product category). Based on data of all merchants (Shopping ads and unpaid listings).

Also see this overview in the official Google documentation.

Some of the most interesting properties and definition;

  • price_benchmark_value: The average (click-weighted) price of a specific product, based on all merchants who advertise that same product on Shopping ads. Matching products between merchants is based on GTIN.
  • rank: The popularity of the product on Shopping ads, on Google product category and country level. Popularity is based on the estimated number of products sold. The rank updates daily, but data in the exports can be delayed up to 2 days.
  • previous_rank: The change in rank over the previous 7 days.
  • relative_demand.min / .max / .bucket: A product's current estimated demand in relation to the product with the highest popularity rank in the same category and country. Also includes buckets (e.g. Very High)
  • previous_relative_demand.min / .max / .bucket: Same as above, only calculated on the last 7 days.
  • price_range.min / price_range.max: The lower and upper pricerange without decimals. Doesn't include shipping costs.

Practical use cases & business value

Combining the different reports, you can think of the following use cases;

  • Product feed status: Get Merchant Center and Shopping errors and notifications related to your product feeds.
  • Product inventory optimization: Using the best seller reports, you can check which popular products are already in your inventory and what potential products could be added, based on the complete list of top products per Google product categories AND country.
  • Price and bid optimization: Using the price competitiveness benchmark information, you can check how you are doing in terms of price against competitors.
  • Trending products and brands: Identify trending products / brands based on the changes in best seller rank over the last X days / weeks / months. Differentiate per country
  • Product feed enrichment: Enrich you product feed with Google market insights data
  • Rule based automation: Based on the product feed enrichment, pause and enable campaigns / products in channels (e.g. Google Shopping) or adjust bids. For example based on price competitiveness.

Enrich product feed with Merchant Center data

Example query

To show you how the data can be combined, I've constructed a query that will join and combine data out of different Merchant Center tables on product level. Some remarks:

  • Google doesn't provide data for every product in your feed. Either a product is not in the bestsellers list or Google just doesn't provide or calculate it (not enough data and probably some other reasons). Or you have unique products that other merchants don't offer.
  • Use the BestSellers_TopProducts_Inventory as a mapping table to join the BestSellers_TopProducts data with your Product data by joining on rank_id (available in both tables)
  • The "original" product-ID as known in your systems is not directly available in the different exports. It's available in the product_id field but you have te split the value and select the last part since Google added additional information to it, example: online:nl:NL:<original-product-id>
  • The reports contain a lot of data. For example the best seller reports contains 10.000 best sellers in every Google product category and country. For efficiency reasons, try to select only the countries needed. Also important because the same product can be duplicated over different countries in the table Google provides.
  • The reports contain nested and repeated data, make sure you understand the table structures. For example the product errors and notifications are nested.

Example: Enrich product feed with benchmark and price competitiveness data

The following query will result in this table that you could match on your original productfeed and could serve as input for automation rules (on product level)

enrich-productfeed-with-merchent-center-data.PNG

Make sure you replace the dataset / table names with the correct values in your situation.

View code on GitHub

1-- Author: Krisjan Oldekamp
2-- https://stacktonic.com/article/create-advanced-google-shopping-insights-using-merchant-center-big-query-exports
3
4-- set variable to change the fetch date easily
5declare gmc_fetch_date date default date('2021-09-05');
6
7with
8-- get productfeed uploaded in gmc for specific date
9gmc_products as (
10    select
11        _partitiondate as date,
12        *
13    from
14        `dataset.products_<merchant-id>`
15    where
16        _partitiondate = gmc_fetch_date
17        and target_country in ('nl')
18),
19-- get mapping table to join rank (bestsellers) and productfeed
20gmc_product_inventory as (
21    select
22        *
23    from
24        `dataset.bestsellers_topproducts_inventory_<merchant-id>`
25    where
26        _partitiondate = gmc_fetch_date
27        and (regexp_contains(rank_id, ':nl:'))
28),
29-- get price benchmark information
30gmc_price_benchmarks as (
31    select
32        _partitiondate as date,
33        *
34    from
35        `dataset.products_pricebenchmarks_<merchant-id>`
36    where
37        _partitiondate = gmc_fetch_date
38        and country_of_sale in ('nl')  -- specify the countrycodes, since dataset countains all countries	
39    ),
40-- get bestseller information (if product ranks in bestseller list -> 10000 products per google product category and country).
41gmc_bestsellers as (
42    select  
43        _partitiondate as date,
44        inventory.product_id as product_id,
45        bestseller.rank_timestamp,
46        bestseller.rank_id,
47        bestseller.rank,
48        bestseller.previous_rank,
49        if(inventory.product_id is null,false,true) as product_in_inventory,
50        bestseller.ranking_country,
51        bestseller.ranking_category as ranking_category_id,
52        (select name from bestseller.ranking_category_path where locale = 'nl-nl') as ranking_category_path,
53        (select name from bestseller.product_title where locale = 'nl-nl') as product_title,
54        concat(gtins[safe_offset(0)],'|',gtins[safe_offset(1)],'|',gtins[safe_offset(2)],'|') as gtins,
55        bestseller.brand,
56        bestseller.google_brand_id,
57        bestseller.google_product_category as google_product_category_id,
58        (select name from bestseller.google_product_category_path where locale = 'nl-nl') as google_category_path,
59        bestseller.price_range.min as price_range_min,
60        bestseller.price_range.max as price_range_max,
61        bestseller.price_range.currency as price_range_currency,
62        bestseller.relative_demand.min as relative_demand_min,
63        bestseller.relative_demand.max as relative_demand_max,
64        bestseller.relative_demand.bucket as relative_demand_bucket,
65        bestseller.previous_relative_demand.min as previous_relative_demand_min,
66        bestseller.previous_relative_demand.max as previous_relative_demand_max,
67        bestseller.previous_relative_demand.bucket as previous_relative_demand_bucket
68    from 
69        `dataset.bestsellers_topproducts_<merchant-id>` as bestseller
70    -- only join products when available in the productfeed and the bestsellers list
71    inner join gmc_product_inventory as inventory
72        on bestseller.rank_id = inventory.rank_id
73    where 
74        _partitiondate = gmc_fetch_date
75        and ranking_country in ('nl') 
76)
77
78    -- join all the data on product level
79select
80    products.date,
81    products.product_id as product_id_gms,
82    split(products.product_id, ':')[safe_offset(3)] as product_id,
83    products.target_country,
84    products.title,
85    products.price.value,
86    benchmarks.price_benchmark_value,
87    (cast(benchmarks.price_benchmark_value as float64) - cast(products.price.value as float64)) as price_benchmark_diff,
88    benchmarks.price_benchmark_currency,
89    bestsellers.rank,
90    bestsellers.previous_rank,
91    bestsellers.price_range_min,
92    bestsellers.price_range_max,
93    bestsellers.relative_demand_min,
94    bestsellers.relative_demand_max,
95    bestsellers.relative_demand_bucket,
96    bestsellers.previous_relative_demand_min,
97    bestsellers.previous_relative_demand_max,
98    bestsellers.previous_relative_demand_bucket
99from gmc_products as products
100-- join price benchmark data
101left join gmc_price_benchmarks as benchmarks
102    on products.product_id = benchmarks.product_id
103    and products.date = benchmarks.date
104-- join bestseller data
105left join gmc_bestsellers as bestsellers
106    on products.product_id = bestsellers.product_id
107    and products.date = bestsellers.date

Closing thoughts

Using the BigQuery Merchant Center exports, you can generate insights at scale and use it for automation. But keep in mind, the exports hold a lot of data, so it could generate some costs in your Google Cloud project (and you probably won't need all the data). The tables are date partitioned, so you could write a Python script or use Airflow to delete partitions older than X days to keep the amount of data in BigQuery (probably after you've created some aggregate reports).

Did you like this article? Stay hydrated and get notified when new articles, tutorials or other interesting updates are published. No spam and you can unsubscribe at any time. You can also check my Twitter account for updates and other interesting material.