Create Advanced Google Shopping Insights using Merchant Center BigQuery exports
Price competitiveness and product benchmark data, already available in the Merchant Center interface, is made available in Google BigQuery. Leveraging these exports, you can create advanced insights into your product catalog that can be used for automation as well.
- Export product and brand level market insight data (based on all the merchants on Google Shopping / Merchant center) to Google BigQuery
- Enrich your productfeed with the Merchant Center data to create valuable advanced insights and automations
- BigQuery example query
Connecting Merchant Center to Google BigQuery
First step, you will need a Google Cloud Project with an active billing account. If not present already, you can check out these instructions.
The official Google documentation provides all the info to set up the data transfer service between Merchant Center and Google BigQuery. When the connection is set up, Merchant Center will run a daily export of several reports.
Merchant Center exports
Several reports can be exported by the transfer service. You don't have to export all the reports (can be configured when setting up the export).
Be aware that the Merchant Center exports are not that small in terms of GBs. For every available Google product category and country, it will export 10.000 top products for example. So it could generate some costs in your Google Cloud project. See the tips in the last chapter of this article.
The most important exports;
|Products and product issues||Product_[ACCOUNT]||Productdata provided to Merchant Center, enriched with information like status, item issues, Google product categories and (country) destinations|
|Price benchmarks||Products_PriceBenchmarks_[ACCOUNT]||Based on the price competitiveness report in the Merchant Center Interface. When available, it provides a price benchmark on product level based on data of other retailers (matching based on GTIN).|
|Best Sellers - Top Products||BestSellers_TopProducts_[ACCOUNT]||A ranked list of top products (max 10.000 products per country and Google product category). Based on data of all merchants (Shopping ads and unpaid listings). The same product can be featured multiple times in this table. Per ranking country + multiple times per Google product (sub) categories.|
|Best Sellers - Top Products Inventory||BestSellers_TopProducts_Inventory_[ACCOUNT-ID]||Mapping between your product feed and top products. The same product-ID can exist multiple times times (but has a unique ranking-ID). Per ranking country and in multiple Google product categories.|
|Best Sellers - Top Brands||BestSellers_TopBrands_[ACCOUNT]||A ranked list of top brands (10.000 per country and Google product category). Based on data of all merchants (Shopping ads and unpaid listings).|
Also see this overview in the official Google documentation.
Some of the most interesting properties and definition;
price_benchmark_value: The average (click-weighted) price of a specific product, based on all merchants who advertise that same product on Shopping ads. Matching products between merchants is based on GTIN.
rank: The popularity of the product on Shopping ads, on Google product category and country level. Popularity is based on the estimated number of products sold. The rank updates daily, but data in the exports can be delayed up to 2 days.
previous_rank: The change in rank over the previous 7 days.
relative_demand.min / .max / .bucket: A product's current estimated demand in relation to the product with the highest popularity rank in the same category and country. Also includes buckets (e.g. Very High)
previous_relative_demand.min / .max / .bucket: Same as above, only calculated on the last 7 days.
price_range.max: The lower and upper pricerange without decimals. Doesn't include shipping costs.
Practical use cases & business value
Combining the different reports, you can think of the following use cases;
- Product feed status: Get Merchant Center and Shopping errors and notifications related to your product feeds.
- Product inventory optimization: Using the best seller reports, you can check which popular products are already in your inventory and what potential products could be added, based on the complete list of top products per Google product categories AND country.
- Price and bid optimization: Using the price competitiveness benchmark information, you can check how you are doing in terms of price against competitors.
- Trending products and brands: Identify trending products / brands based on the changes in best seller rank over the last X days / weeks / months. Differentiate per country
- Product feed enrichment: Enrich you product feed with Google market insights data
- Rule based automation: Based on the product feed enrichment, pause and enable campaigns / products in channels (e.g. Google Shopping) or adjust bids. For example based on price competitiveness.
To show you how the data can be combined, I've constructed a query that will join and combine data out of different Merchant Center tables on product level. Some remarks:
- Google doesn't provide data for every product in your feed. Either a product is not in the bestsellers list or Google just doesn't provide or calculate it (not enough data and probably some other reasons). Or you have unique products that other merchants don't offer.
- Use the
BestSellers_TopProducts_Inventoryas a mapping table to join the
BestSellers_TopProductsdata with your
Productdata by joining on
rank_id(available in both tables)
- The "original" product-ID as known in your systems is not directly available in the different exports. It's available in the
product_idfield but you have te split the value and select the last part since Google added additional information to it, example:
- The reports contain a lot of data. For example the best seller reports contains 10.000 best sellers in every Google product category and country. For efficiency reasons, try to select only the countries needed. Also important because the same product can be duplicated over different countries in the table Google provides.
- The reports contain nested and repeated data, make sure you understand the table structures. For example the product errors and notifications are nested.
Example: Enrich product feed with benchmark and price competitiveness data
The following query will result in this table that you could match on your original productfeed and could serve as input for automation rules (on product level)
Make sure you replace the dataset / table names with the correct values in your situation.
View code on GitHub
1-- Author: Krisjan Oldekamp 2-- https://stacktonic.com/article/create-advanced-google-shopping-insights-using-merchant-center-big-query-exports 3 4-- set variable to change the fetch date easily 5declare gmc_fetch_date date default date('2021-09-05'); 6 7with 8-- get productfeed uploaded in gmc for specific date 9gmc_products as ( 10 select 11 _partitiondate as date, 12 * 13 from 14 `dataset.products_<merchant-id>` 15 where 16 _partitiondate = gmc_fetch_date 17 and target_country in ('nl') 18), 19-- get mapping table to join rank (bestsellers) and productfeed 20gmc_product_inventory as ( 21 select 22 * 23 from 24 `dataset.bestsellers_topproducts_inventory_<merchant-id>` 25 where 26 _partitiondate = gmc_fetch_date 27 and (regexp_contains(rank_id, ':nl:')) 28), 29-- get price benchmark information 30gmc_price_benchmarks as ( 31 select 32 _partitiondate as date, 33 * 34 from 35 `dataset.products_pricebenchmarks_<merchant-id>` 36 where 37 _partitiondate = gmc_fetch_date 38 and country_of_sale in ('nl') -- specify the countrycodes, since dataset countains all countries 39 ), 40-- get bestseller information (if product ranks in bestseller list -> 10000 products per google product category and country). 41gmc_bestsellers as ( 42 select 43 _partitiondate as date, 44 inventory.product_id as product_id, 45 bestseller.rank_timestamp, 46 bestseller.rank_id, 47 bestseller.rank, 48 bestseller.previous_rank, 49 if(inventory.product_id is null,false,true) as product_in_inventory, 50 bestseller.ranking_country, 51 bestseller.ranking_category as ranking_category_id, 52 (select name from bestseller.ranking_category_path where locale = 'nl-nl') as ranking_category_path, 53 (select name from bestseller.product_title where locale = 'nl-nl') as product_title, 54 concat(gtins[safe_offset(0)],'|',gtins[safe_offset(1)],'|',gtins[safe_offset(2)],'|') as gtins, 55 bestseller.brand, 56 bestseller.google_brand_id, 57 bestseller.google_product_category as google_product_category_id, 58 (select name from bestseller.google_product_category_path where locale = 'nl-nl') as google_category_path, 59 bestseller.price_range.min as price_range_min, 60 bestseller.price_range.max as price_range_max, 61 bestseller.price_range.currency as price_range_currency, 62 bestseller.relative_demand.min as relative_demand_min, 63 bestseller.relative_demand.max as relative_demand_max, 64 bestseller.relative_demand.bucket as relative_demand_bucket, 65 bestseller.previous_relative_demand.min as previous_relative_demand_min, 66 bestseller.previous_relative_demand.max as previous_relative_demand_max, 67 bestseller.previous_relative_demand.bucket as previous_relative_demand_bucket 68 from 69 `dataset.bestsellers_topproducts_<merchant-id>` as bestseller 70 -- only join products when available in the productfeed and the bestsellers list 71 inner join gmc_product_inventory as inventory 72 on bestseller.rank_id = inventory.rank_id 73 where 74 _partitiondate = gmc_fetch_date 75 and ranking_country in ('nl') 76) 77 78 -- join all the data on product level 79select 80 products.date, 81 products.product_id as product_id_gms, 82 split(products.product_id, ':')[safe_offset(3)] as product_id, 83 products.target_country, 84 products.title, 85 products.price.value, 86 benchmarks.price_benchmark_value, 87 (cast(benchmarks.price_benchmark_value as float64) - cast(products.price.value as float64)) as price_benchmark_diff, 88 benchmarks.price_benchmark_currency, 89 bestsellers.rank, 90 bestsellers.previous_rank, 91 bestsellers.price_range_min, 92 bestsellers.price_range_max, 93 bestsellers.relative_demand_min, 94 bestsellers.relative_demand_max, 95 bestsellers.relative_demand_bucket, 96 bestsellers.previous_relative_demand_min, 97 bestsellers.previous_relative_demand_max, 98 bestsellers.previous_relative_demand_bucket 99from gmc_products as products 100-- join price benchmark data 101left join gmc_price_benchmarks as benchmarks 102 on products.product_id = benchmarks.product_id 103 and products.date = benchmarks.date 104-- join bestseller data 105left join gmc_bestsellers as bestsellers 106 on products.product_id = bestsellers.product_id 107 and products.date = bestsellers.date
Using the BigQuery Merchant Center exports, you can generate insights at scale and use it for automation. But keep in mind, the exports hold a lot of data, so it could generate some costs in your Google Cloud project (and you probably won't need all the data). The tables are date partitioned, so you could write a Python script or use Airflow to delete partitions older than X days to keep the amount of data in BigQuery (probably after you've created some aggregate reports).