How to set up error alerts for Google Tag Manager Server on App Engine
Running Google Tag Manager (GTM) Server in Google App Engine? Learn how to set up error alerts when 4xx and/or 5xx alerts exceed a certain threshold.
When using GTM Server in production, it's important that the service runs stable and when it does not, you're notified straight away. This tutorial will show you how to:
- Set up alerts in when 5xx and 4xx errors exceed a certain threshold
- Using Google Cloud Monitoring
- Notifications can be send by email , sms , Slack or to other channels .
What is Google Cloud Monitoring?
Google Cloud Monitoring can be used to gain (performance) insights in the services you run within the Google cloud. Additionally, it enables you to create alerts based on all sorts of (performance) metrics.
You can set up alerts by creating an alerting policy. This is basically done in the drag and drop interface of the GCP console OR you can upload a policy file using the Monitoring API (not all types of alerting policies and rules can be configured in the GCP interface).
When an incident occurs, alerts can be send to different notification channels like:
- Email / SMS
- Slack
- Webhooks
- Cloud Pub/Sub
Create alerts for Google App Engine / GTM Server
Google Tag Manager Server can (and in a lot of cases will) handle a lot of traffic. The ratio of 4xx and 5xx errors can be a good indicator for your GTM server stability. For example, a lot of 5xx errors can point to server stability or scaling issues.
Our goal was to set up a policy that will trigger when the ratio of 4xx and 5xx errors, compared to the total request exceed a certain threshold. However, we've tried to drag and drop the related metrics to create this policy in the interface, but we didn't succeed.
Luckily, I've came across this example in the Google documentation, explaining how to set up a metric-ratio policy, using the Monitoring API (and indeed, it's currently not possible to set this up by dragging and dropping stuff in the interface).
So caution! When you want to update your policy, make sure you do it through the Monitoring API and not in the GCP interface. It seems to overwrite some of the settings, resulting in not triggering any errors.
I've adjusted to policy from the example a bit. The policy will:
- Take 5xx and 4xx errors into account.
- Error threshold is set to 1% (
thresholdValue
). However, if you have a lot of traffic 1% of all responses are still a lot of errors, so adjust to your specific setup.
Save the code (policy) below to your local filesystem, in this tutorial we've named it gtm_server_alerting_policy.json
.
Make sure you replace
<your-project-id>
with your GCP project-ID.
Next, you can use the Cloud Shell to upload the policy file:
- In the Google Cloud Console, open up the Cloud Shell (icon in the top right corner of the interface). The Cloud Shell is activated on the bottom of your screen.
- Click the "Open Editor" button (on top of the the Shell window) and upload the
gtm_server_alerting_policy.json
file. - Click the "Open Terminal" button and use the following command to create the policy:
1gcloud alpha monitoring policies create --policy-from-file="gtm_server_alerting_policy.json"
- Now the policy is created and is visible in the GCP interface (Monitoring > Alerting):
- Configure your notification channels (like email or Slack) in the GCP interface (also under Monitoring > Alerting).
- You're all set!
gtm_server_alerting_policy.json
1{
2 "displayName": "HTTP error count exceeds 1 percent for GTM Server / App Engine",
3 "combiner": "OR",
4 "conditions": [
5 {
6 "displayName": "Ratio: HTTP 4xx and 5xx error-response counts / All HTTP response counts",
7 "conditionThreshold": {
8 "filter": "metric.label.response_code>=\"400\" AND
9 metric.label.response_code<\"600\" AND
10 metric.type=\"appengine.googleapis.com/http/server/response_count\" AND
11 project=\"<your-project-id>\" AND
12 resource.type=\"gae_app\"",
13 "aggregations": [
14 {
15 "alignmentPeriod": "300s",
16 "crossSeriesReducer": "REDUCE_SUM",
17 "groupByFields": [
18 "project",
19 "resource.label.module_id",
20 "resource.label.version_id"
21 ],
22 "perSeriesAligner": "ALIGN_DELTA"
23 }
24 ],
25 "denominatorFilter": "metric.type=\"appengine.googleapis.com/http/server/response_count\" AND
26 project=\"<your-project-id>\" AND
27 resource.type=\"gae_app\"",
28 "denominatorAggregations": [
29 {
30 "alignmentPeriod": "300s",
31 "crossSeriesReducer": "REDUCE_SUM",
32 "groupByFields": [
33 "project",
34 "resource.label.module_id",
35 "resource.label.version_id"
36 ],
37 "perSeriesAligner": "ALIGN_DELTA",
38 }
39 ],
40 "comparison": "COMPARISON_GT",
41 "thresholdValue": 0.01,
42 "duration": "0s",
43 "trigger": {
44 "count": 1
45 }
46 }
47 }
48 ]
49}