< Back to articles

Google Cloud Monitoring Log-based Metrics and the Misunderstood resource.type

I have got a short one today. It’s mainly a small issue that might bother you. At first, I thought the GCP console was just borken again, but I guess I was just using it wrong this time. Have you ever created a log-based metric and tried to find it in Metrics Explorer? The UI is a bit misleading. It shows all the available GCP resources there are. Which one is the correct one?

screen shot of Metrics Explorer for Google Cloud Monitoring Log-based metrics

You might say, well, just switch that Show only active resources & metrics, you silly! No no, what if the data is still not there? What if you would like to create an alert that fires right after there is a single point of data? In that case, there might be a time frame with no data.

But what's wrong? Why so many resources? First, you should go back to the creation of the metric. Let’s say you have the following filter:

jsonPayload.response.statusCode>=500  
severity > "warning"  

Go to the first log entry displayed by the filter in Logs Explorer. Expand the JSON and check resource and type:

resource: {  
  labels: {}  
  type: "cloud_run_revision"  
}  

And that’s basically it. The problem with UI is that it doesn’t know which resource.type will be matched. So it suggests it could be all of them. Let’s see what happens if you add the correct resource.type into the filter:

jsonPayload.response.statusCode>=500  
severity > "warning"  
resource.type="cloud_run_revision"  

This means Google Cloud Monitoring now knows we are working just with Cloud Run and shows only one metric in the UI:

screen shot of Metrics Explorer for Google Cloud Monitoring Log-based metrics

Alert policies issues

This might also be an issue if you are creating alert policies. Let’s roll back the resource.type filter and prepare the following terraform code:

resource "google_monitoring_alert_policy" "delivery_alert_policy" {  
 combiner           = "OR"  
 display_name       = "Load Error - warehouseLoad"  
 enabled            = false  
 notification_channels = []  
 project            = …  
 user_labels        = {}  
 conditions {  
     display_name = "Load Error - logging/user/warehouseLoad"  
  
     condition_threshold {  
         comparison   = "COMPARISON_GT"  
         duration     = "0s"  
         filter       =   
"metric.type=\"logging.googleapis.com/user/warehouseLoad\""  
         threshold_value = 0  
  
         aggregations {  
             alignment_period  = "3600s"  
             cross_series_reducer = "REDUCE_SUM"  
             group_by_fields   = []  
             per_series_aligner   = "ALIGN_DELTA"  
         }  
  
         trigger {  
             count   = 1  
             percent = 0  
         }  
     }  
 }  
}

The filter here in the condition_threshold block has a bit of a different role. If I leave it like this and try to apply the setup, I will get the following message:

│ Error: Error creating AlertPolicy: googleapi: Error 400: Field alert_policy.conditions[0].condition_threshold.filter had an invalid value of "metric.type="logging.googleapis.com/user/warehouseLoad"": must specify a restriction on "resource.type" in the filter; see "https://cloud.google.com/monitoring/api/resources" for a list of available resource types.

Makes sense, but what if I put there anything else but cloud_run_revision. Will I be allowed to do that? Let’s put there metric.type="metric" which sounds like a resource which might be it:

filter      = "metric.type=\"logging.googleapis.com/user/warehouseLoad\"  AND resource.type=\"metric\""

And after terraform apply:

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.  

Sounds like it’s done, but you could be unpleasantly surprised there is no data available for the alert policy filter. The type of resource can be different because the log filter can match multiple resources, but that’s hardly the case. In most cases, your resource type of alert policy filter needs to match the resource type of logs from the log explorer filter.

Correct alert policy in this example looks like the following:

resource "google_monitoring_alert_policy" "delivery_alert_policy" {  
     combiner           = "OR"  
     display_name       = "Load Error - warehouseLoad"  
     enabled            = false  
     notification_channels = []  
     project            = …  
     user_labels        = {}
     conditions {  
          display_name = "Load Error - logging/user/warehouseLoad"  
  
          condition_threshold {  
                     comparison  = "COMPARISON_GT"  
                     duration    = "0s"  
                     filter      =   
"metric.type=\"logging.googleapis.com/user/warehouseLoad\" AND   
resource.type=\"cloud_run_revision\""  
            threshold_value = 0  
  
                        aggregations {  
                              alignment_period = "3600s"  
                              cross_series_reducer = "REDUCE_SUM"  
                              group_by_fields  = []  
                              per_series_aligner   = "ALIGN_DELTA"  
                        }  
  
                        trigger {  
                              count   = 1  
                              percent = 0  
                        }  
             }  
       }  
}

And that’s all. It’s simple, yet sometimes it can get rather tricky. Especially if you do not check your logs in the first place. Hopefully, I spared you a few minutes of googling. If you have any questions or additional info, please comment and subscribe.

Martin Beránek
Martin Beránek
DevOps Team LeadMartin spent last few years working as an architect of the Cloud solutions. His main focus ever since he joined Ackee is implementing procedures to speed up the whole development process.

Are you interested in working together? Let’s discuss it in person!

Get in touch >