< Back to articles

Send Messages From Pub/Sub To BigQuery Cheaper with Cloud Run

EDIT NOTE: On 28th July 2022, GCP introduced a new option for pushing the messages directly to BigQuery. If you are interested in customization of the data transfer, this blog post might still be useful.

The last time I wrote about this topic, I mentioned how expensive is the Dataflow template for consuming messages from Pub/Sub subscription to the BigQuery table. My solution was to use Cloud Function to subscribe itself to the Pub/Sub topic. This works fine for most cases. We reached the limit of the maximum size of the submitted message. That is for Cloud Function only 10 MB. So what can you do to enlarge the message size?

For some reason, we had issues with messages sized only 300 KB. The log looks like this:

Function execution could not start, status: 'request too large'

The log does not say how large the message was, and it is hard to determine the actual size. Is there anything you can do about the issue? Currently, it seems there is no way how to enlarge to message size. The only thing you can do is to use a different consumer.

For this reason, I have prepared the implementation of the same thing with Cloud Run. The maximum message size for Cloud Run is 32 MB. The implementation uses OIDC token for Service Account authorization. Therefore, it is not open to the public and could be considered safe as using Cloud Function.

The terraform code to enable SA usage looks like this:

resource "google_pubsub_subscription" "default" {  
 name  = "pubsub_to_bq_${split(".",   
var.bigquery_table)[2]}_${lower(random_string.random.result)}"  
 topic = var.topic_name  
 push_config {  
   oidc_token {  
     service_account_email = google_service_account.sa.email  
   }  
   push_endpoint =  
one(google_cloud_run_service.default.status)["url"]  
 }  
}  

One last thing is that you have to provide the docker image by yourself and push it to the gcr.io registry. GCP does not allow executing images from the docker hub. I also checked possible billing increases, but you might use only the free tier for such a short runtime of a container.

Hopefully, it is a terraform you might enjoy, and if you are interested, it’s at GitHub.

Martin Beránek
Martin Beránek
DevOps Team LeadMartin spent last few years working as an architect of the Cloud solutions. His main focus ever since he joined Ackee is implementing procedures to speed up the whole development process.

Are you interested in working together? Let’s discuss it in person!

Get in touch >