This course will demonstrate some of the more advanced options that are available in Google Cloud Pub/Sub. These options include filtering and ordering messages, creating and enforcing schemas, as well as replaying previously delivered messages.
Learning Objectives
- Filtering and ordering Pub/Sub messages
- Creating and enforcing message schemas
- Handling duplicate or undeliverable messages
- Replaying and purging messages
- Monitoring your topics for problems
Intended Audience
- GCP Developers
- GCP Data Engineers
- Anyone preparing for a Google Cloud certification (such as the Professional Data Engineer exam)
Prerequisites
- Some experience with Cloud Pub/Sub
- Access to a Google Cloud Platform account is recommended
In Cloud Pub/Sub you can format your message data however you want, as long as it can be encoded into a string. However, its free form nature introduces some problems. What happens if your subscribers are expecting a JSON object and they get a string instead? It would be ideal if you could define and enforce a specific format. Well, you can do just that with a message schema.
A message schema creates an enforced contract between the publisher and subscribers. It makes it much easier for multiple people and teams to share the same topics. Everyone knows and understands the format that the message data must follow. If the schema defines a JSON object, then you cannot publish a message containing just a string.
Now you can not assign a schema to a topic that has already been created. You must create your schema first, and then create the topic that will use it. This also means you can reuse schemas, and enforce them across multiple topics. So it is possible to create a single, company-wide standard for all messages.
Schemas can be defined either using the Apache Avro format (which uses JSON) or the Protocol Buffer format (which is Google’s method of serializing structured data). Attempting to publish a message that does not validate against the set schema will produce an error and the operation will fail.
So now I want to show you how to create and enforce a message schema.
The first step is to create the schema. I’m going to call it “schema-1”. Next, I have two options. I can use the Avro format. Or I can use the Protocol Buffer format. I am going to pick Avro, since most viewers probably are already familiar with JSON.
The default provided schema works, but I am going to change it slightly to make it easier to understand. So you can see it has two fields. The first is a name field which will contain a string. The second is a user ID field which will contain a number. Any message published will need to have these two fields.
To make sure that you created a valid schema, you can click on the “Validate” button here.
Now that I have a schema, I can create a topic. I’ll call it “topic-4”. Now I need to be sure to check “Use a schema”. If I forget to do this, I am going to have to delete the topic and create another. You cannot go back and add a schema later.
Now I need to pick the schema. And I’ll leave the format as JSON.
So this will create the topic and it’s going to enforce the specified message format on all messages. Any message I try to publish must validate against the schema.
Before I can test this, I need to create a subscription. So let me do that.
Now let’s see what happens when I try to publish two messages. The first message will be in a valid format. The second message is going to be invalid.
As you can see, in order to successfully publish a message to this topic, it must validate against the provided schema. Any subscriber can now assume all messages must fit this format.
So now you know how to create and use schemas.
Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.
Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.
When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.