image
Replay and Purge

Contents

Subscriptions
2
Filtering
PREVIEW6m 39s
3
Ordering
7m 56s
Messages
5
Schemas
5m 20s
6
Monitoring
Start course
Difficulty
Advanced
Duration
45m
Students
678
Ratings
4.8/5
starstarstarstarstar-half
Description

This course will demonstrate some of the more advanced options that are available in Google Cloud Pub/Sub. These options include filtering and ordering messages, creating and enforcing schemas, as well as replaying previously delivered messages.

Learning Objectives

  • Filtering and ordering Pub/Sub messages
  • Creating and enforcing message schemas
  • Handling duplicate or undeliverable messages
  • Replaying and purging messages
  • Monitoring your topics for problems

Intended Audience

  • GCP Developers
  • GCP Data Engineers
  • Anyone preparing for a Google Cloud certification (such as the Professional Data Engineer exam)

Prerequisites

  • Some experience with Cloud Pub/Sub
  • Access to a Google Cloud Platform account is recommended
Transcript

Normally, any messages that you have acknowledged in a subscription are no longer accessible.  Generally this is the desired behavior.  But what if some messages were erroneously acknowledged due to a bug?  You want some way to recover those lost messages.

Luckily, it is possible to “replay” previously-acknowledged messages by using the “seek” feature.  You can use a seek operation to mark previously viewed messages as unacknowledged.  This forces those messages to be redelivered.  You can also use seek to “purge” unviewed messages by changing their state to “acknowledged”.

So let’s look at the two ways to perform a seek operation: “Seek to timestamp” and “Seek to snapshot”.  

“Seeking to a timestamp” is pretty straightforward.  First, you pick a time, and then all messages that were sent before that time are marked as “acknowledged”.  All messages sent after that time are marked as “unacknowledged”.  So it’s basically a “quick and dirty” way to alter the acknowledgement state of messages in bulk.

So if you lost some messages an hour ago due to a bug, you can use “seek to timestamp” to replay all messages added in the last hour.  Or, if a bug added some bad messages to a topic,   you can prevent those from ever being delivered by using seeking to a time in the future.

Using “seek to timestamp” is pretty straightforward.  But it does have a few issues.  First, it requires you to enable the “retain acknowledged messages” option for your subscription.  If you have not enabled this feature, then messages are deleted upon acknowledgement and cannot be recovered.  Even when it is enabled, you still cannot recover any messages older than the message retention duration.  So any message older than 7 days is gone.  You should also be aware that enabling “retain acknowledged messages” will increase your costs.  More messages means more money.  

The second main issue is that this operation is “all or nothing”.  You are going to mark every message prior to the timestamp as acknowledged, and every message after as unacknowledged.  If you need greater control, then “seek to timestamp” won’t be the right solution.

Luckily, there is a second option: “seek to snapshot”.  You can generate and use snapshots to replay and purge messages as well.  A snapshot captures the message acknowledgment state of a subscription at a specific time.  Snapshots do not store any message data, only the acknowledgment state.  So they cannot be used to restore deleted messages.  This also means that a snapshot is only usable as long as the messages still exist in the topic.  As soon as the oldest message in a snapshot is removed from the topic, then that snapshot expires and is deleted.

“Seeking to snapshot” essentially copies the state of the messages in the snapshot to a subscription.  You create snapshots to store the acknowledgement status of messages in a topic.  And then later, you can restore that state for any subscriptions to that topic.

Snapshots do not require that you enable the “retain acknowledged messages” option in your subscription.  So using “seek to snapshot” will be cheaper than using “seek to timestamp”.  However, you will need to make sure you generate snapshots for the time periods you want to seek to.  You can schedule recurring snapshots, or you can take snapshots just before making any significant changes.

Whichever method you choose, the two types of “seek” operations give you greater control over which messages are delivered for a subscription.  They can make debugging your subscriber code much easier, as well as make it possible to recover from an incident.

About the Author
Students
31865
Courses
36
Learning Paths
14

Daniel began his career as a Software Engineer, focusing mostly on web and mobile development. After twenty years of dealing with insufficient training and fragmented documentation, he decided to use his extensive experience to help the next generation of engineers.

Daniel has spent his most recent years designing and running technical classes for both Amazon and Microsoft. Today at Cloud Academy, he is working on building out an extensive Google Cloud training library.

When he isn’t working or tinkering in his home lab, Daniel enjoys BBQing, target shooting, and watching classic movies.