1. Home
  2. Training Library
  3. Microsoft Azure
  4. Courses
  5. Developing for Autoscaling on Azure

Transient Faults: Other considerations

Start course

Develop your skills for autoscaling on Azure with this course from Cloud Academy. Learn how to improve your teams and development skills and understand how they relate to scalable solutions. What's more, in this course you can analyze and execute how to deal with transient faults.

This Course is made up of 19 lectures that will guide you through the process from beginning to end. 

To discover more Azure Courses visit our content training library.

Learning Objectives

  • Learn how to develop applications for autoscale
  • Prepare for the Azure AZ-303 certification
  • Design and Implement code that addresses singleton application instances


Intended Audience

This course is recommended for:

  • IT Professionals preparing for Azure certification
  • IT Professionals that need to develop applications that can autoscale


There are no prior requirements necessary in order to do this training course, although an understanding of MS Azure will prove helpful



When you are sorting out values to use for the number of retries and for retry intervals for a given policy, be sure to consider whether the operation on the service or resources is part of a long-running or multi-step operation. If it is, it might be tough or worse yet, expensive, to compensate all the other steps that might have already succeeded when one fails. Cases such as these might call for a very long interval and/or a large number of retries, assuming they don't block any other operations due to holds or locks on scarce resources. Another consideration to account for is whether retrying the same operation will cause data inconsistencies. This can happen when some parts of a multi-step process are repeated and the operations are not idempotent. An example of this would be an operation that increments a value. If the operation is repeated, it's going to produce an invalid result. 

Another example would be a scenario where repeating a specific operation sends a message to a queue more than once. This may cause an inconsistency in the message consumer if it can't detect duplicate messages. To prevent inconsistencies, ensure that your design treats each step as an idempotent operation. Another consideration to keep in mind is the scope of operations that will be retried. It may be easier, for example, to implement retry code at a higher level, which encompasses several lower level operations. In such a case, you could just retry all the underlying operations, if one higher level operation fails. Keep in mind, however, that doing so can also result in idempotency issues and unnecessary rollback operations. In a case where you do choose a retry scope that includes several operations, be sure that you consider the total latency of all operations when determining the retry intervals when monitoring the time taken and before raising alerts for failures. You should also consider how a particular retry strategy might affect neighbors and other tenants in a shared application. Effects on neighbors with whom you are sharing resources or services should also be considered. Overly aggressive retry policies can cause transient faults to occur for your neighbors and even for any applications that share the resources and services with yours.

About the Author
Learning Paths

Tom is a 25+ year veteran of the IT industry, having worked in environments as large as 40k seats and as small as 50 seats. Throughout the course of a long an interesting career, he has built an in-depth skillset that spans numerous IT disciplines. Tom has designed and architected small, large, and global IT solutions.

In addition to the Cloud Platform and Infrastructure MCSE certification, Tom also carries several other Microsoft certifications. His ability to see things from a strategic perspective allows Tom to architect solutions that closely align with business needs.

In his spare time, Tom enjoys camping, fishing, and playing poker.