1. Home
  2. Training Library
  3. Microsoft Azure
  4. Courses
  5. Managing Indexing in Azure Cognitive Search

Concurrency

Contents

keyboard_tab
Start course
Overview
Difficulty
Intermediate
Duration
38m
Students
88
Ratings
5/5
starstarstarstarstar
Description

This course will focus on the skills required to manage and maintain the indexing process for an Azure Cognitive Search solution. As data changes within a given data source, the requirement to rebuild an index or set up the schedule for an index becomes very important. Understanding all of the functions related to the indexing process is important when you know that there are going to be periodic updates to the underlying data source, and this course will teach you the skills to perform all of those functions.

Learning Objectives

  • Manage re-indexing
  • Rebuild indexes
  • Schedule and monitor indexing
  • Implement incremental indexing
  • Manage concurrency
  • Push data to an index
  • Troubleshoot indexing for a pipeline

Intended Audience

  • Developers who will be including full-text search in their applications
  • Data Engineers focused on providing better accessibility to organizational data
  • AI Engineers who will be providing AI combined with search functionality in their solutions

Prerequisites

Candidates for this course should have a strong understanding of data sources and the operational requirements for those data source changes. Candidates should also be able to use REST-based APIs and SDKs to build knowledge mining solutions on Azure.

Transcript

Hi there, in this video, we wanna talk about concurrency. Now, concurrency can mean a lot of things to a lot of different people, but in the scope of your search indexes, it's concurrency of transactions that are occurring across all of the pieces that make up your search solution. That, of course, includes the index, especially if you're gonna be using push methods to put data into the index. But also when pulling data from, say, multiple data sources, you wanna make sure that if the data source happens to be being updated at the time, you understand that kind of information.

Now, when it comes to your search solution, you're gonna be focusing on an optimistic concurrency model. And this is done inside of your search solution by using access condition checks. And this is done automatically. Azure does this for you. It implements those access condition checks against all of the pieces of your overall search solution, your indexes, indexers, data sources, skillsets, and synonym maps.

All of your search resources have an ETag or an Entity Tag that lets you know when the last time was that, one, you did a condition check against that particular resource and that you have a basic state of that resource. Those ETags are updated automatically every time you interact with one of these resources within your overall solution. And then the primary piece of your concurrency is doing a check of that ETag before performing an update.

So if that ETag has, in fact, changed since the last time that you did your resource check, you're gonna need to make sure to validate those changes before making your update. Now, how are you gonna do that? You're gonna do that by maintaining a local copy of your ETag within your code. And then you can check against the ETag that's available inside the resource. And if you found out, for example, that your SQL database got updated recently, then you're gonna need to make sure to validate that the operation you're performing is going to actually succeed.

Now, some examples of how to do this kind of ETag checking can be found in this particular GitHub repository. This was created by the Azure search engineering team and that particular path. So the actual GitHub repository is called search-dotnet-getting-started, but this specific folder, because there are numerous different solutions inside of this repo, this particular solution, the DotNetETagsExplainer, is the one that will provide you some sample code, some dotnet sample code, for how to manage your ETags moving forward.

About the Author

Brian has been working in the Cloud space for more than a decade as both a Cloud Architect and Cloud Engineer. He has experience building Application Development, Infrastructure, and AI-based architectures using many different OSS and Non-OSS based technologies. In addition to his work at Cloud Academy, he is always trying to educate customers about how to get started in the cloud with his many blogs and videos. He is currently working as a Lead Azure Engineer in the Public Sector space.