Source control repositories are an important part of software development. They ensure that data can always be recovered and they also enable continuous integration. Azure DevOps includes features common to modern software development such as source control repositories, continuous integration pipelines, agile planning tools, and more. This course covers several topics which fall under the umbrella of configuring repositories for Azure DevOps.
Learning Objectives
- Integrating GitHub repositories with Azure DevOps Pipelines
- Configuring permissions in source control repositories
- Configuring tags to organize source control repositories
- Recovering data using Git commands
- Purging data from source control
Intended Audience
- Software engineers
- DevOps engineers
- Site reliability engineers
Prerequisites
- Be comfortable using Git
- Be familiar with Azure DevOps
Hello and welcome. In this content, we'll explore how to recover data from Git repositories. Most version control usage revolves around committing and pushing changes. However, eventually, changes will need to be recovered. Changes are recovered in different ways depending on the state of the change. This content covers three recovery scenarios: first, reverting uncommitted changes; second, reverting a local branch to a previous state, and finally, reverting one or more shared commits.
I'm using this web-based IDE to demonstrate these scenarios. I'll start out by cloning a repository for these demos. This is a private repository hosted on Azure DevOps Repos. And finally, I'm going to change directories into the repository directory. The first scenario is reverting uncommitted changes. Running the Git status command shows that there are no changes to the repository. I'm going to open and change the app.py file from the web app directory. I'm going to add a function named new_feature.
The file has been changed and the IDE automatically saved the changes. Running the Git status command again shows an unstaged change. The app.py file has been changed, however, it has not been staged to be committed. For this scenario, imagine that the IDE is unable to revert the change that was just made to the app.py file. Git is able to revert the change with the Git checkout command. I'm going to run Git checkout and pass the path of the changed file as an argument. This instructs Git to check out the currently committed version of the file thereby reverting the change. Notice the new_feature function has disappeared from the file view. So, to summarize the first scenario, reverting an uncommitted change is accomplished by checking out the currently committed version of the changed file using the Git checkout command. The second scenario is reverting a local branch to a previous state.
The Git reset command is used to revert a branch to the state of a previous commit. For this scenario, imagine the branch being reset is not shared with other users. Shared branches must be reset differently, which is covered in scenario three. A word of warning, the Git reset command can result in lost work if not run mindfully. I'm going to cover three modes of operation for this command. These modes are hard resets, soft resets, and mixed resets. Hard resets consist of all changes after the specified commit being discarded. This includes changes to the working tree. The specified commit becomes the branch head. Running the Git log command with the one line flag displays the commits for the current branch. Currently, the branch head points to the commit starting with the hash ID of 9132. I'm going to perform a hard reset of this branch to the commit made prior to adding the tests. For this, I'll use the Git reset command with the hard flag and the commit hash to reset. Now that I've reset this to the previous commit, displaying the log again shows that the new branch head is the specified commit. Hard resets remove all of the changes after the commit specified and resets the working tree.
This mode is destructive and should be used carefully. I'm going to demonstrate soft resets next. Soft resets are a less destructive form of reset. They preserve changes to the working tree such that uncommitted changed files are not reset. I'm going to modify the app.py file by adding a comment. This file has been changed in the working tree without being committed. The Git status command shows it as having changes that need to be committed. Running the Git log command displays the commits for the current branch. Currently, the branch head points to the commit starting with the hash ID of 7c9. I'm going to perform a soft reset of this branch to the commit with the hash starting with add. To do that I'll use the Git reset command with the soft flag followed by the commit hash. Displaying the log now shows the branch head has changed, however, running the Git status command again shows that the app.py file has uncommitted changes. So, soft resets preserve uncommitted files in the working tree while resetting back to the previous state. I'm going to use a hard reset to remove the changes made to the app.py file before the next demonstration. I'm going to demonstrate mixed resets next.
Mixed resets are the default mode if none is specified. This mode resets the branch to the specified commit and keeps all subsequent changes as unstaged changes in addition to changes made to the working tree. I'm going to set up the scenario by adding a comment to the app.py file. I'll also add a new file. I'm going to reset this branch to the commit hash starting with 41e. To do that, I'll run the Git reset command with the mixed flag and specify the hash. Notice the app.py file has been unstaged. Running the Git status command confirms that the app.py file is unstaged. It also includes two untracked files, the newly created demo.txt file and the previously committed web proxy directory which includes a configuration file for nginx. This web proxy directory was added to a commit which occurred after the current branch had represented by commit hash 41e.
Mixed resets preserve files committed after the commit used to reset, so the Git reset command is able to reset the state of a branch to a previous commit. This should only be used for commits which have not been shared. Removing commits from a shared repository will cause issues for other users due to the difference in commit histories, and this leads us into our final scenario. The final scenario is reverting one or more shared commits. Once commits are shared, they're part of a history built upon by other repository users. The Git revert command can be used to add a new commit that reverts a previous commit. This leaves the original commit intact, which prevents the types of issues caused by removing commits. For this demonstration, I've reset the repository to its original state. The Git log command shows the original commit history. I'm going to revert the change made by a specific commit.
I have the commit represented by #7c9 open in the web interface. Notice the change that was made in this commit. These two environment variables are passed to the secret config function. For this scenario. I'm going to revert this change so that these last two environment variables are not passed to the secret config function. To do this I'm going to use the Git revert command and specify the hash starting with 7c9. This opens up the Git configured text editor to allow a commit message to be added, I'll use the default provided by Git. Notice the app.py file has changed to reflect the reverted code.
The Git log command shows that the branch head is now pointing to a commit which reverts the previous change. The reason this command can be used to revert shared commits is because it doesn't rewrite the commit history. Rather, it adds a new commit which reverts the specified changes. The commands demonstrated in this content are very useful for recovering data. However, they also have the potential to be highly destructive. They can cause data loss. Use caution when performing any action that rewrites the Git history. Okay, we're going to wrap up here. I hope this demonstration has been helpful. Thank you so much for watching.
Ben Lambert is a software engineer and was previously the lead author for DevOps and Microsoft Azure training content at Cloud Academy. His courses and learning paths covered Cloud Ecosystem technologies such as DC/OS, configuration management tools, and containers. As a software engineer, Ben’s experience includes building highly available web and mobile apps. When he’s not building software, he’s hiking, camping, or creating video games.