The DevOps Institute is a collaborative effort between recognized and experienced leaders in the DevOps, InfoSec and ITSM space and acts as a learning community for DevOps practices. This DevOps Foundations course has been developed in partnership with the DevOps Institute to provide you with a common understanding of DevOps goals, business value, vocabulary, concepts, and practices. By completing this course you will gain an understanding of the core DevOps concepts, the essential vocabulary, and the core knowledge and principles that fortify DevOps principles.
This course is made up of 8 lectures and an assessment exam at the end. Upon completion of this course and the exam, students will be prepped and ready to sit the industry-recognized DevOps Institute Foundation certification exam.
Learning Objectives
- Recognize and explain the core DevOps concepts.
- Understand the principles and practices of infrastructure automation and infrastructure as code.
- Recognize and explain the core roles and responsibilities of a DevOps practice.
- Be prepared for sitting the DevOps institute Foundation certification exam after completing the course and assessment exam.
Intended Audience
- Individuals and teams looking to gain an understanding and shared knowledge of core DevOps principles.
Prerequisites
- A basic understanding of IT roles and responsibilities. We recommend completing the Considering a Career in Cloud Computing? learning path prior to taking this course.
- [Instructor] Hi, and welcome back. In lecture five, we'll explore culture, behaviors and operating models. First, we will explore defining a culture for DevOps, and then we will look at behavioral models, organizational models, and then how we can work towards identifying a target operating model for a DevOps practice. Now practices alone are not enough for success with DevOps. We need to implement behavioral change over just a change to the tools and processes. Sascha Bates is a consultant at Chef, and she makes a good point saying that, "Tools and processes are a reflection "of your cultural choices." Now culture eats strategy for breakfast. And in 2006, Mark Field of Ford Motor Company attributed this quote to Peter Drucker. Now Peter Drucker was a management consultant whose writings contributed to the philosophical and practical foundations of the modern business corporation. Peter introduced the concept known as management by objective, and he often argued that a company's culture would trump any attempt to create a strategy that was incompatible with its culture. So what is organizational culture? We can try and define it, and there's a few definitions out there. For example, it's the values and behaviors that contribute to the unique social and psychological environment of an organization. But in reality, culture is a nebulous term and it is very subjective. As Lloyd Taylor puts it, "You can't directly change culture. "But you can change behavior, and behavior becomes culture."
So focusing on the behaviors and values helps us to address the challenges that we often see with culture as we can understand how an organization deals with things like blame, fear, failure, learning, innovation, experimentation, trust and respect. We need to influence the behaviors and values if we want to change the culture of our organization. Now DevOps can help us overcome the notion of cultural debt. Cultural debt occurs when cultural considerations are disregarded or deferred in favor of growth and innovation. It's a common scenario. And the effective interest rate on cultural debt is often higher than non-technical debt, which is the one that we tend to focus on. Cultural debt happens when silos become impenetrable, when we have fiefdoms, companies hire the wrong people, employees don't feel empowered or employees don't feel their contributions matter; or worse, people's contributions are not acknowledged, people aren't given the time or resources needed to make improvements because the feedback loop are either negative or non-existent. People tend to put what's happening today over investing in the future. And culture is an investment in the future. Let's look at the characteristics of a DevOps culture. Having a shared vision, goals and incentives is crucial, and that's what can break down silos. So an example of a safe environment is one where people feel comfortable speaking up, i.e., there's less of a blame culture, postmortems, less about attributing blame as defining solutions and scenarios that can help ensure that situations don't occur again. And a safe environment requires high trust and a learning culture.
Now some companies recognize teams or individuals that enable an organization to learn from failure. And such awards send a message about your company's values. It's a very important thing. Organization culture is one of the strongest predictors of both IT performance and overall performance of the organization. So we need to shift thoughts and behaviors. We need to adopt a lean forward approach to shifting thoughts and behaviors. Organizations may find a need to tackle one behavior at a time, as many are interrelated. It's often good to look at people's current behavior to determine a good starting point. Ultimately, we want to shift from command and control to collaborate. We want to shift from task orientation to outcome orientation. We want to shift from blame to responsibility. Most importantly, from reactive to proactive. Real cultural change takes time, and it needs to be incremental and performed at a realistic pace. High trust organizations tend to encourage a good information flow. Cross functional collaboration, shared responsibilities and learning from failures and new ideas.
They also empower people, which enables people to move more quickly as they don't have to wait for someone else to make a decision or to take an action. As an example, organizations that require a high percentage of changes to be approved by, say, a change advisory board, that perhaps that board only meets weekly, that's going to slow down the change management process and increase the cost of handling changes. Conversely, organizations that increase their use of standard change procedures can make low-risk changes more quickly and at a lower cost. The Spotify engineering culture is a very interesting take on culture change from Henrik.
- [Henrik] Our culture is based on Agile principles. All engineering happens in squads, and we try to keep them loosely coupled and tightly aligned. We like cross pollination and have an internal open source model for code. Squads do small and frequent releases, which is enabled by decoupling. Our self-service model minimizes the need for handoffs, and we use release trains and feature toggles to get stuff into production early and often. And since culture is all about the people, we focus on motivation, community and trust rather than structure and control. That was part one. And now I'd like to talk about failure. Our founder, Daniel, put it nicely, "We aim to make mistakes faster than anyone else." Yeah I know, sounds a bit crazy. But here's the idea. To build something really cool, we will inevitably make mistakes along the way, right? But each failure is also a learning, so when we do fail, we want it to happen fast so we can learn fast and therefore improve fast. It's a strategy for long term success. It's like with kids. You can keep a toddler in the crib and she'll be safe, but she won't learn much and won't be very happy. If you instead let her run around and explore the world, she'll fail and fall sometimes, but she'll be happier and develop faster. And the wounds? Well, they usually heal.
So Spotify is a fail-friendly environment. We're more interested in fast failure recovery than failure avoidance. Our internal blog has articles like celebrate failure and stories like how we shot ourselves in the foot. Some squads even have a fail wall where people show off their latest failures and learnings. Failing without learning is, well, just failing. So when something goes wrong, we usually follow up with a post mortem. This is never about whose fault was it, it's about what happened, what did we learn, what will we change? Post mortems are actually part of our incident management workflow. So an incident ticket isn't closed when the problem is solved, it's closed when we've captured the learnings to avoid the same problem in the future. Fix the process, not just the product. In addition, all squads do retrospectives every few weeks to talk about what's working well and what to improve next. All in all, Spotify has a strong culture of continuous improvement, driven from below and supported from above. However, failure must be non lethal, or we don't live to fail again.
So we promote the concept of limited blast radius. The architecture is quite decoupled. So if a squad makes a mistake, it will usually only impact a small part of the system and not bring everything down. And since the squad has end-to-end responsibility for their stuff without handoffs, they can usually fix the problem fast. Also, most new features are rolled out gradually, starting with just a tiny percent of all users and closely monitored. Once the feature proves to be stable, we gradually roll it out to the rest of the world. So if something goes wrong, it normally only affects a small part of the system for a small number of users for a short period of time. This limited blast radius give squads courage to do lots of small experiments and learn really fast instead of wasting time trying to predict and control all risk in advance. Mario Andretti puts it nicely. "If everything is under control, you're going too slow." All right, let's talk about product development. Our product development approach is based on Lean startup principles and is summarized by the mantra: think it, build it, ship it, tweak it. The biggest risk is always building the wrong thing.
So before deciding to build a new product or major feature, we try to inform ourselves with research. Do people actually want this? Does it solve a real problem for them? Then we define a narrative, kind of like a press release or an elevator pitch showing off the benefits. For example, radio you can save, or follow your favorite artist. We also define hypotheses. How will this feature impact user behavior and our core metrics? Will they share more music? Will they log in more often? And we build various prototypes and have people try them out to get a sense of what the feature might feel like and how people react. Once we feel confident this thing is worth building, we go ahead and build an MVP, minimum viable product, just enough to fill fill the narrative but far from feature complete. You might call it the minimum lovable product. The next stage of learning happens once we put something into production. So we wanna get there as quickly as possible. We release the MVP to just a few percent of all users and use techniques like A/B testing to measure the impact and test our hypotheses. The squad monitors the data and continues tweaking and redeploying until they see the desired impact. Then they gradually roll out to the rest of the world while taking the time needed to sort out practical stuff like operational issues and scaling. By the time the product or feature is fully rolled out, we already know it's a success because if it isn't, we don't roll it out.
Impact is always more important the velocity, so a feature isn't really considered done until it has achieved the desired impact. Note that like most things in this video, this is how we try to work, but our actual track record, of course, varies. Now with all this experimentation going on, how do we actually plan? How do we know what's gonna be released by which date? Well, the short answer is we mostly don't. We care more about innovation than predictability, and 100% predictability means 0% innovation. On a scale, we'd probably be somewhere around here. Of course sometimes we do need to make delivery commitments like for partner integrations or marketing events, and that sometimes involves standard Agile planning techniques like velocity and burn up charts. But if we have to promise a date, we generally defer that commitment until the feature is already proven and close to ready. By minimizing the need for predictability, squads can focus on delivering value instead of being a slave to someone's arbitrary plan.
One product owner said I think of my squad as a group of volunteers that are here to work on something they are super passionate about. So where do ideas come from? An amazing new product always starts with a person and a spark of inspiration, but it will only become real if people are allowed to play around and try things out. So we encourage everyone to spend about 10% of their time doing hack days or hack weeks. That's when people get to experiment and build whatever they want, like this dial a song product. Just pick it up and dial the number of the song you wanna listen to. Is it useful? Doesn't matter, the point is if we try enough ideas, we're bound to strike gold from time to time. And quite often, the knowledge gained is worth more than the actual hack itself. Plus, it's fun. As part of this, we do a Spotify-wide hack week twice per year, hundreds of people hacking away for a whole week. The mantra is make cool things real. Build whatever you want with whoever you want in whatever way you want. And then we have a big demo and party on Friday. We're always surprised by how much cool stuff can be built in just a week with this kind of creative freedom. Whether it's a helicopter made of lollipop sticks or a whole new way of discovering music, it turns out that innovation isn't really that hard. People are natural innovators, so just get out of their way and let them try things out. In general, our culture is very experiment-friendly. For example, should we use tool A or tool B? Don't know, let's try both and compare. Or do we really need sprint planning meetings? Don't know, let's skip a few and see if we miss them. Or should we show five or 10 top songs on the artist page? Don't know, let's test both the measure the impact. Even the Spotify-wide hack week started as an experiment, and now it's part of the culture.
So instead of arguing an issue to death, we try to talk about things like what's the hypothesis and what did we learn and what will we try next? This gives us more data-driven decisions and less opinion-driven, ego-driven or authority-driven decisions. Although we are happy to experiment and try different ways of doing things, our culture is very waste-repellent. People will quickly stop doing anything that doesn't add value. If it works, keep it. Otherwise, dump it. For example, some things that work for us so far are retrospectives, daily standups, Google Docs, GIT and guild unconferences. And some things that don't work for us are time reports, handoffs, separate test teams or test phases and task estimates. We mostly just don't do these things. We're strongly allergic to useless meetings and anything remotely near corporate BS. One common source of waste is what we call big projects. Basically, anything that requires a lot of squads to work tightly coordinated for many months.
Big project means big risk, so we are organized to minimize the need and instead try hard to break projects into a series of smaller efforts. Sometimes however, there is a good reason to do a big project, and the potential benefits outweigh the risks. And in those cases, we have found some practices to be essential. Visualize progress using various combinations of physical and electronic boards. Do a daily sync meeting where all squads involved meet up to resolve dependencies. Do a demo every week or two where all the pieces come together so we can evaluate the integrated product together with stakeholders. These practices reduce risk and waste because of the improved collaboration and short feedback loop. We've also found that a project needs a small tight leadership group to keep an eye on the big picture.
Typically, we have a tech lead, a product lead and sometimes a design lead working tightly together. On the whole, we're still experimenting a lot with how to do big projects and we're not so good at it yet. One thing we keep wrestling with is growth pain. As we grow, we risk falling into chaos. But if we overcompensate and add too much structure and process, we risk getting stuck in bureaucracy instead. And that's even worse. So the key question is really what is the minimum viable bureaucracy, the least amount of structure and process we can get away with to avoid total chaos? Both sides cause waste but in different ways, so the waste-repellent culture and Agile mindset helps us stay balanced. The key thing about reducing waste is to visualize it and talk about it often. So in addition to retrospectives and post mortems, many squads and tribes have big, visible improvement boards that show things like what's blocking us and what are we doing about it? We also like to talk about definition of awesome. For example, awesome for this squad means things like really finishing stuff, easily ramping up new team members and no recurring tasks or bugs. And our definition of awesome architecture includes I can build, test and ship my feature within a week. I use data to learn from it, and my improved version is live in week two. Keep in mind though, awesome is a direction, not a place, so it doesn't even have to be realistic. But if we can agree on what awesome would look like, it helps focus our improvement efforts and track progress.
Here's an example of an improvement tracking board inspired by a lean technique called Toyota Kata. Top left shows what is the current situation? In this case, the squad was having quality problems. Bottom left shows definition of awesome. In a perfect world, we'd have no quality problems at all. Top right is a realistic target condition. If we were one step closer to awesome, what would that look like? And finally, the bottom right shows the next three concrete actions that will move us towards the target condition. As these get done, the squad fills it up with new actions. Boards like this live on the wall in the squad room and are typically followed up at the next retrospective. All right, I realize that maybe this video makes it seem like everything at Spotify is just great. Well, truth is we have plenty of problems to deal with, and I could give you a long list of pain points but I won't, because it would go out of date quickly. We grow fast and change fast, and quite often a seemingly brilliant solution today will cause a nasty new problem tomorrow just because we've grown and everything is different. However, most problems are short lived because people actually do something about it. This company is pretty good at changing the architecture, process, organization or whatever is needed to solve a problem, and that's really the key point. Healthy culture heals broken process.
Since culture is so important, we put a lot of effort into strengthening it. This video is just one small example. No one actually owns culture, but we do have quite a lot of people focusing on it, groups such as people operations and about 30 or so Agile coaches spread across all squads. And we do boot camps where new hires form a temporary squad. They get to solve a real problem while also learning about our tech stack and processes and learning to work together as a team all in one week. It's an intense but fun way to really get the culture. They often manage to put code into production in that time, which is impressive. But again, failing is perfectly okay as long as they learn. Mainly though, culture spreads through storytelling, whether it happens on the blog, at a post mortem, a demo or at lunch. As long as we keep sharing our successes and failures and learnings with each other, I think the culture will stay healthy. At the end of the day, culture in any organization is really just the sum of everyone's attitudes and actions. You are the culture, so model the behavior you wanna see.
- [Instructor] Now high trust organizations encourage good information flow, so Westrum's study looked at how organizations respond to problems and opportunities. This study was initially conducted to look how culture affects the performance of a medical unit, particularly as it relates to safety. The concepts are very applicable however to any type of organization, especially DevOps in my view. The types described were pathological, bureaucratic and generative, all shaped by the preoccupations of the organization's leaders. In other words, team leaders shape the organization's culture by creating incentive structures that rewards certain behaviors.
Now in the context of DevOps, techniques we can use to create and maintain a high trust culture could include encouraging and creating boundary-spanning teams; making quality, availability and security everyone's responsibility; holding blameless post mortems. When incidents and outages occur, we need to develop effective counter measures and create global looting from those outcomes. We wanna maximize everyone's creativity to find novel solutions to problems.
Now the characteristics of high trust generative cultures, according to the 2015 State of DevOps Report, burn out is associated with pathological cultures, so bureaucratic cultures more likely to experience the morale issues that come from having narrow responsibilities or that come from that us versus them culture. I think a key perspective is keeping the individual at the center of cultural discussions and initiatives. There needs to be an element of self-empowerment so the individual feels motivated to change how they do things. People don't resist change, they resist being changed. Now cultural change is never easy, but the reality is people typically don't resist their own ideas. When people participate in what and how to change things, those decisions are more likely to be accepted.
To adopt DevOps practices, the culture of the company must enable and encourage collaboration. The culture has to make it okay for people to make mistakes and to speak their minds. Organizational change management practices must be introduced. And organizational change management provides the preparation, motivation and education people need to embrace and support change in the organization. Now people adapt to change at different paces. The diffusion of innovation theory, which was developed by E.M. Rogers in 1962, is a very established social science theory as a way to explain how over time an ideal product gains momentum and diffuses or spreads through a specific population or social system. Now the end result of this diffusion is that people, as part of a social system, adopt a new behavior or product or idea. Adoption means that a person does something differently than what they had previously, i.e., they'll purchase a new product and perform a new behavior, et cetera. So the key to adoption is that the person must perceive the idea, the behavior or product as new or innovative. It is through that where diffusion is possible.
Adoption of a new idea, behavior or product does not happen simultaneously in a social system. Rather, it is a process whereby some people are more apt to adopt the innovation than others. Researchers have found that people who adopt an innovation early have different characteristics than people who adopt an innovation later. When promoting an innovation to a target population, it is important to understand the characteristics of your target population. That will help or hinder adoption of your innovation. So there are five established adopted categories, and while the majority of the general population tends to fall in the middle categories, it is still necessary to understand the characteristics of the target population. When promoting an innovation, there are different strategies used to appeal to the different adopted categories.
So the innovators. These are people who want to be first to try the innovation. They're venturesome, and they're interested in new ideas. These people are very willing to take risks and are often the first to develop new ideas themselves. Very little, if anything, needs to be done to appeal to this group of the population. Then we have the early adopters. These are people who represent opinion leaders. They enjoy leadership roles and embrace change opportunities. They are usually already aware of the need to change and so are very comfortable adopting new ideas. Strategies to appeal to this group of the population include how-to manuals, information sheets on implementation and social media where implementation ideas are shared. This group do not generally need information to convince them to change.
Then we have the early majority. Now these people are really leaders, but they do adopt new ideas before the average person. That said, they typically need to see evidence that the innovation works before they're willing to adopt it. So strategies to appeal to this group of the population include success stories and evidence of the innovation's effectiveness. Then we have our late majority. These people are skeptical of change and will only adopt an innovation after it has been tried by the majority. So strategies to appeal to this population group include information on how many other people have tried the innovation and have adopted it successfully.
And then we have the conservatives, or the laggards. These people are bound by tradition and generally very conservative. They are very skeptical of change, and they are the hardest group to bring on board. Strategies to appeal to this group of the population include statistics, fear appeals and pressure from people in other adopter groups. Innovators generally get the adopters on board, then innovators work with early adopters to get the early majority's buy in, and then the three groups kind of group together so the late majority see the benefits and they get won over. Most people tell us don't worry about the conservatives because they will either join or leave. Change can also be well expressed in the Kubler-Ross Change Curve as the stages of acceptance. So Elisabeth Kubler-Ross was a Swiss-American psychiatrist who introduce this model to reflect stages of grief that can occur in any order at any time. Now I think this model also holds very true when looking at changes in business, especially around DevOps and cloud technologies.
I'm sure we all have experienced one or two of these stages in some degree in an IT project, so how do we go about minimizing the impact of these stages? So first, denial. We want to create alignment and rapport as soon as possible to help people get over the block of denial. For frustration, we want to over communicate the why, the what and the benefits, trying to reduce that sense of frustration by empowering people with knowledge. And when you get that sense of despondency, or in the extreme, depression about something not going as it was expected, we need to look for ways to spark personal motivation so people feel empowered to do things and can see their own personal benefit in this project. Then on the upside, when experimentation happens, we need to develop and support capability. And I think perhaps easily overlooked is the need to share the knowledge when a decision or final point is reached. That way, we can really use the knowledge and people feel empowered to do it again. It also ensures that any information we've learned at the end is fed back down the curve. Now communication can be improved with ChatOps.
Chat platforms make things more immediate and keep people up to date. It doesn't have to be a chat platform. Tools like Wikis, GitHub and Confluence can provide the same level of interactivity and communication. We wanna encourage collaborative relationships. So collaboration enables people to work together to achieve goals that they could not reach individually. The desired result is to ensure parity. Now communication tends to be passive. I will tell you, you will listen to me and vice versa. Collaboration, on the other hand, is active. I ask for your input, your feedback and your opinion. What do you think? Now you can't force people to collaborate but you can help people change their behavior to be more collaborative. Setting up meetings, arranging work spaces, things like that that just help people start to ask why and to be proactive rather than inactive. So it's natural to expect some conflict when we try to change anything. Changing an organization is not an easy business.
The key is being ready and willing to deal with it effectively. Now many individuals avoid conflict of any kind, particularly when it appears to risk the success of them individually or in a project. Organizational change management actually improves when conflict is acknowledged and posing or conflicting opinions allowed to be expressed. Everyone is capable of using any of these modes. However, most likely we all have a default mode where we tend to overuse it or use it in the wrong situation. So understanding other people's use of these modes can also help leaders adapt their own behavior and potentially influence that behavior of those around them. So when are the modes appropriate or inappropriate? Well, competing is assertive and uncooperative. So competing is assertive and uncooperative simply because as an individual pursues his or her own concerns, it is generally at another person's expense. It's a power orientated mode in which you use whatever power seems appropriate to win your own position, your ability to argue, your rank, your economic situation.
Competing means standing up for your rights while defending a position which you believe is correct or simply trying to win. So accommodating on the other hand is unassertive and cooperative, the complete opposite of competing. When accommodating, the individual neglects his or her own concerns to satisfy the concerns of the other person. There's an element of self-sacrifice in this mode. Accommodating might take the form of selfless generosity or charity, obeying another person's order when you would prefer not to, or yielding to another point of view. Avoiding is unassertive and uncooperative. The person neither pursues his or her own concerns nor those of the other individual, thus he or she does not deal with the conflict. So avoiding might take the form of diplomacy, sidestepping an issue, postponing an issue until a better time, or simply withdrawing from a threatening situation. Now collaboration or collaborating is both assertive and cooperative, the complete opposite of avoiding. Collaborating involves an attempt to work with others to find some solution that fully satisfies their concerns. It means digging into an issue to pinpoint the underlying needs and wants of the two individuals.
Collaborating between two persons might take the form of exploring a disagreement to learn from each other's insights while trying to find a creative solution to an interpersonal problem. Compromising is moderate in both assertiveness and cooperativeness. The objective with compromising is to find some expedient, mutually acceptable solution that partially satisfies both parties. It falls between competing and accommodating. Compromising gives up more than competing but less than accommodating. And likewise, it addresses an issue more directly than avoiding. But it doesn't explore it as much depth as collaborating. In some situations, compromising might mean splitting the difference between the two positions, exchanging concessions or seeking a quick middle ground solution.
Now inevitably with change comes fatigue, so how do we avoid change fatigue? Change fatigue can sever people's ability to be committed to a project or an organization. And when too much is changing at one time, especially when change is being dictated or from top down, people can start to feel helpless, people can begin to perceive that things are never going to get any better and there's nothing that they can do about it. They start to catasthropize. People also sometimes need help connecting the dots to show how their change or the work that they're doing is affecting the overall change program. And most importantly, show people that what they're doing is contributing to the change in a positive way. People are sometimes working on a project such as automation, and they don't necessarily understand that it's part of a greater DevOps initiative. So it's all about empowering new behaviors. Now many of these activities are aimed at engaging and educating existing employees. It's all about learning new behaviors. According to the research report, What Smart Businesses Know About DevOps, over 70% of DevOps leaders are retraining existing personnel while only 50% are hiring new resources. So putting in communities of practice can play and help enable people to share their knowledge, training plans, help people learn by putting in place a simple framework to encourage them to learn new techniques and behaviors. Let's listen to Target share their experiences in moving to a DevOps culture.
- So good morning everybody. And thank you, Gene, that was a really nice introduction. We're really excited to be back here at DevOps Enterprise Summit this year. It was such an incredible conference last year. It was fun to, you know, tell our story and connect with other enterprise IT shops to, you know, share learnings and well, honestly, war stories as well with other large IT shops that are going through similar transformations. So I'm Heather Mickman, I lead the Enterprise API Integration and Cloud Engineering team at Target. I've been working in tech for almost 20 years. I've worked across all parts of IT from Support and Ops, security, enterprise architecture and of course, software development. And I do work with mainframe as well, less now than previously but, and I'm here with Ross Clanton this morning who leads our engineering practices across our IT org. Ross has been a Target for about 17 years and has worked in many parts of the IT organization as well. So we both have a shared passion for DevOps and driving change across our IT organization. It's been a really fun, though often challenging journey, over the last three, actually I think we're approaching maybe four years that we've been pushing this change across Target. So what we're gonna talk about today are the different phases of driving DevOps and rebuilding our engineering culture at Target. We have started and have made some really great progress starting to shift a large, legacy enterprise delivery machine to be a more kind of modern, nimble, Agile technology team. Last year we talked about kind of the early days of this change, you know, and we're really operating kind of skunk work, small, change agent teams off to the side, but we've made some really incredible progress over the last year and so that's what we'll talk about today. Change slides. There we go. So before we dive into that story, just a little background on Target.
I'm gonna assume that most people are familiar with Target. Yeah? Quick show of hands. So we're one of the largest retailers in the US, just shy of 1,800 stores. We've been around for 53 years, and we're also the second largest importer in the US and have 38 distribution centers. So we're pretty huge scale from a brick and mortar store perspective, as well as a large online presence. We have a really massive supply chain and lots of technology. It takes a lot of technology to make that happen. So we have hundreds, actually probably thousands of applications to support our business. In addition to three data centers, we also leverage large telcos' data centers and we have a growing presence on public cloud provider as well. So in my job, I'm accountable to two groups of people: Target guests and Target engineers. So first, Target guests. At Target, we call our customers our guests and our guests are at the center of every decision we make, whether it's, you know, creating a new experience to make shopping even more fun or just how I'm prioritizing work on a daily basis. Now the second group of folks that I'm accountable to is our engineers. Target really lost sight of the importance of engineering years ago, and we are a technology company and so we can't be successful if we don't have the engineers within our IT organization to make that magic happen. So it really is our commitment to those two groups of people behind our passion for DevOps and rebuilding our engineering culture at Target. Now saying that I have commitment to both our guests and engineers is, I mean that's kind of an easy thing for me to say but it was really, really hard to be an engineer at Target.
Some specific complexities we faced were culture, right? We had outsourced and off-shored our engineering work. Technology was really treated as a commodity, and lots of big implications of that, right? Probably at the top of the list would be just loss of IP, lower quality solutions, we just really didn't own our technology work. Now we also had a culture of when something breaks we just would stop changes, right? And I'm sure probably many in the audience have that same reaction as well and same kind of like knee jerk, you know, what do we when we start to have you know, lots of meltdowns within production? But then what that starts to do, the implications of that are pretty big, right? If we're always freezing production, that means we're now pushing our changes in big batches and that kind of makes the problem worse, right? We see even more breakage. So secondly, Target's IT organization is incredibly complex or was incredibly complex and difficult to work in. There are just lots of silos, and we literally would have silos within silos. Just to get a server provisioned, as an example, that would take 10 different teams to make that happen. So imagine, that's just provisioning a server. Imagine what it would take to actually build an app. And there just wasn't end-to-end accountability because of all of the different handoffs. So last but not least then was our system complexity, right? So I mentioned we've been around for 53 years and the way that we had been operating in kind of that project mode versus end-to-end system accountability just meant that over the years we have built up a ton of technology debt. So we knew that we had to overcome these issues.
The loss of team member engineering, we had a lot of zombie processes and projects that just needed to be removed. We had to break down silos and we needed to move more quickly. So we had a lot, a lot of work to do, and we knew we had to start doing something different. So that's kind of the story, that's the set up to where we started Target's DevOps journey and the rebuilding of our engineering culture. So what we have seen over the last three, four years are kind of four stages in our journey. And we didn't plan these stages, but as we've reflected on, you know, what we've seen over the last four years, this is how it's kind of unfolded. So first was enabling our change agents. And those change agents are important in the early days because they can demonstrate success, which then helps to grow, start growing a grassroots movement where more and more passionate folks start to get bought into, you know, the different changes that are being talked about.
And then that got us to our tops down buy in. That was super important, and that actually sped up a lot this year as our new CIO, Mike McNamara, started. And now we're in the scale it phase, right, trying to figure out how we start taking these different changes and blow them out across our organization. So I'm gonna talk about my team as the early change agents and also about growing the grassroots movement, and then Ross is gonna talk about gaining tops down alignment and how we're starting to expand across our large IT organization. So more than four years ago, Target realized that we really need to step up our game and our multi-channel guest experience. But that was really hard for us to do because a lot of our core data was locked in legacy systems, right? So I mentioned the mainframe, right? And it would take three to six months of work just to develop integrations to get it to core data. And that's because there are multiple sources of truth. Our dot com online systems were separate than our store systems. There were different data structures, different data owners and of course, lots of different priorities. Then, even when we could get at that data, it would be many, many months of manual testing to ensure that we weren't, you know, impacting critical business capabilities. Because even small changes could result in problems across the ecosystem because there were so many unique point-to-point integrations in a very tightly coupled architecture.
Now I touched on the complexities of our delivery model. So it was really project focused, many different teams specializing in just kind of one part of the waterfall delivery models. So our BAs would gather requirements, and that would be handed off to a tech lead to do the high-level design, and then that would be handed off to the different centers of excellence to actually do the development work, and then handed off to the testing team and then the packaging team, then the deployment team. So it was complicated. There are easily, you know, 20, 30 different teams that would need to be engaged, all of which had conflicting priorities. So that meant kind of heavy project management, lots of coordination was needed to figure out those dependencies in the handoffs. So it really felt like our development was more an exercise of waiting in different queues rather than delivering results and getting stuff done. So around that time, I was leading a team charged with a strategy to build APIs to expose our core data like product information, store locations, inventory availability, the kind of data that you would imagine, excuse me, that would be important for a retailer. And that was really exciting, right? Because now, instead of spending three to six months developing integrations to just get store location information, operating hours, whether or not there's a Starbucks in a store, teams could just reuse our Locations API. When Instacart started local grocery delivery in some pilot cities across the country, which by the way is amazing; if you haven't used it, I highly recommend it, they could quickly partner with Target and just use our Products API to get the product's information, availability and pricing.
Makes a lot of sense, right? But one thing that we knew in addition to building these APIs and exposing that core data, it was really important to challenge how we were doing our work because we wanted to be able to deliver new capabilities in days and not months. And we also needed to ensure that we could scale these APIs because we knew that they would be highly reused. And I also knew that I couldn't build the kickass software engineering team that I wanted to if we were operating in our traditional delivery model, where team member work was more about managing contractors than it was about actually building technology solutions. So we started talking about Agile, not waterfall, having our team members do the engineering work, not contractors. We insisted on full stack ownership so that queuing and waiting was decreased and we actually had control of our ecosystem. We brought in more modern development tools, things like GitHub, Jenkins and Chef. Now a couple of years into the building of these APIs and the platform to expose and manage them, we also needed to start bringing technologies to give us the scale that we needed. So things like Cassandra and Kafka to keep up with the growing volumes. So as we brought in these tools and technologies, they weren't blessed by the enterprise process. In fact, when we asked for permission, we were told no. But we did it anyways because we knew we needed to.
So our API platform and APIs, looks like we're missing some of the data up there... Our platform and APIs have scaled as the speed of new guest experiences continued to grow. We have more than 90 APIs with monthly volumes of more than 17,000,000,000. This time last year, when I was giving this same talk and had some of the similar metrics, I'm sure that will be shown here shortly... There, there we go! That's right. So the traffic that I talked about last year was 1.5 billion per month. So the jump from, you know, to 17 billion is obviously pretty significant for us. We do more than 80 deployments per week across our stage and production environments, and the NPV on this investment has been really amazing. Now I don't know about you but every time I kind of work with my finance partners and go through that budgeting process, it's an exercise, but they love when we go through kind of our yearly planning cycle with this because it's a really easy story for us to tell.
So these APIs have enabled many internal and external capabilities. I'm sure it will be shown shortly. Yup, there we go. So I mentioned Instacart. We also do curbside pick up, the ability to buy online and pick up in stores are enabled by these APIs, as well as store fulfillment of online orders. Cartwheel as well, and Rich Pins on Pinterest. Anyways, I won't go through all of them, there's a few listed there. And we continued to see more and more growth. So digital sales were up 42% last holiday season, and we saw a rise of 32, or 30% rather, in Q2 this year as well. And by our peak holiday season this year, more than 450 of our 1,800 stores will help fulfill online orders, up from just over 140 right now. So our APIs and platforms have continued to scale, and we need to write so that we can continue to support the awesome strategy that our business teams continue to come up with. We're investing heavily in technology because we know we need to.
This year alone, Target will spend more than $1,000,000,000 on technology because we are a technology company and rebuilding our engineering culture committed to DevOps. So that's my story of a team of passionate change agents delivering results to demonstrate success. But that change needed to start happening across our broader organization, not just in pockets. We needed to start a grassroots movement. So that's the second phase of the Target DevOps journey. And Ross and I kind of talk about like how we think about that phase starting was I leaned over the wall one morning, the shared queue wall that I had with Ross and I was like hey, what do you think about co-sponsoring an internal DevOps days? I just read about one that ING did and it sounds pretty cool. He's like, yeah! So we did, we made it happen, and we didn't know like who was really gonna show up to our first DevOps days event. Like, was it just gonna be us and you know, a handful of people? And so we were amazed when more than 100 people showed up to our first event because we were still in the early days then where when people would hear the phrase DevOps muttered in the hallways, they would, you know, roll their eyes and walk away, or you know, nobody wanted to talk about it, right? It's still seen like something that only Silicon Valley web companies could do. We also started pairing an automation hackathon with our DevOps days. And so that was great because it was a day where we get our engineers together and start talking about, so helping everyone understand what is infrastructure as code, what is a CICD pipeline, how do you automate test cases? So it was a great way for engineers to sit side by side and share and learn.
Something else that's really important to our story is starting to connect to the larger technology community. So historically, that's not something that Target IT has done well, but that's changing. We brought in amazing external speakers for our DevOps event. All of the folks listed here has been, I mean, amazing to have them come and talk to us. We've also created a lot of fun branding around our commitment to technology. I even have my May the Force Be With You socks on right now. So rebuilding our engineering culture also meant that we needed to rebuild our technology brand. So we started a tech blog. We knew we were doing amazing engineering work and we wanted to tell the world about it. So you can check it out at target.github.io. We contribute to and we use open source. We're speaking at conferences, we host a ton of local meetups in the Twin Cities, and of course we brag as much as we can on Twitter. We now have more than 975 followers. You can see that that kind of graph on the bottom there. 975 followers on our internal community. I mean, we've hosted six internal DevOps days events, the last one was actually just last Thursday. So our grassroots movement has continued to grow, and we knew it was now time to start getting our senior leadership team bought in. So Ross is gonna describe how we built the tops down support and started scaling.
- All right. So when we were here last year, we were still kind of in this bottoms up phase. I mean, we were just kind of on the cusp of really starting to focus tops down, and that's really what happened this year. So in 2015, we kind of entered the year with suddenly a lot of executive attention. They saw some of the wins that some of these early adopter teams were having, the momentum was building bottoms up and we actually moved into kind of a transformation that Target has been going through. And DevOps and Agile became really kind of core pillars or goals that we baked in to how we're changing our organization, and we suddenly got a lot of attention from the top so much so that our executives, our VPs, would do town hall or huddle meetings to kind of share how we were progressing through our transmission with the teams. They were using the word DevOps so much, like my team jokingly said we should make a drinking game out of it. We never actually did, but if you think back to what Heather said a year ago, people wouldn't even use the word. Like, it was frowned upon to even talk about it, and now we're using it, it was like a daily conversation in the hallways. So to drive tops down support, we started to, you know, we have thousands of people in our IT organization, it's a massive organization, how do you scale things and how do you kind of look at how you're gonna drive this change across the organization? So tops down, we aligned with the ThoughtWorks CICD maturity model. We used that to start baseline measuring our different products across the company that are in our different portfolios. We set some goals and aligned some DevOps champions throughout the organization to help drive these practices within their spaces and be champions within their specific teams. And then we also did something that I thought was great. We were so inspired by the conference last year.
I know Gene shared my quote at the beginning, and we're like great, we've got all this interest and attention now, but we have hundreds of middle management even just in the IT organization. And a lot of them hadn't been exposed to this thinking and they weren't exposed to what, you know, we get exposed to when we're interacting with folks out here, so we decided to do our own little kind of mini dose summit inside Target and we invited folks that we thought had really relevant stories based on where we were going. So you can see the folks we had there, so Gene came in and keynoted, we had Jason Cox, Scott Prugh, Jonny Wooldridge, Courtney Kissler and Nicole Forsgren all kind of made up this team. We made a one-day event, and they essentially gave the same presentations they gave at the conference last year, but we are able to have all of our management in Target in our IT organization involved in that discussion and learning from folks. And that was a tipping point for us in our movement. I mean, we energized a lot of people, and suddenly everyone was coming to the table saying all right, how can we get involved, how can we help? So now we had to figure out, so that's tops down, great, we've got the alignment, we've got support. Now how are we gonna actually scale this? That's the challenge. We gave this talk at Velocity a few months back and actually asked the audience, you know, there were a few hundred people in the audience, who's figured that out? Like, who's figured out how to scale DevOps practices across a massive organization? I think two hands went up and I did get a chance to connect with the Capital One guys after that.
That was one of the hands that went up. You should talk with those guys, it's really impressive what they're doing. Topo is rocking it, so wherever you're at, great job. But we knew that, you know, we had to scale these practices across a very large organization, so how are we gonna go about doing that? First, we had to look at our structure, and there were some structural changes that we had to make. The model we're moving from, and we're in the process of this transition right now, you know, our organization was largely a COBIT-based, highly segmented model where you, Heather talked about it earlier, you had all these teams that had to come together to get anything done. And we're moving essentially to a product and service model, very simplified accountabilities, people are accountable for their entire stack. We've established key practice areas within the organization. So think of skills that we want the entire organization to learn and apply in their jobs, it's a key role that I actually have. And in our delivery model, we largely are shifting from a waterfall-based model that we applied to almost everything where every unit of work that came into our IT organization was a project.
That was the only way to get resources, to get money, to get anything done was to have a project and you'd have hundreds of projects and it was crazy. And so we've shifted to this product focus and the models that were largely embedded into the organization now, it's mostly Scrum and Kanban-based models to actually do the work based on, you know, characteristics of what the team is made up of. And then our technology modernization strategy, you know, we have a lot of legacy, we have a lot of different, you know, tightly coupled integrations in our environment. For key strategic capabilities, we are very focused on moving those things to a more modern technology architecture. So you know, API-based, loosely coupled architecture, lightweight, too lean, self-service, we're trying to drive as much self-service into the organizations as we can, things that are really optimized for cloud-based development and CICD practices. We're still working out how to apply this to legacy as well. Obviously, you know, different challenges, but we are trying to get as much right with our modernization strategy as we can. So great, we moved a bunch of boxes around, people changed reporting relationships. That doesn't actually change much of anything in terms of how people work, so now let's talk about how. How are we trying to impact how our teams are doing work? So one of the first things we did is we started to converge the movements. We had a large Agile movement and we had a large DevOps movement and we were kind of loosely connected, but we weren't as close as we probably could have been. We converged those efforts, and so we're kind of driving those jointly as part of our transformation. And we also started pulling in our organizational effectiveness organization to really look at how do we scale learning? And it's gonna be a combination, it is a combination of traditional, you know, training experiences, but more focus on coaching and hands on immersive experiences. So we came up with an idea, this is one of my favorite parts of our journey right now. We created an internal incubator environment.
Publicly, I'll often refer to it as like a transformation emergence center we called the dojo. So the place you go to practice your craft. And the dojo is huge. You could fit probably three or four of these rooms in our dojo, and the idea is teams come to the dojo with their real work, we have coaches and subject matter experts that work with them in a very immersive environment and show them how to apply new practices to get that work done. They typically are in there for a long enough period of time for some of the stuff to sink in. There's a lot of activity there. We largely are trying to create what an engineering culture should look and feel like in the dojo, and you can see pictures of the dojo on the screen there. Our CIO is up at the top there. He's made multiple visits in. Gene's been out there, John Willis has been out there, it's been great to see some of the evangelists in the community out experiencing and participating in what we're doing. So the dojo really has three primary services. The first one, and I've talked a little bit about these last year.
They weren't as kind of formalized as they probably are now. The most important one I would say is what we call challenges. And the challenge is 30, at least a 30-day, sometimes we go longer with them, a 30-day experience where a Scrum team will actually come in with their work, and they work in the dojo in a fully immersive environment. They're doing two-day sprints, they're demoing every two days, we're teaching them all of these kind of engineering practices, as well as Agile practices in terms of how they're doing their work. And we get them immersed long enough where they get comfortable working in that way, and then after the challenge is done, they'll move back out into wherever the team came from. And then flash builds, something we did a lot last year. We do it a little bit less now, but think of that as one to three-day events where the team is a little bit more confident coming in and it's more of a way to kind of collaboratively quickly build a solution, build a product. And then open labs is really kind of open to the masses.
And so we have twice a week people can come down in the dojo, we have all this SMEs there and available to work with them, and that's important because we wanna be accessible to everyone. We can't put everyone through challenges. We tend to focus those on our more strategic priorities. And so we wanna be accessible to everyone, but we can't put them all through the formal training. So how do we actually prioritize demand? So the challenges, which is kind of the main service that we provide there, we do focus on our strategic priorities. We can run up to 10 of them concurrently in the dojo. We have eight going right now actually. If we don't have a capacity issue in terms of coaches or space, then that's great, we'll take in whatever challenge people wanna do. When we do start to run into capacity issues, then we focus back on those priorities. And so that's like the formal way for folks to get in. And then informally, we encourage people to bring their friends down, come to the demos, we have a open demo lounge in the dojo that everyone can see and participate in. Bring your friends down and take some WIP off someone's Kanban board even.
It's great to kind of informally and organically grow that culture as well. So results. So far, the dojo has only been up for four or five months, and you know, when we started, we had like one challenge going and then we had a couple. Right now we have eight concurrent. We've done 14 challenges, six flash builds. We've had over 200 learners cycle through that environment. And they've gotten all kinds of great results. Not gonna read all this. Whoa. I'm definitely not gonna read it. I'm not gonna read all this, but the results people are largely getting are, you know, learning how to take what would take three to six months before and do themselves as an empowered team in minutes, right? Building a full stack environment. They can do that themselves when they come out of the dojo. That would take three to six months previously. Learning how to work in a collaborative environment where, you know, everyone can contribute and merge code and it's not, you know, one person's role in this big batch activity that has to happen at the end of it, of some cycle. Learning how to work in that kind of highly collaborative environment where you're supported working with each other on these Scrum teams. And we've actually put our executives, this is actually our CIO's idea. We put our executive through a hands-on DevOps workshop. They actually pushed code for real. They wrote some lines of code. And this is our VPs, director, CIO. We kind of fed them what they had to write, and it wasn't to teach them how to be good coders, it was for them to build empathy and understanding for how their engineers are supposed to work in this model. That was huge, that was one of the best events I've seen with that group together. It was highly interactive.
We were able to dive into some deep concepts with them, and it worked out so well we're actually rolling that same workshop out across a lot of our middle management now as well, and so we're gonna do multiple iterations of that. And by the way, some of the things that have come through the dojo, they are some very strategic capabilities. We've had our point-of-sale system in there, we've had inventory, pricing, promo, these are very, very important retail capabilities. So what have we learned over the last six months? One, MVPs rock. We've got, you know, people have a tendency to plan for everything, they wanna boil the ocean. In the dojo, we talk about the skateboard. I don't know if people are familiar with that diagram, but if you wanna build a car and you wanna show value along the way, build a skateboard, then a scooter, then a bike, than the car. Don't build a tire, another tire. Your customer gets no value out of a tire, right? And so getting people oriented around what it really means to build an MVP and continue to build off of it. We've learned that a successful challenge needs a good charter. We were a little loose at the beginning and it was hard to get alignment and commitment from folks at times because it is a big commitment to spend 30 days in this environment, especially when people were sometimes assigned to multiple things. So doing a more formal charter where people actually signed off their commitment was really important. Don't overly focus on one area.
When I started my organization, we were probably a bit infrastructure-focused because that's where I sat in the organization and so a lot of our coaching was oriented around config management infra as code. We've now expanded to cover more of the software engineering practices as well. Befriend your landlord. And everyone I've talked to in a large enterprise that's trying to go through this type of transformation has challenges dealing with their space people. And getting alignment on making a space that open and collaborative, it's amazing how difficult that can be. Take them on the journey with you as much as you can. I mean, at times I feel like we were pushing pretty hard on them but you know, be friends with them, take them on the journey. Expect the unexpected. We lost one of our champions, a VP that was a big driver for this movement. That was a risk of saying us back we had to regroup and make sure we didn't lose momentum. And then ultimately get comfortable with being uncomfortable. This is a big change for teams to go through and it is uncomfortable. We have this sign plastered in the dojo. We have all kinds of cool signs, but the Get Comfortable with Being Uncomfortable is an important one.
Where we're going. So we have a large captive center in Bangalore of employees, and we're trying to drive this movement across the entire organization. There's been a lot of great things happening over there from a DevOps perspective, but we're actually looking to formalize the dojo there as well. So I might have more to share on that here later this year. And advice for others. So I'll go real quickly here because I think we're short on time but don't wait to start. If you're thinking about how you're gonna approach DevOps, what kind of transformation you're gonna make, just start, start doing something. You can plan and talk about the stuff forever. Be exclusively inclusive. And we say that kind of tongue-in-cheek, but when you're pushing for this kind of change, some people will feel like you're actually being exclusive because you're kind of challenging the way that they're working. You gotta figure out how to pull everyone in close. Try to not make it the cool kids club, which unfortunately we get that label at times and we're trying not to do that, but that's how it can be viewed internally within an organization. Empower your change agents. If people are seeing challenges, they've got to be bold, they've got to speak up. If someone speaks up, others are probably feeling those same things and they will join in and you'll get your movement started. Unlearn what you've learned.
This is hard. I've been four years at trying to build my Agile mindset and my DevOps mindset and I'm still on that journey. For people that aren't really going after this as hard as folks like I am, it's hard. Like, executives have learned to think a certain way. They've gotten to where they're at because they've driven IT a certain way. Teams have been operating a certain way. You gotta work on breaking that down. And then connect in that broader community. This is a really powerful supportive community. People actually wanna help each other here. So tap into that and get involved. And so what we're still looking for help with? I mean, this is our latest experiment on how we're trying to scale, but we still need to figure out how to scale these practices across thousands of people. That's a big challenge. So if anyone has ideas there, love to hear from you. The other thing I would just say is I feel like we get a lot of credit for what we're doing in the community, but we're still just starting this journey and I just wanna make sure people realize that, that this is a long and hard journey. We're going at it hard, we're sharing our story as we go, we're doing some really cool things but we got a long ways to go.
- [Instructor] Okay, so that concludes lecture five.
Andrew is fanatical about helping business teams gain the maximum ROI possible from adopting, using, and optimizing Public Cloud Services. Having built 70+ Cloud Academy courses, Andrew has helped over 50,000 students master cloud computing by sharing the skills and experiences he gained during 20+ years leading digital teams in code and consulting. Before joining Cloud Academy, Andrew worked for AWS and for AWS technology partners Ooyala and Adobe.