1. Home
  2. Training Library
  3. Programming
  4. Programming Courses
  5. Fundamentals of Data Structures in C++

Arrays and Linked Chains

Contents

keyboard_tab
Introduction
1
Course Overview
PREVIEW54s
Conclusion

The course is part of this learning path

Start course
Overview
Difficulty
Intermediate
Duration
2h 28m
Students
17
Ratings
3.6/5
starstarstarstar-halfstar-border
Description

In this course, we will explore the fundamental concepts of abstract data types (ADTs) and data structures. Then, we'll discuss specific ADTs and common implementations. We'll first learn some basic terminology and then learn about array-based implementations, as well as a new approach using a data structure called the linked chain. We will put all that into practice through some demo projects, including a link-based queue and a stack that uses a linked list.

Intended Audience

  • Beginner coders, new to C++
  • Developers looking to upskill by adding C++ to their CV
  • College students and anyone studying C++

Prerequisites

To get the most out of this course, you should have a basic understanding of the fundamentals of C++.

Transcript

In the previous lecture, we distinguished between an abstract data type or ADT  and a data structure. We learned that ADTs expressed the responsibilities of an object but don't specify how that object is supposed to perform those responsibilities. We also learned that data structures are implementations of ADTs for collections of elements. ADTs specify what something does but not how. The data structures fill in the how, that is, the implementation and format of the data that is required to accomplish the goals of that ADT. The first and one of the most fundamental data structures in C++ that you're probably already familiar with is the array. Specifically, we're looking at the built-in array. It's very simple. As a bit of review and perhaps extending your knowledge of arrays, an array is a collection of elements. Therefore, that makes it a data structure.

When you declare an array, for example, int myArray[5] in C++, this creates an array named myArray of size five. So the valid indices of this array are 0, 1, 2, 3, and 4. As you can see, the highest index allowed is one less than the size of the array. This is, of course, due to the fact that we index arrays from zero. Another thing about arrays is that the memory allocated for them is contiguous. That is, the memory for each element in the array is adjacent to the other elements in the array. This enables arrays to support random access similarly to the way random access memory or RAM works. The base address is the address of the element zero. The system can add to the base address a certain number of bytes in order to access a particular index. That's why it's called random access.

In fact, we can access any element in the array, and the system does so by performing a little bit of basic math and without looping through the different elements of the array sequentially. So, as you can see, you can start an array at, say, address 1000. These addresses are in decimal to make it a little easier. Usually, we show memory in hexadecimal, but starting at 1000 and each integer takes up, we'll say four bytes which is pretty typical for C++. At 1004, the element at one which is seven starts, 1008 is 22, 1012 is 64, and 1016 is 30 at index four. But we said, of course, that we don't want to have to go through in an iterative or sequential fashion just to get to one of these cells. Now, this isn't too bad, but what if we had 400,000 elements or even 1000 elements, that could make it very inefficient to start at the beginning and go all the way through.

So, given an index and data type size, you can actually calculate the memory address immediately that you want, that  is, you can randomly access an element at a particular index. If someone asks for something at, we'll say index three, as the examples given, you take that base address of 1000, which is where the array starts,  and you add the index times the size of the data type. In this case, it's index three, times four bytes which is 12. So, 1,000 + 12 = 1012. And that is how it jumps in memory to the location at index three, which is the element 64. Pretty cool. One disadvantage of an array is it's fairly resource intensive, relatively speaking to add elements to anywhere but the end of the array. This is especially true if the order of the elements in the array matters. So, if we have 100 elements and you have to insert an element at say index 50 where something already lives, you have to start at the end and move every element after index 50 over to its right.

Think of a long, very full row of seats in a movie theater, also called a cinema in some countries. Maybe there's an empty seat at the end of the row, but you want to sit next to your friends right in the middle. So, you have to ask everyone to scoot over a seat until the seat right next to your friends opens up. Another disadvantage to arrays is they are fixed size. If you declare them using the standard notation, their size is completely unchangeable. An alternative is to use dynamic arrays, which is a data structure that's a variation on the simple built-in array, and is actually used by data structures like the STL vector. You use a pointer to point to the allocated memory for an array. Still, if it gets full in the dynamic array, you can't just add cells to the end, you don't know what lives there.

You have to reallocate a larger empty array on the heap, copy over all the elements from the old array, point to the new array as the array you want to work with and destroy the old array. That's very resource intensive as well. But by this point in the course, you're all pros at using arrays. So, I won't belabor that issue much longer. If you need to review, please go back to the section on arrays and go through that again to knock some of the dust off. But the big question is, what do we do about some of these problems with arrays? Do we have an alternative? What if we would really like a data structure of some sort that's pretty easy to insert something in the front middle orient? What if we want some sort of collection of elements that we can keep growing as needed and not have to worry so much about it filling up and having to make a brand new one and then copying all the elements over like we do with arrays. Well, you're in luck. In comes the linked chain data structure. The core to a linked chain is the node. The node is an individual entity that contains the data that we are interested in, which corresponds to the element we would have in an array as a particular element in its cell. However, in addition to the data we are containing in our collection, we also have to have a pointer to the next node in the chain, and that's how we create a collection. Typically, we have two classes involved: a node class holding the data and the pointers to the next, and a linked something class which we will use as single pointer to point to the first node in the chain, and then each node will point to the next node in the chain.

So, we have to maintain that first node in order to add, remove, and modify nodes in the chain. For our simple little example, we're going to create a node class and just construct a raw linked chain without building a full class around it. Just to get a feel for what it looks like and what it feels like what it does, we'll implement a linked chain of nodes containing integers. Obviously, you could make the nodes of any types, or even use templates, allowing for a single implementation that can handle a vast number of data types, but we'll focus on integers for simplicity. Let's create a new project called LinkedChainFun. So I've already got Visual Studio started up here and I'm at the project screen here with C++ as the language, and a directory that I'm happy with here LinkedChainFun.

I'm going to hit 'Create' and now we need to create a main.cpp. So, main.cpp. And also I'm going to make a header file called Node.h. And in fact I'll put the entire code in the .h file, which is something we hadn't really been doing before so much, but we'll do it for this section because it makes things a little bit easier and cleaner, and you don't have as many files floating around. have as many files floating around. You could break it up into an implementation and specification file. We'll get the skeleton at very least here. Include Node.h. Here we go. So far so good. And now we're going to make our little node class ifndef NODE_H, define NODE_H. Okay. And we're going to have class Node, public. I don't like how it in dense that. I'm going to do private. The private data in this case, it's just going to be the data literally, and then also a pointer to next, public, data, Node*next.

And then we're going to also have setData if we want to change the data at some point. All right, we'll fill the code in just a moment. setNext, which will take a pointer to what the next node should be. getData, and that also has a little partner for the other piece of data which is getNext. Now, some professors will kind of do the, C style authors will kind of do it too sometimes, and they might just use a struct for the node, and that's fine. They might also maybe make a class but they'll make the data public and then they won't bother providing any methods. But I chose to do it kind of a more object oriented way so that you can see how it might be done this way. I would say this is probably better. There we go. this - > next > next;. We've got setData, data = data;, setNext this - > next = next;. And then we've got return data; and of course, return next;. Looks good so far and tell essence wasn't working for some reason so we may have an issue. We'll have to iron out and see what's going on. Normally, again it's not good to return private data member that's a pointer. That's another thing to make note. This gives the outside world full access to it. In the examples, we'll use the node classes exclusively used by our various linked entities so we could make the linked entities a friend of the node class and keep the data private without providing methods as one solution. But for right now, we're just going to keep it super simple. We're being a little sloppy at the moment but we'll write better code in future lectures coming up. Don't worry. And some of this we might keep a little bit sloppy just so you can see the simpler aspects.

So, what we're going to do, we're going to inside of, I guess we'll start outside of the main, we're going to create a couple methods to help us. I'm going to create one called createChain(). These are not member methods as you can see. Reference to the pointer, the head of the list. We also have printChain(Node* head);. We want to print the chain. So, now I'm going to copy all of these, put them under main and then we'll start working with it. Inside of main, I'll get some of this party started here. We'll create theHead here. theHead will point to the head of the chain that's returned. Then we're going to call printChain (theHead); and also deleteChain(theHead); as well. Since we have to manager of memory, we need this deleteChain. So, how do we create the chain in the first place? So, what does this look like? It's very different from what we’ve dealt with arrays. So, we're going to create an arbitrary chain of nodes. We're just going to start with nullptr head. It's a good idea to always set it to nullptr to start with. I usually just do that so I can get the code to compile. In this case, it may not be that necessary but null’s a good value as a default for pointers. So in this one, I'm just going to create a chain of 25 values. head = new Node (i and head);. So, the data is just going to be the value i, so zero through 24, essentially on each of these nodes. And then you'll notice it doesn't seem like we're doing much here but really what we're doing is we're setting head = the new Node that has this data and then points to the next or the previous head. So the head, actually in this case it would be necessary to do this. The head is currently null.

The first node that gets created at index i = 0 has data zero and then sets its next to theHead and then we set theHead equal to that node. So, what that does is that creates our new head as that new node we just created. The next time through i = 1, it would create the node with the data, remember the nodes separate from the chain at this point, It creates it with the data i  which is one at that point. And then, it also points as its next value, it's next pointer, to the current head and then it becomes the head because we set theHead equal to that whatever this returns. So, this might look a little confusing but just remember this whole thing needs to be done first before this gets assigned. So when over here, this is referring to the previous head, not the current head. The current head gets updated this way. And then at the very end, we update the head.

So, we're adding to the front of the list and it's growing from that front all the way down the chain. Deleting the chain. Not too particularly difficult, nodeToDelete; while the head is not nullptr because we know at the very end, we have the nullptr. We're going to do this. We're going to say nodeToDelete = head;. Then we're going to say head = -> getNext();. And then we're going to delete, the nodeToDelete. Now you might think well why don't we just say head or deleteHead and then head = headsNext. Well the problem is, once you delete the head, remember it deletes that node. It invalidates it. So, that means that that object no longer exists. So therefore, it doesn't have the information about the rest of the chain and that actually causes a memory leak. So, we have to grab the head at the beginning. Then we actually move the head to the next node in the chain and then we delete what used to be the head because  we disconnected from the chain and then we delete it. Now, printChain is going to look pretty similar but it's not exact and there's some important distinctions.

So, he said walker =  head and while (walker != nullptr), what are we going to do? We're going to use cout and we're going to print out the data. And then set walker = walker -> getNext();. So, in this case we're not using head and you might wonder why don't we just use head and why do we need this walker thing. Walker copies the address of head. And then as long as it's not an all pointer, it walks the chain getting the data and then going to the next. If we updated head, remember that head is the data we're trying to keep track of. We've actually passed to them. Here, it's not a good idea to, you don't really want to update the chain itself. So, it's a good idea to just get in the habit of grabbing the value of the address of head here and not trying to update it to the next. So, let's see what we get and see if there's any errors. No errors. And we will start without debugging of course and that's pretty much what we expect.

Remember? Because it started at the beginning and we had zero added to the chain and then the next time through when i was one, it added a one and then pointed to this was the head, the zero was the head. And then the one was pointing to that, the one node and then the two node gets added and points to the one node. So, the chains actually connected this way, if you think of the arrows going down like that. So, that's pretty cool. And then at the very end we go through and we destroy each of the nodes in turn. Awesome. So, we maintain a node pointer as a variable, the head of the list and each time we add a node in our creation process, we add the new node at the beginning to make it the new head and then the old head is passed into the node constructor so that it is the next in the chain. We print the chain using a walker too. You guessed it. Walk the chain and print out the data held in each node. Technically speaking, I could have just used the parameter directly for printChain since I didn't pass it by reference.

so, it's just a copy of the address and doesn't really affect the argument itself but this walker technique is common inside of these linked collection classes and in those situations, the head node typically is a field. So in that case, it's a data member of the class so you definitely don't have a copy, you have to make a copy. If you don't make a copy of the entire chain will get left in our wake as we walk the list. We definitely don't want that unless we're deleting the list. So before moving on, I have a challenge for you. I'd like you to modify our code so that after the print method has printed out all the elements in the list, it also prints out how many elements were in the list. You don't have to modify the node. You just modify something in printChain here. So, pause the video and give this one your best shot. Come back when you're done or if you need some help. How did that go for you? Were you able to complete this challenge?

This one was pretty straightforward if you were understanding what's going on. Let's update the code and add the counting feature to our list. So, we go down to the printChain here. I'm going to add a counter right here, maybe right below there. counter = 0. So, we start with null nodes. And then every time we're getting ready to go across another node, we take the counter and increment it. And then of course, and while there, and then of course we're going to actually printout something like "Number of elements":. And then we print out the <<< counter <<< endl;. And let's run it again. Make sure it looks good. It succeeds. And it'll print out the.  This is an exception that happens sometimes if you have antivirus. So, I will actually shut that off. So, live television.

So, maybe that will help someone else if they get that same problem. You can actually go to your Settings if you have something like McAfee LiveSafe and then you can go to the real time scan and that actually has to be turned off. So, I'm going to actually tell it when I restart my pc for now because it keeps thinking that it'll say insufficient resources which doesn't really make sense but that's what visual studio thinks is happening. It's not actually what's happening. I think I mentioned that at another point in the course but just wanted to make sure that you were aware of it again. There we go. Now it worked. So, it just seems to randomly do it and sometimes it does it especially when there's loops.

It thinks you're up to something and it thinks it's a virus. So, it's because the program just comes out of nowhere and it's like where did that come from? So, anyway focus again. We've got our list printed here and then we've got a number of elements printed out, which is what we did here. That's 25 which is exactly what we expect. Awesome work. So, we see the number printed out as well as the size of the list as well. In the next lecture, we will begin our deeper dive into Data Structures and learn about the ADT List. I'll see you there.

 

About the Author
Students
316
Courses
20
Learning Paths
4

John has a Ph.D. in Computer Science and is a professional software engineer and consultant, as well as a computer science university professor and department chair.

Covered Topics