The course is part of this learning path
CISSP: Domain 2, Module 1
This course is the first of two modules of Domain 2 of the CISSP, covering asset security
The objectives of this course are to provide you with and understanding of:
- How to classify information and supporting assets using policies and categorization systems
- How to determine and maintain ownership using data management practices
- Protecting privacy, through regulations, standards, and technology
This course is designed for those looking to take the most in-demand information security professional certification currently available, the CISSP.
Any experience relating to information security would be advantageous, but not essential. All topics discussed are thoroughly explained and presented in a way allowing the information to be absorbed by everyone, regardless of experience within the security field.
If you have thoughts or suggestions for this course, please contact Cloud Academy at firstname.lastname@example.org.
So we're going to move on to section three in which we're going to discuss how to protect privacy, and all the different features that will fit into that.
Now, the whole idea of privacy, of course, is that this is about information that identifies discreetly, specifically individual persons. One of the under lying principles is that the individuals the data subjects should have control over their own personal information, the basic rules and these are essentially paraphrases of what laws actually say, if the personal information is obtained by some party, it needs to be obtained fairly and lawfully, it should be used for the original purpose only, and this purpose should be specified by whoever the collector of this information is. It should be adequate and relevant and not excessive to the stated purpose. In general, it should be accurate and up-to-date, and there is a process by which this needs to be done. It must be accessible to the subject at all times within reason. It needs to be kept secure by whoever it is that's collecting this data and holding it, who will be making use of this data for a specific purpose. And once that purpose is properly accomplished, it needs to be properly destroyed.
Now, the European Union's Data Protection Directive 95/46/EC, quite well known and privacy circles, has a lot to say about this particular point. It says in its body, that there are different things that will be done to strengthen the individual's rights in the European Union, which in its own way, bears a relationship to United States from the standpoint that it is a coalition of nations that have formed a political and legal bond amongst themselves. And a directive coming from the EU acts in that context, in much the same way that a Federal Law here in the United States will act, so that this directive applying to all member countries of the EU will have the effect of the Federal Law, enhancing the single market dimension means that it sets a standard usually assumed to be a minimum standard that applies to all member states, they of course can pass their own laws individually, that will be in effect as most commonly as extensions to this EU level law within their own countries boundaries. But this enhances the idea of a single basic standard that all will as a minimum adhere to. It adds revising the data protection rules in the area of police and criminal justice. This is an area that it needs to be enabling for proper police and jurisprudence kinds of uses. But it needs to be looked at closely to make sure that even though it may be dealing with that properly to ensure that law enforcement is not inappropriately hampered, but it must be done without sacrificing or compromising rightful individual management of their own privacy. It seeks to establish a foundation for how the data when it's transferred outside of the EU is gonna be handled by those doing so and at its destination. It seeks to create a framework for better enforcement of the rules so that checks and balances are properly defined and enforced. And this is now enforced by the General Data Protection Regulation that went into effect in May of 2018, which we commonly refer to as GDPR.
Now, the GDPR itself, a very widely known law that went into effect in May of 2018 is based on a Swiss law template. Now, as I was saying, this acts as a federal type of law in that it applies to all EU member nations and sets minimum standard of regulation and enforcement. Now, they kinda of course pass their own extensions to that that will work within the boundaries of each individual member nation that wishes to do so. What it does is it levels the playing field as a harmonizing agent. So that as a minimum standard all nations within the EU, and anyone else subscribing to the principles of it would adhere to the same minimum standards. Now, that's minimal, saying it sets the lowest level that would be compliant. It doesn't mean that it's very very low, it simply means that it specifies a minimum standard. It is intended to work with the United States as well to provide a framework for organizations within the United States who are doing business with organizations in the EU or in other places that have adopted this same practice, so that the same standards will be in use at sources and destinations to ensure that the privacy and data protection laws principles are actively at work.
Now, as with all of these laws, they need to define their domain, in that they will have to define the various terms that they mean. So, for purposes of the GDPR, we have these definitions, Data Controller means the party who in accordance with domestic law is competent to decide about the contents and use of personal data regardless of whether or not such data are in fact collected. Now, this is a term that is presumed to mean potentially the owner of the data and it can be an organization that does the collection, but it is definitely the organization in whose possession this data is placed either through creation or collection. The term the Data Processor is some form of individual or organization that in some meaningful way manipulates the data, it can be a third party or it could be the Data Controller itself, they can be co-located in the same country or the Data Processor can be remote from the Data Controller.
Now generally speaking the data subject is the person about whom the data is being collected or processed or stored or all of the above. It is a discreetly defined individual and it can contain a wide variety of attributes regarding that individual. Personal Data is the individuals data that is being collected about them. Now transporter flows of personal information means movements of personal data across national boundaries. More specifically, this means movement of this information across EU member state borders or some destination outside of the EU itself. Now, the objectives of delineating data management roles and responsibilities are to make sure that all parties concerned clearly have defined roles associated and the functions associated with those roles. We have to establish data ownership throughout all phases of a project so that a responsible party knows and is in control of that data. It means that we have to have principles and practices that enforce data accountability, which of course means that there must be sanctions to deal with those times when that accountability has not been upheld. And to make sure that we have adequate and agreed upon data quality and metadata metrics so that we know what the data should be and when it isn't, what it should be, how to go about measuring and then correcting that situation.
The data owner is assigned the responsibilities of making all these definitions and historically has the responsibility of enforcing the policy, which over its life cycle is going to protect the data, ensure that it meets the CINA standards that have been set for it, that it is available only to those who are authorized and require access, and that throughout its life cycle, each phase has the proper protections in place and so that it is ultimately destroyed when it has outlived its purpose. Now, typically their responsibilities include this, they have to determine the impact of the information on the mission of the organization, which means they have to know what benefits we accrue from having that information, and what impacts will occur if that information is not available. They have to understand all the financial impacts what the cost of acquiring, maintaining, managing, replacing that information, assuming, of course, that it can be replaced. By policy, they also must determine who has a need.
Now, this is not a person of course, this is a role and why that role would need access to this particular information, and what form, what conditions it needs to be in? So that release can be judged or prevented as the case might be. And then the metrics that are in involved in determining the state of the information's integrity when it needs to be updated, or it's no longer accurate or it's no longer needed so that it can be eliminated. Now throughout this entire process, documentation must be addressed and must be developed to ensure that we memorialize the program so that those who will come after will be able to deal with it. So that those who need to use it now have the guidance they need at their fingertips. Now, the data owners, information owners need to establish, and they need to document all these various characteristics and they need to commit metrics and criteria as well as they can in each case. So the ownership, intellectual property rights and copyright of their data, so that we establish ownership. We have to address the statutory, non-statutory obligations relevant to the business to ensure that the data is compliant.
Now statutory of course means in accordance with law, non-statutory means that we may have contractual obligations that have to be met. Now, those are not legal except within the contents of that particular contract but the data is still must be determined as being compliant or not compliant with the contract. The policies for data security, disclosure control, release pricing, dissemination and all the other aspects of this particular programs execution. The agreement reached between users and customers on the conditions of use, and it needs to be set out in a memorandum of agreement or a license agreement before the data can be released. And this is just typical control of how the information is going to be allowed to be used.
The data custodian, historically is a role that works for on behalf of the data or information owner. They have the responsibility of it seem to the daily adherence to this policy, so that the policy ownership is the information owners and the data custodianship is a role that usually works for the data owner or information owner and they enforce this. So their responsibilities will include adherence to the appropriate and relevant data policy and ownership guidelines established in that policy. They have to ensure that the information is accessible to appropriate users, and that those levels of access will be in accordance with that policy. They may have to indulge in fundamental data set maintenance, which may include things like data storage, archiving, destruction at the appropriate time and so on. They will have to take care of the data set documentation and the updates to them. And then they have to enforce quality and validation of any additions to the data set including periodic audits to ensure ongoing data integrity is maintained. So as I said, the data custodianship is typically assigned to a role that functions usually as an employee, or a subordinate to the data owner. And they are the ones who have that responsibility of enforcing the policy established by the data owner.
We have to establish metrics and methods for managing data quality. These needs to be in place, so that at the point of creation or capture, we are able to establish what that should be then, in keeping with this policy, we need to understand how to do data manipulation prior to digitization, so that we know how the data is prepared and we ensure that proper checks and balances are in place to protect it from any sort of corrupting influence. We need to do the identification of the collection, and it's recording to ensure proper methods are used then when it's digitized, that is to say data entry is being done. The procedures are in place to ensure that error prevention is performed before that it is actually edited. The data is documented, we have proper methods for storage and archiving, that the data presentation and dissemination is properly controlled, and that proper uses are defined and established to ensure proper access control for those gaining access to it. So, the standards need to be set governing the various attributes that you see on the slide. What does define accuracy or precision or its resolution, liability? All of these characteristics need to be defined as would be appropriate for the individual data elements. Obviously, for those things that say precision may not be an appropriate term to apply to its quality or integrity, then of course, you would exclude it. But what this reflects is, these are the things that we need to do define and make clear what the minimum acceptable standard would be, what the various boundaries would be for the data covered under these particular quality standards? So that we know that it is it should be at all times.
We characterize quality control and quality assurance in two different but related ways. Quality control is an assessment of quality based on an internal standard. That's the term control. This involves assessing the standards the processes and procedures that produce control and monitor and correct the quality. Quality Assurance is an assessment of quality based on standards external to the process. And as you'll notice, the text and red really highlights the difference between the two types of quality. This involves making sure that we review what is being provided to ensure that that external standard is being met properly. So we have to have ways as we just covered of measuring the data quality, we have to attempt to verify that the data is what it should be in all of its pertinent respect. And since we have the processes of data entry and then data usage, we want to try to prevent where possible data errors from being created, entered included in the overall database. And then ways of detecting and correcting errors that have gotten in and need to be removed. Now the documentation regarding this data is key to maintaining the quality as is the definition process for all the steps we've just covered.
The objectives of this documentation will be to accompany the data throughout its useful lifespan within our organization. It helps us see that there are different ways that it can be used different forms that a specific data element might have that can be included in different processes throughout our organization, so that instead of regenerating, requiring or rebuying data, we're able to make multiple uses. This is the process that we know of as code generation. We want to be sure that we ensure that data users understand these various contextual element. It's equally important that they understand what the limitations of the data sets are. Data can say what it says, but data cannot be interpreted or made to say things that it literally will not say. And we have to understand that there are limitations, and then we should never attempt to make data say things that it doesn't actually say that hampers its integrity, and employs the data in ways that it probably should not be used in. Because people need data throughout all phases of their work, we need to facilitate their discovery of this data. The alternative to facilitating discovery is not publishing it, not allowing people to know that the data already exists within the organization in some form, and that puts them in a position of having to define what they need and then going to acquire it not knowing that its present already. And that means that it raises the other issue of data integration. Data that they need, they will go acquire in some other way and then if it gets included in the product or process that they're working on, it then poses the issue of us having to analyze and then integrate this new data, which in its original form, may not be something that we know about or recognize its quality standards, and so we're now faced with that.
So facilitation of the discovery of data sets is really a very important part of this process to ensure that known data, known good, known high quality data is what is used when it's appropriate, and that we have ways of acquiring it, if what is needed is not present in-house, so that we can apply are the standards to it that we have. And we want to be sure that along with them, we facilitate the interoperability of data sets and the data exchange process. Part of all of this will be to take the data and as we categorize it, put it in containers, if you will. We give it appropriate titles, file names, we have descriptions of data content, and the metadata that elaborates on what that is. And these need to be accurate, these need to reflect what it actually is, because we have the underlying assumption that only authorized people will be getting at it. And if we try to come up with some sort of naming convention that we try to employ in some obscure way to try to keep it from revealing what it actually contains, we complicate the data usage problem, we run the risk of leading to harm to its integrity. And so having proper names, proper content and metadata for authorized users to look up and determine whether or not it's what they need, is really the only sound way of proceeding. Our standards should of course describe these objects and features to make sure that they are properly named, properly collected, and that in the standards it describes, usage affect the parameters and quality measures that go along with it, and how it's used in the organization or normal usage and occasional usage types of situations. The better we define all of this, the better job we do at enabling authorized users to gain access and make appropriate uses of the data. In it, we can probably save money, we can save time, we can produce a higher level of quality and thus, trust in the data products that are generated from the usage of all of these data elements. We improve data consistency by rendering it more trustworthy, and we have less data integration problems that go along with all this. It means that people that acquire and use this data have a better understanding of what the data actually is, and that includes proper uses and the limitations of that data. And it allows us to do better documentation of the information resources overall.
Throughout the data lifecycle control, we need to specify the data modeling, processing and the database maintenance and security that is required from its point of creation or capture all the way through the final cycle step of destruction or some other form of disposal. Throughout all of these steps, we have to perform data audit to ensure that we're monitoring for proper use, and that the data is continually effective, that it has relevance and meaning in the various ways that it's being used, and when it no longer is to determine why, so that it can either be refreshed, disposed off, or that replacement can be reacquired. And then making sure that the data is archived if it has a regulatory or other contractual obligation for that, and that that takes place at the appropriate time. So understanding the user requirements will be the very first step that we need to understand so that the data that we will have can be properly configured to meet it, acquired, created and supplied. And databases, of course, being the ultimate data repositories, they need to be designed in accordance with those user requirements, to ensure that it is appropriate and meets those needs.
Questions that need to be asked during this process will be, as you see here, what are the databases objectives? How the database assist in meeting them? Who are the stakeholders? Who has a vested interest in at success? And so on down these lists of questions, who will use it? What will the database hold? Some of these questions relate to its design, some of it relate to the end user and how they're going to be coming into contact with it? What they're going to be seeking? So that how the data is entered, formatted, characterized, accessed, searched for et cetera, that all of that becomes something that is ergonomically reasonable, possible and even made easier by understanding from user requirements standpoint, how our data modeling needs to take place. So we begin with a concept of design. And these concepts need to be embodied in the mechanics and the storage structures to make sure that it meets those users needs. Whatever the data is, it is of course what will be stored. And so the database needs to reflect the type of data that it is. For example, a database that stores information regarding music, or movies, such as IMDb, or a media player on your computer needs to be designed in accordance with what it will be storing, the object and its metadata. If it's a hierarchical database, it needs to be designed along the same lines but appropriate to its particular type, structure and use. The data objects, of course need to have their relationships mapped out, what they are, whether they're simple or complex. And having this diagram might be one of the best ways to begin to process of doing proper design.
Now the database management has to not only construct but it has to operate, modify and update the database structure as in accordance with its usage. Major hardware changes and software changes are going to occur periodically. It's not so important to know that major changes in hardware can be expected every one-to-two years and software every one-to-three, the fact is, these two different things hardware and software are going to change on some cycle, trying to forecast the precise nature of the cycles, the precise duration of the cycles is not really the objective. The objective here is to map out your product roadmap so that as you do that, you create something that is designed from its beginnings to adapt and evolve as things change over time, so that it can adapt and evolve with the business environment in which it will be used. Along with that, the data audit process needs to be defined, and it needs to employ the methods and the standards that we have defined up to this point in this process. In this data audit, we need to look at how the information is being used, who's using it, whether or not the standards that we apply to it are valid, or if we have a case of, for example, data that is an exception to the established standard and do those exceptions occur more than a certain percentage of the time. If that's the case, it may reflect that the standard is not well defined, or that the data has changed and its characters such that it no longer conforms to that standard, and the entire relationship between the standard and the data needs to be reexamined to redefine one or the other or both. We need to constantly refer to the resources and services being provided to ensure that this catalog continues to be appropriate to end user needs. And then by looking at the data flow between organizational elements are persons in it, it gives us the ability to understand how it moves and how it's being employed and by which party.
From this data audit, we're able to establish a better process for doing capacity planning, knowing how the data grows or shrinks, how it changes, understanding how it is used, and facilitates a better understanding of how to enable better sharing and reuse of the data. And by monitoring the data holdings, it helps us identify where data leaks might take place or where data losses might take place, and plan controls for those particular places. It lets us judge how the control practices management and usage practices are working and whether or not they're truly relevant and effective to the data. Part of how that will drive what we're going to do on our product roadmap will be what sort of machines do we need? And this can be something from real hardware in our shop to cloud virtual equivalents. It helps us understand what network infrastructure we have and how it impacts the access and usage of the data, and how it may need to change. Database maintenance and updating or procedures that need to be looked at on a regular basis to ensure that their proper and efficient. Part of every data management structure should be how the database that holds it all will be backed up, and how of course, it needs to be recovered at those times when the database might take an exception or stop or be corrupted, and then of course, how the data will be archived when it reaches its end of useful life. Part of every data policy that we're going to have will be to define what access sharing and dissemination rules we're going to put in place.
So we need to understand data ownership because that is where the responsibility for creating embedding in policy and then enforcing these rules will take place. It has to look at the needs of who require access to the data? So that it can properly address and not impair how it's going to be used by those authorized users. It needs to look at the various types and the differentiated levels of access that users will have to it, to ensure that the levels themselves are properly defined, giving the proper kind of access, restricting improper type, and making sure that we have a better way of assigning it to the various people. We have to always look at cost because this system like all the things does not come free and at the very least, it requires manpower and the work necessary to ensure that it's being done and that it's cost effective, that it's not an inefficient method. Drawing from all the requirements we have gained from the stakeholders, we have to understand what the appropriate format will be for delivery to the end user. And they're the only ones who can tell us that, so it does mean we will have to actively engage with them to understand what that is, because that leads to system design considerations to ensure that we are delivering the right data in the right format to the right location. And there will always be the issue of determining privacy concerns and whether public domain plays a role in ensuring how the data is collected. Just because it's free doesn't mean it's something that we want, but just because it isn't free, doesn't mean it's something we should avoid and try to find something that is, because we may sacrifice other requirements for the sake of getting it for free. We have to examine by necessity, what liability issues may exist. There may be data elements inside the metadata, for example, that may have a direct impact on this data and any legal ramifications should it end up in the public eye or in the wrong hands. Having a disclaimer, is a warning kind of a control a deterrent to let people know that there are certain things that they need to be aware of, limits on what the data will say, limits on how it can be used, limits on whether or not you're authorized or not, legal impacts might be something that would be included as a statement in the disclaimer. When the data is at rest, and if there are legal or jurisdictional issues that may be specific or unique to that particular geography. If the data moves when on the wire, and if there are similar issues associated with how it moves, where it's coming from, where it's going to, and where the data is being consumed. And again, the same question must be asked and answered. What specific legal or jurisdictional issues might be in place? Other aspects single use or multiple use and what that does to our versioning and other types of controls? Do we need to to institute a program of data obfuscation, or other forms of restriction, to protect sensitive data from being visible to potentially unauthorized or to people who are not authorized for an entire set of data, but only certain parts? And then what is the impact of the data is available? Now one of the aspects of a more technical nature is, what we do about Data Remanence.
First Data Remanence means the residual physical representation of the data. It may have been erased in some way, such as just simply doing a file delete or it may have been partially overwritten, but part of it remains. The issue of Data Remanence is the data needs to be removed entirely. And as we know, erasing or deleting a file does not, in fact, delete any of the files contents. So, even after the media is erased, there's probably going to be something we need to do following that to ensure that the data is in fact actually gone, or has been rendered into a state where it cannot be reconstructed, on a hard drive a traditional rotating type of a hard drive, the data, as you know, is magnetically written onto the drive by altering the field through a rewrite exercise on that hard drive platter.
Now compare that to a solid state drive, which does not use magnetic platters, it uses flash memory, which is chip-based. Now there are ways that will apply to both, but there are ways that will apply primarily or exclusively to hard drives, traditional rotating hard drives, and these are clearing, purging, and destruction. So for the next few slides, we're gonna talk specifically about these methods. The idea behind clearing is we're going to remove sensitive data. Then we're going to put down a new image on top of that, and then reposition the device, at least that's the beginning risk planned for this particular device. Typically it's going to be redistributed to persons within our organization. In other words, it's not going to go outside of the organization itself. Now to recover the information that has been cleared, it may require specific types of tools, Utilities, Norton Utilities, even Forensic Tools. But whatever the case is clearing by this definition is going to be done to a machine that will be re provisioned within our organization. If it's not, if it's going to be put out to another party, say a donation or sold to other parties, then we need to do something stronger, we need to do purging.
When we do purging, we're actually going to use much stronger methods to actually remove, garble or completely remove any data that was there so that whatever the technique might be even strong forensic techniques, that data cannot be found and reconstructed by any known method. Now, the easiest way of course to do with any of this, would be to simply destroy the data. Here we have a way of taking a computer, pulling the hard drive out of it, be it a laptop or a desktop or a server, we pull the drives, and we set them up to be destroyed, either by us internally or by a service, such as Shredded or Iron Mountain. If we're going to clear, it means that there may be some data left, but because of who it's going to go to, this is a low concern. If we're going to purge, then we'll use stronger methods to actually remove the data or ensure that whatever might be left cannot be reconstructed into something that could be exploitable, and that it would take the strongest possible methods and probably still not be successful. Here, we eliminate all that simply by destroying the drive and rendering it into a state whereby from which it cannot be reconstructed at all.
So we can do a specific formula of overwriting. We can do degaussing, a method by which we demagnetized the drive, or we can do a newer method called Crypto Erase. When it comes to media destruction by whatever method you use, incineration, grinder, crushing, phase transition, these are going to effectively destroy the media in such a way that it cannot be restored, reverse engineered or reassembled. Now for the sake of clarity, and on in the event that you might see a question on this, I draw your attention to the lower right hand box that says, "Curie temperature." This is not a fixed degree, this is a temperature that any sort of metallic media, magnetic media is exposed to a heat source that raises it to its Curie temperature, that is the temperature at which that material will lose its magnetic properties permanently and that will work and typically we achieve that by doing incineration. For solid state drives, we're going to try something relatively new. We're going to use Crypto Erase, for SSDs which are not based on magnetic rotating drives, we cannot use the method known as degaussing because that functions through magnetic corrosion to realign the atomic structure of the magnetic emotion on the platters of the rotating hard drive. And since SSD is do not use magnetism, this method will not work, since they use flash memory overwriting does not work. So we have to use the different method.
So the data destruction that we do here is going to involve the technique that we've come to know is Crypto Erase. Crypto Erase is typically based on an encryption program that is normally, has become normally built in by SSD manufacturers, so that this information can be erased and thus rendered non-restorable to a human readable form. Now the cryptographic eraser takes place by using symmetric encryption such as AES or similar and it encrypts the data marked to be erased so that it cannot be keyed and restored to a human readable form. Once the Crypto Erase operation has been performed, crypto shredding will then be performed on the key to do a bit level dispersion of the key. So the key itself cannot be reclaimed and then employed to bring back that Crypto Erased data into a human readable form.
Now, when it comes to cloud data-erasure, how can I be certain that that data has in fact been erased? Cloud providers typically use a thing called eraser encoding. And what this does, is it breaks the data up into fragments. It's expanded the encoding is added so that whenever the data object by its reference name is erased, all the different parts stored in different locations throughout the form of the data storage array are found and they are destroyed by these methods. Now for these different methods we find the standards provided by NIST for the media sanitization we find it in Special Publication 800-88 released one, updated as of December 2014. A comparable model would be the data communications establishment in Canada for clearing and declassifying electronic data devices, which we find here as of 2006. The NSA's central security, media destruction guidance, New Zealand has its own manual, and the Australian Government likewise has its own manual. And it should come as no surprise that most of the methods that are discussed in each of these reflect the same standards we've just been discussing, such as degaussing, a specific formula of overwriting physical destruction and the more recent of them, Crypto Erasure.
Like all of the other official guides that I've mentioned in this course, you will not be asked for details of each of these but to have it general understanding of what the methods are and why you would choose one over another and what the benefits are of each one.
All right, we've reached the end of our first module in Asset Management, we're going to continue on with our next chapter, but this concludes our first portion of this module.
About the Author
Mr. Leo has been in Information System for 38 years, and an Information Security professional for over 36 years. He has worked internationally as a Systems Analyst/Engineer, and as a Security and Privacy Consultant. His past employers include IBM, St. Luke’s Episcopal Hospital, Computer Sciences Corporation, and Rockwell International. A NASA contractor for 22 years, from 1998 to 2002 he was Director of Security Engineering and Chief Security Architect for Mission Control at the Johnson Space Center. From 2002 to 2006 Mr. Leo was the Director of Information Systems, and Chief Information Security Officer for the Managed Care Division of the University of Texas Medical Branch in Galveston, Texas.
Upon attaining his CISSP license in 1997, Mr. Leo joined ISC2 (a professional role) as Chairman of the Curriculum Development Committee, and served in this role until 2004. During this time, he formulated and directed the effort that produced what became and remains the standard curriculum used to train CISSP candidates worldwide. He has maintained his professional standards as a professional educator and has since trained and certified nearly 8500 CISSP candidates since 1998, and nearly 2500 in HIPAA compliance certification since 2004. Mr. leo is an ISC2 Certified Instructor.