CSSLP Domain 5:2 - Security Testing


CSSLP Domain 5:2
Security Testing
PREVIEW22m 14s

The course is part of this learning path

Security Testing

This course covers section two of CSSLP Domain Five and looks at security testing. You'll learn the different types of tests and how they can be performed to confirm the security of your code.

Learning Objectives

Obtain a solid understanding of the following topics:

  • Types of security testing
  • Security testing methodologies

Intended Audience

This course is intended for anyone looking to develop secure software as well as those studying for the CSSLP certification.


Any experience relating to information security would be advantageous, but not essential. All topics discussed are thoroughly explained and presented in a way allowing the information to be absorbed by everyone, regardless of experience within the security field.


We're going to continue now about security testing. Now, so far, we've discovered the various types of software testing for quality assurance and the different methodologies for security testing. We're going to look at it, as it is pertinent to software security issues that we're going to cover. We're going to cover the different types of tests and how they can be performed to attest that the security of the code is as it should be.

There are of course, various types of security testing tests, and this is a different approach and a perspective on testing from the other forms that we have discussed thus far. Now security testing is employed to uncover specifically, vulnerabilities, threats, flaws, and risks in software, and ideally, lead to the prevention of malicious attacks by addressing and fixing these things. Now, some forms are passive as would be in the case of an observation of a system component or aspect that we examine for certain characteristics or features.

Other forms are active, which would involve a dynamic manipulation of processes and programs to provide a range of inputs or actions to determine what outputs or conditions are produced. So essentially, cause and effect. Now the purpose here is to identify the discoverable weaknesses of the test object that candidates turn result in some form of negative impact on the organization, the system, and this is done with a view towards their mitigation or compensation for them.

There are of course, many different kinds of testing tools, but it is not important for the CSSLP to know everything about everything, but we do have to have a thorough understanding of how each tool can be used and what the tool is properly used for. And we must be familiar with what the tool is used for and how they can impact the overall state of software security?

So here we see different kinds of tool types, such as those for information gathering or reconnaissance tools, vulnerability scanners, fingerprinting, sniffers and protocol analyzers, password crackers, various types of web security tools, wireless, reverse engineering, looking at assemblers and disassemblers, debuggers and de-compilers, looking at various tools that do source code analysis, looking at vulnerability exploitation tools, security oriented operating systems, and what is becoming more important these days, privacy testing tools.

So as we mentioned earlier, every program, every system, every component, and every entity has an attack surface of some description. Now this is the total number of possible entry points for an unauthorized access into a system and entity itself, or into various parts of its infrastructure. Now in this attack surface, will be all vulnerabilities and endpoints that can be exploited to carry out a security attack of some kind. It is also the entire area of an organization or a system that is susceptible to hacking, including the human end through social engineering.

Now, even though the product of this name was built by Microsoft to examine the windows platform, every entity, every operation, every program, every system needs to have its attack service examined carefully looking at the various components, and in the case of software, obviously, its programs and the infrastructure in which they operate.

One of the more common forms of testing and examination is going to be making the use of a scanner of some description. Now, what a scanner does, is it examines or it enumerates a test object in search of specific characteristics or behaviors. For example, networks generate a map of the type and extent of the computer ecosystem.

We have scanners that look at the operating system itself and it examines, as before, the attributes governing the operating system. Now, if it's a passive type, it doesn't actually contact the host, but it scans for static information such as in version fingerprinting. There are of course, active versions, which use crafted packets to stimulate a response, and then examines the contents of the response packages.

There are vulnerability scanners, scanning to detect the presence or absence of defined vulnerabilities and whether or not they have been remediated. There's also the form of banner grabbing, which like fingerprinting, reveals substantial information about the scanned object. Some form of passive scanning is normally the first step that an attacker will take when conducting intelligence gathering on a target, or when the operations team wants to begin deployment of a new application or system.

In many cases, what follows scanning will be the preparation of a penetration test, the five common steps in preparing and executing a penetration test. Now the main objective here is to determine the software target, how it can be compromised by exploiting vulnerabilities found during the scans? Ideally, the penetration test is measuring the resiliency of a target by simulating real attacks and evaluating the results. Now, it attempts to emulate the actions of a potential target. And I want to stress that it's emulating them, it is not actually attempting to cause damage.

But to determine with reasonable precision, how much and what type of damage could be caused? Now, in most cases, pen testing is done after the software has been deployed, but this need not necessarily be the case. It is advisable to perform black box assessments using pen testing techniques before deployment to determine the presence of security controls, and then, after deployment, to ensure that they're working effectively, and can withstand the various attacks.

When pen testing is performed, post-deployment, it is important to recognize that there are rules of engagement that need to be established and followed and that the penetration test itself is methodically and properly conducted within those rules. These rules of engagement should explicitly define the scope of the penetration test for the testing team irrespective of whether they are internal or external security providers of this service. That means that we have to have a definition of scope and it means including restrictions on, for example, IP addresses, software interfaces that are to be tested in other aspects.

Most importantly, the environmental data, infrastructural and application interfaces that are not in scope must be identified prior to the test and communicated to the pen testing team. And then again, during test monitoring, must be in place to ensure that the pen testing team does not go beyond the scope of the test. The technical side of information security testing and assessment, as is published in the special publication, 800-115 by NIST, it provides guidance on how to conduct pen testing.

The OSSTMM also provides a document that covers secure software concepts, and it's known for its prescriptive guidance on the activities that need to be performed before, during, and after a pen test, including the measurement of results. Now when conducting post deployment pen testing, it can be used as a mechanism to certify the software or system, whatever happens to be in scope, as part of the validation and verification activity inside of certification and accreditation.

Final stage will be used to capture the results and produce a report of the exercise, that final phase being exfiltration and reporting. Depending upon how the test is scoped and focused, this report can have several uses. These include, how to provide insight into the state of the security, of course, it can be used as a reference for corrective action both now and later, it can be used to define security controls that will mitigate identified vulnerabilities, thus supporting implementation of those controls.

It demonstrates that the team and the entity itself is employing due diligence and due care processes for compliance. And it enhances SDLC activities such as security risk assessments, certification and accreditation processes, and various other process improvements. One form of testing that is used is called fuzzing. Now fuzzing is a type of brute force testing or better known as fault injection testing, in which faults, which are random and pseudo random input data are injected into the software and we observe the behavior.

Now the results are hoped to be indicative of the extent and the effectiveness of input validation measures within the software itself. Now it's used not only to test applications and their APIs, but also protocols and file formats. Typically, fuzzing is employed to find coding defects and security bugs that can result in buffer overflows, unhandled exceptions, hanging threads that can cause denial of service, state machine logic faults, and buffer boundary checking defects.

Now it can be done as a black box which is the most common form, but it can also be done as white or gray box tests. And it's typically broadly categorized into two types, generation based and mutation based. In generation based fuzzing, smart fuzzers are used and the specification's informative, how the input is expected by the software is programmed into the fuzz tool to create the data by introducing anomalies to the known data content structures, such as checksums, bit flags, and offsets, messages and sequencing, in other words, there is fore knowledge of the data format or protocol and the fuzz data is generated from scratch based on that specification and format. Now, this is why generation based fuzzing is also referred to as smart or intelligent fuzzing.

A majority of successful fuzzers operate as generation based fuzzer and preferred, because they have a detailed understanding of the format or protocol specifications that is being tested. Generation based have relatively greater code coverage and are more thorough in their testing approach, but can be time consuming as the fuzzer has to first import the known data format or structure, and then generate the various variations based on those. Now this needs to be carefully considered because this is why an appropriate amount of time should be allocated to the project plan when smart fuzzing is part of the test strategy. Thus, the main shortcoming of this type of fuzzing method is that it's based on known formats and structures, and so test coverage for new or proprietary protocols is, as a result, limited or even nonexistent.

Now, unlike generation based fuzzing, the mutation based or dumb fuzzers, there is no foreknowledge of the data format or protocol specifications, so that the fuzz data is created by corrupting or mutating existing data samples, if they exist, by recursion or replacement. Now this is done randomly and blindly, so that mutated fuzzing is also referred to as a form of dumb fuzzing. This can be dangerous, leading to a denial of service, destruction, and complete disruption of the softwares operations. So typically, it is recommended to perform this type of dumb fuzzing in a simulated environment, as opposed to the production environment.

We have simulation testing. Now this of course, involves setting up an environment to simulate the intended production as reasonably and as realistically as possible. It is however, a very costly endeavor, if it turns out that the production environment is very specialized or very complex, or the application itself requires extensive setup and configuration. All of these conditions generating a great deal of more variability. Simulation testing can be thought of as a final exam for the software target. This is the last pre-production test of the full system to check operations, performance, and other items likely to appear only in the full setup.

Now, this is a last run through to evaluate the performance in something like a real world scenario, before it actually gets committed to full production release. Now all of these things have been testing for various conditions. We have yet to mention testing for failure. Now, obviously it's important to test for errors and defect conditions, including those that do not cause actual failure. We have functional issues. It means essentially that nothing happens if the function is not activated, but that when it is, something happens.

We also have to look at generational issues. Dead code in version 2.1, for example, that gets activated in version 3.0. Then we have to look at what happens when that occurs and what the results are going to be. Testing must include formulaic logic where formulas are used. Now returning an incorrect value may not be a failure in and of itself, but an action taken based on that incorrect value could be catastrophic. Here's an example, the NASA Mars Climate Orbiter was famously noted for failing in its mission.

It was supposed to be an orbiter, but instead, it either crashed into the surface of the planet, or it continued in a long period orbit around the sun, possibly eventually falling into the sun itself due to a mismatch between metric, which is what NASA used, and English, which is what the builder of this system used. And in terms of the impact of this seemingly very simple mismatch, it produced a $327 million loss, and a total failure of this mission.

Now, a fagan inspection can be used to avoid both the flaw and the cost having been employed and testing it earlier, but somehow this particular one was missed. We must of course, test for performance and capacity management as a part of our program of testing for failure. These would be concerned with how the program reacts under varying load conditions, all the way up to and possibly beyond, what we consider to be its maximum load handling capability. We also need to look at how this loading at various levels, including overloading, affects memory handling and other scalability issue.

There comes a point when we're going to have to evaluate how cryptography is being handled and we need to validate that it's being performed in the correct way. Now it seems a bit obvious to say that doing it right is easier and less costly than doing it wrong. Now, most common mistakes to make starting out include, developing your own proprietary algorithm or we're still doing that, and then implementing it badly, poor implementation of varying system elements that contribute to the cryptography's function, and improperly configuring supporting elements.

Now we do know from history that even using proven algorithms can be totally defeated by bad implementation. So what we want to do is we want to be sure that we design well and implement better using proven algorithms, and that we verify and validate that we have implemented them in the optimal fashion. We need to bear in mind through cryptographic validation that security by obscurity is a largely but not completely failed idea as part of any security overall strategy. However, in this particular case, we need to be very mindful of which parts need to be kept hidden, such as varying randomized seeds and obviously the keys themselves, but in keeping with Kirkoff, Axiom, we need to leave other significant portions available for scrutiny.

One of the worst things that can happen involving cryptography is for it to be entirely a black box because that means that if any component should happen to go missing, we may find ourselves having to do uproot force to try to regain access to what's been obscured and locked up, seemingly now permanently. So we want to be sure that we bear that in mind throughout. Thus using well known and proven modules isn't the best approach. The history shows that these things carry the least trouble. And this goes for hashing just as it does for actual cryptography.

In the course of doing our cryptographic validation, we will need standards to apply to it. These will include ones that are very commonly known such as the FIPS 140-2, which is still quoted many times, that's been valid since December of 2002. It was however superseded by FIPS 140-3 in September of 2019. Thus far, no certifications against the 140-3 have been issued but it is the current reigning standard. And then we have the secure hash standard characterized in FIPS 140-4.

Now the other issues that we have that can spoil it being properly validated include things like inadequate randomness, poor key management, and a common failure, something that hackers look for in virtually every program, they acquire hard coded keys. Regression testing is an important part, but regression testing also has its own challenges. The software changes and evolves over time, this is common knowledge. And it often is present in operation in many different versions across a wide range of entities, possibly even multiple versions in the same entity.

Now this presents an issue because when we're patching and updating software, regression testing must be performed to ensure that the patch or other updates have not adversely affected other functions in that particular environment across all the different versions that may be present, and it becomes increasingly critical as operational versions continue to age. The longer it serves, the more likely these impacts will occur. And thus, the more critical regression testing becomes to ensure that that hasn't happened. Or if it has, that we identify it and we can compensate for it.

One approach is what's called delta, and the delta approach makes it more efficient because it leads to our mapping, the various changes and flows from one level to the next, we document capturing the change history and how it was built, and the regression analysis process can take advantage of this and become more efficient. Now, after all the testing is done, we have to get down to evaluation of the results, the impacts, and the corrective action plan that inevitably arises from the testing program.

Now, these that you see here are the main decision points, when considering what to do with the findings from the testing program that you've been engaged in. Now, each flaw, bug, or other problem that's been discovered has to be evaluated in context and in comparison to the other flaws and bugs that were discovered as well, because not all of them are of the same extent, the same impact, or of the same priority. Not all must be fixed now, some should be of course, but others can come later. And depending upon the findings and the decisions we make, some may not ever need to be fixed, but we'll always have to be under constant surveillance, so that, should that condition change from our present decision, we can respond appropriately.

Now, some of the things that we find may not be at all exploitable, some may have little impact, even if they are, while others can be proven catastrophic. That's part of the making the determination of priority and now, or later. We also have the economic factors to consider, some may be comparatively easy or cheap to fix, while others may be quite the other way in both cases. But whatever may occur, it should all be documented as a critical part of the system's history, and then reapplied and augmented with each return visit.

Like most things, the more you do this, the better the records you keep, and the more keenly you apply the lessons learned, in the end, the better will be the quality and security over its lifecycle, which is in the end, the goal. We've discussed testing and security testing and the topics that we've covered include, the standards for quality assurance, an approach and methodology, various types and methods of doing security testing, we've talked about the environmental conditions, and various testing considerations. We've spoken of scanning, fuzzing, doing simulations, the importance of cryptographic validation, regression testing, and ultimately, what it all leads to, the impact analysis, the corrective action plans, and the decisions about priority and how we're going to pursue a mitigation program.

About the Author
Learning Paths

Mr. Leo has been in Information System for 38 years, and an Information Security professional for over 36 years.  He has worked internationally as a Systems Analyst/Engineer, and as a Security and Privacy Consultant.  His past employers include IBM, St. Luke’s Episcopal Hospital, Computer Sciences Corporation, and Rockwell International.  A NASA contractor for 22 years, from 1998 to 2002 he was Director of Security Engineering and Chief Security Architect for Mission Control at the Johnson Space Center.  From 2002 to 2006 Mr. Leo was the Director of Information Systems, and Chief Information Security Officer for the Managed Care Division of the University of Texas Medical Branch in Galveston, Texas.


Upon attaining his CISSP license in 1997, Mr. Leo joined ISC2 (a professional role) as Chairman of the Curriculum Development Committee, and served in this role until 2004.   During this time, he formulated and directed the effort that produced what became and remains the standard curriculum used to train CISSP candidates worldwide.  He has maintained his professional standards as a professional educator and has since trained and certified nearly 8500 CISSP candidates since 1998, and nearly 2500 in HIPAA compliance certification since 2004.  Mr. leo is an ISC2 Certified Instructor.

Covered Topics