This course is the 5th of 6 modules within Domain 3 of the CISSP, covering security architecture and engineering.
Learning Objectives
The objectives of this course are to provide you with and understanding of:
- Common attacks against cryptography
- How to assess and mitigate vulnerabilities in web-based systems, discussing suggested protections, XML, SAML and OWASP
- How to assess and mitigate vulnerabilities in mobile systems, discussing risks from remote computing and risk for mobile workers
Intended Audience
This course is designed for those looking to take the most in-demand information security professional certification currently available, the CISSP.
Prerequisites
Any experience relating to information security would be advantageous, but not essential. All topics discussed are thoroughly explained and presented in a way allowing the information to be absorbed by everyone, regardless of experience within the security field.
Feedback
If you have thoughts or suggestions for this course, please contact Cloud Academy at support@cloudacademy.com.
Welcome back. In this session we're going to begin with section eight and continue our coverage of domain three. In talking about common attacks against cryptography, we are of course going to include the complementary technology of hashing.
Now when it comes to the attacks on hashing algorithms in particular, there are three primary ways that these attacks proceed. We have the brute force attack, which of course is trying every combination in a particular space based on the length of the hash until the one that is sought pops up. We have collisions of course, where we take a fake message which we want to substitute for the genuine one, and then attempt to produce a hash value for that faked message that corresponds exactly to the one that is on the message we want to set our spoof against and replace with our spoofed version. And then we have cryptanalytic techniques that can be used to attack the hashing algorithm in other ways.
Now brute force is, of course, trying every possible combination in the space in which the hash that we seek is going to be found. Going back to what I said earlier about the key space, or in this case the hashing space, defined as two to the power of whatever the length of the hash value might be, such as 128 bits with MD5, the brute force will begin at the very first one that appears in this space and continue trying each one until it finds the one that matches. This is the one that is the equivalent value to the one we're trying to simulate and collides with the known hash.
We have the rainbow table, also called the time memory trade off. A rainbow table is a look up table of pre-calculated hash values that are stored and then the attacker will do a simple lookup on. Inside the rainbow table will be a sequence of hash values based on the calculation of any passwords and that typically is where the rainbow table is used as an attach method, calculated in any one of about two dozen different hash values and then stored. So the attacker would typically take a password from any standard keyboard, put it into the entry field, do a search and then arrive at any one of about two dozen different hash values that could be produced based on the algorithm that is being used by the target system.
Now the cryptographic attacks on the algorithms are done in several different ways. We have, of course, the brute force method, which as in the hashing case, we try every combination of characters from all zeros to all ones for the entire length of the key space, which is again based on the length of the key in bits. There is the replay attack, which seeks to capture credentials in motion and then replay them to the target system. We have interception of keys performed as a man-in-the-middle attack. We have statistical attacks, which seek to exploit potential vulnerabilities in the way that the encryption system generates the randomness features of the particular implementation. We have frequency analysis, typically used against substitution types of ciphers. And then we have implementation attacks, like the kind that was employed against WEP and its RC4 algorithm. Now the replay attack causes disruption and damage by the attacker who resends repeated files to the host. If there are no checks against such things as time-stamping, use of time-based tokens or other forms of sequence verification codes in the receiving software, the system may be subjected to duplicate processing files. In an implementation attack, what the attacker seeks to do is instead of doing a head-on attack against the algorithm itself, it exploits the various aspects and features that appear in the implementation.
A very famous form of this was the attack against WEB and its RC4 algorithm. In this particular case the implementation sought to optimize performance in a low-power situation, trying to keep in mind that the devices that were generating these factors were using USB supplied power, which is very low having very small, low powered CPUs in the various devices that would be processing the encryption algorithms. Various things were installed using these thoughts so that it would perform correctly, but in doing so they crippled the true strength of this algorithm and so the implementation was what was attacked rather than the algorithm itself. Now in the case of an attacker wanting to recover the ciphertext in its plain text form, we have the ciphertext-only attack. Now this is one of the most difficult forms of attacks because the only thing the attacker really has to go on is the captured ciphertext. Normally this requires some kind of an augmented attack in that they are going to use side-channel techniques as well, attacking various other aspects of the system while they attempt to attack the ciphertext itself. They're looking for patterns. They're looking for any sort of clue that they can pick up, either through the direct or through the side-channel avenues that they're going to have to explore. And the more pieces they gather the simpler the attack, and simple really isn't the right word because it never becomes easy in a ciphertext only. But it becomes easier the more pieces of the clues that they pick up through side-channel or direct they're able to accumulate.
Now something that is easier than the ciphertext-only attack is the known plaintext. In this particular type of an attack, the attacker is not necessarily trying to recover the actual plaintext because, as the name indicates, he has access to plaintext or samples of plaintext. So what they're trying to do is in fact recover the keys. The goal of this type of an attack is to recover the key so that whatever traffic they have in hand now that they're using for the actual attack, they're able to prosecute this in a way that ultimately will reveal keys, the algorithms and other aspects that will enable them to recover the keys to decrypt the traffic that they have captured that they have not yet been able to decipher. And so they will use a trial and error method, trying key-pairs and text and plaintext samples in an effort to recover the keys. The chosen ciphertext is a compliment to the known plaintext in that the attacker has access to a decryption device, or software, that be believes is similar to what his target is using, and he's attempting to defeat the cryptographic protection by decrypting chosen pieces of the ciphertext to discover the key. Now an adaptive form of this would be the same, except that the attacker can modify the ciphertext prior to putting it through the algorithm. Basically, what the attacker is attempting to do is run either a plaintext through the encryption algorithm with a chosen key, and then taking the resulting ciphertext from that, running that back through with the same and perhaps a different key, to see what the results are going to be, so that through experimentation, trial and error, they're eventually able to hone in on the keys that either entirely, or in part, recover the traffic that have yet to be deciphered. Now in this chosen plaintext attack, in this particular case the attacker is listening to the sender's traffic, which is encrypted, but it contains elements that the attacker has either decrypted in part or recognizes in some way. The attacker then chooses plaintext exemplars that are frequently known clues to the original sender's message content. The attacker then sends these plaintext exemplars to the encryption source, intending to plant them as seed or clues to how to decrypt the content that the attacker will then capture when the sender rebroadcasts it. Then the encryption source will encrypt the attacker's plaintext exemplars, embedding them in the message traffic, and then in broadcasting them the attacker will then pick them up again. The attacker then receives them back from the source in such a way that the attacker knows which cipher text corresponds to teach plaintext, thus the now-embedded clues serve to assist the decryption of the original content.
One historical example of this attack being successful was used by the U.S. military prior to the Battle of Midway. The Japanese, in broadcasting their traffic, the U.S. discovered that they had not changed their cryptokeys in quite some, by some estimates years. There was a pair of characters, AF, or alpha fox, that continually appeared and reappeared in the Japanese message traffic up to the time that the battle was expected. The U.S. was listening, captured the traffic, and through a variety of reduction methods, was able to determine that there was a high probability, note that there was not certainty because Japanese forces were moving in other directions as well, but a high probability that alpha fox indicated the Midway Island. And so, through a process of elimination, the United States took a gamble and sent back traffic to say that Midway Island was suffering from various supplies deficiencies and water deficiencies, in the hopes that Japanese, who they knew were listening, would take that message, re-encode it, embedding the clue alpha fox, meaning Midway at this point, confirming what the U.S. had expected to be the case, and then retransmitting it to the various Japanese forces at sea. As it turns out this is exactly what happened, and the clue was then received through the U.S. listening to the continued Japanese broadcasts and then were able to make the decision about where to go.
A more complex sort of attack is the differential cryptanalysis. One of the attack forms is looking at multiple executions of various ciphers and various keys in an effort to determine how long and how much machine time it takes, how much machine effort it takes, to do a particular algorithm with a particular key, and they look at the differentials between each execution. Now, by executing this and measuring the exact execution times and power required by a crypto device, which they believe to be what the adversary or target is using to perform these encryption or decryption operations, they are able to calculate the differential between the varying strengths that they're using to zero in on the key length, the value of the key, and the algorithm being used. Linear cryptoanalysis takes a known plaintext and uses an linear approximation to describe the behavior of the blocks cipher. Using pairs and given sufficient numbers of pairs of plaintext and their corresponding ciphertext, the information about the key that is trying to obtain the key value, the key length, and so on, and the increased amounts of data will usually give a higher probability of success. But this, like many other techniques, is its own form of trial and error.
Algebraic attacks, typically very highly mathematic, are used against classes of ciphers that display a very high degree of mathematical complexity in their construction and in their performance. Now frequency analysis is typically one that is used in conjunction with some of these other attacks. By itself it's especially useful attacking a substitution-only type of cipher, where the statistics of the plaintext language are known. This is to say that, in a frequency analysis attack, you're trying to examine the cipher text and compare it to plaintext to determine the frequency of letter appearance and their positions. So in a substitution-only cipher, where one group of letters is substituted possibly by some sort of code or some sort of shifting scheme, the number of letters, and the letters, will appear to be replicated with the same statistical analysis that appear in the plaintext, thus giving a relationship between plaintext and ciphertext based on this pattern.
A factoring attack is typically the type used against the RSA, which uses factorization of large prime numbers in the calculation of keys. Using this form, an attacker will attempt to use the factorization of decimal digits, some 200 digits long, in an attempt to try to find the keys by solving these particular problems. One of the easiest ways, you could say a low-cost method, would be to do social engineering for key discovery. Now there are varying techniques, of course, in social engineering, coercion bribery, befriending. By doing these things with people in positions of responsibility or those presumed to have these particular keys or passwords or whatever they thing is we're after, spies were once upon a time gaining access to systems without having to exercise any technical expertise whatever, except in the ability to manipulate people. This as I said is a very low-cost method, meaning you don't have to employ a lot of machine time, but you do simply have to be a very good manipulator of people. It can take a fairly long period of time if the individual doesn't respond to befriending techniques or they don't respond to coercion or bribery readily, but in terms of the cost of machine time it can take a lot less time to do social engineering attacks than to actually do the methods requiring a lot of computer hours.
There is, of course, the family of dictionary-style attacks. Now these are most commonly used against password files. What they exploit is the poor password construction habits that many people will exercise, people who use pattern passwords or simple ones, or as we commonly joke about, how the most common password is the word password itself. And other forms that are based on natural words. It starts out with an encrypted or a hashed dictionary, and then runs through to compare patterns until it gets a match on the one that we're looking for, and then that can be fed into a system and then the system itself can be accessed if that attack is successful. Reverse engineering is also common. One way of doing it would be a competing firm buying a crypto product from another firm and then trying to reverse engineer the product, not trying to recover key, not trying to recover ciphertext from plaintext from cipher text, but trying to figure out how the machine itself works.
Now as we know, all finished cryptographic products and modules are themselves black boxes in the sense that no one, including its inventors and programmers, should ever be able to predict what keys will be generated, because if they can then certainly any attacker can also do that. But in trying to reverse engineer the product they're able to figure out exactly how it works and that could, of course, reveal other sorts of vulnerabilities or features that might give them an advantage. This way they're able to find out design weaknesses or flaws in the system, and gain other forms of crucial information about how it works. Now the birthday attack is common attack against hashing functions. And the birthday attack basically asks the question, how many people must be in the same room, and that, of course, is a group of people selected at random, for the chance to be better than 50% that someone in that room will have the same birthday as you? So please note, in this particular example, we're starting with a fixed birthday, your own, and we want to know how many people, selected at random, would give you the chance of having better than 50% likelihood that you would have one more person in that room with that same birthday. And the average number is 253, and this looks at the possibility of a match on one out of the 366 birthdays that are possible. This is called the strong collision resistant form.
The weak collision resistant form asks a similar question with a couple of distinct differences. How many people must be in the same room for the chance to be better than 50% the two people have the same birthday? So instead of beginning with a specific birthday that you want to match, say your own, you're trying to match anybody on a birthday that occurs during the 366 birthdays. Notice that the difference is about 1/10th the number of people that you would have to have for the fixed birthday. Now what this represents is if you are trying to match the hash value of any message produced, and spoof that particular message with a falsely inserted one, trying to match any possible hash gives you something on the order of 1/10th of what it would take to match a specific message. If you're trying to match the specific one, as you see, it's an order of magnitude greater.
Now the birthday attack suggests that it would be easier for an attacker to find two message with the same value than to match a specific one. So we have the hashing value equals n. A brute force would find it at the hash value 2n, but a brute force defined, you need two matching hashes, two times the quantity n divided by two, n equaling the length of the hash in bits. So a hashing algorithm with a hash output of 64 bits is more vulnerable to the birthday attack than a hash double that number with 128. Now, another point to remember, because our systems are binary, a bit length of 128 bits is not double the size of 64. It is, in fact, doubled 64 times worth of doublings over 64. 64 bits produces one, then by adding one more bit to 65, it doubles the size, and with each additional bit, it doubles the size from the previous doubling, so it's literally been doubled 64 times from 64 to 128. So this is not a simple math problem, this is a very complex math problem. There is, of course, another relatively simple way, and this requires an element of luck that the others don't. Temporary files can store a wide variety of things in them, imagines, text, downloaded files, partial downloads. They could indeed store downloaded or used passwords. Now most systems are set up, and most savvy users will set up their systems to delete all temporary files when they log out of a system, but if that step has not been taken, if they have not been deleted or overwritten, then they may be available to reveal their contents to an attacker if the attacker knows to look. So this is one of the things that would fall into the area of either configuration or implementation weaknesses that can go overlooked and needs to be addressed.
Mr. Leo has been in Information System for 38 years, and an Information Security professional for over 36 years. He has worked internationally as a Systems Analyst/Engineer, and as a Security and Privacy Consultant. His past employers include IBM, St. Luke’s Episcopal Hospital, Computer Sciences Corporation, and Rockwell International. A NASA contractor for 22 years, from 1998 to 2002 he was Director of Security Engineering and Chief Security Architect for Mission Control at the Johnson Space Center. From 2002 to 2006 Mr. Leo was the Director of Information Systems, and Chief Information Security Officer for the Managed Care Division of the University of Texas Medical Branch in Galveston, Texas.
Upon attaining his CISSP license in 1997, Mr. Leo joined ISC2 (a professional role) as Chairman of the Curriculum Development Committee, and served in this role until 2004. During this time, he formulated and directed the effort that produced what became and remains the standard curriculum used to train CISSP candidates worldwide. He has maintained his professional standards as a professional educator and has since trained and certified nearly 8500 CISSP candidates since 1998, and nearly 2500 in HIPAA compliance certification since 2004. Mr. leo is an ISC2 Certified Instructor.