Module 9 - Cryptography
The course is part of this learning path
Many security controls are based on cryptographic systems and cryptography can assist in countering all these threats. This course provides a basic understanding of what cryptography is and how it works through symmetric ciphers, hash functions, asymmetric ciphers, and digital signatures.
The objectives of this course are to provide you with and understanding of:
- What cryptography is
- How cryptography works through symmetric ciphers, hash functions, asymmetric ciphers, and digital signatures
- Key exchange and management
- Models of protection
This course is ideal for members of information security management teams, IT managers, security and systems managers, information asset owners and employees with legal compliance responsibilities. It acts as a foundation for more advanced managerial or technical qualifications.
There are no specific pre-requisites to study this course, however, a basic knowledge of IT, an understanding of the general principles of information technology security, and awareness of the issues involved with security control activity would be advantageous.
We welcome all feedback and suggestions - please contact us at email@example.com if you are unsure about where to start or if would like help getting started.
Welcome to this video on cryptography.
Throughout this course you’ve seen reference to many security threats to information, including:
· Unauthorised access to information in an information system;
· Unauthorised disclosure;
· Unauthorised modification;
· Misrepresentation of the origin of a message; and
· Repudiation of a message.
Many security controls are based on cryptographic systems and cryptography can assist in countering all these threats. This video will give you a basic understanding of what cryptography is and how it works through symmetric ciphers, hash functions, asymmetric ciphers and digital signatures. We’ll also look at key exchange and management different models of protection and cryptanalysis.
The video is supported by a quiz for you to check your understanding as you work through it.
The security services that cryptography can provide are:
· Confidentiality, which is the protection of information so that only the originator and intended recipients can see it. This means encrypting data which could be travelling around a network, like an email in transit. Only authorised users will be able to decrypt the messages or files.
· Authentication, which is where the identity of an entity (like logging into a computer) is verified. The entity could be a user, a message or a device. User authentication normally relates to the verification of a user’s claimed identity when accessing a system, whilst message authenticity generally involves the recipient of a message verifying the identity of the originator.
· Integrity involves mechanisms which ensure that, if data has been modified, changed or deleted, the modification can be detected. This includes detection of any attempt to insert data into communications traffic.
· Non-repudiation prevents a party in a communication exchange from claiming that the transaction didn’t happen. There are various forms of non-repudiation, including non-repudiation of origin and non-repudiation of receipt.
The word cryptography is derived from the Greek word ‘kruptos’, meaning hidden. Cryptography is concerned with the creation of cryptograms which represent encrypted information that’s unintelligible until it’s decrypted. Cryptanalysis, on the other hand, is concerned with revealing the hidden data within cryptograms.
In this video we’re going to look at four primary areas:
· Symmetric ciphers;
· Hash functions and message authentication codes;
· Asymmetric key ciphers; and
· Digital signatures.
First, we’ll look at symmetric cryptography. Symmetric ciphers are algorithms that use keys with the same value both to encrypt data into ciphertext and decrypt the message from ciphertext back into its original form.
Here we can see how the sender uses a symmetric cipher to encrypt the message with a symmetric key, which creates ciphertext. This is then transmitted to the recipient.
Using the decryption function of the same algorithm, the recipient decrypts the ciphertext back into the plaintext message.
Symmetric algorithms can be used to protect both communications traffic and files. There are two forms of symmetric cipher:
· Stream ciphers which encrypt one bit at a time; and
· Block ciphers which operate on chunks or blocks of data, encrypting and decrypting one block at a time. This approach is used in well-known protocols like TLS/SSL and IPsec.
The most common symmetric block algorithms are Data Encryption Standard, DES, and Advanced Encryption Standard, AES. AES is more recent and is widely supported in modern products. Protocols like TLS can use AES.
Sensitive government systems may use proprietary algorithms rather than commercially available ones, but they work in the same way.
Symmetric ciphers provide confidentiality by using the same keys to encrypt and decrypt the data. The strength of protection lies in the secrecy of the keys, not in the secrecy of the algorithm.
Most commercial algorithms are openly published, so it’s assumed that attackers not only have knowledge of the algorithm but also know the system they’re attacking.
If the strength of protection lies in the secrecy of the keys, then the keys need to be kept secret during their generation, distribution, use, and archiving, until they’re finally destroyed. Keys are large numbers that are randomly generated, with the size depending on the algorithm. For example:
· Standard DES uses 56-bit keys; however Triple-DES can support up to 168-bit keys.
· AES comes in 3 key sizes: 128, 192, or 256-bits.
As DES uses such small key sizes, AES is now the recommended approach.
Note that generating truly random numbers is impossible using standard computer systems which are deterministic - the same input to the same starting state will always result in the same output. However, there are programs which can produce numbers which are sufficiently random or pseudo-random. Truly random numbers can be generated by some physical processes, for example radioactive decay, which are known to be random.
The challenge is to generate keys randomly and keep them protected throughout their lifecycle, particularly during use and distribution. If an attacker can guess a key, the security of the system is compromised.
Now, let’s look at hash functions. These are one-way functions used in the creation of message authentication codes and digital signatures. They work by taking a data block of any length, for example an email message, and processing it through the hashing algorithm. The result is a number which is called a hash or a message digest.
The hash has a fixed length which depends entirely on which algorithm has been used to create it. For example, if the same algorithm was used to hash a 500-byte email and then a document of 50 MB, the resultant hashes would be exactly the same size in bits. However, the actual value will be different; and that’s how each file on the system, even with a single bit difference, can result in an entirely different hash.
Another important feature is that a hash function is one-way. This means it’s impossible to determine the original message from a hash.
Types of hash function include:
· Message Digest 5, or MD5, which has a length of 128-bits and is considered cryptographically weak;
· SHA or SHA-1, which has a digest length of 160-bits; and
· SHA-2 includes a number hash functions, with digests from 224 to 512 bits in length. The most widely used are the 256-bit and 512-bit versions, which are usually referred to as SHA-256 and SHA-512 respectively.
A cryptographic hash algorithm has three basic requirements:
· The input can be of any length, but the output has a fixed length; although we say the input can be of any length, there are limits but they are very large;
· It’s one-way, meaning that the original input value is computationally unfeasible to calculate from the hash value; and
· It’s collision free, meaning that it’s computationally unfeasible to find two inputs that generate the same hash value.
It’s been shown that MD5 and SHA-1 aren’t collision free, which is why they’re no longer recommended for use. Most security professionals and cryptography experts recommend SHA-2 for all new applications.
Hash algorithms are designed to be computationally fast to run compared to other kinds of symmetric and asymmetric algorithms.
This is a simplified illustration of how a cryptographic hash algorithm can be used:
· First, the sender generates a message;
· The system takes the message and processes it through the algorithm, outputting a message digest;
· Then the sender sends the original message and the message digest to the recipient;
· The recipient then takes the message and calculates their own version of the message digest; and finally
· The recipient compares the digest sent with the original message, with the one they calculated.
If the data is changed or manipulated during transmission, the resultant hash will be different. Any change at all to the original message will result in a completely different digest being generated, even just a single space or comma out of place.
To prevent replay attacks, or the insertion of messages into a message-stream, hashing algorithms can add sequence numbers and timestamps to the flow. For example, if a sequence number is added to each message, starting at one, the receiver of a message can determine if it has a missing number.
There’s a major problem with this simple method. An eavesdropper could intercept the sender’s message and hash, then substitute their own message and corresponding hash. The recipient not only needs to know that the message hasn’t been altered, they also need to know from whom it came - they need to be able to verify the message’s authenticity.
Hashing algorithms can also provide authentication; in this case, message authentication. This function is called a Message Authentication Code, or MAC. The stages in the process are:
· The sender generates a message;
· The sender then links a secret key, or value, to the message;
· The system takes the resultant output and processes it with a hashing function, producing a message digest;
· Then, the sender transmits the message and message digest to the recipient;
· The recipient takes the message and secret key, which they already have, and calculates a message digest based on both inputs; and finally
· The recipient compares the received digest with the digest they just calculated.
MACs are predicated on the assumption that only the sender and recipient know the symmetric key value. It’s this assumption that allows the recipient to use the MAC function to verify that the sender was the originator of the message. The input data could, for example, contain the sequence numbers for use as a sequencing key, enabling the recipient to detect replay, lost or inserted messages.
A more complex and cryptographically secure version of MAC is known as hash-based message authentication coding, or HMAC.
The third area we’re going to look at is asymmetric algorithms. These are based on the creation of two mathematically linked keys which have a unique pairing and are impossible to deduce from each other.
The private key is kept by the user and the public key is distributed to anyone who needs it. The private key decrypts messages encrypted by a public key and vice versa.
Here, an asymmetric key cipher uses the recipient’s public key to a generate ciphertext that only the recipient can decrypt. In this case the sender must possess the recipient’s public key. The recipient holds the private key and distributes the public key to anyone who needs it – often through a network directory.
The sender uses the recipient’s public key with the encryption algorithm to produce ciphertext from the plaintext message. The ciphertext is then sent to the recipient who has sole access to the private key. This means that only the recipient can decrypt the ciphertext. Anyone trying to intercept it can get hold of the recipient’s public key, but this can’t be used to decrypt the ciphertext.
In this mode, asymmetric ciphers are used to protect data being sent to the recipient, so that only that recipient can decrypt the message. In most applications, this technique is used to share symmetric keys used in communications encryption.
For example, if A generates a symmetric key needed to encrypt a mobile phone conversation with B, they use B’s public key to encrypt this symmetric key. On receiving the encrypted symmetric key, B uses the private key to obtain the symmetric key. The communications are then encrypted using the faster symmetric encryption algorithm, but the keys were exchanged using a very strong asymmetric exchange. This combination of symmetric and asymmetric key encryption is sometimes known as hybrid encryption.
The asymmetric algorithm process can also be reversed. Here, the sender uses their own private key to encrypt a message. The resultant data is sent to the recipient, who uses the sender’s public key to decrypt the message. Anyone with the sender’s public key can decrypt the message.
This isn’t really providing confidentiality. Instead it’s proving that only the sender could have created the message in the first place, because they’re the only entity in possession of the private key. This mode of operation provides the security attribute known as authentication.
The main feature of public key cryptosystems is that the encryption and decryption keys are different.
Systems have different characteristics, depending on how they’re used. There’s no exchange of secret keys because it’s the public key that is made available to the other entity. Anyone with the public key can encrypt data and send it to a recipient.
The private key holder is the only entity able to decrypt a message that’s been encrypted with the corresponding public key.
Anybody wanting to receive a message protected by a public key cryptosystem needs to publish their corresponding public key in a location that’s accessible to the entity that needs it.
Key-pairs – both public and private – are generated through one-way functions which are easy to compute, but hard to reverse. These functions are provided by the factorisation of prime numbers. In some cases, elliptic curve mathematics is also used.
The main issue faced by public key cryptosystems is preventing a recipient’s public key being substituted by an attacker’s public key. Therefore, while public keys need to be accessible, it’s essential to know they’re the correct ones. Therefore, public keys are digitally signed by a certification authority and packaged in a certificate of authenticity, which can be proven.
The dominant asymmetric algorithm publicly available today is RSA.
Symmetric cryptography is very fast in comparison to asymmetric cryptography. It is used to encrypt large quantities of data, like mobile phone communications streams, or VPNs, whereas asymmetric ciphers are used for encrypting and decrypting small items of information, like symmetric keys and message digests.
The final area we’ll look at are digital signatures, which can also be used to verify message integrity and authenticity, and provide non-repudiation. In this illustration:
· The sender hashes the message, then encrypts the hash with their private key; the encrypted hash is effectively the digital signature;
· The message and the digital signature are then sent to the recipient;
· The recipient decrypts the encrypted hash using the sender’s public key. They also hash the message that they’ve received and, if the hashes match, the message hasn’t been tampered with.
The fact the hash was decrypted using the sender’s public key, proves its origin.
Digital signatures provide a level of protection from unauthorised message modification or manipulation and prove the origin of a message through authentication.
As the sender has sole custody of the private key relating to the digital signature, it’s very difficult for them to deny they sent the message in the first place. This is called non-repudiation.
Only one entity, the holder of the private key, can digitally sign the message digest. However, as many recipients as necessary can verify it.
Public keys require protection and are usually encapsulated within a certificate. This protection provides integrity of the certificate and detects any attempt at substitution. A certificate is basically a digitally signed public key.
Certificates are digital documents attesting to the binding of a public key to an individual or entity. They verify that a public key belongs to a specific individual. Certificates can help prevent an individual from using a phoney public key to impersonate someone else.
In their simplest form, they contain a public key and a name. Commonly though, a certificate will also contain an expiration date, along with the name of the certifying authority that issued it, a serial number and specific information based on the certificate’s usage.
Most importantly, a certificate contains the digital signature of the issuing authority.
It would be a fair question to ask why it’s necessary to put public keys inside a certificate. In this kind of cryptosystem, the authenticity of messages and communications is being attested and the integrity of the message and origin is being proven.
The integrity and non-repudiation of signed documents needs to be maintained so a public key can’t be trusted on its own because:
· It’s easy to forge; and
· There’s a need to tie the key and identity together to establish that the entity owns the public key.
So, a mechanism is required to provide the protection of the public key through certification by a known and trusted entity. There needs to be an unimpeachable binding between the public key and its owner. This is the basic concept behind a Public Key Infrastructure.
To produce a certificate:
· The user, or end-entity, generates a key-pair;
· The public key is then sent together with some identifying information about the end entity, including their name and email address;
· Then, this information is passed to a third-party organization who checks the attached information and signs the public key, generating a certificate;
· The Certificate Authority uses their own private key to provide this digital signature; and finally
· The certificate is returned to the user and made available for use.
Certificates are like a passport; they provide an identity that should be irrefutable and have the following characteristics:
· The holder must prove their identity to the authority to be issued the credential;
· Details of the issuing authority are clear and can be proven as authentic;
· Identity information is clear and can be proven as authentic; and
· The certificate includes a validity period, with clear start and end dates.
A digital certificate is fundamentally a digitally signed public key, where the signature is generated by the certificate authority using their own private key.
The certificate authority's public key is used to validate the authenticity of the digital signature, which conveniently, is also included within the certificate. This is referred to as a trusted certificate.
Let’s use the example of Bob and Alice to illustrate this:
· Alice wants to send a digitally signed message to Bob;
· Bob obtains the certificate authority’s public key in the form of a trusted certificate;
· Alice then uses her own public key to digitally sign the message intended for Bob;
· Alice sends the message to Bob along with the digital signature and her certificate; remember, her public key is mathematically linked to the private key used to sign the message;
· On receipt of the message, Bob verifies Alice’s certificate is authentic by checking it was issued by the certificate authority;
· Bob then uses the public key from the trusted certificate to validate the digital signature on Alice’s certificate. He now knows the public key in Alice's certificate is valid;
· Then Bob uses this public key to validate that the digital signature in the message is authentic. If the digital signature checks out, Bob knows the message came from Alice and hasn’t been modified.
Alice and Bob now have trust in each other, which is established through the certificate authority. Other trust models are possible using public key cryptography, but these are outside the scope of this course.
It’s worth noting that if, for some reason, Alice’s private key is compromised, there must be a way to indicate to people that trust her, that it’s no longer valid. The most common mechanism is where the certificate authority publishes a Certificate Revocation List, or CRL; which is essentially a list of certificate serial numbers that are no longer trustworthy.
When Bob’s verifying Alice’s certificate, he should check to see if it’s on a CRL; if it is, he won’t use it.
Trusted certificates are those issued by certificate authorities, where the security obtained from the public key infrastructure, is derived from the integrity of the trusted certificates and the certificate authority. That’s why they’re referred to as trust anchors.
Here’s a screenshot of some of the trusted certificates that are preinstalled within Mozilla’s Firefox web browser. Other web browsers, such as Internet Explorer, have similar mechanisms for showing which certificates are installed and which ones are trusted.
It’s possible to install new trusted certificates in a web browser’s certificate store, however they must be legitimate. Only trusted certificates from known sources should be installed; a bogus certificate issued by a bogus certificate authority can be used for nefarious communications.
As you’ve seen, keys are at the heart of cryptography systems.
Because the algorithms are normally public, it’s the keys that make the cryptography system strong, so the way they’re generated, distributed, and managed are important.
Key management is an extremely complicated subject. However, there are some basic principles you need to understand for this course:
· Distribution of a symmetric key starts when a user generates a secret key and provides a copy of it to the entity they want to communicate with.
· Keys are like a password. While they work on a small scale, if every communicating pair of users needs to have their own unique secret key, the number of keys rises dramatically with the number of users: For example, if there are 10 users it means 45 separate keys are required. However, 100 users will need 4,950 separate keys, and 1,000 users require 499,500 keys. This approach doesn’t scale well.
· A more common approach is using an asymmetric cipher to encrypt the symmetric key. For example, if Alice needs to give Bob her secret key, she obtains Bob’s public key and uses that to encrypt the symmetric secret key. When Bob receives the encrypted secret key, he uses his own private key to decrypt it. This is how TLS and S/MIME protocols exchange keys between two communicating endpoints.
· Public keys can also be distributed using a public key infrastructure. In many cases, Lightweight Directory Access Protocol (LDAP), servers can be used to distribute public keys, although these are typically in the form of digital certificates. This method allows senders to look up the recipient's digital certificate in the directory.
Another important key management and cryptographic scheme is known as Pretty Good Privacy or PGP for short. PGP is a data encryption and decryption technology used for signing, encrypting and decrypting emails, files, directories and entire disks. It makes use of both symmetric and asymmetric ciphers but doesn’t use a public key infrastructure.
Rather than relying on certificate authorities, PGP relies on a web of trust. Each user in the web of trust acts as a certificate authority for other users they trust. If someone trusts a user, they also inherit the trust of the others in that web of trust.
We’ve covered various cryptographic mechanisms in this video. Now we’ll look at the different models of protection.
If an organization wants to protect data being exchanged between two users over an open network, then there are three approaches, or models, that can be used:
· First, the organization can implement network level protection, such as a Virtual Private Network or VPN;
· The organization can implement application-level protection, such as Web Service Security or S/MIME to provide application-to-application protection; or
· The organization can implement controls at the data level, such as WinZip, for directly encrypting the data being exchanged via their email service.
Typically, organizations implement one of these models of protection. If an organization has implemented a VPN between different sites, they won’t generally implement secure email. However, depending on their risk appetite, some organizations that handle more sensitive information could implement several layers of controls.
To illustrate, an organization might have a VPN for inter-site communications but, when two users wish to exchange sensitive information, they’re advised to use WinZip to further protect the information at the data level.
Let’s move on to look at cryptanalysis.
Cryptanalysis is used to defeat cryptography and access the contents of encrypted messages even if the key is unknown.
The easiest way of breaking a symmetric cipher is to perform an exhaustive key search, also known as a brute force attack. If the first possible key is set as 000,000 and the last possible key is 999,999 then using every possible key combination will eventually decrypt the message. That’s why the key length of a symmetric cipher should be very long.
The length of a key gives rise to the key space. The data encryption standard has a key space of 2 to the power of 56. Whilst this is a very large number, recent advances in cryptanalysis and computing power has meant the data encryption standard can be broken down in hours and minutes. The advanced encryption standard has a longer key length and it is also more inherently secure, so it is now recommended.
There are other means by which symmetric ciphers can be attacked, but these generally mean finding a mathematical weakness in the cipher. So, the algorithms need to be computationally complex.
One other factor to consider is cover time. Not all information needs to be kept secret forever; it may become less valuable over time. The cover time is the time that an item of information needs to be kept secret.
Finally, we’ll look at policies. Most organizations which use cryptography will have policies covering how they’re used. These might relate to:
· Key generation, distribution and handling. For example, if an organization has their own trusted certificate, the policy defines how the trusted certificate is provided to a user, or if secret keys are manually distributed, it defines the controls required to protect the keys in transit;
· Information protection, which relates to how cryptography is used within the organization, for example, whether laptops should use disc encryption or internal webservers, should use TLS to protect communications;
· Approved algorithms, which define the ciphers and algorithms used and their key lengths. Different cipher suites may be needed for different types of data, for example AES 128 and SHA-1 might be used for certain types of data, but for very sensitive information AES-256 and SHA 512 would be used; and
· Export control policies. Cryptography is export-controlled but some countries also have import controls. A few countries also prohibit the use of cryptology, unless it’s approved by the government. Multinational organizations must be careful where they use cryptography; many countries are signatories to the Wassenaar Arrangement, which defines the goods that are export controlled.
That’s the end of this video on cryptography.
About the Author
Fred is a trainer and consultant specializing in cyber security. His educational background is in physics, having a BSc and a couple of master’s degrees, one in astrophysics and the other in nuclear and particle physics. However, most of his professional life has been spent in IT, covering a broad range of activities including system management, programming (originally in C but more recently Python, Ruby et al), database design and management as well as networking. From networking it was a natural progression to IT security and cyber security more generally. As well as having many professional credentials reflecting the breadth of his experience (including CASP, CISM and CCISO), he is a Certified Ethical Hacker and a GCHQ Certified Trainer for a number of cybersecurity courses, including CISMP, CISSP and GDPR Practitioner.