Chapter 15 Security Blanket
Now that we've seen some Internet-related security threats and solutions, let's talk a bit about securing your own computer. First, we'll dive into cryptography to get a sense of how we can actually prevent some of the attacks we saw last time by ensuring that prying eyes can't see the information exchanged between two parties. After that, we'll see some threats to the data on the computer that I'm sure is near and dear to your heart, as well as some concrete ways you can protect your digital information!
Last time, we saw how technologies like HTTPS and WPA2 used encryption to securely transfer data between computers. Now, let's take a look at a few ways of actually going about encrypting and decrypting textual information. One of the simplest methods of encrypting and decrypting text is called the Caesar cipher, named after the big J himself. The Caesar cipher works by shifting, or rotating, each character in the alphabet over by some value. For example, we could rotate every character by 1, which would mean that an A becomes a B, a B becomes a C, a Z becomes an A, and so on. If we rotate every character by 2, then A becomes C, B becomes D, and Z becomes B.
Let's try an example. We want to encrypt the word "banana," and let's pick a rotation of 13 (we can pick any number we want for the rotation). Here's a table of every letter in the alphabet rotated by 13, or ROT13 for short:
Great, now we know how to encrypt the letters of any message! So, the word "banana" becomes "onanan" under this encryption scheme. If we were to transmit this message over a network, then any prying eyes would just see what looks like a nonsense word, so our message can be sent safely! On the other hand, if we told the recipient of our message how many characters we'd be rotating by in advance, then they can decrypt the message by shifting the characters in the other direction. Knowing the size of the rotation is the key to being able to understand our encrypted message, so we call this additional piece of information the key, which is known by both the sender and receiver.
So, we're safe now, right? Sadly, not quite. Since there are 26 values in the alphabet, that means that there are only 26 different possible keys for the Caesar cipher. To see for yourself, compare the ciphertext for a key of 27 with a key of 1: you'll get the same thing! An attacker, then, could potentially crack our encryption scheme simply by trying every possible key on the ciphertext. Eventually, one of the keys the attacker tries will be the key that we picked, so they'll be able to decrypt our message. This type of attack is called a brute-force attack, since it consists simply of trying everything possible until something works. Since there are such a small number of keys, this won't take long at all, so sending your bank account password over the Internet using the Caesar cipher might not be the best idea in the world. Maybe we should be more secure and encrypt our information with ROT26?
It looks like having such a small number of keys to choose from ultimately led to our demise. In order to increase the number of possible keys, let's try a different approach. Rather than rotating each character in our message by the same number of positions, let's try rotating each letter by a different amount! For example, we can rotate the first character by 2, the second character by 4, and the third character by 6, then lather, rinse, and repeat. If our message is "banana," then we'll rotate the "b" by 2 to get "d", the "a" by 4 to get "e", and the "n" by 6 to get "t". Now, the next letter will be rotated by 2 again, so "a" will become a "c", and so on. Eventually, we'll get "detcrg" as our ciphertext. At first glance, this doesn't look any more readable than "onanan", but let's see how long it will take our attacker to crack our encryption. We still have 26 possible values for each rotation (since we're still using the English alphabet), but since we have 3 different rotation values, we have a total of 26 × 26 × 26 = 263 = 17576 different keys! Now we're cooking. If we pick an even longer key, then we can make it even more time-consuming for a brute-force attack to succeed, especially if the attacker doesn't know how long our key is.
Telling the recipient of our message the long series of numbers that we used for our key might be kinda annoying. Instead, we can an encoding similar to ASCII to write down our key. In ASCII, we said that a value of 65 corresponded to an A, which still seems a bit arbitrary if you ask me. Let's instead say that a value of 0 corresponds to an A, a 1 is a B, a 2 is a C, and so on. Now, rather than saying our key is "2 4 6", we can say that our key is "ceg", which is a bit easier to remember. Notice that this doesn't change how encryption works at all, it's just a different way of writing down the key. This type of encryption is called the Vigenère cipher, incidentally named after a Frenchman who didn't actually invent it. So, in the Caesar cipher we rotate each character by the same amount and our key is a single number, while in the Vigenère cipher we rotate each character by a different amount, and our key is a sequence of letters.
Let's try another one! Instead of a prehistoric-sounding word like "ceg" (try saying it out loud, it's kinda fun)
So, is the Vigenère cipher the be all end all of ciphers? While it's stronger than the Caesar cipher, it's still pretty weak. Using some trickery that involves looking at the frequency of characters in the ciphertext, it can still be cracked relatively quickly. In fact, there are plenty of tools online, like this and this, that will do just that. Looks like we can still do better!
Asymmetric Key Encryption
In both the Caesar and Vigenère ciphers, both the sender and receiver had to agree on some kind of key before the message was sent. In general, encryption methods that rely on this kind of exchange are called symmetric key cryptocraphy, because the same key is used by the sender to encrypt the message and the receiver to decrypt the message. More advanced symmetric key algorithms include AES (used by the U.S. government) and Blowfish, but they use the same basic principles as Caesar and Vigenère. However, if the key is ever compromised, then any attacker can decrypt any message that has been encrypted with the key, which isn't so good. The process of transferring a key from one party to another is thus a potential weakness for all of these methods!
Another technique is called asymmetric key cryptography, which doesn't require both parties to agree on some kind of shared key. While symmetric key cryptography uses just one key, asymmetric key cryptography uses two different keys: a public key and a private key. As their names suggest, public keys are designed to be shared with anyone on the Internet, while private keys are designed to be kept private, just like passwords. Now, the process of encrypting and decrypting a message requires both a public key and a private key. For example, if Alice wants to send a message to Bob, she can encrypt it using Bob's public key, which is publicly available. Now, the only want to decrypt that message is with Bob's private key, so only Bob can actually read the message that Alice encrypted. No need for Alice and Bob to share any kind of secret key! As an aside, we can also go the other way around. Let's say I receive a message from the President, and I want to make sure that the message is genuine (and not sent by anyone else). If the President encrypts the message with his private key, then I can use his public key to decrypt the message and prove its authenticity. Because only the President has access to his private key, he can use it to "sign" messages that can be verified with his public key.
How messages are actually encrypted and decrypted depends on a lot of scary math, but at a high level, these methods rely on something called a trapdoor one-way function, which is essentially an operation that's really hard to reverse-engineer. Let's take a look at an example. If I have any two prime numbers, it's really easy for me to multiply them together to get a third number. So, if I have the prime numbers 2459 and 8863, I can pull out a calculator and tell you that the product of these two numbers is 21794117 without too much effort. However, if you just give me the number 21794117, it's much harder for me to tell you what prime numbers were multiplied together to get that product! We can see, then, that multiplication is pretty easy, but prime factorization (the reverse direction) is much more difficult! Using a technique called RSA, we can encrypt and decrypt messages using this fact. Essentially, we can use the product of two prime numbers to encrypt a message, so the message can only be decrypted if we know what those two prime numbers are, which we just saw was difficult to figure out from the encrypted message alone. By the way, if someone does figure out a way to calculate prime factorizations efficiently, then we're in a lot of trouble! A lot of smart people think there's no way to do so, but this is actually one of the largest unsolved problems in computer science.
We can also combine symmetric and asymmetric key cryptography! Before, we said that a weakness with symmetric key cryptography was the need to agree on some kind of shared key without anyone else seeing. We talked about mixing paints together a few sections ago, and I miss that. So, let's say that Alice and Bob want to agree on a secret color, which they can hypothetically use as a key to encrypt and decrypt messages, but they know that a third person, Eve, can see every piece of information they exchange with each other. First, Alice and Bob will publicly agree on a starting color. They know that Eve is going to find out what this color is, so it will serve as a public key. Let's say that both Alice and Bob like green, so they choose that. Now, here's what everyone knows:
Now, Alice and Bob will both choose another color without telling anyone else, so this is their private key. Let's say that Alice picks yellow and Bob chooses purple. At this point, here's what everyone knows:
Next, both Alice and Bob will now mix together their public and private colors in order to get a new color. Mixing two colors together is another one-way function: it's easy to mix chartreuse and blanched almond, but given a mixed color, it's really hard to figure out exactly what two colors were mixed. After Alice and Bob mix their colors, here's what we have:
Now that Alice and Bob both have a mixed color, they'll send it to each other, which means that Eve will find out. Now, the situation looks like this:
Finally, Alice and Bob will mix their original private color with the mixed color they received, which means that they'll end up with the same color!
Remember though, that Eve has been able to see everything exchanged between Alice and Bob. That means that Eve knows the public colors they started with, Alice's mixed color, and Bob's mixed color. However, none of this information is particularly useful, since Eve has no way of knowing Alice and Bob's shared secret without knowing their secret keys! So, using this process, called Diffie-Hellman, Alice and Bob were able to agree on a shared key using asymmetric key cryptography. Cool, huh?
Now that we have a handle on cryptography, let's take a look at a few different types of malware, which is an oh-so-clever combination of the words "malicious" and "software." As this elegant portmanteau suggests, malware is a general term for software whose purpose in life is to make your life miserable by launching some kind of malicious attack on your computer or from your computer. So, malware is software that has been designed to be intentionally destructive by some not so nice person out there, which is different than software that crashes because of some problems with its design.
There's a good chance that you've already had an experience with a computer virus on your machine or a machine owned by a friend or family member. Typically, viruses take destructive action like slowing down your computer or deleting files because some person on the Internet thought it would be fun to wreak havoc on your poor computer. Much like biological viruses reproduce by first infecting an existing healthy cell, a computer virus usually attaches itself to some other kind of file or program, whether that be a PDF, spreadsheet, or screensaver. Then, when a user opens up that program, the virus launches and begins its path of destruction. So, viruses generally require some kind of human interaction in order to infect a machine, which usually comes in the form of running an infected program. That infected program may come to you as an email attachment or perhaps as a file downloaded from a sketchy web page, so always think before opening something if you have any doubts! Once a virus has been loaded, it has the ability to make new copies of itself, but because viruses need some kind of human interaction to launch, viruses can't automatically propagate themselves to other computers.
The Internet has already seen its fair share of viruses over the past few decades. For example, a virus called the Melissa virus spread like wildfire back in 1999. Melissa was contained within a Microsoft Word document called
LIST.DOC, which apparently contained passwords to a number of pornographic websites and was named after a Miami stripper. Ooookay, let's move on. When someone opened the Word document, you know, just out of curiosity, the virus loaded itself into the computer's memory. Then, the virus sent a copy of itself to the first 50 contacts in the users' Microsoft Outlook address book (as an email attachment) and potentially attached itself to new Word documents created on the computer. So, if one of the 50 recipients of that email opened the attachment, 50 new people would receive a copy of the virus. The email traffic generated by this virus was enough to jam corporate networks, so much so that many large companies had to shut down their email services to protect themselves. However, it wasn't until someone actually opened up the infected Word document that the virus was actually able to do anything, so users could protect themselves from the virus simply by ignoring or deleting the email.
Worms are very similar to viruses, but they have a few key differences. While we saw that viruses required some kind of user interaction in order to spread from computer to computer, worms have the ability to self-propagate, so they can spread to other machines without the user doing anything. Worms spread automatically to other devices by exploiting security holes on the computers on a network, which means that worms have the potential to flood the network with traffic to bring it down. Worms may also be used to create a botnet, which is a network of zombie computers controlled by the author of the worm. A botnet might be used to launch something called a DDoS, or Distributed Denial of Service, attack, in which a network of computers makes a huge number of requests to the same web server to an attempt to overwhelm it and bring it down. The larger the botnet, the more successful such an attack might be, since more computers are making requests at the same time. Finally, unlike a virus, which attaches itself to an existing document or program, a worm might exist as a stand-alone program.
The Conficker worm, first detected in November of 2008, created one of the largest botnets in the history of the Internet. The worm targeted vulnerabilities in the Windows operating system, and it's suspected that it was able to infect over ten million computers. That means that the author of the worm could publish instructions that would be blindly executed by millions of computers, and people feared that the Conficker botnet powerful to bring down the Internet itself. While that didn't happen (phew!), people did use Conficker to attack banks and national defense networks, causing millions in damages. That's a lot of damage for such a tiny little worm, if you ask me. While the Conficker worm was particularly difficult to remove at first, software utilities have since been developed to clean your computer should it become infected.
Another popular worm was the ILOVEYOU worm, which spread in mid-2000. While the Conficker worm created a huge botnet, the ILOVEYOU worm instead destroyed lots of files on infected computers. In particular, the worm targeted JPEG, MP3, and other files that you probably wouldn't want to lose. To spread itself to other computers, ILOVEYOU utilized both IRC, an early form of instant messaging, as well as email. However, ILOVEYOU one-upped the Melissa virus by emailing everyone in a user's address book, rather than just the first 50. So, this is to say that computer worms can be used for a variety of destructive purposes, and they're particularly devastating because of their self-propagation abilities.
Spyware, Trojans and Adware
One last type of malware is called spyware, which describes any piece of malware whose purpose in life is to steal information from your computer. For example, spyware might install a keylogger, which is a small program that records every key you press on the keyboard and sends it to some remote source. Using a keylogger, an attacker could figure out your passwords, bank account numbers, and really anything else you might type. Even more creepily, spyware could include software that lets attackers view your screen (or even webcam) or take control of your machine!
A trojan, named after the famed Trojan Horse, is one type of spyware. Trojans are essentially malware in disguise. While a Trojan Horse looks like it might be a useful piece of software, like a game or some kind of system utility, it might actually destroy files on your computer or allow someone else to control your computer. Unlike a virus or worm, though, trojans usually don't try to replicate themselves, since their purpose was to allow an attacker to access the data on your computer. So, a trojan is essentially a clever way for attackers to get spyware onto your computer!
Another type of spyware is adware, which as you might guess, is malware that displays unwanted advertisements on your computer. For example, adware might cause unwanted pop-up messages as you're trying to use your computer, or adware might track your Internet-browsing habits in order to generate additional revenue for companies. That being said, adware might also include less harmful applications. Many free software programs include advertisements as a way to generate money without requiring users to pay to download the software. Sites like Google and Facebook offer small advertisements on the peripheries of their web pages that other sites have paid to place. Free iPhone or Android apps take a similar approach, as most people don't mind the presence of a few ads if it means they don't need to pay for the app! (And those that do can usually pay to remove them.) In these contexts, adware isn't so bad, but adware that displays advertisements all over your computer without your permission sounds malicious to me.
Now that we've seen a few different types of malware, let's discuss a few ways you can protect yourself from harm. Luckily, while there may be lots of malicious attackers out there writing programs designed to wipe our hard drive, make your computer unusably slow, or steal your personal information, there are also lots of people writing software to combat viruses, worms, and spyware. Installing anti-malware software designed to detect and remove threats to your computer is a very good idea, particularly if you know you have a good deal of sensitive information on your machine. Anti-malware software can scan your computer for threats, so it's also a good idea to schedule regular, automatic scans to alert you of any potentially harmful software on your computer. Here are a few examples of anti-malware applications you can install on your PC or Mac:
- Adaware Anti-Spyware
- Avast Antivirus
- AVG Antivirus
- Clam Antivirus for Windows and Mac
- ESET Antivirus
- Spybot Anti-Spyware
- Norton Security
However, using anti-virus software on your computer isn't a license to make reckless security decisions! Even with anti-virus software installed, it's important to think before opening sketchy email attachments or software applications. We've seen how the devastating effects of viruses and trojans can be prevented simply by ignoring an infected email or choosing not to install a malicious program. So, think twice before opening that .exe file attached in a chain email sent from your high school friend who you haven't talked to in years!
Another important piece of online security is password safety. The registration page for many a web page nowadays will ask you to choose a password with some minimum length that isn't based on any dictionary words and includes capital letters, numbers, and symbols. Yes, it might be slightly annoying to come up and and remember a password that meets all of these requirements, but this is for your own good. Earlier, we saw how easy it was to crack the Caesar cipher using a brute-force attack because there were so few keys to try. The same idea applies here: if you choose a purely numerical password, then a brute-force attack only needs to try passwords with 10 different characters. So, if an attacker knows that your password is 8 numbers, then there are only 108 = 100 million different things you possibly could have chosen. However, if your password could have characters from the nearly 100 different printable ASCII characters, then an attacker has to try something on the order of 10 quadrillion (or 10 million billion, if you prefer) different passwords! In the same vein, picking a longer password also means that a brute-force search will have to search over a larger space of passwords.
In an attempt to crack your password more efficiently, an attacker might use something called a dictionary attack, which takes advantage of passwords that are based on words. Many people are much more likely to choose a password like "monkey1" than a password like "sdfov87q4ivb", simply because the former is easier to remember. So, if an attacker starts by guessing passwords that are based on actual words and thus more likely to be chosen by humans, there's a chance they can crack your password faster than they would have if they simply used a brute-force search. The solution here is simply to make sure your password isn't based on any dictionary words, so this approach wouldn't work!
In addition to choosing passwords, it's also important to manage them effectively. For example, if you're on an unencrypted Wi-Fi network, it's probably not the best idea to use an unencrypted instant messaging client to chat a highly classified corporate password to your co-worker. Similarly, writing down passwords on sticky notes is just asking for trouble, since an attacker walking through the building might (literally) stumble upon what should be secret information. Instead, there are plenty of applications like LastPass and KeePass that are designed to store your passwords securely. Web browsers like Chrome and Firefox also allow you to securely store your passwords in the cloud, so they'll also be available on other computers you log into.
However, in order to store your passwords securely, websites shouldn't actually save your passwords in the clear in a database. Instead, sites should store a hash of your password, which is simply the result of applying some kind of one-way function. For example, MD5 and SHA-1 are a few methods for hashning text. Since we saw that these one-way functions are hard to reverse, even if the site's database is compromised, attackers won't immediately have access to your password. However, password hashes are still subject to attacks like brute-force. LinkedIn recently had a security breach that led to password hashes being released to the public, and attackers were able to figure out over 150,000 passwords because many users had chosen bad passwords! Check out this infographic for more information. As you can see, the most common passwords on the site were short and based on words in the English language, so you should always choose a strong password!
Speaking of other computers, by the way, always be careful when using a public computer! You never know what on Earth could be installed on one of those, since the person who used it before you may not be as well-versed in security as you are. So, if possible, it might be best to avoid doing your online banking on a computer the public has physical access to.
Let's look at one last bit of data-related security before we finish up. Earlier, we saw how information is stored on both hard drives and RAM. Hard drives, for example, have a big table that keeps track of where files are actually stored on the drive, whether those locations are platters on a HDD or transistors on an SSD. But, what happens when you empty your recycle bin and delete files from your computer? Perhaps not what you'd expect! The bits of a deleted file aren't actually removed from the hard drive. Instead, what gets erased is the entry in the master file table, which means that your computer has essentially forgotten where the bits of the file are, although they are still very much there. If your hard drive were a book, then deleting a file would be more like removing a chapter from the table of contents than ripping a page out of the book and shredding it! So, someone might be able to use forensic analysis to scour your hard drive looking for bits of files that are still on the drive. Of course, bits and pieces of deleted files may have been overwritten by the operating system (since it forgot where the files were), but analysts may be able to recover large portions of deleted files from a hard drive. By the same logic, files that have been "freshly" deleted may be easier to recover than those that were removed from the drive a long time ago.
Don't worry, though, there are a few approaches we can take to solve this problem. Before ever selling your computer to someone else, make sure to wipe (or physically remove) the hard drive. You might hear that formatting a hard drive is the right thing to do here, but formatting a drive doesn't necessarily erase all the 0s and 1s that have been stored on the drive. Instead, it overwrites a limited amount of data and tells the operating system that the disk is blank. So, data from a formatted drive may still be recovered, even though it will be a bit more difficult. Rather than formatting the drive, a safer solution is to securely delete everything on the drive using a utility like DBAN. DBAN and other applications like it completely fill up your hard drive with random numbers or zeros in order to ensure that everything on the drive is overwritten and unrecoverable. To be even more secure, the US Government mandates that all government drives be completely wiped, which means the entire disk will be overwritten with random data and verified, at least seven times.
Another solution is to encrypt your entire hard drive. We've seen how we can use encryption to scramble the data of text-based messages, but we can apply the same techniques to every single bit on your hard drive! For example, we can pick a strong key with which to encrypt our hard drive, and all data will be encrypted before being written to the hard drive and decrypted after being read from the hard drive. Now, if someone gets a hold of your HDD, the only way they'll be able to read the information from it is with the password you choose when you encrypted the drive.
Your hard drive isn't the only piece of hardware that's vulnerable to forensic analysis. When we talked about RAM, we mentioned that once you turned your computer off, the contents of RAM was lost. However, some fairly recent research has revealed that's not quite the whole truth. While the data on RAM will eventually fade away, it will actually do so predictably and gradually. So, even after a computer has been turned off, it's totally possible to dump the entire contents of its RAM! If passwords or bank account numbers are stored in RAM, then even if you're using an encrypted hard drive, an attacker could be able to access them. If you're using an encrypted hard drive, then there's a good chance the encryption key is also stored in RAM, which means that an attacker could be able to decrypt your encrypted hard drive with access to the physical hardware. In fact, if the physical sticks of RAM are kept very cold, then the contents of RAM can actually be kept around for even longer. Check out the below video for a demo, where we can see that almost all of the bits stored in RAM have been preserved ten minutes after the computer was powered off!
Give the ROT13 encryption of this message: "That's one small step for man; one giant leap for mankind." Assume that you can leave any punctuation as is, so you don't need to worry about encrypting those characters.
The following message has been encrypted with ROT5 encryption. What does it say? "Mtzxyts, Ywfsvznqnyd Gfxj mjwj. Ymj Jflqj mfx qfsiji."
How are the Caesar cipher and Vigenère cipher similar and different? Why is the Vigenère cipher more secure than the Caesar cipher?
Using the Vigenère cipher and the key "ALICE", encrypt this message: "Sentence first—verdict afterwards." Just like last time, you can leave any punctuation as is.
What's the difference between symmetric key cryptography and asymmetric key cryptography? Why might we want to combine the two approaches in some cases?
Explain the roles of the public key and the private key in the encryption process of asymmetric key cryptography.
Compare and contrast viruses and worms.
Why is it important to choose a strong password when signing up for a website?
What happens when you delete a file from your hard drive?