From the development of password guessing algorithms based on human behaviour to the building of highly efficient GPU cracking platform able to guess most passwords in a matter of seconds, password cracking has evolved.
This article highlights the most interesting things we learned about passwords over the last years.
Easy to remember, hard to guess
A password is a secret, it is supposed to be robust against guessing. Some people try to make their passwords more complex by making combinations of random characters.
However, it is rather difficult to remember a password consisting of mixed letters, numbers and symbols. People tend create their personal algorithm to generate more complex but easier to remember passwords. They take a word and append a number such as a date, they add symbols, put some capital letters…
This substitute & append method is widely used by the few who care at least a little about their passwords. This was good practice for a while until rules based password cracker such as maskprocessor or John The Ripper emerged from the web.
Then it was suggested that people should use passphrases because it's much easier to remember a sequence of words than random characters.
A passphrase is a sequence of words mostly used for user authentication or as a cryptographic key. It has the same usage as a password but it's supposed to be longer for added security.
Some people claim that because of their extensive length passphrases are much safer than passwords, well it is not entirely true. Because it is in human nature to use what we know to build something and because some people thoughts are too narrowed, we can easily guess the words they use to build their passphrases.
If I ask you to give me four different words to build a passphrase right now, they will probably linked to what surround you, very basic words or what you like maybe even about this article.
Most of the time people will use the following elements to build a passphrase.
If an attacker is able to gather information about its target, through social media or social engineering, he or she can build a dictionary of likely words and use a tool such as princeprocessor to rely the attack on.
Know your enemy
A motivated attacker will gather as many information as he or she can in a passive way. The more you expose publicly the easier it will be for an attacker to enter you mind and generate a dictionary.
Social Medias are the best sources of information for reconnaissance, by simply typing someone's name on Facebook or Twitter it is possible to know, what the person like (Hobbies, Animals, Food), who he or she talks too (Relatives, Friends, Soulmate) and probably the city he or she lives in (Location).
A non-passive approach would be social engineering, an attacker can try to gather more sensitive information by using psychological manipulation.
We all heard stories of attackers tricking people to reveal their password by claiming to be IT service and asking first letters and last letters of a password for verification.
Targeted passphrase cracking
We experimented the strength of passphrases with 19 non-technical friends, we explained them the principle of a passphrases and asked them to register an account on a fictitious dating website.
The website was storing the MD5 hash of the passphrases and the email in a MySQL database.
Then we simulated a database leak, our goal was to crack as many passphrases as possible in less than 7 days.
We know our friends, we know what they like, the words they use and how they think. We also have large amounts of Instant Messaging conversations history with them.
Upon that knowledge we built a dictionary of words for each of them.
To better explain how we processed, let's take one of our candidate "Peter" as an example.
Peter is a master of trolling, he likes to spend time on websites like 9gag, 4chan. Peter likes to play video games, he spent a lot of time on Diablo, hearthstone we also saw him playing Pokemon on his Game Boy.
Finally Peter talks a lot on Instant Messengers so we aggregated all the messages he sent to us.
Our first step was to start writing words that Pierre would be likely to use for his password. For that we used a template containing common words in Peter native language, mostly grammatical articles and numbers.
Then used a Python script to sort all the data we had about Peter. From that data we had a list of words he uses the most.
- youloose, thegame
Then we extended the dictionary some words related to what he likes.
We ended up with a custom dictionary of about 180 words.
The next step was to actually crack Peter's passphrase using that custom dictionary.
We used the princeprocessor to generate combination and hashcat with custom rules to crack the passphrase.
After a few hours Perter's passphrase was found. It was a 24 characters long alphabetic passphrase with 5 distinct words with a Capital letter on this first character.
With enough computing power and good knowledge of your target, it becomes trivial to break a weakly created passphrase.
We asked the passphrases we couldn't crack to their creators and several of them couldn't remember it properly because it was too long.
All the test subjects created a passphrase related to what they like, where they were born or where they live.
It would be a great to make the same study on a more heterogeneous population, you can contact us to get the website framework we used.
Most user databases leaks show that people's choices of password structure follow several known patterns.
Although most passwords contains only lower-case letters and numbers, it's more common to see a capital letter as first character than as the second character.
It is because when you type something you are used to start with a sentence capital letter. Unconsciously, this habit will push you to do the same when you create a password.
When people include at least some digits in a password two patterns are very common.
- Letter substitution with numbers.
- Appending meaningful numbers at the end of the password.
Such methods will generate passwords such as "sup3rg0rg30s", "ashley69" or "b1mb01984". Note that only around 10% of the time the numbers are prepend at the beginning of the passwords.
Most thorough authentications systems require at least one special character in the password.
Our study indicate that symbols use the same two previous pattern used by numbers.
We can encounter passwords such as "b@ndit", "Blues!" or "k!mmy$".
People don't like to put symbols in passwords, so they put it at the end of their already typed passwords when it's mandatory.
Some people will follow language logic to generate their passwords, then a password is more likely to finish with a dot than start with one.
The most used characters for symbols are not the same at the end, at the beginning or in the middle of a password.
In general the five most used symbols for passwords are the exclamation mark, the at-sign, the full stop, the underscore and the hashtag.
Password cracking tools
If an attacker successfully gathered what he or she think to be enough information about the target to start a password cracking attack it's still important to pick the right tools.
We won't be talking about Hashcat or John the ripper because it's not the topic of the article. What we would like to introduce are the tools released by Jens Steube laster year.
All the following tools are able to generate passwords on the standard output, you only have to pipe your favourite password cracker after a pipe.
The princeprocessor is a password candidate generator, the acronym PRINCE stands for "PRobability INfinite Chained Elements".
This tool takes a list of words as input and will produce combinations of these words in a smart manner.
It is a great tool for targeted attack, you can use a dictionary of likely words as seen earlier to generate complex passphrases based on the target theme or environment.
The maskprocessor is a high-performance word generator with a per-position configurable character set.
We have seen earlier some examples of how humans design their passwords we can use this flaw to create a mask for the password generator to reduce the key-space and speedup the password cracking process.
The statsprocessor is a word-generator based on per-position markov-chains.
This tool will generate words using pre-computed statistics of the most probable letter following another letter.
Password cracking companies
Over the last few years some companies specialized in "high-performance password cracking" emerged.
Sagitta HPC, leader on this market is selling high-end GPU and CPU compute nodes.
The performances of their hardware is very impressive, on Twitter earlier this year Jeremy Gosney Founder and CEO of Sagitta HPC posted a link entitled "The world's fastest 8-GPU".
The Gist displays a benchmark run by cudaHashcat on their most recent hardware, 8x GTX Titan X reaching 300 GH/s on NTLM and 150 GH/s on MD5.
At such speed it would only take 24 hours to test the whole alphanumeric key-space of length 9 for a MD5 hash.
The amazing part is you can with this nodes is, if you have enough money to afford several and a good power supplier it is possible to make them work together in a cluster.
Follow some good practices to reduce the risks of getting you passwords cracked.
- Secret and private
- Unique for each account
- At least 10 characters
- Mixture of lower-case, upper-case, digits and symbols
- Easily remembered (passphrases)
Please stop hashing your passwords
Regularly, among large developer communities I see people recommending hash function with salt to secure user passwords. Well, it's not the best practice.
Back in the years it was common to use MD5 or SHA1 to hash user passwords so it doesn't show up in clear text in the database.
MD5, SHA1 and SHA2 are cryptographic hash functions. A hash function is designed to be fast, to prove uniqueness and to verify integrity.
Obviously the faster hash function will be the faster the brute-force process will be.
Key derivation to the rescue
Bcrypt, Scrypt and PBKDF2 are a key derivation functions for passwords. Key derivation functions are often used in conjunction with non-secret parameters such as a salt to derive one or more keys from a common secret value.
Just like a hash function, a key derivation function return a value of fixed length, it's also a one way function so original values are unrecoverable.
The strong point of a key derivation function is that a hash is very expensive in processing power. Which means that it's a lot more time consuming to break than conventional hash.