Some lessons from cracking the compromised Linkedin password database

Here’s a blow-by-blow account of a security researcher’s attempts to crack the compromised Linkedin database. This is a very good example of ethical hacking.

It’s good to get into the mind of someone who found the password m0c.nideknil in the list. At first glance, it looked like a decent password to me. It’s 12 characters long, with one special character and a number. Then I read on, and they explained the problem with that password: it’s “linkedin” backwards with four garbage characters. So it’s hard for a human to guess it, but easy for a computer algorithm to guess.

If a good guy takes that approach, it means some of the bad guys take that approach too.

Linkedin messed up because they encrypted their passwords in an entirely predictable way that anyone who can open a command prompt and type md5sum can duplicate. Linkedin failed to add any randomness to the encryption, which is standard security practice. Not adding randomness is like having a doorknob with a lock, and never bothering to lock the door.

The only way to save yourself in this situation is to use extremely long, complex passwords. I once worked for someone who assumed that the password database would be stolen, so they required hopelessly complex passwords. So we had specific rules to protect our passwords in the event that someone swiped the database.

This ethical hacker’s methodology gives some insight into that policy, which remains the most draconian password policy I have ever seen. The policy required 16 characters minimum, 2 uppercase, 2 lowercase, 2 numbers, and 2 special characters. No keyboard patterns (think 1234qwerty!@#$QWERTY or 1q2w3e4r!Q@W#E$R). No dictionary words, period.

And then the person implementing this decided to impose an extra rule or two on top of what management required. I never liked her much because she would do things like this, and then visibly enjoy it when she saw it making our lives difficult, but never explain what her reasoning was, if there was good reasoning. She disallowed words spelled backwards. And then she cackled like the Wicked Witch of the West when she found out it took us an hour to come up with a new password every 45 days, because of course if you’re going to institute a draconian password policy, why not make the password life short?

We argued that this wasn’t a good idea, because you were shrinking the pool of possible passwords by disallowing anything that has a word in it–especially a word spelled backwards.

Another reason this wasn’t a good idea was because one person would figure out a password that worked, then share it with everyone else. I won’t say everyone else used those shared passwords, but I’m sure some did. Coming up with passwords that worked became a game, and not an easy one at that.

But now I understand the logic. Passwords containing dictionary words, spelled forward or backwards, are easy for a computer to guess. A 16-character password containing an 8-character dictionary word is, in effect, a 9-character password. You guess every possible dictionary word, along with every possible character combination for the remaining 8 characters. Disallowing the dictionary words shrinks the pool of possible passwords, so you lengthen the password requirements to make up for it. It could be the original target was 12 characters, so they lengthened it to 16 in order to keep the password strength where they wanted it.

Now, I’m not certain she was aware that this was a good idea. Other people I worked with at the time would argue she was not. If she was aware of it, she would have picked up some much-needed goodwill if she had spent five minutes at the whiteboard explaining why she implemented the policy the way she did.

Those of us who were having to remember passwords like j*g^P0]b!6Qx$7Hn still wouldn’t like it, but at least we would understand. Being a good security person doesn’t mean you have to enjoy your users’ suffering.

If you found this post informative or helpful, please share it!

6 thoughts on “Some lessons from cracking the compromised Linkedin password database

  • June 12, 2012 at 10:38 am
    Permalink

    m0c.nideknil is more than just “linkedin” backwards, with four random characters. It’s actually linked.com with a 0 substituted for the o (a fairly common character substitution), and then that whole thing backwards.

    • June 15, 2012 at 7:44 pm
      Permalink

      Nice catch on m0c.nideknil, Timothy. I hadn’t caught that m0c. was just .com backwards, with the common 0-for-o substitution.

  • June 12, 2012 at 11:49 pm
    Permalink

    David, can you satisfy idle curiosity for me?
    I propose a password, as an example:
    i*G#$%Q$*QH

    The question is whether that’s a good password or not?
    Before you answer, you need to know how it was derived. It’s a method I saw maybe ten years ago, and I’m sure it was probably pretty good then, but I’m not so sure now.

    Basically, it is a dictionary word,
    Libertarian

    That word was reverse-cased (upper and lower case or capitalisation inverted), then transposed one row up and half-a-space left on the keyboard. If necessary, it would have wrapped from upper row to lower.

    I suspect the answer is it’s no good, because once the method became common it would only double the necessary checking to transpose anything this way, or triple it to account for moving half-a-space right rather than left.

    Would it be worthwhile to do everything else to create a good password, then transform it this way?

    • June 13, 2012 at 6:46 pm
      Permalink

      Don, I’m concerned that it’s derived from a dictionary word, and derived via a means that’s easy for a computer to simulate. Like you say, move left or right and factor that into the algorithm, and you’ve tripled the number of passwords you have to try. But it’s only triple… A good Perl or Python coder could write a program to transpose a dictionary like that in a short period of time. Maybe the resulting program would take an hour or two to run. Then, once you have the transposed passwords, you can try a few million passwords in a matter of hours if you throw enough hardware at the problem.

      If the hardware exists, the crooks have it. State-of-the-art hardware is free when you’re spending other people’s money, and that’s something computer criminals have been doing for 30 years.

      But I don’t think it would take much to turn it into a very good password. One option would be to deliberately misspell the word before feeding it into the formula. Another, maybe better option would be to append something, anything, onto the beginning or the end afterward. Because then, even if I know your formula, if you misspelled the word, I won’t catch it unless that misspelled word happens to be in my attack dictionary. And even if I know your formula, I still have to figure out what you’re appending to it.

      Making a non-dictionary password and then transposing it like you suggest would work well, if you’re willing to do the work. But how much better is it, really, than i*G#$%Q$*QH]]]]? And i*G#$%Q$*QH]]]] is less work to type. Append something better than ]]]] to it, and it’s better still.

  • June 13, 2012 at 3:09 pm
    Permalink

    Don, I don’t want to speak for Dave (and I am interested in hearing his opinion as well), but ti me, that’s a good (maybe great) password.

    There are (essentially) two ways to crack passwords: word lists, and brute force. (Rainbow tables could I guess be considered a third methord, although really it’s just a combination of the two.)

    For a word list attack to work, that would mean that I would have to exactly guess your password. So, if your password is “pencil” and I use a list of words in the dictionary, I will get your password. I will be trying variations of “pencil” as well, so if it’s “Pencil” or “p3nc1l”, I’ll eventually find it.

    The second method of cracking passwords, using brute force, takes time. You can stack the odds in your favor by using numbers and symbols, but it really comes down to a matter of time and computing power. Dave has written about cracking passwords with brute force before (https://dfarq.homeip.net/2011/06/the-death-of-8-character-passwords/) and I agree with him.

    The password you posted above, in my opinion, is pretty secure. It’s not in the dictionary so that’s a win, and it’s long enough that brute forcing it would take a while. In the case of the Linkedin attack, people are cracking the easy passwords and then moving on. Chances are people are not going to single you out specifically and dedicate hours and dollars toward cracking your specific password. Just don’t be the lowest hanging fruit, that’s all. And don’t use that password on other sites — if you do, the bad guys only have to crack it once.

    When people ask me about passwords I tell them to pick one that’s (a) at least 12 characters long, (b) difficult enough that no part of it appears in a dictionary, and (c) easy enough that you can remember it without writing it down. Personally, I use KeePass, which generates my passwords and stores them. Most of my passwords are 24 characters long and a garbled, messy combination of numbers, letters and symbols. I have no idea what they are.

  • June 14, 2012 at 3:03 pm
    Permalink

    Thank you, gentlemen. Having said it, I was better able to think about it, and I was coming down way firmly on the side of “Not good”. Once you get into brute force attacks, then dictionary words are easy. Checking these variations are only an arithmetic progression harder – 2x or 3x. That’s easy for brute force – normally they have to worry about geometric or exponential numbers of things to check. I agree with Dave about ways to make it good, but the simple way I first set out is not secure these days – or wouldn’t be the first time a Russian Mafiosi read that the method had been recommended.

Comments are closed.