-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using MD5 for passwords #314
Comments
To do that we need the password reset functionality, |
What should we use instead anyway? |
Sha256 is still safe so it should be good, |
Looks like sha256 + salt is not good enough for passwords => http://security.stackexchange.com/questions/90064/how-secure-are-sha256-salt-hashes-for-password-storage?lq=1 According to another SO question, we probably want to use PBKDF2-Haxe. |
I knew about PBKDF2 but I assumed it wasn't available in haxe, |
It would be good to come back to this, MD5 really shouldn't be used anymore. This should be considered a major security concern for the Haxe ecosystem, especially as it is growing. |
Instead of forcing everyone to reset their passwords, we could temporarily rehash the existing passwords with a new algorithm and then update them properly when the user next logs in: https://paragonie.com/blog/2016/02/how-safely-store-password-in-2016#legacy-hashes There is however a problem with how to handle old clients... right now, the client hashes the password with md5 before sending it off to the server, instead of the password being hashed on the server side. We might have to simply reject We could do this by adding an extra parameter to the |
My idea right now:
P.S. I've no experience in handling passwords in a system. Feel free to comment. |
I don't really see the point in having a transition period like that. There would be no extra security provided by the other hash during this period, as cracking the MD5 hash would also crack the new algorithm since the password was the same. Until the year is up and the md5 hashes are removed, there is no benefit, but the registrations would still be broken in return for no extra added security. I think it's better just to break compabitilty at the same time as rolling out the change. So this is what I think would be good:
I think it's fair to have these breaking changes, as it's easy to update haxelib and the current situation of hashing on the client is extremely insecure. Therefore, we should prevent library authors from using these old clients to register/submit when it can easily be avoided by updating their haxelib. Also, it will have no effect on users who aren't library authors. I've started working on a binding for argon2 and some of these changes, so I can make a PR soon to give a better idea of how it might work. |
Once a user have their new password hash stored in the DB, the old md5 wouldn't be used anymore. But I'm fine with disabling lib submission with the old hash anytime.
Letting the server able to read raw passwords is a bad idea. Passwords will be leak if the server provider/maintainer (e.g. Digital Ocean, or myself) is a bad actor. The server should only receive hashed passwords and the hashing is performed on user's machine. |
That's possible to do with the plan I sent above, if we accept submissions from old clients. However, there is an issue with this. Let's suppose we want to implement this, so it's possible to use the md5 to login with old clients until the new hash is stored. We'll need a way to keep track of whether the hash has been updated (either with a Now, theoretically, there would be two ways an old clients submission request can fail: either the account no longer uses md5, so the request is outright rejected, or the account still uses it but the password is incorrect. The problem lies in how to handle these two errors.
We can avoid this problem by disabling submissions from old clients immediately, and that also gives us the freedom to implement a proper solution and not have to carry around a legacy field in the remoting api. So I do think doing that is the best solution.
It doesn't really make a difference if the server receives a hashed password or the plain password. If the server is compromised in this way, even if the password is hashed on the client, then the bad actor could as well see the hash, which essentially acts as the password itself, and knowing the hash is enough to give them access to the account (using a custom client). Note that the password would still be hashed before being stored anywhere long term. Also, for the client to hash the password, there would have to be public access to any user's salt, which otherwise would be private apart from in the case of a database leak. Client-side hashing is not a standard practice, and overall it leads to much worse security than server side hashing: |
I think the conclusion is that we need hashing on both client side and server side. This is also suggested in the discussions in the stackexchange pages you referenced. |
Only if you use unique salt on both client and server (and a different one on each), which doesn't sound very practical in haxelib case since it would require one more back-and-forth between server and client for exchanging salt, as initial requests comes from the client. Failing to do that would make hashing on client side less secure than sending plain password. |
I really don't see the benefit of hashing on both the client and server. If there is a bad actor who can modify the code running on the server to retrieve the password before it is hashed, then it doesn't matter if what the client sends is a hash or the raw password, either one will give access to the account. They can just modify the client to send the hash directly to log in. Also, if they are someone with that kind of power, they already have access to the server so they can simply just replace an author's libraries on the server without having to log into their account, and thus eventually get whatever code they want running on users' machines when they download the library. Hashing only on the server side is common practice and there is really little to gain with client side hashing. However, there are a lot of drawbacks.
In my opinion, the best solution is just to follow the standard practice, which will lead to a simpler solution and avoid all the extra problems. |
In #564, I implemented the gradual migration to argon2id as I described. If we were to hash on the client as well (which I personally still oppose), we'll probably need to go for bcrypt instead of argon2. BCrypt has a Haxe implementation so it won't be much of a hassle if we ever decide to migrate the client to a target other than neko (or indeed if someone wants to make their own haxelib client using this repo as a library). However, the C implementation of argon2 is widely used so it is less likely to have security issues than any Haxe algorithm implementations. Having a C dependency is alright for the server but it is a hassle for the client, which is another reason why hashing on the client introduces complexity. |
I also see no reason to do client-side hashing. Using one password per platform is what users should do anyway… |
Hashing on the client-side is not about protecting the haxelib account, but to protect the users account on other sites. (This is discussed in the stackexchange pages @tobil4sk commented eariler) Of course people shouldn't reuse password, but it's common. The haxelib client has been doing client-side password hashing since the beginning. Let's not make a change that would improve security in one aspect but remove existing security measures. |
But that makes haxelib very vulnerable (md5 hashes dump from a hacked database anywhere can be directly used on it without even having to crack the passwords) |
I'm not against replacing md5 with another algo. I just insist we need client-side hashing (with an additional server-side hashing too). |
I am also against client-side hashing, the server should be using TLS/SSL always, which as far as I know (not much) should make MITM attacks not be a concern. |
@andyli Can we look into this more? Delaying something important for security shouldn't be happening. |
@tobil4sk Thanks for your work, and sorry for blocking this long-awaited improvement. Let me reiterate one last time what my concern is and what I think would satisfy everyone here. My main concern is that the lack of client-side hashing would give the haxelib server operator (myself, the current hosting provider, and whoever got access to the haxelib server) read access to the passwords. It means haxelib users who reuse their passwords may give out access to their other online accounts. I have yet seen anyone address this point since I mentioned it in this thread. Here is my final proposal - keep the current client-side md5 hashing and add server-side argon2id (or any modern secure method) hashing. This at least wouldn't be a security downgrade regarding my main concern. We then got better hashing on the server-side, which is what everyone here agreed to be most important. This proposal also has the benefit of no change on the haxelib client, making it a server-only change. i.e. haxelib users no need to upgrade their haxelib clients for this enhancement. |
If the server operator is evil and sees the md5 hashes, they can still crack the password. Client side hashing only makes sense if it uses a proper key derivation function. As for password reuse: I would really hope that all programmers use a password manager and randomly generated passwords. Also, an evil server operator can do things like replacing major haxelibs with malware so I think compromised passwords would be the least of our concerns in such a case. TLDR: Keeping md5 in the client has at best no benefit and at worst is a security issue. |
Right, I know client-side md5 hashing is a weak protection against evil server operator, but it's still better than no protection. To be extra clear, my bottomline is "do not reduce existing security measures". Removing client-side hashing would reduce protection against evil server operator, and is also a breaking change that require a haxelib client upgrade from the user side.
I can't see how keeping md5 in the client (when strong server-side hashing is added) is worse. |
While I find this evil server operator argument somewhat contrived, I also don't see how the MD5 client-side hashing could be worse. I can see how there's a very strong argument that it has no benefit, but in that case we might as well keep it in order to have less friction. IMO the decision should hinge entirely on that aspect: If it's harmless then let's keep it, if it has the potential to cause problems then let's not keep it. |
Having looked into it a little more, it looks like "password shucking" attacks would also work for argon2id+md5 (the usual example given is bcrypt+md5, but the attack doesn't depend on any specifics of bcrypt as far as I can see) |
If I understand correctly, password shucking requires an attacker to target individual users and will only work on users who re-used a password across websites. I think most of you in this thread don't care much about people who re-use passwords. |
Keeping the (unsalted) Md5 algorithm around (even if combined with a modern hashing algorithm) is a security liability. Rehashing with a better algorithm won't mitigate the flaws with Md5, so the hash produced by https://www.scottbrady91.com/authentication/beware-of-password-shucking |
From the source you referenced (https://www.scottbrady91.com/authentication/beware-of-password-shucking),
|
I now read a bit about password shucking, if I understand correctly it's basically this situation expressed in Haxe code: import haxe.crypto.BCrypt;
function main() {
var leakedMd5Hashes = ["48f03653ae8b6d062bb8b7ae177cb391"];
var leakedBCryptHashes = ["$2b$10$jziC4wqx5Zz6oIRXidy2ceQM8lhQP8jIhJvy/1hOV4xlEQPsNIeq."];
var leakedSalt = "$2b$10$jziC4wqx5Zz6oIRXidy2ce";
for (md5 in leakedMd5Hashes) {
var bc = BCrypt.encode(md5, leakedSalt);
for (bc2 in leakedBCryptHashes) {
if (bc2 == bc) {
trace("FOUND MATCH: " + md5);
return;
}
}
}
} Is that accurate? That quote from Scott Brady "As a herd immunity, I think rehashing alone will still help." is a bit confusing, but I think what he means is that |
This part of the article discusses rehashing for the purpose of hash migration, it doesn't justify keeping Md5 around forever. |
Yes, it means it's an immediate improvement without breaking change (haxelib client update). If you want to do it properly, at the cost of having a breaking change, use a stronge algo on the client side as well. Let me state it again, my bottomline is "do not reduce existing security measures". The existing client-side md5 hashing has been providing (weak, but exist) protection against server operator. |
@Simn I believe it's more like: function main() {
var victim = "someone"; // haxelib account username
var leakedMd5Hashes = [
// ... md5 hashes leaked from other breached sites
];
for (md5 in leakedMd5Hashes) {
final succeeded = tryHaxelibSubmit(victim, md5);
if (succeeded) {
// Attacker gained access to the victim's haxelib account.
// Other haxelib accounts are still safe.
return;
}
}
} |
Ah yeah, that's basically the same thing except that we don't need any of the BCrypt leaks because the server checks this for us. Hmm, that's actually a quite convincing argument against client-side hashing... But then again I'm changing my opinion on this with every comment at the moment. |
It will also work for anyone who has a weak or common password, because the hashes for these would be readily available in pre-computed tables.
The Md5 algorithm being involved at all is actively harmful for security. An attacker trying to crack the hashes has a much easier job trying to crack the rehashed passwords than if we were using purely argon2id (because of password shucking).
The server operator can already change the contents of a library (including haxelib itself) and therefore run arbitrary code on the machine of anyone who's installed it. That's enough power to get the password even with proper client side hashing, and to do a lot worse.
@Simn That is correct. Then once they work out which md5 hash is used, they can just crack the md5 hash and they have the original password, without having to crack argon2id. So it makes the task as easy as cracking md5 and they have stripped off any protection that would have been provided by argon2id. |
I was thinking of that as well. I know it's tempting to fall into a "we don't care about those idiots" line of thinking here, but the reality of the situation is that any such fallout is going to affect us regardless. If someone hacks Nicolas' haxelib account tomorrow because his password is hunter2 then we can't just sit there and say "should have known better". P.S.: Please don't turn into an evil server operator, Andy. :) |
No hashing, client and/or server side, will save the users who use a weak or common password. |
The attacker has to obtain the md5 hash first. The md5 hashes in the haxelib DB will be removed once we rehash them.
What I said earlier:
This is still being prevented (weakly) by the client-side md5 hash. |
The real question here is not "md5 or not" it's "client-side hashing or not". Client side hashing has some drawbacks:
The argument in favor is that client side hashing provides protection against an evildoer with access to the haxelib server in the case of password reuse. I argue that this protection is minimal and does not justify the complications. Putting the threat of malware distribution aside: if you're concerned about (evil) server operators seeing passwords the proper solution would be (for example) the Secure Remote Password protocol or WebAuthn.
There are huge amounts of leaked databases online, we should assume any attacker trying this has found relevant hashes already.
Md5 is broken, especially combined with the lack of salting (which means rainbow tables can be used). |
So let's do this now?
Breaking changes are inevitable in the long term. It's just my proposal can provide an immediate improvement without a breaking change, benefiting everyone including haxelib users of both existing and future clients.
haxelib, both client and server side should migrate off Neko, which is officially deprecated. I do understand it would require more time and work, that's why I proposed having client-side md5 + server-side whatever as an immediate improvement.
I considered those but didn't mentioned since it's an even bigger change. For the record, I do believe elimilating password altogether is the right approach. For many many years, Debian has been doing that (packagers sign packages with public key). OPAM uses a Github repo for storing library info, you don't even need a dedicated "OPAM" account.
Yes, and many of them are leaked plain-text password, which is useful for generic dictionary attack, which is not preventable with server-side-only hashing. There is not much difference whether client-side md5 is used or not.
I know, but it's not a good reason to introduce a breaking change just to remove client-side md5 hashing. We can improve the current situation without a breaking-change. Later, after the immediate improvement, we can pick a even better solution at the cost of a breaking-change. |
Okay. Let's say we temporarily go with md5+argon. Since we do not want to keep md5 in the loop forever (right?), how do we then upgrade those passwords when we get rid of md5 on the client?
Again, client side hashing introduces technical complexity and slows down future algorithm upgrades for little gain. That said, updating the server to rehash passwords is better than doing nothing so given the lack of consensus here it's probably a good idea to do that as soon as possible. |
I agree with this. It's a required step anyway and the removal of the client-side MD5 hashing could be done at a later point. |
Yes, this is best to do immediately because without any server-side hashing, a database leak would immediately give direct access to all accounts. I've created a new pull request adapted from the old one: #619. The earthly builds are failing for me though, and it doesn't seem to be exclusive to my branch... (well, apart from the current error actually x) ) |
quoting @ibilon from HaxeFoundation/haxe.org#177:
The text was updated successfully, but these errors were encountered: