Simply, I want to generate a random number between 1-100 from an MD5 hash.
I will always use a email string as the argument.
The reason is I want to create a simple a/b test to determine a specific email layout depending on the number returned.
This is my solution but I do not know if it is accurate, and frankly I'm not entirely sure if I know what I am doing..
parseInt(crypto.createHash('md5').update('[email protected]').digest("hex"), 16) % 10**2
52 is returned
Could anyone lead me in the right direction and give a detailed explanation on what is happening? Is there a better way?
Thanks.
Simply, I want to generate a random number between 1-100 from an MD5 hash.
I will always use a email string as the argument.
The reason is I want to create a simple a/b test to determine a specific email layout depending on the number returned.
This is my solution but I do not know if it is accurate, and frankly I'm not entirely sure if I know what I am doing..
parseInt(crypto.createHash('md5').update('[email protected]').digest("hex"), 16) % 10**2
52 is returned
Could anyone lead me in the right direction and give a detailed explanation on what is happening? Is there a better way?
Thanks.
Share Improve this question edited Jul 13, 2020 at 14:18 Benjamints asked Jul 13, 2020 at 14:05 BenjamintsBenjamints 8491 gold badge11 silver badges26 bronze badges 12- You want the same MD5 hash to always produce the same "random" number? – kindall Commented Jul 13, 2020 at 14:20
- How do you define "accurate?" – Robert Harvey Commented Jul 13, 2020 at 14:20
- @kindall Yes for that particular email, but each email will have it's own random number which won't change, just like my example above. – Benjamints Commented Jul 13, 2020 at 14:27
- @RobertHarvey I think the above ment will answer your question. – Benjamints Commented Jul 13, 2020 at 14:28
- 1 I think you should change the title. It doesn't seem like you really want a 'random' number, but you want to assign deterministic values to an email address in a pseudo-random way by using the hash, right? You're using a hash to 'scramble' the email address so you don't cluster similar email addresses with the same setting, right? – Jason Goemaat Commented Jul 13, 2020 at 15:12
3 Answers
Reset to default 3If you use the library seedrandom you can seed the number generator.
Seeding a number generator ensures that numbers will always generate in same order when starting from "scratch".
It's great in auto generated mazes and game levels(Think level seeds in minecraft)
Using this library you could then do something like the below to get your consistent results.
var myrng = new Math.seedrandom('[email protected]');
console.log("int 1: " + myrng.int32());
console.log("int 2: " + myrng.int32());
console.log("int 3: " + myrng.int32());
console.log("int 4: " + myrng.int32());
console.log('round 2, repeating the above with same seed');
myrng = new Math.seedrandom('[email protected]');
console.log("int 1: " + myrng.int32());
console.log("int 2: " + myrng.int32());
console.log("int 3: " + myrng.int32());
console.log("int 4: " + myrng.int32());
console.log('round 3, repeating the above with a different seed');
myrng = new Math.seedrandom('#[email protected]');
console.log("int 1: " + myrng.int32());
console.log("int 2: " + myrng.int32());
console.log("int 3: " + myrng.int32());
console.log("int 4: " + myrng.int32());
<script src="//cdnjs.cloudflare./ajax/libs/seedrandom/3.0.5/seedrandom.min.js">
</script>
Your first part is creating the MD5 hash:
crypto.createHash('md5').update('[email protected]').digest("hex")
// e820bb4aba5ad74c5a6ff1aca16641f6
The md5 hash produced is a 32 digit hexadecimal number. parseInt(hash, 16)
parses that as an integer. That's too big for an integer though so you get a large floating point number. I'm not sure what exactly is going on in that conversion. Then you take the modulo 100 which gives a value from 0 to 99.
It would be a little easier to reason if you take the first two digits of the md5 and get a number from 0-255 and base your settings off that.
Number
s in Javascript are just 64bit floats and hence are only good for about 15 decimal digits. Taking the modulus of values with ~37 decimal digits will mean the low order bits are all effectively zero and you'll get relatively sparse output. e.g.:
a = Array(100).fill(0)
for (i = 0; i < 10000; i++) {
d = Math.random() * 2**128
a[d % 100] += 1
}
note that the Math.random() * 2**128
is roughly equivalent to generating the hash of a random email. this gives me an a
like:
[
409, 0, 0, 0, 408, 0, 0, 0, 408, 0, 0, 0,
398, 0, 0, 0, 420, 0, 0, 0, 434, 0, 0, 0,
356, 0, 0, 0, 401, 0, 0, 0, 398, 0, 0, 0,
423, 0, 0, 0, 346, 0, 0, 0, 397, 0, 0, 0,
406, 0, 0, 0, 378, 0, 0, 0, 429, 0, 0, 0,
410, 0, 0, 0, 421, 0, 0, 0, 358, 0, 0, 0,
389, 0, 0, 0, 363, 0, 0, 0, 398, 0, 0, 0,
398, 0, 0, 0, 426, 0, 0, 0, 396, 0, 0, 0,
430, 0, 0, 0
]
indicating that only values divisible by 4 are possible, and hence 75 of your 100 values will never be used.
As James K. Polk mented, taking the modulus is also slightly biased, but the above is a much bigger issue. I'd also second the suggestion using division as this keeps the high order bits and maintains the entropy, something like:
digest = crypto.createHash('md5').update('[email protected]').digest("hex")
Math.floor(parseInt(digest, 16) / 2**128 * 100)
you can use something similar to above loop to see that this gives a uniform distribution of outputs
note that both of the above generate values from 0 to 99, so you probably want to add 1 to the result
another way is to go with Node's BigInt
type, something like:
digest = crypto.createHash('md5').update('[email protected]').digest()
Number(digest.readBigUInt64BE() / (2n**64n / 100n + 1n))
which avoids converting to strings and back again, but gets to basically the same answer.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744272493a4566169.html
评论列表(0条)