correcthorse, in 27 lines of bash

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · edit-2 5 days ago

correcthorse, in 27 lines of bash

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · edit-2 3 days ago

I don’t feel great about the 2n solution to apostrophes. You could just as well end up with 2n words with apostrophes, no? Its not particularly robust.

It doesn’t matter - the algorithm takes the stems, it doesn’t drop the words. “Dad’s” becomes “Dad”. If you get both “Dad’s” and “Dad”, you might indeed get a passphrase containing “DadDad” - but that’s not a weakness. Good randomness doesn’t include a guarantee of no duplicates. In fact, the uniq call reduces the quality of the passphrase: “DadDadDadDadDadDad” is a perfectly good phrase.

But it’s a good catch in another way: I’d considering only plurals and possessives, but the American dictionary word file does indeed include many words with more than one apostrophe suffix. No word of more than one letter appears more than 5 times, so 5n would guarantee enough different words. But the best thing about your comment is that it exposes another weakness: the dictionary contains several 1-letter “words”, and one of them - “O” - contains 25 variations with apostrophes. They’re all names: “O’Connell”, “O’Keefe”, etc. The next largest is “L” with 8 appearances: all borrowed words from French, such as “L’Amour”.

I don’t see a simple solution to excluding names, although a tweak could ensure that we get no single letter words. However, maybe simplifying the algorithm would be better: simply grab N words and delete any apostrophes. You might end up with mush like “OBrianMustveHed”, but perhaps that’s not a bad thing.

Perhaps the best implementation would be the simplest:

alias pony="shuf -n 6 /usr/share/dict/american-english | xargs echo | tr -d ' '

Leave in the apostrophes; more random bits. Leave in the spaces, if they’re legal characters in the authentication program, and you get even more.

apotheotic (she/her)@beehaw.org · 3 days ago

Aaaaah I totally misunderstood why you were taking 2n. You were taking 2n in case the truncated string was the same as one you already had. Makes more sense now.