I've been into reading random papers from SSRN lately. There's some really good stuff on there, like the paper I mentioned in my last post.
https://hq.ssrn.com/rankings/Ranking_display.cfm?TRN_gID=10
Sunday, July 7, 2019
Thursday, July 4, 2019
Sunday, June 30, 2019
The law of small numbers
I was listening to a podcast when I heard about an interesting probability result in the same vein as the Monty Hall Problem. The new problem is this: Flip a coin 100 times and record the results. Now pick random flips in the set and see if the next 3 flips are all heads; if so we call this a streak. Repeat until you find a streak of 3. Now what is the probability that the 4th flip is also heads? Is it 50% like we would expect? It turns out to be closer to 46%, which is not very far from 50%, but is also a clear trend.
You can download the paper here, and I recommend you read through the introduction, which is pretty easy to follow. I think does a good job of explaining what is going on. Since no one will do that, here is a table from the paper which helps give some intuition.
This represents every possible outcome from flipping a coin 3 times and looking for a 'streak' of 1 heads. There are eight total possible outcomes, all equally likely. In the first two, the streak of 1 heads never happens, or happens on the last flip where there is no following flip to look at. Those are thrown away and ignored. In the other six possible outcomes we do get a streak, at least once, and earlier than the last flip. The underlined flips represent the possible candidates for the flip that is following a streak. If we pick the preceding streak, then the underlined flips will be the one we are trying to predict. In three out of the six outcomes with a streak, the following flip will not be heads. In two out of the six outcomes the following flip will always be heads. And in the remaining possible outcome it could be either head or tails with 50/50 probability depending on which streak you pick.
If you list out all the possible outcomes from any combination of streak length and total flips, you can see that some number of the heads flips are 'consumed' by the streaks themselves. Those flips can never be following a streak, because they are part of the streak needed to define the streak. On the other hand, the tails have no restrictions, they are all available to occur in the flip immediately following a streak. There are simply more tails available to go in the candidate position. The effect gets smaller as you decrease the streak length or increase the total number of flips in a set.
I found this very surprising, so I wanted to test it out. I wrote a Ruby script to simulate various coin flips and look for streaks of different lengths, and output the results. I then decided to rewrite it in a compiled language so it would be faster. I decided to try out Go, as I've never used it before and I was hoping for something with a bit more syntactic sugar than C.
https://github.com/StephenWetzel/coin-flips-go
Here are the results of a bunch of combinations of streak lengths and numbers of flips from the Go program:
You can download the paper here, and I recommend you read through the introduction, which is pretty easy to follow. I think does a good job of explaining what is going on. Since no one will do that, here is a table from the paper which helps give some intuition.
This represents every possible outcome from flipping a coin 3 times and looking for a 'streak' of 1 heads. There are eight total possible outcomes, all equally likely. In the first two, the streak of 1 heads never happens, or happens on the last flip where there is no following flip to look at. Those are thrown away and ignored. In the other six possible outcomes we do get a streak, at least once, and earlier than the last flip. The underlined flips represent the possible candidates for the flip that is following a streak. If we pick the preceding streak, then the underlined flips will be the one we are trying to predict. In three out of the six outcomes with a streak, the following flip will not be heads. In two out of the six outcomes the following flip will always be heads. And in the remaining possible outcome it could be either head or tails with 50/50 probability depending on which streak you pick.
If you list out all the possible outcomes from any combination of streak length and total flips, you can see that some number of the heads flips are 'consumed' by the streaks themselves. Those flips can never be following a streak, because they are part of the streak needed to define the streak. On the other hand, the tails have no restrictions, they are all available to occur in the flip immediately following a streak. There are simply more tails available to go in the candidate position. The effect gets smaller as you decrease the streak length or increase the total number of flips in a set.
I found this very surprising, so I wanted to test it out. I wrote a Ruby script to simulate various coin flips and look for streaks of different lengths, and output the results. I then decided to rewrite it in a compiled language so it would be faster. I decided to try out Go, as I've never used it before and I was hoping for something with a bit more syntactic sugar than C.
https://github.com/StephenWetzel/coin-flips-go
Here are the results of a bunch of combinations of streak lengths and numbers of flips from the Go program:
Looking for a streak of length 1 in 10 total flips. Performed 10000 rounds, and 9973 were successful, found 45.29% continued the streak.
Looking for a streak of length 1 in 100 total flips. Performed 10000 rounds, and 10000 were successful, found 49.43% continued the streak.
Looking for a streak of length 1 in 1000 total flips. Performed 10000 rounds, and 10000 were successful, found 49.91% continued the streak.
Looking for a streak of length 2 in 10 total flips. Performed 10000 rounds, and 8203 were successful, found 38.16% continued the streak.
Looking for a streak of length 2 in 100 total flips. Performed 10000 rounds, and 10000 were successful, found 47.72% continued the streak.
Looking for a streak of length 2 in 1000 total flips. Performed 10000 rounds, and 10000 were successful, found 50.15% continued the streak.
Looking for a streak of length 3 in 10 total flips. Performed 10000 rounds, and 4797 were successful, found 34.88% continued the streak.
Looking for a streak of length 3 in 100 total flips. Performed 10000 rounds, and 9995 were successful, found 45.84% continued the streak.
Looking for a streak of length 3 in 1000 total flips. Performed 10000 rounds, and 10000 were successful, found 49.78% continued the streak.
Looking for a streak of length 4 in 10 total flips. Performed 10000 rounds, and 2152 were successful, found 35.83% continued the streak.
Looking for a streak of length 4 in 100 total flips. Performed 10000 rounds, and 9637 were successful, found 40.61% continued the streak.
Looking for a streak of length 4 in 1000 total flips. Performed 10000 rounds, and 10000 were successful, found 49.21% continued the streak.
Looking for a streak of length 5 in 10 total flips. Performed 10000 rounds, and 985 were successful, found 37.36% continued the streak.
Looking for a streak of length 5 in 100 total flips. Performed 10000 rounds, and 7860 were successful, found 38.66% continued the streak.
Looking for a streak of length 5 in 1000 total flips. Performed 10000 rounds, and 10000 were successful, found 48.91% continued the streak.
Looking for a streak of length 6 in 10 total flips. Performed 10000 rounds, and 388 were successful, found 35.82% continued the streak.
Looking for a streak of length 6 in 100 total flips. Performed 10000 rounds, and 5190 were successful, found 35.24% continued the streak.
Looking for a streak of length 6 in 1000 total flips. Performed 10000 rounds, and 9996 were successful, found 46.68% continued the streak.
Looking for a streak of length 7 in 10 total flips. Performed 10000 rounds, and 140 were successful, found 40.71% continued the streak.
Looking for a streak of length 7 in 100 total flips. Performed 10000 rounds, and 2997 were successful, found 33.83% continued the streak.
Looking for a streak of length 7 in 1000 total flips. Performed 10000 rounds, and 9761 were successful, found 42.40% continued the streak.
Looking for a streak of length 8 in 10 total flips. Performed 10000 rounds, and 52 were successful, found 36.54% continued the streak.
Looking for a streak of length 8 in 100 total flips. Performed 10000 rounds, and 1634 were successful, found 33.60% continued the streak.
Looking for a streak of length 8 in 1000 total flips. Performed 10000 rounds, and 8365 were successful, found 38.27% continued the streak.
Looking for a streak of length 9 in 10 total flips. Performed 10000 rounds, and 17 were successful, found 47.06% continued the streak.
Looking for a streak of length 9 in 100 total flips. Performed 10000 rounds, and 784 were successful, found 33.04% continued the streak.
Looking for a streak of length 9 in 1000 total flips. Performed 10000 rounds, and 6037 were successful, found 35.80% continued the streak.
Looking for a streak of length 10 in 10 total flips. Performed 10000 rounds, and 0 were successful, found NaN% continued the streak.
Looking for a streak of length 10 in 100 total flips. Performed 10000 rounds, and 381 were successful, found 30.71% continued the streak.
Looking for a streak of length 10 in 1000 total flips. Performed 10000 rounds, and 3615 were successful, found 33.91% continued the streak.
Labels:
Stuff I Wrote
Friday, May 31, 2019
Tuesday, April 30, 2019
Should You Time The Market?
https://ofdollarsanddata.com/even-god-couldnt-beat-dollar-cost-averaging/
You have 2 investment strategies to choose from.
- Dollar-cost averaging (DCA): You invest $100 (inflation-adjusted) every month for all 40 years.
- Buy the Dip: You save $100 (inflation-adjusted) each month and only buy when the market is in a dip. A “dip” is defined as anytime when the market is not at an all-time high. But, I am going to make this second strategy even better. Not only will you buy the dip, but I am going to make you omniscient (i.e. “God”) about when you buy. You will know exactly when the market is at the absolute bottom between any two all-time highs. This will ensure that when you do buy the dip, it is always at the lowest possible price.
Labels:
Links
Friday, March 15, 2019
Everything Smarthome
This is a long, but enjoyable article in broken Russian-English about everything smarthome in 2019.
https://vas3k.com/blog/dumbass_home/
https://vas3k.com/blog/dumbass_home/
Labels:
Links
Wednesday, February 27, 2019
Password strength
Dropbox has a password strength estimator called zxcvbn that I like a lot. It estimates entropy in your password by looking for dictionary or password list leak matches. It's long bothered me when sites estimate password strength purely based on complexity. These sites say a password like
Password!1 is much more secure than one like zbuwcramudbpvreorkno (a score of 72% vs 21% respectively). I discuss this in more detail in my How to be secure online post.
However, a while ago Dropbox changed their algorithm to favor length over resistance to dictionary attacks. There is some logic in their decision, but I really feel like something is lost by not having the old algorithm. So, I made a demo comparing the two so you can find passwords both algorithms agree are strong. At the same time, I finally hooked up this domain I bought a while ago to my github pages site.
Labels:
Stuff I Wrote
Thursday, January 31, 2019
Tuesday, December 25, 2018
How to Be Secure Online: The Blog Post
I've read a lot recently about some new types attacks I wasn't aware of
before. Most of these can be defended against pretty easily, it's just a
matter of knowing the threats. I wanted to summarize some of the
things everyone should be doing at this point, but most people aren't.
However, security isn't the only benefit of a password manager, it is also much more convenient. You can memorize one really good random password, with no restrictions on maximum length or allowed characters, and then use random passwords on every site. You'll never have to worry about password complexity restrictions, or being forced to change your password again. Just generate a new 30 character random password and let the password manager worry about keeping track of it.
I wrote about password managers in more detail here. If you just want the easiest path, then LastPass will work fine. I use KeepassXC which is open source and offline. You have to copy the password file between computers and phones yourself, using something like Dropbox, or the open source Syncthing.
I've always been bothered with password strength estimators that score you based on complexity. A classic example of a bad password estimator is http://www.passwordmeter. com/
If I generate a random 20 character password, but one that consists of only lowercase letters like
Luckily, people are starting to wise up to how useless things like replacing o with 0 are. NIST has updated password guidelines that are a great summary of what restrictions should be on password systems. Password estimators like the one above used to be much more common, and even major companies used them. A long time ago I made my own password estimator, which attempted to replace common dictionary words and then figure out the number of possible combinations, however Dropbox has a way better version of that called zxcvbn, named for the bottom row of letters on a keyboard. Using
At some point, zxcvbn changed its algorithm for calculating entropy. I didn't like this change, so I made a page with both the new and old versions of it so you can compare the two.
If you are going to use 2 factor auth, you should use a hardware device like a Yubikey, or an app like Authy. If the service only supports SMS based 2 factor auth, then use a VOIP number like Google Voice, which can't be easily ported to a new carrier.
The worst part of this, is that using plain SMS for 2 factor auth can make you less secure than no 2 factor auth, because an attacker attempting to social engineer their way into your account will be more believable if they have access to SMS codes being sent to them, versus if there is no 2 factor turned on. In some cases services allow you to reset your password using only your SMS phone number, so someone who knows your phone number, but not your password, can reset it and get into your account.
A credit freeze simply adds a random PIN that will be needed to open new accounts, ie, any time someone wants to do a hard pull of your credit with one of the reporting agencies, they will require you to lift the freeze, using the PIN. Note that you can still use your existing accounts with the freeze in place, it's only opening new accounts that will be blocked. You can quickly and temporarily remove a freeze (called thawing) within a few minutes. See here or here for more info on how to freeze your credit.
When freezing your credit, make sure they use the word "Freeze" on the page. Be careful not to do any sort of credit monitoring or "locking", those are paid services that are less effective than freezes. They will push those hard, both because they can charge for them, and because people freezing their credit restricts the agencies from doing whatever they want with your info. Worse still, if the monitoring is with a third party, the will require your SSN and other info to monitor your credit, giving your info to yet another database that will inevitable be leaked at some point.
Use a password manager
At this point, you really should be using a password manager. You have to assume some of the sites you use will be breached in any given year, and when they are the username and password you use there will be tried on other popular sites. The only way to be safe is to use different random passwords for every site. There is no way you can memorize random passwords for every site, even if you limit it to only the sites you actually care about the security of.However, security isn't the only benefit of a password manager, it is also much more convenient. You can memorize one really good random password, with no restrictions on maximum length or allowed characters, and then use random passwords on every site. You'll never have to worry about password complexity restrictions, or being forced to change your password again. Just generate a new 30 character random password and let the password manager worry about keeping track of it.
I wrote about password managers in more detail here. If you just want the easiest path, then LastPass will work fine. I use KeepassXC which is open source and offline. You have to copy the password file between computers and phones yourself, using something like Dropbox, or the open source Syncthing.
Use a long password
You should only need one or two passwords, if you are using a password manager, so you can make them very strong. You should make your password very long, and not worry about complexity too much.I've always been bothered with password strength estimators that score you based on complexity. A classic example of a bad password estimator is http://www.passwordmeter.
If I generate a random 20 character password, but one that consists of only lowercase letters like
xznmjetjsciqukhspaxv passwordmeter.com gives that a score of 21% (weak). A 6 character random password like z&*4uV
gets a score of 64% (strong), merely because it has lower case, upper
case, digits, and special characters. Tacking on 2 more characters z&*4uV.9
gets you to 100% (very strong). While that is an ok password, the 20
character one is much, much stronger, despite being all lower case.
Even if the attacker knew that your password was all lowercases there
would still be over 10^28 possibilities. Trying every possible 6
character password, even with all 95 normal keyboard characters
possible, is only about 10^12 possibilities. Which makes the 20
character password roughly a quadrillion times more secure than the 6
character one. Even the 8 character one is a trillion times worse than the 20 character one.Luckily, people are starting to wise up to how useless things like replacing o with 0 are. NIST has updated password guidelines that are a great summary of what restrictions should be on password systems. Password estimators like the one above used to be much more common, and even major companies used them. A long time ago I made my own password estimator, which attempted to replace common dictionary words and then figure out the number of possible combinations, however Dropbox has a way better version of that called zxcvbn, named for the bottom row of letters on a keyboard. Using
zxcvbn as a password would seem random to many estimators, but isn't actually, and attackers were already trying keyboard patterns.At some point, zxcvbn changed its algorithm for calculating entropy. I didn't like this change, so I made a page with both the new and old versions of it so you can compare the two.
Don't use SMS for 2 factor authentication
Don't use actual cell phone numbers with a traditional carrier, like Verizon, for 2 factor auth. It is quite easy, and increasingly common to intercept SMS codes via SIM swapping attacks. All an attacker needs is your phone number; then they call your carrier and pretend to be you with a new phone and SIM card, and ask for your number to be ported to the new phone. Then they request a 2 factor auth code and it goes to the phone they have instead of yours.If you are going to use 2 factor auth, you should use a hardware device like a Yubikey, or an app like Authy. If the service only supports SMS based 2 factor auth, then use a VOIP number like Google Voice, which can't be easily ported to a new carrier.
The worst part of this, is that using plain SMS for 2 factor auth can make you less secure than no 2 factor auth, because an attacker attempting to social engineer their way into your account will be more believable if they have access to SMS codes being sent to them, versus if there is no 2 factor turned on. In some cases services allow you to reset your password using only your SMS phone number, so someone who knows your phone number, but not your password, can reset it and get into your account.
Freeze your credit
After the Equifax data breach it's safe to assume that if you have a credit history in the US, that history including SSN and date of birth was leaked. To open new accounts one typically only needs SSN, DOB and name. To prove your identity online you are sometimes asked security questions generated from your credit history (things like what bank was your car loan in 2015 with?). All those things were leaked.A credit freeze simply adds a random PIN that will be needed to open new accounts, ie, any time someone wants to do a hard pull of your credit with one of the reporting agencies, they will require you to lift the freeze, using the PIN. Note that you can still use your existing accounts with the freeze in place, it's only opening new accounts that will be blocked. You can quickly and temporarily remove a freeze (called thawing) within a few minutes. See here or here for more info on how to freeze your credit.
When freezing your credit, make sure they use the word "Freeze" on the page. Be careful not to do any sort of credit monitoring or "locking", those are paid services that are less effective than freezes. They will push those hard, both because they can charge for them, and because people freezing their credit restricts the agencies from doing whatever they want with your info. Worse still, if the monitoring is with a third party, the will require your SSN and other info to monitor your credit, giving your info to yet another database that will inevitable be leaked at some point.
Labels:
Links,
Stuff I Wrote
Subscribe to:
Posts (Atom)


