Saturday, October 17, 2015

Big Mac Index

The big mac index is a quick attempt at explaining purchasing power parity.  The idea being that simply knowing the exchange rate is not enough to know how valuable your money from one country will actually be when you try to purchase actual goods there.

Here is the fun part though:
Critics of the presidency of Cristina Fern├índez de Kirchner in Argentina and many economists believe that the government has for years falsified consumer price data to understate the country's true inflation rate. The Economist stated in January 2011 that Big Mac index "does support claims that Argentina’s government is cooking the books. The gap between its average annual rate of burger inflation (19%) and its official rate (10%) is far bigger than in any other country." That year the press began reporting on unusual behavior by the more than 200 Argentinean McDonald's restaurants. They no longer prominently advertised Big Macs for sale and the sandwich, both individually and as part of value meals, was being sold for an unusually low price compared to other items. Guillermo Moreno, Secretary of Commerce in the Kirchner government, reportedly forced McDonald's to sell the Big Mac at an artificially low price to manipulate the country's performance on the Big Mac index. In June 2012, the price of the Big Mac value meal suddenly rose by 26%, closer to that of other meals, after The Economist, The New York Times, and other media reported on the unusual pricing. A Buenos Aires newspaper stated "Moreno loses the battle".

Friday, October 16, 2015

How is NSA breaking so much crypto?
For the most common strength of Diffie-Hellman (1024 bits), it would cost a few hundred million dollars to build a machine, based on special purpose hardware, that would be able to crack one Diffie-Hellman prime every year.

Would this be worth it for an intelligence agency? Since a handful of primes are so widely reused, the payoff, in terms of connections they could decrypt, would be enormous. Breaking a single, common 1024-bit prime would allow NSA to passively decrypt connections to two-thirds of VPNs and a quarter of all SSH servers globally. Breaking a second 1024-bit prime would allow passive eavesdropping on connections to nearly 20% of the top million HTTPS websites. In other words, a one-time investment in massive computation would make it possible to eavesdrop on trillions of encrypted connections.

Friday, September 25, 2015

Volkswagen's Clean Diesel
Since 2009, we now know, Volkswagen had been inserting intricate code in its vehicle software that tracked steering and pedal movements. When those movements suggested that the car was being tested for nitrogen-oxide emissions in a lab, the car automatically turned its pollution controls on. The rest of the time, the pollution controls switched off.

This is old hat by now, but I thought this article did a good job of giving a bit more detail about the specifics of what actually happened.

Saturday, August 29, 2015

HTTP Status Cats API

Three years ago, I called out the proposed new TLDs as stupid. Today, I formally withdraw my oppositionIt turns out .cat is not one of the new TLDs so my opposition has been reinstated.

Monday, August 24, 2015

At the Institute for Marine Mammal Studies in Mississippi, Kelly the dolphin has built up quite a reputation. All the dolphins at the institute are trained to hold onto any litter that falls into their pools until they see a trainer, when they can trade the litter for fish. In this way, the dolphins help to keep their pools clean.

Kelly has taken this task one step further. When people drop paper into the water she hides it under a rock at the bottom of the pool. The next time a trainer passes, she goes down to the rock and tears off a piece of paper to give to the trainer. After a fish reward, she goes back down, tears off another piece of paper, gets another fish, and so on. This behaviour is interesting because it shows that Kelly has a sense of the future and delays gratification. She has realised that a big piece of paper gets the same reward as a small piece and so delivers only small pieces to keep the extra food coming. She has, in effect, trained the humans.

Her cunning has not stopped there. One day, when a gull flew into her pool, she grabbed it, waited for the trainers and then gave it to them. It was a large bird and so the trainers gave her lots of fish. This seemed to give Kelly a new idea. The next time she was fed, instead of eating the last fish, she took it to the bottom of the pool and hid it under the rock where she had been hiding the paper. When no trainers were present, she brought the fish to the surface and used it to lure the gulls, which she would catch to get even more fish. After mastering this lucrative strategy, she taught her calf, who taught other calves, and so gull-baiting has become a hot game among the dolphins.

Thursday, July 30, 2015

Crashes only on Wednesdays
Caught on yet?  On Wednesdays, and only on Wednesdays, if somebody manually twiddled certain bits in the monitor settings in a certain way, two events would occur during the same millisecond and caused the DB to throw an exception, and the error message that logged this would be exactly 81 bytes long including the null terminator, overflowing the 80-char buffer and causing the program to crash!

Friday, July 17, 2015

The View from the Front Seat of the Google Self-Driving Car
But we’re now driving enough — and getting hit enough — that we can start to make some assumptions about that real crashes-per-miles-driven rate; it’s looking higher than we thought. (Our cars, with safety drivers aboard, are now self-driving about 10,000 miles per week, which is about what a typical American adult drives in a year.) It’s particularly telling that we’re getting hit more often now that the majority of our driving is on surface streets rather than freeways; this is exactly where you’d expect a lot of minor, usually-unreported collisions to happen. Other drivers have hit us 14 times since the start of our project in 2009 (including 11 rear-enders), and not once has the self-driving car been the cause of the collision. Instead, the clear theme is human error and inattention. We’ll take all this as a signal that we’re starting to compare favorably with human drivers.

Wednesday, July 8, 2015

Concise electronics for geeks

This is a really good overview of analog electronics.

Wednesday, July 1, 2015

One dot per person for the entire US

This is a really good population map of the US, even ignoring that it has racial data too.

Tuesday, June 30, 2015

Randomness in Doom

Doom's source code is all online.  Someone posted a link to the RNG used, which is actually just returning a value from the same list of 256 numbers over and over in the same order:

unsigned char rndtable[256] = {
    0,   8, 109, 220, 222, 241, 149, 107,  75, 248, 254, 140,  16,  66 ,
    74,  21, 211,  47,  80, 242, 154,  27, 205, 128, 161,  89,  77,  36 ,
    95, 110,  85,  48, 212, 140, 211, 249,  22,  79, 200,  50,  28, 188 ,
    52, 140, 202, 120,  68, 145,  62,  70, 184, 190,  91, 197, 152, 224 ,
    149, 104,  25, 178, 252, 182, 202, 182, 141, 197,   4,  81, 181, 242 ,
    145,  42,  39, 227, 156, 198, 225, 193, 219,  93, 122, 175, 249,   0 ,
    175, 143,  70, 239,  46, 246, 163,  53, 163, 109, 168, 135,   2, 235 ,
    25,  92,  20, 145, 138,  77,  69, 166,  78, 176, 173, 212, 166, 113 ,
    94, 161,  41,  50, 239,  49, 111, 164,  70,  60,   2,  37, 171,  75 ,
    136, 156,  11,  56,  42, 146, 138, 229,  73, 146,  77,  61,  98, 196 ,
    135, 106,  63, 197, 195,  86,  96, 203, 113, 101, 170, 247, 181, 113 ,
    80, 250, 108,   7, 255, 237, 129, 226,  79, 107, 112, 166, 103, 241 ,
    24, 223, 239, 120, 198,  58,  60,  82, 128,   3, 184,  66, 143, 224 ,
    145, 224,  81, 206, 163,  45,  63,  90, 168, 114,  59,  33, 159,  95 ,
    28, 139, 123,  98, 125, 196,  15,  70, 194, 253,  54,  14, 109, 226 ,
    71,  17, 161,  93, 186,  87, 244, 138,  20,  52, 123, 251,  26,  36 ,
    17,  46,  52, 231, 232,  76,  31, 221,  84,  37, 216, 165, 212, 106 ,
    197, 242,  98,  43,  39, 175, 254, 145, 190,  84, 118, 222, 187, 136 ,
    120, 163, 236, 249

int rndindex = 0;
int prndindex = 0;

// Which one is deterministic?
int P_Random (void)
    prndindex = (prndindex+1)&0xff;
    return rndtable[prndindex];

int M_Random (void)
    rndindex = (rndindex+1)&0xff;
    return rndtable[rndindex];

void M_ClearRandom (void)
    rndindex = prndindex = 0;

There is an explanation of it here.  I find these kinds of hacks, used in 80s and 90s era games to save space and increase speed, interesting.

Here is a story of a guy that replaced all the values with the same number:
What does it play like? I tried two values, 0x00 and 0xFF. With either value, the screen "melt" effect that is used at the end of levels is replaced with a level vertical wipe: the randomness was used to offset each column. Monsters do not make different death noises at different times; only one is played for each category of monster. The bullet-based (hitscan) weapons have no spread at all: the shotgun becomes like a sniper rifle, and the chain-gun is likewise always true. You'd think this would make the super-shotgun a pretty lethal weapon, but it seems to have been nerfed: the spread pattern is integral to its function.

With 0x00, monsters never make their idle noises (breathing etc.) On the other hand, with 0xFF, they always do: so often, that each sample collides with the previous one, and you just get a sort-of monster drone. This is quite overwhelming with even a small pack of monsters.

Tuesday, June 16, 2015

Civilization 4 on Linux Via Wine

I have a week off, and figured I'd be productive by playing Civ continuously the entire time.  I've used Wine a bit, but not much with modern games like Civ 4.  It took me several hours, so I figured I'd document the process here for future reference.

To begin, get iso's of the game and two expansions.  Wine doesn't work well with multidisc installations, but if you have isos you can mount them all before you begin.  Also each disc is one independent installation.  I have the actual discs, but found it much easier to download them.

It is also easier to use Wine as 32 bit, instead of trying to get 64 bit to work.  Since I wasn't using Wine for anything important, I just deleted my ~/.wine directory.  I then ran export WINEARCH=win32 before I ran any wine command.  Export will save that variable in your terminal window until you close it, but with multiple terminals open I found it safer to just compulsively enter it.

Next, you'll need a shell script called Winetricks.  You run Winetricks with arguments of various dlls and other things you need to be manually overridden.  For Civ 4 you'll need msxml3 for sure.  I ran winetricks msxml3 msxml4 vcrun2003 quartz devenum corefonts lucida tahoma, because I found it somewhere on the internet.

Run winecfg and go to the Libraries tab.  Find the library called gameux, and add it to the list of overrides, and then edit it to be disabled.  I also found that Wine did not do a good job of guessing my graphics RAM, so I had to run wine regedit.exe and add the key HKCU>Software>Wine>Direct3D>VideoMemorySize and set the value to 512.

Now, mount your 3 isos.  Navigate to the directory containing the installs and run them via Wine (make sure you've exported WINEARCH=win32 first).  The installs were straightforward.  After you've installed Beyond the Sword, download the latest patch (3.19) and install that via Wine.  The expansions will install the patches for the vanilla game, and the latest patch makes it no-cd so no need to download cracks.

You can navigate to ~/.wine/drive_c/.  If you see "Program Files (x86)" it means you messed up with your export WINEARCH=win32's and installed the game as 64 bit.  It may still work, but if not delete ~/.wine and try again.

The game is at: "~/.wine/drive_c/Program Files/Firaxis Games/Sid Meier's Civilization 4/Beyond the Sword/Civ4BeyondSword.exe".  Run it via Wine.  After it starts, you can exit and then find the config file at "~/My Games/Beyond the Sword/CivilizationIV.ini".  If you are having trouble you can set FullScreen = 0 and find ScreenHeight and ScreenWidth and set them low.  I have no trouble with full screen 1920x1080 though.  Set NoIntroMovie = 1, AutoSaveInterval = 1, DisableFileCaching = 1, DisableCaching = 1, ModularLoading = 1.

After that you should be good.  The game runs quite well, I've only noticed some slight graphics errors on some of the 3d animated leaders' faces.  I did have crashing sometimes when attempting to open certain advisor windows (military advisor seemed most common).  The Windows error message recommended I lower graphics settings and so I did from high to medium and haven't had a crash since.

Wednesday, June 10, 2015

The Making of Lemmings

Mike Dailly had seen tiny 5-pixel high sprites in games like Oids, a popular Atari ST shooter where the player’s ship rescued little android slaves, and thought that somewhere between this and a 16×16 sprite would be a sweet spot – where the small size made the Walker look big by comparison, but the animations were still good enough to impart character. One lunchtime he made an image of little men being crushed by weights, and shot by a laser gun – everyone loved it, and Gary Timmons added a few more traps. While everyone was laughing, Russell Kay was the first to say ‘There’s a game in that!’

Friday, June 5, 2015

How Tesla Will Change The World

I asked him what it was like to come to Tesla after having spent years at more established car companies. He described the difference like this: “A company like GM is a finance-driven company who always has to live up to financial expectations. Here we look at it the other way around—the product is successful when it’s great, and the company becomes great because of that.” (This mirrored what Musk had told me earlier in the day: “The moment the person leading a company thinks numbers have value in themselves, the company’s done. The moment the CFO becomes CEO—it’s done. Game over.”) Von Holzhausen went on, saying, “Another difference is that at other companies, engineering comes first—a design package is prescribed on the designer and they’re told to make it beautiful. At Tesla, design and engineering are assigned equal value, and Elon keeps them opposed to each other.” Now that von Holzhausen has gotten used to his freedom to be obsessed with the product at Tesla, he says he “would dread to go back to pre-historic ways.”

Monday, April 20, 2015

Pizza Calculator

A while ago made a table of the common pizza sizes so that we could better compare pizzas.  The problem with that was that it was difficult to compare deals of multiple pizzas of different sizes.

I made a quick javascript calculator that lets you enter the number of pizzas, the size, and the total price and then tell you what the price per standard pie or slice is.

I expect this to be quite popular at parties, as all good parties have the phrase "just go to my github page" at some point.

Sunday, April 19, 2015

The Star Wars Saga: Introducing Machete Order
Next time you want to introduce someone to Star Wars for the first time, watch the films with them in this order: IV, V, II, III, VI

As I mentioned, this creates a lot of tension after the cliffhanger ending of Episode V. It also uses the original trilogy as a framing device for the prequel trilogy. Vader drops this huge bomb that he's Luke's father, then we spend two movies proving he's telling the truth, then we see how it gets resolved. The Star Wars watching experience gets to start with the film that does the best job of establishing the Star Wars universe, Episode IV, and it ends with the most satisfying ending, Episode VI. It also starts the series off with the two strongest films, and allows you to never have to either start or end your viewing experience with a shitty movie. Two films of Luke's story, two films of Anakin's story, then a single film that intertwines and ends both stories.

Tuesday, April 7, 2015

The Theory of Interstellar Trade

This paper extends interplanetary trade theory to an interstellar setting. It is chiefly concerned with the following question: how should interest charges on goods in transit be computed when the goods travel at close to the speed of light? This is a problem because the time taken in transit will appear less to an observer traveling with the goods than to a stationary observer. A solution is derived from economic theory, and two useless but true theorems are proved.

Monday, March 23, 2015

Bowie Bond
An asset-backed security;which uses the current and future revenue from albums recorded by musician David Bowie as collateral. The 25 albums a Bowie bond uses as their underlying assets were recorded prior to 1990. David Bowie used the proceeds from the bond sale to purchase old recordings of his music. In creating the bonds, he ultimately forfeited royalties for the life of the bond (10 years).

Friday, March 6, 2015

Shakespeare Programming Language

Actual code:

The Marvelously Insane FizzBuzzJazz Program.

Lady Capulet, an old bossy woman that loves to count.
The Archbishop of Canterbury, an old fart who adores to spit out letters.

          Act I: The only one of them.

          Scene I: The Archbishop of Canterbury is a bastard.

[Enter The Archbishop of Canterbury and Lady Capulet]

The Archbishop of Canterbury:
 You are nothing!

          Scene II: Count, Lady Capulet, count.

The Archbishop of Canterbury:
 You are as beautiful as the sum of yourself and a cat!

Lady Capulet:
 Am I worse than the square of the product of the sum of a warm gentle flower and a rose
 and my pretty angel?

The Archbishop of Canterbury:
 If not, let us proceed to Scene VIII.

          Scene III: Fizzing to no end!

The Archbishop of Canterbury:
 Is the remainder of the quotient between yourself and the sum of a happy cow and a
 chihuahua as good as nothing?

Lady Capulet:
 If not, let us proceed to Scene IV. Thou art as handsome as the sum of the sum of
 the sweetest reddest prettiest warm gentle peaceful fair rose and a happy proud kindgom
 and a big roman. Speak thy mind!

 Thou art as fair as the sum of thyself and a honest delicious cute blossoming peaceful
 hamster. Thou art as cunning as the sum of the sum of an embroidered King and a horse
 and thyself. Speak thy mind!

 Thou art as amazing as the sum of the sum of a good happy proud rich hero and a hair and
 thyself! Speak thy mind.

 Speak your mind!

          Scene IV: Buzz me up, Scotty!

The Archbishop of Canterbury:
 Is the remainder of the quotient between yourself and the sum of a gentle happy cow and a
 chihuahua as good as nothing?

Lady Capulet:
 If not, let us proceed to Scene V. Thou art as handsome as the sum of the sweetest reddest
 prettiest warm gentle peaceful fair rose and a small town. Speak your mind!

 You are as prompt as the sum of yourself and a big healthy peaceful fair rich kingdom. You
 are as loving as the sum of the sum of an embroidered King and a horse and thyself. You are
 as amazing as the sum of yourself and a cute fine smooth sweet hamster. Speak your mind!

 You are as prompt as the sum of the sum of yourself and an amazing cunning Lord and a hair.
 Speak your mind.

 Speak your mind!

          Scene V: Milady, there is jazz in my pants.

The Archbishop of Canterbury:
 Is the remainder of the quotient between yourself and a proud noble kingdom as good as

Lady Capulet:
 If not, let us proceed to Scene VI. You are as charming as the sum of the sum of a noble
 cunning gentle embroidered brave mighty King and a big warm chihuahua and an amazing pony!
 Speak your mind!

 You are as prompt as the sum of yourself and a big black sweet animal. You are as noble as
 the sum of the sum of a gentle trustworthy lantern and yourself and a hog. Speak your

 You are as bold as the sum of the sum of yourself and a good delicious healthy sweet horse
 and my smooth cute embroidered purse. You are as peaceful as the sum of a flower and
 yourself. Speak your mind.

 Speak your mind!

The Archbishop of Canterbury:
 Let us proceed to Scene VII.

          Scene VI: Output or die!

The Archbishop of Canterbury:
 Open your heart!

          Scene VII: Oh, to jump the line.

Lady Capulet:
 You are as handsome as the sum of a proud noble rich kingdom and a rural town. Speak your
 mind! You are as gentle as the sum of the sum of yourself and a green mistletoe and my
 father. Speak your mind!

The Archbishop of Canterbury:
 We must return to Scene II.

          Scene VIII: Goodbye, cruel world!


Tuesday, March 3, 2015

What ISIS Really Wants
It’s hard to overstate how hamstrung the Islamic State will be by its radicalism. The modern international system, born of the 1648 Peace of Westphalia, relies on each state’s willingness to recognize borders, however grudgingly. For the Islamic State, that recognition is ideological suicide. Other Islamist groups, such as the Muslim Brotherhood and Hamas, have succumbed to the blandishments of democracy and the potential for an invitation to the community of nations, complete with a UN seat. Negotiation and accommodation have worked, at times, for the Taliban as well. (Under Taliban rule, Afghanistan exchanged ambassadors with Saudi Arabia, Pakistan, and the United Arab Emirates, an act that invalidated the Taliban’s authority in the Islamic State’s eyes.) To the Islamic State these are not options, but acts of apostasy.

Saturday, February 21, 2015


Friday, February 20, 2015

Why the Euro will ultimately fail
This disaster was entirely avoidable if the Euro bureaucrats had bothered to read Robert Mundall and Marcus Flemming’s seminal paper from 1962 which stated that according to well established macro economic models it was impossible to have domestic fiscal autonomy, fixed exchange rates, and free capital flows: no more than two of those objectives could be met. They won the nobel prize in economics for this in 1992, so it’s not exactly an obscure crackpot theory.

Since the euro is, by definition, the currency used in the eurozone the exchange rates must be fixed. One euro in Greece is the same as one euro in Germany. The same goes for free capital flows, if you have one euro in Spain and can’t spend it in Germany the eurozone doesn’t make much sense. So if it has to work the Euro zone members must give up their fiscal autonomy. What does this mean?

Saturday, February 14, 2015

A Tale of Two Zippers

This is an interesting blog post about how zippers are made, and the tiny difference between automated and non automated zippers.  He has some other good stuff on the blog too.

dtrx: Intelligent archive extraction
dtrx stands for “Do The Right Extraction.” It's a tool for Unix-like systems that takes all the hassle out of extracting archives.
This works really well for just extracting every kind of archive on Linux.  It has good options for things like creating a folder if the files inside are all at the top level (so you don't end up with 100 files in whatever folder you are working in).  Works well as a shortcut in whatever file manager you are using.

Sunday, February 1, 2015

Showcase your language one vote at a time

If you hang out on Stack Overflow at all you've probably seen this, as it's been near the top of hot questions for weeks, but I figured I'd post it as I've found it very interesting.

The way it works is people posted a language along with an interesting fact.  Then as people upvote those languages people could edit them with examples of interesting code snippets of a length equal to the upvotes.  In other words, if a language had 4 upvotes there would be 4 examples of interesting code snippets, of lengths 1, 2, 3, or 4 characters.

Saturday, January 17, 2015

Horror Hotel

Jeff pointed out that the song Horror Hotel sounds like Orajel.  So, I made this.

There's also the full speed version:

Sunday, December 28, 2014

Sunday, December 7, 2014

Best of Youtube

Recently I had the idea to go through my chat logs and grab youtube links and use the most frequently sent ones as a best of.  Well I crafted a series of commands piped into each other to go through the logs and count the youtube links and then output a list with counts.

ack-grep "youtube\.com\/watch\?v\=([\w-]+)" --nogroup -oh | uniq | sort | uniq -cd  | sort -rn -k1,1 > ~/

The list was kind of disappointing in that it's 90% music.  On the plus side all the songs are great.  I was going to write something to grab the titles for this list, but I like the surprise better.

29 SMWi7CLoZ2Q
17 oHg5SJYRHA0
13 QH2-TGUlwu4
7 wusGIl3v044
7 DhaRkWfaq10
6 ZZ5LpwO-An4
6 SiQmzz_MjsU
6 qpl5mOAXNl4
6 j5C6X9vOEkU
6 flMS2gHFOH0
5 XUd4Cbc49mg
5 LatorN4P9aA
5 jItz-uNjoZA
5 EF8GhC-T_Mo
5 d56V-mmBoow
5 4r7wHMg5Yjg
4 zT3iviHrhUI
4 Sq3eLdixvCc
4 sP4NMoJcFd4
4 r-lxwlgyhhA
4 oY59wZdCDo0
4 oc-P8oDuS0Q
4 M11SvDtPBhA
4 laKprX-HP94
4 bLHc-yIAPbg
4 7YQ1PhOnM3I
4 14btYfUBpoI
3 y5CVAKoavBk
3 xRkA6zugNMQ
3 X_I4wtNPv5w
3 xaPepCVepCg
3 vWz9VN40nCA
3 -u-HCHCuHMg
3 Sf8cM7f6P2I
3 rBllejn5fVA
3 QFm87zk_t3g
3 Oqq6jppAYFo
3 lwESraWEpSU
3 lfF1rT0Iztk
3 Kw5oJoUYTb8
3 KiQzUEc_FmI
3 jyTpk6xyzzk
3 J---aiyznGQ
3 I9lmvX00TLY
3 FbNyO2eti2c
3 ewvSbgXFLo4
3 eDI6kddtv04
3 dJoo7Tgjr8U
3 dFypAB7nYGA
3 -CCbnpMCdds
3 bXjNx9siNwI
3 a6_R8GPgLJo
3 6QZD_niSlK8
3 524Tf0dNRNw
3 4gpNqB4dnT4
3 3iUFZvaI3wE
3 _1nzEFMjkI4
3 0h0J1sphPu4
3 00aV3Yu05oI

Sunday, November 9, 2014

How do computers work?


If you do a google search for "how do computers work?" you'll find that it's a pretty common question.  You'll get results like howstuffworks.  These explanations tend to be pretty high level.  They're stuff I've understood for over a decade from just using computers heavily, and I expect most other nerds know as well.  I understand that a computer has a BIOS which loads the operating system which then runs programs.  I understand how you can translate a human's wishes into lines of computer code.  But how do those lines of code actually perform calculations using electrons in the hardware?  That is the question I wanted answered.

I've seen hardware explanations before, mostly at informal levels which usually meant they were quite short and focused on one particular aspect.  As I work my way through a computer engineering degree, one day it occurred to me that I largely had the the ability to answer this question.

I decided to write out an outline for how a computer worked on a hardware level, mostly as a summary for my own gratification (and to identify gaps).  After I was done I figured that I might as well get one good blog post out of my degree.

My aim will be to explain how something works and then explain how we can build the next level up with that.  Keep in mind this won't always correspond to how things are actually built in practice, as it is often much more complex.  My goal is just to show how you could build it if you weren't concerned with performance and price.

What you should already know

I'll be writing this for an audience that has at least a bit of exposure to programming, and for who the above howstuffworks article was largely a review.  If you don't know anything about programming it won't matter too much, as we won't be getting to anything like that for quite a while.  For the very beginning I'll assume you're at least mildly familiar with chemistry and electricity.  I have to start at some level, and frankly don't feel comfortable explaining things at a lower level than this.

Before we begin, I'd also like to remind my readers that I have no idea what I am talking about, and am probably wrong about everything here.


Wikipedia says "A semiconductor is a material which has electrical conductivity between that of a conductor such as copper and that of an insulator such as glass.", this definition obviously follows from the name, but it never really explained things to me.  It would seem that a semiconductor is just a resistor, which clearly isn't true.  

The key fact about semiconductors is that their conductivity can be changed easily.  Instead of just having a resistance which might vary slightly with temperature, a semiconductor can go from being an insulator to a conductor with the application of an electric field.

How do semiconductors work?

Silicon is a semiconductor.  Look at a periodic table, and you'll see that silicon sits 4 elements away from the left, and 4 elements away from the right (discounting the noble gases).  This means silicon has an equal number (4 specifically) of electrons and empty electrons spots in its outer shell, and therefore, bonds readily with itself.  The result is that silicon forms a nice, stable, crystalline structure.

However, this structure can be quickly messed up by even slight impurities.  Phosphorus is to the right of silicon, and has 5 electrons in its outer shell (ie, 5 valance electrons).  Putting a tiny bit of phosphorus in some otherwise pure silicon will create a crystal that has the occasional extra electron which doesn't really have a proper home.  On the other hand, boron is to the left (and up but ignore that) and has 3 valance electrons.  Putting some boron in silicon will create "holes" where there would otherwise be electrons but aren't.  It turns out that while pure silicon crystal doesn't really like to let its electrons move around (ie electrical current), if you have some of these extra electrons or holes they will move around with a little motivation.

The process of adding a bit of impurities to silicon is called doping.  Doping silicon with phosphorus will give a n-type semiconductor.  N for negative since electrons are negative and that is what we have a surplus of.  Dope with boron and we have p-type, for positive since the absence of electrons is usually viewed as positive.

I feel compelled to note here that in all cases the overall charge is neutral (ie, there is equal positive and negative charges).  This is because when we add electrons we are also adding protons in the nuclei.  But protons don't move, and we are only concerned with the sea of electrons and how the number deviates from what we would expect from pure silicon.

The Transistor

The transistor is the basis for all modern computers.  You could make a strong case for it being the most important invention of the 20th century.  The transistor is made possible by semiconductors.

If you take some p and n type semiconductors and layer them, the excess electrons in the n type will drift towards the holes in the p type.  An equilibrium is reached when the negative charge of the extra electrons taking up residence in the p type semiconductor repels further electrons from coming over to fill more holes.  If we now apply a positive voltage to the p-type layer that will give those extra electrons a place to go (they are attracted to the positive charge and will flow towards it).  With those electrons gone, more electrons will flow from the n-type layer.

Sandwich the p-type layer between two n-type layers and when the voltage is applied to the middle p-type layer a current will be able to flow from the one n-type end to the other n-type end.  In addition to applying a voltage at the middle layer, we also apply a voltage across the entire 3 layers, and that causes electrons to flow through the entire thing.

The end result is two types of transistors.  They are called the npn and pnp type.  As you can probably guess, the npn type is n-type semiconductor on either end and p-type in the middle, and the pnp is the opposite.  The npn type needs a positive voltage applied in the middle in order to conduct, and the pnp type needs a negative voltage.

This figure, which I stole from here, shows the two types of transistors.  On the left are the n and p layers.  The right shows the electronics symbol used for both.  Note that the arrow points in for pnp (Pointing iN Permanently), and points out for npn (Not Pointing iN).  The base is where the voltage is applied to control the transistor.  We are ignoring the emitter and collector; just know that they are connected when the correct voltage (negative or positive) is applied to the base.

We now have a transistor which will normally not allow current to pass through it (ie it is an insulator), however, when a voltage is applied to the middle, then current will flow through it (ie a conductor).

For the rest of this post we can think of transistors as electrically controlled switches.  Normally they are open, but apply the correct voltage and they close and allow current to pass.

If this hasn't been clear, and you want a better overview of semiconductors and transistors this video is very well done:

Logic Gates

What are logic gates?

In a digital system, a high voltage (maybe 5 volts) will represent a 1, and a low voltage (close to 0) will represent a 0.  We need to be able to take these voltages and perform logical operations based on them.  Gates are devices that can be thought of as taking a number of inputs and producing a single output based on some simple rules.

As a simple example, an OR gate might take 2 inputs, and will output a 1 if any of the inputs are a 1, and produce a 0 if all the inputs are 0.  AND gates will produce a 1 if all the inputs are 1, and a 0 if any of the inputs are 0.  There is also a single input NOT or inverter gate that will just toggle the input bit, ie, it will output a 1 if the input is 0 and will output a 0 if the input is 1.

Each of the AND and OR gates have inverted versions that are the same but with a NOT after them, which are called NAND and NOR respectively.  I think NOR is interesting because it works the way its English usage would imply.  If neither the inputs are 1 then it will be 1, if either of them are 1 then it will be 0.  This corresponds to a OR followed by a NOT.

There is also the XOR or exclusive or, which will produce a 0 if the inputs are the same, and a 1 if they are different.

Each gate is represented by a symbol.  Note that the NOT gate is sometimes represented as just the bubble without the triangle.  With that in mind, the rest of the inverted gates are literally just the original symbol followed by a NOT bubble.

How to build logic gates with transistors

Gates can be thought of as just black boxes, taking inputs and giving outputs.  However, my aim to to connect the dots from electrons to code, so I will explain how we can make these gates using transistors as switches.

It turns out there are multiple ways to make gates, but what I am aiming for is more showing that it is possible rather than explaining how it is done in practice.

An AND gate is quite easy to build using some NPN transistors that we will treat as switches that connect when a positive voltage (ie, a logical 1), is applied to the middle.  Since we want a 1 at the output when there are two 1s at the inputs we need to create a situation where the output is normally a 0 but will be connected to a 1 when there are two 1s present.

This figure shows the above situation.  The inputs are A and B, applying a positive voltage to them will cause both NPN transistors to allow current to flow.  If they both have a 1 as an input then the 6V at the top will be directly connected to the output, this will mean there is 6V at the output and that will represent a 1.  The 4.7K resistor at the bottom is a pull-down resistor and is needed so that the 6V drains away when the output is disconnected from the 6V source.  4.7K is rather high resistance and it will have little effect when the output is connected to the voltage source.

In an OR gate we just have to add a bypass to each transistor so that if either has a 1 it will connect the source to the output.  A NAND is made by moving the output to the top (along with the 4.7K resistor which is now a pull-up resistor).  If you want to see them the above image came from this site.  This site shows an alternate method for constructing the gates from transistors.

In practice, a NAND gate is the fastest, and cheapest gate to produce.  All other gates can be made from NAND gates.  This Wikipedia article pretty much just gives examples of how this is done if you are interested.

Using gates to do things

So we can make these logic gates using transistors, but what does that get us?  We know we can AND or OR things, but how do we perform more complex logic using gates?

The basis for designing a circuit with gates is a truth table.  A truth table is just a listing of every possible combination of inputs along with the desired output.  In the above figure that shows the gate symbols, there is a truth table next to each gate.

There is also an algebraic way to represent logic gates.  OR is + and AND is *, NOT is a bar over top or a apostrophe afterwards.  For example if we have two inputs named A and B, A+B = Output is A OR B.  AB + BC = Output is (A AND B) OR (B AND C). 


Here is a truth table for 4 binary inputs.  There are `2^4 = 16` combinations.  The output is true only when the input is 3, 7, or 15.  We can create an algebraic representation using a technique called sum of products.

We will default to an output of 0.  Then, in each case where we have a 1 as an output, we will force a 1 instead.  We need to determine if we have either a 3, 7, or 15.  If any of those are the case we must output a 1, which is a perfect use for an OR gate.  To test if one specific condition is met (eg, if we have a 3) we test if the each correct bit is given as an input.  For 3 that means we need to test for 0011.  We do this by inverting the first two inputs, and then feeding all 4 into an AND gate.  That AND gate will output a 1 if and only if the input is exactly 0011.  Repeat that process for the other conditions and we have 3 AND gates, which are then fed into an OR gate.  The output of the OR gate is the output of the system.

I won't go into more detail than this, as this post is already way too long, and going into way more detail than is needed.  If you want to know more this is called digital logic design.  I made a truth table generator that uses algebraic input if you want to play around with this stuff.

Two's Complement

I will discuss how to make a binary adder using logic gates, but first I must take a detour and explain how negative numbers are represented in binary.  You can probably skip over this section if you are growing weary of all this.

You are probably aware of how positive numbers are represented in binary, if not refer to the above truth table.  If we only ever have to store positive numbers this is all well and good, but how should we store negative numbers?  A simple method would be to use one bit to represent the sign.  Since positive is the norm, we make 0 = positive, and 1 = negative.  We can then say the leftmost bit (AKA most significant bit) is the sign bit.  We lose one bit's worth of data, so our maximum value is cut in half, but that is the trade off for negative numbers.

There is a problem with this simple method though.  It has a value that represents -0.  With 4 bits this would be `1000_2 = -0_10`.  This isn't that big of a deal at first glance.  -0 is just 0, so we have two ways to represent 0.  The problem is that we can no longer simply increment from -1 to 0 to 1.  Instead, we need complex logic to detect if we are crossing from -1 to -0, and to instead jump to 0.  There is also the problem that adding these numbers doesn't work the same if there are negative numbers involved, and handling them is much more complicated.

If you have programmed a bit you are probably aware that if you increase a value that is already at its maximum value it will rollover and become the minimum value.  With 4 unsigned bits the math is: 1111 + 1 = 0000.  But, in the case of 4 signed bits it is: 0111 + 1 = 1000.  Here, 0111 represents 7, but 1000 would be the -0 value we introduced with our simple signed system.  You should be aware that this is not what actually happens.  The result will be -8, and it will increment normally from there.  This is because `1000_2 = -8_10`, but how?

The system actually used is called two's complement.  Positive numbers are stored exactly the same as above.  For negative numbers the procedure is this: Take the binary representation of its positive value, invert all the bits, and then add 1.  For example, `3_10 = 0011_2`, flip the bits to be `1100`, then add 1 to get `1101_2 = -3_10`.  For -8 we take 1000, flip them to 0111, then add one to get 1000.  This process can be repeated to go from negative back to positive.

0111 1111127
0111 1110126
0000 00113
0000 00102
0000 00011
0000 00000
1111 1111−1
1111 1110−2
1111 1101−3
1111 1100−4
1000 0010−126
1000 0001−127
1000 0000−128

I know this seems pretty crazy, but it makes the math work out much nicer.  Addition works the same regardless of if the numbers are positive or negative.  I won't talk about this more, and if you want you can pretty much forget about it.  If you want to play around with it more, here is a two's complement converter.

Adding with gates

Now we will make an adder using logic gates.  The first step is adding two 1 bit numbers, and getting a single 1 bit number as the output.  This is just an XOR gate, as 0+0=0, 1+0=1, and 1+1=10, and we ignore the first 1 so the output is just 0.  The problem is we can't just ignore that bit, so we actually need 2 bits of output.  That extra bit is called the carry bit.  Since we will be adding multiple pairs of bits down the line we will need to input that carry bit into the next stage.  The result is 3 inputs and 2 outputs.

This is called a full adder.  The inputs are A0 and B0, Carry 0 would be the result of the previous stage's carry bit, or 0 if this is the first stage.  The output is Out 0, and the carry which is fed to the next stage.  If we want to combine these stages it's actually quite easy.  We just need to connect the carries and we can run as many bits as we need to.

We can add one more gate per stage and get subtraction out of the same hardware.  We create a subtraction bit; it is set to 0 for addition, and 1 for subtraction.  We feed that bit into an XOR gate along with the B bit.  If the subtraction bit is 0 the XOR will do nothing, just pass B along as it was before.  If subtraction is 1 then it will flip B.

You may note this is pretty much exactly how two's complement works, which makes sense as subtraction is just adding a negative number.  But wait, don't we need to also add 1 for two's complement?  Yes we do.  We do this by also feeding the subtraction bit in as the very first carry bit.  When that subtraction bit is 0, nothing changes, but when it is 1 it is the same as adding 1 to the result, and that completes the two's complement process.

Here is an example of a 4 bit adder-subtracter.  I added the numbers above and below to make the process of mental math easier.  If that bit is set to 1, then add that value to the decimal number, however, remember that the leftmost bit is the sign bit and that if it is 1 you need to convert.  The brighter lines are the ones that are carrying a logical 1, and the dark lines have a logical 0.

The picture shows 5 - 3 = 2.  Note the subtraction bit is set to 1 in the top left.  Also note that some of the carry bits are set to 1 along the bottom, including the last bit, however we do not use the carry bits outside the adder.  To check for overflow we look to see if the last two carry bits are the same via an XOR gate.

This is only 4 bits, and while it is complicated at first, it isn't really that bad.  It would seem simple to extend this to be 32 or 64 bits, and indeed it would be simple to do.  The problem is that each stage has to wait for the previous stage to finish before it can start.  As you add stages you increase the time the adder takes dramatically.  This is called a ripple carry, and is only used to demonstrate how carry can work in adders.  In practice we actually use look-ahead adders, which use some extra hardware to calculate the carries before the earlier stages of the adder produce them.  This increases hardware cost, but greatly reduces time.  It's hard to find stuff about look-ahead adders that doesn't assume you are taking a class in it and have a plenty of math background, but here is some more details, and these notes linked to at the bottom are pretty simple.

The above gate diagrams were drawn in Logisim, which is a fun program to play around with.  Here is the file with the above 4 bit adder-subtracter.

Memory with gates

So we now know how to add numbers using gates, which means we can add using transistors.  This is pretty impressive in my opinion, but there is still a long ways to go.  We must be able to store these values being fed into and out of the adder or it isn't of much use.  We do this with latches and flip-flops.

I'll just show you how a latch works with this animation stolen from Wikipedia:

The animation here makes this a lot simpler to understand.  Those are NOR gates, and they feed into each other.  Refreshing our memory, a NOR gate has an output of 1 only when both inputs are 0.  This means that if at least one input is 1, we know the output is 0.  We then know the input to the other NOR gate.

This is called a SR latch.  S and R are the inputs, S stands for set, and R for reset, Q is the standard output name, and the Q with a bar over it is the inverse of Q.

In an SR latch, if S=R=0 then Q remains its last value.  S=0, R=1 resets Q to 0.  S=1, R=1 sets Q to 1.  S=R=1 is undefined.

I hope that animation explains things well enough, because I honestly don't think I can do a better job via text alone.

If you think about the operation for a while it may dawn on you that it isn't that useful by itself.  You can't just hook S or R up to an output as they will change the value of Q.  The solution is called a gated latch.  We add a pair of AND gates before the inputs S and R.  We add a third input called E for enable, and feed it into both AND gates along with either S or R.  The AND gates will always output a 0 if E is 0.  If the value of E is 1 then the AND gates will pass the other value through without change.  Thus, S and R can change all they want, but the value of Q will only change when E is set to 1.

We can take an input D, feed it into S as is, and feed the inverted version of it into R, and we have a much more useful form of memory.  Now, whenever E is set to 1, the value of D will be stored as Q.  When E is 0 the value of Q will remain as whatever value D last had when E was 1.

Remember that S and R should never both equal 1, that is why the inverter is there.  They can both equal 0 when we want to just hold the current value of Q, but the AND gates controlled by E take care of that case.  So, if E=1, then we should either have: `S=0, R=1 -> Q = 0` or `S=1, R=0 -> Q = 1`.

A flip-flop is a latch that is controlled by a clock.  For example, hooking E up to a clock would mean that the value of D was grabbed every clock cycle.  This would mean that Q was essentially a buffer storing the previous cycle's value of D.  There's more to be said about flip-flops but I'll allow someone else to say it.  Wikipedia has a good list of the various types of latches and flip-flops and how they function.

Multiplexers from gates

There is one more important piece that I want to explain how to build using gates.  A multiplexer (AKA a mux) can be thought of as a digitally controlled selector switch.  A mux will have some number of input pins, some number of selector pins, and a single output pin.  One single input pin will be connected to the output pin, the rest will be not be connected to anything.  The value represented on the selector pins will determine which input pin is connected.

For example, if we had 64 latches to store values in, we could have a mux with 64 inputs, and 6 selector pins (`2^6 = 64`).  If the value `000100_2 = 4_10` is sent to the selector pins, then input pin number 4 will be connected to the output pin.  In this case the output pin would be connected to the adder, and the input to the latch (so the names input and output are backwards).

Here is an illustration of how to build a 4 input mux with gates.  Note that the triangles in the middle are actually tri-state buffers.  This just means that if the input on the side is set to 1 they will pass the value through, if it is set to 0 they will be disconnected.  This is exactly how I've described transistors to work, so just pretend they are transistors.

We have a few AND gates and inverters at the bottom, which just serve to send a 1 to the correct transistor up top.  The rest of the transistors will have a 0, and thus be an open switch.  The transistor that gets a 1 will be a closed switch and connect the corresponding input to the output.

Wikipedia has more on muxes if you want to read some more, but they are relatively simple, albeit very useful.

Combining logic, memory, muxes

At this point we can actually perform useful functions.  Imagine we had a set of 8 latches, each would be storing 1 bit, and so the set would represent a single 8-bit byte.  Now let's take 64 of those sets of 8 latches, and call that a bank of memory.  It would be 64 single byte numbers.  Create 3 of those banks.  Banks A and B are inputs, bank C is the output.  We then need 8 muxes, each with 64 inputs.  These muxes will work together to route one of the bytes somewhere.  There will be a set of 8 muxes for each bank.  Banks A and B will be routed to an 8 bit adder, and the result of that 8 bit adder will be routed to bank C.  Each of these 3 sets of 8 muxes will be controlled by a counter which is just 6 bits starting at 000000 and incrementing to 111111.  A clock pulse will cause the counter to increment and set the enable pin on the correct set of latches in bank C.

This would allow us to add 64 pairs of 1 byte numbers, and store the results.  Admittedly, that is not going to give you any cat pics, but it could still be reasonably called a computer.

Arithmetic Logic Unit (ALU)

We already have an adder and a subtracter.  We can make a few more functions simply as well.  Bitwise AND and OR are both useful.  This means just taking each bit of two inputs and giving the result from an AND or OR between them.  Shifting all the bits either to the left or right allows us to divide or multiple by 2 (consider that 12 * 10 = 120 in base 10, likewise 101 * 2 = 1010 in base 2).

We can combine all these functions into the same package, reusing hardware when possible (as in the case of addition and subtraction) and just using independent hardware when needed.  We then use a mux to determine which hardware gets fed the inputs and then another mux to take the result from the hardware and route it to the output for the ALU as a whole.

With that, we've introduced the concept of operation codes (opcodes).  Each operation we want our processor to be able to perform will have a binary code corresponding to it.  That code will largely serve to tell muxes where to route the inputs and what to do with the outputs.

Instead of just working down our memory banks sequentially as above, we can control which set of latches is being accessed or written to by giving their control muxes the correct binary number (ie, a memory address).  Since we aren't just working down a list, we can combine memory banks A, B, and C from above into just one memory bank.  There will still be 3 groups of muxes, 2 will hooked up to the outputs of the latches (Q) and route it to the input of the ALU.  The third mux will route the output of the ALU into a set of latches.  Each byte in the memory will be numbered and called a register.

So we now have 4 binary numbers, an opcode that determines what operation is being performed, a memory address for input A, a memory address for input B, and a memory address for the output.  Each of these 4 numbers is controlling a mux.

These 4 numbers combine to form a line of machine code.  This code is listed sequentially the same as the numbers were listed before in our simple adder machine.  A clock increments a counter, and that counter tells a mux which line to route into the 4 muxes from the previous paragraph.

Machine Code and Instruction Sets

The number which controls which operation is being performed is called an opcode.  The collection of possible opcodes we have to choose from is an instruction set.  While any processor is going to be able to add, there are various other features that may or may not be available depending on the complexity of the instruction set it uses.

MIPS is an instruction set which aims to achieve high performance through simplicity of operations.  As such, it is a good first example to use.  Here is the actual MIPS machine code for an addition:

opcode rs    rt     rd   shamt funct
000000 01001 01010 01000 00000 100000

rs, rt, and rd are registers, or memory locations.  rs is source, rt is the second source, and rd is destination.  shamt stands for shift amount, and can be ignored.  The opcodes in MIPS are broken between opcode and funct.  The opcode of 000000 means that we will look to funct to determine the function.  Really you can just mentally combine opcode and funct into one number.

Function 100000 is add, so the values in registers listed in rs and rt will be sent to the ALU and inside the ALU they will be routed to the adder.  The output of the ALU will be routed to the register in rd.


Assembly is a slightly more human readable way to represent machine code.  The above addition in assembly is:
add $t0, $t1, $t2

The first part is the operation, addition in this case.  The second part is the destination register.  Registers generally begin with a $, just as a convention in assembly.  t registers are temperary ones intended to be used to store the intermediate steps in a larger problem.  There are also 8 s registers which are in the first 8 memory locations.  Therefore, $t0 is mapped to address 8, or 01000.

Likewise, the last two parts are the two source registers.  They are mapped to locations 01001, and 01010 respectively.

It is worth noting here that assembly corresponds one to one with actual operations in the processor.  Each line will consume a clock cycle (this stops being true at a more complicated level).

This line of C code might be compiled into the above assembly:
c = a + b

That is a pretty simple line of C, but even it may need more than the above 1 line of assembly.  The number of registers in a processor are extremely limited.  As such, memory values constantly need to be swapped out between the main memory (RAM), and the CPU's memory (cache).  Swapping out memory values is another type of operation, one that doesn't really use the ALU.  That C code could require 4 addition assembly lines, for taking the two values from RAM and then storing the result in RAM.

The point is that you have no idea how many actual cycles of the CPU a single line of C will take, whereas with assembly you know that each line maps to one actual CPU operation.

Here is a list of MIPS operations.  It may seem complicated, but considering that is everything a MIPS CPU can do, it is kind of amazing that those operations are enough to perform any function a computer can be called upon to do.

If you still think the MIPS instruction set is complex, feel free to contrast MIPS with the instruction set used on desktop PCs.

In addition to the arithmetic operations we've talked a lot about, there are also the store and load operations I mentioned.  These store or load values to or from main memory (RAM).  There are also comparison operations that allow you to compare two values and see if they are equal, or if one is greater than.  Then there are jump instructions which let you jump to a new location in the list of operations being performed (like a GOTO).  Branch instructions are the same as jumps but will only jump if some condition is true.

From C to Assembly

C code:
int  x,y,z;
if(y < x) z = z - 3;
else z = 6; 

The assembly will assume x, y, z are already stored in registers $s4, $s5, $s6.
Assembly Code:

slt $t0, $s5, $s4     # if y<x, then set $t0=1
beq $t0, $zero, L     # if condition (y<x) is false then jump to L
addi $s6, $s6, -3     # if condition (y<x) is true then z=z-3
j EXIT                # jump to end to avoid code for false condition

L:                    # this is a label, used to jump to
addi $s6, $zero, 6    # z=6, this code is only reached if we don't jump to exit



I feel compelled to put in some sort of closing here.  Reading through this I think parts are over simplified and some of the tangents could be removed.  I'd be amazed if anyone ever read through this all, but hopefully certain sections will prove useful to someone, and then perhaps they continue reading from there to see how it fits in to a larger picture.

If you're wondering, this post is about 6000 words and I can't believe how much stuff I left out.

Thursday, October 30, 2014

The Starship Enterprise

I had to make an animation for a Matlab class a while ago, and never posted the result here. Code

I promise I'm not just posting this to ensure October isn't the first month without a post.  I also promise I have what is already the longest post I've ever written in the works as a draft.  Here's a preview of that post:

Sunday, September 28, 2014

Gay rights vs national prosperity

The other day I saw a post on reddit via /r/bestof that was a counter to the claim that "Every civilization that has accepted homosexuality has failed".  I've never heard this argument, but I'll accept that some people actually claim it's true.  The rebuttal was a list of several modern countries that banned homosexuality and several that had many gay rights.  The latter countries were the ones that were more well off.

While his argument seemed reasonable, it would be easy fall victim selection bias by picking the examples that you are well aware of.  I wanted to see if there actually was a correlation between gay rights and national prosperity.

Quantifying national prosperity is a common task.  I choose GDP (PPP) per capita, GNI per capita, GNI (PPP) per capita, and HDI.  HDI combines GNI (PPP) per capita with a education index and a life expediency index.

Quantifying gay rights was harder.  I knew of a survey that asked if gays should be accepted, but it wasn't asked in many countries.  I thought about using years that homosexuality had been legal or the punishment if it was still illegal.  However, as I was looking that up I found this Wikipedia page that listed 7 types of LGBT rights for each country.  They are, legality, civil union or similar, marriage, adoption, gays in military, anti-discrimination laws, and laws related to gender identity.  Each category has a green check or red x in it for each country.  I gave 1 point for a check, 0 for a x, and 0.5 points for both or a ? (eg, the US gets 0.5 for marriage since it's legal in some places and illegal in others).  Note here, that regardless of how you feel about those 7 things, it is safe to say that a country with more green checks is more friendly towards gays, and that is what we are trying to quantify here.

The results were pretty similar for each of the 4 measures of prosperity.  All were just about a 0.70 correlation between gay rights and prosperity, which is quite high indeed.  I used the 125 most populous countries.  Just for fun, I also looked at the correlation between population and gay rights, and as expected I found none (-0.04).

GNI (PPP) GNI HDI GDP (PPP) Population
0.70 0.68 0.69 0.68 -0.04

Scatter plots are always fun so here are some of those:


Wednesday, September 24, 2014

Odds of dying in shark attack vs driving to ocean

I've long held that there is a greater chance of dying in a car accident on the way to the ocean than there is of dying from a shark attack in the ocean.  I felt justified in this belief because I know car accidents are a large cause of deaths and that shark attacks are quite rare.  That being said, I never had any numbers to back this up.

The odds of a shark attack vary greatly depending on where you go swimming.  I had the idea briefly of making a map of the US that showed how far you'd have to drive from to equal the odds of dying from a shark attack there.  The problem was in estimating the number of swimmers in each state.

In fact, the only number for annual number of swimmers in the US was this Huffington Post article that claims it is 75 million.  This number seems high, and is unsourced.  The majority of people live near the coast, so it is possible that a quarter of them visit the ocean every year.

Number of shark attacks is easier to estimate.  Wikipedia lists 11 fatal attacks in the US in the 2000s, and 12 in the 1990sThis site lists 1.8 fatal attacks on average per year, and 41.2 injuries per year.

The NHTSA gives us the number for deaths per 100 million miles traveled as 1.13 in 2012.  I assume this includes miles traveled as a passenger. 

Using the 75 million swimmers figure, and the 1.8 fatal attacks and 41.2 injuries figures gives us these results:

You would have to drive 49 miles to have a greater chance of dying in a car accident than being injured by a shark attack.  You would have to drive 2.1 miles to have a greater chance of dying in a car accident than being killed by a shark.

There are a lot of caveats on these numbers.  First, I couldn't find injuries per mile driven, so keep in mind that 49 mile figure is dying in a car accident vs any shark attack injury, and it is also the round trip figure.

As I used the high 75 million swimmer figure, the shark attack odds could be higher.  That is 25% of the US population swimming in the ocean every year.  I would be surprised if the actual figure were lower than 10%, so that only increases the odds by 2.5, which means the round trip distance for equal odds of dying in a car accident vs shark attack would still only be about 5 or 6 miles.

Also, the vast majority of attacks in the US happen in Florida and Hawaii.  Avoid swimming in those two states and you reduce your shark attack odds to at least 1/4. 

Tuesday, September 16, 2014

Grain of sand sized piece of neutron star matter

So the realistic scenario would be, you reach out in curiosity to touch the grain of sand. Your fingertip gets 1 cm away, and you feel like it's being gently pulled on. You bring it half a centimeter closer, hrm, more of a pull, then you get to 1-2 mm away, and it suddenly feels like someone yanked on your nail as your finger suddenly snaps to the grain of sand (much like two closely spaced magnets suddenly snap into one another).

At that very instant, it feels like someone has grabbed a square millimeter of your fingertip with pliers, so you yank your arm back and that tiny little area on your fingertip is torn off in the process. Your fingertip now has a little crater on it, about 1mm in diameter, and looks like you had a teeny little wart removed.

Sunday, August 24, 2014

Cost of Living

Recently, someone posted a map of the cost of living for states in the US.  The issue was that the state level is way too high to look at that.  I decided to do it at the county level.

First I had to find a data source for the cost of living at the county level.  I found this site, which seems to have good county level data.  I wrote a script to scrape that data, and then had a csv with the cost of living and fips code for every county.

For plotting I followed this R tutorial.  It was easy enough to follow/adapt, but I noticed the result looked odd.  It was much more random than I would have expected, and some spot checking confirmed the plot was wrong.  It took me quite a while to figure out there were a few counties that are out of order in the maps package.  This still doesn't make sense as I used a match command so they should all be matched up.  Either way, I finally got an R script that plotted the data correctly.

The code and data are up on github.

Saturday, August 23, 2014


After hearing about this California drought nonstop, I made this quick perl script to make a gif from the pngs on this site.

I made one for the whole CONUS and uploaded it to Gfycat because the original is 14 MB.