Sunday, December 7, 2014

Best of Youtube

Recently I had the idea to go through my chat logs and grab youtube links and use the most frequently sent ones as a best of.  Well I crafted a series of commands piped into each other to go through the logs and count the youtube links and then output a list with counts.

ack-grep "youtube\.com\/watch\?v\=([\w-]+)" --nogroup -oh | uniq | sort | uniq -cd  | sort -rn -k1,1 > ~/

The list was kind of disappointing in that it's 90% music.  On the plus side all the songs are great.  I was going to write something to grab the titles for this list, but I like the surprise better.

29 SMWi7CLoZ2Q
17 oHg5SJYRHA0
13 QH2-TGUlwu4
7 wusGIl3v044
7 DhaRkWfaq10
6 ZZ5LpwO-An4
6 SiQmzz_MjsU
6 qpl5mOAXNl4
6 j5C6X9vOEkU
6 flMS2gHFOH0
5 XUd4Cbc49mg
5 LatorN4P9aA
5 jItz-uNjoZA
5 EF8GhC-T_Mo
5 d56V-mmBoow
5 4r7wHMg5Yjg
4 zT3iviHrhUI
4 Sq3eLdixvCc
4 sP4NMoJcFd4
4 r-lxwlgyhhA
4 oY59wZdCDo0
4 oc-P8oDuS0Q
4 M11SvDtPBhA
4 laKprX-HP94
4 bLHc-yIAPbg
4 7YQ1PhOnM3I
4 14btYfUBpoI
3 y5CVAKoavBk
3 xRkA6zugNMQ
3 X_I4wtNPv5w
3 xaPepCVepCg
3 vWz9VN40nCA
3 -u-HCHCuHMg
3 Sf8cM7f6P2I
3 rBllejn5fVA
3 QFm87zk_t3g
3 Oqq6jppAYFo
3 lwESraWEpSU
3 lfF1rT0Iztk
3 Kw5oJoUYTb8
3 KiQzUEc_FmI
3 jyTpk6xyzzk
3 J---aiyznGQ
3 I9lmvX00TLY
3 FbNyO2eti2c
3 ewvSbgXFLo4
3 eDI6kddtv04
3 dJoo7Tgjr8U
3 dFypAB7nYGA
3 -CCbnpMCdds
3 bXjNx9siNwI
3 a6_R8GPgLJo
3 6QZD_niSlK8
3 524Tf0dNRNw
3 4gpNqB4dnT4
3 3iUFZvaI3wE
3 _1nzEFMjkI4
3 0h0J1sphPu4
3 00aV3Yu05oI

Sunday, November 9, 2014

How do computers work?


If you do a google search for "how do computers work?" you'll find that it's a pretty common question.  You'll get results like howstuffworks.  These explanations tend to be pretty high level.  They're stuff I've understood for over a decade from just using computers heavily, and I expect most other nerds know as well.  I understand that a computer has a BIOS which loads the operating system which then runs programs.  I understand how you can translate a human's wishes into lines of computer code.  But how do those lines of code actually perform calculations using electrons in the hardware?  That is the question I wanted answered.

I've seen hardware explanations before, mostly at informal levels which usually meant they were quite short and focused on one particular aspect.  As I work my way through a computer engineering degree, one day it occurred to me that I largely had the the ability to answer this question.

I decided to write out an outline for how a computer worked on a hardware level, mostly as a summary for my own gratification (and to identify gaps).  After I was done I figured that I might as well get one good blog post out of my degree.

My aim will be to explain how something works and then explain how we can build the next level up with that.  Keep in mind this won't always correspond to how things are actually built in practice, as it is often much more complex.  My goal is just to show how you could build it if you weren't concerned with performance and price.

What you should already know

I'll be writing this for an audience that has at least a bit of exposure to programming, and for who the above howstuffworks article was largely a review.  If you don't know anything about programming it won't matter too much, as we won't be getting to anything like that for quite a while.  For the very beginning I'll assume you're at least mildly familiar with chemistry and electricity.  I have to start at some level, and frankly don't feel comfortable explaining things at a lower level than this.

Before we begin, I'd also like to remind my readers that I have no idea what I am talking about, and am probably wrong about everything here.


Wikipedia says "A semiconductor is a material which has electrical conductivity between that of a conductor such as copper and that of an insulator such as glass.", this definition obviously follows from the name, but it never really explained things to me.  It would seem that a semiconductor is just a resistor, which clearly isn't true.  

The key fact about semiconductors is that their conductivity can be changed easily.  Instead of just having a resistance which might vary slightly with temperature, a semiconductor can go from being an insulator to a conductor with the application of an electric field.

How do semiconductors work?

Silicon is a semiconductor.  Look at a periodic table, and you'll see that silicon sits 4 elements away from the left, and 4 elements away from the right (discounting the noble gases).  This means silicon has an equal number (4 specifically) of electrons and empty electrons spots in its outer shell, and therefore, bonds readily with itself.  The result is that silicon forms a nice, stable, crystalline structure.

However, this structure can be quickly messed up by even slight impurities.  Phosphorus is to the right of silicon, and has 5 electrons in its outer shell (ie, 5 valance electrons).  Putting a tiny bit of phosphorus in some otherwise pure silicon will create a crystal that has the occasional extra electron which doesn't really have a proper home.  On the other hand, boron is to the left (and up but ignore that) and has 3 valance electrons.  Putting some boron in silicon will create "holes" where there would otherwise be electrons but aren't.  It turns out that while pure silicon crystal doesn't really like to let its electrons move around (ie electrical current), if you have some of these extra electrons or holes they will move around with a little motivation.

The process of adding a bit of impurities to silicon is called doping.  Doping silicon with phosphorus will give a n-type semiconductor.  N for negative since electrons are negative and that is what we have a surplus of.  Dope with boron and we have p-type, for positive since the absence of electrons is usually viewed as positive.

I feel compelled to note here that in all cases the overall charge is neutral (ie, there is equal positive and negative charges).  This is because when we add electrons we are also adding protons in the nuclei.  But protons don't move, and we are only concerned with the sea of electrons and how the number deviates from what we would expect from pure silicon.

The Transistor

The transistor is the basis for all modern computers.  You could make a strong case for it being the most important invention of the 20th century.  The transistor is made possible by semiconductors.

If you take some p and n type semiconductors and layer them, the excess electrons in the n type will drift towards the holes in the p type.  An equilibrium is reached when the negative charge of the extra electrons taking up residence in the p type semiconductor repels further electrons from coming over to fill more holes.  If we now apply a positive voltage to the p-type layer that will give those extra electrons a place to go (they are attracted to the positive charge and will flow towards it).  With those electrons gone, more electrons will flow from the n-type layer.

Sandwich the p-type layer between two n-type layers and when the voltage is applied to the middle p-type layer a current will be able to flow from the one n-type end to the other n-type end.  In addition to applying a voltage at the middle layer, we also apply a voltage across the entire 3 layers, and that causes electrons to flow through the entire thing.

The end result is two types of transistors.  They are called the npn and pnp type.  As you can probably guess, the npn type is n-type semiconductor on either end and p-type in the middle, and the pnp is the opposite.  The npn type needs a positive voltage applied in the middle in order to conduct, and the pnp type needs a negative voltage.

This figure, which I stole from here, shows the two types of transistors.  On the left are the n and p layers.  The right shows the electronics symbol used for both.  Note that the arrow points in for pnp (Pointing iN Permanently), and points out for npn (Not Pointing iN).  The base is where the voltage is applied to control the transistor.  We are ignoring the emitter and collector; just know that they are connected when the correct voltage (negative or positive) is applied to the base.

We now have a transistor which will normally not allow current to pass through it (ie it is an insulator), however, when a voltage is applied to the middle, then current will flow through it (ie a conductor).

For the rest of this post we can think of transistors as electrically controlled switches.  Normally they are open, but apply the correct voltage and they close and allow current to pass.

If this hasn't been clear, and you want a better overview of semiconductors and transistors this video is very well done:

Logic Gates

What are logic gates?

In a digital system, a high voltage (maybe 5 volts) will represent a 1, and a low voltage (close to 0) will represent a 0.  We need to be able to take these voltages and perform logical operations based on them.  Gates are devices that can be thought of as taking a number of inputs and producing a single output based on some simple rules.

As a simple example, an OR gate might take 2 inputs, and will output a 1 if any of the inputs are a 1, and produce a 0 if all the inputs are 0.  AND gates will produce a 1 if all the inputs are 1, and a 0 if any of the inputs are 0.  There is also a single input NOT or inverter gate that will just toggle the input bit, ie, it will output a 1 if the input is 0 and will output a 0 if the input is 1.

Each of the AND and OR gates have inverted versions that are the same but with a NOT after them, which are called NAND and NOR respectively.  I think NOR is interesting because it works the way its English usage would imply.  If neither the inputs are 1 then it will be 1, if either of them are 1 then it will be 0.  This corresponds to a OR followed by a NOT.

There is also the XOR or exclusive or, which will produce a 0 if the inputs are the same, and a 1 if they are different.

Each gate is represented by a symbol.  Note that the NOT gate is sometimes represented as just the bubble without the triangle.  With that in mind, the rest of the inverted gates are literally just the original symbol followed by a NOT bubble.

How to build logic gates with transistors

Gates can be thought of as just black boxes, taking inputs and giving outputs.  However, my aim to to connect the dots from electrons to code, so I will explain how we can make these gates using transistors as switches.

It turns out there are multiple ways to make gates, but what I am aiming for is more showing that it is possible rather than explaining how it is done in practice.

An AND gate is quite easy to build using some NPN transistors that we will treat as switches that connect when a positive voltage (ie, a logical 1), is applied to the middle.  Since we want a 1 at the output when there are two 1s at the inputs we need to create a situation where the output is normally a 0 but will be connected to a 1 when there are two 1s present.

This figure shows the above situation.  The inputs are A and B, applying a positive voltage to them will cause both NPN transistors to allow current to flow.  If they both have a 1 as an input then the 6V at the top will be directly connected to the output, this will mean there is 6V at the output and that will represent a 1.  The 4.7K resistor at the bottom is a pull-down resistor and is needed so that the 6V drains away when the output is disconnected from the 6V source.  4.7K is rather high resistance and it will have little effect when the output is connected to the voltage source.

In an OR gate we just have to add a bypass to each transistor so that if either has a 1 it will connect the source to the output.  A NAND is made by moving the output to the top (along with the 4.7K resistor which is now a pull-up resistor).  If you want to see them the above image came from this site.  This site shows an alternate method for constructing the gates from transistors.

In practice, a NAND gate is the fastest, and cheapest gate to produce.  All other gates can be made from NAND gates.  This Wikipedia article pretty much just gives examples of how this is done if you are interested.

Using gates to do things

So we can make these logic gates using transistors, but what does that get us?  We know we can AND or OR things, but how do we perform more complex logic using gates?

The basis for designing a circuit with gates is a truth table.  A truth table is just a listing of every possible combination of inputs along with the desired output.  In the above figure that shows the gate symbols, there is a truth table next to each gate.

There is also an algebraic way to represent logic gates.  OR is + and AND is *, NOT is a bar over top or a apostrophe afterwards.  For example if we have two inputs named A and B, A+B = Output is A OR B.  AB + BC = Output is (A AND B) OR (B AND C). 


Here is a truth table for 4 binary inputs.  There are `2^4 = 16` combinations.  The output is true only when the input is 3, 7, or 15.  We can create an algebraic representation using a technique called sum of products.

We will default to an output of 0.  Then, in each case where we have a 1 as an output, we will force a 1 instead.  We need to determine if we have either a 3, 7, or 15.  If any of those are the case we must output a 1, which is a perfect use for an OR gate.  To test if one specific condition is met (eg, if we have a 3) we test if the each correct bit is given as an input.  For 3 that means we need to test for 0011.  We do this by inverting the first two inputs, and then feeding all 4 into an AND gate.  That AND gate will output a 1 if and only if the input is exactly 0011.  Repeat that process for the other conditions and we have 3 AND gates, which are then fed into an OR gate.  The output of the OR gate is the output of the system.

I won't go into more detail than this, as this post is already way too long, and going into way more detail than is needed.  If you want to know more this is called digital logic design.  I made a truth table generator that uses algebraic input if you want to play around with this stuff.

Two's Complement

I will discuss how to make a binary adder using logic gates, but first I must take a detour and explain how negative numbers are represented in binary.  You can probably skip over this section if you are growing weary of all this.

You are probably aware of how positive numbers are represented in binary, if not refer to the above truth table.  If we only ever have to store positive numbers this is all well and good, but how should we store negative numbers?  A simple method would be to use one bit to represent the sign.  Since positive is the norm, we make 0 = positive, and 1 = negative.  We can then say the leftmost bit (AKA most significant bit) is the sign bit.  We lose one bit's worth of data, so our maximum value is cut in half, but that is the trade off for negative numbers.

There is a problem with this simple method though.  It has a value that represents -0.  With 4 bits this would be `1000_2 = -0_10`.  This isn't that big of a deal at first glance.  -0 is just 0, so we have two ways to represent 0.  The problem is that we can no longer simply increment from -1 to 0 to 1.  Instead, we need complex logic to detect if we are crossing from -1 to -0, and to instead jump to 0.  There is also the problem that adding these numbers doesn't work the same if there are negative numbers involved, and handling them is much more complicated.

If you have programmed a bit you are probably aware that if you increase a value that is already at its maximum value it will rollover and become the minimum value.  With 4 unsigned bits the math is: 1111 + 1 = 0000.  But, in the case of 4 signed bits it is: 0111 + 1 = 1000.  Here, 0111 represents 7, but 1000 would be the -0 value we introduced with our simple signed system.  You should be aware that this is not what actually happens.  The result will be -8, and it will increment normally from there.  This is because `1000_2 = -8_10`, but how?

The system actually used is called two's complement.  Positive numbers are stored exactly the same as above.  For negative numbers the procedure is this: Take the binary representation of its positive value, invert all the bits, and then add 1.  For example, `3_10 = 0011_2`, flip the bits to be `1100`, then add 1 to get `1101_2 = -3_10`.  For -8 we take 1000, flip them to 0111, then add one to get 1000.  This process can be repeated to go from negative back to positive.

0111 1111127
0111 1110126
0000 00113
0000 00102
0000 00011
0000 00000
1111 1111−1
1111 1110−2
1111 1101−3
1111 1100−4
1000 0010−126
1000 0001−127
1000 0000−128

I know this seems pretty crazy, but it makes the math work out much nicer.  Addition works the same regardless of if the numbers are positive or negative.  I won't talk about this more, and if you want you can pretty much forget about it.  If you want to play around with it more, here is a two's complement converter.

Adding with gates

Now we will make an adder using logic gates.  The first step is adding two 1 bit numbers, and getting a single 1 bit number as the output.  This is just an XOR gate, as 0+0=0, 1+0=1, and 1+1=10, and we ignore the first 1 so the output is just 0.  The problem is we can't just ignore that bit, so we actually need 2 bits of output.  That extra bit is called the carry bit.  Since we will be adding multiple pairs of bits down the line we will need to input that carry bit into the next stage.  The result is 3 inputs and 2 outputs.

This is called a full adder.  The inputs are A0 and B0, Carry 0 would be the result of the previous stage's carry bit, or 0 if this is the first stage.  The output is Out 0, and the carry which is fed to the next stage.  If we want to combine these stages it's actually quite easy.  We just need to connect the carries and we can run as many bits as we need to.

We can add one more gate per stage and get subtraction out of the same hardware.  We create a subtraction bit; it is set to 0 for addition, and 1 for subtraction.  We feed that bit into an XOR gate along with the B bit.  If the subtraction bit is 0 the XOR will do nothing, just pass B along as it was before.  If subtraction is 1 then it will flip B.

You may note this is pretty much exactly how two's complement works, which makes sense as subtraction is just adding a negative number.  But wait, don't we need to also add 1 for two's complement?  Yes we do.  We do this by also feeding the subtraction bit in as the very first carry bit.  When that subtraction bit is 0, nothing changes, but when it is 1 it is the same as adding 1 to the result, and that completes the two's complement process.

Here is an example of a 4 bit adder-subtracter.  I added the numbers above and below to make the process of mental math easier.  If that bit is set to 1, then add that value to the decimal number, however, remember that the leftmost bit is the sign bit and that if it is 1 you need to convert.  The brighter lines are the ones that are carrying a logical 1, and the dark lines have a logical 0.

The picture shows 5 - 3 = 2.  Note the subtraction bit is set to 1 in the top left.  Also note that some of the carry bits are set to 1 along the bottom, including the last bit, however we do not use the carry bits outside the adder.  To check for overflow we look to see if the last two carry bits are the same via an XOR gate.

This is only 4 bits, and while it is complicated at first, it isn't really that bad.  It would seem simple to extend this to be 32 or 64 bits, and indeed it would be simple to do.  The problem is that each stage has to wait for the previous stage to finish before it can start.  As you add stages you increase the time the adder takes dramatically.  This is called a ripple carry, and is only used to demonstrate how carry can work in adders.  In practice we actually use look-ahead adders, which use some extra hardware to calculate the carries before the earlier stages of the adder produce them.  This increases hardware cost, but greatly reduces time.  It's hard to find stuff about look-ahead adders that doesn't assume you are taking a class in it and have a plenty of math background, but here is some more details, and these notes linked to at the bottom are pretty simple.

The above gate diagrams were drawn in Logisim, which is a fun program to play around with.  Here is the file with the above 4 bit adder-subtracter.

Memory with gates

So we now know how to add numbers using gates, which means we can add using transistors.  This is pretty impressive in my opinion, but there is still a long ways to go.  We must be able to store these values being fed into and out of the adder or it isn't of much use.  We do this with latches and flip-flops.

I'll just show you how a latch works with this animation stolen from Wikipedia:

The animation here makes this a lot simpler to understand.  Those are NOR gates, and they feed into each other.  Refreshing our memory, a NOR gate has an output of 1 only when both inputs are 0.  This means that if at least one input is 1, we know the output is 0.  We then know the input to the other NOR gate.

This is called a SR latch.  S and R are the inputs, S stands for set (equal to 1), and R for reset (back to 0), Q is the standard output name, and the Q with a bar over it is the inverse of Q.

In an SR latch, if S=R=0 then Q remains its last value.  S=0, R=1 resets Q to 0.  S=1, R=0 sets Q to 1.  S=R=1 is undefined.

I hope that animation explains things well enough, because I honestly don't think I can do a better job via text alone.

If you think about the operation for a while it may dawn on you that it isn't that useful by itself.  You can't just hook S or R up to an output as they will change the value of Q.  The solution is called a gated latch.  We add a pair of AND gates before the inputs S and R.  We add a third input called E for enable, and feed it into both AND gates along with either S or R.  The AND gates will always output a 0 if E is 0.  If the value of E is 1 then the AND gates will pass the other value through without change.  Thus, S and R can change all they want, but the value of Q will only change when E is set to 1.

We can take an input D, feed it into S as is, and feed the inverted version of it into R, and we have a much more useful form of memory.  Now, whenever E is set to 1, the value of D will be stored as Q.  When E is 0 the value of Q will remain as whatever value D last had when E was 1.

Remember that S and R should never both equal 1, that is why the inverter is there.  They can both equal 0 when we want to just hold the current value of Q, but the AND gates controlled by E take care of that case.  So, if E=1, then we should either have: `S=0, R=1 -> Q = 0` or `S=1, R=0 -> Q = 1`.

A flip-flop is a latch that is controlled by a clock.  For example, hooking E up to a clock would mean that the value of D was grabbed every clock cycle.  This would mean that Q was essentially a buffer storing the previous cycle's value of D.  There's more to be said about flip-flops but I'll allow someone else to say it.  Wikipedia has a good list of the various types of latches and flip-flops and how they function.

Multiplexers from gates

There is one more important piece that I want to explain how to build using gates.  A multiplexer (AKA a mux) can be thought of as a digitally controlled selector switch.  A mux will have some number of input pins, some number of selector pins, and a single output pin.  One single input pin will be connected to the output pin, the rest will be not be connected to anything.  The value represented on the selector pins will determine which input pin is connected.

For example, if we had 64 latches to store values in, we could have a mux with 64 inputs, and 6 selector pins (`2^6 = 64`).  If the value `000100_2 = 4_10` is sent to the selector pins, then input pin number 4 will be connected to the output pin.  In this case the output pin would be connected to the adder, and the input to the latch (so the names input and output are backwards).

Here is an illustration of how to build a 4 input mux with gates.  Note that the triangles in the middle are actually tri-state buffers.  This just means that if the input on the side is set to 1 they will pass the value through, if it is set to 0 they will be disconnected.  This is exactly how I've described transistors to work, so just pretend they are transistors.

We have a few AND gates and inverters at the bottom, which just serve to send a 1 to the correct transistor up top.  The rest of the transistors will have a 0, and thus be an open switch.  The transistor that gets a 1 will be a closed switch and connect the corresponding input to the output.

Wikipedia has more on muxes if you want to read some more, but they are relatively simple, albeit very useful.

Combining logic, memory, muxes

At this point we can actually perform useful functions.  Imagine we had a set of 8 latches, each would be storing 1 bit, and so the set would represent a single 8-bit byte.  Now let's take 64 of those sets of 8 latches, and call that a bank of memory.  It would be 64 single byte numbers.  Create 3 of those banks.  Banks A and B are inputs, bank C is the output.  We then need 8 muxes, each with 64 inputs.  These muxes will work together to route one of the bytes somewhere.  There will be a set of 8 muxes for each bank.  Banks A and B will be routed to an 8 bit adder, and the result of that 8 bit adder will be routed to bank C.  Each of these 3 sets of 8 muxes will be controlled by a counter which is just 6 bits starting at 000000 and incrementing to 111111.  A clock pulse will cause the counter to increment and set the enable pin on the correct set of latches in bank C.

This would allow us to add 64 pairs of 1 byte numbers, and store the results.  Admittedly, that is not going to give you any cat pics, but it could still be reasonably called a computer.

Arithmetic Logic Unit (ALU)

We already have an adder and a subtracter.  We can make a few more functions simply as well.  Bitwise AND and OR are both useful.  This means just taking each bit of two inputs and giving the result from an AND or OR between them.  Shifting all the bits either to the left or right allows us to divide or multiple by 2 (consider that 12 * 10 = 120 in base 10, likewise 101 * 2 = 1010 in base 2).

We can combine all these functions into the same package, reusing hardware when possible (as in the case of addition and subtraction) and just using independent hardware when needed.  We then use a mux to determine which hardware gets fed the inputs and then another mux to take the result from the hardware and route it to the output for the ALU as a whole.

With that, we've introduced the concept of operation codes (opcodes).  Each operation we want our processor to be able to perform will have a binary code corresponding to it.  That code will largely serve to tell muxes where to route the inputs and what to do with the outputs.

Instead of just working down our memory banks sequentially as above, we can control which set of latches is being accessed or written to by giving their control muxes the correct binary number (ie, a memory address).  Since we aren't just working down a list, we can combine memory banks A, B, and C from above into just one memory bank.  There will still be 3 groups of muxes, 2 will hooked up to the outputs of the latches (Q) and route it to the input of the ALU.  The third mux will route the output of the ALU into a set of latches.  Each byte in the memory will be numbered and called a register.

So we now have 4 binary numbers, an opcode that determines what operation is being performed, a memory address for input A, a memory address for input B, and a memory address for the output.  Each of these 4 numbers is controlling a mux.

These 4 numbers combine to form a line of machine code.  This code is listed sequentially the same as the numbers were listed before in our simple adder machine.  A clock increments a counter, and that counter tells a mux which line to route into the 4 muxes from the previous paragraph.

Machine Code and Instruction Sets

The number which controls which operation is being performed is called an opcode.  The collection of possible opcodes we have to choose from is an instruction set.  While any processor is going to be able to add, there are various other features that may or may not be available depending on the complexity of the instruction set it uses.

MIPS is an instruction set which aims to achieve high performance through simplicity of operations.  As such, it is a good first example to use.  Here is the actual MIPS machine code for an addition:

opcode rs    rt     rd   shamt funct
000000 01001 01010 01000 00000 100000

rs, rt, and rd are registers, or memory locations.  rs is source, rt is the second source, and rd is destination.  shamt stands for shift amount, and can be ignored.  The opcodes in MIPS are broken between opcode and funct.  The opcode of 000000 means that we will look to funct to determine the function.  Really you can just mentally combine opcode and funct into one number.

Function 100000 is add, so the values in registers listed in rs and rt will be sent to the ALU and inside the ALU they will be routed to the adder.  The output of the ALU will be routed to the register in rd.


Assembly is a slightly more human readable way to represent machine code.  The above addition in assembly is:
add $t0, $t1, $t2

The first part is the operation, addition in this case.  The second part is the destination register.  Registers generally begin with a $, just as a convention in assembly.  t registers are temperary ones intended to be used to store the intermediate steps in a larger problem.  There are also 8 s registers which are in the first 8 memory locations.  Therefore, $t0 is mapped to address 8, or 01000.

Likewise, the last two parts are the two source registers.  They are mapped to locations 01001, and 01010 respectively.

It is worth noting here that assembly corresponds one to one with actual operations in the processor.  Each line will consume a clock cycle (this stops being true at a more complicated level).

This line of C code might be compiled into the above assembly:
c = a + b

That is a pretty simple line of C, but even it may need more than the above 1 line of assembly.  The number of registers in a processor are extremely limited.  As such, memory values constantly need to be swapped out between the main memory (RAM), and the CPU's memory (cache).  Swapping out memory values is another type of operation, one that doesn't really use the ALU.  That C code could require 4 addition assembly lines, for taking the two values from RAM and then storing the result in RAM.

The point is that you have no idea how many actual cycles of the CPU a single line of C will take, whereas with assembly you know that each line maps to one actual CPU operation.

Here is a list of MIPS operations.  It may seem complicated, but considering that is everything a MIPS CPU can do, it is kind of amazing that those operations are enough to perform any function a computer can be called upon to do.

If you still think the MIPS instruction set is complex, feel free to contrast MIPS with the instruction set used on desktop PCs.

In addition to the arithmetic operations we've talked a lot about, there are also the store and load operations I mentioned.  These store or load values to or from main memory (RAM).  There are also comparison operations that allow you to compare two values and see if they are equal, or if one is greater than.  Then there are jump instructions which let you jump to a new location in the list of operations being performed (like a GOTO).  Branch instructions are the same as jumps but will only jump if some condition is true.

From C to Assembly

C code:
int  x,y,z;
if(y < x) z = z - 3;
else z = 6; 

The assembly will assume x, y, z are already stored in registers $s4, $s5, $s6.
Assembly Code:

slt $t0, $s5, $s4     # if y<x, then set $t0=1
beq $t0, $zero, L     # if condition (y<x) is false then jump to L
addi $s6, $s6, -3     # if condition (y<x) is true then z=z-3
j EXIT                # jump to end to avoid code for false condition

L:                    # this is a label, used to jump to
addi $s6, $zero, 6    # z=6, this code is only reached if we don't jump to exit



I feel compelled to put in some sort of closing here.  Reading through this I think parts are over simplified and some of the tangents could be removed.  I'd be amazed if anyone ever read through this all, but hopefully certain sections will prove useful to someone, and then perhaps they continue reading from there to see how it fits in to a larger picture.

If you're wondering, this post is about 6000 words and I can't believe how much stuff I left out.

Thursday, October 30, 2014

The Starship Enterprise

I had to make an animation for a Matlab class a while ago, and never posted the result here. Code

I promise I'm not just posting this to ensure October isn't the first month without a post.  I also promise I have what is already the longest post I've ever written in the works as a draft.  Here's a preview of that post:

Sunday, September 28, 2014

Gay rights vs national prosperity

The other day I saw a post on reddit via /r/bestof that was a counter to the claim that "Every civilization that has accepted homosexuality has failed".  I've never heard this argument, but I'll accept that some people actually claim it's true.  The rebuttal was a list of several modern countries that banned homosexuality and several that had many gay rights; the latter countries were the ones that were more well off.

While his argument seemed reasonable, it would be easy to fall victim to selection bias by picking the examples that you are well aware of.  I wanted to see if there actually was a correlation between gay rights and national prosperity.

Quantifying national prosperity is a common task.  I chose GDP (PPP) per capita, GNI per capita, GNI (PPP) per capita, and HDI.  HDI combines GNI (PPP) per capita with a education index and a life expediency index.

Quantifying gay rights was harder.  I knew of a survey that asked if gays should be accepted, but it wasn't asked in many countries.  I thought about using years that homosexuality had been legal or the punishment if it was still illegal.  However, as I was looking that up I found this Wikipedia page that listed 7 types of LGBT rights for each country.  They are: legality, civil union or similar, marriage, adoption, gays in military, anti-discrimination laws, and laws related to gender identity.  Each category has a green check or red x in it for each country.  I gave 1 point for a check, 0 for a x, and 0.5 points for both or a ? (eg, the US gets 0.5 for marriage since it's legal in some places and illegal in others).  Note here, that regardless of how you feel about those 7 things, it is safe to say that a country with more green checks is more friendly towards gays, and that is what we are trying to quantify here.

The results were pretty similar for each of the 4 measures of prosperity.  All were just about a 0.70 correlation between gay rights and prosperity, which is quite high indeed.  I used the 125 most populous countries.  Just for fun, I also looked at the correlation between population and gay rights, and as expected I found none (-0.04).

GNI (PPP) GNI HDI GDP (PPP) Population
0.70 0.68 0.69 0.68 -0.04

Scatter plots are always fun so here are some of those:


Wednesday, September 24, 2014

Odds of dying in shark attack vs driving to ocean

I've long held that there is a greater chance of dying in a car accident on the way to the ocean than there is of dying from a shark attack in the ocean.  I felt justified in this belief because I know car accidents are a large cause of deaths and that shark attacks are quite rare.  That being said, I never had any numbers to back this up.

The odds of a shark attack vary greatly depending on where you go swimming.  I had the idea briefly of making a map of the US that showed how far you'd have to drive from to equal the odds of dying from a shark attack there.  The problem was in estimating the number of swimmers in each state.

In fact, the only number for annual number of swimmers in the US was this Huffington Post article that claims it is 75 million.  This number seems high, and is unsourced.  The majority of people live near the coast, so it is possible that a quarter of them visit the ocean every year.

Number of shark attacks is easier to estimate.  Wikipedia lists 11 fatal attacks in the US in the 2000s, and 12 in the 1990sThis site lists 1.8 fatal attacks on average per year, and 41.2 injuries per year.

The NHTSA gives us the number for deaths per 100 million miles traveled as 1.13 in 2012.  I assume this includes miles traveled as a passenger. 

Using the 75 million swimmers figure, and the 1.8 fatal attacks and 41.2 injuries figures gives us these results:

You would have to drive 49 miles to have a greater chance of dying in a car accident than being injured by a shark attack.  You would have to drive 2.1 miles to have a greater chance of dying in a car accident than being killed by a shark.

There are a lot of caveats on these numbers.  First, I couldn't find injuries per mile driven, so keep in mind that 49 mile figure is dying in a car accident vs any shark attack injury, and it is also the round trip figure.

As I used the high 75 million swimmer figure, the shark attack odds could be higher.  That is 25% of the US population swimming in the ocean every year.  I would be surprised if the actual figure were lower than 10%, so that only increases the odds by 2.5, which means the round trip distance for equal odds of dying in a car accident vs shark attack would still only be about 5 or 6 miles.

Also, the vast majority of attacks in the US happen in Florida and Hawaii.  Avoid swimming in those two states and you reduce your shark attack odds to at least 1/4. 

Tuesday, September 16, 2014

Grain of sand sized piece of neutron star matter

So the realistic scenario would be, you reach out in curiosity to touch the grain of sand. Your fingertip gets 1 cm away, and you feel like it's being gently pulled on. You bring it half a centimeter closer, hrm, more of a pull, then you get to 1-2 mm away, and it suddenly feels like someone yanked on your nail as your finger suddenly snaps to the grain of sand (much like two closely spaced magnets suddenly snap into one another).

At that very instant, it feels like someone has grabbed a square millimeter of your fingertip with pliers, so you yank your arm back and that tiny little area on your fingertip is torn off in the process. Your fingertip now has a little crater on it, about 1mm in diameter, and looks like you had a teeny little wart removed.

Sunday, August 24, 2014

Cost of Living

Recently, someone posted a map of the cost of living for states in the US.  The issue was that the state level is way too high to look at that.  I decided to do it at the county level.

First I had to find a data source for the cost of living at the county level.  I found this site, which seems to have good county level data.  I wrote a script to scrape that data, and then had a csv with the cost of living and fips code for every county.

For plotting I followed this R tutorial.  It was easy enough to follow/adapt, but I noticed the result looked odd.  It was much more random than I would have expected, and some spot checking confirmed the plot was wrong.  It took me quite a while to figure out there were a few counties that are out of order in the maps package.  This still doesn't make sense as I used a match command so they should all be matched up.  Either way, I finally got an R script that plotted the data correctly.

The code and data are up on github.

Saturday, August 23, 2014


After hearing about this California drought nonstop, I made this quick perl script to make a gif from the pngs on this site.

I made one for the whole CONUS and uploaded it to Gfycat because the original is 14 MB.

Thursday, August 7, 2014

Learn X in Y minutes

This site is a great reference for people that already know how to program but switch between languages frequently, and get caught up on the subtle syntax differences between them.

Wednesday, July 16, 2014

Tax Brackets

I saw a post where someone was advocating a sliding tax scale instead of tax brackets.  The thing is I think many people misunderstand that tax brackets don't mean that two people making $37k and $89k pay the same tax rate, despite both belonging to the 25% bracket.

I wrote a quick perl script to calculate the effective tax rate for taxable incomes $0 - $1,000,000 and plotted it.  Let me stress taxable incomes here, this is after deductions are applied.

Here's another for just up to $100k:

The result is more or less a logarithmic curve, which is realistically the curve you'd end up with using a sliding scale since you couldn't have the rate rise to infinity.  The other alternative would be a logistic curve.

If you thirst for sweet sweet data here it is:

Monday, June 30, 2014

On being sane in insane places
Rosenhan's study was done in two parts. The first part involved the use of healthy associates or "pseudopatients" (three women and five men) who briefly feigned auditory hallucinations in an attempt to gain admission to 12 different psychiatric hospitals in five different States in various locations in the United States. All were admitted and diagnosed with psychiatric disorders. After admission, the pseudopatients acted normally and told staff that they felt fine and had not experienced any more hallucinations. All were forced to admit to having a mental illness and agree to take antipsychotic drugs as a condition of their release. The average time that the patients spent in the hospital was 19 days. All but one were diagnosed with schizophrenia "in remission" before their release. The second part of his study involved an offended hospital administration challenging Rosenhan to send pseudopatients to its facility, whom its staff would then detect. Rosenhan agreed and in the following weeks out of 193 new patients the staff identified 41 as potential pseudopatients, with 19 of these receiving suspicion from at least 1 psychiatrist and 1 other staff member. In fact Rosenhan had sent no one to the hospital.

Friday, June 13, 2014

Detroit: A Hurrican Without Water

This is a blog that goes through Google and Bing street view images and compares recent ones to old ones.  This shows the rapid deterioration of houses in Detroit.

Keep in mind the oldest images are from 2008, and the newest are from 2013.  So, 5 years is all it takes for a couple of houses to turn into couches.

Let me stress that these pictures are of the same place.

Friday, May 30, 2014

Everything there is to know about NES Tetris

When I was a kid, I kept track of the pieces I received in Tetris on a piece of paper, because I was convinced that the in game stats were lying (they weren't).

This guy wrote 25,000 words about the inner workings of Tetris (including an interesting RNG, which is actually truly random in practice).

Even if you don't understand or care about the technical details I recommend you skim through it, as there were a lot of interesting items.

Friday, May 2, 2014

The Graphing Calculator Story
People around the Apple campus saw us all the time and assumed we belonged. Few asked who we were or what we were doing.When someone did ask me, I never lied, but relied on the power of corporate apathy. The conversations usually went like this:
Q: Do you work here?
A: No.
Q: You mean you're a contractor?
A: Actually, no.
Q: But then who's paying you?
A: No one.
Q: How do you live?
A: I live simply.
Q: (Incredulously) What are you doing here?!

Sunday, April 27, 2014

Block editing text in Notepad++ or Geany

The other day I showed the concept of block editing text to 3 different people in 3 different contexts.  All were unaware of it, and agreed it was a very useful feature.

Block editing allows you to drag your cursor down across several rows of text and edit them all at once.  Two main uses are to either type on several rows at once, or to select a column of data to be copied to somewhere else.

In Notepad++ it is Alt + Shift, in Geany is it Ctrl + Shift.

Consider this data:
-7.1382    6918.30976
-7.4637    7244.35
-7.7947    7585.7757
-8.1307    7943
-8.4715    8317.637
-8.8169    8709.635
-9.1664    9120.1
-9.5199    9549.9258
-9.8770   10000

If you wanted to remove those negatives, there are a few ways you could go about it, but with the ability to drag your cursor across several rows it is a simple matter to drag it down those rows and delete the negative.

If you wanted to paste this into a spreadsheet it may be able to figure out how to separate the columns, but by just selecting one column at a time and pasting it you can guarantee each column will end up organized properly.

I think block editing is pretty self explanatory, and the stolen gif does a good job of showing its potential.  However, I'd like to also touch on using find and replace.

Ctrl+H will bring up the find and replace box.  There is an option for regular, extended (escape sequences), or regex.  I prefer the extended option.  Let's say we wanted to enclose all the above data in quotes, separated by commas, and semicolons for each pair.  In other words we want this output:
"-7.1382", "6918.30976";
"-7.4637", "7244.35";
"-7.7947", "7585.7757";
"-8.1307", "7943";
"-8.4715", "8317.637";
"-8.8169", "8709.635";
"-9.1664", "9120.1";
"-9.5199", "9549.9258";
"-9.8770", "10000"

This is a pretty common format for input to various programs.  First, I would use block editing to change the spaces to ", " which admittedly wouldn't work for the last one.  If there were more of varied length I'd have to use a regex to check for multiple spaces (\s+).  The real usefulness comes from find and replace with newlines.

The newline character (\n) represents the each line break.  Search for that, and in the replace box enter ";\n"  If you want the output to just be on the same line leave out the \n from the replace text, maybe use a space instead.  On Windows you will likely need to use \r\n instead of just \n (\r = return, \n = newline).

These two techniques have saved me countless hours manually formating text.

Jack Churchill
Lieutenant Colonel John Malcolm Thorpe Fleming "Jack" Churchill, DSO & Bar, MC & Bar (16 September 1906 – 8 March 1996), nicknamed Fighting Jack Churchill and Mad Jack, was a British soldier who fought throughout the Second World War armed with a longbow, and a Scottish sword (a basket-hilted claybeg commonly but incorrectly called a claymore). He is known for the motto "any officer who goes into action without his sword is improperly dressed." Churchill also carried out the last recorded bow and arrow killing in action, shooting a German officer in 1940 in a French village.

Monday, March 24, 2014

Backups to/from a Linux machine from Windows

I'm currently using Windows 7 on a laptop and Linux at home on my PC.  I plan on switching the laptop to Linux at some point, but for now I wanted to be able to back files up to my home PC over the internet.  Now if you're a normal person there are plenty of services that do this for you (eg, Dropbox), but if you're me, then you insist upon doing it manually and having total control.

Linux on Windows
Step one is to get a Linux like environment on Windows.  This means installing Cygwin.  You will want to install openssh, rsync, git, and whatever languages you may want to program in. 

With Cygwin you should be able to directly use linux commands in the Windows command prompt.  Use 'ls' to test this.

SSH (Secure SHell) allows you to log in to a remote machine over an encrypted connection.  Once logged in you have the same control as you would in a shell prompt on that machine locally.  It supports public key encryption so that we can login without a password prompt.

As it is, you can log in with just the password using ssh dale@  Replace dale with your account on the remote machine.  You can use an actual IP to log in over the internet, but first you have to forward port 22 on your router.

Generate Keys
Now it is time to generate some ssh keys to allow you to log into your home computer remotely.  There are two ways to use ssh keys.  You can generate keys that require a password to use, or you can generate keys that do not require a password.  If you use passwordless keys, then you will have a private key file that anyone could use to login to any computer where you've set it up to allow logins with that key.  Of course, you can keep the key on an encrypted laptop, and if they laptop is stolen or you have reason to believe the key was stolen it's as easy as deleting the key from the remote computer.

Passwordless keys are nice, and I use one to admin my site, but for my home PC I didn't want to allow access with just the keyfile.  This complicated the commands slightly, but I will get to that later.

Generating keys is easy.  Just run ssh-keygen, and answer the questions.  You now have the .ssh directory in your home directory.  Inside, should be (at least) two files.  First is id_rsa, which is your private key and must be protected.  The second file is, and that is your pubic key.  As the name implies there is no need to keep this a secret.  You will place this file on any machine you want to be able to log in to. To allow you to log in to a remote machine, find the .ssh directory in your home directory, and then the authorized_keys file in it.  Copy and paste the public key text string into that file. You may need to restart some processes, but you should now be able to log in from the laptop to the remote machine.  You use the same command as above, but with the username you put in the ssh key (which can be the same as you use on different machines, Linux will pair it with @host to make it unique).  

Now that you have the ability to log in to the remote machine you can make some scripts to move files between them.  rsync is nice for this because it will automatically use ssh to tunnel the connections to hide the data on public networks, and it does a good job of only sending the data that has changed. Because I used a password with my key I wanted to combine the rsync commands into one, so I only had to enter my password once.  I made 4 batch files.  Two each to receive files, and to send files, and then one set for going over the local network, and one for going over the internet. My local receive command looks like this:
rsync -ahvz ---progress --stats dale@ /cygdrive/C/Users/Dale/Documents/
You can check the flags if you want, but I found those work well for me.  Note that it takes all the contents of the Documents folder on the remote machine and dumps it into the Documents folder on the laptop.  Also notice how cygwin handles drives.  You use /cygdrive/C/ to represent C: all forward slashes. For sending files it was a little harder:
rsync -ahvz --progress --stats Other Programming School dale@
I can't remember exactly why, but I couldn't just use the whole folder.  I had send those individual folders.  It took a little while to get it, but this syntax works.  Note that 'Other' 'Programming', and 'School' are each files on the laptop that need to be sent to Documents on the remote machine. For over the internet versions, just replace the IP with your home IP, and make sure you set up port forwarding.  If you plan on accessing the remote machine over the internet, it makes sense use the host file to create a shortcut for the IP.  The host file is at: C:\windows\system32\drivers\etc\hosts,  Simply add a line with the IP and then a name, like: 123.456.78.90 home .  Then you can replace that IP with the name in scripts or on the command line.  

These are all pretty rudimentary, but they have been working well for me for a few months.   As it stands, they never delete a file, so if you move something you must manually delete it's old location on both the remote machine and local between syncs.  There are flags to do this automatically, but I kind of like it as is.

Sunday, March 23, 2014


I never know what the world looks like outside my little internet bubble.  I'm often surprised when I bring something up that has, from my perspective, been talked about constantly for weeks and people don't know what I'm talking about.  I have a feeling this 2048 game is a case of this.

It has spawned an absurd number of forks and variants.  At one point there were 10 threads about it on the top 100 posts on hackernews.  I thought it was some crazy new revolutionary technology.  Nope, silly flash game.

Anyway, while I didn't care about the original that much, one of the clones has already eaten several hours of my (very valuable) time.

In case you haven't played any of these, the point is to move squares with the same number together and they will merge into a single square with the sum.  You end up with powers of 2, and are aiming for the highest possible value.  The goal being the eponymous 2048 square.

In the tetris version the pieces will fold together due to gravity, and can be quickly moved left and right for unlimited length combos.

The key is a good organization strategy, that is robust enough to handle the 4s and 8s.  I've gotten to 1024 a few times, but 2048 eludes me.

Another random variant, is this one where you are placing the blocks and the AI is playing against you collapsing them.

Sunday, March 16, 2014

The Earth and Moon

I stumbled upon this image of the Earth and the Moon photoshopped together to scale.  I decided to make my own in 1920x1080.  I took the images and the average radii, orbit, and inclination from wikipedia

8 kB PNG:

28 kB JPG

Wednesday, March 12, 2014

What’s gone wrong with democracy
Yet these days the exhilaration generated by events like those in Kiev is mixed with anxiety, for a troubling pattern has repeated itself in capital after capital. The people mass in the main square. Regime-sanctioned thugs try to fight back but lose their nerve in the face of popular intransigence and global news coverage. The world applauds the collapse of the regime and offers to help build a democracy. But turfing out an autocrat turns out to be much easier than setting up a viable democratic government. The new regime stumbles, the economy flounders and the country finds itself in a state at least as bad as it was before. This is what happened in much of the Arab spring, and also in Ukraine’s Orange revolution a decade ago. In 2004 Mr Yanukovych was ousted from office by vast street protests, only to be re-elected to the presidency (with the help of huge amounts of Russian money) in 2010, after the opposition politicians who replaced him turned out to be just as hopeless.

Wednesday, February 19, 2014

The Highest-Denominated Bill Ever Issued Gives Value to Worthless Zimbabwe Currency
A 100-trillion-dollar bill, it turns out, is worth about $5.

That's the going rate for Zimbabwe's highest denomination note, the biggest ever produced for legal tender—and a national symbol of monetary policy run amok. At one point in 2009, a hundred-trillion-dollar bill couldn't buy a bus ticket in the capital of Harare.

But since then the value of the Zimbabwe dollar has soared. Not in Zimbabwe, where the currency has been abandoned, but on eBay.

The notes are a hot commodity among currency collectors and novelty buyers, fetching 15 times what they were officially worth in circulation. In the past decade, President Robert Mugabe and his allies attempted to prop up the economy—and their government—by printing money. Instead, the country's central bankers sparked hyperinflation by issuing bills with more zeros.

Sunday, February 16, 2014

Cryptic Crossword: Amateur Crypto and Reverse Engineering
Something that my friend had noticed was that when we scrambled a puzzle twice in a row, the two keys would be different, but only in the first half. The third and fourth digits were the same. At first I thought that this might be due to scrambling the same grid, but further exploration suggested that it was entirely due to temporal proximity. So naturally, I tried running two instances of the Across Lite program at the same time, and hit Alt-S S on both of them as quickly as possible. In this way I obtained two grids scrambled with the same key.

With this technique, I had my first inroad, a way to start making some actual progress in the investigation. I could now create two crossword grids that differed in some specific way, scramble them both with the same key, and then compare the results, seeing directly how a change input affected the scrambled output.

Monday, January 27, 2014

Truth Tables

I made a truth table generator.  There are many of these, but as usual, none met my standards for working exactly how I wanted them to.

You can change what operators are used to represent AND, OR, and NOT, and you can use short hand of writing variables together for AND or OR.  It also shows what actual logical comparison it makes in javascript so you know if what you are writing is being interpreted correctly.  I wanted to make the columns in the table color code to show which functions were equivalent, but that was annoying; so it just displays decimal values at the end for you to manually compare.

Tuesday, January 21, 2014

Snapping windows to half screen in Fluxbox

I've recently begun using Windows 7 more heavily, and one feature I do miss when on Linux, despite initially scoffing at it, is the ability to snap a window to half the screen.  I figured this wouldn't be too hard to replicate, and it turns out I was right.  There is a program called wmctrl which should be usable with any windows manager, but Fluxbox had some built in commands I used.

I stole a very good idea from this thread about using the numpad to move windows to either the left/right, top/bottom, or the four corners of the screen.

Here is what I added to my ~/.fluxbox/keys file:
#1920 / 2 = 960
#1080 / 2 = 540
Control KP_0 :Minimize
Control KP_1 :MacroCmd {ResizeTo 958  540} {MoveTo 00 00 LowerLeft}   
Control KP_2 :MacroCmd {ResizeTo 1920 540} {MoveTo 00 00 LowerLeft}   
Control KP_3 :MacroCmd {ResizeTo 958  540} {MoveTo 00 00 LowerRight}   
Control KP_4 :MacroCmd {ResizeTo 958 1060} {MoveTo 00 00 UpperLeft}   
Control KP_5 :Maximize
Control KP_6 :MacroCmd {ResizeTo 958 1060} {MoveTo 00 00 UpperRight}   
Control KP_7 :MacroCmd {ResizeTo 958  540} {MoveTo 00 00 UpperLeft}   
Control KP_8 :MacroCmd {ResizeTo 1920 540} {MoveTo 00 00 UpperLeft}   
Control KP_9 :MacroCmd {ResizeTo 958  540} {MoveTo 00 00 UpperRight}  

Note the subtraction of 2 pixels from the left/right measurements to account for window borders.  Also, note the subtraction of 20 pixels from the full height measurement to account for my taskbar.  Finally, note that the title bar will be overlapped when stacked vertically.

Here is a sample after using Ctrl+7, Ctrl+1, Ctrl+6:

Wednesday, January 1, 2014

The Best of The Onion

A reddit thread asked for favorite headlines, and someone at The Onion coded up a page for the highest voted ones.