Wednesday, December 18, 2013

Boston Dynamics

https://www.youtube.com/watch?v=5S4ZPvr6ry4

Google buying Boston Dynamics is old news at this point, but I wanted to see what kind of things they were up to after that Bigdog robot.  If this video doesn't seem that impressive, skip to about 3:55, and imagine that thing walking towards you, possibly carrying a plasma rifle of some sort, asking you for your real name.

Wednesday, November 6, 2013

Electrocution in Water

http://www.youtube.com/watch?v=dcrY59nGxBg
How Dangerous is it to swim in a pool when there is live wire in the water? What are the chances of electrocution? Take a look and you may get some ideas!

Friday, October 4, 2013

Burritos

https://medium.com/comedy-corner/fd08c0babb57

Jesus already gave me two burrito forks. One at the end of each arm. They’re called fucking HANDS.

A fork. My god. I haven’t cried since I was six, but I’m fucking sobbing now.

People eat burritos with forks?

God is sorry he made us.

Friday, September 20, 2013

New Jersey Municipalities

So by popular demand, a post on the types of municipalities in New Jersey.

The structure of NJ Municipalities is rather unique, and complicated.  To begin, we must discuss the 5 traditional (ie used before 1900) types, all of which were recently redefined in the late 1980s:


Township is the oldest forms.  It began as a direct democracy with a town hall meeting.  That system was replaced in the late 1800s and then again in 1989.

Boroughs began as a special form, requiring an act of legislature for each.  In the late 1800s any township or area not exceeding 4 square miles and a population of 5000 could form a borough.  Boroughs were considered a separate school district and thus could avoid paying taxes as well as exercise greater control over their own schools.  This, of course, led to the infamous Borough Fever.  The legislature removed the ability for boroughs to self-incorporate.  The latest rewrite came in 1987.

Cities were created in the late 1800s by various special laws, with no real pattern, besides a max population cap of 12,000.  In the late 1980s when all the municipality laws were rewritten there were only 11 cities. 

Towns were created in the late 1800s for municipalities over 5000.  The law was rewritten with the rest in the late 1980s.

Villages were also created in the late 1800s for areas of at least 300 people.  In the late 1980s rewrite villages were all but abolished (there was only 1 at the time).  Now, they operate under the same rules as a township, but with some name changes.  For example, mayor is called president of the board, which is a terrible name for the leader of a village, it should probably be chieftain.


Currently, all 5 of these types of municipalities are legally equal.  This is opposed to other states, where townships are made up of towns or boroughs.

Now that we've discussed the 5 types of municipalities, we must discuss the 12 forms of government.  To begin, each of the 5 types has a default form of the same name.  They may either keep the default or change to one of the 7 modern forms.


To begin, there is the special charter, which is another name for 'other'.  As in, a special charter granted by the state legislature which doesn't fit one of the other forms.

Next there is the Walsh Act, and the 1923 Municipal Manager Law.  Both created in the early 1900s.

The most important of the modern forms, though, are the 4 Faulkner Act forms.  The Faulkner Act (aka Optional Municipal Charter Law) was passed in 1950 and created 4 new optional forms: Mayor-Council, Council-Manager, Small Municipality and Mayor-Council-Administrator.  Each of these forms have several sub plans, designated by letters (eg plan B).


For some stats I went to this wiki page:
https://en.wikipedia.org/wiki/List_of_municipalities_in_New_Jersey

Unfortunately, it doesn't have most forms, only the types.  It also seems to be wrong in a few cases.  I used the links to the individual pages, and scraped the form from them directly.  Even so, I know there are at least a few errors.  For example, there are no more village forms, the last changed a few years ago.  My data still lists 1.  Still, I hope that it is accurate for a few years ago.


FormNumberMinMedianMeanMax
All56558,09715,561277,140
Borough2182965,7767,16742,704
Township141166,90211,08592,843
Faulkner (all)13649223,41732,983277,140
Faulkner Act (Mayor-Council)7349230,71945,143277,140
Faulkner Act (Council-Manager)431,17022,86622,92962,300
Walsh Act3054,30012,14466,455
Faulkner Act (Small Municipality)171,6735,3577,32726,535
City131,1158,62414,95864,270
Special Charter108,21324,33829,12666,522
Town92,68113,62014,27140,684
1923 Municipal Manager Law76724,13628,87184,136
Faulkner Act (Mayor-Council-Administrator)313,18325,85026,59240,742
Village1194194194194


This is a good PDF with a good historical overview:
http://www.njslom.org/history_municipal_govt.pdf

This page gives a good overview of how each form operates:
http://www.njslom.org/types.html


Here's my data:
https://github.com/DaleSwanson/nj-city-wiki/blob/master/nj.csv

Tuesday, September 17, 2013

High Point Itineraries

A week or so ago I made a post comparing state high point elevations to dates of admission to the union.  Todd commented that he wanted to see itineraries for the two different trips.  I thought it wouldn't be that hard to make a site to generate the itineraries for any HP trip, and it would be good practice at both Rails and Github.  I've done about as much work as I see myself doing on it, so here it is:

http://hp-itinerary.herokuapp.com/create

It should be pretty easy to figure out, but I'll go over the details here.

Probably the biggest negative is that it does not support Alaska and Hawaii.   You can't drive to Hawaii, and probably wouldn't to Alaska, so they don't really lend themselves to pregenerated itineraries.  I thought about adding some base time like 12 hours for each to represent a flight, but then I needed to account for time to nearest airport, which is pretty significant for some of the HPs.

You enter state abbreviations in the top box, in the order they will be traveled.  Use any separator.  Click on the dots on the map (red for HPs, green for cities) to add them to the end of the list.  Cities are abbreviated with 3 letter airport codes.  You can use full names, but only if there are no spaces in the name.  There are currently 5 built in trips with buttons to add them to the box.  These include two of our trips (NE, SE).

After entering states and cities in order, you can click 'create' to be taken to the itinerary page.

Clear will clear out the box, and reverse will reverse the order of the points in it.

There are some options below the main box, all can be left as their defaults.

Daily start/end time is the time of the day you will begin or stop hiking or driving (use either 24 hour time or 'p' for pm hours).  For example if you want your days to start at 4am and end at 9pm, you would enter 4 in the start box, and either 21, 9p, or '9 pm' in the end box.  If you enter a start time after the end time it will just swap them.

Driving/Hiking time scale is what to multiply the default data by.  The driving times come from Google directions, which I find to be slightly slow, so I might use 0.9 there.  The hiking times aren't great, they just assume 2 mph average hiking speed.  I compared those times to our hikes and they are reasonably close.

Overhead time is the amount of minutes to add to every hike.  Consider this to be the time taken to get out of the car and stop at summit for picture.  Somewhat confusingly it will be doubled for most hikes, because it is considered one way.


The display page should also be pretty self-explanatory.  It lists the day, with the day count and actual date.  The code can handle any starting date, but there is currently no place to enter one other than today, mainly because I didn't feel like parsing dates in javascript.  There are two types of task, drive or hike.  For each it lists the start and end time (in 24 hour format), the duration (in decimal hours), and distance (miles).  There is also a link.  For drives it's to Google Directions, and for hikes it's a Google search for the peak and summitpost.org, which should take you directly to the page for most.

Note here that (RT) means round trip, and that is what most hikes will be.  The exception will be if the HP is either the starting or ending terminus.

Note that it does seem to handle multi day drives.  Although, I'd be careful to double check them.

At the end is total distance and time for the drive and hike.  There is no overall Google Directions link because the syntax of the URL was more annoying than I felt like dealing with right now.

You can link directly to the URL given for an itinerary:
http://hp-itinerary.herokuapp.com/display/PHL;CT;RI;MA;NH;ME;VT;NY;NJ;PHL



Some other random notes:

I've deployed this to Heroku.  I suppose I should give them credit for free hosting of a dynamic site.  However, at least half the development time was spent trying to get it to work.  The main issue was their 'toolbelt' they insist you use kept breaking my Rails installation.  Then there was the fact that I pretty much had to switch to PostgreSQL for development to make uploading my database work.  That being said, once I got it working it was pretty easy to work with.

The code is pretty sloppy overall.  Maybe I'll refactor it all one day.

I'd vaguely like to add traveling salesman logic to find the fastest route.  I honestly don't think this would be that hard, but isn't likely to happen any time soon.

This is by far the most time I've ever spent for what amounts to a one-off mildly-funny inside joke.


Here's all the code:
https://github.com/DaleSwanson/itinerary

Monty Hall Problem

I discussed this problem a while ago, but as it is one of my favorite probability problems I felt it deserved a revisit.  What makes the problem great, is that nearly everyone (myself included) gets it wrong at first.  Wikipedia even claims: "Paul Erdős, one of the most prolific mathematicians in history, remained unconvinced until he was shown a computer simulation confirming the predicted result (Vazsonyi 1999)."
Suppose you’re on a game show, and you’re given the choice of three doors. Behind one door is a car, behind the others, goats. You pick a door, say #1, and the host, who knows what’s behind the doors, opens another door, say #3, which has a goat. He says to you, "Do you want to pick door #2?" Is it to your advantage to switch your choice of doors?


Yes; you should switch. The first door has a 1/3 chance of winning, but the second door has a 2/3 chance. Here’s a good way to visualize what happened. Suppose there are a million doors, and you pick door #1. Then the host, who knows what’s behind the doors and will always avoid the one with the prize, opens them all except door #777,777. You’d switch to that door pretty fast, wouldn’t you?

This is the article that brought the problem to a large audience.  She published many of the responses, which are almost all claiming she is wrong.

http://marilynvossavant.com/game-show-problem/  



http://en.wikipedia.org/wiki/Monty_Hall_problem

Friday, September 6, 2013

Xaos

https://www.youtube.com/watch?v=9ltIvooYa1U

http://en.wikipedia.org/wiki/XaoS

So for the last hour I've been zooming in on fractals.  This Xaos program is pretty sweet.  Left click to zoom, p for random color scheme, and a digit for a different fractal (1 to jump back to default zoom of Mandelbrot).  If you go into Calculation > Iterations, and turn it up (mine is at 500) it'll let you zoom further.

The United States of America: Onwards and Upwards

I awoke this morning to some highpoint trivia, which is obviously the best way to start the day:

(11:11:59 AM) Todd Nappi: well you probably didnt read that HP records post I sent you awhile back 
(11:12:06 AM) Todd Nappi: But there are two up for grabs
(11:12:13 AM) Todd Nappi: doing them in Ascending height order 
(11:12:20 AM) Todd Nappi: And doing them in order of admittance to the union

My first thought was that order of admittance would actually work pretty well.  It would move generally west, and you end in one of the far travel ones of Hawaii and Alaska.  Then, I realized that ascending order would be pretty similar.  I wondered how similar and have been practicing R lately, so I found out.

To start with the big reveal, the correlation between state highpoint elevations and dates of admission is 0.703.  What's more, p = 0.000000013, or about a 1 in 77.5 million chance of random data of this size giving this correlation.  This is pretty conclusive: The United States has been adding states of ever increasing height in an effort to increase average state highpoint elevation.

I did some calculations and here is the relationship between year (y), and height (h):
`y = 0.01338 h + 1757.6`
`h = 74.738 y - 131 360`

Using these formulas we can predict that the US will annex Nepal/Tibet sometime around 2146, and Mars in 2748.


Tuesday, September 3, 2013

A basic intro to Git and Github

Git is a system for making backups of source code.  It allows you to atomize each change you make to the code which can then be rolled back independently of later changes.  Github is a website which people can publish their code which is tracked in git to.  It allows for easy collaboration between multiple people on single projects.

I've wanted to learn more about Git for a while, since it is the trendy version control system.  However, it has a notoriously steep learning curve.  As such, there are many overviews available online.  In keeping with the general theme of this blog I've decided to provide my own inferior overview.

As I said, there are many guides online, but I liked this one.

The main thing I want to summarize is the different conceptual levels a file goes through in the backup process:
  • To begin, we have untracked files.  All your files will begin at this level, and git will do nothing with them until you tell it to.

  • The first time you add a file, it becomes tracked.  This means git will monitor it and when you run git status, it will alert you to changes.

  • You must explicitly stage files which are part of a single change.  Files that are staged will all by saved by the next step.  You stage files with git add.

  • When you commit changes it takes the currently staged files and makes a record of the changes.  At this point the changes are backed up locally, and you can roll them back later.  You can skip the staged step and directly commit any tracked file that has changed with git commit -a.

  • For remote backups, you can push your changes to a site like Github.  Once the files are uploaded there, others can see them and download them with pull.
This is the basic gist of using git as a single user backup system.  If you want to collaborate on files that's when things like branches and merges become more useful.

Friday, August 30, 2013

Anatomy of a hack: How crackers ransack passwords like “qeadzcwrsfxv1331”

http://arstechnica.com/security/2013/05/how-crackers-make-minced-meat-out-of-your-passwords/
One of the things Gosney and other crackers have found is that passwords for a particular site are remarkably similar, despite being generated by users who have never met each other. After cracking such a large percentage of hashes from this unknown site, the next step was to analyze the plains and mimic the patterns when attempting to guess the remaining passwords. The result is a series of statistically generated brute-force attacks based on a mathematical system known as Markov chains. Hashcat makes it simple to implement this method. By looking at the list of passwords that already have been cracked, it performs probabilistically ordered, per-position brute-force attacks. Gosney thinks of it as an "intelligent brute-force" that uses statistics to drastically limit the keyspace.

Where a classic brute-force tries "aaa," "aab," "aac," and so on, a Markov attack makes highly educated guesses. It analyzes plains to determine where certain types of characters are likely to appear in a password. A Markov attack with a length of seven and a threshold of 65 tries all possible seven-character passwords with the 65 most likely characters for each position. It drops the keyspace of a classic brute-force from 957 to 657, a benefit that saves an attacker about four hours. And since passwords show surprising uniformity when it comes to the types of characters used in each position—in general, capital letters come at the beginning, lower-case letters come in the middle, and symbols and numbers come at the end—Markov attacks are able crack almost as many passwords as a straight brute-force.

Thursday, August 29, 2013

An overview of open source licenses

I know what you are saying, 'wait a minute, you already did a post on open source licenses in May of 2011'.  Well yes, but that was more of an intro to the concepts of open source licenses, and the virus like nature of the GPL.  Here I will do more of a review of the major licenses, as well as some good sites I've found to explain them.

During my recent WMGK parser post, I published the code to github, and that required picking a license.  This lead to much more research than would strictly be necessary for something no one will ever see or use ever.  I want to now spread that knowledge to all of you.

In uncharacteristic fashion, I will give a quick summary of the licenses here, as opposed to forcing you to read through all my drunken ramblings (although we are already three paragraphs in):

  • If you provide no explict license on your code it will default to the most restrictive one possible.  That is, no one will be able to redistribute your code in any way, unless they get your expressed written consent first.

  • If you want people to be able to do anything they want with your code, and not even have to attribute you, then use Unlicense, which is effectively the public domain, with a disclaimer that you can't be sued if the code burns down their house.  This is good for short code snippets that aren't worth a complex license.

  • If you want people to be able to do anything they want with your code, as long as they attribute you, then you can use the MIT or BSD licenses.  The MIT license seems to be somewhat more popular, and is probably simpler to read.  The BSD license has a 2 clause version that is basically the same as MIT, and a 3 clause version that also prohibits the use of your organizations names to endorse the derivative works.  There is also the Apache license which prevents people from using software patents to sue, it is probably better than pure MIT for longer programs.

  • If you want people to be able to release programs based on your program, as long as they also release the source code to those derivative works, then use the GPL.  There are two versions in common use, v2 and v3.  The main update was an attempt to deal with software being released on physical devices, as well as some software patent stuff.

If you want to review software licenses in greater depth, and I know you do, here are two good sites with simple color coded overviews.

http://choosealicense.com/licenses/

http://www.tldrlegal.com/browse

Monday, August 26, 2013

An analysis of radio song play frequency for 102.9 WMGK

Update: I've looked at this again in 2023.  The original post is below.


Recently I had to drive a car with only a radio for music.  After just a few days I was annoyed at the variety, or lack thereof.  I decided to scrape the station's web site for recently played data and see what the long term trends were.

The local classic rock station is 102.9 WMGK.  They have a recently played page which is easily scrapable.  This is contrasted to other classic rock stations that I wanted to scrape for a comparison that only listed the last few songs, or embedded it in flash.

I began scraping on June 26th and August 24th makes 60 days worth of data.

Thanks to my recent post, we know classic rock songs average 4:48 in length.  There are 86,400 minutes in 60 days.  That would be enough time for 18,000 songs.  If we assume only 75% of the time is actually music that's 13,500 songs.  During these 60 days WMGK played 14,286 songs.

I won't speculate about what this means about the actual percentage of music on the air.  Slight changes in average song length have a big effect on numbers.

So, now it is time for the interesting questions.  How many of those were unique songs?  How many songs would be needed to represent 25% or 75% coverage?

In case it isn't clear what I mean by coverage, I mean how many songs represent 10% (or whatever) of the total songs played.  For example, if a station played 1 song 10 times, and then 10 other songs 1 time each, for a total of 20 plays, that 1 top song would give 50% coverage.

So, without further ado here are the stats:

CoverageSongs
10.04%28
25.27%77
50.23%172
75.13%279
90.03%373
100.00%924

924 unique plays out of 14,286 means about 6.5% of songs were being played for the first time in 60 days.  Honestly, that's not bad.  However, the 50% and 75% coverages are awful.  I have 30k unique songs in my playlist, and that's not even particularly impressive.  Admittedly, they aren't all one genre, but Led Zeppelin has 86 studio songs, all of which would be suitable for a classic rock station.

The key take away is that there are about 300 to 350 songs that get played at least every other day.  Then, they occasionally toss in a deep cut.  There is no middle ground; either a song is played every other day, or once every few months.  I made some (read: over 100) graphs that illustrate this, but for now let's stick to tables.

Want to guess what the most popular bands or songs are?

Top songs:
Plays per 30 daysBandSong
27.5Warren ZevonWerewolves Of London
27CarsJust What I Needed
27Blue Oyster CultBurnin' For You
27Steve Miller BandRock 'n Me
26.5SupertrampThe Logical Song
26.5David BowieChanges
26.5Pink FloydAnother Brick In The Wall
26.5Electric Light OrchestraDo Ya
26J. Geils BandCenterfold
26WarLow Rider

Top Bands:
Plays per 30 daysBand
356.5Rolling Stones
334.5Led Zeppelin
283Beatles
183Pink Floyd
177.5Who
169.5Van Halen
164Queen
154.5Journey
143Cars
138.5Billy Joel
135.5Tom Petty And The Heartbreaker
135.5Steve Miller Band
135Electric Light Orchestra
132.5Foreigner
112Supertramp
107.5Boston
107.5Bad Company
105.5Aerosmith
105.5Creedence Clearwater Revival
105Elton John


I would not have guessed most of those top songs.  Note how much higher the top three bands are than the rest; there is a second smaller drop off after Foreigner.  Also interesting is that none of the top three bands have a top 10 song.  I would have also guessed Beatles to be the top band.

Let's start looking at the graphs.


Here we have two graphs of bands.  The first is limited to the top 50 and has labels.  The second is all bands, and without labels, just showing the general trend.





These next two graphs really show the tendency to play the same songs.  The first shows how many songs were played various number of times in 60 days.  There are three clusters.  First, songs that were played once or twice.  Then there is a wasteland from 11 to 26 plays in 60 days.  After that, there is the main group of songs that are played every other day.  That tapers off, and leads to the last group of songs which are played almost every day.  Keep in mind that the number of plays compounds the number of songs in that group.  20 songs each played 35 times is a lot more plays than 200 songs played once.
The second graph that illustrates this point is this one of every single song.  It is a bit confusing as there are way too many songs to see individual bars, but you can look at the slopes to see the same three groups as before.  The top songs as the peak on the left.  Then the plateau of the normal roster, followed by a steep drop off to the songs played a few times.  The steep drop off illustrates the lack of a middle ground.

The last general WMGK graph is this one that shows the average daily plays in a given hour of the day.  It shows there is a drop off in the morning hours from 5am - 8am.  The night hours of 12am - 4am are the highest.  It's interesting that there is a clear rise at midnight.  I don't think WMGK has a morning show, so I'm not sure why there is the drop off.  At first I thought they increased ads during peak commuting hours, but there is no drop off in the evening.  My guess is they must provide short news and traffic conditions during those hours.



I made that graph so that I could compare individual bands to see if different bands were being played more at certain times (eg longer songs being relegated to the middle of the night).  Unfortunately, I don't have enough data to really answer this.  A typical band might have a few plays in a given hour for the entire 60 day period.  I suppose I could have increased the buckets to 6 hour windows, but was too lazy to code this.


The rest of the graphs compare WMGK plays to last.fm plays for a given band.  I'll post a few here.  All of the graphs are in this gallery on my site.

I had a hard time with the comparisons.  First, there was the fact that some of the titles had subtle differences.  This meant I had to hard code the differences in a hash.  There is also the problem of songs with multiple versions on last.fm, this will tend to change the order slightly.  Also, the last.fm api only gives play data from the last 6 months.  For most classic rock this doesn't matter, but, eg, David Bowie had a recent album, and thus his graphs are hugely skewed towards it.

Then I couldn't decide which songs to show.  I ended up making two graphs for each band.  One shows all the WMGK plays and the other shows the top 20 last.fm songs.  Each has the other's plays on them, but are sorted differently.  I think the last.fm graph is more interesting, as it shows which popular songs WMGK doesn't play.  In their defense, some songs simply aren't classic rock.  On the other hand, the WMGK sorted graph shows the sharp drop off when you go from the frequently played songs down to the deep cuts (that aren't that deep).

For example, here are the two Billy Joel graphs:

I think they both show the vast difference of opinion of what good Billy Joel songs are.


Cars:


A very different Genesis:


Led Zeppelin, showing the drop off:


A not-that-surprising Pink Floyd.  Except, why does WMGK love Young Lust?


Not big Rush fans:


WMGK doesn't seem to know where the Styx CD gets good:


A crime against humanity:


Finally, I used this as an excuse to learn some git and github use.  All the code, text files, and graphs are available there. Here are some files of interest:

Text file of all plays

Text file of all songs, with play counts

Text file of all bands, with play counts

Perl script to scrape WMGK site

Perl script which provides general stats and graphs for all WMGK

Perl script which provides specific stats from one band passed to it on command line

Wednesday, August 21, 2013

Genre Average Song Lengths

I was working on a different post and had to write a script to get the average song length of a genre from last.fm. I wrote a quick perl script and figured I'd get some other genres to compare. I grabbed the 250 top tags, and then grabbed the 50 top tracks for each tag.

I didn't really expect anything revolutionary.  Prog has longer songs than punk.  Some of the tags don't represent genres (eg Live), but I didn't remove them, both because they are interesting too, and I'm lazy.  Here are the results.

There's a slight positive correlation (0.23) between song length and ranking.  That is, longer genres are generally less popular.  Mean and median length across all genres are quite close at 4:20, and 4:14 respectively.

A graph and table of the top 50 most popular:





RankGenreAverage Length
48progressive metal6:40
26progressive rock6:02
15ambient5:45
24black metal5:38
42trance5:29
36classical5:24
29heavy metal5:14
40thrash metal5:14
49trip-hop5:08
19hard rock4:51
9classic rock4:48
44favorites4:47
11jazz4:38
27death metal4:34
25chillout4:33
17experimental4:32
50hip hop4:31
7metal4:30
45reggae4:29
33industrial4:28
10alternative rock4:25
38rap4:18
18electronica4:16
43japanese4:16
41metalcore4:14
1rock4:13
3alternative4:11
16singer-songwriter4:08
28instrumental4:06
5electronic4:04
14folk4:04
3990s4:02
2180s4:00
8female vocalists3:58
22dance3:55
20hip-hop3:54
30british3:52
46acoustic3:50
4indie3:46
6pop3:45
2seen live3:41
37emo3:41
34soul3:40
13indie rock3:39
47country3:38
35blues3:28
31punk rock3:26
32soundtrack3:16
23hardcore3:09
12punk2:58

Sunday, August 18, 2013

A Comparison of PNG Compression Methods

You may remember my past post about which image format you should be using.  The summary was use PNG for computer generated graphics.  I briefly touched on the fact that PNG allows you to use 24 bit or 8 bit colors.  It is, however, much more complicated than that.


Intro

There are three color modes available in PNG.  The mode with the fullest color space is RGB 24 bit.  The second mode is indexed.  It allows you to use 8 bits or less for up to 256 colors.  Additionally, it allows you to choose the palette.  The last mode is grayscale.  As the name implies it's black and white, but allows a gradient of grays.  By comparison, a 1 bit indexed mode would only allow full black or full white, no grays.

Generally, switching to indexed offers a pretty big savings in file size.  However, you can often get even greater savings by going to less than 8 bit color.  Doing so will generally require generating a custom palette to take full advantage of the reduced color space.  If the file is black and white, it can sometimes be worth it to switch to grayscale or to 1 bit indexed.

There really is no simple way to know which will be best, besides trial and error.  Doing this for one file is a bit annoying, but doing it for many files is pretty unreasonable.  As such, many PNG compressors have popped up to help automate this process.

It turns out that the compression process is not so clean cut, and thus many of the various programs perform differently on various files.  I decided to look into if my current program was a good choice.  I found a few old comparisons, but not any from the last few years.  I decided to just download a bunch of programs and try them out.

I didn't download too many, preferring those that either had recent development, or were spoken highly of somewhere.


Programs Tested

So without further ado, here are the candidates:

TinyPNG - Has been my go to.  It's the only web based option here.  It's quite fast and has a great interface.

pngquant v1.8.3 - Some research showed that pngquant was the library being used at TinyPNG.  I figured if I could have the same compression as TinyPNG as a command line tool, that would be ideal.

pngcrush v1.7.9 - This is the standard PNG compression utility.

optipng v0.7.4 - Another standard utility, based on pngcrush.

pngout 20130221 - Based largely on the recommendation of this 6 year old Jeff Atwood post.  It is still being developed though.

Before I get into the results I should note the command line switches used.  In all cases its none.  Some of the utilities are geared more towards automatic use than others.  There is no doubt you could achieve much better results if you were willing to experiment with the options.  However, that wasn't the use case I was testing here.  I was concerned with processing dozens or more files quickly and easily.

If you want to see the images used I uploaded them to imgur here.  Unfortunately, imgur is no dummy, they compressed the files themselves.  They also resized the largest ones and converted some to jpg when that made sense.  If you want all the files, both the original and the results, I uploaded them as a zip to mediafire.

Results

Filesizes in KB:
NameOriginaloptipngpngcrushpngoutquantpngtinypngt2osmallest
a.png885765879769335329328328
b.png5.743.225.052.773.363.273.272.77
c.png27272723272025822725258725832582
e.png84.632.484.637.233.232.432.432.4
f.png22.919.720.916.419.618.116.616.4
g.png44.217.623.631.315.014.013.513.5
i.png717550616529189181181181
m.png30.729.129.126.829.427.927.126.8
n.png48.047.948.042.526.625.624.024.0
o.png22.222.222.219.122.222.219.119.1
p.png104410191019973288281281281
r.png545188243174188174174174
s.png476311316352156147146146
t.png11.611.611.611.612.612.011.611.6
w.png79.458.759.973.739.836.536.136.1
x.png60.760.760.760.261.958.658.658.6
y.png90.078.990.070.848.045.142.242.2
z.png104.6104.6104.6102.152.650.850.050.0
sum70006043635458734246404540274025


Ratio to the best compression method:
Originaloptipngpngcrushpngoutquantpngtinypngt2o
a.png270%233%268%234%102%100%100%
b.png207%116%182%100%121%118%118%
c.png106%105%105%100%106%100%100%
e.png261%100%261%115%102%100%100%
f.png140%120%127%100%120%111%101%
g.png326%130%174%231%111%103%100%
i.png397%304%340%292%105%100%100%
m.png114%108%108%100%109%104%101%
n.png200%200%200%177%111%107%100%
o.png116%116%116%100%117%117%100%
p.png372%363%363%347%102%100%100%
r.png314%108%140%100%108%100%100%
s.png326%213%217%241%107%101%100%
t.png100%100%100%100%108%103%100%
w.png220%162%166%204%110%101%100%
x.png104%104%104%103%106%100%100%
y.png213%187%213%168%114%107%100%
z.png209%209%209%204%105%102%100%
sum174%150%158%146%105%100%100%
geo201.0%152.0%174.1%152.2%109.0%103.9%101.1%

In the first table sum for all columns is just the sum of the column.  In the second table sum is the ratio of sum to the smallest sum.  In other words, a sum ratio of 100% should indicate a perfect score, but it's skewed by the large file sizes.  The geometric mean solves this for the normalized ratios.  Or does itYeah, probably.


Analysis

I think the take away here is that TinyPNG is the winner.  That being said, you'll note there were three cases where it did significantly worse than pngout.  Images b and o were both black and white, and pngout turned b into grayscale, but both left o as indexed.  What's more, e was grayscaled by pngout for worse size than the indexed tinyPNG version.  Pretty clear evidence that grayscale vs indexed  isn't clear cut.

quantpng is consistently a few percent higher than TinyPNG, lending credence to the idea that TinyPNG is using quantpng, but then also doing something else to squeeze a bit of filesize out.  On the other hand, quantpng actually made a few files worse, so maybe there is some overhead on its end.

The alert reader will have noticed there is a seventh algorithm listed which wasn't defined above.  t2o is a response to the fact that either TinyPNG or pngout was the optimal algorithm.  It is the TinyPNG files ran through pngout.

While it almost always does better than TinyPNG or pngout alone, you'll note the one case where it failed to return to the level pngout had achieved on the raw file.

I suppose my strategy will be to continue to use TinyPNG, and if I really need to minimize a file, run it through pngout independently, and compare the results.

Tuesday, July 30, 2013

Ohm's Circle

There are four common variables in circuits.  P = Power, V = Voltage, I = Current, and R = Resistance.  Knowing any two of those you can find the other two.  To do this you must derive the right formula though.  I always remember that V = IR, and P = IV.  From that you can find these:
`V = IR = sqrt(PR) = P / I`
`P = VI = V^2 / R = I^2 R`
`I = P/V = V/R = sqrt(P / R)`
`R = V/I = V^2 / P = P/I^2`

I discovered something called Ohm's Circle which presents those formulas quite nicely.  However, I always just search for it and have never found a particularly well executed one.  Two of the top google image results have mistakes.  So, I decided to make my own version.  I derived the above formulas by hand to double check it was all correct.


Friday, July 26, 2013

8 horrible courtroom jokes and their ensuing legal calamities

http://www.salon.com/2013/07/26/8_horrible_courtroom_jokes_and_their_ensuing_legal_calamity/
Defending Texas’s abortion restrictions before the Supreme Court, attorney Mr. Jay Floyd decided to open oral argument with a sexist joke. Arguing against two female attorneys, Floyd begins: “It’s an old joke, but when a man argues against two beautiful ladies like this, they are going to have the last word.” The joke is demeaning and (as Floyd himself admits) unoriginal, but it also lacks the saving grace of at least being funny. A recording of the oral argument, which can be listened to here, demonstrates just how badly the joke bombed with the Supreme Court. Painful silence endures for just over three seconds.

Saturday, July 20, 2013

The story behind football's innovative yellow first down line

http://sportsillustrated.cnn.com/nfl/news/20130718/nfl-birth-of-the-yellow-line/#all
While the line looks simple on TV, the technology behind it is very complex. Sensors were placed on the three main game cameras (at midfield and the two 20 yard lines), capturing the pan, tilt and zoom movements of those cameras 30 times a second. A three-dimensional, virtual model of each field had to be constructed, with the exact measurements of the crown of the field (the center of the field is always higher, for drainage, than the sides and ends, but the precise levels vary in each venue). An exhaustive color palette had to be produced on the fly as the game progressed, so that the green pixels of grass the yellow line replaced would be exactly the "right greens" even as shadows crossed the field and changed the grass hues -- an essential feature to assure replacing only the green blades of grass and not the green threads of a Packers or Eagles jersey.

Wednesday, July 17, 2013

Overriding BIOS limited CPU speed while running on battery

The Wrath of Dell

As it turns out, my work at salvaging a power adapter for my Dell Inspiron 6000 was not as much of a smashing success as I had initially thought.  I noticed that the battery charging tray indicator wasn't going away.  This happened sometimes, and required me to unplug the cord a few times.  However, I had duct taped the cord in place to prevent it from falling out, and didn't care to fix it.

Eventually I noticed that the power level was only going down.  I also noticed that I had suddenly lost the ability to watch .mp4, or just about any video over 480p.  The internet seemed even slower than usual, and I started noticing my cronjobs running even with their low nice levels.

I had a suspicion, but didn't want to look into it for fear of confirmation.  I eventually caved and discovered my battery was not charging.  The BIOS listed the AC adapter as an unknown device, and thus refused to charge.

I deduced that when I shorted the unknown-at-the-time third wire to the +19.5V wire I must have fried the chip on the motherboard that looked for other Dell products.

Further research confirmed my deepest fear: In addition to refusing to charge the battery, Dell forced the laptop to run at half speed if it didn't detect a Dell adapter.  I could certainly live with the battery not charging, as my laptop had long ago become a desktop, but an 800 MHz CPU was pushing it.


The Search for a Fix

I began looking for a hacked BIOS to undo all the Dell nonsense with renewed gusto.  Unfortunately, while it seemed possible, it didn't seem anyone had published anything.  I briefly considered hacking the BIOS myself, as my previous BIOS messing withs have gone well.

Eventually, I discovered that while the battery charging was controlled by the BIOS, the CPU speed was controlled by the OS, with the BIOS only able to suggest speeds.

I learned of CPU governors which were profiles that controlled the speed.  By running the command cpufreq-info I found my current governor was ondemand, and I had the common performance governor available.  I could update the governor with:
sudo echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

I confirmed this with cpufreq-info, but it also showed my speed still at 800MHz.  The policy said that the speed had to be between 800MHz and 800MHz and that the governor could decide which speed to use between those limits.  This made me suspect that perhaps the BIOS had more control over the speed than I had hoped.  Still, I persevered as it seemed other Dell users had fixed this, although they were being coy about exactly how.


The Voyage to 1.6 GHz

The breakthrough came from this site and its comments, which revealed there was a seperate BIOS limit that I was missing.

For me, the key file was at: /sys/devices/system/cpu/cpu0/cpufreq/bios_limit  You can edit this file similar to the above command, but I recommend opening it with a text editor to see its current setting.  For me, it was at 800000, and needed to be changed to 1600000, evidently in KHz.

This is all well and good, but it will be overwritten by the BIOS on each boot.  To prevent that you can edit your GRUB file at: /etc/default/grub to contain this magic line: GRUB_CMDLINE_LINUX="processor.ignore_ppc=1"  Follow that by running update-grub to, as the name implies, update grub, and reboot.

Now, cpufreq-info revealed that the policy could pick speeds from 800MHz to 1.6GHz, and what's more it was picking 1.6GHz.  I left the governor at ondemand since it was successfully raising the speed as needed.

PS
On a page full of technical Linux commands I found the most recent comment good:
"You all sound mega tech savvy so this might be a stupid answer but i fixed mine by blowing down the end of the cable….."

Friday, July 5, 2013

Why the McWrap Is So Important to McDonald's

http://www.businessweek.com/printer/articles/132030-why-the-mcwrap-is-so-important-to-mcdonalds
After lengthy discussions with produce suppliers around the country, Coudreaut managed to add one new ingredient to the McDonald’s arsenal: the English cucumber. That might not seem like a big change, but when the chain added sliced apples to its menu, it immediately became one of the largest buyers of apples in the country. The company had to build up reserves of edamame before it introduced its Asian salad. Coudreaut would like to add guacamole one day. Who knows what that would do to the avocado supply?

For cucumbers, McDonald’s went to Taylor Farms and Fresh Express, a producer and distributor owned by Chiquita. Then it had to figure out how the vegetable could be sliced, evenly, before it reached the restaurants. The chain expects to use about 6 million pounds this year. McDonald’s also tested the size of the chicken breast and the amount of lettuce. Initially, the sandwich was made with a half-breast of chicken and loads of produce. “We talked a lot about the veggies,” says Leslie Truelove, director of marketing in the U.S. “But we went too far. People thought it was a salad.” People wanted more meat. Now the wrap has a full breast of chicken, a handful of shredded lettuce, 10 leaves of spring mix, two cucumber slices, and two tomato slices—no more, no less.

Thursday, June 20, 2013

Dell AC Adapter

I bring you today another chapter in my ongoing saga to squeeze every last minute of life out of my laptop.  About a month ago, my laptop randomly shut off.  Investigation revealed that I had managed to rip open the power cord by getting it caught between a table and the concrete floor.  I noticed that only bending the cable certain ways would cause the short.  So, I did the logical thing and wrapped the cord in duct tape and forgot about it.

Fast forward to last night, and the power indicator tells me I'm running on battery power; the led on the power brick is off.  I attempted to coax the cable into working, but to no avail.  I shut my laptop off to preserve potentially precious battery and went to sleep.

Attempt 1

This morning I removed the duct tape bandage and began to examine the tear in the cable.  The cable is similar to stereo cable from my last adventure, except (unbeknownst to me at this point) there is another layer of wire under the first.

I'll explain here that standard DC barrel connector have a ground and a positive voltage.  The ground is the outside of the plug, and the positive is the inside.  Similarly, in the cable, the ground is the outer wire, and then under an insulator there is the positive wire.  These are both wound around a common axis (ie coaxial).

The outer wire was almost completely broken, so I cut it and attempted to splice it back together.  The inner wire was in much better shape, with only a few broken strands.  I at first attempted to just wrap that up, but ended up cutting it and splicing as well.  I soldered these two up and used a multimeter to search for shorts between ground a positive.  I powered it up and detected 19.5 V between the ground and positive.

However, booting up my laptop with it confirmed my suspicion that there was indeed a third inner wire.  Dell conveniently adds their own third wire to standard DC cables for making more money through proprietary power adapter sales safety.  This third wire was shorted to the positive and I figured whatever it was connected to in the power brick was now destroyed.

The third wire is a data wire.  It sends a signal from the laptop to the adapter to make sure you are using an official dell adapter.  If not it complains, but still powers the laptop.  However, it does not charge the battery.  There was some debate online about if the battery would indeed charge, just a slower rate, but I did not get any charge over the few hours I had it hooked up like this.

Attempt 2

It just so happens that I had procured a second power adapter long ago.  In fact, this adapter was the second one.  The first had failed a few years ago.  It developed a noticeable kink near the power brick, and with that, presumably a short. 

I decided to open it up and have a look.  This video purports to show the adapter being opened.  Note the comments accusing him of faking the opening.  I'll agree that there is no way that was the first time he was opening the brick.  It took me about an hour to get mine open, and I had to use just about every method I read online or could think of.  I started with just a screwdriver, but that did nothing but tear up the plastic.  I put some alcohol, and then acetone in the seam, which also did nothing but begin to dissolve the outer plastic.  I hit it with a hammer on the corners, which along with constant attacks with the screwdriver did begin to weaken the adapters defenses.  I only stabbed myself with the screwdriver three times before I put on a glove.

While the adapter was weakened, there were some parts that were clearly intent on making a valiant last stand.  I ended up using a clamp to squeeze the bottom piece away from the top piece.  This still required a constant attack with the screw driver.  When it finally came apart, both parts of the plastic case looked like they had been gnawed on by a honey badger.

Now I pulled back the insulation to reveal the wire below.  While this wire was about half broken it didn't seem like the second layer of insulation had any holes.  It didn't seem like there was any short here.

I tested this with the multimeter and it confirmed the positive was separate from the ground.  However, moving the cable around created a short.  Further investigation revealed that the short was actually way up by the plug.  This was rather annoying as I had hoped to use this better plug.

Attempt 3

I began to rip apart the plug to determine what was wrong with it.  The rubber came off easily with a razor blade.  Once off, it was clear that the insulation between the two wires stopped early and there was a short.  I don't really know how this design could possibly work.  It seems to be just begging for the insulation the wiggle its way down a mm and create a short.

The two wires were shrink tubed and soldered on to opposite sides of the plug.  I had to get at least the ground wire free so that I could wrap the inner wire up to keep them separated.  The problem was the wires were encased in something similar to hot glue, but much more rugged.  I suspect it was just liquid plastic poured in.

It was quite strongly bonded to the shrink tubing.  I tried acetone, which seemed to help, but I still had to chip away with screwdrivers and wire cutters little by little until the wire was free.

I wrapped the inner positive wire up in some electrical tape, and then wrapped the outer wire as well.  I added several layers of duct tape around the whole thing for good measure.

Now I revisited the kink by the power brick.  Here I decided to use a bit of solder to keep the broken half of the wire together.  I did this and applied a generous amount of electrical tape.

I was pretty incredulous that this would produce a working cord.  I did a lot of testing with the multimeter looking for shorts.  Then I left it powered and did a thorough wiggle test.  It passed both of those, so I plugged it into the laptop.  The laptop recognized it as a fellow dell product and graced me by charging the battery.


Saturday, June 15, 2013

The 156 Best Episodes of Star Trek

As promised I've gone through the data from IMDB for all the Star Trek series, and found the best episodes across all the series.  I took the top 25% of episodes in each series and combined them.  Then I cut it off to only those episodes that had a rating of 8 or above, which only cut off a dozen or so episodes.  Here are the episodes, sorted by series and then air date.  I'd add you should probably watch any two part episodes, even if they aren't listed, because they are usually both good and important to the plot.  In DS9 the ending arc begins with S7E16, and even though some didn't make the cut, skipping any of those would be madness.

I've previously written some overviews of each series, so maybe couple that with this.

SeriesEpisodeRatingNum RatingsTitle
TOSS01E108.1877The Corbomite Maneuver
TOSS01E118.4854The Menagerie: Part I
TOSS01E128.3806The Menagerie: Part II
TOSS01E148.61042Balance of Terror
TOSS01E188805Arena
TOSS01E228.61083Space Seed
TOSS01E238739A Taste of Armageddon
TOSS01E258.3787The Devil in the Dark
TOSS01E268.1717Errand of Mercy
TOSS01E288.71326The City on the Edge of Forever
TOSS02E018.6871Amok Time
TOSS02E048.6981Mirror, Mirror
TOSS02E068.7845The Doomsday Machine
TOSS02E108.5687Journey to Babel
TOSS02E158.6960The Trouble with Tribbles
TOSS03E028.4665The Enterprise Incident
TOSS03E238.1578All Our Yesterdays
TNGS01E248.1669Conspiracy
TNGS02E038.1697Elementary, Dear Data
TNGS02E088607A Matter of Honor
TNGS02E098.6882The Measure of a Man
TNGS02E168.5834Q Who?
TNGS03E108.2638The Defector
TNGS03E138.4678Déjà Q
TNGS03E158.51019Yesterday's Enterprise
TNGS03E168.1685The Offspring
TNGS03E178.1571Sins of the Father
TNGS03E238.1585Sarek
TNGS03E268.61074The Best of Both Worlds: Part 1
TNGS04E018.6958The Best of Both Worlds: Part 2
TNGS04E078.3570Reunion
TNGS04E118.1583Data's Day
TNGS04E128545The Wounded
TNGS04E148.2607Clues
TNGS04E218.1612The Drumhead
TNGS04E268.3550Redemption
TNGS05E018.2545Redemption II
TNGS05E028.4780Darmok
TNGS05E078.3562Unification I
TNGS05E088.3549Unification II
TNGS05E148.3563Conundrum
TNGS05E188.7771Cause and Effect
TNGS05E238.5662I Borg
TNGS05E248.1533The Next Phase
TNGS05E258.51462The Inner Light
TNGS05E268.2575Time's Arrow: Part 1
TNGS06E018.2587Time's Arrow: Part 2
TNGS06E048.4617Relics
TNGS06E108.2587Chain of Command: Part 1
TNGS06E118.6611Chain of Command: Part 2
TNGS06E128.5593Ship in a Bottle
TNGS06E158.7689Tapestry
TNGS06E218.2538Frame of Mind
TNGS06E258.5546Timescape
TNGS06E268.2499Descent: Part 1
TNGS07E118.6599Parallels
TNGS07E128.3518The Pegasus
TNGS07E158.3577Lower Decks
TNGS07E258.44923All Good Things...
DS9S01E188.7505Duet
DS9S02E088.2337Necessary Evil
DS9S02E148.2337Whispers
DS9S02E198323Blood Oath
DS9S02E228320The Wire
DS9S02E238314Crossover
DS9S02E268.5326The Jem'Hadar
DS9S03E018.2323The Search: Part 1
DS9S03E028317The Search: Part 2
DS9S03E098314Defiant
DS9S03E208.5337Improbable Cause
DS9S03E218.6346The Die Is Cast
DS9S03E268.2299The Adversary
DS9S04E018.6429The Way of the Warrior
DS9S04E028.7653The Visitor
DS9S04E078.3378Little Green Men
DS9S04E108.1302Homefront
DS9S04E228285To the Death
DS9S04E258.1286Broken Link
DS9S05E018.1293Apocalypse Rising
DS9S05E028277The Ship
DS9S05E068.4621Trials and Tribble-ations
DS9S05E148.5312In Purgatory's Shadow
DS9S05E158.5310By Inferno's Light
DS9S05E268.6337Call to Arms
DS9S06E018.4300A Time to Stand
DS9S06E028.6306Rocks and Shoals
DS9S06E058.4288Favor the Bold
DS9S06E068.6341Sacrifice of Angels
DS9S06E138.5541Far Beyond the Stars
DS9S06E188.1283Inquisition
DS9S06E198.3657In the Pale Moonlight
DS9S06E268.1268Tears of the Prophets
DS9S07E068.1248Treachery, Faith, and the Great River
DS9S07E088.4334The Siege of AR-558
DS9S07E168.1276Inter Arma Enim Silent Leges
DS9S07E208.4245The Changing Face of Evil
DS9S07E218.1234When It Rains...
DS9S07E228.5249Tacking Into the Wind
DS9S07E248.2257The Dogs of War
DS9S07E258.4438What You Leave Behind
VOYS01E068.1350Eye of the Needle
VOYS02E188.2330Death Wish
VOYS02E218.1299Deadlock
VOYS03E088.2340Future's End: Part 1
VOYS03E098.1311Future's End: Part 2
VOYS03E218264Before and After
VOYS03E238.3310Distant Origin
VOYS03E258260Worst Case Scenario
VOYS03E268.3397Scorpion: Part 1
VOYS04E018.6370Scorpion: Part 2
VOYS04E088.5357Year of Hell: Part 1
VOYS04E098.3328Year of Hell: Part 2
VOYS04E148.6360Message in a Bottle
VOYS04E168.1246Prey
VOYS04E238.6339Living Witness
VOYS04E258270One
VOYS04E268254Hope and Fear
VOYS05E028.3345Drone
VOYS05E068.5357Timeless
VOYS05E118.1299Latent Image
VOYS05E158.2290Dark Frontier: Part 1
VOYS05E168.2270Dark Frontier: Part 2
VOYS05E228.1284Someone to Watch Over Me
VOYS05E248.4292Relativity
VOYS05E268.3279Equinox: Part 1
VOYS06E018292Equinox: Part 2
VOYS06E048.3319Tinker Tenor Doctor Spy
VOYS06E108.2283Pathfinder
VOYS06E128.6394Blink of an Eye
VOYS06E248.2256Life Line
VOYS06E268242Unimatrix Zero: Part 1
VOYS07E258.3540Endgame
ENTS01E268.5236Shockwave: Part 1
ENTS02E018.5656Shockwave: Part 2
ENTS02E228.3256Cogenitor
ENTS02E238.4726Regeneration
ENTS03E088.6310Twilight
ENTS03E138.4211Proving Ground
ENTS03E188.6229Azati Prime
ENTS03E198.5203Damage
ENTS03E228.6197The Council
ENTS03E238.7204Countdown
ENTS03E248.7223Zero Hour
ENTS04E048.5218Borderland
ENTS04E058.5208Cold Station 12
ENTS04E068.6212The Augments
ENTS04E078.7202The Forge
ENTS04E088.6195Awakening
ENTS04E098.6203Kir'Shara
ENTS04E128.7204Babel One
ENTS04E138.6190United
ENTS04E148.6195The Aenar
ENTS04E168.4190Divergence
ENTS04E188.7360In a Mirror, Darkly: Part 1
ENTS04E198.7327In a Mirror, Darkly: Part 2
ENTS04E218.4227Terra Prime