Wednesday, February 9, 2011

Charts: The Movie

Some time ago, I used to go to Newegg once a month and find the price of flash in the various sizes, then figure out the price per gig for each.  I was paid by the government to do this (this sentence is technically not a lie).  After the grant money ran out I knew I couldn't keep up the literally minutes per month it took to maintain this spreadsheet.

After a while, I decided to revive this practice but automate it this time.  I wrote a perl script to scrape Newegg and find the prices.  It ran every day, and kept the full html pages, so the data is much more complete.  As I'm sure most people know, you can get these daily flash price updates on my site.  While I keep the full record of data on my computer, the only data available on my site is the current price of flash.  For a while, I've been planning on coming up with some way to present the history data in some form other than a .csv file.

In not-entirely-unrelated news, I downloaded a program called gnuplot.  I was tired of how annoying OpenOffice's LibreOffice's chart generation is to use.  Gnuplot is a pretty straight forward plotting program.  It's programing based, command line, and opensource. I suppose it's sort of like a Matlab/Octave dedicated to charts.  As with most serious programs, the learning curve was a bit steep, but not too bad if you are just looking to do simple stuff (I downloaded the program for the first time less than 24 hours ago).

To tie these two story lines together, I plotted the newegg data using gnuplot.  With this new command line plotting ability I could generate a daily chart of the flash prices, as I had been wanting to do for a while.  Well, fast forward a few hours and a few new perl scripts and three charts are now generated every morning when the newegg script runs.  There's a total history chart starting in Feb 2009, a past year chart, and a past 3 months chart.  I played around with the possible charting options for a while before I settled on those as the most readable.  They are all capped at $5/GB to increase the relevant info.

The hodgepodge of  scripts that make this happen every day is getting pretty good.  I spent exactly 0 seconds debugging the new scripts.  Also, I may make changes as I see how it goes.  In other words, don't be surprised if it doesn't work.

No comments:

Post a Comment