The local classic rock station is 102.9 WMGK. They have a recently played page which is easily scrapable. This is contrasted to other classic rock stations that I wanted to scrape for a comparison that only listed the last few songs, or embedded it in flash.
I began scraping on June 26th and August 24th makes 60 days worth of data.
Thanks to my recent post, we know classic rock songs average 4:48 in length. There are 86,400 minutes in 60 days. That would be enough time for 18,000 songs. If we assume only 75% of the time is actually music that's 13,500 songs. During these 60 days WMGK played 14,286 songs.
I won't speculate about what this means about the actual percentage of music on the air. Slight changes in average song length have a big effect on numbers.
So, now it is time for the interesting questions. How many of those were unique songs? How many songs would be needed to represent 25% or 75% coverage?
In case it isn't clear what I mean by coverage, I mean how many songs represent 10% (or whatever) of the total songs played. For example, if a station played 1 song 10 times, and then 10 other songs 1 time each, for a total of 20 plays, that 1 top song would give 50% coverage.
So, without further ado here are the stats:
924 unique plays out of 14,286 means about 6.5% of songs were being played for the first time in 60 days. Honestly, that's not bad. However, the 50% and 75% coverages are awful. I have 30k unique songs in my playlist, and that's not even particularly impressive. Admittedly, they aren't all one genre, but Led Zeppelin has 86 studio songs, all of which would be suitable for a classic rock station.
The key take away is that there are about 300 to 350 songs that get played at least every other day. Then, they occasionally toss in a deep cut. There is no middle ground; either a song is played every other day, or once every few months. I made some (read: over 100) graphs that illustrate this, but for now let's stick to tables.
Want to guess what the most popular bands or songs are?
|Plays per 30 days||Band||Song|
|27.5||Warren Zevon||Werewolves Of London|
|27||Cars||Just What I Needed|
|27||Blue Oyster Cult||Burnin' For You|
|27||Steve Miller Band||Rock 'n Me|
|26.5||Supertramp||The Logical Song|
|26.5||Pink Floyd||Another Brick In The Wall|
|26.5||Electric Light Orchestra||Do Ya|
|26||J. Geils Band||Centerfold|
|Plays per 30 days||Band|
|135.5||Tom Petty And The Heartbreaker|
|135.5||Steve Miller Band|
|135||Electric Light Orchestra|
|105.5||Creedence Clearwater Revival|
I would not have guessed most of those top songs. Note how much higher the top three bands are than the rest; there is a second smaller drop off after Foreigner. Also interesting is that none of the top three bands have a top 10 song. I would have also guessed Beatles to be the top band.
Let's start looking at the graphs.
Here we have two graphs of bands. The first is limited to the top 50 and has labels. The second is all bands, and without labels, just showing the general trend.
These next two graphs really show the tendency to play the same songs. The first shows how many songs were played various number of times in 60 days. There are three clusters. First, songs that were played once or twice. Then there is a wasteland from 11 to 26 plays in 60 days. After that, there is the main group of songs that are played every other day. That tapers off, and leads to the last group of songs which are played almost every day. Keep in mind that the number of plays compounds the number of songs in that group. 20 songs each played 35 times is a lot more plays than 200 songs played once.
The last general WMGK graph is this one that shows the average daily plays in a given hour of the day. It shows there is a drop off in the morning hours from 5am - 8am. The night hours of 12am - 4am are the highest. It's interesting that there is a clear rise at midnight. I don't think WMGK has a morning show, so I'm not sure why there is the drop off. At first I thought they increased ads during peak commuting hours, but there is no drop off in the evening. My guess is they must provide short news and traffic conditions during those hours.
I made that graph so that I could compare individual bands to see if different bands were being played more at certain times (eg longer songs being relegated to the middle of the night). Unfortunately, I don't have enough data to really answer this. A typical band might have a few plays in a given hour for the entire 60 day period. I suppose I could have increased the buckets to 6 hour windows, but was too lazy to code this.
The rest of the graphs compare WMGK plays to last.fm plays for a given band. I'll post a few here. All of the graphs are in this gallery on my site.
I had a hard time with the comparisons. First, there was the fact that some of the titles had subtle differences. This meant I had to hard code the differences in a hash. There is also the problem of songs with multiple versions on last.fm, this will tend to change the order slightly. Also, the last.fm api only gives play data from the last 6 months. For most classic rock this doesn't matter, but, eg, David Bowie had a recent album, and thus his graphs are hugely skewed towards it.
Then I couldn't decide which songs to show. I ended up making two graphs for each band. One shows all the WMGK plays and the other shows the top 20 last.fm songs. Each has the other's plays on them, but are sorted differently. I think the last.fm graph is more interesting, as it shows which popular songs WMGK doesn't play. In their defense, some songs simply aren't classic rock. On the other hand, the WMGK sorted graph shows the sharp drop off when you go from the frequently played songs down to the deep cuts (that aren't that deep).
For example, here are the two Billy Joel graphs:
A very different Genesis:
Led Zeppelin, showing the drop off:
A not-that-surprising Pink Floyd. Except, why does WMGK love Young Lust?
Not big Rush fans:
WMGK doesn't seem to know where the Styx CD gets good:
A crime against humanity:
Finally, I used this as an excuse to learn some git and github use. All the code, text files, and graphs are available there. Here are some files of interest:
Text file of all plays
Text file of all songs, with play counts
Text file of all bands, with play counts
Perl script to scrape WMGK site
Perl script which provides general stats and graphs for all WMGK
Perl script which provides specific stats from one band passed to it on command line