Thursday, December 17, 2020

Gaia’s stellar motion for the next 1.6 million years

The stars are constantly moving across the sky. Known as proper motion, this motion is imperceptible to the unaided eye but is being measured with increasing precision by Gaia. This animation shows the proper motions of 40 000 stars, all located within 100 parsecs (326 light years) of the Solar System. The animation begins with the stars in their current positions; the brightness of each dot representing the brightness of the star it represents.

As the animation begins, the trails grow, showing how the stars will change position over the next 80,000 years. Short trails indicate that the star is moving more slowly across the sky, whereas long trails indicate faster motion. To avoid the animation becoming too difficult to interpret, the oldest parts of the trails are erased to only show the newer parts of the stellar motions into the future.

Sometimes it appears as if a star is accelerating (as indicated by a longer trail). This is due to the star getting closer to us. Proper motion is a measure of angular velocity, which means that close-by stars appear to move more quicker across the sky even when their speed is the same as that of other, more distant stars.

Towards the end of the animation, the stars appear to congregate on the right side of the image, leaving the left side emptier. This is an artefact and is caused by the average motion of the Solar System with respect to the surrounding stars.

The animation ends by showing star trails for 400 thousand years into the future.

Saturday, December 12, 2020

Cameras and Lenses

Over the course of this article we’ll build a simple camera from first principles. Our first steps will be very modest – we’ll simply try to take any picture. To do that we need to have a sensor capable of detecting and measuring light that shines onto it.

Saturday, December 5, 2020

Your Smart TV is probably ignoring your PiHole

Smart devices manufacturers often “hard-code” in a public DNS server, like Google’s, and their devices ignore whatever DNS server is assigned by your router - such as your PiHole.

Nearly 70% of smart TVs and 46% of game consoles were found to contain hardcoded DNS settings - allowing them to simply ignore your local network’s DNS server entirely. On average, Smart TVs generate an average of 60 megabytes of outgoing Internet traffic per day, all the while bypassing tools like PiHole.

Thursday, December 3, 2020

AI Generated Music

We started with the original SampleRNN research code in theano. It's a hierarchical LSTM network. LSTMs can be trained to generate sequences. Sequences of whatever. Could be text. Could be weather. We train it on the raw acoustic waveforms of metal albums. As it listens, it tries to guess the next fraction of a millisecond. It plays this game millions of times over a few days. After training, we ask it to come up with its own music, similar to how a weather forecast machine can be asked to invent centuries of seemingly plausible weather patterns.

It hallucinates 10 hours of music this way. That's way too much. So we built another tool to explore and curate it. We find the bits we like and arrange them into an album for human consumption.

It's a challenge to train nets. There's all these hyperparameters to try. How big is it? What's the learning rate? How many tiers of the hierarchy? Which gradient descent optimizer? How does it sample from the distribution? If you get it wrong, it sounds like white noise, silence, or barely anything. It's like brewing beer. How much yeast? How much sugar? You set the parameters early on, and you don't know if it's going to taste good until way later.

We trained 100s of nets until we found good hyperparameters and we published it for the world to use.

Monday, November 30, 2020

DeepMind Solved Protein Folding

We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment.


You're welcome

Thursday, October 8, 2020

Reverse engineering my cable modem and turning it into an SDR

This is the type of nerdy hacking that makes me jealous.

After removing a few screws from the plastic housing to get access to the board, my first thought was to look for UART headers to take a peek at the serial console. After identifying two candidates consisting of four vias surrounded by a rectangle near the edge of the PCB, it was time to identify the pins. Using a multimeter, the ground pin can be easily identified by checking the continuity with one of the metal shields on board. The VCC pin can be identified by measuring the voltage of each pin when powering on the board. It should be a steady 3.3v, or in some cases 1.8v or 5v. This pin is not needed, but is still useful to identify the operating voltage and eliminate one candidate for the Tx and Rx pins. While booting, the Tx pin will sit on average a little lower than the VCC pin and drop much lower when a lot of data is being output. This leaves the last pin as Rx.

Tuesday, October 6, 2020

The economics of vending machines

It is estimated that roughly ⅓ of the world’s ~15m vending machines are located in the US.

Of these 5m US-based vending machines, ~2m are currently in operation, collectively bringing in $7.4B in annual revenue for those who own them. This means that the average American adult spends ~$35 per year on vending machine items.

What makes the vending industry truly unique is its stratification: The landscape is composed of thousands of small-time independent operators — and no single entity owns >5% of the market.


Thursday, October 1, 2020

Test if your email is letting the sender know when you view an email

There are a ton of ways companies can track if you view an email.  This site tests which of these methods work even if you are blocking images for example:

You have to click the link in the first email, then click "test this email" for a second email that actually runs the test by the way.  I was confused at first why it wasn't doing anything.

Tuesday, September 29, 2020

Wednesday, August 26, 2020

Walk with me though the hilariously inconsistent on-screen titles of Star Trek's two-part episodes.

 I couldn't resist the pedantry of this post.

"The Best of Both Worlds"
"The Best of Both Worlds" Part II
Okay, here we go. This is TNG's first actual two-parter. Note now the "Part II" is placed outside the quotes, adopting the style from TOS before it. The difference, other than dropping the "Part I" from part one, is that we’re not using ALL CAPS anymore, so we learn that “Part” is meant to be rendered in title case, with the “P” capitalized. A boring fact that you'll soon learn is the only constant in the universe.

"Redemption II"
Okay, another season-ending cliffhanger resolved! But... now we're just naming them like heavy metal albums, I guess. The only actual established rule for TNG so far is that "part one" does not get a roman numeral…


Thursday, July 9, 2020

A Graphical Analysis of Women's Tops Sold on Goodwill's Website

I set up a script that collected information on listings for more than four million women's shirts for sale through Goodwill's website, going back to mid-2014. The information is deeply flawed—a Goodwill online auction is very different from a Goodwill store—but we can get an idea of how thrift store offerings have changed through the years. There's more info on data collection method below.

Wednesday, July 1, 2020

Using AWS S3 Glacier Deep Archive For Personal Backups

I've been using AWS S3 for personal backups, and it's working well.  The hardest part of doing anything in AWS is that you have no idea what it will cost until you actually do it; they are masters of nickle and dime charging.  With that in mind, I wanted to wait until I had a few months of solid data before reporting on how it's been working for me.

If you know me, this may surprise you, but my backup strategy is a bit complex.  However, the relevant part for this post is that my documents folder is about 16 GB and I'm keeping a full backup of that, with daily diffs, for about $0.02 a month.


I did a post estimating the costs last year, and the result has lined up with that.

Here is the relevant part of my AWS bill for May 2020 (June looks to be the same, but isn't complete yet):

There are also some regular S3 line items, since I believe the file list is stored there even when the files are in Deep Archive.  However, I'm far below the cost thresholds there.


I have a local documents folder on my SSD, that gets backed up to a network version nightly via an rsync script.  Folders that are no longer being updated (eg, my school folder) I will delete from my local version and just keep on the network version.

Every month I create a full zip of my local documents folder and upload to S3.  Then every day I create a zip of just the files that have changed in the last 40 days.  I chose 40 days to to provide some overlap.  You could be more clever and just get files that changed since the first of the month, but I wanted to keep the process simple due to how important it is.  I also do a yearly backup of the full network version of this folder, which has a lot of stuff that hasn't changed in years in it.

The result is that I could do a full recovery by pulling the most recent monthly backup and then the most recent daily backup, and replacing the files in the monthly with the newer versions from the daily.  I'd also have to pull the most recent yearly, and extract that to a separate location.

This feels like a pretty simple recovery, all things considered.


The full backup:

And the diff backup:

If you want to adapt these scripts it should be pretty straightforward.  You'll have to have 7zip installed and have the command line aws client set up.  Create a nice long random password and store it in the password file.  Make sure you have a system for retrieving that password if you lose everything.

There's a feature to warn if the compressed file is larger than expected, since that will cost money.  The numbers are arbitrary, and work for me, you'd have to adjust them.  Also if you want to get the emailed warnings you'll have to set up mail and change the email address.

If you do want to use S3 Deep Archive for backups I really recommend reading my previous post, because there are a lot of caveats.  I highly encourage you to combine your files into a single file, because that will reduce the per file costs dramatically.

Also, note there is nothing here to delete these backups.  If all you care about is being able to restore the current version, then you can delete any but the newest version.  Keeping them all gives you the ability to restore at any point in time.  If you do delete them, keep in mind there is a limit to how fast you can delete things on Deep Archive.


I realize there are easier, free-er, and arguable better solutions out there for personal backups.  That's it, I don't have a 'but,'.  If you're reading this blog, this should not be a surprise.  Now that I have real data, I'm thinking about backing up some of my harder to find media here too.  I estimate 1 TB should cost about $12 per year in any of the cheapest regions.

Saturday, April 4, 2020

Stateless Password Managers

An idea I've had for a while is a password generator where you take a master password, an optional per site password, and the site domain name, combine and hash them to get a unique password for any site.

This system has a unique benefit over traditional password managers in that you can't lose your passwords.  Even if all your electronics were destroyed and you woke up naked in China tomorrow you could get your passwords just by using an online version of the tool (or failing that, manually doing the steps yourself with a hash generator).

However, the system has a unique drawback of not remembering what the password requirements are.  Some sites require special characters, some don't allow them, some require more than 10 characters, some allow for a max of 8.  It would be easy to translate your hash into whatever set of requirements you have, but you still need to either remember that, or store it somewhere else.

Today I discovered this idea has been implemented, a lot.  It's called a stateless password manager, or a deterministic password manager.  Two examples are:

And here is an article discussing the flaws in this system:

Tuesday, March 24, 2020

Social Distancing Scoreboard

According to the World Health Organization and the CDC, social distancing is currently the most effective way to slow the spread of COVID-19. We created this interactive Scoreboard, updated daily, to empower organizations to measure and understand the efficacy of social distancing initiatives at the local level.

Sunday, March 15, 2020

How do laser distance measures work?

I recently bought a laser tape measure; it's pretty great.  One button to turn it on, then it gives you instant distance measurements to wherever you point the laser.  There are more expensive ones that do further distances, but the one I got was $30 and goes up to 65 feet.  I compared it to a normal tape measure and it was accurate and repeatable to an eighth of an inch.  I was pretty impressed with it, and it was a great toy to add to my collection of measuring devices.

However, I began to wonder how it worked, especially since it worked so well, and was so cheap.

How laser distance measures don't work

In principle it would be simple.  Light has a very well known speed, so all you have to do is measure how long it takes for the light to go out and reflect back.  Distance = speed x time.  You could encode a binary number in the laser, just a counter incrementing and resetting when it runs out of numbers.  Measure what number is being reflected back and how long ago you sent that number out and you know how long it took to come back.

However, the devil is in the details, and getting that time precise enough to measure an 1/8th of an inch is going to be hard.

An 1/8th of an inch is 3.175 mm.  The speed of light is 299,792,458 m/s.  Or 299,792,458,000 mm/s.  3.175 mm / 299,792,458,000 mm/s = 1.059066002254133e-11 seconds.  Which is about 10.59 picoseconds.  Take the inverse of that and it's 94.42 Gigahertz.  I'm going to go out on a limb and assume that the $30 laser tape measure I have in my pocket doesn't have a 100 GHz clock inside of it.

How do they actually work?

Instead of transmitting a counter, just send an alternating pulse.  It doesn't have to be very fast, a MHz would be enough.  Then your reflected pulse is the same wave, but delayed slightly.  You only care about measuring the difference in time of the leading and falling edges of the two waves, or delta.  This means you can just compare the two waves using an XOR gate, which is just a fancy way of saying "tell me whenever these waves are different".

Here's an example

Where the top red line is the original signal, and the second blue line is the reflected version.  Then the third green line is the XORed delta of the two.

When you measure something slightly further away the reflected wave gets more delayed and the delta version gets a longer pulse.

Are logic gates fast enough? 

Logic gates like these are cheaper and faster than the circuitry you'd need for a timer.  However, they still aren't quite fast enough for the precision we see in these tools.  Luckily though, a delay doesn't really impact the measurement.  As long as it's a consistent delay on both the rising and falling edges of the two waves.

All you end up with is a slightly offset delta signal.

Who will measure the measurer?

It might seem like we're back to square one here, with the need to precisely measure the time of that pulse, but we actually just need take the average of that signal.  There are a variety of ways we can do this, but as a proof of concept, imagine the delta signal is charging a capacitor, which is simultaneously being drained by a constant resistor.  You'd end up with a level of charge in the capacitor which would translate into what percentage of time the delta single is high.

Now, all you have to do is measure the charge in the capacitor and turn that into a measurement you display.  Let's review what we need:
  • Laser transmitter and optical sensor.
  • MHz clock to turn laser on and off.
  • XOR circuit to compare the two transmitted and received signals.
  • A capacitor and resistor circuit to find average of the digital signal.
  • A way to measure the charge in the capacitor.
  • Something to take that measurement and convert it into the distance.
  • A display.
None of this is very expensive.  I'm pretty amazed they can combine them for less than $30, but at that point, you'd be losing money not to buy one.

Saturday, February 29, 2020

Guessing Smart Phone PINs by Monitoring the Accelerometer
In controlled settings, our prediction model can on average classify the PIN entered 43% of the time and pattern 73% of the time within 5 attempts when selecting from a test set of 50 PINs and 50 patterns. In uncontrolled settings, while users are walking, our model can still classify 20% of the PINs and 40% of the patterns within 5 attempts.

Tuesday, December 31, 2019

Predictions for the decade, from 2010

This is a good look back at what people thought the 2010s would bring at the start of them.

Wednesday, October 30, 2019

A comparision of AWS S3 Glacier Deep Archive region pricing

I'm considering using S3 for personal backups.  They recently introduced a new tier of storage called "S3 Glacier Deep Archive" which is intended for storing files that you will likely never, or perhaps once need to read.  Every geographic region AWS offers storage in has its own pricing.  I couldn't find a nice table with all the prices compared so I found the price to store 1 TB for 1 year in each region:

Using their tool:

If you're considering this keep in mind there are some important caveats.  First you pay for each request, which means if you're storing 1,000,000 files you will pay $50 just for the requests.  Doesn't matter if each file is 1 MB, or 1 KB, or even 1 byte each, it's $0.50 per 1000 PUT requests.  You will then also pay storage fees every month on top of that.  As far as I can tell, you don't pay for the bandwidth to upload the files.

Retrieving the files has more caveats.  First you need to pick a speed, standard or bulk.  Standard takes up to 12 hours, and bulk is up to 48 hours.  Standard also costs about 10x as much as bulk.  And here you pay for the individual requests, the data retrieved, and (I believe) bandwidth to download from S3.

So if you're storing many smallish files (documents) you're probably much better off combing them all into a single zip file, to reduce the number of requests you have to do.  On the other hand if you're storing large files (videos), you'd probably be better off leaving them on their own so that ideally you just need to recover one or two, and then don't have to pay for the bandwidth to download them all.

I made this table to compare some scenarios.  The first 3 rows shows the costs to retrieve 1 TB split across either 1, 1024, or1048576 file.  The less file scenarios are cheaper, but not by a ton, and keep in mind if you only needed a few of those files it'd be much cheaper to just grab those individual files if they weren't zipped together.

The bottom 2 rows shows the cost to get 1 GB of files, either as 1 file or 1024 files.  Here the cost is negligible, pretty much however you store and access it.

So it seems in any case the bandwidth is the biggest cost.  Still, since you generally only pay for bandwidth out of S3 and not in to it, you should never really have to pay this, unless you're recovering from a pretty major disaster.  There is also the option to use AWS Snowball, where they will mail you a physical drive which you keep for up to 10 days then mail back.  That works out to be $200 + $0.03 per GB vs just $0.09 per GB for bandwidth.  So you need to be transferring 10s of TBs before it makes sense.

Wednesday, August 14, 2019

Build a computer out of NAND gates in stages.  This is essentially a game version of my post about how computers work.

Sunday, July 7, 2019

Social Science Research Network

 I've been into reading random papers from SSRN lately.  There's some really good stuff on there, like the paper I mentioned in my last post.

Sunday, June 30, 2019

The law of small numbers

I was listening to a podcast when I heard about an interesting probability result in the same vein as the Monty Hall Problem.  The new problem is this: Flip a coin 100 times and record the results.  Now pick random flips in the set and see if the next 3 flips are all heads; if so we call this a streak.  Repeat until you find a streak of 3.  Now what is the probability that the 4th flip is also heads?  Is it 50% like we would expect?  It turns out to be closer to 46%, which is not very far from 50%, but is also a clear trend.

You can download the paper here, and I recommend you read through the introduction, which is pretty easy to follow.  I think does a good job of explaining what is going on.  Since no one will do that, here is a table from the paper which helps give some intuition.

This represents every possible outcome from flipping a coin 3 times and looking for a 'streak' of 1 heads.  There are eight total possible outcomes, all equally likely.   In the first two, the streak of 1 heads never happens, or happens on the last flip where there is no following flip to look at.  Those are thrown away and ignored.  In the other six possible outcomes we do get a streak, at least once, and earlier than the last flip.  The underlined flips represent the possible candidates for the flip that is following a streak.  If we pick the preceding streak, then the underlined flips will be the one we are trying to predict.  In three out of the six outcomes with a streak, the following flip will not be heads.  In two out of the six outcomes the following flip will always be heads.  And in the remaining possible outcome it could be either head or tails with 50/50 probability depending on which streak you pick.

If you list out all the possible outcomes from any combination of streak length and total flips, you can see that some number of the heads flips are 'consumed' by the streaks themselves.  Those flips can never be following a streak, because they are part of the streak needed to define the streak.  On the other hand, the tails have no restrictions, they are all available to occur in the flip immediately following a streak.  There are simply more tails available to go in the candidate position.  The effect gets smaller as you decrease the streak length or increase the total number of flips in a set.

I found this very surprising, so I wanted to test it out.  I wrote a Ruby script to simulate various coin flips and look for streaks of different lengths, and output the results.  I then decided to rewrite it in a compiled language so it would be faster.  I decided to try out Go, as I've never used it before and I was hoping for something with a bit more syntactic sugar than C.

Here are the results of a bunch of combinations of streak lengths and numbers of flips from the Go program:
Looking for a streak of length  1 in    10 total flips. Performed 10000 rounds, and   9973 were successful, found 45.29% continued the streak.
Looking for a streak of length  1 in   100 total flips. Performed 10000 rounds, and  10000 were successful, found 49.43% continued the streak.
Looking for a streak of length  1 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 49.91% continued the streak.
Looking for a streak of length  2 in    10 total flips. Performed 10000 rounds, and   8203 were successful, found 38.16% continued the streak.
Looking for a streak of length  2 in   100 total flips. Performed 10000 rounds, and  10000 were successful, found 47.72% continued the streak.
Looking for a streak of length  2 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 50.15% continued the streak.
Looking for a streak of length  3 in    10 total flips. Performed 10000 rounds, and   4797 were successful, found 34.88% continued the streak.
Looking for a streak of length  3 in   100 total flips. Performed 10000 rounds, and   9995 were successful, found 45.84% continued the streak.
Looking for a streak of length  3 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 49.78% continued the streak.
Looking for a streak of length  4 in    10 total flips. Performed 10000 rounds, and   2152 were successful, found 35.83% continued the streak.
Looking for a streak of length  4 in   100 total flips. Performed 10000 rounds, and   9637 were successful, found 40.61% continued the streak.
Looking for a streak of length  4 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 49.21% continued the streak.
Looking for a streak of length  5 in    10 total flips. Performed 10000 rounds, and    985 were successful, found 37.36% continued the streak.
Looking for a streak of length  5 in   100 total flips. Performed 10000 rounds, and   7860 were successful, found 38.66% continued the streak.
Looking for a streak of length  5 in  1000 total flips. Performed 10000 rounds, and  10000 were successful, found 48.91% continued the streak.
Looking for a streak of length  6 in    10 total flips. Performed 10000 rounds, and    388 were successful, found 35.82% continued the streak.
Looking for a streak of length  6 in   100 total flips. Performed 10000 rounds, and   5190 were successful, found 35.24% continued the streak.
Looking for a streak of length  6 in  1000 total flips. Performed 10000 rounds, and   9996 were successful, found 46.68% continued the streak.
Looking for a streak of length  7 in    10 total flips. Performed 10000 rounds, and    140 were successful, found 40.71% continued the streak.
Looking for a streak of length  7 in   100 total flips. Performed 10000 rounds, and   2997 were successful, found 33.83% continued the streak.
Looking for a streak of length  7 in  1000 total flips. Performed 10000 rounds, and   9761 were successful, found 42.40% continued the streak.
Looking for a streak of length  8 in    10 total flips. Performed 10000 rounds, and     52 were successful, found 36.54% continued the streak.
Looking for a streak of length  8 in   100 total flips. Performed 10000 rounds, and   1634 were successful, found 33.60% continued the streak.
Looking for a streak of length  8 in  1000 total flips. Performed 10000 rounds, and   8365 were successful, found 38.27% continued the streak.
Looking for a streak of length  9 in    10 total flips. Performed 10000 rounds, and     17 were successful, found 47.06% continued the streak.
Looking for a streak of length  9 in   100 total flips. Performed 10000 rounds, and    784 were successful, found 33.04% continued the streak.
Looking for a streak of length  9 in  1000 total flips. Performed 10000 rounds, and   6037 were successful, found 35.80% continued the streak.
Looking for a streak of length 10 in    10 total flips. Performed 10000 rounds, and      0 were successful, found NaN% continued the streak.
Looking for a streak of length 10 in   100 total flips. Performed 10000 rounds, and    381 were successful, found 30.71% continued the streak.
Looking for a streak of length 10 in  1000 total flips. Performed 10000 rounds, and   3615 were successful, found 33.91% continued the streak.

Tuesday, April 30, 2019

Should You Time The Market?
You have 2 investment strategies to choose from.
  1. Dollar-cost averaging (DCA):  You invest $100 (inflation-adjusted) every month for all 40 years.
  2. Buy the Dip: You save $100 (inflation-adjusted) each month and only buy when the market is in a dip.  A “dip” is defined as anytime when the market is not at an all-time high.  But, I am going to make this second strategy even better.  Not only will you buy the dip, but I am going to make you omniscient (i.e. “God”) about when you buy.  You will know exactly when the market is at the absolute bottom between any two all-time highs.  This will ensure that when you do buy the dip, it is always at the lowest possible price.

Making a DIY smartwatch

Friday, March 15, 2019

Everything Smarthome

This is a long, but enjoyable article in broken Russian-English about everything smarthome in 2019.

Wednesday, February 27, 2019

Password strength

Dropbox has a password strength estimator called zxcvbn that I like a lot.  It estimates entropy in your password by looking for dictionary or password list leak matches.  It's long bothered me when sites estimate password strength purely based on complexity.  These sites say a password like Password!1 is much more secure than one like zbuwcramudbpvreorkno (a score of 72% vs 21% respectively).  I discuss this in more detail in my How to be secure online post.

However, a while ago Dropbox changed their algorithm to favor length over resistance to dictionary attacks.  There is some logic in their decision, but I really feel like something is lost by not having the old algorithm.  So, I made a demo comparing the two so you can find passwords both algorithms agree are strong.  At the same time, I finally hooked up this domain I bought a while ago to my github pages site.

Thursday, January 31, 2019


Friday, November 16, 2018

Invisibly inserting usernames into text with Zero-Width Characters
Zero-width characters are invisible, ‘non-printing’ characters that are not displayed by the majority of applications. F​or exam​ple, I’ve ins​erted 10 ze​ro-width spa​ces in​to thi​s sentence, c​an you tel​​l? (Hint: paste the sentence into Diff Checker to see the locations of the characters!). These characters can be used to ‘fingerprint’ text for certain users.

Sunday, November 4, 2018

The FBI of the National Park Service
Last August, I traveled to Yosemite National Park to meet up with Shott’s colleague, ISB special agent Jeff Sullivan, an affable, self-deprecating, 35-year veteran of the Park Service. Sullivan has played a role in investigating nearly every major crime and mystery that’s taken place in Yosemite over the past quarter-century, which made him the ideal guide for a tour of the shadowy side of America’s fifth most visited national park. See that grassy expanse, dotted with wildflowers? That’s where park visitors discovered the skull of a still-unidentified young woman, a murder claimed by the prolific serial killer Henry Lee Lucas. That lush meadow? Once, someone found a dead bear there, its head neatly severed from its body. (The ISB sent the bear’s remains to the park’s wildlife lab in Oregon, hoping to discover clues about who’d poached it. The lab called back a few weeks later: The poacher you’re looking for is a mountain lion.) 
Sullivan and I drove up to Glacier Point, where he told me about the rockslide in 1996 that killed one and injured at least 11. The dust cloud it kicked up was so massive it blocked out the sun; until Sullivan arrived on the scene, he’d been sure there would be dozens of casualties. Next to us, a bored teenager flung a water bottle into the abyss. Watching it fall seemed to cause Sullivan physical pain. He leaned in close and flashed his badge at the kid. “Don’t throw water bottles,” he said quietly.

Monday, October 22, 2018

How to set up Raspberry Pis without a keyboard, mouse, or monitor

There are plenty of guides out there about how to set up headless Raspberry Pis, but they get out of date quickly, and I do this often enough that I'm constantly searching for up to date ones.  So for my own benefit here's my documentation of the process.

Download Raspbian Lite.  This is the version without the GUI components.

Put your SD card in your computer and use lsblk to identify which drive your SD card is. Be careful, if you use the wrong drive below you will overwrite your main hard drive.

Use dd to copy the date over.  They constantly recommend you use the program Etcher, but I've never had it work successfully.  The command is sudo dd bs=4M if=2018-10-09-raspbian-stretch-lite.img of=/dev/sde conv=fsync status=progress

Your card should have 2 partitions, open the boot partition and add an empty file called ssh to enable ssh, and create a file called wpa_supllicant.conf to configure wifi.  The contents of the file are this:

ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev


You can either put your actual password in there as the psk, or use the tool wpa_passphrase to convert your password into a hash that will also work.

Put the card in the Pi, boot it up, and it should connect to your network and you should be able to ssh in with username pi and password raspberry.  Note that you need to boot once for it to expand the filesystem.

You should put your public key in ~/.ssh/authorized_keys and turn off password ssh access.  You should also run sudo raspi-config once you ssh in, and update with sudo apt update && sudo apt upgrade

Saturday, October 6, 2018

Blockchain Technology Overview

NIST just published a good overview of blockchain technologies.  Very thorough, yet digestible for non-technical readers.
Blockchains are tamper evident and tamper resistant digital ledgers implemented in a distributed fashion (i.e., without a central repository) and usually without a central authority (i.e., a bank, company, or government). At their basic level, they enable a community of users to record transactions in a shared ledger within that community, such that under normal operation of the blockchain network no transaction can be changed once published. This document provides a high-level technical overview of blockchain technology. The purpose is to help readers understand how blockchain technology works.

GoogleMeetRoulette: Joining random meetings
Let’s see… I generated a meeting, got a Google Meet phone number, and all subsequent phone numbers are also from the same carrier. Let me call and see if I get the Google Meet greeting to confirm. Bingo! For countries like Australia and Spain, Google Meet phone numbers are assigned in batches that are sequential. I can generate a meeting myself and just check the subsequent phone numbers to obtain more Google Meet numbers. You can use them to join/find meetings in the US as the phone numbers from other countries are not specific to meetings in that country, they are global.

10 seconds per call, 3 PINs at a time, 10,000 PINs to try. It would take about 9 hours to cover all PIN combinations making one call at a time. Because Twilio is designed to make calls at scale, we can make hundreds of calls at the same time making the process much faster. The script fires so many calls that the line will be busy sometimes. Not a problem! The script will detect failed calls and simply retry. Actually, Twilio notifies of failed calls immediately using webhooks making the script very efficient handling calls that did not go through.  
I did some benchmarks and on average it takes 25 minutes to try all 10k PINs and find 15 different valid PINs for 15 different meetings for a cost of $16. Not bad!

Saturday, September 29, 2018

Man in the browser attack

Recently I heard of the man in the browser attack and thought it was interesting.  This is malware that is installed in your browser (as an extension for example), and silently waits for you to do a bank transfer.  When you do it can simple change the to account and routing numbers you submit to that of the attacker.  Everything looks fine to you, and wire transfers already take days to process.  Things like strong passwords and 2 factor authentication won't help since you are logging into your real bank's website.

Monday, August 27, 2018

wideNES - Peeking Past the Edge of NES Games
At the end of each frame, the CPU updates the PPU on what has changed. This involves setting new sprite positions, new level data, and —crucially for wideNES— new viewport offsets. Since wideNES runs in an emulator, it’s really easy to track the values written to the PPUSCROLL register, which means it’s incredibly easy to calculate how much of the screen has scrolled between any two frames!

Hmm, what would happen if instead of painting each new frame directly over the old frame, new frames are instead painted overlapping the previous frame, but offset by the current screen scroll? Well, over time, more and more of the level would be left on-screen, gradually building up a complete picture of the level!

Friday, August 24, 2018

How I recorded user behaviour on my competitor’s websites
I spoofed the back button in Chrome and sent people to my version of search results and competitor websites where I recorded everything with Lucky Orange.