Saturday, February 28, 2026

You should use the full alphabet when generating random codes

This is going to be a post with a pretty limited audience, even by this blog's standards.  I'll try to keep this post short so let me get to the point I want to make: If you're generating a random code to represent something (like a94a8fe5), rather than just using hexadecimal you should do the work to convert it to the full alphanumeric space (a-z and 0-9) instead (eg, 3z4xlz).

As for why, you may notice that the second code (3z4xlz) is shorter than the first (a94a8fe5).  Despite this, the chance that you randomly generate two identical codes by accident (a 'collision'), is about the same for both (technically it's twice as likely for the 6 character one vs the 8 character hex one, but we're looking at orders of magnitude of probability here, and you could change it to 7 characters and the chances would be far lower than the 8 character hex one).

If you're not a software developer, this might seem like a weird point to be making, but hexadecimal is very common in programming, and as a result it's very easy to output.  When a programmer needs a random code, they will often just generate a bunch of hexadecimal characters and truncate it to however many they need.  It takes extra work to convert them to the full alphanumeric space.  It's not a lot of effort, but most don't bother to do it; most likely because they don't think to do it.

You may now be asking, "Well why not use upper and lower case letters and numbers, won't that be even better?" and you'd be right.  However, I often don't actually use that range, for three reasons.  First, by having both upper and lower case letters you increase the chances of confusion if a person ever has to read the codes.  Second, it's not as big of an improvement as you may think (if you're generating 1000 codes of 8 characters each, the chance of a collision goes down by 3 orders of magnitude if you go from hex to the full alphanumeric space, and then by 2 more orders of magnitude if you then increase to both upper and lower case [1 in 8,590; 5,642,220; and 436,680,211 respectively]).  And last, the full upper and lower case (and numbers) space is 62 characters, and at that point you might as well just pick 2 more characters you are ok with (like _ and . ) and just use base 64, which is easy to generate and work with.

Now, if you're asking "How did he calculate those 1 in x probability so quickly and easily?" have I got an answer for you.  Since I create these sorts of codes at both work and in my personal life somewhat often, I created a tool so I can cite it when discussing this with others and trying to convince them why they shouldn't use hex for this.

https://wetzel.dev/tools/collisions.html

Like most things I make, it has a bit of a learning curve, but I think once you get the hang of it, it is very easy to use to make these types of comparisons.  The idea is you would use this to help answer a question like "I'm going to be generating random codes, and I think I might generate X number of them total, and I'd like the risk of a collision to be below 1 in Y odds, how many characters do I need to use?"

Maybe this post wasn't actually brief, but if you remove the parentheticals it's about 10 words, so that counts for something.  I'll end this post by just quoting the description from the page here for some reason.

This tool calculates how likely it is that when generating random strings of characters you will get two with the same value (a collision). When generating a large number of these random values the chances of a collision goes up quickly, often much faster than expected. For example, if you're generating random 4 digit numbers, there are 10,000 possibilities, and so the chance of any one of those random numbers being the same as another is 1 in 10,000. However, every time you generate a new number you must compare it to all prior numbers, and so the number of comparisons can get very large, which can lead to the chance of a collision being higher than expected. The Birthday Paradox describes the surprising fact that it only takes 23 people before the chance that two of them will share the same birthday to be over 50%. That is not a paradox, but the number is much lower than people generally expect.

I made this tool mainly to show how much better using the full space of lowercase letters and numbers (36 characters) is than just using hex (16 characters) when generating random IDs. Developers often default to generating random IDs in hex because it's easy, but using the full space of the alphabet greatly reduces the chance that two randomly generated IDs will be the same, and including upper and lowercase letters reduces it much more. As an example, if you're going to have 6 character IDs, the chance of a collision if you generate 1000 is 2.9% with hex, 0.023% with alphanumeric, and 0.00088% with upper and lower alphanumeric. You could generate over 50,000 of the case sensative IDs and still have a lower chance of a collision (2.18%) than if you used just hex and only generated 1000 (2.93%).

Note this tool is using native javascript numbers, which have a precision limit of about 15 decimal places. When showing the chance of no collisions and using values that give a very small chance of a collision you'll see 1, when really the answer is a very small number less than 1. Just know that the chance of a collision is never 0, and so the chance of no collision is never truly 1. Viewing the chance of a collision (rather than the chance of no collision) should never round to 0.


Sunday, November 30, 2025

The Alien Style of Deep Learning Generative Design

https://medium.com/intuitionmachine/the-alien-look-of-deep-learning-generative-design-5c5f871f7d10

What happens when you have Deep Learning begin to generate your designs? The commons misconception would be that a machine’s design would look ‘mechanical’ or ‘logical’. However, what we seem to be finding is that they look very organic, in fact they look organic or like an alien biology. Take a look at some of these fascinating designs. 

Sunday, September 28, 2025

Custom PCBs for ESP32 I2C sensors to use with Home Assistant

Background 

I have a post from a few years ago about how I use I2C sensors with ESP32s to make relatively cheap and easy smart home sensors.  These all run software called ESPHome, which is really easy to use for these DIY sensors, and if you're running Home Assistant and have any interest in DIYing sensors I'd highly recommend you check it out.

I won't go into too much detail about how these work here, since you can go back and read that post, but the key is that all my sensors use a protocol called I2C which is just a way to do short distance low speed communications over 4 wires.  The nice thing is that I2C uses a bus, where many sensors all share those same 4 wires, so you don't need dedicated pins for every sensor.  All you need to do is find a sensor that detects what you want to measure, and uses I2C (and confirm the voltage is 5V or 3V3, more on that later), then you can hook multiple sensors up to 4 pins on the ESP32, and run ESPHome on it.

The only tricky part is how you physically connect those 4 wires from multiple sensors to the same 4 pins on the ESP32.  That's not hard to do for a one off, where you don't care what it looks like, but if you're trying to make a lot of these, easily, and have something that looks semi-presentable it becomes a lot harder.

 

An example of my previous DIY solution to I2C sensors
 

Above you can see what I came up with before.  It's just some protoboard with bare copper wires (scavenged from ethernet cables).  It works fine, but it's more annoying to wire everything up than you might think.

For a long time I've figured someone has to make a basic circuit board for exactly this purpose.  Just connecting the correct 4 pins on the ESP32 to a bus of 4 pin sockets you can plug in whatever I2C sensors you want.  Theses types of boards are commonly called "hats" and while they seem more common for Raspberry Pis they exist for ESP32s as well.

Either way, I could never find one, and so I decided to make this my first foray into designing and printing custom PCBs.

Getting the PCBs 

I'm trying not to make this post into one of my trademark long sagas, so I'll skip over a lot of the design details; I'm no expert anyway.  I will say, I used KiCAD to do the design, starting with the schematic editor, and then switching to the PCB editor.  The only items I'm placing are pin sockets and the board only needs 2 layers.  If you're interested in doing something similar there are many tutorials online.

The revised board I made

I then used JLCPCB to have the boards printed.  There's another company called PCBWay which I've also heard good things about, and the prices were similar.  Both these sites want you to upload "Gerber files" which I exported from KiCAD (including exporting drill files, that was not obvious).  Then I just zipped the files I got from the export, and that's zip file is the gerber file.

Order details of my custom PCB from JLCPCB

This is a screenshot of my order details from JLCPCB.  Mainly, just in case anyone else wants to get some of these boards made, they can use it as a reference.  But I didn't change much, besides choosing lead free solder and black as the board color, because I want my PCB all black.

Tariffs 

There was a lot of uncertainty for me as to what, if any, tariffs I'd have to pay to import these boards from China to the US.  So allow me to post another data point here.  I ordered the boards on Aug 11th, 2025, and it cost $5.22 total shipped, after a first time customer type discount of $9.46.  They were delivered on Aug 20th, and the government has yet to come to my house demanding import tariffs, so I think I'm in the clear.  Of course, this info is already like 3 iterations of tariffs out of date, so only time will tell what you will pay.

20 Custom PCBs, printed and shipped from China for $5

My one regret

I had one regret in the design, which I've since updated for if I ever get more of these printed, or if anyone else uses my design.  I mentioned above that when buying I2C sensors you have to check if they are 5 V or 3.3 V (called 3V3).  The ESP32 supplies both voltages (if you power if from USB) so you can use either, but if you're powering all the sensors from the same bus they all need to be the same.  All the sensors I've bought are 5V, but I figured it might be nice to have the option to use either.  So I designed a solder jumper in the middle of the board.  The 5V supply comes in from one side, and the 3V3 comes in on the other, then the output is in the middle.  You need to connect one of those two sides with a blob of solder to the middle pad, but not all 3 together.

I didn't really think much about this, other than patting myself on the back for being so clever to design the flexibility into my boards, but when I got them I realized how annoying it was to actually bridge the solder.  It's not impossible, but it isn't trivial.  Searching around for how to do this lead to a lot of discussions basically saying, "yeah, it sucks, so don't use solder bridges".  So, I redesigned the boards to just have 3 through holes there, that you can solder in male pins and just use a normal jumper cap to select the voltage, or even just solder a wire between the two, which would be pretty easy to do.

It's not worth getting new boards made, but if/when I ever get more made I will use the new design.

The boards with headers soldered on and one attached to an ESP32

Here you can see 5 boards with the female headers soldered on.  The bottom left board shows the solder jumper bridged.  The stack on the right and up top show unsoldered boards.

Side view of the finished board, with 3 sensors connected

Top view of the finished board

 

These pictures show the finished product.  There are 3 sensors attached to this board, with 2 spare available sets of pins I could attach 2 more (although I'd have to solder on the headers).  The sensors here are a BH1750 light sensor, a SHT45 temperature and humidity sensor, and a SCD41 CO2 sensor.

The eagle eyed among you may have noticed 2 of these sensors are dangling off cables losing a lot of the nice presentation the board should provide.  A few reasons for that, first I don't really care as this board went behind a TV in my bedroom where you can't see it.  Second, I always dangle the temperature sensors off a wire so they get less heat from the ESP32 itself.  Third though, the SCD41s seem to all have the slightly wrong pin order from what I need.  Most I2C sensors seem to use the order of VCC, GND, SCL, and SDA (in either direction is fine, you can solder the pin header to either side).  But the SCD41s all swap the GND and VCC, which means I need to use the wires so I can swap the order there.

What you need if you want to do this yourself?

If you want to do something similar, you can download my gerber files here and upload them to JLCPCB or PCBWay to get them printed.  Refer to my above screenshot for the options I chose.

This is the old version I had printed, with the solder jumper, and this is the newer version that I haven't actually tested, but has the easier to use through hole jumpers.

Then you need to buy ESP32s.  I get my ESP32s and sensors on Aliexpress these days, although you can also find them on Amazon for a bit more.   This is the ESP32 listing I used, but these don't tend to last long.  If you need to search for your own ESP32, just make sure it is the 30 pin version.  Just literally count the pins and make sure it's 15 on each side.  Then, imagine you are holding the board with the USB port facing down, the bottom left pin should be VIN/VCC, and the pin above that should be GND.  The bottom right pin should be 3V3 (and the one above that should be GND, but that doesn't actually matter for my board).  Then in the upper right the top 5 pins should be labeled D23, D22, TX0, RX0, and D21.  Of those D21 and D22 are the ones used, and technically those pins can be changed in the ESPHome config.  I don't think I've ever seen a 30 pin version of a ESP32 that didn't have the pins arranged like that, but it's good to double check.  Also note they make micro USB and USB C powered versions.

When buying sensors, it's going to depend on what you want to measure.  For temperature, BME280s are good sensors, with temperature, humidity and pressure.  The SHT45s are supposed to be a bit more accurate, with the trade off of not having pressure.  SCD41 are CO2 sensors.  The BH1750 is a light sensor.  They make SHT41 and SHT40 for temperature and SCD40 for CO2 which detect the same things but are less accurate for less money.  Any sensor you get make sure it is I2C, and then try to get all either 5V or 3V3.  Most will be compatible with either voltage, but check the listing.  Then try to get ones with pins in the order of VCC, GND, SCL, and SDA, it's not the end of the world if they don't match that order, but you'll have to use dupont wires to swap the order.  Sometimes they will have more pins other than those, but they don't matter, you can just leave them unconnected.

There are a lot of types of I2C sensors available, just search AliExpress or Amazon for "I2C sensor" to see what is out there.  Here's a $24 LiDAR sensor that works with ESPHome.  Mostly any should work with this board and an ESP32 running ESPHome, but you'll want to do a bit of research if you're buying something I didn't mention above.

Then you need 15 pin female headers and 4 pin female headers.  A kit like this has both size, but if you're going to be make a bunch of these, it's probably cheaper to buy a pack of like 100 of the 4 pin and 20 of the 15 pin ones.

You'll also need power, but the ESP32 power requirements are very low, so any spare USB power supply you have should work.  You should even be able to power them from any USB ports on the back of your TV or any other appliances you have.

Then you'll have to solder on the headers.  Refer to my above pictures as a guide.

Then you setup ESPHome and flash it to the boards.  There are a bunch of guides on how to do that, and you can refer to my previous post for an example of my YAML config. 

Sunday, August 31, 2025

WeatherStar 4000+

https://weatherstar.netbymatt.com/

A really good simulation of the 90s style local on the 8s forecast, complete with smooth jazz. 


 

Wednesday, July 2, 2025

Xfinity using WiFi signals in your house to detect motion

Subject to applicable law, Comcast may disclose information generated by your WiFi Motion to third parties without further notice to you in connection with any law enforcement investigation or proceeding, any dispute to which Comcast is a party, or pursuant to a court order or subpoena. 

https://news.ycombinator.com/item?id=44426726

Friday, May 16, 2025

A leap year check in three instructions

 https://hueffner.de/falk/blog/a-leap-year-check-in-three-instructions.html

With the following code, we can check whether a year 0 ≤ y ≤ 102499 is a leap year with only about 3 CPU instructions:

bool is_leap_year_fast(uint32_t y) {
    return ((y * 1073750999) & 3221352463) <= 126976;
}

How does this work? The answer is surprisingly complex. This article explains it, mostly to have some fun with bit-twiddling; at the end, I'll briefly discuss the practical use.

 

Bonus link:

That article links out to this site several times that lets you input high level code like C/Ruby/Javascript/etc and see the compiled/assembly output.

https://godbolt.org/z/PWs8saMYd