art with code


Hardware hacking

I got into these Pi devices recently. At first it was a simple "I want an easy way to control sites accessible on my office WiFi to stop wasting time when I should be working", so I set up an old broken laptop to prototype a simple service to do that. Then I replaced the laptop with a small Orange Pi hacker board. And got some wires and switches and breadboard and LEDs and resistors and ... hey, is that a Raspberry Pi 3B+? I'll get that too, maybe I can use it for something else...


I took apart a cheap RC car. Bought a soldering station to desolder the wires from the motor control board. Then got a motor controller board (an L298N, a big old chip for controlling 9-35V motors with 2A current draw -- not a good match for 3V 2A Tamiya FA-130 motors in the RC car), wired the motors to it, and the control inputs to the Raspberry Pi GPIOs.

Add some WebSockets and an Intel RealSense camera I had in a desk drawer and hey FPV web controlled car that sees in the dark with a funky structured light IR pattern. (The Intel camera is .. not really the right thing for this, it's more of a mini Kinect and outputs only 480x270 video over the RPi's USB 2 bus. And apparently Z16 depth data as well, but I haven't managed to read that out.) Getting the video streaming low-latency enough to use for driving was a big hassle.

Experimented with powering the car from the usual 5 AA batteries, then the same USB power bank that's powering the RPi (welcome to current spike crash land, the motors can suck up to 2A when stalling), and a separate USB power bank ("Hmm, it's a bit sluggish." The steering motor has two 5.6 ohm resistors wired to it and the L298N has a voltage drop of almost 1.5V at 5V, giving me about 1W of steering power with the USB. The original controller uses a tiny MX1508, which has a voltage drop something like 0.05V. Coupled with the 7.5V battery pack, the car originally steers at 5W. So, yeah, 5x loss in snappiness. Swap motor controller for a MX1508 and replace the resistors with 2.7R or 2.2R? Or keep the L298N and use 1.2R resistors.) Then went back to the 5 AA batteries. Screw it, got some NiMHs.

Tip: Don't mix up GND and +7.5V in the L298N. It doesn't work and gets very hot after a few seconds. Thankfully that didn't destroy the RPi. Nor did plugging the L298N +5V and GND to RPi +5V and GND -- you're supposed to use a jumper to bridge the +12V and GND pins on the L298N, then plug just the GND to the RPi GND (at least that's my latest understanding). I.. might be wrong on the RPi GND part, the hypothesis is that having shared ground for the L298N and the RPi gives a ground reference for the motor control pins coming from the RPi.

Tip 2: Don't wipe off the solder from the tip of the iron, then leave it at 350C for a minute. It'll turn black. The black stuff is oxides. Oxides don't conduct heat well and solder doesn't stick to it. Wipe / buff it off, then re-tin the tip of the iron. The tin should stick to the iron and form a protective layer around it.

Destroyed the power switch of the car. A big power bank in a metal shell, sliding around in a crash, crush. It was used to control the circuit of the AA battery pack. Replaced it with a heavy-duty AC switch of doom.

Cut a USB charging cable in half to splice the power wires into the motor controller. Hey, it works! Hey, it'd be nicer if it was 10 cm longer.

Cut a CAT6 in half and spliced the ends to two RJ45 wall sockets. Plugged the other two into a router. He he he, in-socket firewall.

Got a cheapo digital multimeter. Feel so EE.

Thinking of surface mount components. Like, how to build with them without the pain of soldering and PCB production. Would be neat to build the circuit on the surface of the car seats.

4-color printer with conductive, insulating, N, and P inks. And a scanner to align successive layers.

The kid really likes buttons that turn on LEDs. Should add those to the car.

Hey, the GPIO lib has stuff for I2C and SPI. Hey, there are these super-cheap ESP32 / ESP8266 WiFi boards look neat. Hey, cool, a tiny laser ToF rangefinder.

Man, the IoT rabbit hole runs deep.

(Since writing the initial part, I swapped the L298N for a MX1508 motor controller, and the D415 for a small Raspberry Pi camera module. And got a bunch of sensors, including an ultrasound rangefinder and the tiny laser rangefinder.)


1 GB/s from Samba

Hey, finally on the fast network (GPU died, so I replaced it with an IB card :P)

In other news, new workstation time.


Three thoughts


We're living on a grain of sand, five meters away from a light bulb. Of all the solar energy falling onto our tiny grain of sand, we are currently able to utilize 0.01%. The total energy falling onto our tiny grain of sand is roughly 0.00000005% of the total output of the light bulb.


Gravity sorting. In a gravity well, denser particles end up below lighter particles given time. Here's a list of densities of common elements. It matches the distribution of elements on Earth: air above water, water above silicon, silicon above iron. Note that the rare elements tend to be at the bottom of the list. There are three reasons for that. 1) Heavy elements are rare in the Universe. You need supernovas to produce them. 2) Heavy elements get gravity sorted below iron and mostly reside in the middle of the planet. 3) Some elements - like gold - don't oxidize easily, so they don't bubble up as lighter oxides. For a counter-example, tungsten has a similar density as gold (19.3 g/cm3), but oxidizes as wolframite (Fe,Mn)WO4, which has a density of 7.5 g/cm3 (close to elemental iron). As a result, the annual global tungsten production is 75,000 tons, whereas the annual gold production is 3,100 tons.

Elemental abundances from Wikipedia

The Earth formed from dust in the protoplanetary disk, similarly to other planets and objects in the Solar System. As a result, asteroids should have similar overall composition as the Earth.

Gravity sorting acts slower on less massive objects. You should see relatively more heavy elements at the surface of less massive objects. The force of gravity is also weaker, making the grains of matter less densely packed. This should make mining asteroids less energy-intensive compared to mining on Earth. Asteroid ores should also be more abundant in metals compared to Earth ores. At the extreme end, you'd have iridium, which is 500x more common on asteroids compared to the Earth's crust. Couple that with - let's say - factor of two energy reduction compared to Earth mining, and an asteroid mine could generate 1000x the iridium per watt compared to an Earth-based operation.

Suppose you want to have reusable space launch vehicles to minimize launch costs. Rockets that land. Your customers pay you to launch their stuff to space, then you recover your rocket to save on costs. You don't want to recover the stage that gets to LEO because it'd be too expensive. But what if the returning LEO vehicle brought back a ton of gold. That'd be worth $40 million at today's prices. And the amount of gold brought back this way would be small enough (say, 25 tons per year) that it wouldn't put much dent into gold demand or affect the price of gold. If anything, you could sell it as Star Gold for jewelry at 10x the spot price. Even at spot price, it'd go a long way towards covering the cost of the entire launch.


In a few decades, it looks likely that we'll be able to simulate humans in real-time using a computer the size of a match box. Match box humans. If it takes half a square meter of solar panel to power one match box human, you could sustain a population of 2 trillion in the US alone. If you were to use fusion to convert all the cosmic dust that falls on Earth (let's say, 100 tons) to energy at 1% mass-to-energy ratio, it would generate around 1 PW of energy, which could sustain 10 trillion match box humans running at 100 watts per person.


The carbon capture gambit

To control the climate of the planet, stopping the loss of carbon to air is not enough. To gain control of the temperature of the planet and the carbon economy, we need to mine the carbon that's been lost into the air.

There are proposed air-mining mechanisms for carbon that run at a cost of 3-7% of GDP. And that's before significant optimizations. To gain control of the climate and amass vast carbon reserves, we would only need a sum of money equivalent to a few years' worth of economic growth. If we invest now, we'll get a massive head start on future mining operations by other countries, and will end up in an immensely powerful geopolitical position. Not only will we have control over the climate of the planet, we also end up in the possession of all the carbon lost into the air.

Control over carbon will give us control over the carbon-based fuel economy. Most of the carbon accessible carbon on the planet is right there in the air, ready for the taking.

Every year, more than 7 gigatons of carbon is lost in the atmosphere in the form of CO2. We can estimate the market value of this lost carbon using the price of coal. Coal is around 50% carbon, with the rest composed of impurities. The price of coal is around $40 per ton. Pure carbon could be twice as valuable as coal. At $80 per ton, 7 gigatons of carbon would be worth $560 billion dollars. More than half a trillion dollars. Every. Single. Year.


Crank, the SUBSET-SUM edition

SUBSET-SUM: 2^n possible answers for n-element input. Input composed of n k-bit numbers, find a subset that sums up to a target number. Total input length is n*k bits. If n is log(k), solvable in polynomial time to n*k (2^n ~= k ~= n*k). If k is log2(n), also solvable in polynomial time (2^k unique numbers ~= n => you've got all the numbers, greedy solution always works). Therefore k = p(n).

If you pre-sort the numbers, you get a partially sorted binary enumeration. E.g. If you have 4 numbers in the input, you can state your answers as binary numbers from 0001 to 1111. Because the numbers are sorted, you know that 0100 > 0010, regardless of the numbers involved.

In addition to ordered bit patterns, the output space is composed on ambiguous bit patterns. Think of 0110 and 1001. You can't tell which one is larger without evaluating the sum in some way.

The kickers are that for any given bit pattern, you can generate an ordered pattern that is smaller or larger to it. And the number of ambiguous bit patterns between the two ordered patterns grows exponentially to n. Indeed, the number of ambiguous bit patterns for a given bit pattern grows exponentially with n as well.

Think of this 24-bit pattern: 000000111111111111000000. It's got the middle 12 bits set, surrounded by 6 zero bits on either side. Because of the ordered pattern properties (if you move a bit up, the sum is larger, if you move a bit down, the sum is smaller), you know that any bit pattern with 12 bits set where all the bits are below the 6th bit are equal or smaller to the mid-bits-set pattern. Likewise, any bit pattern with zeroes above the 18th bit are equal to or larger to the mid-bits-set pattern. The bit patterns ambiguous to the mid-bits-set pattern are ones where you have bits moved to above and below the pattern. This would constitute around 212 patterns, or 2n/2.

You might think that this is pretty nice to use for solution search, but note that the ordered pattern comparison only decreases the size of the output space under consideration by 2-n/4. With 24 bits, this comparison would reduce your output space to 63/64 of the original. With 28 bits, 127/128. With 30 bits, 255/256. And so on.

Even if you manage to use some fancy combination of comparisons and pattern recognition to rule out that portion of the search space with each comparison, you're still running at 2n/4 complexity.

Digression: The SUBSET-SUM search graph is composed of n layers. If you can solve one layer in p(n) time, you can solve the problem in p(n) time (because n*p(n) ~ p(n)). And of course, if you can't solve a layer in p(n), you can't solve the full problem either. This doesn't really help much, since the number of sums in the middle layer grows exponentially with n, and the number of unique sums grows exponentially with k. (If you think of the middle layer as a line with a little tick mark for every number generated on it, adding one to n up-to-doubles the number of tick marks on the line. Adding one to k doubles the length of the line. Adding one to n increases the size of the input by k bits. Adding one to k increases the size of the input by n bits.)

Digression 2: You can easily generate a number that's got the maximum distance from the target number equal to max(input), just sum them up until you reach a number that's higher than the target. Of course, max(input) is around 2k. I believe you could also generate a number that has a distance around the difference between two numbers in the input (greedily swap numbers to move the generated number towards the target). The difference between two adjacent input numbers is around 2k/n.

Digression 3: Even if you could get to p(n) distance from the target, you'd have to then iterate towards the target. You can generate an ordered pattern in the right direction, but that'll jump you an exponential distance. And after you've jumped to the closest ordered pattern, you need to find the next largest number for it. And finding the next largest number for a given bit pattern is tough. You need to resolve the ambiguous patterns for it. Which are exponential in count. And where every evaluation of a pattern yields you a portion of the solution that shrinks exponentially with n. If so, finding the next largest number to a pattern sounds like it's harder than NP (how would you verify the adjacency in polynomial time?)

Digression 4: Let me propose a series of simplified problems for SUBSET-SUM. First, let's limit ourselves to problems where a solution exists, and where exactly half of the numbers are used for the generated number. The first simplified problem is finding an adjacent number for such a generated number. The second is finding a number that's p(n) distant from a generated number. The third is finding the actual generated number. I don't think any of these can be solved in p(n)...


Even more crank on P != NP

Size limits for the input and the output

If the input is exponentially smaller than the output, the problem is not in NP. If the input is power-of-two exponentially larger than the output, the problem is in P.

In the first case, the output is exponentially longer than the input and therefore takes exponentially longer to print out, even on a non-deterministic machine. Even if you could print it out instantly, you need to check the output with a deterministic verifier that needs to run in polynomial time to the original input, and reading the exponentially larger output into the verifier would take exponential time to the original input. So exponentially longer outputs are not in NP.

In the second case, if you have a polynomial time deterministic verification algorithm, you can create a solver by adding an output space enumerator in front of it. If the output is exponentially smaller than the input, hey, you've got a polynomial time solver. If |output| = log2(|input|), enumerating the outputs takes 2|output| = 2log2(|input|) = |input| steps. From NP, you have the verifier that runs in p(|input|) so a solver that enumerates the output space and tries the verifier on each possible solution would run in 2|output| * p(|input|), which becomes |input| * p(|input|) if the output is exponentially smaller than the input like above. (So any problems in NP where the output is a single bit are also in P.)

Due to these size limits, for P ?= NP the size of the output must be polynomial(-ish??) to the size of the input.

Size limits for the solver

How about the size of the deterministic solver relative to the input. If the deterministic solver machine is by necessity exponentially larger than the input (by necessity meaning that for any given part of the machine there exists an input that makes the machine access that part), there also exists an input that causes the machine to move the tape a distance exponential to the size of the input, which takes exponential time relative to the input. So the deterministic solver needs to have a size at most polynomial to the size of the input.

For the lower bound, consider the case where the machine extracts zero information from the input. This machine just enumerates over the output space, checking each answer using the verifier. The runtime of the machine is 2|output| * p(|input|), the code is fixed in size, and it uses |output| extra tape as memory. At the other end is the answer table machine: it's an array of precomputed solutions where the key is the input. The size of the answer table machine is |output| * 2|input| (and due to the above tape movement limitations, will take exponential time to run relative to some input.)

Knowing nothing about the input is out, and knowing everything about the input is out. Every bit in the program encodes information about the input-output-relation, allowing the solution search to proceed faster than brute-force enumeration when the program encounters an input that matches the encoded information. The program can also learn from the input, but it can't learn an exponential amount of information from the input (because that'd take exponential time.) Wild guess: If the amount of information used by the program is exponentially less than the amount of information in the input-output-relation, the program would end up using brute-force search over parts of the output that are polynomial in size to the input, which would take exponential time to the input. If true, there would be no sub-polynomial-space programs for P ?= NP that run in polynomial time.

Given the existence of the deterministic polynomial time verifier, we know that you can extract information from the input-output-pair in polynomial time, even though the amount of information extracted is very small. How small? Well, at least you get |output| / 2|output| bits of info. After trying every 2|output| outputs, you would have |output| bits and a correct output. If there are several correct outputs, you'd gain numberOfOutputs * |output| / 2|output| bits. If the numberOfOutputs grows exponentially with the output size, this would give you a polynomial time solver. But, if this were the case, someone would've noticed it. Another wild guess: the amount of information we get out of the verifier shrinks exponentially in relation to the output size (and therefore the input size, which has a polynomial size relation to the output).

The number of input enumerations is 2|input|, ditto for the output 2p(|input|). We have a deterministic polynomial time verifier that can extract solution information from the input, but the amount of information extracted looks to be a logarithm of the input size. We also have a hypothetical solver program that encodes a polynomially growing amount of information about the input-output-relation, allowing it to avoid brute-force search whenever it encounters an input that matches the encoded information.

If the number of patterns that can be encoded in the input grows exponentially with the input length, then as the problem size grows towards infinity the question becomes: can you discern between an infinite number of input patterns with a zero-length solver program and a verifier that can extract zero bits of information from the input? For P to be NP, the number of patterns in the input needs to grow as a polynomial to the length of the input.


Some more rubbish on P != NP

Feeling crank today, so:

Non-deterministic machines are one-bit oracles. Given a branch in execution, they always follow the correct path. You can use this property to write a universal non-deterministic machine:


NonDeterministic done, correctOutput;

while (!done) {

Its execution time is linear to the size of the output. There are still problems that take non-polynomial time compared to the input. For instance, "given an input of n bits, print out n! bits".

The question in P ?= NP then is: "given a problem where the output size is a polynomial of the input size, is it possible to emulate a non-deterministic machine using a deterministic machine with at most a polynomial slowdown". Since the ND machine takes a number of steps linear to the output size, which is polynomial of the input size, it solves the problem in polynomial time to the input. If the deterministic machine can emulate the ND machine at a polynomial slowdown, its running time is a polynomial of the output size, which is a polynomial of a polynomial of the input size.

Maybe you could also ask if it's possible to prove that there exists a problem for which an optimal program running on a deterministic machine requires a number of steps exponential to the size of its output.

Make a bridge from output size to input size? ...exponential to the size of its output and where the output size grows linearly with input size.

Blog Archive

About Me

My photo

Built art installations, web sites, graphics libraries, web browsers, mobile apps, desktop apps, media player themes, many nutty prototypes