fhtr: 2013

2013-12-28

Techniques for faster loading

Speed is nice, right? Previously fhtr.net was taking half a second to get to the start of the first WebGL frame. With a hot cache, that is. Now, that's just TERRIBLE! Imagine the horror of having to wait for half a second for a website to reload. What would you do, who could you tell, where could you go! Nothing, no one and nowhere! You'd be stuck there for 500 milliseconds, blinking once, moving your eyes a few times, blinking twice, maybe even thrice before the site managed to get its things in order and show you pretty pictures. Clearly this can not stand.

Fear not, I am here to tell you that this situation can be amended. Just follow these 10 weird tricks and you too can make your static HTML page load in no time at all!

OK, let's get started! First off, I trimmed down the page size some. By swapping the 400 kB three.js library for 1 kB of WebGL helper functions, I brought the JS size down to 8 kB. This helped, but I still had to wait, like, 350 ms to see anything. Jeez.

My next step in getting the page load faster was to make my small SoundCloud player widget load the SoundCloud sdk.js asynchronously and do a timeout polling loop to initialize the widget once the SDK had loaded. Now, that didn't help all that much, but hey, at least now I was in control of my destiny and didn't have to wait for external servers to get around to serving the content before being able to execute crazy shaders. I also inlined the little logo image as a data URL in the HTML to avoid that dreadful extra HTTP request.

To further investigate the reason for the slowness in the page load, I turned my eye to the devtools network pane. 'Lo behold, what a travesty! I was using synchronous XHR gets to load in the two fragment shaders. For one to start loading, the other had to finish. And they were both loaded by my main script, which was in a file separate from the HTML.

I didn't want to inline the JS and the shaders into the HTML because I don't have a build script ready for that. But I could still fix a few things. I made the XHRs asynchronous so that the shaders load in parallel. Then I moved the shader loader out of the script file into a small script inlined in the HTML. Now the shaders start loading as the HTML loads, similar to the main script file.

Timing the code segments a bit, I noticed that my code for generating a random 256x256 texture was taking ~16 ms. Not too long, but hey, way too long, right? So I moved that out of the main script file and inlined it into the HTML, after the shader-loading snippet. Now the texture is generated while the browser is downloading the shaders and the script file. This squeezed out a few extra milliseconds in the hot cache scenario. Later on, I stopped being stupid and used an Uint8Array for the texture instead of a canvas, bringing the texture creation time to 2ms. Yay! Now it's practically free as it takes the same amount to generate the texture as it takes to load in the scripts.

My other major delays in getting the first frame to the screen were creating a WebGL context (15 ms or so), compiling the shaders (25 ms a pop) and setting up the initial shader uniforms (7 ms per shader). To optimize those, I made the page compile and set up only the first visible shader, and pared down the initial uniforms so as not to overlap with the ones I set before every draw. That brought down my shader setup cost from 64 ms to 26 ms. As for the WebGL context setup, I moved it into the inline script, after texture generation, so that it overlaps with the I/O. Maybe it helps by a millisecond or so. Maybe not.

As for caching. I'm using AppCache. It downloads the whole site on your first visit and keeps it cached. On subsequent visits (even if you're offline!), you get the page served from cache. Which is nice. And a bit of a hassle, as AppCache is going to require some extra logic to update the page after you have downloaded a new version of it.

Well, then. What is the result of all this effort? Let me tell you. On a hot cache and a particularly auspicious alignment of the stars in the eternal firmanent of space, when I sit down in front of my iMac and hit reload, the first WebGL frame starts executing in 56 milliseconds. That's less than the time it takes you to move your eyes from the address bar to the page itself. It's still TOO SLOW because, as everyone knows, websites should load in less than a single frame (at 60Hz).

Furthermore, alas, in a new tab, setting up the page takes ~350 ms. And with a cold cache - who can tell - 650 ms or more. Therefore, the next step in this journey is to spend a few minutes switching the page from GitHub Pages to Amazon CloudFront and hopefully claw back a few hundred ms of I/O time.

Carry on, wayward soldier.

2013-12-26

Opus 2, GLSL ray tracing tutorial

A bit older shader on fhtr.net this time, from the time I was getting started with ray tracing. This one's path tracing spheres and triangles with motion blur. Might even work on Windows this time, but YMMV. I should get a test machine for that.

"So", I hear you ask, "how does that work?" I'm glad you asked! It's a common Shadertoy-style WebGL shader where two triangles and a trivial vertex shader are coupled with a delightfully insane fragment shader. In this case, the fragment shader takes the current time as its input and uses it to compute the positions of a few objects and the color of the background. The shader then shoots a couple of rays through the scene, bouncing them off the objects to figure out the color of the current pixel. And yes, it does all this on every single pixel.

Path tracing

The path tracing part is pretty simple. The shader shoots out a ray and tests it against the scene. On detecting a hit, the ray is reflected and tested again. If the ray doesn't hit anything, I set its color to the background color. If the ray does hit something and gets reflected into the background, I multiply the object color with the background color. If the reflected ray hits another object, I give up and leave the color black. To make this one-bounce lighting look less harsh, I mix the ray color with the fog color based on the distance traveled by the ray. For the fog color, I'm using the background color so that the objects blend in well with the bg.

Digression: The nice thing about ray tracing -based rendering is that a lot of the things that are difficult and hacky to do with rasterization suddenly become simple. Reflections, refractions, transparent meshes, shadows, focus blur, motion blur, adaptive anti-aliasing schemes, all are pretty much "shoot an extra ray, add up the light from the two rays, done!" It just gets slowwwer the more rays you trace.

And if you're doing path tracing, you're going to need a whole lot of rays. The idea behind path tracing is to shoot a ray into the scene and randomly bounce it around until it becomes opaque enough that further hits with light sources wouldn't contribute to the pixel color. By summing up enough of these random ray paths, you arrive at a decent approximation of the light arriving to the pixel through all the different paths in the scene.

For specular materials, you can get away with a relatively small amount of rays, as the variance between ray paths is very small. A mirror-like surface is going to reflect the ray along its normal, so any two rays are going to behave pretty much the same. Throw in diffuse materials, and you're in a whole new world of hurt. A fully diffuse surface can reflect a ray into any direction visible from it, so you're going to need to trace a whole lot of paths to approximate the light hitting a diffuse surface.

Motion blur

The motion blur in the shader is very simple. Grab a random number from a texture, multiply the frame exposure time with that, add this time delta to the current time and trace! Now every ray is jittered in time between the current frame and the next frame. That alone gets you motion blur, albeit in a very noisy fashion.

I'm using two time-jittered rays per pixel in the shader, first one at a random time in the first half of the exposure time, the second in the second half. Then I add them together and divide by two to get the final motion blurred color. It looks quite a bit better without totally crashing the frame rate. For high-quality motion blur, you can bump the ray count to a hundred or so and revel in your 0.4 fps.

Background color

I made the background by adding up a bunch of gradients based on the sun position and the plane of the horizon. The gist of the technique is to take the dot product between two directions and multiply the gradient color with that. Quick explanation: the dot of two directions (or unit vectors) is the cosine of the angle between them, its value varies from 1 at zero angle to 0 at a 90 degree angle and -1 at 180 degrees. To make a nice diffuse sun, take the dot between the sun direction and the ray direction, clamp it to 0..1-range, raise it to some power to tighten up the highlight and multiply with the sun color. Done! In code: bgCol += pow(max(0.0, dot(sunDir, rayDir)), 256.0)*sunColor;

You can also use this technique to make gradients that go from the horizon to the zenith. Instead of using the sun direction, you use the up vector. Again, super simple: bgCol += pow(max(0.0, dot(vec3(0.0, 1.0, 0.0), rayDir)), 2.0)*skyColor;

By mixing a couple of these gradients you can get very nice results. Say, use low-pow sun gradient to make haze, high-pow for the sun disk, a horizon-up gradient for the skydome, horizon-down gradient for the ground and a reversed high-pow horizon gradient to add horizon glow (like in the shader in the previous post).

Let's write a path tracer!

Here's a small walk-through of a path tracer like this: First, normalize the coordinates of the current pixel to -1..1 range and scale them by the aspect ratio of the canvas so that we get square pixels. Second, set up the current ray position and direction. Third, shoot out the ray and test for intersections. If we have a hit, multiply the ray color with the hit color, reflect the ray off the surface normal and shoot it out again.

vec2 uv = (-1.0 + 2.0*gl_FragCoord.xy / iResolution.xy) * vec2(iResolution.x/iResolution.y, 1.0);
vec3 ro = vec3(0.0, 0.0, -6.0);     // Ray origin.
vec3 rd = normalize(vec3(uv, 1.0)); // Ray direction.
vec3 transmit = vec3(1.0);          // How much light the ray lets through.
vec3 light = vec3(0.0);             // How much light hits the eye through the ray.

float epsilon = 0.001;

float bounce_count = 2.0; // How many rays we trace.

for (int i=0; i<bounce_count; i++) {
  float dist = intersect(ro, rd);
  if (dist > 0.0) { // Object hit.
    transmit *= material(ro, rd); // Make the ray more opaque.
    vec3 nml = normal(ro, rd);    // Get surface normal for reflecting the ray.
    ro += rd*dist;                // Move the ray to the hit point.
    rd = reflect(rd, nml);        // Reflect the ray.
    ro += rd*epsilon;             // Move the ray off the surface to avoid hitting the same point twice.
  } else { // Background hit.
    light += transmit * background(rd); // Put the background light through the ray and add it to the light seen by the eye.
    break;                              // Don't bounce off the background.
  }
}

gl_FragColor = vec4(light, 1.0); // Set pixel color to the amount of light seen.

Spheres!

Ok, let's get real! Time to trace two spheres! Spheres are pretty easy to trace. A point is on the surface of a sphere if (point-center)^2 = r^2. A point is on a ray (o, d) if point = o+d*t | t > 0. By combining these two equations, we get ((o+d*t)-center)^2 = r^2 | t > 0, which we then have to solve for t. First, let's write it out and shuffle the terms a bit.

(o + d*t - c) • (o + d*t - c) = r^2

o•o + o•dt - o•c + o•dt + dt•dt - c•dt - c•o - c•dt + c•c = r^2

(d•d)t^2 + (2*o•dt - 2*c•dt) + o•o - 2o•c + c•c - r^2 = 0

(d•d)t^2 + 2*(o-c)•dt + (o-c)•(o-c) - r^2 = 0

Ok, that looks better. Now we can use the formula for solving quadratic equation and go on our merry way: for Ax^2 + Bx + C = 0, x = (-B +- sqrt(B^2-4AC)) / 2A. From the above equation, we get

A = (d•d)t^2   // We can optimize this to t^2 if the direction d is a unit vector.
               // To explain, first remember that a•b = length(a)*length(b)*cos(angleBetween(a,b))
               // Now if we set both vectors to be the same:
               // a•a = length(a)^2 * cos(0) = length(a)^2, as cos(0) = 1
               // And for unit vectors, that simplifies to u•u = 1^2 = 1
B = 2*(o-c)•d
C = (o-c)•(o-c) - r^2

And solve

D = B*B - 4*A*C
if (D < 0) {
  return No solution;
}
t = -B - sqrt(D);
if (t < 0) { // Closest intersection behind the ray
  t += 2*sqrt(D); // t = -B + sqrt(D)
}
if (t < 0) {
  return Sphere is behind the ray.
}
return Distance to sphere is t.

In GLSL, it's five lines of math and a comparison in the end.

float rayIntersectsSphere(vec3 ray, vec3 dir, vec3 center, float radius, float closestHit)
{
  vec3 rc = ray-center;
  float c = dot(rc, rc) - (radius*radius);
  float b = dot(dir, rc);
  float d = b*b - c;
  float t = -b - sqrt(abs(d));
  if (d < 0.0 || t < 0.0 || t > closestHit) {
    return closestHit; // Didn't hit, or wasn't the closest hit.
  } else {
    return t;
  }
}

Demos

Right, enough theory for now, I think. Here's a minimal ray tracer using the ideas above.

float sphere(vec3 ray, vec3 dir, vec3 center, float radius)
{
 vec3 rc = ray-center;
 float c = dot(rc, rc) - (radius*radius);
 float b = dot(dir, rc);
 float d = b*b - c;
 float t = -b - sqrt(abs(d));
 float st = step(0.0, min(t,d));
 return mix(-1.0, t, st);
}

vec3 background(float t, vec3 rd)
{
 vec3 light = normalize(vec3(sin(t), 0.6, cos(t)));
 float sun = max(0.0, dot(rd, light));
 float sky = max(0.0, dot(rd, vec3(0.0, 1.0, 0.0)));
 float ground = max(0.0, -dot(rd, vec3(0.0, 1.0, 0.0)));
 return 
  (pow(sun, 256.0)+0.2*pow(sun, 2.0))*vec3(2.0, 1.6, 1.0) +
  pow(ground, 0.5)*vec3(0.4, 0.3, 0.2) +
  pow(sky, 1.0)*vec3(0.5, 0.6, 0.7);
}

void main(void)
{
 vec2 uv = (-1.0 + 2.0*gl_FragCoord.xy / iResolution.xy) * 
  vec2(iResolution.x/iResolution.y, 1.0);
 vec3 ro = vec3(0.0, 0.0, -3.0);
 vec3 rd = normalize(vec3(uv, 1.0));
 vec3 p = vec3(0.0, 0.0, 0.0);
 float t = sphere(ro, rd, p, 1.0);
 vec3 nml = normalize(p - (ro+rd*t));
 vec3 bgCol = background(iGlobalTime, rd);
 rd = reflect(rd, nml);
 vec3 col = background(iGlobalTime, rd) * vec3(0.9, 0.8, 1.0);
 gl_FragColor = vec4( mix(bgCol, col, step(0.0, t)), 1.0 );
}

Demo of the minimal ray tracer.

I also made a version that traces three spheres with motion blur. You can check out the source on Shadertoy.

Demo of a motion blur tracer.

Conclusion

If you got all the way down here, well done! I hope I managed to shed some light on the mysterious negaworld of writing crazy fragment shaders. Do try and write your own, it's good fun.

Simple ray tracers are fun and easy to write and you can get 60 FPS on simple scenes, even on integrated graphics. Of course, if you want to render something more complex than a couple of spheres, you're probably in a world of pain. I should give it a try...

P.S. I put together a bunch of screenshots from my ray tracing shaders to make a project pitch deck. You can check out the PDF here. The first couple screenies have decals and clouds painted in PS, the rest are straight from real-time demos.

2013-12-22

Opus

Cooking a bit with shaders on fhtr.net. You won't see the cube on Windows as ANGLE doesn't like complex loops very much. The clouds are made by raymarching fractional brownian motion (= hierarchical noise, p=currentPoint; d=0.5; f=0; octaves.times{ f += d*noise(p); p *= 2; d /= 2; }), with the noise function from iq's LUT-based noise (and a dynamically generated noise LUT texture using the power of canvas).

I don't know where that's going. A cursed artifact, long dead. Flying neon beams lighting up the night, the last shards of the sun seeping into the iridescent tarsands. Powercrystals booming to life.

2013-10-03

PDF export tweaks

I did a small tweak to Message's Print to PDF -feature. It now uses the CSS @page { size: 1920px 1080px; } attribute on Chrome to make you nice full HD PDF slides, and falls back to normal printing on Firefox.

What I'd really like to see is a way to save the page to a PDF from JavaScript with something like window.print('pdf'). It'd open a "Save as PDF"-dialog to save a PDF of the current page. In a nutshell, it'd be a shortcut to press the "Save as PDF"-button in the print dialog, with print annotations turned off and defaulting to zero margins.

I created a Chromium issue on that: DOM API for Save as PDF.

2013-10-01

Message update - start of October

PDF Export

Let's see... what happened over the last two weeks in Message? The latest news is probably the biggest. Yes! You can now create PDFs of your presentations. Now you can grab a PDF and email your slides to everyone out there without having to worry about the browser they're using. Similarly, you'll have a good fallback for those times when you need to use computer at the venue to do your presentation.

To create a PDF of your slides, go to the "Share"-tab and click on the new "Print to PDF"-button. You'll get a print dialog where you can save your slides as a PDF, or print them if you need a physical copy. To make the PDF look good on screen, choose landscape mode in the print dialog (Page Setup in Firefox). Now click on "Save as PDF" and you should receive a nice PDF of your presentation.

Markdown template

What's that? You want to use Markdown to write your slides? Here's a template that lets you do just that: Markdown template. It's lacking a few things on the CSS side, namely auto-scaling images, so you'll have to decide how to deal with those.

To use the Markdown template, paste this template ID into the "Load Template"-input in the "Theme"-tab and hit enter: 52488616c072d10200000006

Note: loading a new template replaces your existing theme and JS. Fear not, you can revert back to them using the version history.

Spinning theme template

Here's another template, this one's based on the Basics of Three.js slide deck. It's got fancy CSS 3D animations and tilted title slides with animated headings to double up on the awesome. The name of the template is Spinning [preview] and you can load it up using the template ID 5219ec577a35d40200000002

Friendly URLs

The stable Message URLs look quite intense, right? That long list of alphanumerics in v.fhtr.net/5219ec577a35d40200000002, not very appetizing. What if you could refer to the presentation with v.fhtr.net/ilmari.heikkinen/spinning-theme? Turns out, now you can!

One caveat though, if you change the title of your presentation, the friendly URL changes as well. If you don't change the title, all is well (as long as you don't have several presentations with the same title). I'll get around to fixing that soon-ish.

Similarly, you can visit a person's profile at message.fhtr.net/username. To find out your username, head over to the Account page, click on the "Presentations"-link and look at the person=username -part of the page URL.

Business

I'm also trying to make this thing profitable and make it grow faster, thus far with no success. In that vein, I was experimenting with making a paywall for a few days (PayPal-powered Subscribe-button) and seeing if I get any subscriptions. Result: Nope. Though I had just ~6 uniques/day at that time.

After a few days, I turned off the paywall and went with in-app payments by putting the "Subscribe"-button on the presentation list page. Now you see the "Subscribe"-button if you're already signed up (and have a better idea of the app). Few days of that now, no takers yet. Not very many people on the app either.

I also experimented with AdWords to buy traffic onto the site. Got a few hundred hits, no signups. Which tells me that the landing page didn't sell for the people who clicked on the ad, and that the sign up flow is very broken.

In a nutshell: business part of the equation is still lacking very much, ditto for salable product. So... time to start looking for freelance gigs while learning about selling a product and looking for people who could help on that side.

2013-09-20

The latest motivational flicks

Here are two documentaries that will change your perspective on life.

The first one is Touching the Void. It recounts the experiences of two British mountain climbers trying to climb a previously unclimbed mountain in the Peruvian Andes. Perseverance is the word.

The second movie is Jiro Dreams of Sushi. It's a documentary about a top-rate sushi chef. His life, seventy years of trying to top yourself every single day, trying to improve your craft.

Ars longa, vita brevis, eh?

2013-09-16

Presentation books, videos and articles

Here's a small list of presentation resources that I'm collecting. Now to go through the Kindle archive to add in some more good stuff!

2013-09-15

Message weekly - Blog, Feedback forums, HTML Export, Complexity warnings

Welcome to the second weekly update on Message - the super-fast presentation app. This week brings news of yet more successful presentations done using Message and a whole lot of new features. Let's get started with discussing the new Message blog.

Blog

Message now has it's own blog up at https://message.fhtr.net/blog. The Message blog will be introducing new features, giving you presentation tips, and acting as a playground for Message's publishing features. No worries though, I'll keep doing these wrap-up posts to keep you updated.

The Message blog is built on top of Message, so it's strictly speaking a presentation. I like it as a demonstration of the power of Message's theming system. In addition, blog-style publishing requires a few extra features that will make publishing presentations nicer as well. For example, to do a blog well you need to track page views, have draft-publish workflow, load data dynamically, have nice post URLs, have a TOC, have comments and likes — most of which would improve the presentations as well. So hey, clear and motivated goals.

Fullscreen mode

If you look at the blog screenshot closely, you may notice a "Fullscreen"-button in the bottom right corner of the screen. Yes, you can now go fullscreen with your presentations.

The fullscreen mode hides the "Edit"-button as well so that you have a screen full of presentation and can run your presos straight from the editor preview. To exit the fullscreen mode, hit ESC and you'll be back to the editor in no time.

Feedback forums

I set up an UserVoice forum for Message to have a place to collect feature requests and other issues from y'all. Currently 6 issues posted, 5.5 resolved. And the remaining half issue would be PDF export of some sort. Speaking of which, @BlurSpline wrote this little snippet to create PDFs from Message presos.

If you have any ideas or find bugs on the Message site, feel free to head over to the feedback forum and create a new issue. Or ping me on Twitter: @ilmarihei.

HTML export

You can now export your presentations as HTML bundles. The bundle is a single HTML file with your stylesheets, custom scripts and images inlined into the document. So even if you don't have Internet available, you should be able to do your presentation from the bundle file.

The Export as HTML feature is still experimental, so let me know if you run into any problems. I'll keep tinkering on it (and the whole offline story) this week.

Private share links and shorter share links

As you can see in the screenshot above, the share pane now has a private share link. If you don't want to publish a preso for all the world to see, you can send the private share link instead.

In other news, I changed the public Message share links from "viewmessage.fhtr.net?page=view&id=$ID" to "v.fhtr.net/$ID". Now they look a bit nicer. And 23 characters shorter.

Complexity warnings

Ever seen a wall of text? Yes, that slide with sixteen bullet points and a five-line quote from a technical textbook. They are real, sad to say. Worst of all, they destroy presentations.

A wall of text does active battle with whatever you're saying. People can only deal with one linguistic task at a time. You can't listen to speech while you're reading and vice versa. Heck, even listening to music makes you worse at reading.

Your presentation slides should have very few words on them. Your slides are the headlines and illustrations. What you say is the article body.

Easier said than done. When you want to explain your thinking, it's so easy to add a few extra words on the slide. And perhaps a few bullet points. Maybe an inspiring quote or two.

It's too easy to make slides that make your audience stop listening to you. So, we took a swing at solving the problem.

Message now has slide complexity warnings. When your slide is getting a bit too verbose, you get a yellow "Complex Slide"-warning. It's a sign that you should pause for a few seconds when you show that slide. Maybe take a sip of water. Because your audience is going to be too busy reading the slide to listen to you.

If you keep on writing, the warning will turn into a red "Overcomplex Slide"-warning. This is the kind of slide that will likely make your presentation worse. Your audience is going to stop listening to you for the duration of this slide. Try to break overcomplex slides into three slides.

Slide complexity warnings are turned on on all new Message presentations. They need a snippet of code in the slide parser, so old presos don't have them by default. You can add the warnings to an old preso by loading a new preso as a template for the old one.

New version history names

The version history dropdown has somewhat cryptic names now. Gone are the undescriptive names like "Version 203", replaced with something like "→ 203 | T-3m 9/15/2013 15:03:20 PM | -2+39". Let me unravel that for you. The arrow points to the version at page load time. The T-3m means "this change was done 3 minutes ago" and the timestamp gives you a more exact time. The last part tells you how many characters were deleted and how many added, so that you have an idea of the size of the change.

Suppose you wanted to go to the version from yesterday, before you decided to rewrite a large part of your presentation. You'd open the dropdown and scroll it up to yesterday's edits, find one with a large number of deleted character and pick the version before that. Zing!

Paywalls

Last but not least, I'm slowly trying to move this project to the "paying my bills"-phase. And this will require asking you, dear reader, to pay for the value you're getting out of Message. I was thinking of going with a one-off payment instead of monthly subscriptions (but don't hold me to that, plans change.) Chew on the image on the right for a minute. Does it make you want to whip out your credit card or no?

I also have a mock up ad below there. Who knows if that price point is too low (for consumers) or too high (for corporates). I guess we'll find out.

That's all this week, you can see all these amazing features in action on Message. It's still in beta, so sign-up is free!

2013-09-03

Latest on Message

The last week's been busy on the Message front, here's a quick recap:

Yes, Message has code highlighting for your slides. Start a line with > and type along. I might replace that with four spaces to go with Markdown-style slides. Or heck, do a full Markdown parser for the slides and share it as a template.

Your presentations are now displayed on a nice responsive grid with big live previews of the slides. You can flip through your presos right there without necessarily having to open the editor. Magic in action, enabled by a ground-breaking technology platform.

Speaking of ground-breaking, your slides now have a full version history. Accidentally deleted a couple slides while thumb-typing? No problem, go to the version dropdown and pick a previous version. Just think of the amount of lost work you'll save with this single dropdown. We're giving all those hours back to you. You're welcome.

To protect your work even better, Message now prompts you if have unsaved changes and you try to leave the page. It's like having a friend watch out for you and catch you when you're about to have an accident.

Last but definitely not least, you can now use other presentations as templates. Made truly beautiful slide transitions for your previous preso? Now you can load them straight into your other presos. No more arduous copy-pasting of the style definitions. And best of all, you can publish your template for others to use.

To make your presos look better from word one, I changed the default theme from the old minimalistic black-on-white one to a more advanced one. It's the same theme used in the "How to use Message"-presentation. Yes, you get to enjoy the full power of Message's theming capabilities without any extra work.

In addition to the features highlighted above, the last week's new features include a list of the latest public presentations, custom descriptions for your presentations when sharing them on social networks, and Message is now using the ACE code editor for editing your slides and themes.

And hey, I can't wait to get cracking on some new themes, it's been far too long that I've been head-down in coding and market research. Creative output!

See Message at message.fhtr.net and sign up for free during the beta phase.

2013-08-25

Message

Hey peeps, I made a webapp for making presentations quickly. It's kinda rough and geeky but give it a spin.

https://message.fhtr.net

Feature highlights

super fast to write preso slides
embed images, video, audio and websites
timing marks to keep your script straight
practice count notation to help you hone your delivery
your edits are synced between devices so that you can use your phone to work on your preso in the train, then whip out your laptop at work and continue on it
full themability with Stylus
heck, you can even hack the slide parser JS to customize everything

I also use it as a mobile notepad, which is kinda nice... Maybe I should make a separate product for that.

2013-08-23

Adventures in OT, part 2

I've got a couple new tidbits to add to my previous post about implementing an operational transformation library. Operational transformations are about recording your edits and streaming them to the other collaborators. The concept is simple and the implementation isn't super-complicated, but it can be a bit tricky.

Anyway, I was hooking up the transformation library to inputs and textareas. Which, well, I guess you should just use a diff algorithm, it's probably less of a browser-compatibility hassle. I ended up doing it the hard way.

I'm monitoring events on the input element and keeping track of the cursor position using selectionStart and selectionEnd. When the contents of the element change, I figure out what kind of change it was and record a cursor movement followed by insert & delete events.

If the selection is empty, the change is an insert or a delete. If the input value is longer than the previous value, the change is an insert. If the value is shorter, the change is a delete. This logic works fine until you run into iOS autocorrect.

With autocorrect, the selection is empty but the cursor jumps multiple characters forwards. So I had to add a hack to the tune of "if the cursor jumps forward, set the selection length to the distance between last cursor position and current cursor position". Which works for simple autocorrect cases, but not for the autocorrect popup menu suggest action. Anyway, it sort of works. And this is why you should use a diff algorithm to record your edits.

Here's the code I've got at the moment:

// Setup event recording for the input element.
// Record the edits to the named field inside the changelist.
// Uses username as the name for the cursor.
var recordEdits = function(input, name, changelist, username) {
input._lastValue = input.value;
input._cursor = {
pos: input.selectionStart,
length: input.selectionEnd-input.selectionStart
};

getEventNames(input).forEach(function(n) {
input.addEventListener(n, recordEdit(name, changelist, username), false);
});
};

// Get the list of event names for the element.
var getEventNames = function(elem) {
var a = Object.getOwnPropertyNames(elem).join(" ").match(/\bon\S+/g).join(" ").replace(/\bon/g, '').split(" ");
// Take out touch events as Mobile Safari
// doesn't like touch listeners on textareas inside iframes.
a = a.filter(function(s) { return !(/^(touch|gesture)/).test(s); });
return a;
};

// Makes an event listener to record edits.
var recordEdit = function(name, changelist, username) {
return function(ev) {
// Update cursor position.
var cursor = {
pos: this.selectionStart,
length: this.selectionEnd-this.selectionStart
};

// The value has changed. Let's record the change.
if (this.value !== this._lastValue) {

// Hack to handle autocorrect edits.
if (cursor.pos - this._cursor.pos > 1 && this._cursor.length === 0) {
this._cursor.length = cursor.pos - this._cursor.pos;
}

// Handle selection replacements.
// Triggers when you've got something selected and you
// either type something or paste something.
if (this._cursor.length > 0) {
ChangeList.addChange(changelist, {edit: name, value: {cursor: {id: username, pos: this._cursor.pos}}});
ChangeList.addChange(changelist, {edit: name, value: {del: this._cursor.length}});
}

// How much the length of the value has changed.
// A positive value indicates that text insertion.
// A negative value means that you've deleted text.
var d = (this.value.length-this._lastValue.length+this._cursor.length);

if (d > 0) {
// Create a text insertion change.
// Note that we insert the text from the
// previous cursor position. The current cursor position
// is after the inserted text.
ChangeList.addChange(changelist, {edit: name, value: {cursor: {id: username, pos: this._cursor.pos}}});
ChangeList.addChange(changelist, {edit: name, value: {add: this.value.substr(this._cursor.pos, d)}});
} else if (d < 0) {
// Create a text deletion change.
ChangeList.addChange(changelist, {edit: name, value: {cursor: {id: username, pos: cursor.pos}}});
ChangeList.addChange(changelist, {edit: name, value: {del: -d}});
}
this._lastValue = this.value;
}
this._cursor = cursor;
};
};

Ok, so it's a bit hacky. And it probably doesn't work all that well on IE. Perusing the code for ShareJS, it had a comment about IE converting \n linebreaks into \r\n linebreaks. Which would screw up cursor positions compared to other browsers.

Enough about hooking up input events. The other thing I got kinda working are writes based on a log table. Usually you'd need to have a document lock to do writes reliably. Because you need to transform new events based on events that arrived from other clients. So you can't insert new changes directly into the changelist, but need to read in tail = changelist[newChanges.lastSeenVersion..-1], then rebase the newChanges on top of the tail and push the transformed changes to the changelist. Which means that you want to lock changelist during tail read, transform and transform push. Otherwise you might get other changes in before the push and your transform goes FUBAR.

But hey, if you have an atomic push, you could do this: record the lastSeenVersion for the newChanges and push it into the changelist without transformation. Then change the changelist execution function so that it transforms the changes that have lastSeenVersion set. That way the result stays the same compared to pre-transforming the changes, though it'll cost you some performance. Not to worry though, you transform the newChanges and atomically replace the non-transformed version with the transformed version. Zing! Now the changelist results remain consistent even if other writers push more junk on top of the changelist while you're transforming.

2013-08-19

Algebra with compass and straightedge

Thought: could you use a compass and a straightedge to do basic math?

Let's see. First we need to define a couple of things. Like zero. Let's use a zero-length point as the zero. Draw a long straight line, put a mark on it somewhere in the middle. There's your zero and line of numbers.

How to represent numbers? Line segments, right? Take the ruler and draw a few short lines. You can move the short lines to your line of numbers using the compass. Measure line with the compass, move the compass to the line, make a mark with the other end, there you go. Our numbers are lines that are on the number line.

With that, we can add positive numbers. Suppose you have three lines: A, B and C. To add them together, you'd first copy A to the number line, then B starting at one end of A and C at one end of B. The resulting long line is the three lines added together.

To get a proper group going, we're going to need a way to negate lines. So, let's add a direction to the lines. Put a short line A on the number line with one end at zero. Jot a small arrowhead at the other end. Now you can negate A by flipping it around so that the arrow points in the opposite direction. A + -A would go from zero to A, then all the way back to zero. To draw A and -A, first move A to start at zero, then rotate the compass 180 degrees and make a mark on the other side of zero. That's -A.

To add these arrows together, we need to add a rule. You need to attach the non-arrow start of a line to the arrowhead of the previous line. And the result of addition can be represented as a single arrow, going from the start of the first line to be added with the arrowhead at the arrowhead of the last line.

Ok, we have numbers and their negations. We can add lines and subtract lines. And they seem to obey associativity A + (B + C) = (A + B) + C and commutativity A + B = B + A. The addition operator requires us to define A as any arrow pointing to the same direction as A and that has the same length as A. Other group axioms, A + 0 = A seems to be ok, ditto for A + -A = 0.

Right, I think we have an additive group. How about multiplication?

You can multiply two of these vectors together by doing the addition but then rotating the second arrow by 90 degrees. Then add the first vector after the second, rotate it, and repeat once more for the second vector. Now you have a rectangle. The area of the rectangle is the result of the multiplication.

Now wait a minute, how do you convert that into a vector on the number line? It's an area, a completely new dimension, totally different unit from the number line segments. Right. Well, maybe this is not the kind of multiplication we really want. Let's try to find something else.

Ok, first, define one. It's some vector. Probably pretty short. Or maybe long. It's arbitrary. So we have our one.

Let's draw a multiplication line. It's just a long straight line. Jot a zero mark on there, copy the one-vector on the line. Now draw another one-vector at the end of the previous one, but rotated 90 degrees. Draw a dotted line from the zero mark to the end of the second one. There we go, 1 multiplied by 1. Is 1.

Now take some other lines, say 4 and 5. The 4-line is 4 times longer than the one-line. And the 5-line 5 times. Multiply 4 by 5. Draw the 4-line in the place of the second one-line. So that it's standing up, one unit away from zero. Then draw a line from zero through the end of the 4-line. Right, that's the line that multiplies numbers by 4. When it's one distant from zero, it's 4 units up. When the distance is two, it's 8 units up. And so on.

Now copy the 5-line on the multiplication line. And draw a normal to the multiplication line through the end of the 5-line. This is the line that multiplies numbers by 5. Where the 4-multiplier line and the 5-multiplier line cross, there's the result. The normal vector from the multiplication line to the result point is the number 4 times 5. To convert it into a number line number, rotate it 90 degrees so that it's flat on the number line.

Division is, well, there's a trick to it. You want to create the number 1/A. So let's do the multiplication line thing again. This time, draw an upright one-vector A units away from zero. Draw a line from zero through the end of that vector. Then draw a normal line one unit away from zero. The crossing point of those two lines is 1/A.

2013-08-14

Quick thought on Hyperloop

Hyperloop basics (or my understanding)

A lightweight car resting on an air cushion, propelled along a low-pressure tube by track-mounted linear motors. The car counteracts the air piling up in front of the vehicle by pumping it to the rear with a compressor

Low air pressure and air cushion make for low friction, so one linear motor can slingshot the vehicle for a hundred klick glide. The vehicle has batteries and electric motors to pump air, power the cabin and control the air bearing to maintain distance from the tube walls.

Thought experiment

Speed of sound in air is primarily determined by the square root of air temperature. If you fill the tube with hot air, the speed of sound inside the tube goes up. In 1000 C air you can do subsonic travel at 750 m/s. Go for 2200 C and it's 1000 m/s, or Mach 3 in normal atmosphere.

The thermal conductivity of air has a linear relation to its density: at 0.1% density thermal conductivity is also 0.1%. So, if you can cool the tube material and the vehicle to counter a 1 C temperature increase at normal air density, that cooling would suffice to counter 1000 C at 0.1% density.

But heat moves from cold to hot. If the air temperature is 1000 C, you can cool the vehicle by heating a part of it above 1000 C. Alternatively, you could use a thermal buffer to soak up the heat during brief hot air stints, and release the heat when transferring to a slow & cold part of the loop.

2013-08-13

Writing an Operational Transformation library

Recently I've been indulging myself in that most noble of endeavours, re-implementing existing technology for no good reason. The object in this matter: Operational Transformations. A.k.a. the thing that you can use to make multi-user-editable documents.

The concept is simple: record the user actions on the client, send them to the server, the server merges them into the document and sends the resulting changes out to other clients. The implementation is, as I have had the pleasure to discover, not quite as simple.

Sure, maintaining the list of changes, that part is easy. Have an array for the event log, push changes that come from the client into the log, send back the events that happened since the client's last synced version.

ChangeList = function() {
this.log = [];
this.version = 0;
this.newChanges = [];
this.snapshots = {};
};

ChangeList.prototype.add = function(event) {
this.newChanges.push(event);
};

ChangeList.prototype.addNewChanges = function(newChanges) {
this.log.push({version: this.version++, changes: newChanges});
};

ChangeList.prototype.changesSince = function(version) {
var log = this.log, i = this.log.length - 1;
while (i >= 0 && log[i].version > version) {
i--;
}
return log.slice(i+1);
};

ChangeList.prototype.sync = function(cl) {
var changes = this.changesSince(cl.version);
var newChanges = cl.newChanges.splice(0);
this.normalize(changes, newChanges);
this.addNewChanges(newChanges);
for (var i=0; i<changes.length; i++) {
cl.log.push(changes[i]);
}
cl.version = this.version;
};

ChangeList.prototype.normalize = function(prevChanges, newChanges) {
// Tweak newChanges according to prevChanges.
};

The problems start when you have to implement the normalize method. For operations that are associative and commutative, it's simple: you don't have to do anything. (1+2)+3 = (3+1)+2 and so forth, the order of the operations doesn't matter. Likewise for idempotent operations: abs(x) = abs(abs(x)).

The troublesome ones are the operations where the order does matter. For example, (1+0)*2 != (2*0)+1. Or, say, editing text. The raison d'être of this whole library. It being one of the complex cases does make life difficult for the poor implementer.

How does one represent text editing operations anyhow? You should at least have one operation that writes to some point in the text. And another that erases what you have written. It would also be nice to keep track of the cursor position and preserve local sanity.

Imagine that you're stuck in the middle of a sentence, trying to think up the exact word to use. At the same time, Joe Danger is hammering his keyboard at full steam on the page above you, chronicling his daring exploits in forgotten jungles and deadly deserts. The truly excellent thing for your cursor to do would be to stay there, stuck in the middle of the sentence, blinking lazily away without a worry in the world. As Joe's tale of adventure and peril grows, it should push your sentence and your cursor forward at the same pace.

ChangeList.prototype.normalizeString = function(prevChanges, newChanges) {
var i,j,k;
var cursors = {};
for (i=0; i<prevChanges.length; i++) {
var changes = prevChanges[i].changes;
for (j=0; j<changes.length; j++) {

// Find all text-editing ops in prevChanges.
var cmd = changes[j];
var edit = (cmd.write || cmd.erase);
if (edit) {
var pos = edit.pos;
var move = edit.value.length || -edit.value;

// Adjust text-editing ops and cursors in newChanges accordingly.
for (k=0; k<newChanges.length; k++) {
var ncmd = newChanges[k];
var nedit = ncmd.cursor || ncmd.write || ncmd.erase;
if (nedit.pos > pos) {

// Move new ops that come after the edit.
nedit.pos = Math.max(pos, nedit.pos+move);

}
}

}

}
}
};

Alright, now your local space make senses, even if the text elsewhere is in a state of flux. I think we're getting somewhere. Oh wait, how do you get the final text out? All we have now is a list of changes. A list of changes is completely useless unless you can execute it.

To execute the changelist, we should go through the changes and replay them. For that we need an initial state and a way to apply changes to it. In functional programming parlance, foldl applyChange initialState changes. In JavaScript, pretty much the same.

ChangeList.prototype.execString = function() {
var snapshot = this.snapshots.string;
if (!snapshot) {
snapshot = {value: '', version: 0};
}

// Execute changes on top of the snapshot.
var initialState = snapshot.value;
var changes = this.changesSince(snapshot.version);

var result = initialState;
for (var i=0; i<changes.length; i++) {
result = this.applyStringChanges(result, changes[i]);
}

// Update the snapshot to the latest synced state.
this.snapshots.string = {value: result, version: this.version};

// Apply unsynced changes to get the final result.
return this.applyStringChanges(result, {version: -1, changes: this.newChanges});
};

ChangeList.prototype.applyStringChanges = function(state, changeList) {
var i, changes = changeList.changes;
for (i=0; i<changes.length; i++) {
var ch = changes[i];

if (ch.write) {
state = state.slice(0, ch.write.pos)
.concat(ch.write.value, state.slice(ch.write.pos));

} else if (ch.erase) {
state = state.slice(0, ch.erase.pos)
.concat(state.slice(ch.erase.pos + ch.erase.value));
}
}
return state;
};

Hooray! String playback is in! Now the easy part is over! I haven't yet made textareas emit string change objects and maintain cursor state. I believe that it will be full of Bad Times fighting browser incompatibilities and APIs. But who knows, it might be simple as well. Excited to find out!

The OO-like implementation above was written solely for this post, if you want to get the slightly nutty tested version with JSON editing, here's a gist. I'll put up a proper GitHub repo once I manage to sync the edits with clientside editors and prove that it actually works.

Thanks for reading, have a cookie. 🍪

2013-07-21

10 billion user challenge

Storage

A person using a PC with a keyboard and a mouse can generate something like 100 input events per second. Assuming we can compress those to two bytes per event, the maximum a person can output on a computer is around 200 bytes per second.

In 24 hours, that'd amount to around 17 megs. A year would fit in 6 GB. If you wanted to have the system used by everyone in the world, it'd generate 173 petabytes per day and 63 exabytes per year. With double redundancy, you'd have to store 126 exabytes of data.

If you were to store the data on 4TB hard disks, you'd need 31 million of them. At current prices, that'd come to 5.6 billion dollars per year, which comes to 56 cents per person. (If you bought solar PV to power the drives and expected 20 year panel lifetime, you could power each drive at 35 cents per year. Total electricity cost $10 million a year.)

Let's optimize that a little bit. The input stream is likely to average less than 1 byte per second as not everyone swings their mouse around every second of the day. And most people sleep a third of their day. The input event stream is likely to be highly compressible as well. Good compressors squeeze text to a fifth of the original size. Putting all of the above together, the compressed realistic input stream per person might be around 0.1 bytes per second.

Now the per person storage per year shrinks to 6 megabytes with double redundancy. The disk cost falls to 2.3 million per year. Total electricity $5000 per year.

Access

It's not enough to just store all the things made by everyone, everywhere, forever. You want to have people following interesting events and broadcasting what they're doing. Suppose the latest rubber-stamped planetary dear leader wants to address every single person out there on the subject of taxes, general mood, weather, etc. The resulting 3 hour monologue composed mostly of hastily-written text messages needs to be sent to all the ten billion humble subjects in real-time. At 160 bytes per minute, the combined bandwidth consumed by the stream would be 27 GBps.

Bear in mind that all the also listeners want to tell their 150 closest friends about their thoughts about dear leader's hair-do and dance moves. Each person is now on the receiving end of 150 streams and is sending out a stream to 150 listeners. At 160 bytes per minute, the total bandwidth used by each person in each direction is 400 bytes per second. Each person would also produce something like three random reads and writes per second. The combined bandwidth used by the population would sum up to 8 TBps and require 60 billion IOPS.

To achieve 60 billion IOPS, you'd need around 6,000 DRAM chips. Or 600,000 super-fast SSDs. Let's say that you can spend a bit of time in RAM and turn the random writes into streaming log writes. If you needed 10 TB of DRAM to achieve that, the cost would be around $70000 (DDR3 spot price for a 512MB chip is $3.5). For the SSD solution, you'd have to spend 60 million. On the other hand, you would also get an extra 77 PB of fast storage.

Software

What kind of math lets you piece all this stuff together? ( ._. )

2013-07-10

The $1000 cluster

$50 gets you a quad-core ARM A9 with 2 GB of DDR3 and 8 GB of flash. Buy twenty. Plug them into a $80 switch with 24 ports.

Hello, kinda slow 80-core cluster with 40 GB RAM and 160 GB flash, plus 80 ARM Mali-400 GPU cores. Total bill of materials $1080.

You get something like peak 192 DP GFLOPS (512 SP) from the CPUs, 480 GFLOPS from the GPUs, 60 GBps aggregate memory bandwidth, 10-20 megs of L2, 5 megs of L1. And all the fun of programming a cluster with a 100 Mbps interconnect.

That said, if you have the right workload, you might have something interesting here. I kinda want to build one just for the fun of it.

Or, wait for the A15 with Mali-T600 series to go mainstream. You'll get double the CPU performance, OpenCL, and triple the GPU perf.

2013-06-10

Pivoting for no good reason

I moved PoemYou away from being a hypothetically useful greeting card service to being a completely useless ephemereal link stash. With five embedded media items in boxes and no history of previous items held in them. I sort of like the feeling. And it only works on Safari and Chrome. Because CSS 3D suffers from the browser 3D engine problem, namely that browser devs aren't very interested in being 3D engine devs [edit: Sorry, this is too snarky. It's an uncommon use for CSS 3D and it's difficult to implement -> not very high priority.] And the same goes for most sane web developers.

Still, it's got some cool cloud effects on Safari. The clouds are stretched 64x64px divs with gradient fills and low opacity. The effect looks surprisingly okay and performs alright.

But where to go from there. I could make it into a blogging platform. Give every user a box. Let them put something inside it. Have a page with the boxes of the people you are following. Have a birthday every day opening your new presents.

For mobiles and non-CSS 3D devices, I'd like to turn the boxes into letters or something equally... flat. I think you could still make some nice reveal animations for them, maybe a PNG spritesheet or three.js canvas renderer. Or pre-render the frames into a spritesheet. Anyway, the reveal, the reveal, it's the deal.

2013-05-16

Adventures in page speed optimization

I started building this little virtual gift box / greeting card service a few months ago, based on a birthday card I made for my wife. My progress on that has been quite stop-and-go due to paying freelance work taking priority. And inevitably, I got sucked into the whole spiral of optimizing page load speed (instead of, you know, making the graphics amazing and finding out if people actually want to use it. Oh well, failing builds character, no?) Anyway, here's the story of the page speed optimizations I've got in place thus far.

The page I started off with was a pretty basic setup with an HTML file with stylesheets in link tags, a half dozen PNGs and a bunch of scripts with a window.onload handler to start running the animation loop. Nothing too exciting, and not really all that fast to load. And frankly, the page load times were perfectly acceptable. I just got into the optimization thing. So hey. Go for it, eh?

My first step was to deploy the whole thing to S3 with a small script that sets up the Expires-headers as I want them and sends all text content over gzipped with the Content-Encoding header set accordingly. I don't actually have any uncompressed assets on the server, everything's compressed beforehand and set as Content-Encoding: gzip. So I don't need a server to do on-the-fly compression and save a few kB of disk space (ha.) The Expires-headers are set to one day for HTML and JS, and a year for the unchanging images. Still, I need to go and update the headers daily as they're static on S3.

The second thing I did was set up the serving from the Amazon CloudFront CDN, to make the page load reasonably quickly from locations around the world. CloudFront is pretty easy to get set up, just make it track an S3 bucket and create a CNAME record to give the CloudFront distribution a more friendly name. I'm serving everything from CloudFront at the moment, which does make testing changes on live a bit of a hassle as I need to invalidate the index.html document to force updates, and those take a half hour or so to go through.

With the simple things out of the way, I started changing the page structure and loading order to make things faster. I had a bunch of social buttons on the page, which were causing extra scripts to load and some maybe even some rendering hiccups. Let's load those only when they're needed. I've got a front page that doesn't need any of the social stuff, so I made the social scripts get injected by JS when flipping to the sharing page. Now the front page is fast to load and doesn't make spurious script loads. The only problem was that I had to go and figure out how to call the social scripts to make them initialize the button code on the sharing page.

I also used the Closure compiler to bundle and minify all my several JS files into a single bundled file, getting rid of several HTTP requests during page load in the process. The amount of CSS I had was small enough that I decided to just inline it into the page and save an HTTP request by doing that as well. Later on I moved to using Compass/SASS to compile my CSS and then inlined it into the page in my build script. I also ended up inlining the JS in the build script to get rid of another HTTP request.

After inlining the scripts and the CSS into the HTML, I was down to a single HTTP request for the textual page content. I started off with around ten HTTP requests for loading the page, so this made the page a good deal faster on bad networks. And as S3 charges per request, it's also a wee bit cheaper.

I still had seven images loaded on the page, keeping the page request total at around 10. And they were a bit large filesize-wise. So first I used Compass's sprite generator to bundle all the images into a single image sprite and then used ImageOptim to squeeze the resulting PNG even smaller. This got my page HTTP requests down to 3: the HTML with the inlined JS and CSS, the image sprite, and the favicon.

I also wanted to have Google Analytics on the site to figure out how people are using the site and how I could make it better. The Analytics script is kind of like the social scripts: it injects a script tag to the page, which may slow things down a bit. Well, eh, let's take control over that aspect as well. I moved the Analytics script to get injected after the window onload event fires and the first frame is ready to render. Which got my time from navigation start to first frame down by a little bit.

On my machine with cold cache, the page now renders the first frame in around 120 ms from the start of navigation, depending on CPU speed, network latency, DNS and the phase of the moon. On hot cache the first frame is at around 60-70 ms. So it's pretty fast to load. Still, I'd like to knock down the current 60 kB gzipped page weight to something smaller. Perhaps replace the images with procedural stuff. On second thought, screw that. Make better art instead.

Thank you for reading all the way here, have a cookie. Also, all feedback welcome. Cheers.

2013-04-27

Earth, the beautiful

Boolean Algebra, the crunchy version

First some ring axioms:
x + (y + z) = (x + y) + z
x + 0 = x
x + -x = 0
x + y = y + x

x * (y * z) = (x * y) * z
x * (y + z) = (x*y) + (x*z)
x * 1 = x
x * y = y * x

Let's define AND next:

a AND b = a * b

a AND 1 = a because x * 1 = x

a AND 0 = 0

x * 0 = 0 derives from
x * (y + z) = (x*y) + (x*z): call with z = 0 =>
x * (y+0) = (x * y) + (x * 0) <=>
x * y = (x * y) + (x * 0) <=>
x * 0 = 0

For 0 AND b = 0 and 1 AND b = b, apply x * y = y * x to the above.
Applying all the 0,1 combinations to AND we get the truth table:
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1

Then NOT:

NOT a = 1 + -a

NOT 0 = 1 + -0, apply -0 = 0 and x + 0 = x

NOT 0 = 1

-0 = 0 because

(0 + -0) = 0, apply x + y = y + x

(-0 + 0) = 0, apply x + 0 = x

-0 = 0

NOT 1 = 1 + -1, apply x + -x = 0

NOT 1 = 0

When we combine NOT with AND, we get NAND:

a NAND b = NOT (a AND b)

NOT (0 AND 0) = 1

NOT (0 AND 1) = 1

NOT (1 AND 0) = 1

NOT (1 AND 1) = 0

Writing it out as an equation:

a NAND b = 1 + -(a * b)

NAND is functionally complete so we can write all the 16 possible (binary, binary) -> binary -functions using only NANDs.

2013-04-21

Boolean Algebra

I started implementing the logic gates in JS for the heck of it. In the process I figured out an interesting algebra for logic circuits. Let's start off with writing some basic JS logic to simulate an abstract transistor. This crummy transistor should give out a 1 when its two input pins are 1. And if either of those is 0 it should return a 0. Let's see:

Transistor = function(control, input) {
  if (control) {
    return input;
  } else {
    return 0;
  }
};

Hmm, well, that does work but it's a bit ugly. And if you know about transistors, they're not really like that. You can use a transistor to amplify a signal. Hook up a faint signal to the control and a powerful input and the transistor scales the input by the control signal. Hey, it's like multiplication!

Transistor = function(control, input) {
  return control * input;
};

Ok, so a transistor is the multiplication operator. How does that work with binary logic? If we take the set {0, 1} as the type of the inputs for both control and input, all works fine: a 1 in the control passes the input through unchanged, a 0 in the control makes the input 0. From that you can see that 1 is the multiplicative identity and multiplying by 0 annihilates the other operand.

To make a NAND gate we need a bit more than just a transistor though. We either need inverse transistors (for a CMOS NAND) or a ground wire (for a NMOS NAND). The NMOS version is a bit simpler so let's write a ground function for it next. A grounded circuit returns a 0 if the ground wire is connected, otherwise it passes the input through untouched.

Ground = function(ground, input) {
  if (ground) {
    return 0;
  } else {
    return input;
  }
};

Again, that's a bit ugly. Let's turn that into an arithmetic representation on our {0, 1} algebra.

Ground = function(ground, input) {
  return (1-ground)*input;
};

Right, it seems we had to introduce a few new concepts there. Algebra-wise 1-ground is shorthand for 1 + -ground. Which means that we need a second operator for our algebra: the addition. And we also have the additive inverse, thanks to the minus sign there. And as 1-0 should return 1, we also get the additive identity. Add 0 to any input and what you get is the input, unchanged.

The algebra we have now has addition, additive identity, additive inverse, multiplication and a multiplicative identity. I think we also have to expand our set to include the additive inverses, netting us the set {-1, 0, 1}. I haven't found a use for the negative numbers in the binary logic though. The alternative would be to turn (1-x) into an axiomatic operation of some sort, I guess.

Oh, also, the inverse transistor has the same formula as the ground operator (if a different wiring implementation).

ITransistor = function(control, input) {
  return (1-ground)*input;
};

Ok, enough setup, let's make a NAND gate! A NAND gate takes two inputs and outputs a 0 if both its inputs are 1, in other cases it outputs a 1. An NMOS NAND gate has two transistors controlled between a voltage source and ground, with the idea that if both transistors are on, the source is connected to the ground and the output signal is zero. And if either of the transistors is off, the source is not connected to the ground and the output signal is the source voltage.

NAND = function(a, b) {
  return Ground(Transistor(b, Transistor(a, 1)), 1);
};

The thing about the NAND gate is that it's functionally complete. You can implement any of the other binary logic gates using NAND gates. Don't believe me? Watch this!

NOT = function(v) {
  return NAND(v, v);
};

BUFFER = function(v) {
  return NOT(NOT(v));
};

AND = function(a, b) {
  return NOT(NAND(a, b));
};

Got that? NAND with the same signal to both inputs is a 1 if the signal is 0 and a 0 if the signal is a 1. And AND is the inverse of NAND. Still not enough? Let's do OR, NOR and XOR as well.

NOR = function(a, b) {
  return AND(NOT(a), NOT(b));
};

OR = function(a, b) {
  return NOT(NOR(a, b));
};

XOR = function(a, b) {
  return AND(OR(a, b), NAND(a, b));
};

With that we have functions to generate the truth tables 1110, 0001, 0111, 1000 and 0110 (I'm writing the entries as the outputs for the 2-bit numbers 00, 01, 10 and 11. So 1110 means the function returns the following: 00 -> 1, 01 -> 1, 10 -> 1, 11 -> 0.) But hey, there are 16 different permutations for those truth tables. Let's write more functions.

// 0000
ZERO = function(a, b) {
  return AND(AND(a, b), NAND(a, b));
};

// 0001 = AND
// 0010
XLEFT = function(a, b) {
  return AND(a, NOT(b));
};
// 0011
LEFT = function(a, b) {
  return AND(a, ONE(b, b));
};
// 0100
XRIGHT = function(a, b) {
  return AND(b, NOT(a));
};
// 0101
RIGHT = function(a, b) {
  return AND(b, ONE(a, a));
};
// 0110 = XOR
// 0111 = OR
// 1000 = NOR
// 1001
XNOR = function(a, b) {
  return NOT(XOR(a, b));
};
// 1010
NRIGHT = function(a, b) {
  return NOT(RIGHT(a, b));
};
// 1011
XNRIGHT = function(a, b) {
  return NOT(XRIGHT(a, b));
};
// 1100
NLEFT = function(a, b) {
  return NOT(LEFT(a, b));
};
// 1101
XNLEFT = function(a, b) {
  return NOT(XLEFT(a, b));
};
// 1110 = NAND
// 1111
ONE = function(a, b) {
  return OR(AND(a, b), NAND(a, b));
};

As you can see from the above, you can implement all the binary functions in {0, 1} using only NAND gates. And since our definition of NAND is in terms of a transistor and a ground operator, which are grounded in some basic arithmetic operations, you can actually write the gates as arithmetic functions.

NAND = function(a, b) {
  //   (1 - (b * (a * 1)) * 1
  // = 1 - (b * a)
  return 1 - b*a;
};

NOT = function(v) {
  // 1 - v*v (v*v = 0 when v = 0, v*v = 1 when v = 1)
  // so in {0,1}, 1-v*v = 1-v
  return 1 - v;
};

AND = function(a, b) {
  //   1 - (1 - b*a)
  // = 1 - 1 + b*a
  return b*a;
};

NOR = function(a, b) {
  // (1-b) * (1-a)
  return (1-b) * (1-a); 
};

OR = function(a, b) {
  //   1 - ((1-b) * (1-a))
  // = 1 - (1*1 + 1*-a + -b*1 + -b*-a)
  // = 1 - (1 - a - b + ba)
  // = 1 - 1 + a + b - ba
  // = a + b - ba
  return a+b - b*a;
};

XOR = function(a,b) {
  // (1-ba)*(a+b-ba)
  return (1-b*a)*(a+b-b*a);
};

And so forth.

Oh, and one more thing. You know the CPU inside your computer? It's made up of NAND gates. Lots and lots of NAND gates hooked up together to compose all the logic in the CPU. And if you think of each NAND as the equation 1-ab, you could think of the CPU as one humongous equation etched onto a silicon chip.

I've got the code for the logic gates and some adder generator code up on GitHub.

2013-04-18

NAND gates

And this should ground the previous to transistors and voltage planes.

I wonder if the implementation of an actual adder circuit would be more complicated. Probably lots of implementation details to consider. But yeah, hook the input pins of the adder to switches, put LEDs on the output pins.

2013-04-17

Circuit diagrams

I was listening to a Feynman lecture on what computers are and started doodling logic circuits based on that. First I made a 1-bit adder, then extended that into a 2-bit adder, doodled a 3-bit adder and generalized from there to arbitrary n-bit adders. Then I made a 4-bit adder out of two 2-bit adders and drew this thing in Photoshop that generalizes the idea to 2n-bit adders. And tried building the rest of the logic gates using just NAND gates. Fun fun :)

I tested the logic gates and the 1-bit and 2-bit adders in JavaScript but haven't tested the 4-bit and n-bit adders.

2013-02-04

Hosting a static website on AWS

Here's a quick guide to hosting a static website on Amazon Web Services. In this guide we set up a root domain for your site content, a www subdomain that redirects to the root domain, and a cdn subdomain that uses Amazon CloudFront as a content distribution network to make your site load faster.

Setting up S3 buckets for hosting the files

First off, go to aws.amazon.com and either log in or sign up for an AWS account. You'll need a credit card for that, mind. Now that you're logged in, go to the S3 console. There? Great!

The second step is to create a new bucket for your site content. Click on the big blue button that says Create Bucket. For the name, enter the domain of your website. Pick the region closest to your intended audience.

Now let's upload the site content into the bucket. Click on the name of the bucket to enter it. Now click the blue Upload button at the top of the page and add the files you want to upload.

As an aside, if you want a nice programmable way to upload files, install this aws shell script and maybe use this automagical hacky upload script of mine. You can also use one of the graphical clients like 3Hub.

Now go back to the S3 console buckets list and click on the magnifying glass next to the bucket name. Look to the right side of the page and select Static Website Hosting from the properties. Click Enable website hosting, write index.html to the Index Document field and click Save. Almost online now! You can see the URL for the bucket endpoint above the website hosting toggling. But if you try to visit the bucket endpoint URL now, it will give you an access denied error.

To make the site publicly visible, we need to edit the permissions for the bucket. Click on Permissions under the bucket Properties. Now click on Add bucket policy and paste in the following policy (remember to change MYBUCKET to the name of your bucket).

{
  "Version":"2008-10-17",
  "Statement":[{
 "Sid":"AddPerm",
        "Effect":"Allow",
   "Principal": {
            "AWS": "*"
         },
      "Action":["s3:GetObject"],
      "Resource":["arn:aws:s3:::MYBUCKET/*"
      ]
    }
  ]
}

Click on Save and then the blue Save button. Done! We're online! You can go to the bucket endpoint URL now and see your site load.

Using your own domain with Route 53

To use your own domain for the site, you can create a CNAME record from www.your_domain to the bucket endpoint URL. Now requests to the www.your_domain go to the bucket, zing! To make the root domain (your_domain with no www) go to the bucket as well, you either have to use Amazon's Route 53 to host your domain's DNS records or (if your DNS provider supports it) set up a redirect page to the www.your_domain from the root domain.

To use Route 53 for hosting your domain's DNS, go to the AWS Console and select Route 53 from Services. Click on Create Hosted Zone, enter your domain name and click on the Create button at the bottom of the form. Ok, now see the four server names in the Delegation Set? Go to your domain registrar and set those servers as the name servers for your domain.

Now tick your hosted zone to select it and click on the Go to Record Sets button. Click on Create Record Set. Leave the subdomain field empty, create an A-record, set Alias to yes and pick your bucket. Now requests to your domain will be served from the bucket. To make www.your_domain go to the bucket as well, create a CNAME record that points to your_domain.

Content delivery network using CloudFront

Let's turn on the CDN now. Go to Services in the AWS console and click on CloudFront. To create a CDN for your bucket, click on Create Distribution. Select Download as the type of the distribution. Type the name of your bucket into the Origin Domain Name field and pick your bucket from the list of completions. Most of the form is alright as-is, so scroll down to Distribution Settings. Enter cdn.your_domain as the CNAME. Set default root object to index.html. Scroll to the bottom of the page and click Create Distribution.

To set up the cdn.your_domain subdomain for the CloudFront distribution, copy the distribution domain name (click on the [i] -button if you're in the list of distributions) and create a new CNAME record that directs cdn.your_domain to the distribution domain name. There we go, your CDN should be live now.

Now when you have images on your website, remember to use point them to your CDN server for best loading performance. Instead of src="http://my_domain/image.jpg", use src="http://cdn.my_domain/image.jpg". The CDN works best for content that change rarely if ever. For frequently changing files it's best to use regular S3, otherwise your edits take anywhere from a few minutes to an hour to appear on the site.

If you need to force a file to be reloaded on the CDN, you can go to Distribution Settings for the distribution (the [i]-icon). On the settings panel, go to the Invalidations tab. Click on Create Invalidation and enter the filenames in the bucket that you want to invalidate. The invalidation takes a few minutes to work and forces the CDN to refresh the invalidated files. You can do it a couple times a month for free, but if you do it a lot it's going to start costing money.