white one - Making of

| | Comments (2)

white one is my first 4k intro and my first serious demoscene production (as far as something like that can be serious). I'm new to C coding and to sizecoding in particular, so there were a lot of things to be learned which I'll try to summarize here. Download and run the executable (nvidia only, sorry) or watch the video capture first:

A 4k intro is a executable file of at most 4 kilobytes (4096 bytes) that generates video and audio. That is, it puts something moving on your screen and something audible on your speakers. The finished product runs for a few minutes, has some coherent theme and design and ideally, sound and visual effects complement each other. On top of that, it's a technical challenge: It's impossible to store 3D models, textures or sound samples in 4 kilobytes, so you have to generate these things at runtime if you need them.

Overwhelmed by the impossibility of all this I started messing around.

I had been lurking on a few demoparties, but never released anything nontrivial - i do know some OpenGL, but i am normally coding Lisp which tends to produce executables that are measured in megabytes. Obviously, that had to change if i wanted to contribute a small intro. Playing with the GL Shading Language had always been fun for me, so it was clear that something shader-heavy was the only option. And I had some experience with C from microcontroller hacking.

When some friends of mine implemented a cellular automaton on the Fragment Shader in summer, I took their code and started messing with the shader. A cellular automaton implements a function on a grid of cells (here: pixels). The function takes the whole grid as input and produces a single pixel's value as output. To get one new frame, it has to be applied once for each pixel, using the previous frame as input.

A very simple function of that kind would, for example, assign to each pixel the value of its left neighbour, thus moving the whole image to the right by one pixel with each frame.

foobar.png

The Shader

(you might want to skip this section if you are more interested in how to make 4k intros in general)

First, i experimented with a function that looked like this:

1. Look up this pixel's current color.
2. Convert it to HSV colors. 
3. Use the Hue value as an angle and walk (dist\_s\*S + dist\_v\*V + dist\_sv\*S\*V + dist\_base) pixels into that direction in the image, where the dist\_something values are arbitrarily chosen by me.
4. Look up the color value from there and output this as the pixel's new color.

I made a small program that allowed me to play with the four parameters. Depending on the choice of the four parameters, interesting yet ugly things happened, e.g. solid-color patches would move about the screen until they collided and merged to a single color or something vagueley resembling smoke would appear.

I realized i wanted some more features and added new parameters to adjust the output color in HSV color space, e.g. rotate the color wheel by incrementing H, decrease/increase the saturation S and darken/lighten the image by modifying V. A new parameter for weighted averaging between the old and the new image was also introduced. Interestingly, instead of using values between 0 and 1 for this blending, i ended up using values below 0 or above 1 for many of the more interesting effects :)

The next addition was a parameter to control smoothing the image. A value of 0 would mean no smoothing, 1 would mean a standard smoothing kernel of [[0 1 0] [1 4 1] [0 1 0]]. We are now only one more parameter away from the first effect that actually made it into the final intro which is also the first one on the screen:

(This will be the last time i bore you with concrete numbers)

smooth: -2.400000
dist_base: 1.400001
dist_s: 3.099999 
dist_v: -1.900000
dist_sv: -2.400000
mix_bias: -0.480000 (alpha blending between frames)
h_cycle: 0.020000 (cycle the color wheel by a small amount)
rotate: 0.001000

As you can see, the smoothing parameter is outside its normal range of [0,1] as well and the kernel that is defined this way looks a lot like a sharpening kernel. If you look closely at the very beginning you can see checkerboard-like pixels in the center that are due to this sharpening. The one missing parameter, called "rotation", rotates the input image by a certain angle before reading from it - or so i thought. I discovered later that i had used the rotated image just for the first read (the one determining in which direction to walk for the second read), but the second read was on the unrotated image. When i fixed this, everything became a lot less interesting, so i kept the bug.

In the periphery of the image it gets blurred by rotating it less than a full pixel and OpenGL's bilinear texture-filtering blending neighbouring values. If you look very closely in the final 1080p production, you can see four areas where the rotation is just about one pixel and the checkerboard pattern re-emerges. I also introduced a parameter for zooming that works just like the rotation parameter (i.e. it is broken as well).

The last parameter controls alpha blending of the final image with an additional image. The reason i had to do this is that some effects can get stuck unpredictably: Once the screen is of a solid color it will stay a solid color forever. I needed a way to re-introduce some variation in the image to get the effects started again in this case. Naively, I thought there would be enough space in the 4 kilobytes to render some primitive geometry (procedural trees and such) so I could use those as a starting point for a LSD trip. In the end the rendered geometry just became a rotating white "1" (|) that is only visible in some effects. Anyway, now there are two inputs: the last frame and some rendered geometry.

Here is the final shader code in GLSL. The rgbToHsv and hsvToRgb functions are not pasted here and behave as you would expect ;)

So this is basically the heart of the intro. It is arguably less then readable since we had compressed the code a bit already. GLSL code must be given as a string to OpenGL and is compiled by the graphics card driver, so the whole shader code has to be present in the executable. Later, all variables and functions would be renamed to one-character names and so on, making it even less readable.

After some time i had collected about 70 different parameter sets that i had found interesting, many of them similar and some ugly. My eyes hurt. Here are some more examples:

bla-1.pngbla-3.png

bla-4.pngwellen_almost_converged.png

The Rest

Now that the shader functionality was fixed (any change would invalidate many of the 70 precious effects), i needed to wrap it in a 4k executable, get some music and make some kind of timeline. Fortunately there are now some nice tools that help the aspiring demo coder with these tasks. For the music, many 4k intros today use 4klang as a synthesizer. 4klang is designed for 4k intros and produces an object file containing the synth, the intruments as well as music data that you can link against. I also found iq's 4k framework which is more of a collection of frameworks for different kinds of 1k and 4k intros. I just took the one for OpenGL shader-on-a-quad intros and started reading its code, figuring out how it works. The third tool is Crinkler, a size-optimizing and compressing linker that replaces/calls Visual Studio's builtin linker and is probably used in 95% of today's 4k and 64k intros on windows. Since Crinkler is Windows-specific, now was the time to switch operating systems and start Visual Studio for coding on Windows for the first time in ten years or so. After a few nights i even stopped using emacs shortcuts.

In the meantime i had coaxed my brother into composing music. While he familiarized himself with the 4klang workflow (using OpenMPT for composing) I continued experimenting with the framework. So now we had become a team of two and kept each other updated while i was distracted for a few months by my last exam. This is how it went for him:

Music

Clemens speaking. So, to be honest, I didn't know at all what I got myself into - I'm new to music making, never seen a tracker or stack-based synthesizer before and because I live far far away, I knew nothing about the visuals but a few old screenshots.

To my surprise, I had a lot of fun. Once you understand stack based synthesis, 4klang is straightforward and gives you really tight control of the sound - you just play and create. There's also not much of an interface to get into your way and I miss that when going back to the pseudo-analogue button soup of Logic or Ableton. But since it's so open, you've got to play with it for a while to find out what you can do.

  • Don't be afraid of 4klang's store unit, because that could well be the most awesome little feature of this synth. It allows you to store the current top value of the stack in (almost) any parameter of (almost) any other unit. That's a simple concept, but it really pays off to follow alcatraz' advice to go crazy with it. You can build multi-purpose instruments that behave differently based on the current pitch. Or get evolving sounds, by adding LFOs or envelope units to provide you with control values that change on a per-note-timescale. Then modulate the parameters of these generators, again, with the current pitch or with a dedicated control instrument to get evolving and pitch-dependent sounds. Add more meta levels. Try and see what happens if you filter the control values before feeding them into Store units. And so on. It's really a lot of fun.

(4klang) Drums by umruehren

(4klang) Gebrumml Gbrumml by umruehren

  • Store the output of your main envelope into said envelope's own gain parameter. I'm still not convinced that this should work, but it will add a lot of punch to the sound, e.g. for a bass drum.

  • I had some problems with phase shifts (for instruments that made heavy use of the reverb unit), resulting in louder-than-usual click noises between notes. In that case, you can get quite far by hand-adjusting the phase and/or detune of the offending oscillators via control instruments... for each note change. Note the glitches:

(4klang) jell-o by umruehren

So now for the really difficult part: What kind of sound will match the visuals? And how do you write write music, anyway? Sadly, there's not much to say about my creative process - Because, obviously, when I say "creative process" I mean "messing around until something works or time's up". At least, I managed decide on "minimalistic but full of little fuzzy things" as a general aim before I started. But what does that even mean? So, instead of planning, I built some 4 or 5 proto-tracks that all went into the trash can for very good reasons (but each one was a necessary step) until suddenly, well, time was up and we stuck to the last one.

I'm almost sure that that is the only correct way of doing things.

(4klang) Mod4- by umruehren

Music size optimization

At about two weeks before the party, I had a track ready. But at 2.5k compressed (about 500 bytes non-negotiable 4klang baseline + 1k for instruments + something short of 1k of note values) that was completely unusable for our 4k. So something had to go.

The final track weighed in at about 1.5k compressed, which means I trashed a kilobyte of instrument definitions and note values. The price for that is of course a reduction of the richness-of-sound (you delete less important instruments, merge instruments, simplify instruments) and reduction of variety (reduce the number of different substrings in each instrument track, delete less brilliant parts and generally repeat more stuff). Strangely, I think if anything, the track got better because of the size constraints.

Mastering

There's just one big regret I have: We should have tested the track on a range of different speakers before submitting, because obviously the party PA behaved differently from the Nubert HiFi speakers we used for the final "mastering". Using the HiFi, I evened out a lot of high frequencies a few days before the party, and the result was mud. That's probably a rookie mistake so don't do it.

One last thing. I've read a few comments by non-coder musicians who complained that 4klang wasn't nice and easy enough - you're just being lazy. All the hard work has been done for you. So grab the chance if someone asks you to learn this stuff - it's a lot of fun and kind of addictive.

So much for the music.

Finishing

I got 4klang plugged into the framework with halcy's help, which proved more difficult than i had expected because the executables i first compiled (even the 4klang example) produced weird glitches on my machine, but not on other machines... Eventually we got it working (without understanding the problem) while my brother's musical output became better and better. This was about a month before tUM, the demoparty where we wanted to release the yet-unnamed intro.

Again with the help of some friends at entropia i got the basic functionality of the linux feedback-explorer ported into the framework. I eliminated many of the 70 original effects, reducing them to about 20, and identified some unnecessary parameters. Still, 20 times 14 float values of four bytes each would be a bit too much to fit in the 4 kilobytes together with the shader, the music, the C glue and the yet-imaginary timeline. I edited the table by hand and with python tools written for the purpose to try and find a compact representation. Many of the values where precision was not that important were merged to the same value so that crinkler could compress better. Also they were mapped to 16bit integers and restored at runtime which did hurt precision, but did not destroy the effects if one chose the right mapping and the few really delicate values ended up being restored close to their original value. The parameter table now took up about 500 bytes and compressed to about 200 bytes which was okay. The whole thing, including the beta music, was 4.8 kilobytes - there were three weeks to go and no timeline or concept yet.

We met around christmas at our parents' house and spent hours creating a sequence of the effects. We introduced a parameter flag to control weather one effect's parameters would be blended into the next effect's parameters while it was running or if the transition would be abrupt. We made a timeline. Syncing sound and visuals late at night proved harmful to sleep quality and after the first night of dreaming of spirals we forced us to take some breaks every few hours. The rendered geometry was cut from fancy twisting spiral-ribbons to the single white quad. We finally arrived at a size of 4200 bytes when we headed for my place in Karlsruhe, right next to where tUM takes place. The intro had even got our grandmother's approval over christmas, which is not something that many people can say about their LSD-themed 4k intros.

At the partyplace we tested the beta version on the compo machine - it worked. Relaxed, we plucked a few more bytes here and there, did some last syncing, hid the mouse cursor, made executables for the various resolutions and won the 4k competition. Here is the live footage of the crowd going wild:

Finally, after ten years of being intimidated by the demoscene, we started to see sceners no longer as godlike, but as talented and motivated people. Had we learned this lesson earlier, we would have started making demos years ago.

Download the original 4k executable from pouet. We didn't bother making the source pretty, but you can have a look if you want.

All content made by Johann Korndörfer and Clemens Korndörfer. Released CreativeCommons 3.0 ShareAlike.

2 Comments

Nice post on a long journey!

Many thanks for sharing this info.
Based on your ideas I wrote my own version...
http://softologyblog.wordpress.com/2012/08/16/video-feedback-simulation-take-2/

Jason.