Recently in graphics Category

This article describes how to render a ridicuously large 500 Megapixel Buddhabrot with iteration depths in the range of millions. Lisp source included. Scroll down for the final image.


buddhabrot_small_detail.jpgWhy would you want to do this? To have it printed and on your wall, of course (see pictures below).

I will begin by outlining the general idea of the Buddhabrot (see the Wikipedia article for more), then talk about my optimizations and rendering choices.

The Buddhabrot

You typically render a Buddhabrot by choosing a set of starting points in the complex plane (i will call them "samples" from now on), run each sample c trough the iteration z_k+1 = z_k^2 + c with z_0 = 0+0i and for those that grow very large after some time (i.e. those that are outside the Mandelbrot Set) remember all the z_k that have been visited during the iteration. All those visited points from all the escaping samples form a point cloud in the complex plane that is usually called a Buddhabrot when visualized.


You need a maximum iteration depth n (sometimes called the Dwell Limit) because many samples will be within the Mandelbrot Set and you would run the iteration indefinitely for those. Whenever the maximum iteration depth is reached, the sample is considered to be inside the Mandelbrot set - although it possibly isn't and after another 1000 Iterations it would have escaped. There is no way to be sure.

We now consider only those samples "good" whose iteration escaped before reaching the maximum iteration depth, i.e. those who are definitely outside the Mandelbrot Set. Each good sample gives a set of points whose magnitude is equal to the number of iteration steps i that it took for us to find out that it was outside the Mandelbrot Set. Sometimes this is a very small number like 2 or 3, but samples nearer to the border of the Mandelbrot Set take much longer. A sample can take thousands or millions of iterations before escaping, thus leaving a much larger footprint in the resulting point cloud. It seems that for each Natural Number n there is a sample that takes at least n iteration steps, although for higher values of n it becomes harder and harder to find one.

Deep Iteration

It has been observed by many people that just rendering the Buddhabrot like this - using all the escaping samples' visited points - gives a nebula-like image but that a more interesting image can be made by considering only those samples as "interesting" whose iteration time i exceeds some minimum iteration time k with k < n. For example, you could use k = 1000 and n = 10000, i.e. keep only those samples that do not escape before their 1000th iteration but do escape before their 10000th iteration. If you look at one of those more deeply iterated good samples' visited path in the complex plane you will note some pattern: it forms some kind of orbit, sometimes some kind of circle or a set of circles or other shapes. I have no idea why the orbits are shaped this way and i don't know if anyone else does. Here is a part of the orbit of the sample -0.9114822433916717 + 0.2523940788714053i, for example (containing nearly five million points):

orbit.png

Incidentally, all interesting samples like this one are located very very close to the border of the Mandelbrot Set and make good targets to zoom into with your favourite fractal explorer. Try it.

Since the longer orbits look nice we will try to find only "interesting" samples that take a long time to finally escape. There have been several approches to find them. The most obvious one is of course just choosing a sample randomly and seeing how it goes, which means throwing away something like 99.999% of them. You could also use a regularly spaced grid of samples which basically means the same. The interesting samples are all located near the Mandelbrot Set's border, so there must be some spatial pattern to their distribution which might help us find them. If you ever played around with the Set a bit you probably know that the border is infinitely long and calling it "complex" is a bit of an understatement, but surely we can do better than just randomly sample the plane?

Optimization

Mandelbrot_Set_-_Periodicities_coloured.png

  • Rejecting all samples from within the two biggest parts of the Mandelbrot Set, the "Main Cardioid" and the circular "Period-2-Bulb", is a must-have. This simple check removes a large amount of all the points in the Set which are very costly because they will iterate until the dwell limit is reached. I just used the formula from the Wikipedia article. The areas removed by this test are shown in red on this image of the Mandelbrot Set (my output images are rotated clockwise by 90 degrees, don't get confused):
  • People have also used the Metropolis-Hastings Algorithm, considering the mapping c -> i a propability density function. The method is especially useful for rendering a small detail of the bigger image, but does not work too well for the whole image with very big values of k. I kept finding many interesting samples with it, but their orbits turned out to be very similar because several good samples that are very close to each other produce orbits that are almost identical - giving an image of a single blurred orbit. I increased the mutation distance enough to produce sufficiently distinctive orbits, but then the speedup was gone. In the end i didn't use this method. Probably it is more useful for rendering the "normal" Buddhabrot or a detail of it without the minimum iteration constraint.
  • To partly remove some of the costly larger areas (like the non-circular bulbs labeled "3" and up in the image above) from within the Set i divided the plane into rectangular cells and chose random samples only from those cells that contained some part of the border of the Mandelbrot Set. Depending on the cell grid resolution and the iteration depth this gave a speedup of 3 to 8.


"The"  Buddhabrotbuddhabrot_symmetry.jpg

Since we still choose our samples randomly, every rendering will look different - depending on which orbits your random search finds. There is no "the" Buddhabrot (see Melinda's comment below - rendering for an infinite amount of time yields "the" Buddhabrot). This also means that the image is not symmetic on the small scale which is a good thing if you ask me. You could of course get a symmetric image of the same quality in half the time by mirroring it, but i think it's worth waiting for the asymmetric one. On the right is a part of the final rendering that exhibits superficial symmetry, but is composed of asymmetric orbits. Click it to view this detail in the final resolution.

If i had had my renderer run for much longer (maybe a week or a month), more and more orbits would have shown up, occluding each other and finally blurring the whole thing into the standard nebula-like buddhabrot. The key is to render just long enough until a sufficient amount of orbits have appeared and then stop.

Lisp Implementation

The renderer is written in Common Lisp which is not your classic number crunching language. If the goal was to do it as fast as possible, you'd probably end up doing it in C with inline assembly or on the GPU. On the other hand, i had never really optimized any lisp code to the limit and wanted to do just that: see how far you can get. One long night of low-level optimization at entropia with the help of some friends later there was is only a factor of 2 left compared to the C implementation that one entropian made. Which is quite okay, considering the cumulative speedup of about 2000 compared to my first implementation. Trading of sbcl's builtin complex numbers for traditional pairs of floats, declaring all types, working in-place instead of on the stack, considering cache efficiency, prompting usage of inline arithmetic instead of jumps to costly sbcl functions, etc. gained us a factor of about 3 to 5 while the big rest was due to the algorithmic improvements.

The renderer uses double-floats throughout. I think i can safely say that the orbits' shapes are not due to cumulative floating-point errors (which i had suspected at some time) because tiny variations in the starting value usually do not produce significantly different orbits. Orbits are not chaotic systems.

Since tracing the few good samples is quite cheap compared to finding them, the renderer first multithreadedly collects some good samples by randomly sampling from the grid cells (see optimizations above) and only afterwards processes them to keep track of their jumping around on the complex plane. Their path is recorded in a big 16bit greyscale image which is repeatedly written to disk.

Postprocessing, Color and Noise

Converting this greyscale HDR image to a LDR image with color has to be done by hand which is a nontrivial task for most image manipulation programs because 20000x25000 is not exactly your standard image resolution. Gimp and Cinepaint both crash or freeze (and Gimp can't handle 16-bit images), but Photoshop worked for me. I also cheated a bit by gently surface-blurring the image, removing some noise.

The noise is the only reason you need the really large iteration depths in the range of millions that i used. If you increase both the minimum iteration time and the maximum iteration limit, your orbits will consist of more and more points, allowing you to choose a higher resolution. If, on the other hand, you get less than, say, five hits per pixel in the darker regions of the orbit you will start to notice that some pixels may get only two hits and their neighbour pixel may get ten hits which will be visible as noise in the output image. If you iterate more deeply, this effect evens out and the signal-to-noise ratio increases. To remove noise, you can always downscale the output image or, as i did, selectively blur the image a bit to improve the look of the noisy areas.

For the output image that i consider the most successful one i used a minimum iteration time of 1000000 and a maximum iteration depth of 5000000. it took about 16 hours on my university's 8-core Xeon machine which i abused for this task overnight.

Results

Here is a scaled version of about 5000x6000 pixels for your firefox's non-crashing convenience.

buddhabrot_medium.jpg

buddhabrot_print.jpg

I am quite satisfied with the digitally printed one as well, although some of the darker reds lose their saturation.

Do not open the large image in your browser. It is a 93MiB JPG file of 20000x25000 pixels which consumes 1.5 GiB of RAM when opened. Don't try this at home. I warned you.

You can also have a look at the Lisp source.

Images created by Johann Korndoerfer (cupe) and released CC-Attribution 3.0 Germany, except the original Mandelbrot bulb image by Wikipedia user Hoehue.

Talk: Computer Graphics, Part 1

| | Comments (1)

As promised earlier, i gave a talk at entropia in december. I tried to touch many of the non-hardware-oriented topics in computer graphics, with a focus on those that are interesting (to me, at least) and easy to explain without too much background knowledge in math or computer science.

The talk was an experiment for me in several ways. First, the presentation style: The slides should contain as little text as possible, providing only visual assistance to what i said rather than guiding the talk themselves. In typical university or conference talks, i usually read the whole content-stuffed slide as soon as it appears without listening to what the lecturer says. The remaining two thirds of the time the slide takes usually consists of waiting for the next slide while trying not to fall asleep. I did not want to go as far into the opossite direction as Lawrence Lessig does because, obviously, the talk would be graphics-oriented and some of the pictures would need time to sink in and be explained. I think the style worked out fine and i surprised myself by talking quite a lot about slides with just one picture on them although i had not been practicing.

Second, the technical means: It was also the first time for me to use the Latex Beamer Class for slides. Creating the slides was painless enough with the help of my latex preprocessor, but the result did not convince me: (La)tex is not made for displaying pixel data on screen, causing images to be resized that want to displayed exactly 1:1. Most pdf viewers use Nearest Neighbour interpolation for resizing pictures, making all my nice pictures look bad - which is somewhat embarrassing for a talk that covers image filtering and interpolation. With evince i found a pdf viewer that performs other interpolation methods, making the result bearable. Nevertheless i will have to find some other way to show images next time. Possibly i will create the slides as images themselves and present them with some image viewer like gqview. And maybe i’ll end up writing another minimalist slide-defining markup language that creates png images.

Third, i wanted to be able to publish the slides. This meant restricting myself to pictures released under some reasonably free licence or making my own. I ended up using many pictures from the Wikimedia Commons, usually modifying them to fit the black background. I also made many of the pictures myself. I included a file that provides proper attribution to the creators of the CC-Attribution-images in the archive containing the slides. This involved finding the authors of each of the images which was quite painful (and the reason for the long time it took to publish the slides). I’d really appreciate a Creative Commons license or something similar that does not force users of my work to remember or research my name - releasing into the Public Domain is not possible in Germany. Just granting all users every imaginable rights to to as they wish (which i hereby do) leaves me doubting the lawfulness of this statement. I’d really like some standard way of doing this.

All in all things went well and i certainly learned something in the two months it took to prepare a talk that was over in about 90 minutes. As i invested this considerable amount of time into something that did not last that long, i will certainly consider giving it again (or recycling the slides for a similar talk).

The talk’s article in the amazing entropia wiki contains a link to the slides, some of whom may not make that much sense without any comment.

And here’s the flash version:

Project Overview: perfectstorm

| | Comments (29)

with-health-bars.png

perfectstorm is a real time strategy game study written in common lisp using OpenGL for graphics display and cairo for texture generation. It is in active development with many of the basic features still unimplemented, but i decided the effort put into it justifies some public documentation.

Terrain Generation 90ies-style

| | Comments (1)

If you have played with Bryce or Terragen you know that Terrain Generation is fun and useless. The Midpoint Displacement Algorithm is one of the simplest methods to build a triangle mesh that resembles the terrain found in nature. You can see its output on the right. displacement.jpg

While this looks like a mountain of some sorts, the algorithm does not produce good mountains every time you invoke it. Instead, the result is quite random and almost always bad. Have a look at the algorithm:

  1. Make one triangle.
  2. Subdivide all your triangles into four new ones (use the edge midpoints as new vertices).
  3. Move your new midpoints along the y-axis by a random offset.
  4. Repeat 2 and 3 with half the offset.

The algorithm refines its terrain model recursively with every step, producing self-similar (in other words fractal) patches of terrain. It is short and easy to understand. Too bad real mountains are only superficially self-similar. One of the most visible shortcomings of the algorithm can be seen when the random displacement is replaced by its expected value. The “expected value” for the algorithm looks like this:

displacement-norandom.jpg

These furrows can be seen in most randomly displaced terrains, too, when viewed from specific angles or directly from above. (The yellow cube is the light source)

A better approach might be to use Perlin noise as a heightmap since it does not introduce visible artifacts. I’ll probably test this later.

Concerning the implementation, i spent most of the time implementing the fundamental mesh operations like splitting lines and triangles. Since this will not be last time i’ll handle triangle meshes and i might save other Lispers the pain i am thinking about expanding this and publishing it as a general mesh handling and drawing library for Common Lisp.

a quick guide to shaders

| | Comments (1)

A shader is a small program that is executed on the graphics card. On modern graphics cards, up to 128 shader programs can be executed in parallel. Conventional parallel programming is hard because there are resources to be shared, but shader programs are completely independent, allowing the graphics card manufacturers to add more shader pipelines at will. Currently there are two important types of shaders: vertex shaders and fragment shaders. For each frame that is rendered, all visible vertices (read: nodes of the mesh) are processed by vertex shaders which can modify their position or attributes like normals, color or any other user-defined attributes.

fraktale_teekanne.png One fragment corresponds to one pixel on the screen. For each fragment, the fragment shader is called. A fragment belongs to the polygon visible at the position of the corresponding pixel on the screen, i.e. the intersection of the pixel ray with the polygon defines the fragment’s position in the 3d world. The fragment shader thus knows its position within the polygon and can use this information, e.g. for looking up (and returning) the color of the polygon’s texture at that place. When using fragment shaders, OpenGL will not do texture mapping, you will have to use a trivial fragment shader to accomplish what would happen automatically without shaders.

Information can pass from the vertex shader to the fragment shader, but not in the other direction. The shader programmer may define arbitrary “varying” constants which can be written by the vertex shader and are, linearly interpolated between the vertices, available to the fragment shader. For each triangle three vertex shader executions need to be made, but a much larger amount of fragment shader executions is needed, depending on the size of the triangle on the screen.

Naturally, the fragment shader has access to all of the triangle’s textures. For advanced shaders, like those in modern games, these auxiliary textures usually contain a normal map and/or a height map, i.e. the direction of the normal for each texel (the three coordinates of the normal are stored as rgb) or the height of the texel compared to the triangle plane (as a grey value) is available. With this information a fragment shader can do arbitrarily complex stuff, ranging from good old bump mapping to a wide variety of occlusion and parallax mapping techniques, the use of gloss maps or the design of weird materials, their colors changing with the angle the camera looks at them.

Since it is also possible to pass values (e.g. a tick count or wind speed) from OpenGL to the shaders, they are used to implement flags fluttering in the wind, rocket engines causing ripples in the air or other animations or effects you would have thought the CPU knew about.

In my experiments i used GLSL (the OpenGL Shader Language), a c-like mid-level language that is compiled into assembly code that can be executed on the graphics hardware. GLSL provides clipping, matrix multiplication, dot product, etc. as primitive operations, along with some useful predefined special variables that are always present and need not be declared.

In the quest for more and more parallelization in computing, shaders may be the most successful example since shader parallelization itself demands no intervention by the programmer at all. Shaders are bound to become even more complex: with geometry shaders (available on graphics cards labeled “Shader Level 4” or “DirectX 10”, something my setup lacks) it becomes possible to generate geometry on the fly and even run particle systems entirely on the GLPU. It can be expected that even more parallelizable work will be taken from the CPU and put onto the GPU, providing an infinite playground for the demo coder. Graphics programming has never been this easy and fun. 500 GFLOPS are just $400 $200 away :)

If you are interested, have a look at typhoon labs’ hard-to-find ex-website, hosting Shader Designer, the tool used in the screenshot above as well as a very nice GLSL tutorial with shiny pictures.