Part 1 is here.
In Part 1, I explained how to create a wire-frame 3d engine in Freebasic. In Part 2, I'll extend it to include filled flat surfaces and textures. If you have not yet read Part 1, read it first as I'll pick up where I left off.
- Filling a quad
To render solid-looking objects made of convex quadrilateral faces, such as the cubes in the example program, there are basically three things that need to be done:
This is basically the same as finding the lines that make up the outer wireframe for a quad, but now instead of just finding start and end screen points for the line and using the LINE command, we need to know screen points in between.
For each line we just need a single sx location for each sy row, so it's not quite like finding points to draw an unbroken line. Also, we need to know if the line forms a left edge or a right edge for the quad we want to draw.
Since a quad is a closed figure, you might think that all we have to do is find both edges for each sy row, and then the leftmost one is the left edge and the rightmost one is the right edge. However, it's possible that part of the 3-d quad extends, not merely off-screen, but actually behind the plane of the viewer. In such cases, there may be only one side edge determined by our lines; the projection of the tile to the view plane extends infinitely to the other side (if only we had an infinitely wide screen to draw it on). To visualize this, stand next to a wall which can can see one end of, with the other end of the wall somewhere behind your head. In practice, of course, the edge of the screen will become the other edge.
So, it would be good to know if a given line forms a right edge or a left edge without having to worry about what the other edge is up to. The easiest way to do this is to use a clockwise or counter-CW convention: for example, if a line starts at point 1 of a quad and goes to point 2, and the screen projection of point 1 is lower on the screen than that of point 2, we can call that a left edge; if point 1 is higher on the screen than point 2, then call it a right edge. And so on. Going around the quad, points 1-2-3-4-1, we will always go counterclockwise that way.
If we were to look at the quad from the opposite side (from in back of it), then the edges would be backwards and that would no longer work. However, if the quad is a face of a solid structure, then from outside the structure we could never see the back face of the quad anyway. This convention works well with backface culling, in which we don't bother trying to draw backfaces. Backfaces can be determined using the cross product as discussed in part 1.
If we don't want to use backface culling (for example, we want some quads to be double-sided, as for a thin-walled open container) then it's easy to fix the problem: as usual, test to see if it is a backface (using the sign of the appropriate cross product). If it is, then swap around the points (exchanging #1 and #3) or reverse the edge convention. I will not bother doing this in the example program, so I'll just use backface culling.
This, finally, is the classic problem in 3-d graphics: determining if each pixel is blocked by something closer. There are many ways to do it, some more efficient than others, and some algorithms work better under certain conditions (number of tiles, number of pixels, some tiles much bigger than others, etc.) so there is no single best approach for all situations. This had led to a variety of creative approaches.
The most basic approach is the "Painters' Algorithm": Sort the tiles by distance (relative z coordinate), and draw each tile in order from back (further) to front (closer), overwriting the ones in back as needed. This works well in simple cases. When it works, it can be very fast and efficient. But it has drawbacks.
In general, each tile covers a range of distances, so it's not possible to really sort them by distance. It is possible, unless they interpenetrate, to determine if a given tile is in front of (can block) a second tile, in back of it, or neither (in which case it doesn't matter which you assume). But it is possible for three tiles to form a sort of "rock-paper-scissors" situation, in which A is in front of B, B is in front of C, and C is in front of A. To visualize it, you can build such a situation yourself using three rulers or other sticks. The Painters' Algorithm fails for such cases.
Such cases are rare in practice, but there are other problems. If two objects touch each other, they may not be supposed to interpenetrate, but in rotated coordinates they might end up interpenetrating slightly. If they do, the sorting process can give the wrong one as the one 'in front'. If there are a lot of small tiles, the sorting itself can be slow. Using a fast sorting algorithm, such as quicksort, is important in such cases.
A more advanced variation is the c-buffer (coverage buffer). Instead of drawing back-to-front, draw front-to-back, but keep a record (in the c-buffer) for each pixel of whether it has been drawn to during the current frame. If it has, don't redraw there. This has no advantage if the shading method is fast (such as flat shading), but saves calculation if using a slower method such as texture mapping, since you only draw the colored pixel once.
The z-buffer is a popular alternative. For each pixel, keep a record of the relative z value of the tile visible there, in the z-buffer. If the next tile you check that pixel for has a z value closer to the viewer, then replace the color and the z value for that pixel; otherwise, leave it alone. Since the comparison is for a single projected point, rather than for whole tiles, this works for any geometry, even with interpenetrating tiles. It can be slow if there are many tiles that project onto an average pixel.
I prefer a similar algorithm, the 1/z buffer (aka "w-buffer"). This works much like the z-buffer but has 2 advantages: For flat tiles, 1/z varies linearly with the screen coordinates, while z does not; this makes 1/z faster to compute. Also, it's easy to reinitialize the 1/z buffer for the next frame, since setting 1/z to 0 makes it 'further away' than any possible object.
To see that 1/z varies linearly:
The equation for a plane in 3-d space can be written as
N dot R = d Nx * x + Ny * y + Nz * z = d
where N = (Nx,Ny,Nz) is a vector normal to (perpendicular to) the plane, R = (x,y,z) is any point on the plane (assume it's in relative coordinates), and d is a constant.
To find d, plug in the relative coordinates of one of the points on the tile; for example, let R = rel3d(qface(q).p1).
If d = 0, then the plane goes through the origin (0,0,0) which is the eyepoint since these are relative coordinates. Planes like that are not visible, so for a visible plane in front of the viewer, d is never 0.
From the screen coordinates Sx and Sy, we know x/z and y/z, and we want to find 1/z. Divide the equation by z and by d:
(Nx * x/z + Ny * y/z + Nz) / d = 1/z
In practice, with the screen scale factored in to Sx and Sy, and since z < 0 for visible objects, it's more convenient to use w = -scrscale / z.
Since w varies linearly with Sx, there's no need to put the whole formula in the inner loop. Just set the initial value of w at the left edge of the tile, and increment w as Sx increases:
increment_w = (change in w)/(change in Sx) = Nx / d
This works OK but there is a practical complication in some cases. Since I'm using Sx and Sy to find 1/z given the equation for the plane of the tile, my estimate for 1/z is only as good as the assumption that the tile projects onto the pixel (Sx,Sy). That assumption is 100% valid for the interior pixels of a tile projection.
For an edge, though, the middle of the pixel might not actually fall within the tile. If the tile ends at Sx = 74.8, and the pixel with Sx = 75 is taken as the rightmost pixel, then the pixel is not actually within the tile. In most cases this doesn't matter, since it's close.
However, is the tile is nearly edge-on compared to the viewpoint, then even a small difference in Sx can lead to a large difference in z. This can cause the pixel from tile # 1 to poke through a closer tile #2, when in fact tile #1 should be completely blocked by tile #2.
One solution would be to not include an edge pixel in the tile unless the center of the pixel falls within the tile, instead of rounding off to the nearest pixel. That could shave a couple of pixels off from small, distant tiles, which might not be so good. It's not really an issue for the big tiles used for cubes in the example program, but in principle, if fine details in the shapes of objects were included, it could matter. Still, I plan to try that method for the next installment.
The solution I used this time is to calculate the 1/z values for edge pixels in a different way: by interpolating the 1/z values of the points that determine the edge line segment. That insures that the 1/z value won't go out of bounds from what the tile includes. For consistency, I could have done the same for top and bottom pixels, but I didn't want to put that much effort into a relatively minor correction.
Basically, a pixel is colored in based on which part of the tile projects onto it. To determine this, the vector in 3-d space from the point on the tile which projects onto the pixel, to (for instance) one point at the corner of the tile, is needed. It may help to think of the tile as being rectangular, though it need not be.
In the simplest case, the texture map gives color as a look-up-table function of horizontal and vertical coordinates along the tile. In more general cases, the tile can be rotated or non-rectangular. (In principle, the texture map could even be rotated or skewed relative to the rectangular edges of a tile, but I won't get into that. If you understand the concepts it wouldn't be hard to do it.)
If we know the (x,y,z) coordinates of the point - which we can find since we know x/z, y/z, and 1/z - then the appropriate distances along the tile can be found by taking the dot product of that vector with a pair of perpendicular vectors pointing along the grain directions of the texture map.
If there are n_p pixels from one side of the tile to the other, then we want an integer that varies between 0 and n_p-1, telling us which column of the texture map to use. Often, n_p is 64 or 256, but it could be anything.
Let (p2-p1) be the vector from one corner of the tile to the next. (An adjacent corner, not an opposite corner of a rectangle.) A vector in that same direction, but corresponding to what we want for texture mapping, is
Ta = (p2-p1) * (n_p) / |(p2-p1)|^2
Let (R-p1) be the vector from the projected 3-d point (the place where the pixel we see shows the color of) to a corner point of the tile.
Now, for a rectangular tile, A = (Ta dot (R-p1)) gives a number varying from 0 (if R = p1) to n_p (if R = p2). If we round fractions down, then if R is just shy of p2, A will be n_p-1. Suppose there are more pixels than texture columns; the pixels just shy of the right edge will be rounded down to N_p-1.
In practice, since we're getting R from Sx and Sy, R is approximate and could be even outside the tile for an edge point, so we could have for example A = -0.8.
Clearly, some way of restricting the range of A is needed. Using if..then statements or the MOD() function works but is slow. It doesn't need to be pixel-perfect; it does need to be fast.
In practice, the AND operator can act like a fast mod() function. For example, (cint(A) and 255) gives an integer value from 0 to 255. That works because the binary expression for 255 is &b11111111. I've found that cint() works best as int() is too slow. Since cint() rounds off instead of down, the value of A is first adjusted by subtracting 0.5.
We actually know Sx, Sy, and w rather than x, y, and z, and an inner loop is no place to be solving for different variables than we already have. The dot product, A = (Ta dot (R-p1)), can be replaced by an equivalent expression using the variables that we have. Plugging it all in, and subtracting the half-pixel roundoff correction, the result is:
A = num_a' / w - const = [Ta.x * (Sx - Cx) + Ta.y * (Cy - Sy) - Ta.z * scrscale] / w - (Ta dot p1) - 0.5
It's better to write it as:
A = num_a / w = [Ta.x * (Sx - Cx) + Ta.y * (Cy - Sy) - Ta.z * scrscale - (Ta dot p1) * w - 0.5 * w] / w
The first way looks simpler, and for a one-time calculation it would be, but there's a constant to subtract out after the division. That would have to be done for every pass through the inner loop, so it's slower than it could be.
Using the second format, note that the numerator, num_a, still depends linearly on Sx. Sx appears explicitly in the term multiplying Ta.x, and implicitly in w in the last two terms, which is OK since w is linear in Sx.
Like before, the fastest thing to do is to set the initial value of the numerator and increment it as Sx increases. The texture column, A, is obtained by dividing num_a by w.
increment_num_a = Ta.x - ((Ta dot p1) - 0.5) * increment_w
That accounts for one direction on the texture map, horizontal for example. A similar formula is used for the perpendicular direction, such as vertical, with B being the texture row much like A is the column. The vector Tb must be perpendicular to both Ta and to the tile normal N, so it's obtained using a cross product. The magnitude of Tb is set equal to that of Ta.
The inner loop ends up looking like this (left and right side pixels are handled seperately):
for sx = Lside(y)+1 to Rside(y)-1 numa += inca: numb += incb: wz += incw 'increment texture & 1/z as sx increases if wz>wrow[sx] then wrow[sx]=wz: row[sx]=textur(texn,umod(numa/wz),umod(numb/wz)) next sx
where numa is the numerator above, inca is its increment per unit sx (and similarly for numb and inb), wrow is a pointer to the w-buffer ("wz" is w = -scrscale/z), row is a pointer to the screen, textur is the texture map, texn is the number of the particular texture, and umod() is a fast "mod" macro. Even though side pixels are handled seperately, the umod() is still needed since it could be a top or bottom pixel.
As the above code shows, the screen pixel data is being updated if the tile is closer to the viewer than any previously drawn tiles (that is, if wz > wrow[sx]). wrow[sx] is a pointer to the pixel sx on the current row.
In FreeBasic, the screenlock command prevents the actual display from updating while the screen pixel data is being changed. Thanks to that, there's no need to use a buffer image (array) to hold the image while it's being drawn. That helps speed things up. Screenunlock is used after the image is complete.
On the other hand, if the screen is not necessarily filled with tiles - meaning there is blank space - then it's necessary to use cls to clear the screen before drawing. This step slows it down a bit, but not more than it would be if this were removed but the viewpoint were in a fully enclosed room with other objects potentially visible.
To save room in the basic program, include file "vect3d.bi" contains the vector definitions and functions needed for the example program, graf-wz0.bas:
For the next installment, I plan to try my f-x buffer, a fix for edge/top/bottom pixels, and make other additions.
I welcome comments. This thread is for discussing Part 2.