Ray Tracing: Homework 3

 This homework coincided with a lot of deadlines of mine, that is why I could not manage to finish every aspect of it. I will be giving details of which parts of the ray tracer I could finish and which parts are left out later on this blog post.

Prework

I started this homework by discarding the parser.cpp given to us and creating a totally new parser..: My opinion about the XML format had been changing even before HW1 was released. I believe that XML is a ancient and useless file format. I compare the XML format to the internet's old but kept protocols, whose fields are just kept for the sake of backward compatibility. The reason I think so is the useless syntax (<,\).

The dislike of XML format was only half of my motivation. The other half is about the nature of parser.cpp. The parser given to us is mostly unable to handle erroneous input. Giving undesirable segmentation faults. For example; in the homework 1, there was a scene with a near plane field of 5 numbers. That error was -if I do not misrecall- causing the parser to give segmentation fault. Also, some scenes'd assume some default arguments and that is also nonexistent in the parser given to us. So, it was our duty to edit the parser... In the first homework, making small edits to the parser was sustainable (Consider that I was unable to parse the .ply files and did not have the time to do so, in the first and the second homework.). However, in the second and 3. homeworks; I realized that generating a seperate .ply parser, integrating the current parser.cpp with that one, and also adding new scenes' fields' support into the parser was made the patching style of working on the parser unsustainable for me. I guess, the c++-style getting input and output thru streams and operators (<<, >>) had a role in the decision too. I did not have a chance to learn what a stream is (Probably missed a class in the Data Structures course when that was being taught...), and am refusing to learn since I do not have any analytical energy left this semester for any additional viewpoints, topics and abstract(ing) concepts other than the stream of those that I have been already exposed to nowadays.

Long story short, all these reasons let me decide to create a parser from scratch with doing which I'd patch not the parser itself, but the sustainability and patchability of it. I decided to create a new binary file format called .795, my xml_to_795.c file'd first save the scenes into this format. Then I'd create another program called 795parser.c++, which'd parse these scene files. I got the help of ChatGPT and started to code these 2 programs simulateously. However, it did not take me long to see that this intermediate-file-creating task was not going to finish even before the deadline of the homework. So, I decided to let go of the task of creating a intermediate file and trying to read that file again. I decided to go with creating a integrated parser for the scenes and storing the scene fields inside C structs and C++ containers.

With the help of GPT, I could now finally see what parser does and I can easily troubleshoot it. I guess C-style (scanf, atof, atoi, strtok and things in that nature...) parsing was really appaling to the eyes and one can really follow what the program does better that way. I also created a .ply file parser (without using special .ply libraries such as happly) from scratch using ChatGPT. I want to take a serious/important note here: Not using special libraries actually reduced the complexity of "coding and maintaining" the parser, rather than the expected outcome of increasing. The reason is a combination of 2 facts. Number one: Learning a library such as happly -that you will probably use once or twice in your life- has become a unfruitful burden in one's shoulders nowadays -more like that, if the library is a ancient one. The LL models and modern integrated development systems are taking on such repetitive tasks. Number 2: The .ply libraries out there are probably designed to read all/most kinds of .ply files out there, however, our .ply files are just a percent of them and a LLM is best at helping generate such a parser in light speed, needing you to make small fixes and the parser is ready!

Now that I finished the parser theory, I went on to the cleaning part of my program. I first decided to commit a crime and put all of my ray tracer code into a single raytracer.cpp file. That worked well (even the parser and X Libary debugging-window handlers were there!) until I realized the program went beyond 1000 lines. I decided that it was too much to handle for me and started to search my way into another structure of development. After some talks with ChatGPT and surfing on the internet, I came across something called C Make (or Cmake, CMake?). I always heard about that and decided to learn what it is. I have learned what Makefiles were for only couple of months before, and was thinking that managing dependencies manually inside a makefile was a cross to bear. I was thirsty to use CMake. However, I had no time to learn how to use it. So I decided to make ChatGPT write my CMakeLists.txt and tell me how to use it. That went smoothly and I was able to seperate my project directory into /src and /build segments.

I then decided to divide my raytracer.cpp into chunks of source files. On the second try I split my source code into more manageable chunks. Then I cleaned my old code and started the actual work. I started by porting to a c++ header-only-library Eigen. (Probably that increases the compilation time where it is included.) I was using my custom Vector library before that.

HW2

I was unable to finish the 2. homeworks' tasks and was aware that at creating a BVH-like acceleration structure was crucial in starting to develop the concepts of this and future homeworks. But before all, I want to show that I could finally parse the .ply files correctly by showing the results I lacked in the HW1:

After that, I attempted to rename my box structure as AlignedBox. That is when VSCode suggested me a bunch of classes named as AlignedBox3f, AlignedBox3i, etc.. Than I realized the Eigen library already had such a class. So, I wrote a wrapper around it as a redefinition of my AABB class. I then thought of all the ways the BVH-like structure can be handled. I am sure that there are many different approaches. For a example, there can be this method: There'd be a single global bounding box, and that box includes more boxes or triangles. The leaf boxes should include at least one triangle. Another way can be storing Triangles in a global container and only storing triangle pointers inside a similar box-tree. Another can be storing the boxes also in a global container and assigning one of them as the global box, then storing children as pointers. Probably, none of these methods are better than each other in every aspect. In terms of runtime complexity; storing pointers will drastically reduce the memory needs of the program, while decreasing the performance for numerous jumps between different memory regions (caching problems). Storing triangles inside box structures may increase the scene setup times and -if we are keeping the global triangle container holding all the triangles in the scene- runtime memory needs, while enabling vectorization optimizations and thus increasing the performance. However, none of these were the primary concern of mine at the time of implementation. My concern was the debugging complexity of the program. Because; I know that first attempts will most probably be failing to achieve log(number of triangles)-like complexity, thus, being useless. Also, I did not have much time to fool around. Since this was a 3d problem and primats (literally) like me have evolved to interpret the 3d world, I decided to be able to enable debugging the scene whenever I want, seeing the boxes instead of triangles. Then, I quickly decided for this design: The global Triangle container'd remain. There is a vector<vector<AlignedBox>> object in the global scope, holding layers of containers of bounding boxes. The layers'd represent the levels of the tree, starting from the root level. The boxes'd also have a pointer to this multi-dimensional box container, allowing to find their children boxes inside the container. Each box has child box indices and also a vector<Triangle> object. In the implementation each box can have any number of triangles, however, I only put triangles in the leaf boxes.

The box-ray intersection function would take a default argument called bool debug. If this argument is false, the intersection function becomes recursive. This was a erroneous result where I was finding the intersection truly but returning the box's garbage material (the box object also had a material):


After solving this initial issue, I tested my structure with only single layer (a single box containing the whole scene):


This was the result of my BVH structure with increasing depths. The results were expected.

Now I want to mention my BVH-like structure, and how I generated it. After generating the scene box, the function accelerate0(...) generates the next depths (up to the desired depth defined by a macro) iteratively. In each iteration, the boxes in the last depth are divided into 8 equal-sized areas. Then for each triangle in the scene and for each area, whether the vertex1 field of that triangle is inside of the area is tested. If so, the box representing the area is extended to contain the whole triangle. Then, each new box is pushed inside the new level of the global box container if it is not empty (the size of the box gives that hint.).

After all the levels of the boxes are crieated, the leaf boxes are tested against all the triangles in the scene with the same rule for inclusion. Then pushed into the triangle container in the box.

The result for only the global container was as below:

The material of the box here is uninitialized and the behavior is likely to occur from some overflow.

The visualization for a second depth (the performance of the result is given up here.) is here:

BVH-like structure with 2 levels / depth 2

Here are more depth:


Here are performance results from the Chinese Dragon scene:



The event log here is just shows which containers are taken into account as the global container to be considered while testing ray-scene intersections. -1 means only account the scene triangles, 0 means the first depth (scene-box) of the global box multicontainer. 1'd mean the second depth of the container (if exists), however that'd be meaningless and left only for debugging. (I took the box visualization videos using those.)

You can see the improvement as the tree depth increases. This was similar in my other scenes:

Tuning the BVH-like Structure Generation

Maybe you have already spotted a complexity problem in the generation part: I was always checking for all triangles in the scene to test if it is included in the box or not. I was aware that this was redundant while writing that part of the code, however, I did not really have the time to optimize it before I need to do so. Chinese dragon was OK, but generating other_dragon's scene structure took really long:

1 minute was bearable, but I wanted to decrease this time. I thought about storing the "covered" triangles inside every box, I could do that but that'd increase the generation time memory complexity drastically. And, I'd need to clean this after finishing the BVH-tree generation process to not use extensive memory throughout the program for just the sake of a initialization optimization.

After thinking about crazy non-memory-intensive solutions such as a hierarchical boolean array system where each level's bool array will represent whether one of a group of triangles included in the parent box is also inside of the current box or not; I came up with a heuristic method: I would store what range of triangles are inside that box. I thought probably the triangles in the given meshes are not in random order.

The result was satisfying:


However, I made a last-index mistake and the bunny had holes in it:


I solved this easily.

Final Remarks

Here is the result of depth-5 tree in Other Dragon:

Since I had not even a hour to waste, I needed a way to estimate the finishing times of the scenes according to the self-observations in the program. I estimated time as a non-algebraic function of the number of rows finished (y). This was because, using a acceleration structure makes the scene to render quickly under a bounding box is hit, then the calculations slow down. A classic linear estimator'd underestimate the time to finish the scene because of that. A differential estimator'd then overestimate (most of the time). The current estimator has some flaws but I can get by with it for now. I plan to upgrade the estimator in the following homeworks.

Another problem about the structure generation part is, it is not much adaptive. The depth is hard coded and it keeps generating at least one child box per parent even if there is a single triangle left. This causes the following behavior. Notice the triangle counts and then check the time to finish the scenes.

This issue may occur because of 2 main causes: the boxes may be overlapping too much or the depth of the tree is way over the needed depth for the Science Tree and there are huge amounts of boxes created. These causes can be tested easily (At least the second possible cause can easily be tested.), however, I decided to leave it there, otherwise I'd not even start the HW3. 

HW3

Notice that I still could not implement transformations while starting this homework. I decided to give the parts of the homework that does not need any transformtations a go. After I finish those parts, I had left literally no time to implement those.

Sampling

I decided to start by implementing sampling. The first step was sampling the pixels. Because of my hurry, I decided to randomly sample within the pixel. Also, I wrote the program in a way that, in every scene I sample the pixel. Only difference is that default sample size of me is 1. This is the result of cornellbox scene from homework 1:


Notice the noisy edges. However, the image was near-identical to the original image.

Then I tried the cornellbox_area scene. I assumed that the area light is a point light instead. The result was this:


After that, I implemented the sampling of the area light. The first result was like this:


I looked for the error in my implementation, after 30 minutes, I found out that the size of the light is not the radius of the light. The light was actually a square and the size is the length of one of its edges. I did not change my area light's sampling parameter, but calculated the area of it as a square to not reduce the estimated average of the light in the scene. I also clamped the colors of the pixels only after collecting the samples. This is important, because one can have a really long time to understand that the lighting difference is caused by clamping the color value in each sample.
This was the end result. I know that the noise in the scene was because of my totally random sampling.

DOF

After finishing the area-lights, I worked my way to the DOF calcualtion. Since I could not spot a scene that incorporates DOF but does not include any transformation, I decided to make my own. I took the Spheres scene from homework 1, and added DOF fields under the camera section. You can see that scene in my submission. I then proceeded to implement the lens according to the focus distance and aperture. I made my changed as if the camera aperture is a circle. I corrected my sampling on the lens according to that. The first result looked so bad that I thought I made a mistake:

I then realized that I did not even touch the focus distance. Since I was working on a windowing mechanism, I could click the screen and send a ray to that pixel using my algorithm. I did that and made the program print the position of the first intersection. Then I found the approximate position of the surface of the closest blue sphere. The sphere was around 14 units away. Then I set my focus distance to 14, but the result was nearly same. Then I tried different focus distances. Here is the result of changing the focus distance to 1000:


Then I realized that I defined the focus distance as the distance from the image plane, not camera. Then I changed the focus distance to 4, and the result was this:

This seemed correct enough for me. This part was done in one shot.

Conclusion and Final Remarks

  • I hope to finish the left parts of this homework before the next homework. The motion blur and glossy reflections are missing.
  • I am sorry to edit this blog to inform that I forgot to remove the debug definition in the homework. The program will try to create a X Window and render the scene on that window. Hitting space will start the rendering without BVH structure. Hitting "up" key will start the rendering with a depth-6 BVH tree. If desired, I can show the results in my computer.
  • In my BVH tree, I only enbox triangles, the spheres are still handled globally.
  • I plan to improve my BVH-like structure:

    • I plan to add a automatic-termination logic to automatically decide on the level of the tree.
    • I plan to optimize the generation part if needed.
    • I plan to add a new intersection test for the boxes that are cubic enough, and compare the results and see if any significant result is reached.
    • I plan to add a different acceleration structure support to compare between them

And here are some visualizations of my BVH tree. I strongly believe that this visuals will be helpful in the early weeks of CENG477 and CENG795 courses by helping the students visually comprehend the acceleration structures.




Comments

Popular posts from this blog

Adaptive Sampling in Path Tracing

My Ray Tracing Adventure (Fall 2024)

Path Tracing (Homework 6)