Blog

  • Hw2 Blog

    I decided to start by improving the performance of ray tracer since I hadn’t made any work in that aspect. Please keep in mind that the measurements were taken on a M3 Max CPU with 16 cores. Also, each new measurement uses all of the optimizations that came before it.

    My ray tracer was single threaded, so the easiest performance gain seemed to be adding multithreading. My implementation for this is simple. The image gets broken up into 50×50 regions that get pushed into a queue, then n(number of cores) worker threads spawn. These threads pop a region from the queue with concurrency protection,  then start rendering the region. This loop continues for each worker until the queue is empty. After that, all the threads join the main thread, then the image gets written as before. Here’s a worker thread’s loop:

    void renderTask(
        std::queue<RenderRegion> &regions,
        std::mutex& regionsMutex,
        const Camera& camera,
        Pixel* buf
        ) {
        while (true) {
            regionsMutex.lock();
            if (regions.empty()) {
                regionsMutex.unlock();
                return;
            }
            auto region  = regions.front();
            regions.pop();
            regionsMutex.unlock();
    
            renderChunk(camera, buf, region);
        }
    }
    
    SceneSingle-threadedMulti-threaded
    bunny_with_plane58000~ ms5000~ ms
    simple40~ ms20~ ms
    cornellbox_recursive140~ ms35~ ms
    bunny5400~ms550~ ms
    Table 1: Perfomance gain from multi-threading.

    Multi-threading improved the performance dramatically, especially in larger scenes. For further performence improvement, I decided to implement back-face culling. I simply added an if statement to the mesh and triangle intersections:

    if (attemptCull && glm::dot(ray.direction, face.normal) > 0) {
        return false;
    }

    Back-face culling improved performance significantly in scenes with large meshes.

    SceneWithout Back-face CullingWith Back-face Culling
    bunny550~ ms430~ ms
    bunny_with_plane5000~ ms4000~ ms
    low_poly_smooth3500~ ms2800~ ms
    Table 2: Performance gain from back-face culling

    My next step was to implement a BVH. I started by adding a simple AABB class so that I can intersect rays with it.

    class AABB {
    public:
    glm::vec3 axesMin;
    glm::vec3 axesMax;
    
    bool hitRay(const Ray& ray) const {
        float t1Max = std::numeric_limits<float>::min();
        float t2Min = std::numeric_limits<float>::max();
    
        for (int axis = 0; axis < 3; ++axis) {
            auto oneOverRayDirection = 1.0f / ray.direction[axis];
            auto t1 = (axesMin[axis] - ray.origin[axis]) * oneOverRayDirection;
            auto t2 = (axesMax[axis] - ray.origin[axis]) * oneOverRayDirection;
    
            if (oneOverRayDirection < 0.0f) {
                auto temp = t1;
                t1=t2;
                t2=temp;
            }
    
            t1Max = t1 > t1Max ? t1 : t1Max;
            t2Min = t2 < t2Min ? t2 : t2Min;
        }
    
        if (t1Max > t2Min) {
            return false;
        }
    
        return true;
    }
    };

    Then, my next step was to compute the bounding boxes of each triangle, sphere, and mesh. For this purpose, I added an instance of my class AABB to my base class Shape. After that, I added the virtual method computeBoundingBox to Shape, and I implemented it for each type of shape except planes.

    I had initially planned to continue with the declaration of the BVH class, then BVH tree construction but I got a little sidetracked because I got curious. I added a single if statement to my naive ray intersection method, under the mesh intersections:

    if (!mesh.boundingBox.hitRay(ray)) continue;

    I took measurements of scenes with large meshes.

    SceneWithout Mesh Bounding Box CheckWith Mesh Bounding Box Check
    bunny430~ ms200~ ms
    bunny_with_plane4000~ ms1500~ ms
    low_poly_smooth2800~ ms870~ ms
    Table 3: Simple bounds check

    With a simple bounding box check, large meshes could now be rendered much faster than before. I carried on by defining the BVHNode class including the traversal method.

    class BVHNode {
    public:
        enum NodeType {
            LEAF,
            INTERNAL
        };
    
        NodeType nodeType;
    
        // INTERNAL
        AABB boundingBox;
        std::unique_ptr<BVHNode> left;
        std::unique_ptr<BVHNode> right;
        // LEAF
        void* object;
    
        bool searchNodes(const Ray& ray, std::vector<void*>& hitList) const {
            if (!boundingBox.hitRay(ray)) {
                return false;
            }
    
            if (nodeType == LEAF) {
                hitList.push_back(object);
                return true;
            }
    
            bool hitLeft = left!=nullptr && left->searchNodes(ray, hitList);
            bool hitRight = right!=nullptr && right->searchNodes(ray, hitList);
    
            return hitLeft || hitRight;
        }
    };

    Then It was straightforward to write the BVH class with a dummy buildTree method.

    class BVH {
    public:
        std::unique_ptr<BVHNode> root;
    
        bool hitRay(const Ray& ray, HitInfo& hitInfo) const {
            return root->hitRay(ray, hitInfo, vertices, vertexNormals);
        }
    
        static BVH buildTree(std::vector<Shape*>& shapes, size_t start, size_t end) {
            BVH result;
    
            result.root = buildNodes(shapes, start, end);
    
            return result;
        }
    private:
        static std::unique_ptr<BVHNode> buildNodes(std::vector<Shape*>& shapes, size_t start, size_t end) {
            return std::make_unique<BVHNode>();
        }
    };

    Then, I implemented the tree building method. I won’t put the code here to not clutter up the post, but the algorithm was basically this:

    1-) Take the array of objects with an upper and lower bound.

    2-) Choose a random axis to split the objects upon.

    3-) Sort the objects with respect to the chosen axis. The comparison values used during this sort is the mid point of the bounding boxes of the objects(i.e. equal count heuristic).

    4-) Split the sorted array into two equal segments, then recursively call the method again with new upper and lower bounds.

    5-) Once the children have been computed, combine their bouing boxes to compute the current node’s bounding box and return the results.

    With this under the belt, I measured times again. Unfortunately, the results were pretty much the same as the one before. After inspecting the code a little, I realized that the implementation was not complete for mesh objects. Namely, I was successfully “bringing” the rays to the bounding box of the mesh, but after that, I was calling the naive intersection method to actually intersect with the object which is slow because it iterates through every single face. To fix this, I decided to extend my BVH implementation to support two types of trees, namely outer and inner. Then, I assigned the initial tree as the outer tree, then constructed inner trees with the same algorithm only for meshes.

    I tested the performance once again:

    SceneWithout BVHWith BVH
    bunny200~ ms15~ ms
    bunny_with_plane1500~ ms70~ ms
    low_poly_smooth870~ ms160~ ms
    Table 4: BVH

  • Hw1 Blog

    I wanted to be able to see my progress as I write the code, so I decided to start by implementing the simplest renderer possible that would let me somewhat see what’s going on in the see. Firstly, I wrote the parser. Once everything was successfully loaded from the .json, I computed the necessary vectors and coordinates(m, u, v, q etc.) to get the ray equations. Secondly, I loaded the vertices and the shapes. After the shapes were loaded, I wrote the hitRay functions for each shape, so that I can calculate ray-shape intersections. I shaded every ray that hit an object fully white, which let me finally see some results.

    simple.png
    bunny.png

    It was relatively straightforward to implement ambient shading since all I had to do was a single multiplication:

    cornellbox.png ambient

    Coding diffuse shading was the next step. On intersections, I looped through point lights and calculated the diffuse values and added them to the already calculated ambient values.

    spheres.png diffuse

    While I had made good progress, I also realized that my objects were coming out horizontally stretched. After some debugging, I found that this was happening because I forgot to normalize u before using it in calculations.

    spheres.png with normalized u

    Adding specular lighting wasn’t much different that adding diffuse lighting, I just added in the phong shading values to the loop I previously mentioned.

    spheres.png with specular lighting

    It was time to implement mirror materials now. On intersections with such materials, I recursively shot new rays in the reflection direction, with an origin moved a bit along the surface normal, until maximum recursion depth was reached.

    spheres_mirror.png after implementing mirror

    The next type of material to implement was conductors. This was extremely similar to shading mirror objects. I just had to calculate reflection ratio alongside with the reflected ray’s shading.

    cornellbox_recursive.png with conductors

    Implementing dielectrics proved to be the most tricky. I first coded the reflected component of dielectrics.

    cornellbox_recursive.png with reflected dielectrics

    Adding in the refracted light however, was not so simple. I kept getting nonsensical results even though I was sure that my transmission ratio was correct.

    cornellbox_recursive.png with errorneus dielectric refractions

    After debugging and reading lecture slides for a good while, I realized that this was happening because I did not handle the rays traveling within dielectrics correctly. Namely, I needed to flip the surface normals before shading them.

    cornellbox_recursive.png with dielectrics

    My last step was to implement smoothly shaded meshes. I calculated vertex normals according to the formulas, but I just kept getting images that were pretty much identical to flatly shaded ones.

    low_poly_scene_smooth.png with incorrect smooth shading

    My error here was calculating vertex normals seperately for each individual mesh when I needed to calculate vertex normals by just looking at the vertices themselves, without considering the meshes.

    low_poly_scene_smooth.png smoothly shaded

    With this, I was pretty much done. I wrote a Makefile to build the binary and tested it on ineks then submitted my code.

  • Hello world!

    Welcome to METU Blog Service. This is your first post. Edit or delete it, then start blogging!