Implement the binary tree traversal with the following functionalities

Could traverse the tree in preorder, inorder, and postorder.
Could access all the nodes on the path from the root to the current one.
Could be suspended and resumed.
Traverse the tree in O(N) time complexity

Implementation

struct TreeNode {
      int val;
      TreeNode *left;
      TreeNode *right;
}

We use the definition of the binary tree in Leetcode. Notice that there is no pointer to the parent in each TreeNode. We take baby steps towards our goal and implement these functionalities one by one.

Three Orders

Assume that you already know all three traversal orders. It’s easy to implement all three traversal orders in one recursive function.

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, G InorderFunc, H postorderFunc, TreeNode* node) {
    if(!node) {
        return;
    }

    preorderFunc(node);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    inorderFunc(node);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    postorderFunc(node);
}

Path from the Root

By analyzing the recursive traversal function, we know that each call stack of a TreeNode is invoked by its parent (except the root node).

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, G InorderFunc, H postorderFunc, TreeNode* node) {
    if(!node) {
        return;
    }

    preorderFunc(node);
    // new call stack created by executing the following line of code.
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    inorderFunc(node);
    // new call stack created by executing the following line of code.
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    postorderFunc(node);
}

Consider the following snapshot of call stacks while traversing the tree. If we could access the previous call stacks, we could retrieve the values of node variables on previous stacks. These nodes build the path from the current node to the root.

┌─────────────────────────────────┐
│Traverse(root->....->left->right)│
├─────────────────────────────────┤
│Traverse(root->....->left)       │
├─────────────────────────────────┤
│                                 │
│         .............           │
├─────────────────────────────────┤
│Traverse(root->left->right)      │
├─────────────────────────────────┤
│Traverse(root->left)             │
├─────────────────────────────────┤
│Traverse(root)                   │
├─────────────────────────────────┤
│someFunc that calls Traverse()   │
└─────────────────────────────────┘

However, accessing a given variable directly in the previous call stack is not possible without the help of a library like libunwind. We have to maintain the path by ourselves and pass the path’s reference during the recursion.

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, 
              G InorderFunc, 
              H postorderFunc, 
              TreeNode* node,
              // path should be initialized as empty.
              std::vector<TreeNode*>& path) {
    if(!node) {
        return;
    }
    
    // push current node into `path` so that its children could start 
    // building their path following their parent.
    path.push_back(node);
    preorderFunc(node, path);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left, path);
    inorderFunc(node, path);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left, path);
    postorderFunc(node, path);
    // We are going back to our parent.
    // We have to remove ourself from the path.
    path.pop_back(node);
}

Suspend and Resume

Suspension is kind of tricky. We pass a reference to a bool shouldStop in the parameters.

We check shouldStop before and after calling preorderFunc, inorderFunc, and postOrderFunc. We check it after calling them because shouldStop could be modified during their execution. During the check after their execution, if shouldStop is true, we set the stopToken as the current node.

stopToken serves as a handle indicating where to resume.

(Assume that shouldStop will only be modified on the current thread.)

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, 
              G InorderFunc, 
              H postorderFunc, 
              TreeNode* node,
              std::vector<TreeNode*>& path,
              bool& shouldStop,
              TreeNode*& stopToken) {
    if(!node) {
        return;
    }
    
    auto CheckAndSetStopTokenInvoke = 
        [&shouldStop, &stopToken](auto func, auto&&... args) {
            if(!shouldStop) {
                func(std::forward<decltype(args)>(args)...);
            }

            if(shouldStop && !stopToken) {
                stopToken = node;
            }
        };

    path.push_back(node);
    CheckAndSetStopTokenInvoke(preorderFunc, node, path);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left, path, shouldStop, stopToken);
    CheckAndSetStopTokenInvoke(inorderFunc, node, path);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left, path, shouldStop, stopToken);
    CheckAndSetStopTokenInvoke(postorderFunc, node, path);
    path.pop_back(node);
}

The previous code does not really “stop” when shouldStop is set. It will continue its tree traversal without calling preorderFunc, inorderFunc, and postOrderFunc.

Compared to suspending, resuming is much easier to implement. Given the stopToken set by Traverse on suspending, we first traverse to stopToken to resume what the stack is like while suspending without calling preorderFunc, inorderFunc, and postOrderFunc. Upon we find stopToken, we could do our traversal as usual.

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, 
              G InorderFunc, 
              H postorderFunc, 
              TreeNode* node,
              std::vector<TreeNode*>& path,
              bool& shouldStop,
              TreeNode*& stopToken,
              TreeNode* const resumeToken,
              bool shouldResume) {
    if(!node) {
        return;
    }

    if(resumeToken == node) {
        shouldResume = true;
    }
    
    auto CheckAndSetStopTokenInvoke = 
        [&shouldStop, &stopToken](auto func, auto&&... args) {
            if(!shouldStop && shouldResume) {
                func(std::forward<decltype(args)>(args)...);
            }

            if(shouldStop && !stopToken) {
                stopToken = node;
            }
        };

    path.push_back(node);
    CheckAndSetStopTokenInvoke(preorderFunc, node, path);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left, path, shouldStop, stopToken, resumeToken, shouldResume);
    CheckAndSetStopTokenInvoke(inorderFunc, node, path);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left, path, shouldStop, stopToken, resumeToken, shouldResume);
    CheckAndSetStopTokenInvoke(postorderFunc, node, path);
    path.pop_back(node);
}

Look what a miserable piece of junk we have here. We get nine parameters in the parameter list of Traverse. We check two flags every here and there to make it suspend or resume. Furthermore, this function could not even really “stop” executing when shouldStop is true!

Maybe we should consider refactoring the code to pack all the parameters into a control struct as suggested in the Refactoring book. Maybe we should stop passing all the flags and pack the whole function into a function object. Maybe…

But wait! There should be an elegant way to implement the suspending and resuming.

The Elegant Way

The root of all evil is that we cannot access the call stack. If only we could access the call stack, we do not need to build the path ourselves and pass it. If only we could access the call stack, we could take a snapshot of the call stack upon suspending. Or even better, just a summary of the call stack with some variables we need could be helpful.

A summary of the call stack? That sounds interesting. How about simulating the call stack of the recursion by ourselves? In others words, we abandon the function call convention provided by the operating system. Instead, on this very problem, we consider building our own stack so that we could have all the privileges to access and even modify it!

Extract the Variables

Let’s first extract variables that are necessary to our stack.

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, G InorderFunc, H postorderFunc, TreeNode* node) {
    if(!node) {
        return;
    }

    preorderFunc(node);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    inorderFunc(node);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    postorderFunc(node);
}

Please take a closer look at the initial code segment we had at the very beginning. List all the variables on the stack:

preorderFunc
InorderFunc
postorderFunc
node

The first three of them remain unchanged during the recursion. However, node changes in each recursive call. Therefore, node is the only variable necessary to our stack.

Program Counter

Besides the stack variables, another essential provided by the call stack of the operating system is the program counter. Program counter stores the address of the next instruction to be executed. While returning from the callee, the caller should resume its execution using the saved program counter.

We do not need to get the program counter and save it in our case. Not to mention that it’s hard to get the current program counter when you are not debugging.

Take a closer look at the code segment again. What are the three most important checkpoints?

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, G InorderFunc, H postorderFunc, TreeNode* node) {
    if(!node) {
        return;
    }
    
    // check point 1, before the first recursion.
    preorderFunc(node);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    // check point 2, between the two recursions.
    inorderFunc(node);
    Traverse(preorderFunc, InorderFunc, postorderFunc, node->left);
    // check point 3, after the second recursion.
    postorderFunc(node);
}

We do not need to know exactly where our program counter is. The three checkpoints are enough to instruct us:

what function to call: preorderFunc, inorderFunc, or postorderFunc
From which node should we start our recursion: node->left or node->right

Thus, the program counter for our own stack could be described with just an enum.

enum class ProgramCounter : int {
    BEFORE_FIRST_REC,
    BETWEEN_TWO_REC,
    AFTER_SECOND_REC
};

Recursive to Iterative

Consider the call stack while traversing this binary tree with the following code.

void Traverse(TreeNode* node) {
    if(!node) {
        return;
    }

    Traverse(node->left);
    Traverse(node->right);
}

                              ┌──────┐            ┌──────┐
                              │ null │            │ null │
                    ┌──────┐  ├──────┤  ┌──────┐  ├──────┤  ┌──────┐
                    │   4  │  │   4  │  │   4  │  │   4  │  │   4  │
          ┌──────┐  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤
          │   2  │  │   2  │  │   2  │  │   2  │  │   2  │  │   2  │
┌──────┐  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤
│   1  │  │   1  │  │   1  │  │   1  │  │   1  │  │   1  │  │   1  │
├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤
│caller│  │caller│  │caller│  │caller│  │caller│  │caller│  │caller│
└──────┘  └──────┘  └──────┘  └──────┘  └──────┘  └──────┘  └──────┘

                    ┌──────┐            ┌──────┐
                    │ null │            │ null │
          ┌──────┐  ├──────┤  ┌──────┐  ├──────┤  ┌──────┐
          │   5  │  │   5  │  │   5  │  │   5  │  │   5  │
┌──────┐  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ┌──────┐
│   2  │  │   2  │  │   2  │  │   2  │  │   2  │  │   2  │  │   2  │
├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤
│   1  │  │   1  │  │   1  │  │   1  │  │   1  │  │   1  │  │   1  │
├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤
│caller│  │caller│  │caller│  │caller│  │caller│  │caller│  │caller│
└──────┘  └──────┘  └──────┘  └──────┘  └──────┘  └──────┘  └──────┘


                    ┌──────┐            ┌──────┐
                    │ null │            │ null │
          ┌──────┐  ├──────┤  ┌──────┐  ├──────┤  ┌──────┐
          │   3  │  │   3  │  │   3  │  │   3  │  │   3  │
┌──────┐  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ┌──────┐
│   1  │  │   1  │  │   1  │  │   1  │  │   1  │  │   1  │  │   1  │
├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤  ├──────┤
│caller│  │caller│  │caller│  │caller│  │caller│  │caller│  │caller│
└──────┘  └──────┘  └──────┘  └──────┘  └──────┘  └──────┘  └──────┘



┌──────┐
│caller│
└──────┘

These are some facts we could conclude from the graph above:

On first encountering a node, it keeps visiting the left subtree.
After first visiting a node, it pushes the node onto the top of the stack.
A node will always appear at the top of the stack for three times.
- The first time is for preoderFun. This corresponds to BEFORE_FIRST_REC
- The second time is for inorderFun. This corresponds to BETWEEN_TWO_REC
- The third time is for postorderFunc. This corresponds to AFTER_SECOND_REC
  Then, it will be popped out of the stack.
If there is no node on the stack, the traversal is finished.

Put It All Together

Combing all the information we have above, we could have a code snippet for traversing the binary tree.

enum class ProgramCounter : int {
    INVALID_STATE,
    BEFORE_FIRST_REC,
    BETWEEN_TWO_REC,
    AFTER_SECOND_REC
};

ProgramCounter NextPCState(ProgramCounter pc) {
    switch(pc) {
        case ProgramCounter::BEFORE_FIRST_REC:
            return ProgramCounter::BETWEEN_TWO_REC;
        case ProgramCounter::BETWEEN_TWO_REC:
            return ProgramCounter::AFTER_SECOND_REC;
        default:
            return ProgramCounter::INVALID_STATE;
    }
}

struct StackInfo {
    TreeNode* node_;
    ProgramCounter pc_;
};

template<typename F, typename G, typename H>
void Traverse(F preorderFunc, G inorderFunc, H postorderFunc, TreeNode* root) {
 
  auto s = std::vector<StackInfo>{};
  while(true) {
    while(root) {
      preorderFunc(root);
      // push the node into the stack at the first time we encoutner it
      s.push_back(StackInfo{root, ProgramCounter::BEFORE_FIRST_REC});
      // move to left continuously until there is no left child to move to.
      root = root->left;
    }
    
    if(s.empty()) {
      // the stack is empty, the traversal finishes here.
      break;
    }

    // move the program counter on the second or third time
    // popping this node from the stack
    s.back().pc_ = NextPCState(s.back().pc_);

    if(s.back().pc_ == ProgramCounter::BETWEEN_TWO_REC) {
      inorderFunc(s.back().node_);
      // traverse the right subtree.
      root = s.back().node_->right;
    }

    if(s.back().pc_ == ProgramCounter::AFTER_SECOND_REC) {
      postorderFunc(s.back().node_);
      // there is no subtree to traverse. 
      // pop the node out of the stack.
      root = nullptr;
      s.pop_back();
    }
  }
}

Suspendable Binary Tree Traversal

Simply extract root and the stack s as data members to make it suspendable. Then we encapsulate the function into a function object.

To access the path from the root and the level information of the current node, we also have two utility functions, PathFromRoot and Level.

To make it step-wise, the outer while(true) loop in the Traverse function is removed. Besides, the inner while(root) loop is also replaced with an if(root_) statement.

template<typename F, typename G, typename H>
class Traverse {
public:
  Traverse(F preorderFunc, G inorderFunc, H postorderFunc, TreeNode* root) 
  : preorderFunc_{preorderFunc}
  , inorderFunc_{inorderFunc}
  , postorderFunc_{postorderFunc}
  , root_{root}
  {}
  
  bool Next() {
    if(root_) {
      preorderFunc_(root_);
      s_.push_back(StackInfo{root_, ProgramCounter::BEFORE_FIRST_REC});
      root_ = root_->left;
      return true;
    }

    if(s_.empty()) {
      return false;
    }

    s_.back().pc_ = NextPCState(s_.back().pc_);

    if(s_.back().pc_ == ProgramCounter::BETWEEN_TWO_REC) {
      inorderFunc_(s_.back().node_);
      root_ = s_.back().node_->right;
    } else if(s_.back().pc_ == ProgramCounter::AFTER_SECOND_REC) {
      postorderFunc_(s_.back().node_);
      root_ = nullptr;
      s_.pop_back();
    }

    return true;
  }

  std::vector<TreeNode*> PathFromRoot() const {
      auto result = std::vector<TreeNode*>{};
      std::transform(s_.begin(), s_.end(), std::back_inserter(result), [](const StackInfo& info){
          return info.node_;
      });
      return result;
  }

  int Level() const {
      return s_.size();
  }
  
private:
  F preorderFunc_;
  G inorderFunc_;
  H postorderFunc_;
  TreeNode* root_ = nullptr;
  std::vector<StackInfo> s_ = {};
};

Let’s test our Traverse!


void TestTraverseSimple() {
  auto* root = new TreeNode{7,
                            new TreeNode{3},
                            new TreeNode{15,
                                         new TreeNode{9},
                                         new TreeNode{20}}};
  auto prefixPrint = [](const std::string& prefix){
    return [prefix](TreeNode* x){
      std::cout<<prefix<<": "<<x->val<<std::endl;
    };
  };

  auto traverser = Traverse{prefixPrint("Preorder"),
                            prefixPrint("Inorder"),
                            prefixPrint("Inorder"),
                            root};
  auto should_continue = true;
  while(should_continue){
    std::cout<<"call Next() --> ";
    should_continue = traverser.Next();
    if(!should_continue) {
      std::cout<<"FINISHED!";
    }
  }
}

execute the code above, the result is as expected: every time we call Next(), either one of preorderFunc, inorderFunc, or postorderFunc will be called.

call Next() --> Preorder: 1
call Next() --> Preorder: 2
call Next() --> Preorder: 4
call Next() --> Inorder: 4
call Next() --> Inorder: 4
call Next() --> Inorder: 2
call Next() --> Preorder: 5
call Next() --> Inorder: 5
call Next() --> Inorder: 5
call Next() --> Inorder: 2
call Next() --> Inorder: 1
call Next() --> Preorder: 3
call Next() --> Inorder: 3
call Next() --> Inorder: 3
call Next() --> Inorder: 1
call Next() --> FINISHED!

Use in Practice

Just Plain Traversal

Let us use our Traverse function for the plain traversing of a binary, which seems overkill but serves well as a teaser.

145. Binary Tree Preorder Traversal

After constructing the function object, we just keep calling next() until next() returns false.

vector<int> preorderTraversal(TreeNode* root) {
    auto result = std::vector<int>{};
    auto placeHolder = [](TreeNode*){};
    auto resultBuilder = [&result](TreeNode* node){
        result.push_back(node->val);
    };
    // build the function object
    // C++17 is supported in Leetcode, thus we have CTAD, yeah!
    auto t = Traverse{resultBuilder, placeHolder, placeHolder, root};
    // keep calling next()
    while(t.next());
    return result;
}

Binary Tree Forward Iterator

LC173 Binary Tree Iterator

Iterate a BST with inorde traversal. We have to trim our code a little bit to meet its requirement.

Remember that in the previous section, every time we call Next() in our code, either one of the three provided functions will be called. However, this problem requires that whenever we call next(), only the inorderFunc will be invoked.

There is also another difference between the problem API and our previous one. In the problem, the iteration is controlled by a pair of functions — next() and hasNext() instead of just one Next() function in our previous code.

We could notice that the postorder traversal is not needed anymore. Thus, after consuming the node with invoking inorderFunc, we could pop the stack frame out of the call stack. Besides, there is no need to keep the program counter anymore. This is because we only have two states now — ProgramCounter::BEFORE_FIRST_REC and ProgramCounter::BETWEEN_TWO_REC. When the stack frame is popped out, it must be in the state ProgramCounter::BETWEEN_TWO_REC. After it popped out, there is no need to keep its state.

Therefore, we could simplify our Traverse a bit. The while(node_) loop comes back because we skip preorder turns. We do not need to bring the outer while(true). The stack frame is popped out immediately after being revisited in the inorder turn. Thus, there is no postorder turn following the inorder turn. So we do not need to loop to the next inorder turn.

class BSTIterator {
public:
    BSTIterator(TreeNode* root) 
    :node_{root}
    {}
    
    int next() {
        while(node_) {
            stack_.push_back(node_);
            node_ = node_->left;
        }

        if(stack_.empty()) {
            return INT_MIN;
        }

        auto* n = stack_.back();
        stack_.pop_back();
        auto result = n->val;
        node_ = n->right;
        return result;
    }
    
    bool hasNext() {
      return node_ || !stack_.empty();
    }
private:
  TreeNode* node_ = nullptr;
  std::vector<TreeNode*> stack_ = {};
};