Several Patterns for Handling States in C++

July 2, 2023

When programming, it is common to encounter situations where state needs to be controlled. In such cases, there are several patterns that can be employed.

Note that I chose to use C++ for its ease of use, but these patterns can be adapted to other languages to a certain extent.

Pattern 1: Function that takes the current state and an input value, and returns an output value along with the new state

This is structurally the simplest pattern. The code example would look as follows:

#include <utility>

struct State { /*...*/ };
struct Input { /*...*/ };
struct Output { /*...*/ };

std::pair<State, Output> update(State state, Input in) {
    State new_state;
    Output out;
    // ...
    return {new_state, out};
}

In this code, three data types are defined: State to represent the state, Input to represent the input value, and Output to represent the output value. The update function takes the current state and an input value, and returns a pair of the new state and the output value. If the current state needs to be updated with the new state, it can be done outside the function.

auto [new_state, out] = update(state, in);
state = new_state;  // Update the actual state

The update function itself becomes a pure function without side effects.

If the struct is relatively large, using references or pointers to reduce the cost of passing arguments/return values during function calls can be beneficial. However, I will only providing an example and not explain it as it is beyond the scope of the main discussion.

std::pair<std::unique_ptr<State>, std::unique_ptr<Output>> update(const State& state, const Input& in) {
    auto new_state = std::make_unique<State>();
    auto out = std::make_unique<Output>();
    // ...
    return {std::move(new_state), std::move(out)};
}

One advantage of this pattern is that the previous state remains unchanged, allowing you to keep a history of states in a ring buffer or perform validation on whether to actually update the state outside the function.

However, with this pattern, since a new state is created in memory every time, it may unnecessarily impact performance if the memory footprint of the managed State is large.

Pattern 2: Function that takes a reference to the state and an input value, modifies the referenced state, and returns an output value

This method addresses the drawback of Pattern 1.

Output update(State& state, Input in) {
    Output out;
    // ...
    return out;
}

By reusing the State and modifying it in-place, the update function is no longer pure, but it eliminates the need to create a new state in memory each time.

Pointers can also be used instead of references.

Output update(State* state, Input in) {
    Output out;
    // ...
    return out;
}

This approach is adopted by C standard library functions related to file operations, such as:

char *fgets(char *str, int n, FILE *stream);

In the case of the fgets function, the third argument FILE* stream represents the file’s state.

In my observation, there seems to be some discussion on whether to use references or pointers for this pattern.

Pattern 3: Method that takes an input value, modifies the class’s state, and returns an output value

In C++, Pattern 2 has syntactic sugar in the form of implementing the update function as a method of a class (or formally, a member function).

struct State {
    /*...*/
    Output update(Input in) {
        Output out;
        // ...
        return out;
    }
};

This is the so-called object-oriented programming approach and is likely the most common method across various languages.

Pattern 4: Lambda that captures the state by reference, takes an input value, modifies the referenced state, and returns an output value

This is conceptually similar to Patterns 2 and 3, but it uses lambdas.

State state;
auto update = [&state](Input in) -> Output {
    Output out;
    // ...
    return out;
};

The generated lambda function can be reused by the caller.

Input in;
Output out = update(in);

One drawback of this pattern is that dangling references can easily occur. It is important to ensure that the captured state state remains alive as long as the update function is being used.

Pattern 5: Function that uses a function-scope static variable

If you only need to maintain a single state throughout the program, and if there’s no need to separate the state’s creation and update logic and there’s no need to read state from other function, a function-scope static variable can be used.

Function-scope static variables are initialized only once during the first call to the function.

Output update(Input in) {
    static State state;
    Output out;
    // ...
    return out;
}

While it has the major limitation of not being able to hold multiple instances, it can be a powerful approach when only one state needs to be managed and a single update pattern is sufficient.

Pattern 6: Function that uses file-scope static/global variables

In the case of Pattern 5, there was a strict constraint that only the update function could modify the state. However, by moving the state variable outside and making it accessible, multiple modification methods become possible.

static State state;

Output update(Input in) {
    Output out;
    // ... Update the state
    return out;
}

void reset() {
    // ... Reset the state
}

While this is still a viable approach in C, the scope of the state being subject to change widens, making it a pattern that should be avoided as much as possible in serious programming.