object oriented – Create C++14 library where each class has 2 similar variants


I’m writing a C++ library which does some computation on vectors of audio data.

The library supports both GPU (using Thrust, a C++ STL-like library for GPUs) and CPUs (using the STL). I’m using CUDA Toolkit 10.2, which is limited to GCC 8 (and thus limiting me to C++14). All of this is on an amd64 desktop computer running Fedora 32.

The library contain different classes, and each class has a CPU and GPU version. I’m looking for a neat way to define CPU/GPU variants without duplicating code. Sometimes when I fix a bug in the GPU algorithm, I forget to go and fix it in the CPU algorithm, and vice versa. Also, it would be nice if it could be something at the library level, so that if I instantiate “AlgorithmA-CPU”, it internally uses “AlgorithmB-CPU”, and similar for GPU.

Here’s a simple example:

struct WindowCPU {
    std::vector<float> window{1.0, 2.0, 3.0};
}

struct WindowGPU {
    thrust::device_vector<float> window{1.0, 2.0, 3.0};
}

class AlgorithmCPU {
public:
    std::vector<float> scratch_buf;
    WindowCPU window;

    AlgorithmCPU(size_t size) : scratch_buf(size, 0.0F) {}

    void process_input(std::vector<float>& input) {
        // using thrust, the code ends up looking identical
        thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
    }
};

class AlgorithmGPU {
public:
    thrust::device_vector<float> scratch_buf;
    WindowGPU window;

    AlgorithmGPU(size_t size) : scratch_buf(size, 0.0F) {}

    void process_input(thrust::device_vector<float>& input) {
        // using thrust, the code ends up looking identical
        thrust::transform(input.begin(), input.end(), scratch_buf.begin(), some_functor());
    }
};

The example is overly simplified, but it shares the problem with all of my algorithms – the code is the same, except with different data types – CPU uses “std::vector”, while GPU uses “thrust::device_vector”. Also, there is a sort of “cascading” specialization – “AlgorithmCPU” uses “WindowCPU”, and similar for GPU.

Here’s one real example I have in my code currently, applied to the above fake algorithm, to reduce code duplication:

template <typename A>
static void _execute_algorithm_priv(A input, A output) {
        thrust::transform(input.begin(), input.end(), output.begin(), some_functor());
}

// GPU specialization
void AlgorithmGPU::process_input(thrust::device_vector<float>& input)
{
        _execute_algorithm_priv<thrust::device_vector<float>&>(
            input, scratch_buf);
}

// CPU specialization
void AlgorithmCPU::process_input(std::vector<float>& input)
{
        _execute_algorithm_priv<std::vector<float>&>(
            input, scratch_buf);
}

Now in the real code, I have many algorithms, some are huge. My imagination can’t stretch to a global library-wide solution. I thought of something using an enum:

enum ComputeBackend {
    GPU,
    CPU
}

Afterwards, I would create templates of classes based on the enum – but I’d need to map the enum to different data types:

template <ComputeBackend b> class Algorithm {
// somehow define other types based on the compute backend

if (ComputeBackend b == ComputeBackend::CPU) {
    vector_type = std::vector<float>;
    other_type = Ipp32f;
} else {
    vector_type = thrust::device_vector<float>;
    other_type = Npp32f;
}
}

I read about “if static constexpr()” but I don’t believe I can use it in C++14.