Grouping and Summation of Fast Matrix Elements

I'm interested in finding the quickest way to group the elements of a large matrix into subgroups of NxM elements and add them together.
To be perfectly clear, I am not actually interested in the "clustered" matrix, but only in the final results where the elements are summarized.

I will show you an example below:

Let's say that I have the following 9×8 matrix:

test = table[Subscript[a, ##] &, {8, 9}

enter the description of the image here

I group it in the sub-matrices NxM, in this example 3×2:

subtest = Score[test, {2, 3}]

enter the description of the image here

and then I summarize them:

out = MapAt[Total[#, -1] &, subtest, {All, All}];

enter the description of the image here

I could use other methods to summarize the subgroups, such as:

                out = Total / @ Flatten / @ # & / @ subtest;

Or by using two nested tables, or for loops.

My question is what is the quickest way to do this? I have to do it on a 48k x 48k matrix, so I really need something reasonably fast.
Should I try to compile nested loops in C (not sure, I have never tried)?

It is interesting to note that the entries in the matrix are all integers greater than or equal to 0.

I will add an example (redundant) with numerical values. However, this can be changed to larger matrices:

test = RandomInteger[1, {8, 9}];

{{0, 0, 1, 0, 1, 1, 0, 0, 0}, {1, 1, 0, 1, 0, 0, 1, 0, 0}, {0, 0, 1,
1, 1, 1, 0, 0, 0}, {0, 0, 1, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 0, 0,
1, 1, 1}, {1, 0, 0, 1, 0, 1, 1, 1, 1, 1}, {1, 0, 1, 1, 1, 0, 1, 1,
0}, {0, 0, 1, 1, 1, 0, 1, 0, 1}}

m = 3
n = 2
out = MapAt[Total[#, -1] &, Partition[test, {n, m}], {All, All}]

{{3, 3, 1}, {2, 4, 2}, {2, 3, 6}, {3, 4, 4}}