I'm interested in finding the quickest way to group the elements of a large matrix into subgroups of NxM elements and add them together.

To be perfectly clear, I am not actually interested in the "clustered" matrix, but only in the final results where the elements are summarized.

I will show you an example below:

Let's say that I have the following 9×8 matrix:

```
test = table[Subscript[a, ##] &, {8, 9}
```

I group it in the sub-matrices NxM, in this example 3×2:

```
subtest = Score[test, {2, 3}]
```

and then I summarize them:

```
out = MapAt[Total[#, -1] &, subtest, {All, All}];
```

I could use other methods to summarize the subgroups, such as:

```
out = Total / @ Flatten / @ # & / @ subtest;
```

Or by using two nested tables, or for loops.

My question is what is the quickest way to do this? I have to do it on a 48k x 48k matrix, so I really need something reasonably fast.

Should I try to compile nested loops in C (not sure, I have never tried)?

It is interesting to note that the entries in the matrix are all integers greater than or equal to 0.

I will add an example (redundant) with numerical values. However, this can be changed to larger matrices:

```
test = RandomInteger[1, {8, 9}];
```

{{0, 0, 1, 0, 1, 1, 0, 0, 0}, {1, 1, 0, 1, 0, 0, 1, 0, 0}, {0, 0, 1,

1, 1, 1, 0, 0, 0}, {0, 0, 1, 0, 0, 1, 1, 0, 1}, {0, 0, 1, 1, 0, 0,

1, 1, 1}, {1, 0, 0, 1, 0, 1, 1, 1, 1, 1}, {1, 0, 1, 1, 1, 0, 1, 1,

0}, {0, 0, 1, 1, 1, 0, 1, 0, 1}}

```
m = 3
n = 2
out = MapAt[Total[#, -1] &, Partition[test, {n, m}], {All, All}]
```

{{3, 3, 1}, {2, 4, 2}, {2, 3, 6}, {3, 4, 4}}