algorithms – CountDistinct on a range

I have a dataset with and ID and a date looking like:

+----+------------+
| ID |    date    |
+----+------------+
| A  | 2021-07-20 |
| B  | 2021-07-20 |
| C  | 2021-07-20 |
| A  | 2021-07-20 |
| A  | 2021-07-21 |
| C  | 2021-07-21 |
| C  | 2021-07-21 |
| B  | 2021-07-22 |
| C  | 2021-07-22 |
| D  | 2021-07-22 |
| .  | ...        |
+----+------------+

I have currently 123 distinct dates and few millions of ID.

Unfortunately, I am not allowed to export the ID (even if I anonymise it). I would like to know if there is a way to preprocess the data to be able to store and query the number of distinct ID on a date range.

For example between 2021-07-20 and 2021-07-21 there is only 3 distincts ID. I don’t want to have 3 distinct on 2021-07-20 and 2 distinct on 2021-07-21 leading to 5 (I have added duplicated entries for this example but they are already dropped on my side).

I was thinking precomputing each ranges of days (123 * 123 / 2 combinations) but in term of scalability, it is not awesome (but better than nothing).

The other solution is to pivot the date in column and aggregate over all column with a count but I will have only groups of 1-2 individuals (because there is 2**123 combinations possibles of ady sequence per ID).

Any ideas is welcome and thanks for your support and time 😉

Regards,

Nicolas