linear algebra – Why this trick to derive $ formula[A^n,B]$ in terms of repeated switches work so well?

This is a known result that since generic non-switch operators $ A, B $, we have
$$ A ^ n B = sum_ {k = 0} ^ n binom {n} {k} operatorname {ad} ^ k (A) (B) A ^ {n-k}. Tag A $$
This can be proven for example via induction with little work.

However, while trying to better understand this formula, I realized that there is a much simpler way to derive it, at least on a formal and intuitive level.

L & # 39; trick

Let $ hat { mathcal S} $ and $ hat { mathcal C} $ (meaning "shift" and "commute", respectively) designate operators that act on expressions of the form $ A ^ k D ^ j A ^ ell $ (designating for simplicity $ D ^ j equiv operatorname {ad} ^ j (A) (B) $) as following:

begin {align}
hat { mathcal S} (A ^ k D ^ jA ^ ell)
& = A ^ {k-1} D ^ jA ^ { ell + 1}, \
hat { mathcal C} (A ^ {k-1} D ^ {j + 1} A ^ ell)
& = A ^ {k-1} D ^ jA ^ { ell + 1}.
end {align}

In other words, $ hat { mathcal S} $ "moves" the center $ D $ block left, while $ hat { mathcal C} $ the fact "eat" the neighbor $ A $ postman.

It's not hard to see that $ hat { mathcal S} + hat { mathcal C} = mathbb 1 $, which is only another way to define the identity
$$ A[A,B]=[A,B]A +[A[AT[UNE[A[A,B]]. $$
Moreover, especially, $ hat { mathcal S} $ and $ hat { mathcal C} $ commute.
For this reason, I can write

$$ A ^ n B = ( hat { mathcal S} + hat { mathcal C}) ^ n (A ^ n B) = sum_ {k = 0} ^ n binom {n} {k} hat { mathcal S} ^ {nk} hat { mathcal C} ^ {k} (A ^ n B), $$
which gives me immediately (A) without any need for recursion or other tips.

The question

Now everything is fine and dandy, but that leaves me wondering as to why this kind of thing works?
It seems to me that somehow I get around the inconvenience of having to deal with non-commuter operations by switching to a "super-operator" space, in which the same operation can be expressed in terms of commute "Super-operators."

I do not even know how we could formalize these "super-operators" $ hat { mathcal S}, hat { mathcal C} $because they seem to be objects acting more on the "chains of operators" than on the elements of the algebra of the operators themselves.

Is there a way to formalize this way of managing expressions? Is this a well-known method in this context (I had never seen it but I do not know this type of manipulation well)?