C # – Best practice in rendering design with multiple passes d3d11 renderqueue

The goal: to emit the least possible calls in d3d11 ImmediateContext and render advanced effects such as toon / outline / anything that requires rendering the same vertex buffers with at least two vertex shader and pixel programs shader by effectively sorting opaque back-to-back calls and face-to-face transparent calls.

What I did: Abstract calls to the immediate context d3d11 by a DrawCall structure acting as a gpu command recorder. A call to DrawCall.SetIndexBuffer will create a c # action that will issue a call on ImmediateContext during the call. DrawCall's internal gpu command list is sorted as follows: set shaders, set constantbuffers, set blendstate, set stencilstate, set vertexbuffers …. I use sharpdx and c #.

My problem: I use a 64-bit sort key for my gpu commands. When rendering complex effects such as outlines or anything that requires changes of state for the output, the idea of ​​sorting does not look correct … example: I make it look like I'm doing it. first opaque objects, then transparent objects … I want to render an opaque object with a diffuse shader and it should have a red transparent outline. For this purpose, I need to change the stencil and merge states between drawindexed calls. According to my sort key, these gpu commands are on different opaque <> transparent draw loops, but the stencilstate state requires that both drawindexed calls be one after the other.

Should I implement a linked gpu command for multipass rendering or are there better ways to handle such effects?