I’ve got a moderately large table (148 million rows) that’s got pretty common org chart attributes to it. Names are changed, but it’s the same idea. I was thinking of adding a hierarchyid column to the mix; then I could index that and hopefully make some of the queries faster.
This query is modeled on the msdocs sample on hierarchy id. Before I ran it, I put indexes on (employeeid), (managerid, include employeeid), and (officeid, managerid, employeeid) hoping to make it as fast as possible. But the query took over 5 hours to run.
I looked at the plan, and the (officeid, managerid, employeeid) index is used for both parts of the cte.
What I don’t understand is that the ROW_NUMBER() windowing function also appears to have added a SORT node, which ss says is taking 94% of the query time. It’s already using an index that is pre-ordered in the same way, so why is there a SORT node taking up 94% of the time?
WITH paths(path, duns, extended_gu, extended_parent) AS ( -- This section provides the value for the root of the hierarchy SELECT hierarchyid::GetRoot() AS path, employeeid, officeid, managerid FROM OrgChart where officeid = 0 -- This section provides values for all nodes except the root UNION ALL SELECT CAST(p.path.ToString() + CAST(ROW_NUMBER() OVER (PARTITION BY RD.officeid, RD.managerid ORDER BY RD.officeid, RD.managerid) AS varchar(30)) + "https://dba.stackexchange.com/" AS hierarchyid), RD.employeeid, RD.managerid, Rd.officeid FROM OrgChart rd JOIN paths AS p ON rd.managerid = P.employeeid ) UPDATE rd set tree=P.Path FROM Paths P INNER JOIN OrgChart rd on rd.duns = P.Duns