recursion – Trace reveals unexpected evaluations and internal calls for a recursive function

I would like help in identifying the problem in my implementation of the recursive function illustrated below. But first, a little context.

There is a well-known solution to the problem of finding the longest common subsequence among two sequences. The algorithm is as follows:

Reverse the sequences S1 and S2, and call the reversed sequences RS1 and RS2

Define a recursive getLCS function, which takes RS1, RS2 and an empty ACC accumulator and does the following:

If RS1 or RS2 is empty, getLCS returns ACC.

Otherwise, if h1, the head of RS1, is equal to h2, the head of RS2, getLCS returns getLCS (tail (RS1), tail (RS2), {h1} union ACC).

Otherwise, getLCS returns the longest of getLCS (queue (RS1), RS2, ACC) and getLCS (RS1, queue (RS2), ACC).

My implementation of Wolfram Language is as follows:

getLCS({}, _, acc_) := acc;
getLCS(_, {}, acc_) := acc;
getLCS({s_, t1___}, {s_, t2___}, acc_) := getLCS({t1}, {t2}, Prepend(acc, s));
getLCS({s1_, t1___}, {s2_, t2___}, acc_) := {
    getLCS({t1}, {s2, t2}, acc), getLCS({s1, t1}, {t2}, acc)
} // MaximalBy(Length) // First;
getLCS(lst1_, lst2_) := getLCS(Reverse(lst1), Reverse(lst2), {});

The implementation works, but is extremely slow, even for short sequences. Operation Trace sure:

getLCS({1, 2, 3}, {1, 2, 4}) // Trace

reveals a long list of internal calls whose innocent eyes must be protected.

For reference, here is (essentially) the same implementation in F#:

let getLCS<'A when 'A : equality> (lst1 : list<'A>) (lst2 : list<'A>) : list<'A> =
    let rec f (cLst1R : list<'A>) (cLst2R : list<'A>) (acc : list<'A>) : list<'A> =
        match cLst1R, cLst2R with
        | (), _ | _, () -> acc
        | h1 :: t1, h2 :: t2 when (h1 = h2) -> f t1 t2 (h1 :: acc)
        | _ :: t1, _ :: t2 ->
            ((f t1 cLst2R acc); (f cLst1R t2 acc)) |> List.maxBy List.length
    f (List.rev lst1) (List.rev lst2) ()

the F# the version runs much faster. I know it Wolfram Language is not as slow in comparison. So my questions are:

  1. What are the shortcomings of my Wolfram Language Implementation?
  2. Why Trace show a stack of calls so different from what would be considered a normal mental image of the stack of execution implied by the definition?

Thanks in advance.