graphs – Error type chart with simultaneous errors

1. What is an error type chart?

Our monitoring system measures the parameters of telecommunications services at regular intervals, and based on the measured parameter values, automatically decides if there were service errors. For every monitored service, there is a chart on our dashboard, which displays in percents the ratio of error-detecting measurements to all measurements. Different service error types are displayed with different colors.

2. EASY CASE: Mutually exclusive error types

In most cases, one measurement can detect at most one service error type, so I could use stacked bar charts. This way the user can examine not just the percentage of different error types, but also the total percentage of errors, which is shown by the total height of the stack. Example chart with 6 possible error types:

enter image description here

The users like this solution.

3. MY PROBLEM: Simultaneous error types

In rare cases one measurement can detect more than one service error type, even all of them.

a] I cannot use stacked bar charts because the height of the stack would be misleading. For example, if there are 100 measurements, and 5 of them detects errors, but all of them detect the same 3 error types, then the top of the stack would be at 15%, while the total error ratio is 5%.

b] I also cannot use use unstacked bar charts, because at the smallest supported display resolution there are only 10-15 pixels for every time interval, so there would be only 1-2 pixels per bar if I put them next to each other.

c] Both problems could be solved by unstacked line charts. Here is an example with 7 error types:

enter image description here

Unfortunately the users don’t like it, because “it’s very hard to see the total error percentage”. Although there is another chart which displays the total error percentage, they would still prefer being able to see its value on the error type chart. I could add yet another, possibly thicker chart line, which shows the total error ratio but in some cases there are 16 different error types and I’m not sure that adding a 17th line with a special meaning wouldn’t make the chart too complex.

4. MY QUESTION: How could I display on the same chart not just the different error type ratios, but also the total error ratio in a meaningful way?

algorithms – Paths in time dependent graphs

Given a time dependent or also called temporal graph $G := (N, (E_i)_{(i=1)}^t)$ with sets of directed edges $E_i$ where $i in {1,…,t}$, $t$ being the last timestamp and $N$ being the number of nodes.
The timestamp increments by one when going from one vertex to another.
I was trying to implement an algorithm which would find at least two node disjoint paths to two given destination nodes $d_1$ and $d_2$, such that for every node on a given timestamp, path 1 and path 2 would not cross.
Reading about time dependent graphs I realized that there seems to be no simple way to compute such a path. You would have to implement backtracking because the available edges change in every step.
I have read about modified Dijkstra algorithms and such, but I still couldn’t figure out how to implement it.

  1. How would I find a path? It doesn’t have to be a shortest or time cost reduced path. Regular pathfinding algorithms work with an array of visited nodes to prevent cycles. But in a time dependent graph it’s possible that only one path exists in which you would have to return to an already visited node.

  2. My graph can also happen to become incoherent, depending on the current timestamp. How would I continue finding a path?

I thought about putting adjacent nodes into a map of this sort: Map <Edge, Integer>, so that I could sort the given sets of directed Edges $(E_i)_{(i=1)}^t$ depending on their availability (timestamp). With the timestamp being an Integer and the Edge a tuple of two nodes as Integers (int, int).
It would be best if the algorithm could compute this in polynomial time but I couldn’t come up with a solution.

graphs and networks – Retain Edge Ordering in DAG Conversion

I would like to convert an undirected graph into a directed graph such that for each tuple, the first value is a head and the second is a tail in a directed edge.

Initially I thought using DirectedGraph[G, “Acyclic”] would straightforwardly solve this, but considering an example with edges E={{1,2},{2,3},{5,3},{3,4},{4,6}}, the {5,3} edge connects from 3 to 5.

enter image description here

I could convert the graph into something like E={{1,2},{2,4},{3,4},{4,5},{5,6}}, and imagine an algorithm exists that detects edges with a larger head and relabels the vertices. For example I could locate a bad pair, swap labels to get {{1,2},{2,5},{3,5},{5,4},{4,6}} while swapping the vertex coordinates in-step, iterating until a result like {{1,2},{2,4},{3,4},{4,5},{5,6}}. But I wonder what simpler and/or alternative methods you might think of.

graphs – SMA*+: What if a removed node gets re-generated via another predecessor?

One last question came to me while reading the paper on SMA*+ about setting the $f$-cost of nodes being re-generated.

Well, first, it looks like the part of the algorithm where we set the predecessor when we find a shorter path is missing from the paper. I assume it looks like Dijkstra or A*.

But then I wonder what happens if between the time we prune and re-generate a node $n$, it was rediscovered through another, shorter path. If I follow Algorithm 1, when we re-generate a node, we set its $f$-cost to whatever it was before. If we do it blindly, I guess we might decrease or increase $f(n)$ arbitrarily, which I guess could be bad. So, how do we do it correctly?

graphs – SMA*+: Usefulness of culling heuristics

The paper on SMA*+ proposes a very interesting idea of having a culling heuristics different from the full path cost estimation (so called $f$-cost).

In the benchmark they use a culling heuristics equal to the $f$-cost: $c(n) = f(n)$. And I actually can’t think of a valid reason to do otherwise. Ever. Are there really cases where having a culling heuristics different from the $f$-cost is useful? Or have the authors just not thought this through?

In my understanding, pruning a node that doesn’t maximize the $f$-cost can only lead to more node re-expansion than necessary. Indeed, we always expand the open node with the smallest $f$-cost and progress by expanding nodes with an $f$-cost monotonically increasing. Therefore if $f(n_1) lt f(n_2)$ and $c(n_1) gt c(n_2)$ and we prune $n_1$. Then we will re-expand $n_1$ before considering expanding $n_2$, even if $n_2$ is more interesting according to the culling heuristic. While if we had $c(n_1) lt c(n_2)$ we would prune $n_2$ and we would only re-expand $n_2$ after we’ve exhausted all the nodes with a smaller $f$-cost. And if the goal is found in the meantime, we never re-expand $n_2$.

It seems to me that if we have some domain knowledge, we should put it in the heuristic $h(n)$ and never choose a culling heuristic other than $c(n) = f(n)$.

graphs – SMA*+: f-cost estimation of re-generated nodes

I was reading the paper on SMA*+, which is very interesting as it implements most improvements I thought of when reading the paper on SMA*. But I have 3 questions which I think are related to my misunderstanding of the cost evaluation of the re-generated nodes.

Can there be some memory saved in not fully expanding a node immediately?

Third paragraph of section 4:

Unlike SMA*, SMA*+ fully expands a node each iteration, instead of adding only one successor to the open list every iteration. While adding one successor at a time may seem more efficient, the overhead required to determine which successor to add, adds unnecessary complexity and decreases performance with minimal memory advantage.

Maybe in this context “minimal” is a euphemism to mean “none” and I’m just arguing semantics here. But if taken litterally, I actually don’t see a case with a memory gain in adding one successor at a time.

I mean, if the heuristics is admissible, then every successor $n_i$ of a node $b$ has a higher $f$-cost: $f(b) le f(n_i)$. And since we’re expanding $b$, we know it already has the lowest $f$-cost of all the open nodes. Therefore we won’t switch to expanding another node, and thus SMA* will always fully expand a node once it started, no matter what.

Or did I miss something that might make the progressive expansion worth it in some cases?

Does SMA* store its children when they are removed?

The SMA*+ paper says in section 4:

The original SMA* makes similar progress by keeping removed nodes in memory, until their parent is removed.

Which is think is surprising. When I read the algorithm in the SMA* paper section 3.1 I see:

Procedure BACKUP(n):
    if n is completed and has a parent then
        f(n) <- least f-cost if all its successors
        if f(n) changed, BACKUP(parent(n)).

I think it means that it stores only the least $f$-cost of all its successors. Not the successor nodes, not even one $f$-cost per successor.

Do I misunderstand the way SMA* recompute the $f$-cost of the regenerated successor nodes?

Why do we need to set $f(n) leftarrow max(f(b), g(n) + h(n))$?

In SMA* I thought this was needed because we forgot the actual value of $f(n)$ when culling it. Therefore we estimate its value by using $f(b)$ which is always less or equal to the $f$-cost of all its successors (if the heuristic is admissible). And this $max$ would actually try to be the maximum underestimation of the real path cost through this node. But here, in SMA*+, we store the $f$-cost of all the children, so why do we still need this $max$?

graphs and networks – Generate Aztec triangle of size n automatically?

In the paper titled “Perfect Matchings of Cellular Graphs” by Mihai Ciucu, the Aztec triangle of size n (n= 1, 2, 3, 4, 5, ...) is equivalent to a triangular grid of n1 squares (n1 = 1, 4, 9, 16, 25, ...) .

See the following example:

Is there a automatical way to generate such patterns?

Also are there some general method to generate Aztec diamond of order n (not just triangle)?

Thank you very much!

matching theory – Number of edges in bipartite graphs

Let $G$ be a bipartite graph on $n$ vertices of either color.

Suppose $G$ contains no perfect matching the number of edges can be $Omega(n^2)$ (just do not place an edge between a particular pair of vertices).

  1. Suppose $G$ contains exactly one perfect matching what is the maximum number of edges it can have?

  2. Suppose $G$ contains exactly two perfect matchings what is the maximum number of edges it can have?

I think $2n-1$ is the upper bound for 1. Essentially fix a perfect matching. As long as we do not have a $2times2$ subgraph we can add edges and there is exactly $n-1$ additional edges we can add. For 2. the upper limit is $2n$ I think by the same calculation.

colorings – What is a guarenteed amount of colors, depending on the graph’s arboricity

Your explanation is correct.

And, you can do better than $f(a) = 2a$. For example, take a complete graph on $4$ vertices: $a,b,c,d$. The arboricity is $2$ since $(a,b), (b,c),$ and $(c,d)$ forms the first tree, and the remaining edges forms a second tree. The graph is colorable with exactly four colors.

Applications of directed graphs with with weights in certain algebras

In my research, directed graphs with weights in certain (mostly non-commutative) $ mathbb C $-algebras arise at some points and need to be investigated there for specific combinatorial quantities. Anyway, this investigation can be detached from the concrete application in my research. So, I am wondering if there are other applications of these weighted graphs which are qualified to capture certain combinatorial quantities of the graph:

The arising algebras:
The rational function field $ F $ in several variables and the non-commutative algebra $ B := mathbb Clangle x_i, y_i : i in mathbb N rangle / ( x_i y_i – a_i : i in mathbb N) $ for some $ a_i in mathbb C $.

Some Examples for the combinatorial quantities which can be captured:
(1) Let $ F := K( x ) $ be the rational function field over $ K $ in the variable $ x $. Let $ Gamma $ be the directed graph with enumerated vertices $ v_1, v_2 $, edges $ E := { e_{1,1}, e_{1,2}, e_{2,1} e_{2,2} } $ and the source- and terminus-functions $ s, t : E to V $ via $ ( s( e_{i,j} ), t( e_{i,j} ) ) = ( v_i, v_j ) $. Let $ w : E to F $ be the weight-function on $ Gamma $ which is defined by $ w( e_{1,1} ) := x $, $ w( e_{1,2} ) := 1 $, $ w( e_{2,1} ) := 1 $, $ w( e_{2,2} ) := 1 / x $. Let $ A = ( a_{i,j} )_{i,j} $ be the $ w $-adjacency matrix of $ Gamma $, i.e. $ a_{i,j} = sum_{ ( s( e ), t( e ) ) = ( v_i, v_j ) } w( e ) $. Then we have
$$ A = begin{pmatrix} x & 1 \ 1 & 1 / x end{pmatrix} $$
and the entries of $ A^n $ are laurent polynomials. Hence, $ A^n $ sorts the paths of length $ n $ by the difference of the number of the appearances of $ e_{1,1} $ and $ e_{2,2} $.

(2) Going one step further, one could also swap $ F $ with the Jacobson algebra $ B := mathbb C langle x, y rangle / ( x y – 1 ) $ and set $ w( e_{1,1} ) := x $, $ w( e_{1,2} ) := 1 $, $ w( e_{2,1} ) := 1 $, $ w( e_{2,2} ) := y $. Then we have
$$ A = begin{pmatrix} x & 1 \ 1 & y end{pmatrix} $$
and $ A^n $ has entries of the form $ sum_{k,l} c_{k,l} y^k x^l $ where we also write $ x $ and $ y $ for their classes here. Hence, $ A^n $ sorts the paths of length $ n $ not only by the difference of the number of the appearances of $ e_{1,1} $ and $ e_{2,2} $ but also by the $ e_{1,1} $ and $ e_{2,2} $ which are not “eliminated”.