search algorithms – Find dominated or subsumed linear inequalities efficiently

Given a set of $N$ linear inequalities of the form $a_1x_1 + a_2x_2 + … + a_Mx_M geq RHS$, where $a_i$ and $RHS$ are integers. The inequality $A$ dominates or subsumes inequality $B$ if all its coefficients are less or equal and $RHS_A geq RHS_B$. Most inequalities are sparse, i.e. most coefficients are zero. Usually, both $N > 1000$ and $M > 1000 $. I’m trying to identify dominating or subsuming inequalities efficiently.

What are quick ways to find
a) if an inequality is dominated by any other inequality, and
b) which inequalities are dominated by a given inequality?

(Monte-Carlo algorithms are fine, that only report correct results most of the time)

Eén, N. and Biere, A., 2005, June. Effective preprocessing in SAT through variable and clause elimination. In International conference on theory and applications of satisfiability testing (pp. 61-75). Springer, Berlin, Heidelberg. (chapter 4.2) discuss a variant of this problem in the context where all coefficients are binary, i.e. $a_i in {0, 1}$. They propose occurrence lists, one per variable $x_i$, containing all inequalities with non-zero coefficients $a_i$. Additionally they use a hashing scheme, similar to Bloom filters, to quickly eliminate candidates. I don’t see how to translate this to non-binary coefficients.

Achterberg, T., Bixby, R.E., Gu, Z., Rothberg, E. and Weninger, D., 2020. Presolve reductions in mixed integer programming. INFORMS Journal on Computing, 32(2), pp.473-506. (chapter 5.2) discuss a variant of the problem, but don’t solve it. They first hash inequalities by indices of non-zero coefficients and the coefficient values. Additionally, they limit their search to a small subset of inequalities.

Schulz, S., 2013. Simple and efficient clause subsumption with feature vector indexing. In Automated Reasoning and Mathematics (pp. 45-67). Springer, Berlin, Heidelberg. describes Feature Vectors for clauses and a kd-tree-like data structure to query combinations of feature vectors.

  • The trivial solution is to check all pairs of inequalities in $O(n^2)$. Unfortunately, that’s too slow for my application where I have millions of inequalities.
  • I tried performing a random projection of the coefficients for every inequality, resulting in a projection $p_j$ for every inequality. An inequality $j$ can only dominate inequality $k$ if $p_j leq p_k$. Thus we don’t have to check all pairs. I repeat this process 10 times with multiple random projections, and use the random projection where I have to check the fewest pairs. In practice this is not effective, as most coefficients are zero – and it’s unlikely that a random projection focusses exactly on the few non-zero elements. It doesn’t help nearly enough.
  • Similarly I implemented Feature Vectors, but couldn’t replicate the performance reported by Schulz.
  • AFAIK, multi-dimensional data structures break down in high-dimensional scenarios (curse of dimensionality). I’m not aware of indexing techniques that work for high-dimensional range queries.
  • I thought about Bloom filters, unsuccessfully.
  • I thought about randomized algorithms, unsuccessfully.

Do you have any other ideas?

algorithms – Software for triangulation flip graphs?

I need to generate flip graphs on around 10 points (more would be nice). Specifically, I would like flip graphs on subsets of the integer lattice, so the coordinates of each point are integers. Is there any software for this? If not, what language would be best to build something simple?

Thanks!

algorithms – Can’t wrap my head around on building a suffix table for Boyer Moore

My resources were the following video, as well as this video.

Basically, in one of the videos they state that the good suffix table for the pattern “ABCBAB” is the following:

k suffix d2
1 B 2
2 AB 4
3 BAB 4
4 CBAB 4
5 BCBAB 4

and for “DRIDI”:

k suffix d2
1 I 2
2 DI 5
3 IDI 5
4 RIDI 5

If all of the above is correct, I can’t understand why in the first table, for k = 5, we have d2 = 4 (because we match the A with the suffix “BCBAB”?), where as in the second table for k = 4, we cannot do the same (match D with the D inside the suffix “RIDI”), hence why the value for d2 is the length of the pattern, i.e 5 and not 3.

What’s going on here?

algorithms – Improving the time complexity

So Currently I’m stuck at a question. first I’ll give the question below then I’ll give my own algorithm for solving it. but the problem is that it is too slow. Any Help Would be appreciated.

Question:

Assume numbers a and b such that a <= b, if we were to add the divisors of a number consecutively to itself, what is the minimum amount of steps required to get to number b.

a = 4, b = 24 => 4 -> 6 -> 9 -> 12 -> 18 -> 24

This is the code that I’ve Written for the above question.

#include <iostream>

int main() {
    int a, b;
    std::cin >> a >> b;
    int array(b + 1);
    array(b) = 0;
    for (int i = b - 1; i >= a; i--) {
        int min = INT32_MAX;
        for (int j = b; j > i; j--) {
            int difference = j - i;
            if (i % difference == 0 && min > array(j) && array(j) != -1 && difference != 1 && difference != i) {
                min = array(j);
            }
        }
        if (min == INT32_MAX)
            array(i) = -1;
        else
            array(i) = min + 1;
    }

    std::cout << array(a);
}

As you can see I’ve used dynamic programming to solve the question, and the output is correct but The code is too slow. can you guys help me with improving it’s speed?

algorithms – Bellman Ford Source & Negative Cycles

I’m having some troubles with Bellman Ford Algorithm in the following exerciseenter image description here

My guess is that at the end the algorithm should detect the negative cycle BD DF FG GB since B is decreasing at Vth (i.e. 7th) iteration. However changing the distance of the source does not make much sense to me. Is the source to be considered at fixed distance 0 or it can change as well as all the other vertices? Or is it correct to assume, as I did, that the algorithm will detect a negative cycle in the end?

Moreover, assuming to proceed with the computations, it is not clear to me the order I should follow. Being Vertex A not connected, should I go on relaxing from C (I guess not since it is not directly connected to B)? Or should I go on from D, which is the first vertex in alphabetic order directly connected to B?
Thank you

algorithms – Is transfer learning applied on only similar datasets only?

I am trying to make a CNN model on different brands of logos . Firstly , I wrote a CNN from scratch and trained it on which I got 70% accuracy, I have total 40 classes and each class has 100 images . I know it is too short, that is why I am now moving towards pretrained model. My question is that I read on Internet that Transfer learning applied on only similar datasets like lets say VGG model trained on Cats and Dogs , so we can use it for training the tiger images also . Can I use different dataset like I am using MobileNet on different brands of logos . Will it give good accuracy ?

algorithms – How to position images within a canvas while avoiding overlapping

I have a canvas. On this canvas, I want to position 1, 2 or 3 images. The images can be positioned anywhere on the canvas just as long as they’re within the canvas. The images move by their top-left corner. To make sure that the images do not leave the canvas, I am doing this:

image_topLeft_x_position = random.randint(0, (canvas_width - logo_width))

image_topLeft_y_position = random.randint(0, (canvas_height - logo_height))

I can save these locations in a list of tuples like this:

locations = ((image_topLeft_x_position, image_topLeft_y_position))

After saving a location, I want to generate new locations within the canvas that don’t overlap with the previous ones. However, I’ve been stuck on this problem for well over a day. Any help is appreciated.

algorithms – Characteristics of max-heaps

I have just reached the priority queue section where the book I am currently reading about algorithms talked about max-heaps and min-heaps. As a part of its structure, the book would ask the reader a question about the content of each chapter. In this chapter, the book asked the reader to agree or disagree with the following statement, and asked the reader the following question (included below).

Statement: “Joe made a claim that the minimum element of a max-heap which is binary must be one of its leaf nodes”.

Question: “Which of the following statements did you have to assume to hold true in order to make a decisively decide whether the statement above is true/false?”
(1) All nodes in the max-heap should be distinct/unique.
(2) There is only one maximum among the nodes

After some thought, I have agreed with this statement since I know that based on the properties of a max-heap every parent node should be greater than or equal to its children. In order to decide, I had to suppose that (1) is true. Would assuming that (2) is true would also be helpful, or even more appropriate than supposing that (1) is true to come up with a decisive choice?

algorithms – How does CNN deal with rotation invariant pictures?

I am trying to make a CNN model . Training the image . Want to know that When we apply kernel on image and take out the features of images. That features are rotation invariant or we have to apply some rotation invariant techniques? . Few person on stack overflow says that max pooling does rotational invariance , some person says that there is rotational invariant CNN architecture . Give me solid reason that how CNN deal with rotational invariant pictures ? Elaborate the answer .

In machine learning , we do some features extraction techniques like SIFT , SURF etc. and apply some algorithm on it, their features are scale and rotation invariant . How about in CNN ?

algorithms – Optimization problem over bidirectional connected graph

A company has several automatic vertical warehouses (called elevators). Each elevator have several trays and each tray has several slots. A slot contains a given quantity of a given article. Elevators, trays and slots are identified by unique IDs and slot also have access to the ID of their own article. In the same tray there can be multiple slots with the same article.

I have to design an algorithm which, given an order list (we can model it as a dictionary with articles’ IDs as key and quantity needed as values), returns the minimum list of trays needed to satisfy the order.

This is quite different from a standard warehouse optimization problem because we are not considering the physical distances between each tray, since the elevators are automatic and they give us the tray, while in the classical problem is the human who moves toward the tray to pick the item from it.

A distance function is given: d(a, b) which returns 1 if tray a and b are on different elevators and returns 2 if they’re on the same one. That could be counter-intuitive, but remember that these are vertical elevators, so the time needed to change tray on the same elevator is greater than the time to move to a different elevator with the tray already in bay. Furthermore, if we use more elevators at the same time, we can “parallelize” the picking process (man A pick from elevator 1 while man B picks from elevator 2…) Anyway, this function is given and I cannot change it.

After reading through articles about warehouse optimization, I’ve decided to take an heuristic approach.
I’ve modeled a sort of euclidean space, with an axes for each article contained in the order. Then we can consider a tray as a point on that space, with coordinates for each axes equal to the quantity that tray has of the article corresponding to that axes. In the same way we can imagine our order as a point.
Then I’ve create a heuristic function, f(order o, tray t), which returns the “euclidean distance” from point-tray t to the point-order o. The idea is that the more a tray is “near” the order, the more article we can pick out of it.

So, to satisfy the order, I simply compute f(order o, tray t) for each tray in every elevator. Then I order it by descending value and finally I greedily take a tray with the minimum distance from the order. This will be repeated until we collect enough articles to satisfy the order.

Now I `d like to find a better solution, taking into account also the physical distances from tray returned by function d.

I`ve tried to build a graph in which each node is a tray and is connected to each of the others by a directed edge. The weights of the edge from node i to node j will be equal to the physical distance from tray i to tray j, plus the heuristic function computed on tray j

--> w(i, j) = d(i, j) + f(order, j)

This results in a fully-connected, bidirectional graph (each node linked with each other node by both an incoming and an outgoing edge).
I want to apply some algorithm of shortest path (or any other useful algorithm) taken from graph theory on this graph but I couldnt find anything really helpful. Ive tried to apply the A* search algorithm (using function d as gScore and my function f as heuristic) but it gives me no result. I think A* can`t be applied in such a graph (bidirectional and fully connected).

Is there any algorithm I can apply on such a graph? Or maybe the graph is not the right structure to represent my problem. I`m open to new solutions.