linear algebra – Efficiently populate a Sparsearray for a set of rules for a constrained basis

I want to populate quite a large SparseArray(10^6 x10^6) efficiently. It is basically a spin system Hamiltonian with a constrained Hilbert space. Unlike the examples I have looked at in this forum this introduces a problem. On use of BitXor to emulate terms like $sigma_x$ , in many cases it would yield a value which is not in the Hilbert space, which means I have to do a search in the basis to find the position it corresponds to before storing it in the sparsearray. Let me first show the code which I have written.

First to generate the constrained Hilbert space. I am not going to explain the constraint in detail because that may be confusing, but the following code generates it efficiently enough for me so one can assume this is the Hilbert space.

N1 =24; 
X1 = Select(Range(0, 2^N1 - 1), BitAnd(#1, BitShiftLeft(#1, 1)) == 0 & ); 
X1 = Select(X1, BitGet(#1, 0)*BitGet(#1, N1 - 1) == 0 & ); 
len=Length(X1);

X1 stores the allowed basis vectors and len stores the length of X1. Now the Hamiltonian term which I want to calculate is $sigma_x$ or in terms of creation and annihilation operators is $tau sum_l^{N_1} (d_l^{dagger}+d_l)$.
To generate this I wrote the following code,

SetAttributes(f, Listable); 
Needs("Combinatorica`"); 
g(j_,k_) := BinarySearch(X1, BitXor(X1((j)), 2^k)); 
f(i_) := (s1 = Mod(i, len) + 1; s2 = IntegerPart(i/len); k1 = g(s1, s2); 
    If(IntegerQ(k1), {s1, k1} -> (Tau))); 
Print(AbsoluteTiming(Q1 = SparseArray(DeleteCases(f(Range(0, len*N1 - 1)), 
        Null)); )); 

This takes approximately 217 seconds in my i7 machine. However a similar code written in normal loop constructs in FORTRAN takes a second to evaluate. Now my experience tells me while Mathematica will take more than a second of course because it is not a low level programming language, but the time taken being almost two orders of magnitude higher indicates there is a faster way of doing this. Can anyone help me find it?

I actually need to go to sizes an order of magnitude higher than in the example, hence I need a speedup, plus while I can generate this data via FORTRAN and import to MATHEMATICA from a file, I want to do that as a last resort after exhausting all possibilities of speedup here.

Edit:- In the slow code snippet I want to do the following, I take a basis vector, then operate a BitXor gate one by one on all the bit positions, i.e. from 0 to N1-1. Each would give me a new integer, which may or may not be present in the Hilbert space. I check whether it is present and also its position in the basis vector array by using BinarySearch, once I do that I create a list like {x1,x2}->tau where x1 is position the basis vector which I started with and x2 is position of the basis vector I get after the operation. If it is not present a “Null” is returned which I delete from the SparseArray. This operation is tried to be made listable using the function “f”. Where using “Range” I try to generate both the basis vector positions as well as the bit position using one value to vectorized the code. The “Mod” and “IntegerPart” operation then splits it up to basis vector position and bit position inside the function f.

python – Compute integral of spline by chunk more efficiently

I have a program that alternates between 2 phases: REST and REG. During each phase, 2 positive quantities are computed a and d. By the end of a REG phase, I compute a score based on the previous REG and REST phases. This computation is very clumsy, and I would like a better, faster implementation (especially, I would like to get rid of the for loop).

Generate sample data:

#%% Imports
from scipy.interpolate import InterpolatedUnivariateSpline
import numpy as np
from matplotlib import pyplot as plt

#%% Generate random data
timestamps_REST = np.linspace(0, 10, 30, endpoint=True)
a_REST = np.random.rand(30)
d_REST = np.random.rand(30)

timestamps_REG = np.linspace(11, 45, 80, endpoint=True)
a_REG = np.random.rand(80)
d_REG = np.random.rand(80)

#%% Spline interpolation
a_REST_spline = InterpolatedUnivariateSpline(timestamps_REST, a_REST, k=1)
a_REG_spline = InterpolatedUnivariateSpline(timestamps_REG, a_REG, k=1)
d_REST_spline = InterpolatedUnivariateSpline(timestamps_REST, d_REST, k=1)
d_REG_spline = InterpolatedUnivariateSpline(timestamps_REG, d_REG, k=1)

In the sample data generated below, the quantities a and d are estimated at:

  • 30 points equally spaced for the REST phase
  • 80 points equally spaced for the REG phase

The REST and REG phase duration/length are different, 0 to 10 for the REST phase and 11 to 45 for the REG phase.

In practice, both the sampling rate and the duration are not constant. This is why I decided to use an interpolated spline to compute the integral of each signal later on.

Visualize/Plots

To visualize the data, you can use the snippet below:

#%% Plot
f, ax = plt.subplots(2, 2, figsize=(10, 5))
ax(0, 0).set_title('REST', fontsize=20)
ax(0, 1).set_title('REG', fontsize=20)
for a in ax.flatten():
    a.set_yticks(())
ax(0, 0).set_ylabel('a', fontsize=20)
ax(1, 0).set_ylabel('d', fontsize=20)

# Spline x-axis
plot_spline_REST_timestamps = np.linspace(0, 10, 2000, endpoint=True)
plot_spline_REG_timestamps = np.linspace(11, 45, 6800, endpoint=True)

# a REST
ax(0, 0).plot(timestamps_REST, a_REST, 'ro', ms=5)
ax(0, 0).plot(plot_spline_REST_timestamps, a_REST_spline(plot_spline_REST_timestamps), 'b', lw=2)
ax(0, 0).fill_between(plot_spline_REST_timestamps, a_REST_spline(plot_spline_REST_timestamps), color='b', alpha=0.3)

# d Rest
ax(1, 0).plot(timestamps_REST, d_REST, 'ro', ms=5)
ax(1, 0).plot(plot_spline_REST_timestamps, d_REST_spline(plot_spline_REST_timestamps), 'teal', lw=2)
ax(1, 0).fill_between(plot_spline_REST_timestamps, d_REST_spline(plot_spline_REST_timestamps), color='teal', alpha=0.3)

# a REG
ax(0, 1).plot(timestamps_REG, a_REG, 'ro', ms=5)
ax(0, 1).plot(plot_spline_REG_timestamps, a_REG_spline(plot_spline_REG_timestamps), 'b', lw=2)
ax(0, 1).fill_between(plot_spline_REG_timestamps, a_REG_spline(plot_spline_REG_timestamps), color='b', alpha=0.3)

# d REG
ax(1, 1).plot(timestamps_REG, d_REG, 'ro', ms=5)
ax(1, 1).plot(plot_spline_REG_timestamps, d_REG_spline(plot_spline_REG_timestamps), 'teal', lw=2)
ax(1, 1).fill_between(plot_spline_REG_timestamps, d_REG_spline(plot_spline_REG_timestamps), color='teal', alpha=0.3)

sample data

Score computation: code to optimize

As said, I compute a score based on the integrals. The idea is to compute the score as:

max(0, Area(a_REG) - Area(a_REST)) + max(0, Area(d_REST) - Area(d_REG))

where Area represents the colored area in the plot above, i.e. the area between the curve and 0, i.e. the integral. In simpler terms, the score increases if:

  • the area in blue in the REG phase is larger than the area in blue in the REST phase
  • the area in teal in the REG phase is smaller than the area in teal in the REG phase

N.B: The integral computation can be changed to another method if need be.

Now, instead of considering the entire duration at once, I want to compute this score by chunk, and to increase the score at the end of each chunk. Moreover, the baseline, which previously was the area during the REST phase is now computed by taking the median of the chunks area during the REST phase.

The chunk version, below, needs some optimization..

#%% Score
score = 0

# compute baseline for REST phase
rest_chunks = np.linspace(timestamps_REST(0), timestamps_REST(-1), num=11, endpoint=True)
rest_baselines = ((), ())
for k, _ in enumerate(rest_chunks):
    if k == 0:
        continue
    
    rest_baselines(0).append(a_REST_spline.integral(rest_chunks(k-1), rest_chunks(k)) / (rest_chunks(k) - rest_chunks(k-1)))
    rest_baselines(1).append(d_REST_spline.integral(rest_chunks(k-1), rest_chunks(k)) / (rest_chunks(k) - rest_chunks(k-1)))

rest_baseline = (np.median(rest_baselines(0)), np.median(rest_baselines(1)))

# Score per chunks
reg_chunks = np.linspace(timestamps_REG(0), timestamps_REG(-1), num=21, endpoint=True)
for k, _ in enumerate(reg_chunks):
    if k == 0:
        continue
    
    # alpha
    score += np.max((a_REG_spline.integral(reg_chunks(k-1), reg_chunks(k)) / (reg_chunks(k) - reg_chunks(k-1))) - rest_baseline(0), 0) * 50 * 10**13
    
    # delta
    score += np.max(rest_baseline(0) - (d_REG_spline.integral(reg_chunks(k-1), reg_chunks(k)) / (reg_chunks(k) - reg_chunks(k-1))), 0) * 50 * 10**13

Finally, you can notice above that the chunk size depends on the phase duration and on the num parameter in the np.linspace() function. Instead, I would much prefer to define a setting chunk_size, and then if the duration is larger than the chunk_size the algorithms cut the signal in chunks of size chunk_size; with possibly the last chunk being shorter. As this is not really an optimization aspect, let’s call this a bonus to this problem.

algorithms – Efficiently enumerating all “good” strings given the ability to say whether a partial specification can be good

Suppose that I want to enumerate all English language words of length 5.
If I’ve got nothing more than a check of whether an arbitrary string is an English word, I have to do 5^26 calculations.
However, suppose that I can check a partial specification to see if there are any words consistent with it.
By partial specification I mean an assignment of some letters to some positions, with the others free to vary. E.g. “is there a word that starts with “HO” and ends with “Y” (but has any of the 2^26 configurations of third and fourth letter)?

An obvious algorithm would be to formulate it as a 26-tree, and traverse it depth-first, stopping once a pattern was deemed not “good” (e.g. as far as I’m aware there are no five letter words in English that start with “AAA”, so we can skip checking “AAAAA”, “AAAAB”, …, “AAAZZ”).

Is it possible to do any better (average, worst case)? How about under additional assumptions on the distribution of letters (any such learning would need to be part of the algorithm, since my actual problem is not about natural language)? As another view on the second question: what sorts of heuristics are helpful?

opengl – Rendering overlapping normal map textures to a 2d scene efficiently

I am using modern OpenGL to render a 2D non grid/tiled world map. I’ve generated some simple normal map textures to render over the base world map to provide terrain elevation/detail shading. Terrain is not tiled (triangulated from noise), so the majority of these terrain elevation features can overlap. This is good as it gives a more continuous appearance to mountains etc.

However the normal map shader needs to sample the base terrain color and apply the lighting value to it before returning the output color. So I can’t render two overlapping normal textures without the second ‘cutting out’ the first where they overlap (the second texture render cannot see the output of the first).

My solution to this was to use a 3 pass render to texture via FBO. I attempt to divide the normal textures destination locations into 3 non overlapping groups and render 1 group in each pass. This works to a point, but of course where a texture overlaps more than 3 neighbours the cut out problem remains.

I could just increase the number of passes/groups to 4, 5, 6… and perhaps this will resolve most cut out issues. While this would probably still provide reasonable performance on my system I am guessing there is a limit where integrated graphics cards may struggle.

Is there an alternative solution for this that could scale better to lower end systems? Or is perhaps even a 5 or 6 pass full-screen render viable on even integrated graphics these days?

Note:

The first render pass draws the entire 2D scene (minus terrain shading) and is obviously the slowest, but it happens only once. The remaining passes make a screen size copy of the previous pass to a second texture (rendering to a screen sized quad) and then render normal textures to this copy using the first texture as input. Adding more passes would only repeat the screen sized texture copy and normal texture rendering.

search algorithms – Find dominated or subsumed linear inequalities efficiently

Given a set of $N$ linear inequalities of the form $a_1x_1 + a_2x_2 + … + a_Mx_M geq RHS$, where $a_i$ and $RHS$ are integers. The inequality $A$ dominates or subsumes inequality $B$ if all its coefficients are less or equal and $RHS_A geq RHS_B$. Most inequalities are sparse, i.e. most coefficients are zero. Usually, both $N > 1000$ and $M > 1000 $. I’m trying to identify dominating or subsuming inequalities efficiently.

What are quick ways to find
a) if an inequality is dominated by any other inequality, and
b) which inequalities are dominated by a given inequality?

(Monte-Carlo algorithms are fine, that only report correct results most of the time)

Eén, N. and Biere, A., 2005, June. Effective preprocessing in SAT through variable and clause elimination. In International conference on theory and applications of satisfiability testing (pp. 61-75). Springer, Berlin, Heidelberg. (chapter 4.2) discuss a variant of this problem in the context where all coefficients are binary, i.e. $a_i in {0, 1}$. They propose occurrence lists, one per variable $x_i$, containing all inequalities with non-zero coefficients $a_i$. Additionally they use a hashing scheme, similar to Bloom filters, to quickly eliminate candidates. I don’t see how to translate this to non-binary coefficients.

Achterberg, T., Bixby, R.E., Gu, Z., Rothberg, E. and Weninger, D., 2020. Presolve reductions in mixed integer programming. INFORMS Journal on Computing, 32(2), pp.473-506. (chapter 5.2) discuss a variant of the problem, but don’t solve it. They first hash inequalities by indices of non-zero coefficients and the coefficient values. Additionally, they limit their search to a small subset of inequalities.

Schulz, S., 2013. Simple and efficient clause subsumption with feature vector indexing. In Automated Reasoning and Mathematics (pp. 45-67). Springer, Berlin, Heidelberg. describes Feature Vectors for clauses and a kd-tree-like data structure to query combinations of feature vectors.

  • The trivial solution is to check all pairs of inequalities in $O(n^2)$. Unfortunately, that’s too slow for my application where I have millions of inequalities.
  • I tried performing a random projection of the coefficients for every inequality, resulting in a projection $p_j$ for every inequality. An inequality $j$ can only dominate inequality $k$ if $p_j leq p_k$. Thus we don’t have to check all pairs. I repeat this process 10 times with multiple random projections, and use the random projection where I have to check the fewest pairs. In practice this is not effective, as most coefficients are zero – and it’s unlikely that a random projection focusses exactly on the few non-zero elements. It doesn’t help nearly enough.
  • Similarly I implemented Feature Vectors, but couldn’t replicate the performance reported by Schulz.
  • AFAIK, multi-dimensional data structures break down in high-dimensional scenarios (curse of dimensionality). I’m not aware of indexing techniques that work for high-dimensional range queries.
  • I thought about Bloom filters, unsuccessfully.
  • I thought about randomized algorithms, unsuccessfully.

Do you have any other ideas?

git – How to calculate lead time to deploy most efficiently?

I want to calculate the lead time to deploy of a team in Azure DevOps.

Lead time is the amount of time it takes a commit to get into production. Thus the code must reach the master branch beforde deployment.

However, following every commit from its original feature branch into the master from where it gets deployed sounds very clumsy.

So my question is: Is my complicated thinking necessary or is there a more efficient way of calculating the timespan between first commit in feature branch and merge into master where this commit would transitively be also part of?

Thank you very much in advance.

How to organize photos efficiently (in terms of ‘naming’, ‘tagging’ and FOSS)?

I have thousands of photos in multiple directories. Any tool or idea to organize them properly will be helpful.

Here is the details:

I have been ignorant all along about organizing those until recently when I tried to search a particular picture. However, I am using a bash script to find and delete duplicates (md5 hash data) and imagemagick's identify to extract EXIF:DateTimeOriginal. My image file names are like yyyymmdd_serialNo.extension. For example,

20210101_1.JPG

Most of the pictures made on my DLSR and iPhone have those info except for the ones which are very old. Even fdupes can’t detect the duplicates of those. I could not but rely on my visual capabilities (a pain you can imagine). Any thoughts to improve what I am doing? I appreciate

How to organize photos efficiently?

I have thousands of photos in multiple directories. Any tool or idea to organize them properly will be helpful. Thanks

java – How do i efficiently read random lines from a TXT or CSV file?

Context of the problem:

  1. I have made chess GUI (Java)
  2. The GUI is capable of loading chess puzzles/problems to solve
  3. Of said puzzles, i have gotten my hands on a database which is just shy of a million entries

The problem/question:
How does one most efficiently go about getting a random puzzle from this database?

Obviously keeping the database in memory is not an option, despite it already being compressed. Stripped all data that isn’t needed from it and in the end converted it into byte-arrays. All to no avail. The full database always ends up taking up somewhere between 100 and 200 MB of memory. A tenth of that would be acceptable (for my purposes).

Even worse: When processing the entire database (in attempts to keep it all in memory), the process file->memory took upwards of 700 MB memory.

Let’s say i have the following:

  • The database as either a txt or csv file
  • The amount of lines in said file (which is equal to the amount of puzzles)

Am i with that, in some way, capable of grabbing either a random or specific puzzle from the file (albeit async, that doesn’t matter) without having to load the entire database into memory?

Thanks for your time!