How to teach agents to play a video game for reinforcement learning

I'm thinking of creating an ML algorithm using reinforcement learning.
The goal is to create one (or more) agent (s) that can play complex pvp games (such as automatic chess games, tft or maybe even moba games). Obviously, I do not have the game code itself. The problem is to teach the agents how to play the game at first (for example, what actions can be performed in a given game state).
At this point, I do not know how to do it.

Does anyone have an idea of ​​how to start?

machine learning – The output of ClassifierFunction depends on $ ContextPath

I'm trying to use Classify predict the heads in a Mathematica expression from the heads of its arguments, which may be functions whose head is not in the current $ContextPath. With the default method "LogisticRegression", the behavior is as expected – the output does not depend on $ContextPath.

a = Test`testA;
b = Test`testB;

clf = Classify({{a}->1, {b}->2});

clf({a}, "Probabilities")
Block({$ContextPath = {"Test`"}}, clf({a}, "Probabilities"))
clf({"foo"}, "Probabilities")

<|1 -> 0.999679, 2 -> 0.000321061|>
<|1 -> 0.999679, 2 -> 0.000321061|>
<|1 -> 0.999997, 2 -> 3.21752*10^-6|>

However, when I put "Method" -> "Markov", the symbol is treated as equivalent to an unknown symbol when its context is enabled ContextPath.

clf2 = Classify({{a}->1, {b}->2}, "Method" -> "Markov");

clf2({a}, "Probabilities")
Block({$ContextPath = {"Test`"}}, clf2({a}, "Probabilities"))
clf2({"foo"}, "Probabilities")

<|1 -> 0.916597, 2 -> 0.0834028|>
<|1 -> 0.5, 2 -> 0.5|>
<|1 -> 0.5, 2 -> 0.5|>

Since SequencePredict always use Method -> Markovhe always has the second behavior.

  • Is this a bug?
  • What is the root cause? Hash does not depend on $ContextPathand overload ToString has no effect.
  • Is there an elegant workaround other than the manual hashing of the input (which does not work well for SequencePredict)?

machine learning – Clustering – Example of complete link pull

I study unsupervised learning methods (clustering) and have seen the full coupling method. I've also seen the following statement:

Unlike single bindings, the full link method can be strongly affected by draw cases (where there are 2 groups / clusters with the same distance value in the distance matrix).

I would like an example where this happens and, if possible, an explanation of the reason.

machine learning – Image Recognition Algorithm

I know it's a very subjective and varied issue, but I'm just trying to take a step in the right direction.

How to implement a brand recognition algorithm in images?

for example. I take a picture of a macbook with my phone and he recognizes that it 's a macbook (ideally he would also be able to determine the current model of the phone). computer) or I take a photo of the headphones and output algorithms corresponding to the Sony model. xx-100 (ideally, it should be possible to do this with boxed and boxless items).

So my question is:
1) Is it even possible?
2) If yes, where would we start? (I do not ask for instructions step by step, but only a general overview)

Thanks in advance, and I apologize if this question does not really follow the rules of this forum.

machine learning – How can I get a NetTrainResultsObject after a NetTraining with a NetGraph?

Apparently, the result of NetTrain is a NetResultsObject that I should be able to query with options like "ValidationLoss". After training, my object seems to be stuck in the NetResultsObject ("FinalNet") mode, which is the default mode. I did not see anything in the manual that forbids what I'm trying to do. No advice ?? Thank you!

What follows does not work:

net = NetGraph ({layers}, {NetPort (Input1), NetPort (Output1), connections}, "Input1" -> {1.83})

trained = NetTrain (NetInitialize (net), ..)

after successful training:

formed ("ValidationLoss")

NetGraph :: invindata2: The data provided to port "Input1" was not a 1 * 83 matrix of real numbers (or a list of them).
$ Failed

machine learning – Running LSTM on multiple columns separately

I have an LSTM neural network to test the prediction capability of this network and it works for one column. But now, I want to use multiple columns for different items and calculate the ABSE for each column. For example if I have two columns:

enter the description of the image here

It must calculate the function & # 39; ABSE & # 39; for each column separately.

My code below fails. Someone can help me?

That's what I tried, but I get a value error:

ValueError: non-broadcastable output operand with shape (1,1) doesn't 
match the broadcast shape (1,2)

This is happening on the line:

 ---> 51     trainPredict = scaler.inverse_transform(trainPredict)

the code:

def create_dataset(dataset, look_back=1):
    dataX, dataY = (), ()
    for i in range(len(dataset)-look_back-1):
        a = dataset(i:(i+look_back), 0)
        dataY.append(dataset(i + look_back, 0))
        return numpy.array(dataX), numpy.array(dataY)

def ABSE(a,b):
    ABSE = abs((b-a)/b)
    return numpy.mean(ABSE)

columns = df(('Item1','Item2'))

for i in columns:
    # normalize the dataset
    scaler = MinMaxScaler(feature_range=(0, 1))
    dataset = scaler.fit_transform(dataset)
    # split into train and test sets
    train_size = int(len(dataset) * 0.5)
    test_size = 1- train_size
    train, test = dataset(0:train_size,:), 
    look_back = 1
    trainX, trainY = create_dataset(train, look_back)
    testX, testY = create_dataset(test, look_back)
    trainX = numpy.reshape(trainX, (trainX.shape(0), 1, trainX.shape(1)))
    testX = numpy.reshape(testX, (testX.shape(0), 1, testX.shape(1)))
    # create and fit the LSTM network
    model = Sequential()
    model.add(LSTM(1, input_shape=(1, look_back)))
    model.compile(loss='mean_squared_error', optimizer='adam'), trainY, epochs=1, batch_size = 1, verbose = 0)
    # make predictions
    trainPredict = model.predict(trainX)
    testPredict = model.predict(testX)
    # invert predictions
    trainPredict = scaler.inverse_transform(trainPredict)
    trainY = scaler.inverse_transform((trainY))
    testPredict = scaler.inverse_transform(testPredict)
    testY = scaler.inverse_transform((testY))
    # calculate root mean squared error
    trainScore = ABSE(trainY(0), trainPredict(:,0))
    print('Train Score: %.2f ABSE' % (trainScore))
    testScore = ABSE(testY(0), testPredict(:,0))
    print('Test Score: %.2f ABSE' % (testScore))

machine learning – Retrieve a boolean vector from scalar products


I want to determine a boolean vector $ b in {0,1 } ^ n $ composed of zeros and ones, but can not access them directly. I can only call a black box computer code that will take the dot product of $ b $ with a real value vector $ v in mathbb {R} ^ n $ of my choice. Namely access to $ b $ is available through the assessment of the map
$$ v mapsto b ^ T v. $$
How can I get all the entries from $ b $ to use the least possible of these scalar products? (maybe even just a dot product?)

Below, I detail some ideas I had that might work in theory, but that did not work in practice (I think). Concretely, we can assume that n $ about 1 text {million} $and the arithmetic is made in double precision floating point format. This issue has emerged as a sub-problem in a machine learning application.

Idea 1:

One idea that I had is to use a vector with fast growing inputs. Say, for example, $ n = $ 9. Then we could use the vector
$$ v = begin {bmatrix} 1 & 10 & 100 & 1000 & dots end {bmatrix} ^ T. $$
We could then read $ b $ like the numbers of $ b ^ T v $. The problem with this solution is that numbers increase so quickly that in arithmetic calculation in finite precision, it will not work for large companies. $ n $.

Idea 2:

Another idea that I had was to use a vector with algebraically independent entries. Then determine $ b $ of $ b ^ Tv $ is a subset sum problem.

For example, if $ n = $ 3 and
$$ v = begin {bmatrix} pi & e end {bmatrix} ^ T, $$
then $ b ^ T v $ will take one of the many possibilities,
$$ b ^ T v in { pi, ~ e, ~ 1, ~ pi + e, ~ pi + 1, ~ e + 1, ~ pi + e + 1 }. $$
We can determine which of these cases is the case, thus determining $ b $.

But it seems rather combinatorial, and therefore impossible for the big $ n $.

Does the weighted max cut problem have applications in machine learning? If that is true, what are they?

At first, I imagined that the problem of weighted maximum weighting (WMCP) could be useful for binary classifiers, but since the standard WMCP does not have constraints "nodes groups must be located on the opposite side to the right", it seems that not the case.

deep learning – Where do the filters of the convolutional layers come from?

My question is about nuclei or convolutional layer filters in CNN. We can specify the size of the filter and the number of filters in a convolutional layer. But we never specify the filter values. I know that filters are like the matrix used to browse an image.

I wonder if there is an algorithm to generate the filters and optimize them during the training. Thank you.

machine learning – CNN Predict a class and a precision of staying stuck

My model is a binary classifier.

With the same exact architecture, the model sometimes gets high accuracies (90%, etc.), at other times, it only predicts a class (so the precision remains locked on one digit all the time) and on other occasions, the loss value is "nan". (too big or too small for the value of the loss to be a figure I guess).

I've tried to simplify my architecture (up to 2 layers conv2D and 2 dense layers), seed a random number of core initializers (2) and change the rate of ## EQU1 ## 39 learning, but none of these actually solves the problem of inconsistency help the model to train once with great accuracy, but if I run it again without changing any code, I'll 39 get a very different result, an immutable accuracy, as it only predicts a class all the time, or a loss "nan").

How can I solve this problem of:
1. Model having unchanging predictions for the entire series (predicting only one class all the time).
2. Inconsistent and non-reproducible results (when the above problems come and go without code modification)
3. Get at random values ​​of "nan" losses. (How can I get rid of it permanently?)

Thank you!!!!