Auto Learning – Are fully connected layers needed for convoys?

I'm trying to create a reinforcement learning model for a grid-based game. One of the features of the game is that the game board can become bigger in the middle of the game, although I know that this is a problem for which machine learning is not very good. However, one of my ideas was to form a convolutional neural network model that does not have a fully connected layer at the end, so that the size of the input and output network can vary simultaneously, and the model will still work n & n. Will not need extra weight to produce an output. In addition, the addition of a layer or two fully connected layers at the end would represent a very large number of additional weights to entail, because the action space of the problem is very important (15232 discrete actions if I remove the part of the developing game that is expanding). . If my calculations are correct, a single fully connected layer would have 464 million additional weight. Which seems to be a lot compared to my 223 350 parameters that can be driven.

Here is an iteration of this type of network that I tested:
Convolutional network architecture without fully connected layers
For this model, the observation space has 92 layers of depth and 28 of width by 16 of height. And the space of action has a depth of 34 layers and 28 of width by 16 of height. This network does not give surprising results (although they are better than random choices), which could be due to a multitude of reasons. However, I do not know if it's a bad idea because I find it hard to reason about this design in general.

Is a fully connected layer at the end necessary for the network to learn global ideas about the space of observation?