Currently, the field of application of AI and neural networks is expanding significantly. In particular, neural networks are actively used in the field of computer vision, improving the quality of images and videos, and text analysis. In this case, neural networks are used in most cases as a "black box" without any understanding of the internal mechanisms of their work, which is simply unacceptable in the context of solving critical tasks by neural networks. This direction is devoted to the analysis of training neural networks using information theory methods. The idea is to build estimates of two mutual information: between the output of the hidden layer of the neural network and the class label (this estimate shows how well the hidden layer has learned) and between the output of the hidden layer and the input dataset (this estimate shows how well the network has learned discard insignificant signs). Analysis of these values shows how well the neural network is trained. The main difficulty lies in the assessment of mutual information between vectors of large dimension.