Neural Networks
When I first saw the term Neural Networks it was through a meme, which implied that they were composed of a bunch of if and else. Neural Networks use advanced mathematical concepts in their implementations, which causes “fear” for beginners when they see it for the first time. Modern programming languages aimed at artificial intelligence have ready-to-use libraries, which abstract the mathematics behind the scenes, but that does not mean that one should leave aside the base and key concepts. In this post, I want to talk a little about Neural Networks, a well-known topic nowadays.
Introduction
Neural Networks are information-processing models inspired by the human brain. Analogously to the human brain, a neural network is composed of units called neurons, internally interconnected and capable of acquiring knowledge over time.
Nomenclature
Neuron, Unit or Perceptron
The three terms are used to represent the central unit of Neural Networks.
Layers
The grouping of neurons forms a layer, the neurons present in this layer cannot create connections between them, as this would result in infinite loops in the direct passage of a network. Layers can be of three types:
Input Layer: This layer provides the input parameters for the Neural Network.
Hidden Layer: This layer serves to increase the computational power of the Neural Network, as long as it is well implemented.
Output Layer: This layer is used to represent the network score, which in this case are arbitrary real-valued numbers or some kind of real-valued target.
Connections or Synapses
It’s the lines or arrows that connect the neurons in different layers. Each connection can have a weight associated with it.
Activation Function
It is a mathematical function that, in the process of learning an artificial neuron, makes small changes in the values of the weights or bias, in order to approach the correct or expected result.
Bias
They are values present in the neurons themselves with the purpose of helping the neurons to reach the expected value.
Types of Training
Recurrent
Recurrent or supervised training consists of the process where the input vectors are inserted, processed and later the outputs are associated with the correct answers, in order to modify the weights of the neurons at each interaction when comparing the answers to the expected results.
Feedforward
Unsupervised training or Feedforwad occurs when the input set is provided so that the network can extract properties according to the internal representations without any feedback.
Types of Networks
Hopfield Network
This Neural Network model was inspired by Physics concepts and its main feature is associative memory.
The main applications of Hopfield Networks are related to their restorative and error-correcting role, where when presenting an incomplete version of the stored pattern, the network will arrive at the closest solution in relation to the memory that was previously stored.
Map of Kohonen
This Neural Network model uses the algorithm developed by Teuvo Kohonen in 1982, being considered simple and capable of dimensionally organizing complex data into groups (clusters), according to their relationships, this algorithm is considered a self-organizing map (from the acronym SOM, Self-organizing map).
They are used in problems where the patterns are unknown or indeterminate.
How We Measure a Neural Network
The two metrics that people commonly use to measure the size of Neural Networks are the number of neurons or, more commonly, the number of parameters.
The Neural Network above has 4 + 2 = 6 neurons (not counting the inputs), [ 3 x 4 ] + [ 4 X 2] = 20 weights, and 4 + 2 = 6 deviations, for a total of 26 learnable parameters.
Conclusion
Modern Neural Networks contain orders of 100 million parameters and are generally composed of approximately 10 to 20 layers, hence deep learning.
In the literature it is well accepted that a single hidden layer can approximate any nonlinear equation, two hidden layers can already represent any relationship between data, even those that cannot be represented by equations. More than two hidden layers are only needed in even more complex problems like Time Series and Computer Vision.