Do you know what a Neural Network is? A Neural Network, also known as Artificial Neural Networks (ANNs), is a structure in machine learning inspired by human brains. As in humans, the information flows from the brain to the rest of the body through neurons. Neural Networks work similarly.
Neural Networks consist of input layers, output layers, and hidden layers. Each layer has units called neurons. The number of neurons in the input and output layer is the same as the number of expected inputs and outputs. In a feedforward Network, the information enters the input layer and flows through the hidden layers until it achieves the output layer. Each neuron receives the values from the previous layer and processes them using weights and an activation function. Neural Networks also define the basis of Deep Learning. According to IBM (https://www.ibm.com/topics/neural-networks), a Neural Network with more than three layers is considered Deep Learning.
Furthermore, it is necessary to specify a cost function and an optimization technique for training the Network. The cost function depends on the application. For example, the mean squared (MSE) error is suitable for regression tasks. Likewise, many optimization methods are available, but the Adam algorithm has become very popular in recent years.
Among the advantages of Neural Networks, the following can be highlighted:
- Neural Networks are flexible and can be applied to different problems.
- Neural Networks usually present accurate results.
- Neural Networks can process unorganized data by processing, segregating, and categorizing them.
Among the disadvantages of Neural Networks, the following can be highlighted:
- Neural Networks are black box models, i.e., models that don't provide explainable results. Consequently, it is unsuitable in areas where explainability is required.
- Neural Networks usually demand more time.
- Neural Networks require more data to train the model.
- Neural Networks are more computationally expensive.
Google launched 2015 a library in Python that is very useful for working with Neural Networks. The book "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow" by Aurélien Geron presents the main concepts concerning Neural Networks and how to implement them.
My GitHub has a personal implementation of Neural Networks for finance and energy datasets with hyper-parameters optimized using Random Search. The algorithm searches for the best model structure and saves the best model. You can find this implementation in the following link:
- https://github.com/kaikerochaalves/NeuralNetwork.git
There are some types of neural networks, as you can see below.
Types of Neural Networks
- Convolutional neural network (CNN):
Convolutional neural networks are specific networks that implement at least one convolutional layer. It usually consists of a convolutional layer, followed by a pooling layer, and it may have another convolutional layer. The final layer must be fully connected. CNN is largely applied to image detection, recognition, and computer vision due to its ability to handle large amounts of data and produce accurate results. Furthermore, there is no need for manual feature engineering. However, two drawbacks can be highlighted: CNN needs a large amount of labeled data and is prone to overfitting. My GitHub page has a CNN application for forecasting time series (https://github.com/kaikerochaalves/CNN-convolutional-neural-network-.git).
- Temporal convolutional network (TCN):
TCN is a specific type of CNN that presents advantages over other networks, such as faster simulations due to the parallelism, better control over the memory size, no problem with gradients exploding (gradient becomes too large) or vanishing (gradient becomes too small), and less memory is necessary for training. However, transfer learning cannot be as simple as a regular CNN. More information about TCN can be found at datasciencecentral.com. My GitHub page has a TCN application for forecasting time series (https://github.com/kaikerochaalves/TCN-temporal-convolutional-network-.git).
- WaveNet:
WaveNet is a type of CNN that uses convolutional layers, doubling the dilation rate at every layer. It doesn't have pooling layers. WaveNet can generate high-quality speech waveforms. My GitHub page has a WaveNet application for forecasting time series (https://github.com/kaikerochaalves/WaveNet.git).
- Recurrent neural network (RNN):
RNN is a neural network that models sequential data, such as time series and natural language. RNN can also model dependencies and relationships within sequences. Another advantage is that input data don't need to be fixed in size. However, among RNN disadvantages, the following can be highlighted: i) gradient becomes too large, causing instability, ii) gradient becomes too small, limiting long time relationships, iii) difficult to keep past information in very long sequences, iv) RNN can become biased toward more recent data, and v) parallelization is a challenge for RNN. My GitHub page has an RNN application for forecasting time series (https://github.com/kaikerochaalves/RNN-recurrent-neural-network-.git).
- Long short-term memory (LSTM):
LSTM is a type of RNN used to process time series with unknown intervals. LSTM inherits the advantages present in the RNN and can also deal with the vanishing gradient problem present in the RNN. Among the disadvantages, the following can be highlighted: i) more computational cost than other neural networks and RNN, ii) LSTM are more susceptible to overfitting when trained with insufficient data, iii) many hyperparameters to tune the model, and iv) long training time. My GitHub page has an LSTM application for forecasting time series (https://github.com/kaikerochaalves/LSTM-long-short-term-memory.git).
- Gated recurrent unit (GRU):
GRU is a type of RNN that addresses the vanishing problem like LSTM and presents other advantages in specific cases, such as faster simulations and less memory. However, LSTM is more accurate when using datasets with long sequences. My GitHub page has a GRU application for forecasting time series (https://github.com/kaikerochaalves/GRU-gated-recurrent-unit.git).