Zadeh [1,2] inspired a new concept of Machine Learning models, referred to as fuzzy inference system (FIS), that combines accuracy and interpretability and have been successfully applied in many areas. Fuzzy models are so named because their basic structure is composed of rules that use fuzzy sets to model the data. The fuzzy rules have two main parts, called antecedent and consequent part. While the antecedent part models the inputs, the consequent part models the output. There are two main types of fuzzy models: Mamdani [3] and Takagi-Sugeno-Kang (TSK) [4]. Mamdani and TSK work the same in the antecedent part but differ in the consequent. While Mamdani uses fuzzy sets to calculate the output, TSK implements polynomial functions. Consequently, Mamdani systems provide more interpretable results. There is no mathematical formality to describe interpretability, but Miller [5] defines it as the ability to present or explain the reasons for a given result to humans. On the other hand, TSK usually presents more accurate results and requires just a few rules to describe highly nonlinear and complex systems, usually requiring fewer rules than Mamdani-based models. The main idea behind TSK is to approximate a nonlinear system by a collection of sub-linear systems. TSK estimates the output as a weighted combination of the local output of each rule. The Mamdani and TSK will be explained more in detail.
1. Mamdani Fuzzy System
Mamdani fuzzy system is a rule-based model that consists of 4 main components: fuzzifier, rule base, inference engine, and defuzzification.
- Fuzzification
The fuzzification involves mapping crisp values into fuzzy sets (linguistic values), i.e., given a fuzzy set A and a universe of discourse X, a fuzzy set is defined as follows:
\begin{equation}
A = \{x, \mu_{A}(x) | \forall x \in X\}
\end{equation}
- Fuzzy Inference Engine
The fuzzy inference engine consists of 3 main parts: fuzzy operator, implication, and aggregation. Each one of those steps is explained as follows:
- Fuzzy operator: In Mamdani fuzzy systems, the input is usually a vector containing values (x) describing different features. Consequently, the model must combine the membership value to represent the input X, applying some intersection or union technique, commonly referred to as t-norm (triangular norm) and t-conorm (triangular conorm), respectively. An extensive list of t-norms and t-conorms can be found, but commonly implements are the minimum and product for intersection and maximum for the union, as presented in the following equations:
\begin{equation}\label{min}
\mu_{A}(X) = \min_{\forall i \in [1, 2, \ldots, m]} \mu_{i}
\end{equation}
\begin{equation}\label{prod}
\mu_{A}(X) = \prod_{i=1}^{m} \mu_{i}
\end{equation}
\begin{equation}\label{max}
\mu_{A}(X) = \max_{\forall i \in [1, 2, \ldots, m]} \mu_{i}
\end{equation}
- Implication: The membership value that resulted from the previous step is used to shape the membership of the consequent part. Fuzzy sets can be referred to as linguistic variables because they can be described as so, i.e., fuzzy set 1 (feature 1) can represent the linguistic variable small, fuzzy set 2 (feature 2) high, and fuzzy set 3 (output) medium, for example.
- Aggregation: The last step in the fuzzy inference engine is aggregation, which consists of aggregating the output's fuzzy set resulting from the implication process for all rules. A commonly used aggregation technique is max (maximum of all inputs), but there are different applicable methods (e.g. sum, probabilistic or).
\begin{equation}\label{max_aggregation}
\mu_{Aggregated} (x) = \max{\left\{ \mu_{output_{1}} (x), \mu_{output_{2}}, \ldots, \mu_{output_{R}} \right\}}
\end{equation}
- Defuzzification
Finally, defuzzification consists of computing a crisp value from the aggregated fuzzy set. In other words, defuzzification transforms the aggregated fuzzy set into a single value. Many defuzzification approaches can be found in the literature, such as Zimmermann and Lee. Therefore, the method for defuzzification must be chosen based on the application. Chakraverty et al. present the algorithm of four remarkable defuzzification methods: max-membership, centroid, weighted-average, and mean–max.
2. Takagi-Sugeno-Kang fuzzy model
TSK fuzzy model consists of a set of fuzzy functional rules to describe a nonlinear system through linear subsystems. The ith fuzzy rule of a TS model is described as follows:
\begin{equation}\label{TSrules}
{\cal R}_{i}: \quad \mbox{IF} \quad \underbrace{x \quad \mbox{is } {\cal A}_{i}}_\text{Antecedent} \quad \mbox{THEN} \quad \underbrace{y_{i} = f_{i}(x,\theta_{i})}_\text{Consequent}
\end{equation}
where R is the i-th fuzzy rule, x is the input, m is the number of attributes in the input vector, A is the fuzzy set of the i-th fuzzy rule, and y is the target value of the i-th rule calculated as a function of the input and the consequent parameters.
A TSK model consists of two main parts: the antecedent and the consequent. While the antecedent part is concerned with the rules' definition, the consequent part refers to the model's process of computing the output. The antecedent part comprises the fuzzification process and the fuzzy inference engine. On the other hand, the consequent part comprises the calculation of the model's output. The TSK mechanism is depicted in Fig. 1. Unlike Mamdani, TSK doesn't have the defuzzification process, as it is not constituted of fuzzy sets in the consequent part. TSK computes the output (crisp) as a linear function of the input variables. This section describes each step of TSK.
- Fuzzification
The fuzzification involves mapping an input variable (x) into a fuzzy set (linguistic variables). A fuzzy set A is a set function that maps variables on X, where X is the universe of discourse, into the [0,1], This concept is described in the following equation:
\begin{equation}\label{FuzzySet}
A = \left\{ (x, \mu_{A}(x) \vert x \in X ) \right\}
\end{equation}
Gaussian, triangular, and trapezoidal are three commonly used membership functions in the literature, but others can be found.
- Fuzzy inference engine
After the calculation of the membership value of the input variable for each rule, the model computes the combined membership function for all rules, referred to as firing strength. Given an input vector , each rule will be composed of $p$ fuzzy sets, one for each attribute. After calculating the membership function for each attribute, the model computes the rule's firing strength as a function of the calculated membership functions. Combining the membership value of fuzzy sets uses the concept of t-norm (triangular norm) and t-conorm (triangular conorm), where t-norm concerns the intersection and t-conorm to the union of the fuzzy sets. Minimum, maximum, and product are three commonly fuzzy operators.
- Consequent part
Finally, the model computes the output. This last step comprises two equations. The first one is to calculate the local output for all rules in the model. The rules' output is calculated according to the following equation:
\begin{equation}\label{RuleOutput}
\hat{y}_{i} = \sum_{j=1}^{p+1} xe_{j} \theta_{i,j} = (xe)^{T} \theta_{i}
\end{equation}
After that, the model computes the final output as a weighted average of each local output, as following described:
\begin{equation}\label{Output1}
\hat{y} = \sum_{i=1}^{R_{max}} w_{i} \hat{y}_{i}
\end{equation}
REFERENCES
[1] L. Zadeh, Fuzzy algorithms, Information and Control 12 (1968) 94–102.
https://doi.org/10.1016/S0019-9958(68)90211-8.
[2] L. A. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on Systems, Man, and Cybernetics SMC-3 (1) (1973) 28–44.
https://doi.org/10.1109/TSMC.1973.5408575.
[3] E. H. Mamdani, Application of fuzzy algorithms for control of simple dynamic plant, in: Proceedings of the Institution of Electrical Engineers, Vol. 121, IET, 1974, pp. 1585–1588.
https://doi.org/10.1049/piee.1974.0328.
[4] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications to modeling and control, IEEE Transactions on Systems, Man, and Cybernetics SMC-15 (1) (1985) 116–132.
https://doi.org/10.1109/TSMC.1985.6313399.
[5] T. Miller, Explanation in artificial intelligence: Insights from the social sciences, Artificial Intelligence 267 (2019) 1–38.
https://doi.org/10.1016/j.artint.2018.07.007
[6] A new Takagi–Sugeno–Kang model to time series forecasting, Engineering Applications of Artificial Intelligence 133 (2024) 108155. https://doi.org/10.1016/j.engappai.2024.108155.53
1. Mamdani Fuzzy System
Mamdani fuzzy system is a rule-based model that consists of 4 main components: fuzzifier, rule base, inference engine, and defuzzification.
- Fuzzification
The fuzzification involves mapping crisp values into fuzzy sets (linguistic values), i.e., given a fuzzy set A and a universe of discourse X, a fuzzy set is defined as follows:
\begin{equation}
A = \{x, \mu_{A}(x) | \forall x \in X\}
\end{equation}
- Fuzzy Inference Engine
The fuzzy inference engine consists of 3 main parts: fuzzy operator, implication, and aggregation. Each one of those steps is explained as follows:
- Fuzzy operator: In Mamdani fuzzy systems, the input is usually a vector containing values (x) describing different features. Consequently, the model must combine the membership value to represent the input X, applying some intersection or union technique, commonly referred to as t-norm (triangular norm) and t-conorm (triangular conorm), respectively. An extensive list of t-norms and t-conorms can be found, but commonly implements are the minimum and product for intersection and maximum for the union, as presented in the following equations:
\begin{equation}\label{min}
\mu_{A}(X) = \min_{\forall i \in [1, 2, \ldots, m]} \mu_{i}
\end{equation}
\begin{equation}\label{prod}
\mu_{A}(X) = \prod_{i=1}^{m} \mu_{i}
\end{equation}
\begin{equation}\label{max}
\mu_{A}(X) = \max_{\forall i \in [1, 2, \ldots, m]} \mu_{i}
\end{equation}
- Implication: The membership value that resulted from the previous step is used to shape the membership of the consequent part. Fuzzy sets can be referred to as linguistic variables because they can be described as so, i.e., fuzzy set 1 (feature 1) can represent the linguistic variable small, fuzzy set 2 (feature 2) high, and fuzzy set 3 (output) medium, for example.
- Aggregation: The last step in the fuzzy inference engine is aggregation, which consists of aggregating the output's fuzzy set resulting from the implication process for all rules. A commonly used aggregation technique is max (maximum of all inputs), but there are different applicable methods (e.g. sum, probabilistic or).
\begin{equation}\label{max_aggregation}
\mu_{Aggregated} (x) = \max{\left\{ \mu_{output_{1}} (x), \mu_{output_{2}}, \ldots, \mu_{output_{R}} \right\}}
\end{equation}
- Defuzzification
Finally, defuzzification consists of computing a crisp value from the aggregated fuzzy set. In other words, defuzzification transforms the aggregated fuzzy set into a single value. Many defuzzification approaches can be found in the literature, such as Zimmermann and Lee. Therefore, the method for defuzzification must be chosen based on the application. Chakraverty et al. present the algorithm of four remarkable defuzzification methods: max-membership, centroid, weighted-average, and mean–max.
2. Takagi-Sugeno-Kang fuzzy model
TSK fuzzy model consists of a set of fuzzy functional rules to describe a nonlinear system through linear subsystems. The ith fuzzy rule of a TS model is described as follows:
\begin{equation}\label{TSrules}
{\cal R}_{i}: \quad \mbox{IF} \quad \underbrace{x \quad \mbox{is } {\cal A}_{i}}_\text{Antecedent} \quad \mbox{THEN} \quad \underbrace{y_{i} = f_{i}(x,\theta_{i})}_\text{Consequent}
\end{equation}
where R is the i-th fuzzy rule, x is the input, m is the number of attributes in the input vector, A is the fuzzy set of the i-th fuzzy rule, and y is the target value of the i-th rule calculated as a function of the input and the consequent parameters.
A TSK model consists of two main parts: the antecedent and the consequent. While the antecedent part is concerned with the rules' definition, the consequent part refers to the model's process of computing the output. The antecedent part comprises the fuzzification process and the fuzzy inference engine. On the other hand, the consequent part comprises the calculation of the model's output. The TSK mechanism is depicted in Fig. 1. Unlike Mamdani, TSK doesn't have the defuzzification process, as it is not constituted of fuzzy sets in the consequent part. TSK computes the output (crisp) as a linear function of the input variables. This section describes each step of TSK.
- Fuzzification
The fuzzification involves mapping an input variable (x) into a fuzzy set (linguistic variables). A fuzzy set A is a set function that maps variables on X, where X is the universe of discourse, into the [0,1], This concept is described in the following equation:
\begin{equation}\label{FuzzySet}
A = \left\{ (x, \mu_{A}(x) \vert x \in X ) \right\}
\end{equation}
Gaussian, triangular, and trapezoidal are three commonly used membership functions in the literature, but others can be found.
- Fuzzy inference engine
After the calculation of the membership value of the input variable for each rule, the model computes the combined membership function for all rules, referred to as firing strength. Given an input vector , each rule will be composed of $p$ fuzzy sets, one for each attribute. After calculating the membership function for each attribute, the model computes the rule's firing strength as a function of the calculated membership functions. Combining the membership value of fuzzy sets uses the concept of t-norm (triangular norm) and t-conorm (triangular conorm), where t-norm concerns the intersection and t-conorm to the union of the fuzzy sets. Minimum, maximum, and product are three commonly fuzzy operators.
- Consequent part
Finally, the model computes the output. This last step comprises two equations. The first one is to calculate the local output for all rules in the model. The rules' output is calculated according to the following equation:
\begin{equation}\label{RuleOutput}
\hat{y}_{i} = \sum_{j=1}^{p+1} xe_{j} \theta_{i,j} = (xe)^{T} \theta_{i}
\end{equation}
After that, the model computes the final output as a weighted average of each local output, as following described:
\begin{equation}\label{Output1}
\hat{y} = \sum_{i=1}^{R_{max}} w_{i} \hat{y}_{i}
\end{equation}
REFERENCES
[1] L. Zadeh, Fuzzy algorithms, Information and Control 12 (1968) 94–102.
https://doi.org/10.1016/S0019-9958(68)90211-8.
[2] L. A. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Transactions on Systems, Man, and Cybernetics SMC-3 (1) (1973) 28–44.
https://doi.org/10.1109/TSMC.1973.5408575.
[3] E. H. Mamdani, Application of fuzzy algorithms for control of simple dynamic plant, in: Proceedings of the Institution of Electrical Engineers, Vol. 121, IET, 1974, pp. 1585–1588.
https://doi.org/10.1049/piee.1974.0328.
[4] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications to modeling and control, IEEE Transactions on Systems, Man, and Cybernetics SMC-15 (1) (1985) 116–132.
https://doi.org/10.1109/TSMC.1985.6313399.
[5] T. Miller, Explanation in artificial intelligence: Insights from the social sciences, Artificial Intelligence 267 (2019) 1–38.
https://doi.org/10.1016/j.artint.2018.07.007
[6] A new Takagi–Sugeno–Kang model to time series forecasting, Engineering Applications of Artificial Intelligence 133 (2024) 108155. https://doi.org/10.1016/j.engappai.2024.108155.53