Abstract

Precise and reliable hydrological runoff prediction plays a significant role in the optimal management of hydropower resources. Nevertheless, the hydrological runoff practically possesses a nonlinear dynamics, and constructing appropriate runoff prediction models to deal with the nonlinearity is a challenging task. To overcome this difficulty, this paper proposes a three-stage novel hybrid model, namely, CVS (CEEMDAN-VMD-SVM), by coupling the support vector machine (SVM) with a two-stage signal decomposition methodology, combining complete ensemble empirical decomposition with additive noise (CEEMDAN) and variational mode decomposition (VMD), to obtain inclusive information of the runoff time series. Hydrological runoff data of the Swat River, Pakistan, from 1961 to 2015 were taken for prediction. CEEMDAN decomposes the runoff time series into subcomponents, and VMD performs further decomposition of the high-frequency component obtained after CEEMDAN decomposition to improve the prediction activity. Afterward, the SVM algorithm was applied to the decomposed subcomponents for the prediction purpose. Finally, four statistical indices are utilized to measure the performance of the CVS model compared with other hybrid models including CEEMDAN-VMD-MLP (multilayer perceptron), CEEMDAN-SVM, VMD-SVM, CEEMDAN-MLP, VMD-MLP, SVM, and MLP. The CVS model performs better during the training period by reducing RMSE by 71.28% and 40.06% compared with MLP and CEEDMAD-VMD-SVM models, respectively. However, during the testing period, the error reductions include RMSE by 68.37% and 35.33% compared with MLP and CEEDMAD-VMD-SVM models, respectively. The results highlight that the CVS model outperforms other models in terms of accuracy and error reduction. The research also highlights the superiority of other hybrid models over standalone in predicting the hydrological runoff. Therefore, the proposed hybrid model is applicable for the nonlinear features of runoff time series with feasibility for future planning and management of water resources.

1. Introduction

Managing water resources is crucial from several aspects such as the development of future water bodies, efficient exploitation of hydropower for power generation and irrigation purposes, to prevent disputes, and for the protection of existing water bodies from overexploitation and pollution [1, 2]. Hydrological runoff prediction is an important area in managing water resources, and accurate runoff planning and prediction can prove to be a feasible measure. Runoff and rainfall prediction depend on nonlinear factors including precipitation, uneven flow, topography, anthropic activities, and evaporation, which make the runoff prediction task a great challenge [3, 4].

Process-driven and data-driven are two approaches for runoff prediction. The data-driven approaches are becoming more popular for extracting accurate predictions with their speedy growth, increasing power of computation, and fewer information needs than the process-driven approaches [5]. Conversely, these artificial intelligence- (AI-) based models exhibit drawbacks and shortcomings like the sensitivity of SVM to parameter selection and overfitting problems faced by artificial neural networks (ANNs) [6]. Furthermore, the preprocessing of input and/or output data is a requirement to make these models to handle nonstationary data [7]. The direct use of the input signal for modeling in AI-based models may not deliver acceptable results; however, the model performance can be improved by applying a preprocessing technique [8]. Appropriate data preprocessing techniques are required to eliminate noise and extract trend from hydrological time series [9].

To overcome the inadequacies of the data-driven models and for more reliable and accurate predictions, hybrid models have been proposed to apply for the prediction of the hydrological time series [10, 11]. Hybrid techniques by employing the data-driven approaches can obtain inclusive and correct information of the different parameters with an additional benefit of improvement in the prediction accuracy. Moreover, these techniques can detect periodicity, volatility, and the random nature of runoff processes [12].

Statistical models are widely employed for modeling of runoff time series [13]. The main disadvantages of these models are the requirements of stationarity and the linearity of the runoff data. They also require a specific time series data length for a robust prediction result [14]. Therefore, the modeling competency of statistical models is limited due to their linear nature for prediction of runoff time series, which exhibits a highly nonlinear and nonstationary nature [15, 16]. Machine learning (ML) models are appropriate for nonstationarity and nonlinearity exhibited by the runoff time series, overcome the constraints of the statistical time series models, and achieve better performance and accuracy than the conventional statistical models for the time series [17, 18]. The comprehensive evaluation of machine learning techniques is provided in [19, 20] and [21]. SVM and MLP are the popular ML techniques in the field of runoff prediction. SVM is a very efficient and robust algorithm with applications in numerous runoff prediction studies due to its better performance. Moreover, SVM offers excellent simplification ability and promising results compared with other machine learning methods for hydrological runoff prediction [2225].

SVM does not experience the problem of localized minimization and requires less time for the computation compared to ANN; therefore, there are fewer chances of overfitting and poor prediction results compared to ANN [2628]. SVM obtains the best cooperation between learning capability and the model complexity, based on limited model information, to obtain the best result [29, 30]. Furthermore, global optimization can be used to improve the parameters of SVM which results in better prediction performance than ANN [31]. The MLP represents an advanced version of ANN and is popular among hydrologists as compared to other ANNs [32]. Many research works exist which used MLP for runoff prediction [3336]. The previous research as mentioned above highlights the superior performance of SVM for runoff prediction; therefore, in the present study, SVM has been chosen as the final stage to accomplish the task of runoff prediction in the present study.

In the field of runoff prediction, hybrid methods coupled with ML techniques were proposed to improve the prediction accuracy and to obtain better management of data [37, 38]. Hybrid ML methods in the runoff prediction field offer advantages of automated and timely performance evaluation and management of the ensemble algorithms [32]. The research in [39, 40] presents a review of the applications of the hybrid ML methods for runoff-rainfall prediction.

Decomposition techniques can be applied as data preprocessing tools to study the nonlinear and nonstationary characteristics of runoff series, such as ensemble empirical mode decomposition (EEMD) and variational mode decomposition (VMD) [17, 41, 42]. Time series decomposition finds successful implementation in improving the performance of ML methods used for runoff modeling. The decomposition method decomposes the original time series into several individual components; afterward, the ML models are employed for prediction purposes [17]. The components obtained as a result of employing an effective decomposition method are much easier to evaluate than an original time series [43].

The hydrological time series has been analyzed by many researchers by employing wavelet transform (WT) due to its excellent performance in conditions with multiple resolutions in frequency and time domains [44, 45]. The WT represents a Fourier transform with an adjustable window requiring a stable signal in the WT window. Consequently, WT is prone to the restrictions of the Fourier transform. Although WT provides high resolution in both the time domain and the frequency domain, some false harmonic waves are produced during WT due to certain limitations of this method. Therefore, the selection of WT basis functions is crucial due to its significant influence on the process of the wavelet decomposition. Empirical mode decomposition (EMD) was proposed to overcome the limitations of the WT [46]. EMD decomposes the trend components or the multiscale fluctuation in the signal to smoothen the signal and generates intrinsic mode functions along with a residual. EMD technique reflects a more accurate representation of the nonlinearity and nonstationary in the original series compared to the WT technique. Therefore, EMD is considered as a more efficient way to process complex signals than the WT. The hydrological time series in classical hydrology can be considered as a combination of random, periodic, and trend components. The high-frequency and low-frequency components along with the residual obtained through EMD technique in the case of perfect decomposition can be approximated as random and periodic components along with the trend [47, 48]. EMD finds successful implementations in the hydrological research [49, 50], but EMD exhibits problems such as the mode mixing of IMFs and the orthogonality effect which affects the precision of prediction and the performance of EMD. Therefore, EEMD was developed to resolve the issues and to lessen the impact of mode mixing as faced by EMD [51, 52]. Nevertheless, the limitations of mode mixing of some signals and the end effect still exist in EEMD-based techniques [53]. Complete ensemble empirical mode decomposition with additive noise (CEEMDAN) is an advanced technique that overcomes the issues faced by EMD and EEMD like mode mixing and computational complexity, respectively. It is possible to achieve the reconstruction error close to zero by utilizing the CEEMDAN technique and by requiring fewer integration times, with the addition of adaptive white noise at each step [43]. However, CEEMDAN is also unable to completely resolve the issue faced by EEMD, such as the presence of residual noise in the modes and appearance of signal information later than in EEMD with specious modes in the initial decomposition stages [54].

VMD is another adaptive and nonrecursive signal analysis technique which, unlike empirical mode techniques, decomposes the original series into multiple modes and updates them [55]. VMD is more robust to noise and sampling, with outstanding performance in frequency search and separation. VMD can improve the mode mixing problem and extract the time-frequency features precisely by yielding narrow-banded modes [56]. The VMD is a comparatively new technique for hydrological application [17], and relatively, a few research works exist regarding applications of VMD for runoff prediction.

A single-layer hybrid model consisting of a decomposition technique and machine learning method is one of the most frequently employed methods to analyze the time series. These hybrid models consisted of a single-layer decomposition technique that can enhance the predictive performance of nonlinear time series to some extent but unable to completely predict the nonlinearity and nonstationarity of the original signals. Consequently, the hybrid model based on two-layer decomposition methodology is employed to overcome the limitations of the single-layer decomposition technique [43]. Therefore, this study proposes a three-stage hybrid model based on CEEMDAN, VMD, and SVM and its applicability to the runoff time series. The first decomposition stage employs the CEEMDAN technique and decomposes the runoff series into random, periodic, and trend components intending to improve the prediction of nonlinear and nonstationary monthly runoff series. The VMD is proposed as an additional decomposition technique to diminish stochastic behaviors, noise, and trends of the data. Finally, the SVM algorithm predicts the monthly runoff data series.

The main objectives of the present research are as follows:(1)The development of ML and the signal decomposition-based hybrid model by taking the hydrological runoff data of the Swat River, Pakistan.(2)Applicability of the hybrid model for the runoff prediction.(3)Verifying the performance and accuracy of the proposed model by comparing results with the similar models used to predict the runoff time series.

The rest of the paper is arranged as follows. The modeling techniques and the proposed approach are described in Section 2. The results and discussion are presented in Section 3, while Section 4 concludes the research work. The research will be useful for prediction and planning purposes and will provide new directions in the field of hydrology.

2. Materials and Methods

2.1. Decomposition Techniques
2.1.1. CEEMDAN

In CEEMDAN, the information regarding noise is shared between all workers as opposed to EEMD to efficiently solve the mode mixing issue of EMD [57]. The CEEMDAN technique enables us to get near to zero reconstruction error by adding a finite number of adaptive white noises at every stage through a lesser average number of integration times. This enables the CEEMDAN to avoid the mode mixing and the computational complexity issues [43]. The CEEMDAN process proceeds as follows:Step 1: create the original time series with added noise:Step 2: use CEEMDAN to get the first IMF for each and take the average:The first residual is . Step 2 is similar to EMD.Step 3: CEEMDAN gets second and the remaining IMFs by decomposing the residual with the added noise as shown in the following:where represents the first IMF decomposed from the original signal. Similarly, the -th IMF and the residual can be calculated asStep 4: CEEMDAN obtains numerous IMFs and computes the residual as shown in the following:

Based on VMD, this study introduces a second decomposition of IMF1, owing to the unpredictability and the highest frequency of IMF1.

2.1.2. Variational Mode Decomposition

VMD is a quasi-orthogonal and adaptive decomposition method, where the modes are obtained nonrecursively [55]. It approximates the corresponding modes concurrently and determines the relevant band adaptively [53]. The VMD can be expressed as [55]where and represent expressions related to all modes along with their central frequency.

Furthermore, and represent Dirac distribution and convolution, respectively. The term of the quadratic penalty and Lagrangian multipliers are used to convert this constrained optimization problem to an unconstrained one [58]:

The above equation can be resolved using different approaches, and the two stages of the equation are given as follows:(i) minimization:(ii) minimization:where denotes the number of iterations and show the Fourier transform of , respectively.

VMD technique relies on the three fundamental concepts including wiener filtering, frequency mixing and heterodyne demodulation, and analytic signal and Hilbert transform. The original signal is decomposed into IMFs that reproduce the original signal with different sparsity features. In contrast to the original decomposition techniques, VMD relies on the alternate direction method of multipliers (ADMM) for reconstruction process [53]. VMD utilizes the principle of the variational mode to obtain the IMFs, thereby minimizing the sum of estimated bandwidth of each IMF, which makes this technique different from EMD. The bandwidth and center frequency of the IMFs are revised in the course of solving the variational model. The frequency domain of the signal results in adaptive segmentation of the signal band, and additionally, the IMF obtained has a narrow band [31].

The number of intrinsic modes defines correctly resolved data for an acceptable prediction model of a time series; therefore, the determination of the number of intrinsic modes is vital in the VMD process. The specification of the original time series dataset is impossible to be given if less intrinsic mode components than the required one are chosen. On the contrary, the intrinsic modes in excess may result in the poor performance that causes error accumulation by each prediction unit in the accumulation stage [59]. Nevertheless, the IMFs generated by the VMD process are usually evener compared to the mode functions obtained by other techniques like EEMD and wavelet transforms [60]. This lessens the accumulation of error over time. Another important aspect of VMD is the selection of several parameters, which requires trial and error methods [3].

2.2. Machine Learning Techniques
2.2.1. Support Vector Machine

SVM is a nonlinear search algorithm [61] used to minimize the expected errors in ML and to reduce the issue of overfitting [62]. Based on the training, by considering the past data, SVM predicts a forward quantity in time [32]. In the case of properly determined kernel filters and support vectors, SVM performs more efficiently than ANNs [24]. SVM works by constructing a hyperplane to enable the maximum sorting between the samples and to minimize the sample to the hyperplane distance [31].

SVMs were developed for the binary classification but are also applicable for the regression problems by introducing a loss function. The SVM algorithm only deals with linear problems. In the case of a nonlinear system, a nonlinear mapping is used to map the input vector into the high-dimensional feature space ; afterward, the linear regression is performed in this space. In case of the radial basis function [63],

Support vector regression (SVR) is used to apply SVM [24]. SVR based on the structural risk minimization theory and Vapnik–Chervonenkis dimension model is a feasible method to deal with prediction problems [26, 64]. Equation (13) gives the standard form of the SVR model:

The coefficients and are predictable by decreasing the risk function .

Three parameters dominate the accuracy of the SVR network when the quality and span of the training samples are fixed: is the epsilon and controls the width of the epsilon tube in the training loss function, controls the width of kernel Gaussian function, and C is the regularization parameter and controls the risk degree of SVR empirically [6567].

2.2.2. Multilayer Perceptron

The MLP is the most widely used type of ANN for modeling the hydrological runoff data [68]. MLP belongs to feedforward neural networks. MLP can approximate both integrable and continuous functions. MLP consists of neurons arranged in layers in the form of groups. The input nodes in MLP are all in one layer, while the hidden portion has one or more hidden layers. The selection of layers is dependent on the problem being considered, and there are no specific rules for the selection of these layers [69]. Many algorithms [7073] have been proposed for finding the optimum structure of the network, but the optimal solution of parameters is not guaranteed by any of these methods. Figure 1 illustrates a simple structure of the MLP network.

In MLP, the nodes of the input layer denote the length of the input data, while the neurons of the output layer show the length of the output data [74]. The calculations in the MLP network are performed successively from the input layer to the output layer. The calculation of the node is performed at the same time, which is present at the same level, and there is no interference of nodes during the process with each other [75]. The weighted sum of all nodes in the preceding layer is equal to the value of each node. The following formula can be used to calculate the value of each node in MLP:where is the activation function and denotes the weight vector. The value vector of all neurons in the layer is . shows the neuron value in the layer, and bias of layer is represented by .

The linear and nonlinear functions are the most widely used activation functions. MLP is essentially a single-layer perceptron in the case of a linear activation function. The most commonly used nonlinear activation is the sigmoid function [74]. Equation (3) describes the sigmoid activations as follows:

The expression of loss function of the actual value and the ideal output can be given aswhere denotes the actual value, the output value is given by , and distance norm is shown by

The backpropagation (BP) algorithm is usually used to adjust the parameters of MLP and serves to minimize the loss function. The gradient descent (GD) algorithm is the simplest and commonly employed parameter adjustment algorithm. The stochastic gradient descent (SGD) algorithm is another useful algorithm to adjust the parameters of MLP [74]. The SGD algorithm performs well for the optimization process; however, it exhibits a slow convergence rate. Additionally, the chance exists for the gradient descent to experience the issue of loss of the function’s saddle point [76]. Several alternative approaches exist to address these issues and for updating the parameters of the neural network. These adaptive approaches diagonally scale the gradient through an approximation of the curvature of function [77]. The most widely used optimizer in deep learning is Adam (adaptive moment estimation) which can be chosen as the best optimizer for nonstationary objectives without the need of other optimizers [78].

2.3. Quantitative Performance Indices

In this research, several statistical indices are used to assess the performance of the observed and the predicted runoff data. The root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), mean squared error (MSE), and the coefficient of determination (R2) were applied to evaluate the reliability of the predicted model.

2.3.1. RMSE

RMSE is used to measure the deviation between the predicted and the observed value, and to represent the extent of dispersion of a dataset, the smaller value represents the better performance of the algorithm:

2.3.2. MAE

MAE reflects the actual condition of the predicted error, and the smaller value represents the better performance of the algorithm:

2.3.3. MAPE

MAPE is the measure of the model’s prediction outcome, and the smaller value represents the superior performance of the algorithm. The MAPE is useful for qualitative assessment of the model’s accuracy and depends on the complexity of the application [31]:

2.3.4. MSE

It represents an average of squares of the difference between the predicted and the observed value, and the smaller value represents the better performance of the algorithm:

2.3.5. Coefficient of Determination (R2)

R2 summarizes the error by evaluating the linear correlation between the observed and the predicted data, with values ranging between 0 and 100%:

In expressions (11)–(18), and denote the th predicted and actual values of runoff, respectively, whereas shows the total number of predictions.

2.4. Proposed Hybrid Modelling

As explained above, hydrological runoff exhibits nonstationary and nonlinear characteristics [79, 80]. These properties of runoff result in the undesirable performance of many prediction models, along with poor generalization due to the requirements of many pseudo-variations, which also affects the accurate knowledge of data variations [81]. Therefore, this paper proposes a three-stage hybrid model by coupling the ML method with signal decomposition techniques for a reliable and accurate runoff prediction.

The flowcharts representing the main steps of the proposed methodology and seven other models developed for comparison with the proposed CVS model are given in Figure 2. Furthermore, the main steps of the CVS model are explained as follows:Step 1: the Pearson correlation coefficient method was applied to the original runoff time series and its lagged values to determine the appropriate input variable, and the time-lagged series having the highest value of the correlation coefficient with the original runoff series was chosen as an initial input for the decomposition modelStep 2: CEEMDAN technique was applied to the lagged runoff time series obtained as a result of Step 1, which decomposes the series into subcomponents (IMFs and residual) having a different frequencyStep 3: the component of high pass (IMF1) produced by CEEMDAN was further decomposed by VMDStep 4: SVM algorithm was applied to construct a prediction model for the whole dataset containing extracted IMF components and the runoff data signal to make a prediction for each component accordinglyStep 5: to produce a collective output, the predicted results of all extracted IMFs obtained by the SVM algorithm were reconstructed to produce the final prediction result of the original runoff seriesStep 6: finally, the statistical performance metrics evaluated the results in the training and testing periods

3. Results and Discussion

3.1. Case Study

The present research considers the runoff data of the Swat River basin collected from the Water and Power Development Authority (WAPDA), Pakistan, for the prediction purpose. The Swat River is a perennial river located in the northern part of Khyber-Pakhtunkhwa Province, Pakistan (Figure 3). It originates from Hindu Kush mountains and flows through the Kalam valley to Madyan and lower areas of Swat valley up to Chakdara. The river outflows into the Kabul river and has a total length of 240 km. The Swat River serves the purpose of irrigation, power generation, and a natural habitat for fishes and birds. The catchment area of the river is generally hilly, with altitudes ranging from 360 m to 4,500 m approximately, from south to north. The location of the catchment of the Swat River basin is between longitude 70°59′ east to 72°47′ east and latitude of 34°00′ north to 35°56′ north [82].

3.2. Data Selection

The monthly runoff data of the Swat River from 1961 to 2015 were taken at Chakdara hydrological station in the Swat River catchment. The data are available on a daily basis, and to obtain monthly data, the average monthly data were calculated from the daily data. The monthly runoff data series is shown in Figure 4 and was selected for prediction.

For developing the CVS hybrid model, runoff data are divided into training (approximately 80% of the whole dataset) and testing datasets (approximately 20% of the whole dataset) to predict a 1-month-ahead runoff. To compare the effectiveness of the proposed CVS model, seven other models were used for evaluation: CEEMDAN-VMD-MLP, CEEMDAN-SVM, VMD-SVM, SVM, CEEMDAN-MLP, VMD-MLP, and MLP. Afterward, four statistical indexes are employed, including RMSE, MAE, MAPE, and MSE and R2 to compare the performance of the proposed model with other models.

The research work was carried out using a 64-bit Windows 10 operating system on a 3.70 GHz, Intel (R) Core i7-10510U CPU with 16 GB memory. The analyses were performed using Matlab R2015a software and Python 3.6 relying on pandas and NumPy packages. The optimal parameters were selected after different trials and errors, considering the best results. SVM and MLP networks were developed with Keras using Google Tensorflow backend. MLP network in all MLP-based models was developed with two hidden layers having 64 and 32 hidden neurons, respectively, with sigmoid activation function, while the output layer has 1 neuron to predict runoff. Moreover, different learning rates were selected for each MLP-based model. Due to the nonstationary and noisy nature of the runoff time series, we applied the adaptive moment estimation (Adam) optimizer for efficient stochastic optimization [83].

For SVR-based models, the radial basis function (RBF) was selected as a kernel for all models with different values of C and for each model. In the case of CEEMDAN, the standard deviation of noise was selected as 0.2, the number of realizations allowed was chosen as 500, while the allowed maximum number of sifting iterations was taken as 5000. The values of the different parameters of CEEMDAN are taken from [57], and the same reference explains the detailed procedure of the parameter selection for CEEMDAN. The selected parameters for VMD include moderate bandwidth constraint, alpha = 2000; uniform initialization of omegas, init = 1; criterion for the tolerance of convergence, tol = 1e-7; and noise tolerance, tau = 0, while the value for the number of modes, K was chosen through correlation analysis of the frequency modes generated by CEEMDAN.

3.3. Analysis

In developing ML-based hydrological models, the selection of suitable input variables is one of the most important steps [84]. The autocorrelation function (ACF) determines an appropriate input dataset for the model corresponding to the runoff at the output by applying a lag time to the original runoff time series [3, 17, 85]. Therefore, to determine a suitable input dataset for the hybrid model in the present study, an ACF was applied to the runoff time series by applying a monthly time lag for a year. As evident from Figure 5, Q12 shows the highest value of correlation; therefore, the Q12 dataset was selected as an input for runoff prediction.

By employing the CEEMDAN as a preprocessing technique, the selected runoff data series after a time lag (input signal) was decomposed into a sequence of eight independent IMFs and a residual, i.e., eight quasi-stable components and one trend component are obtained due to the decomposition of a nonstationary runoff data series (Figure 6). The denoising process of the time series is not required since CEEMDAN has good antinoise features [43]. It is evident that the IMF1 component has the highest frequency and shows strong nonlinearity and significant fluctuations. However, the remaining IMFs (IMF2∼IMF8) and the residual indicate a stable and regular fluctuation which shows a gradual reduction in the frequency with an increase in the wavelength.

The secondary decomposition of IMF1 was carried out by VMD due to the presence of high oscillatory fluctuations in IMF1. The trial and error method was used to select several parameters in the VMD technique [3]. The value of the K parameter can also be determined in ensemble decomposition techniques by correlation analysis [31]. In the present study, we will also obtain K value by correlation analysis of the intrinsic modes produced by CEEMDAN. The correlation coefficient between the input signal and the IMFs including the residual was calculated (Table 1).

The third IMF shows a strong correlation with the input signal, and IMF3 was considered as a borderline for the selection of IMFs as values of K in VMD. IMFs 1 and 2 showed less correlation than IMF3 and were considered as one value of K for VMD decomposition, while the remaining IMFs 3–9 including the residual were taken as seven values of K. Hence, we obtained the value of K = 8 for VMD. Figure 7 shows the decomposition results of IMF1 by VMD.

The VMD produces smoother intrinsic modes compared with other decomposition techniques [60] which is also verified by the decomposition result of IMF1 (Figure 7). The decomposed time series components obtained after applying CEEMDAN and VMD along with the runoff data series were applied as an input to SVR for training and validation of data. SVR was used to predict VF1-VF8; afterward, the prediction results of IMF1 were combined with IMFs (IMF2–IMF8 and residual) produced by CEEMDAN to obtain the prediction results of the runoff time series of Swat River. The performance of the CVS model during training and testing periods is evaluated and compared with CEEMDAN-VMD-MLP, CEEMDAN-SVM, VMD-SVM, CEEMDAN-MLP, VMD-MLP, VMD, and MLP models to verify the effectiveness of the proposed model. The results are presented in Figures 814 and Tables 2 and 3.

Boxplots (Figures 12 and 13) indicate the range of quartile-based predicted and original (observed) runoff, while whiskers show the variability from the exterior of the 25th to 75th percentiles. The testing phase indicates more skewness and dispersion in prediction compared with the training phase. From the above figures showing training and testing results, it is evident that the CVS model simulates well with stable behavior than all the other models, indicating the superior capability of the CVS model in nonlinear runoff modeling. Moreover, the CVS model can mimic the runoff well than the other models in both training and testing phases, and overall, the hybrid approach performance is better than the individual models. Furthermore, the CVS model shows better prediction in the training period compared with the testing period.

As per the results (Tables 2 and 3) during the training period, the CVS model showed the lowest error in terms of lowest RMSE (0.1185), MAE (0.0941), MAPE (0.2398), and MSE (0.0140), whereas the MLP model showed the least performance than all other models with highest RMSE (0.4126), MAE (0.3957), MAPE (1.1065), and MSE (0.1702). During the testing period, MLP again revealed the greatest error with RMSE (0.4578), MAE (0.4282), MAPE (1.1896), and MSE (0.2096), while the CVS model outperformed all the other models having lowest error with lowest RMSE (0.1448), MAE (0.1192), MAPE (0.26276), and MSE (0.0209). To elaborate on the performance of the CVS model for the runoff prediction, a comparison of R2 values for different models is provided in Figure 14.

The correlations amongst the original and the predicted runoffs for standalone models (SVM and MLP) are lowest than the hybrid models (Figure 14). The CVS model shows the highest correlation for training (R2 = 0.9856) and testing (R2 = 0.9804) periods, while MLP showed the lowest correlation during training (R2 = 0.8263) and testing (R2 = 0.8050) periods. The performance of standalone models was significantly improved in hybrid combinations, which reflects the significance of the hybrid models.

Based on the calculation method in [81], the CVS models perform better during the training period by reducing RMSE by 71.28%, MAE by 76.22%, MAPE by 80.14%, and MSE by 91.77% compared with the MLP model; RMSE by 40.06%, MAE by 41.15%, MAPE by 22.50%, and MSE by 64.10% compared with the CEEDMAD-VMD-SVM model. However, during the testing period, the error reductions include RMSE by 68.37%, MAE by 72.16%, MAPE by 77.91%, and MSE by 90.03% compared with the MLP model; RMSE by 35.33%, MAE by 35.43%, MAPE by 36.02%, and MSE by 52.28% compared with the CEEDMAD-VMD-SVM model.

The results of figures and tables highlight that the three-layer CEEMDAN-VMD-MLP model also performs better than two-layer models (CEEMDAN-MLP and VMD-MLP); however, its performance is inferior to the two-stage VMD-SVM hybrid model in error reduction and accuracy. Furthermore, all the hybrid models show better performance than the standalone models with direct prediction. The results highlight that the decomposition-based ensemble models are better than standalone ML models since the decomposition approach decomposes the complex input signal into simple-to-study subcomponents, which are favorable to predict and analyze. It can also be concluded (Tables 2 and 3) that the VMD technique is superior to the CEEMDAN technique since the relevant VMD-based models (VMD-MLP and VMD-SVM) perform better than CEEMDAN-based models (CEEMDAN-MLP and CEEMDAN-SVM).

3.3.1. Extreme Value Analysis

Figure 15 shows the performance of different models in predicting the extreme values of observed runoff during the training and testing periods. The three-stage hybrid models show a superior capability to predict the extreme values of runoff, compared with all the other models, during the training and testing periods. The CVS model shows the best prediction results, while the MLP model in predicting the extreme values of runoff shows the poorest results. Furthermore, the two-stage hybrid models also perform relatively better than the standalone models. All the models show better performance in predicting the extreme values of the maximum and minimum runoff in the training period compared with the testing period. Taking the example of the maximum runoff during the training period, the CVS model, CEEMDAN-VMD-MLP, VMD-SVM, CEEMDAN-SVM, VMD-MLP, CEEMDAN-MLP, SVM, and MLP model show an error of 4.47%, 5.45%, 19.67%, 24.03%, 13.78%, 23.45%, 27.29%, and 28.46%, respectively in predicting the observed runoff; however, these models show an error of 6.89%, 7.19%, 27.45%, 33.53, 16.22%, 24.70%, 29.96%, and 31.40%, respectively, during the testing period. Therefore, the proposed model shows a comparatively satisfactory performance in tracking the extreme values of runoff compared with all the other models.

Albeit the runoff process is a complex task for prediction, all the hybrid models generally performed well in all simulations. The results prove the findings of [8688] according to which, it is impossible practically for a single model to predict precisely the complex hydrological runoff due to the effects of the external factors. The superiority of the proposed hybrid approach proves the viability of decomposition and ML-coupled hybrid approach for hydrological prediction and can also provide a feasible practical reference for similar prediction tasks. The CVS model can identify the intricate nonlinear connection between the original runoff data and the prediction with the best accuracy and performance. Nevertheless, the performance of the model is highly dependent on the reliability of hydrological time series data, mode selection by VMD, and hyperparameter selection by the SVM algorithm. This study deals with monthly runoff prediction by utilizing decomposition-based ML models. However, it is also essential to explore the performances of the CEEMDAN-VMD-based ML models on a daily, weekly, and annual basis for effective management of river basin, reservoir operation and planning, and allocation of water resources. Furthermore, the segregation of hydrological data in normal, drought, and wet periods and maneuver over the performance of models in each period also provides an effective approach for runoff prediction. Moreover, this study has the limitation that it considers only runoff as a predictor for runoff modeling without considering the runoff factors (groundwater flow, surface, and subsurface factors), hydrophysical factors (infiltration, evaporation, etc.), and factors due to humans. Consequently, the authors suggest the implementation of advanced techniques in the future to deal with the limitations of the existing study in a more useful way for reliable hydrological runoff studies. It is expected that this research will provide new directions to study hydrological time series prediction, which will be useful for scientific and technical communities.

4. Conclusions

This paper proposes a three-stage hybrid prediction model, linking the robustness of CEEMDAN-VMD with the SVM algorithm to enhance prediction accuracy with the lowest prediction error for the hydrological runoff time series. Five hybrid models and two standalone models were also used as a benchmark comparison. The models were developed by taking the runoff data of the Swat River, Pakistan. Four statistical performance assessment measures were employed to assess the performance of various models. Considering the results of the prediction accuracy and the error reduction, the following can be concluded from this research work, regarding runoff time series prediction:(i)Three-stage hybrid models (CVS and CEEMDAN-VMD-MLP) coupling a two-stage signal decomposition methodology (CEEMDAN-VMD) with ML techniques (MLP and SVM) perform better than the two-stage hybrid (CEEMDAN-SVM, VMD-SVM, CEEMDAN-MLP, and VMD-MLP) and standalone models (MLP and SVM).(ii)CVS model showed superior performance than all the other models in training and testing periods. The suspicions and projecting inaccuracies associated with the proposed CVS model were relatively less than the other models, which endorse the significance of the proposed model for the hydrological runoff prediction.(iii)Two-stage hybrid models combining single-stage signal decomposition methodology with ML techniques exhibit superior performance than standalone models.(iv)ML techniques (SVM and MLP) are applicable to predict the runoff time series, and the SVM algorithm is superior to MLP.(v)Both signal decomposition techniques (VMD and CEEMDAN) significantly improved the prediction results, showing that both techniques apply to the complex, noisy, and nonstationary runoff time series. VMD has shown better performance than CEEMDAN in all cases.(vi)Limitations: the quantity and quality of the available data play a significant role in the prediction task, and it is not easy to meet this requirement. ML techniques are sensitive to parameter and hyperparameter selection. Furthermore, ML techniques lack physical relations and concepts, which add complexity in the structuring of ML models.(vii)Significance and future directions: the research is vital to manage the study area with higher-order trends and noises. The error criteria determine the results of the performance evaluation, and the study judged the performance by employing the well-known performance measures; the superior results indicate the suitability of the CVS model for prediction purposes. Moreover, all the hybrid models showed better performance than standalone models. Therefore, hybrid models combining decomposition techniques with ML methods can play a role in the forthcoming prediction studies. The financial, societal, and ecological benefits of the precise runoff prediction sound for further enhancements in the runoff prediction; therefore, future research will consider new approaches based on deep learning models to study the nonlinear connections among runoff, temperature, climate condition, and precipitation.

Data Availability

The runoff data of Swat River, Pakistan, used to support the findings of this study are included within the article. The data are also available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to express sincere thanks to all the faculty and colleagues at Laboratory for Operation and Control of Cascaded Hydropower Station, China Three Gorges University, for providing guidance and valuable information regarding compilation of this research work. This work was supported by the National Natural Science Foundation of China (grant no. 51607105) and Provincial Natural Science Foundation of Hubei Province (grant no. 2016CFA097).