Ετικέτες

Σάββατο 5 Μαΐου 2018

Application of feature selection and regression models for chlorophyll-a prediction in a shallow lake

Abstract

As a representative index of the algal bloom, the concentration of chlorophyll-a (Chl-a) is a key parameter of concern for environmental managers. The relationships between environmental variables and Chl-a are complex and difficult to establish. Two machine learning methods, including support vector machine for regression (SVR) and random forest (RF), were used in this study to predict Chl-a concentration based on multiple variables. To improve the model accuracy and reduce the input number, two feature selection methods, including minimum redundancy and maximum relevance method (mRMR) and RF, were integrated with regression models. The results showed that the RF model had a higher predictive ability than the SVR model. Furthermore, the less computational time cost and unnecessary prior data transformation also indicated a better applicability of the RF model. The comparison between ensemble models of mRMR-RF and RF-RF showed that the RF-RF yielded a better performance with fewer variables. Seven variables selected from the candidate predictors could interpret most information, and their potential implications to Chl-a were discussed based on the level of importance. Overall, the RF-RF ensemble model can be considered as a useful approach to determine the significant stressors and achieve satisfactory prediction of Chl-a concentration.



https://ift.tt/2JVDNyw

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου

Αναζήτηση αυτού του ιστολογίου