Random Forest Regression in Maize Yield Prediction

Sitienei, Miriam and Anapapa, Ayubu and Otieno, Argwings (2023) Random Forest Regression in Maize Yield Prediction. Asian Journal of Probability and Statistics, 23 (4). pp. 43-52. ISSN 2582-0230

[thumbnail of Sitienei2342023AJPAS103426.pdf]

Text
Sitienei2342023AJPAS103426.pdf - Published Version
Download (530kB)

Official URL: https://doi.org/10.9734/ajpas/2023/v23i4511

Abstract

Artificial Intelligence is the discipline of making computers behave without explicit programming. Machine learning is a subset of artificial Intelligence that enables machines to learn autonomously from previous data without explicit programming. The purpose of machine learning in agriculture is to increase crop yield and quality in the agricultural sector. It is driven by the emergence of big data technologies and high-performance computation, which provide new opportunities to unravel, quantify, and comprehend data-intensive agricultural operational processes. Random Forest is an ensemble technique that reduces the result's overfitting. This algorithm is primarily utilized for forecasting. It generates a forest with numerous trees. The random forest classifier predicts that the model's accuracy will increase as the number of trees in the forest increases. All through the training phase, multiple decision trees are constructed. It generates subsets of data from randomly selected training samples with replacement. Each data subset is employed to train decision trees. It utilizes multiple trees to reduce the possibility of overfitting. Maize is a staple food in Kenya and having it in sufficient amounts in the country assures the farmers' food security and economic stability. This study predicted maize yield in the Kenyan county of Uasin Gishu using the machine learning algorithm Random Forest regression. The regression model employed a mixed-methods research design, and the survey employed well-structured questionnaires containing quantitative and qualitative variables, which were directly administered to 30 clustered wards' representative farmers. The questionnaire encompassed 30 maize production-related variables from 900 randomly selected maize producers in 30 wards. The model was able to identify important variables from the dataset and predicted maize yield. The prediction evaluation used machine learning regression metrics, Root Mean Squared error-RMSE=0.52199, Mean Squared Error-MSE =0.27248, and Mean Absolute Error-MAE = 0.471722. The model predicted maize yield and indicated the contribution of each variable to the overall prediction.

Item Type:	Article
Subjects:	Impact Archive > Mathematical Science
Depositing User:	Managing Editor
Date Deposited:	03 Oct 2023 11:58
Last Modified:	03 Oct 2023 11:58
URI:	http://research.sdpublishers.net/id/eprint/2951

Actions (login required)

: View Item