Abstract: Hydrothermal Liquefaction (HTL) is a sustainable approach to produce bio-oil from biomass (microalgae). But the identification and optimization of the process parameters by experimentation is not only a time-consuming process but also needs a huge workforce. However, machine learning (ML) algorithms such as Linear regression (LR), Random Forest (RF), and Decision tree (DT) can be applied to identify the critical process parameters and bio-oil production rate. The input parameters for ML algorithms include proximate analysis, elemental composition, the biochemical composition of the feedstock, and operating conditions. More than 100 data were collected from the literature with a maximum of 18 input features. The dataset was spilt into four based on the feature importance and Pearson correlation matrix and it’s used for all ML algorithms. Among the three algorithms, RF was found to be the best for bio-oil prediction using dataset 1 with 6 input features (R2 = 0.84 and RMSE = 0.01). Meanwhile, the best-predicted model was validated with new data and the difference ranged between -1.3% to +7.8%. Datasets 1 & 4 were validated statistically, and then the P value was >0.05, which showed an insignificant difference for the prediction of oil yield. Feature importance implies that bio-oil production follows temperature, time, and pressure.
Loading