Here’s the ranking of the formulation variables for **maximising encapsulation efficiency (encap\_efficiency)**, from most to least important:

---

## 📊 Ranked Variable Importance

1. **solid\_lipid (categorical type)**

   * **Rationale:** The chemistry of the solid lipid (chain length, polarity, crystallinity) directly dictates how well CBD partitions into the lipid matrix. Literature shows solid lipid choice often explains the largest variance in encapsulation efficiency.

2. **solid\_lipid\_input (mass ratio)**

   * **Rationale:** Determines the structural capacity of the lipid core. Higher or lower solid lipid loading can influence crystallinity, drug solubility, and entrapment. Strong nonlinear effects with solid\_lipid type.

3. **liquid\_lipid\_input (mass ratio)**

   * **Rationale:** Liquid lipid modulates matrix fluidity and prevents drug expulsion due to crystallization. Balance with solid lipid strongly affects efficiency.

4. **surfactant\_input (mass ratio)**

   * **Rationale:** Surfactant stabilises droplets and prevents leakage. Too little leads to poor encapsulation; too much increases solubilization in aqueous phase and reduces encapsulation. Important but usually secondary to lipid composition.

5. **drug\_input (amount of CBD fed)**

   * **Rationale:** Mainly sets a loading challenge. Higher input can reduce efficiency (saturation, leakage), but its role is less fundamental than lipid composition, since efficiency is measured as a percentage, not absolute amount.

---

✅ **Summary:**

* **Most important drivers:** lipid *type* and lipid *ratios* (solid + liquid).
* **Moderate influence:** surfactant concentration.
* **Least direct influence:** drug\_input, since efficiency is relative, not absolute.

Would you like me to also suggest a **feature engineering strategy** (e.g., solid\:liquid ratio, lipid\:surfactant balance) that could boost model performance in this dataset?
