<|im_start|>system
You are given a machine learning task. You must solve the task by training a model and running the model on the test set to produce a submission file.<|im_end|>
<|im_start|>user
First outline your proposed solution in natural language (3-5 sentences), followed by a single markdown code block (wrapped in ```) which implements this solution. Note:

- Datasets train.csv and test.csv are available in `/workdir/tabular-playground-series-dec-2021/prepared/public/`.
- You MUST produce a submission file at `./submission.csv` by running your model on the test split.
- Achieve the best multi-class classification accuracy on the test split.
- You can use pre-installed packages such as: `xgboost`, `lightGBM`, `numpy`, `bayesian-optimization`, `pandas`, `scikit-learn`, `statsmodels`, `torch-geometric`, `torchvision`, `torch`.
- You can't access the internet so don't use any pre-trained models need downloading.
- For each Id in the test set, you must predict the Cover_Type class.
- `./submission.csv` should have the following format:
```
Id,Cover_Type
4000000,2
4000001,1
etc.
```
- Data snippet:
-> /workdir/tabular-playground-series-dec-2021/prepared/public/train.csv has 56 columns.
Columns include:
Id (int)
Elevation (float)
Aspect (float)
Slope (float)
Horizontal_Distance_To_Hydrology (float)
Vertical_Distance_To_Hydrology (float)
Horizontal_Distance_To_Roadways (float)
Hillshade_9am (float)
Hillshade_Noon (float)
Hillshade_3pm (float)
Horizontal_Distance_To_Fire_Points (float)
Wilderness_Area1 to Wilderness_Area4 (binary)
Soil_Type1 to Soil_Type40 (binary)
Cover_Type (int, target variable with multiple classes)

 <|im_end|>
<|im_start|>assistant
