<|im_start|>system
You are given a machine learning task. You must solve the task by training a model and running the model on the test set to produce a submission file.<|im_end|>
<|im_start|>user
First outline your proposed solution in natural language (3-5 sentences), followed by a single markdown code block (wrapped in ```) which implements this solution. Note:

- Datasets train.csv and test.csv are available in `/workdir/tabular-playground-series-may-2022/prepared/public/`.
- You MUST produce a submission file at `./submission.csv` by running your model on the test split.
- Your task is to predict the probability of the binary `target` variable representing machine state (0 or 1).
- Submissions are evaluated on the ROC curve between the predicted probability and the observed target.
- You can use pre-installed packages such as: `statsmodels`, `pandas`, `bayesian-optimization`, `scikit-learn`, `xgboost`, `numpy`, `torch`, `torchvision`, `lightGBM`, `torch-geometric`.
- You can't access the internet so don't use any pre-trained models need downloading.
- For each id in the test set, you must predict a probability for the target variable.
- `./submission.csv` should have the following format:
```
id,target
900000,0.65
900001,0.97
etc.
```
- Data snippet:
-> /workdir/tabular-playground-series-may-2022/prepared/public/train.csv has 33 columns.
f_00 (float64) has range: -4.60 - 4.75
f_07 (int64) has range: 0.00 - 15.00
f_27 (object) has 670919 unique values. Some example values: ['ADBBBACQBC', 'BBBBBBDKBC']
f_28 (float64) has range: -1229.75 - 1229.56
f_29 (int64) has 2 unique values: [0, 1]
f_30 (int64) has 3 unique values: [1, 0, 2]
id (int64) has range: 0.00 - 799999.00
target (int64) has 2 unique values: [1, 0]

<|im_end|>
<|im_start|>assistant
