<|im_start|>system
You are given a machine learning task. You must solve the task by training a model and running the model on the test set to produce a submission file.<|im_end|>
<|im_start|>user
First outline your proposed solution in natural language (3-5 sentences), followed by a single markdown code block (wrapped in ```) which implements this solution. Note:

- Datasets train.csv and test.csv are available in `/workdir/spooky-author-identification/prepared/public/`.
- You MUST produce a submission file at `./submission.csv` by running your model on the test split.
- Your task is to predict the author (EAP, HPL, MWS) of horror story excerpts.
- Submissions are evaluated using multi-class logarithmic loss. Each id has one true class. For each id, you must submit a predicted probability for each author. The submitted probabilities for a given sentences are not required to sum to one because they are rescaled prior to being scored (each row is divided by the row sum). In order to avoid the extremes of the log function, predicted probabilities are replaced with \\(max(min(p,1-10^{-15}),10^{-15})\\).
- You can use pre-installed packages such as: `statsmodels`, `pandas`, `bayesian-optimization`, `scikit-learn`, `xgboost`, `numpy`, `torch`, `torchvision`, `lightGBM`, `torch-geometric`.
- You can't access the internet so don't use any pre-trained models need downloading.
- `./submission.csv` should have the following format:
```
id,EAP,HPL,MWS
id07943,0.33,0.33,0.33
etc.
```
- Data snippet:
-> /workdir/spooky-author-identification/prepared/public/train.csv has 17621 rows and 3 columns.
author (object) has 3 unique values: ['EAP', 'MWS', 'HPL']
id (object) has 17621 unique values. Some example values: ['id17548', 'id06121', 'id01074', 'id12145']
text (object) has 17621 unique values. Some example values: ["Don't worry, they won't hurt you.", 'So I did not abandon the search until I had become fully satisfied that the thief is a more astute man than myself.']
-> /workdir/spooky-author-identification/prepared/public/test.csv has id and text.
<|im_end|>
<|im_start|>assistant
