BASELINE_PROMPT = """Take a deep breath and think step by step. You are one of the best programmers in the world. 
You will have the background, paths to training and testing files and data description of the problem.
Your task is to create a basic solution for the given problem.
Your solution must contain correct python code that will read data, make base data processing, train and test the model, and make a submission. 


## COMPETITION DESCRIPTION AND DATA DESCRIPTION ##
{background_data}


## MANDATORY STEP-BY-STEP PLAN ##
**To ensure the submission is correct, you MUST follow this plan precisely, especially for handling the test data and creating the submission file.**

**1.  Load Data:**
    * Load the training data from `{train_path}`.
    * **Crucially, load the `sample_submission.csv` file located at `{dataset_directory}/sample_submission.csv`.**

**2.  Prepare Test Set Processing:**
    * **Extract the column containing the IDs (e.g., 'id', 'image_name') from the `sample_submission.csv` DataFrame. This list of IDs and its specific order MUST be preserved and used for processing the test data.**
    * **Do NOT use `os.listdir` or any other directory scanning function to determine the order of test files. The ONLY correct order is the one in `sample_submission.csv`.**

**3.  Data Processing and Feature Engineering:**
    * Perform necessary preprocessing on the training and test data.

**4.  Model Training:**
    * Define, train, and validate your model using the training data.

**5.  Prediction on Test Set:**
    * Iterate through the list of IDs you extracted from `sample_submission.csv` in **Step 2**.
    * For each ID, load the corresponding test file (e.g., image, text document).
    * Make a prediction using your trained model.
    * Store these predictions in a list. **Ensure the order of predictions exactly matches the order of the IDs.**

**6.  Create Submission File:**
    * Create a new DataFrame for the submission.
    * The first column must be the ID column, taken directly from the `sample_submission.csv` you loaded earlier.
    * The second column (and any subsequent ones) should be your list of predictions.
    * Save this DataFrame to the path specified in `{submission_file}` without the index.


## PATHS TO FILE ##
All files are in the `{dataset_directory}` directory.
The original training data are divided into two groups:
1) for training {train_path}
2) for testing {test_path}

Use these paths when reading the files!

## CODE ALLOCATION ##
You must allocate your code like this:
```python
<some python code>
````

## IMPORTANT NOTE
    1) Do not use try-except blocks. I should be able to see all the code errors on startup.
    2) Do not use models such as LightGBM or those that do not train well on GPUs, unless they are statistical models or similar. All training should be completed in a short period of time. If you are training models, try to utilise all GPUs.
    3) In the end of the code, you must save to this file: {submission_file}
    4) Remember: Always consider resource constraints and prioritize efficiency in your code.

## SUBMISSION FILE FORMAT

**As a final check, remember that your generated submission file must exactly match the format of the provided `sample_submission.csv` file.** This includes column names, the number of rows, and the exact IDs in the exact same order. Following the mandatory plan above will guarantee this.

#############

# DEVICE INFO

{device_info}

#############

# RESOURCE USAGE INSTRUCTIONS

Use the system resources efficiently based on the info above.

  - If CUDA is available, prefer GPU for heavy computations (e.g., matrix ops, model inference).
  - Otherwise, use all CPU cores/threads where parallelism helps.
  - Avoid memory overhead; use batching/streaming if RAM is limited.
  - Always choose the most efficient device (`cuda` or `cpu`) for your tasks.
"""