hypothesis_and_feedback: |-
  =========================================================
  {% for experiment, feedback in trace.hist %}
  # Trial {{ loop.index }}: 
  ## Hypothesis
  {{ experiment.hypothesis }}
  ## Specific task: 
  {% for task in experiment.sub_tasks %}
  {% if task is not none and task.get_task_brief_information is defined %}
  {{ task.get_task_brief_information() }}
  {% endif %}
  {% endfor %}
  ## Backtest Analysis and Feedback:
  {% if experiment.result is not none %}
  Backtest Result: {{ experiment.result.loc[["IC", "1day.excess_return_without_cost.annualized_return", "1day.excess_return_without_cost.max_drawdown"]] }}
  {% endif %}
  Observation: {{ feedback.observations }}
  Hypothesis Evaluation: {{ feedback.hypothesis_evaluation }}
  Decision (Whether the hypothesis was successful): {{ feedback.decision }}
  =========================================================
  {% endfor %}

last_hypothesis_and_feedback: |-
  ## Hypothesis
  {{ experiment.hypothesis }}
  ## Specific task: 
  {% for task in experiment.sub_tasks %}
  {% if task is not none and task.get_task_brief_information is defined %}
  {{ task.get_task_brief_information() }}
  {% endif %}
  {% endfor %}
  ## Backtest Analysis and Feedback:
  {% if experiment.result is not none %}
  Backtest Result: {{ experiment.result.loc[["IC", "1day.excess_return_without_cost.annualized_return", "1day.excess_return_without_cost.max_drawdown"]] }}
  {% endif %}
  Training Log: 
  Here, you need to focus on analyzing whether there are any issues with the training. If any problems are identified, you must correct them in the next iteration and clearly describe how the changes will be made in the hypothesis.
  {{ experiment.stdout }} 
  Observation: {{ feedback.observations }}
  Evaluation: {{ feedback.hypothesis_evaluation }}
  Decision (Whether this experiment is SOTA): {{ feedback.decision }}
  New Hypothesis (Given in feedback stage, just for reference, and can be accepted or rejected in the next round): {{ feedback.new_hypothesis }}
  Reasoning (Justification for the new hypothesis): {{ feedback.reason }}

sota_hypothesis_and_feedback: |-
  ## Hypothesis
  {{ experiment.hypothesis }}
  ## Specific task: 
  {% for task in experiment.sub_tasks %}
  {% if task is not none and task.get_task_brief_information is defined %}
  {{ task.get_task_brief_information() }}
  {% endif %}
  {% endfor %}
  ## Backtest Analysis and Feedback:
  {% if experiment.result is not none %}
  Backtest Result: {{ experiment.result.loc[["IC", "1day.excess_return_without_cost.annualized_return", "1day.excess_return_without_cost.max_drawdown"]] }}
  {% endif %}
  Training Log: {{ experiment.stdout }}
  Observation: {{ feedback.observations }}
  Evaluation: {{ feedback.hypothesis_evaluation }}
  Decision (Whether this experiment is SOTA): {{ feedback.decision }}

hypothesis_output_format: |-
  The output should follow JSON format. The schema is as follows:
  {
  "hypothesis": "An exact, testable, and innovative statement derived from previous experimental trace analysis. Avoid overly general ideas and ensure precision. The hypothesis should clearly specify the exact approach and expected improvement in performance in two or three sentences.",
  "reason": "Provide a clear, logical explanation for why this hypothesis was proposed, grounded in evidence (e.g., trace history, domain principles). Reason should be short with no more than two sentences.",
  }

factor_hypothesis_output_format: |-
  The output should follow JSON format. The schema is as follows:
  {
  "hypothesis": "The new hypothesis generated based on the information provided. Limit in two or three sentences.",
  "reason": "The reason why you generate this hypothesis. It should be comprehensive and logical. It should cover the other keys below and extend them. Limit in two or three sentences.",
  }

hypothesis_output_format_with_action: |-
  The output should follow JSON format. The schema is as follows:
  {
  "action": "If `hypothesis_specification` provides the action you need to take, please follow "hypothesis_specification" to choose the action. Otherwise, based on previous experimental results, suggest the action you believe is most appropriate at the moment. It should be one of [`factor`, `model`].",
  "hypothesis": "The new hypothesis generated based on the information provided,should be a string.",
  "reason": "The reason why you generate this hypothesis. It should be comprehensive and logical. It should cover the other keys below and extend them. Limit in two or three sentences.",
  }

model_hypothesis_specification: |-
  1. First, observe and analyze the overall experimental progression in `hypothesis_and_feedback`. Analyze where the previous model designs were inadequate — whether it was due to parameter settings, architectural flaws, or a lack of novelty (proposing entirely new concepts is highly encouraged as long as they demonstrate effectiveness).
  2. Second, `last_hypothesis_and_feedback` and `sota_hypothesis_and_feedback` are key references you should pay close attention to. You can choose to optimize based on either of them or generate new ideas to form hypotheses and experiments.
  3. If there is no prior experiment or result available at the beginning, you can start by implementing a simple and small architecture.
  4. If a series of attempts fail to achieve SOTA, consider exploring entirely new directions; at this point, it is acceptable to return to simple architectures.
  5. Focus exclusively on the architecture of PyTorch models. Each hypothesis should specifically address architectural decisions, such as layer configurations, activation functions, regularization methods, and overall model structure. DO NOT do any feature-specific processing. Instead, you can propose innovative transformations on the input time-series data to enhance model training effectiveness.
  6. Avoid including aspects unrelated to architecture, such as input features or optimization strategies.
  7. Sometimes, when training performance is poor, adjusting hyperparameters can also be an effective strategy for improvement.
  8. Use standard libraries for baseline models, but also explore custom architecture designs to investigate novel structures. After sufficient trials with traditional models, aim for innovation comparable to top-tier AI conferences (NeurIPS, ICLR, ICML, SIGKDD, etc.) in time series modeling.

factor_hypothesis_specification: |-
  1. **1-5 Factors per Generation:**
    - Ensure each generation produces 1-5 factors.
    - Balance simplicity and complexity to build a robust factor library.
    - Make full use of the financial data provided to you instead of focusing solely on a specific field.
  2. **Simple and Effective Factors First:**
    - Start with factors that are simple, easy to achieve and likely effective.
    - Concisely explain why these factors are expected to work.
    - Avoid complex or combined factors initially.
  3. **Gradual Complexity Increase:**
    - Introduce more complex factors (e.g. machine learning based factors, factors use mult-dimentional factor raw data, etc.) as more experimental results are gathered.
    - Combine factors only after simpler ones are tested and validated.
  4. **New Directions and Optimizations:**
    - If multiple consecutive iterations fail to produce factors surpassing SOTA, consider switching to a new direction and can starting with simple factors again.
    - If optimizing a specific type of factor, proceed from simple to complex.
  5. Note
    - Highlight that factors surpassing SOTA are included in the library to avoid re-implementation.
    - No matter how many factors you plan to generate, only reply with one set of hypothesis and reason. The hypothesis can include the proposal of multiple factors at the same time.

factor_experiment_output_format: |-
  The output should follow JSON format. The schema is as follows:
  {
      "factor name 1": {
          "description": "description of factor 1, start with its type, e.g. [Momentum Factor]",
          "formulation": "latex formulation of factor 1",
          "variables": {
              "variable or function name 1": "description of variable or function 1",
              "variable or function name 2": "description of variable or function 2"
          }
      },
      "factor name 2": {
          "description": "description of factor 2, start with its type, e.g. [Machine Learning based Factor]",
          "formulation": "latex formulation of factor 2",
          "variables": {
              "variable or function name 1": "description of variable or function 1",
              "variable or function name 2": "description of variable or function 2"
          }
      }
      # Don't add ellipsis (...) or any filler text that might cause JSON parsing errors here!
  }

model_experiment_output_format: |-
  So far please only design one model to test the hypothesis! 
  The output should follow JSON format. The schema is as follows (value in training_hyperparameters is a basic setting for reference, you CAN CHANGE depends on the previous training log): 
  {
    "model_name (The name of the model)": {
        "description": "A detailed description of the model",
        "formulation": "A LaTeX formula representing the model's formulation",
        "architecture": "A detailed description of the model's architecture, e.g., neural network layers or tree structures",
        "variables": {
            "\\hat{y}_u": "The predicted output for node u",
            "variable_name_2": "Description of variable 2",
            "variable_name_3": "Description of variable 3"
        },
        "hyperparameters": {
            "hyperparameter_name_1": "value of hyperparameter 1",
            "hyperparameter_name_2": "value of hyperparameter 2",
            "hyperparameter_name_3": "value of hyperparameter 3"
        },
        "training_hyperparameters" {  # All values are for reference; you can set them yourself
            "n_epochs": "100",
            "lr": "1e-3",
            "early_stop": 10,
            "batch_size": 256,
            "weight_decay": 1e-4,
        }
        "model_type": "Tabular or TimeSeries"  # Should be one of "Tabular" or "TimeSeries"
    },
  }

factor_feedback_generation:
  system: |-
    You are a professional financial result analysis assistant in data-driven R&D. 
    The task is described in the following scenario:

    {{ scenario }}
    
    You will receive a hypothesis, multiple tasks with their factors, their results, and the SOTA result. 
    Your feedback should specify whether the current result supports or refutes the hypothesis, compare it with previous SOTA (State of the Art) results, and suggest improvements or new directions.
    
    Please understand the following operation logic and then make your feedback that is suitable for the scenario:
      1. Logic Explanation:
        a) All factors that have surpassed SOTA in previous attempts will be included in the SOTA factor library.
        b) New experiments will generate new factors, which will be combined with the factors in the SOTA library.
        c) These combined factors will be backtested and compared against the current SOTA to enable continuous iteration.
      2. Development Directions:
        a) New Direction: Propose a new factor direction for exploration and development.
        b) Optimization of Existing Direction:
          - Suggest further improvements to that factor (this can include further optimization of the factor or proposing a direction that combines better with the factor).
          - Avoid re-implementing previous factors as those that surpassed SOTA are already included in the factor library and will be used in each run.
      3. Final Goal: To continuously accumulate factors that surpass each iteration to maintain the best SOTA.
    
    When judging the results:
      1. Any small improvement should be considered for inclusion as SOTA (set `Replace Best Result` as yes).
      2. If the new factor(s) shows an improvement in the annualized return, recommend it to replace the current best result.
      3. Minor variations in other metrics are acceptable as long as the annualized return improves.

    Consider Changing Direction for Significant Gaps with SOTA:
      - If the new results significantly differ from the SOTA, consider exploring a new direction (write new type factors).
      - Avoid re-implementing previous factors as those that surpassed SOTA are already included in the factor library and will be used in each run.

    Please provide detailed and constructive feedback for future exploration.
    Respond in JSON format. Example JSON structure for Result Analysis:
    {
      "Observations": "Your overall observations here",
      "Feedback for Hypothesis": "Observations related to the hypothesis",
      "New Hypothesis": "Your new hypothesis here",
      "Reasoning": "Reasoning for the new hypothesis",
      "Replace Best Result": "yes or no"
    }
  user: |-
    Target hypothesis: 
    {{ hypothesis_text }}
    Tasks and Factors:
    {% for task in task_details %}
      - {{ task.factor_name }}: {{ task.factor_description }}
        - Factor Formulation: {{ task.factor_formulation }}
        - Variables: {{ task.variables }}
        - Factor Implementation: {{ task.factor_implementation }}
        {% if task.factor_implementation == "False" %}
        **Note: This factor was not implemented in the current experiment. Only the hypothesis for implemented factors can be verified.**
        {% endif %}
    {% endfor %}
    Combined Results: 
    {{ combined_result }}
    
    Analyze the combined result in the context of its ability to:
    1. Support or refute the hypothesis.
    2. Show improvement or deterioration compared to the SOTA experiment.
    
    Note: Only factors with 'Factor Implementation' as True are implemented and tested in this experiment. If 'Factor Implementation' is False, the hypothesis for that factor cannot be verified in this run.

model_feedback_generation:
  system: |-
    You are a professional quantitative analysis assistant in top-tier hedge fund.

    The task is described in the following scenario:
    {{ scenario }}

    You will receive a quantitative model hypothesis, its specific task description, and it market backtest result. 
    Your feedback should specify whether the current result supports or refutes the hypothesis, compare it with previous SOTA results, examine the model's training logs to analyze whether there are issues with hyperparameter settings, and suggest improvements or new directions.

    Please provide detailed and constructive feedback.
    Example JSON Structure for Result Analysis:
    {
      "Observations": "First analyze the model's training logs to determine whether there are any issues with its parameter settings. Then clearly summarize the current results and the SOTA results with exact scores and any notable patterns. Limit your summary to no more than three concise, data-focused sentences.",
      "Feedback for Hypothesis": "Explicitly confirm or refute the hypothesis based on specific data points or performance trends. Limit to two sentences.",
      "New Hypothesis": "Propose a revised hypothesis, considering observed patterns and limitations in the current one. Limit to no more than two sentences.",
      "Reasoning": "Explain the rationale for the new hypothesis using specific trends or performance shifts. Be concise but technically complete. Limit to two sentences.",
      "Decision": <true or false>,
    }

    
  user: |-
    {% if sota_hypothesis %} 
    # SOTA Round Information:
    Hypothesis: {{ sota_hypothesis.hypothesis }}
    Specific Task: {{ sota_task }}
    Code Implementation: {{ sota_code }}
    Result: {{ sota_result }}
    {% else %}
    # This is the first round. No previous information available. As long as the performance is not too negative (eg.ICIR is greater than 0), treat it as successful. Do not set the threshold too high.  
    {% endif %} 
    
    # Current Round Information:
    Hypothesis: {{ hypothesis.hypothesis }}
    Why propose this hypothesis: {{ hypothesis.reason }}
    Specific Task: {{ exp.sub_tasks[0].get_task_information() }}
    Code Implementation: {{ exp.sub_workspace_list[0].file_dict.get("model.py") }}
    Training Log: {{ exp.stdout }}
    Result: {{ exp_result }}

    # When judging the results:
    1. **Recommendation for Replacement:**
      - If the new model's performance shows an improvement in the annualized return, recommend it to replace the current SOTA result.
      - Minor variations in other metrics are acceptable as long as the annualized return improves.
    2.  Consider Changing Direction When Results Are Significantly Worse Than SOTA:
      - If the new results significantly worse than the SOTA, consider exploring a new direction, like change a model architecture.

action_gen:
  system: |-
    Quantitative investment is a data-driven approach to asset management that relies on mathematical models, statistical techniques, and computational methods to analyze financial markets and make investment decisions. Two essential components of this approach are factors and models.
  
    You are one of the most authoritative quantitative researchers at a top Wall Street hedge fund. I need your expertise to develop new factors and models that can enhance our investment returns. Based on the given context, I will ask for your assistance in designing and implementing either factors or a model.

    You will receive a series of experiments, including their factors and models, and their results. 
    Your task is to analyze the previous experiments and decide whether the next experiment should focus on factors or models.

    Example JSON Structure for your return:
    {
      "action": "factor" or "model",  # You must choose one of the two
    }

  user: |-
    {% if hypothesis_and_feedback|length == 0 %}
    It is the first round of hypothesis generation. The user has no hypothesis on this scenario yet.
    {% else %}
    The former hypothesis and the corresponding feedbacks are as follows:
    {{ hypothesis_and_feedback }}
    {% endif %}

  
    {% if last_hypothesis_and_feedback != "" %}
    Here is the last trial's hypothesis and the corresponding feedback. The main feedback includes a new hypothesis for your reference only. You should evaluate the entire reasoning chain to decide whether to adopt it, propose a more suitable hypothesis, or transfer and optimize it for another scenario (e.g., factor/model), since transfers are generally encouraged:
    {{ last_hypothesis_and_feedback }}
    {% endif %}