extract_factors_system: |-
    user will treat your factor name as key to store the factor, don't put any interaction message in the content. Just response the output without any interaction and explanation.
    All names should be in English.
    Respond with your analysis in JSON format. The JSON schema should include:
    {
        "summary": "The summary of this report",
        "factors": {
            "Name of factor 1": "Description to factor 1",
            "Name of factor 2": "Description to factor 2"
        },
        "models": {
            "Name of model 1": "Description to model 1",
            "Name of model 2": "Description to model 2"
        }
    }

extract_factors_follow_user: |-
    Please continue extracting the factors. Please ignore factors appeared in former messages. If no factor is found, please return an empty dict.
    Notice: You should not miss any factor in the report! Some factors might appear several times in the report. You can repeat them to avoid missing other factors.
    Respond with your analysis in JSON format. The JSON schema should include:
    {
        "factors": {
            "Name of factor 1": "Description to factor 1",
            "Name of factor 2": "Description to factor 2"
        }
    }

extract_factor_formulation_system: |-
    I have a financial engineering research report and a list of factors extracted from it. I need assistance in extracting specific information based on the report and the provided list of factors. The tasks are as follows:

    1. For each factor, I need its calculation formula in LaTeX format. The variable names within the formulas should not contain spaces; instead, use underscores to connect words. Ensure that the factor names within the formulas are consistent with the ones I've provided.
    2. For each factor formula, provide explanations for the variables and functions used. The explanations should be in English, and the variable and function names should match those used in the formulas.

    Here are the sources of data I have:

    1. Stock Trade Data Table: Contains information on stock trades, including daily open, close, high, low, VWAP prices, volume, and turnover.
    2. Financial Data Table: Contains company financial statements, such as the balance sheet, income statement, and cash flow statement.
    3. Stock Fundamental Data Table: Contains basic information about stocks, like total shares outstanding, free float shares, industry classification, market classification, etc.
    4. High-Frequency Data: Contains price and volume of each stock at the minute level, including open, close, high, low, volume, and VWAP.

    Please expand the formulation to use the source data I have provided. If the number of factors exceeds the token limit, extract the formulas for as many factors as possible without exceeding the limit. Ensure to avoid syntax errors related to special characters in JSON, especially with backslashes and underscores in LaTeX.

    Provide your analysis in JSON format, using the following schema:
    {
        "factor name 1": {
            "formulation": "latex formulation of factor 1",
            "variables": {
                "variable or function name 1": "description of variable or function 1",
                "variable or function name 2": "description of variable or function 2"
            }
        },
        "factor name 2": {
            "formulation": "latex formulation of factor 2",
            "variables": {
                "variable or function name 1": "description of variable or function 1",
                "variable or function name 2": "description of variable or function 2"
            }
        }
    }


extract_factor_formulation_user: |-
    ===========================Report content:=============================
    {{ report_content }}
    ===========================Factor list in dataframe=============================
    {{ factor_dict }}


classify_system: |-
    Your job is classify whether the user input document is a quantitative investment research report. The user will input a document and you should classify it based on the following conditions:
    1. The document is about finance other than other fields like biology, physics, chemistry, etc.
    2. The document is a research report on stock selection (which needs to be strictly separated from time selection and base selection) in the field of metalworking quantification.
    3. The document involves the composition of factors or models, or tests their performance.

    If the document meets all the above conditions, please return 1; otherwise, please return 0.

    Please respond with your decision in JSON format. Just respond the output json string without any interaction and explanation.
    The JSON schema should include:
    {
        "class": 1
    }

factor_viability_system: |-
    User has designed several factors in quant investment. Please help the user to check the viability of these factors.


    User will provide a pandas dataframe like table containing following information:
    1. The name of the factor;
    2. The simple description of the factor;
    3. The formulation of the factor in latex format;
    4. The description to the variables and functions in the formulation of the factor.

    User has several source data:
    1. The Stock Trade Data Table containing information about stock trades, such as daily open, close, high, low, vwap prices, volume, and turnover;
    2. The Financial Data Table containing company financial statements such as the balance sheet, income statement, and cash flow statement;
    3. The Stock Fundamental Data Table containing basic information about stocks, like total shares outstanding, free float shares, industry classification, market classification, etc;
    4. The high frequency data containing price and volume of each stock containing open close high low volume vwap in each minute;
    5. The Consensus Expectations Factor containing the consensus expectations of the analysts about the future performance of the company.


    A viable factor should satisfy the following conditions:
    1. The factor should be able to be calculated in daily frequency;
    2. The factor should be able to be calculated based on each stock;
    3. The factor should be able to be calculated based on the source data provided by user.

    You should give decision to each factor provided by the user. You should reject the factor based on very solid reason.
    Please return true to the viable factor and false to the non-viable factor.

    Notice, you can just return part of the factors due to token limit. Your factor name should be the same as the user's factor name.

    Please respond with your decision in JSON format. Just respond the output json string without any interaction and explanation.
    The JSON schema should include:
    {
        "Name to factor 1":
        {
            "viability": true,
            "reason": "The reason to the viability of this factor"
        },
        "Name to factor 2":
        {
            "viability": false,
            "reason": "The reason to the non-viability of this factor"
        }
        "Name to factor 3":
        {
            "viability": true,
            "reason": "The reason to the viability of this factor"
        }
    }

factor_relevance_system: |-
    User has designed several factors in quant investment. Please help the user to check the relevance of these factors to be real quant investment factors.

    User will provide a pandas dataframe like table containing following information:
    1. The name of the factor;
    2. The simple description of the factor;
    3. The formulation of the factor in latex format;
    4. The description to the variables and functions in the formulation of the factor.

    A relevant factor should satisfy the following conditions:
    1. The factor should be able to be calculated in daily frequency;
    2. The factor should be able to be calculated based on each stock;
    3. The factor should only be calculated based on mathematical manipulation, not based on subjective judgment or natural language analysis.

    You should give decision to each factor provided by the user. You should reject the factor based on very solid reason.
    Please return true to the relevant factor and false to the irrelevant factor.

    Notice, you can just return part of the factors due to token limit. Your factor name should be the same as the user's factor name.

    Please respond with your decision in JSON format. Just respond the output json string without any interaction and explanation.
    The JSON schema should include:
    {
        "Name to factor 1":
        {
            "relevance": true,
            "reason": "The reason to the relevance of this factor"
        },
        "Name to factor 2":
        {
            "relevance": false,
            "reason": "The reason to the non-relevance of this factor"
        }
        "Name to factor 3":
        {
            "relevance": true,
            "reason": "The reason to the relevance of this factor"
        }
    }


factor_duplicate_system: |-
    User has designed several factors in quant investment. Please help the user to duplicate these factors.

    User will provide a pandas dataframe like table containing following information:
    1. The name of the factor;
    2. The simple description of the factor;
    3. The formulation of the factor in latex format;
    4. The description to the variables and functions in the formulation of the factor.

    User wants to find whether there are duplicated groups. The factors in a duplicate group should satisfy the following conditions:
    1. They might differ in the name, description, formulation, or the description to the variables and functions in the formulation, some upper or lower case difference is included;
    2. They should be talking about exactly the same factor;
    3. If horizon information like 1 day, 5 days, 10 days, etc is provided, the horizon information should be the same.

    To make your response valid, we have some very important constraint for you to follow! Listed here:
    1. You should be very confident to put duplicated factors into a group;
    2. A group should contain at least two factors;
    3. To a factor which has no duplication, don't put them into your response;
    4. To avoid merging too many similar factor, don't put more than ten factors into a group!
    You should always follow the above constraints to make your response valid. 

    Your response JSON schema should include:
    [
        [
            "factor name 1",
            "factor name 2"
        ],
        [
            "factor name 5",
            "factor name 6"
        ],
        [
            "factor name 7",
            "factor name 8",
            "factor name 9"
        ]
    ]
    Your response is a list of lists. Each list represents a duplicate group containing all the factor names in this group. 
    The factor names in the list should be unique and the factor names should be the same as the user's factor name.
    To avoid reaching token limit, don't respond more than fifty groups in one response. You should respond the output json string without any interaction and explanation.