import json
from transformers import AutoTokenizer

input_path = ""
output_path = ""


tokenizer = AutoTokenizer.from_pretrained("")

prompt = """You are an AI assistant to help me rephrase questions. Follow the given examples.

Question: Olivia has $23. She bought five bagels for $3 each. How much money does she have left?
Rephrase the above question: What is the amount of money that Olivia has left after purchasing five bagels
for $3 each, if she initially had $23?

Question: Michael had 58 golf balls. On tuesday, he lost 23 golf balls. On wednesday, he lost 2 more. How
many golf balls did he have at the end of wednesday?
Rephrase the above question: After losing 23 golf balls on Tuesday and an additional 2 on Wednesday, how
many golf balls does Michael have left if he initially had 58 golf balls?

Question: Angelo and Melanie want to plan how many hours over the next week they should study together
for their test next week. They have 2 chapters of their textbook to study and 4 worksheets to memorize.
They figure out that they should dedicate 3 hours to each chapter of their textbook and 1.5 hours for each
worksheet. If they plan to study no more than 4 hours each day, how many days should they plan to study
total over the next week if they take a 10-minute break every hour, include 3 10-minute snack breaks each
day, and 30 minutes for lunch each day?
Rephrase the above question: Angelo and Melanie need to study 2 chapters in their textbook and 4
worksheets for their upcoming test. They have planned to dedicate 3 hours for each chapter and 1.5 hours for
each worksheet. They can study for a maximum of 4 hours each day, taking into account 10-minute breaks
every hour, 3 10-minute snack breaks per day, and 30 minutes for lunch. How many days do they need to
study in total over the next week to complete their study plan?

Question: Leah had 32 chocolates and her sister had 42. If they ate 35, how many pieces do they have left in
total?
Rephrase the above question: If Leah had 32 chocolates and her sister had 42, and they both consumed 35
chocolates, what is the total number of chocolates that they have left?

Question: There were nine computers in the server room. Five more computers were installed each day,
from monday to thursday. How many computers are now in the server room?
Rephrase the above question: If there were initially nine computers in the server room and five more
computers were added each day from Monday to Thursday, what is the current total number of computers in
the server room?

Question: Jason had 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many
lollipops did Jason give to Denny?
Rephrase the above question: If Jason initially had 20 lollipops and now has 12 after giving some to Denny,
how many lollipops did he give to Denny?

Question: Sam bought a dozen boxes, each with 30 highlighter pens inside, for $10 each box. He rearranged
five of these boxes into packages of six highlighters each and sold them for $3 per package. He sold the
rest of the highlighters separately at the rate of three pens for $2. How much profit did he make in total, in
dollars?
Rephrase the above question: Sam purchased 12 boxes, each containing 30 highlighter pens, at $10 per
box. He repackaged five of these boxes into sets of six highlighters and sold them for $3 per set. He sold
the remaining highlighters individually at a rate of three pens for $2. What is the total profit he made in dollars?
Question: There are 15 trees in the grove. Grove workers will plant trees in the grove today. After they are
done, there will be 21 trees. How many trees did the grove workers plant today?
Rephrase the above question: If there were initially 15 trees in the grove and the grove workers are planning
to plant more trees today, resulting in a total of 21 trees, how many trees did the workers plant today?

Question: {}
Rephrase the above question: """

# with open(input_path) as f:
#     data = json.load(f)
#     for item in data:
#         messages = [
#             {"role": "system", "content": "You are a helpful assistant."},
#             {"role": "user", "content": item["instruction"]}
#         ]
#         item["prompt"] = tokenizer.apply_chat_template(
#                         messages,
#                         tokenize=False,
#                         add_generation_prompt=True)

with open(input_path) as f:
    data = json.load(f)
    for item in data:
        messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt.format(item["instruction"])}
        ]
        item["prompt"] = tokenizer.apply_chat_template(
                        messages,
                        tokenize=False,
                        add_generation_prompt=True,
                        enable_thinking=False)
        
        
with open(output_path, "w") as f:
    json.dump(data, f, indent=4, ensure_ascii=False)
