"""
Code to evaluate the performance of the LLM-based reasoning.
"""
