[
    {
        "question": "Passage: A queen ant (formally known as a gyne) is an adult, reproducing female ant in an ant colony; generally she will be the mother of all the other ants in that colony. Some female ants, such as Cataglyphis cursor, do not need to mate to produce offspring, reproducing through asexual parthenogenesis or cloning, and all of those offspring will be female. Others, like those in the genus Crematogaster, mate in a nuptial flight. Queen offspring develop from larvae specially fed in order to become sexually mature among most species. Depending on the species, there can be either a single mother queen, or potentially, hundreds of fertile queens in some species. Queen ants have one of the longest life-spans of any known insect -- up to 30 years. A queen of Lasius niger was held in captivity by German entomologist Hermann Appel for 283\u20444 years; also a Pogonomyrmex owyheei has a maximum estimated longevity of 30 years in the field. Question: do queen ants give birth to queen ants?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Some schools, however, have opted to compete in a sport at a higher level and are allowed to do so by the NCAA under certain circumstances. First, schools in Divisions II and III are allowed to classify one men's sport and one women's sport as Division I (except for football and basketball), provided that they were sponsoring said sports at Division I level prior to 2011. In addition to this, a lower-division school may compete as a Division I member in a given sport if the NCAA does not sponsor a championship in that sport for the school's own division. Division II schools may award scholarships and operate under Division I rules in their Division I sports. Division III schools cannot award scholarships in their Division I sports (except as noted below), but can operate under most Division I rules in those sports. Question: can a school be division 1 and 2?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Stenotype keyboards enable the trained user to input text as fast as 225 wpm or faster at very high accuracy for an extended period of time, which is sufficient for real-time activities such as court reporting or closed captioning. While dropout rates are very high--in some cases, only 10% or even less graduate--stenotype students are usually able to reach speeds of 100--120 wpm within six months, which is faster than most alphanumeric typists. Guinness World Records gives 360 wpm with 97.23% accuracy as the highest achieved speed using a stenotype. Question: is it possible to type 200 words per minute?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Rite Aid began in 1962, opening its first store in Scranton, Pennsylvania; it was called Thrift D Discount Center. After several years of growth, Rite Aid adopted its current name and debuted as a public company in 1968. As of 2017, Rite Aid is publicly traded on the New York Stock Exchange under the symbol RAD. Its major competitors are CVS and Walgreens. In late 2015, Walgreens announced that it would acquire Rite Aid for $9.4 billion pending approval. However, on June 29, 2017, over fear of antitrust regulations, Walgreens Boots Alliance announced it would buy roughly half of Rite Aid's stores for $5.18 billion. On September 19, 2017, the Federal Trade Commission (FTC) approved a fourth deal agreement to purchase Rite Aid with 1,932 stores for $4.38 billion total. Question: are walgreens and rite aid owned by the same company?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 75"
    },
    {
        "question": "Passage: ``Stop and identify'' statutes are statutory laws in the United States that authorize police to legally obtain the identification of someone whom they reasonably suspect of having committed a crime. If there is no reasonable suspicion that a crime has been committed, is being committed, or is about to be committed, an individual is not required to provide identification, even in ``Stop and ID'' states. Question: do i have to show identification to a police officer?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The 1 Series is BMW's entry level of model range. Unusually for a small car, the 1 Series range is mostly rear-wheel drive, (except for the F52 sedan, which is front-wheel drive) with optional all-wheel drive being available on some models. Question: is a bmw 1 series rear wheel drive?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Bridge on the River Kwai is a 1957 British-American epic war film directed by David Lean and based on the novel Le Pont de la Rivi\u00e8re Kwa\u00ef (1952) by Pierre Boulle. The film is a work of fiction that uses the historical setting of the construction of the Burma Railway in 1942--1943. The cast included William Holden, Jack Hawkins, and Alec Guinness and Sessue Hayakawa. Question: was the movie bridge on the river kwai a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Sony has licensed Digital8 technology to at least one other firm (Hitachi), which marketed a few models for a while; but as of October 2005 only Sony sells Digital8 consumer equipment. Digital8's main rival is the consumer MiniDV format, which uses narrower tape and a correspondingly smaller cassette shell. Since both technologies share the same logical audio/video format, Digital8 can theoretically equal MiniDV or even DVCAM in A/V performance. But as of 2005, Digital8 has been relegated to the entry-level camcorder market, where price, not performance, is the driving factor. Meanwhile, MiniDV is the de facto standard of the domestic digital tape camcorder market. Question: are mini dv tapes the same as 8mm?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The Emancipation Proclamation has been ridiculed, notably in an influential passage by Richard Hofstadter for ``freeing'' only the slaves over which the Union had no power. These slaves were freed due to Lincoln's ``war powers''. This act cleared up the issue of contraband slaves. It automatically clarified the status of over 100,000 now-former slaves. Some 20,000 to 50,000 slaves were freed the day it went into effect in parts of nine of the ten states to which it applied (Texas being the exception). In every Confederate state (except Tennessee and Texas), the Proclamation went into immediate effect in Union-occupied areas and at least 20,000 slaves were freed at once on January 1, 1863. Question: did the emancipation proclamation apply to all states?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Nacho did not appear in Breaking Bad, but is named in the second season episode ``Better Call Saul'', the same episode that introduced Saul Goodman. In the episode ``Better Call Saul'', Walter White and Jesse Pinkman threaten Saul at gunpoint into representing Badger after Badger is arrested for selling drugs. Unaware who his captors are, Saul tries to pin some type of blame on ``Ignacio'', who is later confirmed by producers to be the same person as the character Nacho Varga who appears in the series Better Call Saul. Question: is nacho from better call saul in breaking bad?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The dairy cow will produce large amounts of milk in its lifetime. Production levels peak at around 40 to 60 days after calving. Production declines steadily afterwards until milking is stopped at about 10 months. The cow is ``dried off'' for about sixty days before calving again. Within a 12 to 14-month inter-calving cycle, the milking period is about 305 days or 10 months long. Among many variables, certain breeds produce more milk than others within a range of around 6,800 to 17,000 kg (15,000 to 37,500 lbs) of milk per year. Question: does a cow produce milk all the time?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a decline in production after the peak period and the cow is dried off for about sixty days before calving again. Therefore the score is: 40"
    },
    {
        "question": "Passage: While corn chips and tortilla chips are both made from corn, the corn in tortilla chips is subjected to the nixtamalization process, resulting in a milder flavor and aroma, and a less rigid texture. Tortilla chips also tend to be larger, thinner, and less salty than corn chips. Question: are corn chips and tortilla chips the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There are several differences between corn chips and tortilla chips, as described in the passage. While both are made from corn, the tortilla chips undergo a nixtamalization process, which results in a different flavor, aroma, and texture. Additionally, tortilla chips are generally larger, thinner, and less salty than corn chips.\\n\\nTherefore the score is: 85"
    },
    {
        "question": "Passage: Diary of a Wimpy Kid is a satirical realistic fiction comedy novel for children and teenagers written and illustrated by Jeff Kinney. It is the first book in the Diary of a Wimpy Kid series. The book is about a boy named Greg Heffley and his struggles to fit in as he begins middle school. Question: are diary of a wimpy kid books graphic novels?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Frozen is a 2010 American thriller-drama film written and directed by Adam Green and starring Emma Bell, Shawn Ashmore and Kevin Zegers. Question: is frozen ski movie based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information given in the passage about whether the movie Frozen is based on a true story. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The U.S. Constitution uses but does not define the phrase ``natural born Citizen'', and various opinions have been offered over time regarding its precise meaning. The consensus of early 21st-century constitutional scholars, together with relevant case law, is that natural-born citizens include, subject to exceptions, those born in the United States. Many scholars have also concluded that those who meet the legal requirements for U.S. citizenship ``at the moment of birth'', regardless of place of birth, are also natural-born citizens. Every president to date was either a citizen at the adoption of the Constitution in 1789 or was born in the United States; of these there have been seven that had at least one parent who was not born on U.S. soil. Question: do you have to be born in us to become president?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Croatia national football team have appeared in the FIFA World Cup on five occasions (in 1998, 2002, 2006, 2014 and 2018) since gaining independence in 1991. Before that, from 1930 to 1990 Croatia was part of Yugoslavia. For World Cup records and appearances in that period, see Yugoslavia national football team and Serbia at the FIFA World Cup. Their best result thus far was silver position at the 2018 final, where they lost 4-2 to France. Question: has croatia ever won the soccer world cup?",
        "pred_ans": " Answer: False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the 1770s, blacks throughout New England began sending petitions to northern legislatures demanding freedom. Five of the Northern self-declared states adopted policies to at least gradually abolish slavery: Pennsylvania (1780), New Hampshire and Massachusetts (1783), Connecticut and Rhode Island (1784). Vermont had abolished slavery in 1777, while it was still independent, and when it joined the United States as the 14th state in 1791, it was the first state to join untainted by slavery. These state jurisdictions thus enacted the first abolition laws in the Americas. By 1804 (including, New York (1799), New Jersey (1804)), all of the northern states had abolished slavery or set measures in place to gradually abolish it. Question: were the new england states free states in 1854?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Cavaliers--Warriors rivalry is a National Basketball Association (NBA) rivalry between the Cleveland Cavaliers and the Golden State Warriors. While the two teams have played each other since the Cavaliers joined the league in 1970, their rivalry began to develop in the 2014--15 season, when they met in the first of four consecutive NBA Finals series. Prior to the streak beginning, no pair of teams had faced each other in more than two consecutive Finals. Of these four series, the Warriors have won three championships (2015, 2017, and 2018), and the Cavaliers won in 2016. Question: has cleveland ever beat golden state in the finals?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In 1974, cost overruns at the Four Seasons Hotel Vancouver nearly led the company into bankruptcy. As a result, the company began shifting to its current, management-only business model, eliminating costs associated with buying land and buildings. The company went public in 1986. In the 1990s, Four Seasons and Ritz-Carlton began direct competition, with Ritz-Carlton emphasizing a uniform look while Four Seasons emphasized local architecture and styles with uniform service; in the end Four Seasons gained market share. Question: are ritz carlton and four seasons the same company?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A second season aired from April 1 to June 17, 2017 on MBS and other television networks. Funimation and Crunchyroll streamed the second season on their respective websites, while Adult Swim aired a dubbed version. The opening theme song is ``Opfert eure Herzen!'' (\u5fc3\u81d3\u3092\u6367\u3052\u3088!, Shinz\u014d o Sasageyo!, lit. ``Dedicate Your Hearts!'') by Linked Horizon and the ending theme song is ``Y\u016bgure no Tori'' (\u5915\u66ae\u308c\u306e\u9ce5) by Shinsei Kamattechan. Question: does attack on titan have a season 2?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Mamma Mia! Here We Go Again was announced on May 19, 2017, with a release date of July 20, 2018. It was written and directed by Ol Parker. On September 27, 2017, Benny Andersson confirmed 3 ABBA songs that would be featured in the film: ``When I Kissed the Teacher,'' ``I Wonder (Departure),'' and ``Angeleyes.'' ``I Wonder (Departure)'' was cut from the film, but is included on the soundtrack album. Question: does the new mama mia have abba music?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Canadian law requires that all persons entering Canada must carry proof of both citizenship and identity. A valid U.S. passport or passport card is preferred, although a birth certificate, naturalization certificate, citizenship certificate, or another document proving U.S. nationality, together with a government-issued photo ID (such as a driver's license) are acceptable to establish identity and nationality. However, the documents required to return to the United States can be more restrictive (for example, a birth certificate and photo ID are insufficient) -- see the section below on Return entry into the U.S. Question: can i get into canada with a birth certificate?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Nickelodeon announced a new 2D animated series based on the franchise, which will debut in July 2018. Question: is teenage mutant ninja turtles still on tv?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Some schools around America were integrated before the mid-20th century, the first ever school being Lowell High School in Massachusetts, which has accepted students of all races at its inception. The earliest known African American student, Caroline Van Vronker, attended the school in 1843. The integration of all American schools was a major catalyst for the civil rights action and racial violence that occurred in the United States during the latter half of the 20th century. Question: was integration the rule in the northern states?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is not enough information in the passage to determine if integration was the rule in the northern states. The passage states that some schools in America were integrated before the mid-20th century, and Lowell High School in Massachusetts is mentioned as an example. However, it does not provide information about the prevalence of integration in the northern states as a whole. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The current Claret Jug was first awarded to Walter Hagen for winning the 1928 Open. The winner must return the trophy before the next year's Open, and receives a replica to keep permanently. Three other replicas exist: one in the British Museum of Golf at St Andrews, and two used for travelling exhibitions. Question: does the winner of the british open get to keep the jug?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Plans for any future installments for the series have been shelved. Director D.J. Caruso confirmed that he would like to direct a sequel, but in an interview with MTV Hollywood Crush Lore has stated that any questions or requests for a sequel should be directed to producer Michael Bay. Question: are there going to be any more i am number four movies?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: From and to the spinal cord are projections of the peripheral nervous system in the form of spinal nerves (sometimes segmental nerves). The nerves connect the spinal cord to skin, joints, muscles etc. and allow for the transmission of efferent motor as well as afferent sensory signals and stimuli. This allows for voluntary and involuntary motions of muscles, as well as the perception of senses. All in all 31 spinal nerves project from the brain stem, some forming plexa as they branch out, such as the brachial plexa, sacral plexa etc. Each spinal nerve will carry both sensory and motor signals, but the nerves synapse at different regions of the spinal cord, either from the periphery to sensory relay neurons that relay the information to the CNS or from the CNS to motor neurons, which relay the information out. Question: are sensory neurons part of the central nervous system?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. According to the passage, sensory neurons are part of the peripheral nervous system, not the central nervous system. Therefore the score is: 50"
    },
    {
        "question": "Passage: When the ocean floor has a gradual slope, the wave will steepen until the crest becomes unstable, resulting in turbulent whitewater spilling down the face of the wave. This continues as the wave approaches the shore, and the wave's energy is slowly dissipated in the whitewater. Because of this, spilling waves break for a longer time than other waves, and create a relatively gentle wave. Onshore wind conditions make spillers more likely. Question: is the turbulent water created by breaking waves?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. The correct answer should be \\\"Yes\\\". Therefore the score is: 80."
    },
    {
        "question": "Passage: The Thompson submachine gun is an American submachine gun, invented by John T. Thompson in 1918, that became infamous during the Prohibition era, becoming a signature weapon of various police syndicates in the United States. It was a common sight in the media of the time, being used by both law enforcement officers and criminals. The Thompson submachine gun was also known informally as the ``Tommy Gun'', ``Annihilator'', ``Chicago Typewriter'', ``Chicago Piano'', ``Chicago Style'', ``Chicago Organ Grinder'', ``Trench Broom'', ``Trench Sweeper'', ``The Chopper'', and simply ``The Thompson''. Question: is the tommy gun a sub machine gun?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The 1936 Montreux Convention provides for a free passage of civilian ships between the international waters of the Black and the Mediterranean Seas. However, a single country (Turkey) has a complete control over the straits connecting the two seas. The 1982 amendments to the Montreux Convention allow Turkey to close the Straits at its discretion in both wartime and peacetime. Question: can you get to the black sea from the mediterranean?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The film received positive reviews from critics, and was nominated for four Academy Awards: Best Picture, Best Supporting Actor for Michael Clarke Duncan, Best Sound, and Best Screenplay Based on Material Previously Produced or Published. Question: was the green mile nominated for any oscars?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In high-energy particle colliders, matter creation events have yielded a wide variety of exotic heavy particles precipitating out of colliding photon jets (see two-photon physics). Currently, two-photon physics studies creation of various fermion pairs both theoretically and experimentally (using particle accelerators, air showers, radioactive isotopes, etc.). Question: is it possible to create mass from energy?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The chocolate bar is structured in two layers; a lightly-whipped nougat layer, with a lower layer of cereal 'crispies', these are then coated in milk chocolate. Originally the bar contained raisins within the base layer; however, consumer research in the mid-1980s led to these being removed and the current formulation being introduced. Television adverts in the 1970s featured Willie Rushton before a mascot named Dougie the Double Decker Dog was introduced. Question: did a double deckers have raisins in it?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Whole-wheat flour (in the US) or wholemeal flour (in the UK) is a powdery substance, a basic food ingredient, derived by grinding or mashing the whole grain of wheat, also known as the wheatberry. Whole-wheat flour is used in baking of breads and other baked goods, and also typically mixed with other lighter ``white'' unbleached or bleached flours (that have been treated with flour bleaching agent(s)) to restore nutrients to the white flours (especially fiber, protein, and vitamins), texture, and body that are lost in milling and other processing to the finished baked goods or other food(s). Question: is wholemeal flour the same as whole wheat?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The beetroot is the taproot portion of the beet plant, usually known in North America as the beet, also table beet, garden beet, red beet, or golden beet. It is one of several of the cultivated varieties of Beta vulgaris grown for their edible taproots and their leaves (called beet greens). These varieties have been classified as B. vulgaris subsp. vulgaris Conditiva Group. Question: are the leaves of the beet plant edible?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: An acute triangle is a triangle with all three angles acute (less than 90\u00b0). An obtuse triangle is one with one obtuse angle (greater than 90\u00b0) and two acute angles. Since a triangle's angles must sum to 180\u00b0, no triangle can have more than one obtuse angle. Question: is it possible to have a triangle with two obtuse angles?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There are three types of triangles based on the measure of their angles:\\n\\n1. Acute triangle: all three angles are acute (less than 90\u00b0)  \\n2. Obtuse triangle: one angle is obtuse (greater than 90\u00b0) and the other two are acute  \\n3. Right triangle: one angle is 90\u00b0 and the other two are acute\\n\\nSince a triangle's angles must sum to 180\u00b0, it is impossible to have a triangle with two obtuse angles.\\n\\nTherefore the score is: 100"
    },
    {
        "question": "Passage: The frontier walls built by different dynasties have multiple courses. Collectively, they stretch from Dandong in the east to Lop Lake in the west, from present-day Sino-Russian border in the north to Qinghai in the south; along an arc that roughly delineates the edge of Mongolian steppe. A comprehensive archaeological survey, using advanced technologies, has concluded that the walls built by the Ming dynasty measure 8,850 km (5,500 mi). This is made up of 6,259 km (3,889 mi) sections of actual wall, 359 km (223 mi) of trenches and 2,232 km (1,387 mi) of natural defensive barriers such as hills and rivers. Another archaeological survey found that the entire wall with all of its branches measures out to be 21,196 km (13,171 mi). Today, the Great Wall is generally recognized as one of the most impressive architectural feats in history. Question: is the great wall of china all around china?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Worldwide, there were 24.452 million NGVs by 2016, led by China (5.0 million), Iran (4.00 million), India (3.045 million), Pakistan (3.0 million), Argentina (2.295 million), Brazil (1.781 million), and Italy (1.001 million). The Asia-Pacific region leads the world with 6.8 million vehicles, followed by Latin America with 4.2 million. In Latin America, almost 90% of NGVs have bi-fuel engines, allowing these vehicles to run on either gasoline or CNG. In Pakistan, almost every vehicle converted to (or manufactured for) alternative fuel use typically retains the capability of running on gasoline. Question: are there cars that run on natural gas?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Showtime announced 13 April 2009, that it had renewed the show for a fourth and final season. The network ordered 10 episodes that were first broadcast on 11 April 2010. The series finale was broadcast on 20 June 2010. The final season was shown in Canada on CBC starting 22 September 2010, and ending on 23 November 2010. Question: is there a season 5 of the tudors?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The blue whale (Balaenoptera musculus) is a marine mammal belonging to the baleen whale parvorder, Mysticeti. At up to 30 metres (98 ft) in length and with a maximum recorded weight of 173 tonnes (190 short tons), it is the largest animal known to have ever existed. Question: is the blue whale the largest animal ever?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u9605\u8bfb\u60a8\u7ed9\u51fa\u7684\u95ee\u9898\u548c\u5b66\u751f\u7684\u7b54\u6848\u3002\\n\\n\u95ee\u9898\uff1a```Passage: The blue whale (Balaenoptera musculus) is a marine mammal belonging to the baleen whale parvorder, Mysticeti. At up to 30 metres (98 ft) in length and with a maximum recorded weight of 173 tonnes (190 short tons), it is the largest animal known to have ever existed. Question: is the blue whale the largest animal ever?```\\n\\n\u5b66\u751f\u7684\u7b54\u6848\uff1a``` True```\\n\\n\u6839\u636e\u95ee\u9898\u4e2d\u7684\u6bb5\u843d\uff0c\u5b66\u751f\u7684\u7b54\u6848\u662f\u6b63\u786e\u7684\u3002\u56e0\u6b64\uff0c\u5206\u6570\u662f\uff1a100"
    },
    {
        "question": "Passage: Section 1. No person shall be elected to the office of the President more than twice, and no person who has held the office of President, or acted as President, for more than two years of a term to which some other person was elected President shall be elected to the office of the President more than once. But this Article shall not apply to any person holding the office of President when this Article was proposed by the Congress, and shall not prevent any person who may be holding the office of President, or acting as President, during the term within which this article becomes operative from holding the office of President or acting as President during the remainder of such term. Question: can an american president run for 3 terms?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Nearly two-thirds of the valley's land area is part of the city of Los Angeles. The other incorporated cities in the valley are Glendale, Burbank, San Fernando, Hidden Hills, Agoura Hills, and Calabasas. Question: is pasadena part of the san fernando valley?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Climate change is now a major political talking point in Australia in the last two decades. Persistent drought, and resulting water restrictions during the first decade of the twenty-first century, are an example of natural events' tangible effect on economic and political realities . Question: are there any major water concerns for australia?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Home Park, previously known as the Little Park (and originally Lydecroft Park), is a private 655-acre (265 ha) Royal park, administered by the Crown Estate. It lies on the eastern side of Windsor Castle in the town and former civil parish of Windsor in the English county of Berkshire. Question: is home park windsor open to the public?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: There is no state law prohibiting consumption of alcohol by minors while on private property, but many municipalities prohibit underage consumption unless parents or adult relatives are present. Public schools are not permitted to have ``24/7'' conduct policies which sanction students for alcohol consumption outside of school. Minors are allowed to enter licensed establishments, and while state law does not prohibit bars and nightclubs from having events such as ``teen nights,'' or ``18 to party, 21 to drink,'' some municipalities impose restrictions. It is legal for a person under 21 to be in a location where underage drinking is occurring, and New Jersey does not have an ``internal possession'' statute criminalizing underage drinking after the fact. Question: can you go into a liquor store under 21 in nj?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Like the thymus, the spleen possesses only efferent lymphatic vessels. The spleen is part of the lymphatic system. Both the short gastric arteries and the splenic artery supply it with blood. Question: is the spleen part of the circulatory system?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Criminal Code creates criminal offences with respect to different aspects of hate propaganda. Those offences are decided in the criminal courts and carry penal sanctions, such as fines, probation orders and imprisonment. The federal government also has standards with respect to hate publications in federal laws relating to broadcasting. Question: can you be jailed in canada for offensive speech?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: ``American statutes typically provide that, in absence of issue and subject to the share of a surviving spouse, intestate property passes to the parents or to the surviving parent of the decedent''. Under the civil law system of computation and its various modified forms that are widely adopted by statute in the United States, ``a claimant's degree of kinship is the total of (1) the number of the steps, counting one from each generation, from the decedent up to the nearest common ancestor of the decedent and the claimant, and (2) the number of steps from the common ancestor down to the claimant.'' ``The claimant having the lowest degree count (i.e., the nearest or next of kin) is entitled to the property.'' ``If there are two or more claimants who stand in equal degree of kinship to the decedent, they share per capita.'' Question: can you choose who is your next of kin?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Central Bank of India has approached the Reserve Bank of India (RBI) for permission to open representative offices in five more locations - Singapore, Dubai, Doha and London. Question: is central bank of india and reserve bank of india same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The United States Department of Justice (DOJ), also known as the Justice Department, is a federal executive department of the U.S. government, responsible for the enforcement of the law and administration of justice in the United States, equivalent to the justice or interior ministries of other countries. The department was formed in 1870 during the Ulysses S. Grant administration. Question: is the justice department part of the judicial branch?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In most jurisdictions, the theft of traffic signage is treated like any other theft with respect to prosecution and sentencing. If, however, the theft leads to an injury, then the thieves may be found criminally liable for the injury as well, provided that an injury of that sort was a foreseeable consequence of such a theft. In one notable United States case, three people were found guilty of manslaughter for stealing a stop sign and thereby causing a deadly collision. This was publicized in the novel Driver's Ed by Caroline B. Cooney. Question: is it illegal to take a stop sign?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Baby back ribs (also back ribs or loin ribs) are taken from the top of the rib cage between the spine and the spare ribs, below the loin muscle. They have meat between the bones and on top of the bones, and are shorter, curved, and sometimes meatier than spare ribs. The rack is shorter at one end, due to the natural tapering of a pig's rib cage. The shortest bones are typically only about 3 in (7.6 cm) and the longest is usually about 6 in (15 cm), depending on the size of the hog. A pig side has 15 to 16 ribs (depending on the breed), but usually two or three are left on the shoulder when it is separated from the loin. So, a rack of back ribs contains a minimum of eight ribs (some may be trimmed if damaged), but can include up to 13 ribs, depending on how it has been prepared by the butcher. A typical commercial rack has 10--13 bones. If fewer than 10 bones are present, butchers call them ``cheater racks''. Question: are pork loin ribs the same as baby back?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Fish and shellfish concentrate mercury in their bodies, often in the form of methylmercury, a highly toxic organic compound of mercury. Fish products have been shown to contain varying amounts of heavy metals, particularly mercury and fat-soluble pollutants from water pollution. Species of fish that are long-lived and high on the food chain, such as marlin, tuna, shark, swordfish, king mackerel and tilefish (Gulf of Mexico) contain higher concentrations of mercury than others. Question: does tuna have a lot of mercury in it?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In the United Kingdom and the Crown dependencies, any household watching or recording live television transmissions as they are being broadcast (terrestrial, satellite, cable, or Internet) is required to hold a television licence. Businesses, hospitals, schools and a range of other organisations are also required to hold television licences to watch and record live TV broadcasts. A television licence is also required to receive video on demand programme services provided by the BBC, on the iPlayer catch-up service. Question: do you need a license to watch tv in england?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Liquefied petroleum gas or liquid petroleum gas (LPG or LP gas), also referred to as simply propane or butane, are flammable mixtures of hydrocarbon gases used as fuel in heating appliances, cooking equipment, and vehicles. Question: is liquid petroleum gas the same as propane?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Advanced maternal age is associated with adverse reproductive effects such as increased risk of infertility, and that the children have chromosomal abnormalities. The corresponding paternal age effect is less pronounced. Question: does a woman's age affect birth defects?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: As with the previous installments of the franchise, players must mimic the on-screen dancer's choreography to a chosen song using either motion controllers or the game's associated Just Dance Controller app on a smartphone. A new ``Super'' judgment was added between ``Good'' and ``Perfect''. The ``Dance Lab'' mode features medleys of choreography representing different professions and animals, while a new ``Kids Mode'' was designed to provide a gameplay experience and choreography tailored towards younger players. Question: is super better than perfect in just dance?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Although Sam is mentioned occasionally following his departure -- most notably calling Josh to tell him to ``roll with the punches'' after the latter unwittingly caused the defection of a Democratic Senator -- he is not seen in the series until the last episodes of the seventh and final season, following the election of Congressman Matt Santos as President. Resolving the debate over the result of the California 47th's special election, it is implied that Sam was defeated by Congressman Webb and declined the promotion to Senior Counselor to the President that had been suggested by Toby. After summarily quitting politics, Sam remained in his home state of California and joined an unnamed law firm in Los Angeles which pays him a salary that would ``make (Josh) puke''. Question: does sam come back to the west wing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Full-size car is a marketing term used in North America for an automobile larger than a mid-size car. Traditional U.S. full-size passenger cars were designed to be comfortable for six occupants and their luggage for long-distance driving. The United States Environmental Protection Agency (EPA) currently uses the term large car to denote full-size cars based on their combined interior passenger and luggage volume. Question: is a midsize car bigger than a full size?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage clearly states that full-size cars are larger than mid-size cars. Therefore the score is: 60"
    },
    {
        "question": "Passage: Twenty-eight states have national parks, as do the territories of American Samoa and the United States Virgin Islands. California has the most (nine), followed by Alaska (eight), Utah (five), and Colorado (four). The largest national park is Wrangell--St. Elias in Alaska: at over 8 million acres (32,375 km), it is larger than each of the nine smallest states. The next three largest parks are also in Alaska. The smallest park is Gateway Arch National Park, Missouri, at approximately 192.83 acres (0.7804 km). The total area protected by national parks is approximately 52.2 million acres (211,000 km), for an average of 870 thousand acres (3,500 km) but a median of only 229 thousand acres (930 km). Question: is there a national park in all 50 states?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: For the runners, an infield fly is little different from an ordinary fly ball. If an infield fly is caught, the runners must retouch their original bases (``tag up'') after the catch before attempting to advance. If an infield fly is not caught, no tag up is required and the runners may advance at their own risk. The only difference is that the umpire's declaration that the batter is out removes force plays and gives runners the option of staying on the base. Question: can base runners advance on an infield fly rule?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a crucial detail missing from the student's answer. While it is true that runners can advance on an infield fly rule, it is important to mention that they must retouch their original bases (tag up) after the catch before attempting to advance. Therefore, the score is: 70/100"
    },
    {
        "question": "Passage: A non-disclosure agreement (NDA), also known as a confidentiality agreement (CA), confidential disclosure agreement (CDA), proprietary information agreement (PIA) or secrecy agreement (SA), is a legal contract between at least two parties that outlines confidential material, knowledge, or information that the parties wish to share with one another for certain purposes, but wish to restrict access to or by third parties. The most common forms of these are in doctor--patient confidentiality (physician--patient privilege), attorney--client privilege, priest--penitent privilege, and bank--client confidentiality agreements. Question: is there a difference between a confidentiality agreement and a non disclosure agreement?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: On May 29, 2008 an Australian woman, Meera Thangarajah (age 34), who had an ectopic pregnancy in the ovary, gave birth to a healthy full term 6 pound 3 ounce (2.8 kg) baby girl, Durga, via caesarean section. She had no problems or complications during the 38\u2010week pregnancy. Question: can an ectopic pregnancy be carried to term?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a crucial detail missing from the student's answer. While it is true that the woman in the passage had an ectopic pregnancy, it is important to note that it was not carried to term. An ectopic pregnancy, by definition, occurs when a fertilized egg implants outside the uterus, usually in the fallopian tube. This condition is usually life-threatening and cannot be carried to term, as the embryo cannot develop properly outside the uterus. In this case, the woman had the ectopic pregnancy surgically removed and then gave birth to a healthy baby girl.\\n\\nTherefore, the score is: 50/100"
    },
    {
        "question": "Passage: After mating, the female spends extra time basking to keep her eggs warm. She may also have a change of diet, eating only certain foods, or not eating as much as she normally would. A female can lay between two and 30 eggs depending on body size and other factors. One female can lay up to five clutches in the same year, and clutches are usually spaced 12 to 36 days apart. The time between mating and egg-laying can be days or weeks. The actual egg fertilization takes place during the egg-laying. This process also permits the laying of fertile eggs the following season, as the sperm can remain viable and available in the female's body in the absence of mating. During the last weeks of gestation, the female spends less time in the water and smells and scratches at the ground, indicating she is searching for a suitable place to lay her eggs. The female excavates a hole, using her hind legs, and lays her eggs in it. Question: do red slider turtles lay eggs in water?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mention in the passage that the female red slider turtle excavates a hole using her hind legs and lays her eggs in it. Since the passage does not specify if the hole is in water or not, the student's answer of 'False' cannot be fully confirmed. However, based on the information provided, it can be inferred that red slider turtles might not always lay eggs in water. Therefore the score is: 70"
    },
    {
        "question": "Passage: Seoul (/so\u028al/, like soul; Korean: \uc11c\uc6b8 (s\u028c.ul) ( listen); lit. ``Capital''), officially the Seoul Special City, is the capital and largest metropolis of South Korea. With surrounding Incheon metropolis and Gyeonggi province, Seoul forms the heart of the Seoul Capital Area, home to roughly half of the country's population. Seoul is ranked as the fourth largest metropolitan economy in the world and is larger than London and Paris. Question: is seoul the largest city in the world?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There are not enough details in the passage to determine if Seoul is the largest city in the world, so the student's answer cannot be accurately assessed. Therefore the score is: 50."
    },
    {
        "question": "Passage: Salted duck eggs are normally boiled or steamed before being peeled and eaten as a condiment to congee or cooked with other foods as a flavoring. The egg white has a sharp, salty taste. The orange red yolk is rich, fatty, and less salty. The yolk is prized and is used in Chinese mooncakes to symbolize the moon. Question: do salted duck eggs need to be cooked?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no clear Yes/No answer to the question \\\"do salted duck eggs need to be cooked?\\\" from the given passage. The passage describes that salted duck eggs are normally boiled or steamed before being peeled and eaten, but it doesn't explicitly state that they must be cooked. Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: Founded in the United States between 1980 and 1990, the American Bully was produced using a foundation of American Staffordshire Terriers and American Pit Bull Terriers bred to several bulldog-type breeds. It was created with the purpose of being a family companion dog. Question: is an american bully the same as an american bulldog?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The story begins nineteen years after the events of Harry Potter and the Deathly Hallows and follows Harry Potter, now a Ministry of Magic employee, and his younger son Albus Severus Potter, who is about to attend Hogwarts School of Witchcraft and Wizardry. Question: is harry potter and the cursed child a prequel?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Spencer joined the A-Team briefly near the ending of the third season after having been invited by Mona at the Radley while hospitalized. Initially, Spencer was extremely determined to be part of the team. However, she later unfolds the truth behind the disappearance of Toby and became a double agent as well. Likewise Toby, she got kicked off from the team. She is the ``A'' who kidnapped Malcolm, causing a break up between Ezra and Aria. Question: is spencer on the a team pretty little liars?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A dry thunderstorm or heat storm, is a thunderstorm that produces thunder and lightning, but most or all of its precipitation evaporates before reaching the ground, and dry lightning is the term which is used to refer to lightning strikes occurring in this situation. Both are so common in the American West that they are sometimes used interchangeably. The latter term is a technical misnomer since lightning itself is neither wet nor dry. Question: does it always rain when there's lightning?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no mention of the score in the passage."
    },
    {
        "question": "Passage: In informal games, it is customary to announce ``check'' when making a move that puts the opponent's king in check. In formal competitions, however, check is rarely announced. Question: is it required to say check in chess?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Year zero does not exist in the Anno Domini system usually used to number years in the Gregorian calendar and in its predecessor, the Julian calendar. In this system, the year 1 BC is followed by AD 1. However, there is a year zero in astronomical year numbering (where it coincides with the Julian year 1 BC) and in ISO 8601:2004 (where it coincides with the Gregorian year 1 BC) as well as in all Buddhist and Hindu calendars. Question: is there a year 0 in the gregorian calendar?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Canadian law requires that all persons entering Canada must carry proof of both citizenship and identity. A valid U.S. passport or passport card is preferred, although a birth certificate, naturalization certificate, citizenship certificate, or another document proving U.S. nationality, together with a government-issued photo ID (such as a driver's license) are acceptable to establish identity and nationality. However, the documents required to return to the United States can be more restrictive (for example, a birth certificate and photo ID are insufficient) -- see the section below on Return entry into the U.S. Question: can you get into canada with just a drivers license?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Scarlett is a 1991 novel by Alexandra Ripley, written as a sequel to Margaret Mitchell's 1936 novel, Gone with the Wind. The book debuted on The New York Times bestsellers list, but both critics and fans of the original novel found Ripley's version to be inconsistent with the literary quality of Gone with the Wind. Question: is there a second gone with the wind?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Although Ireland is a member of the European Union, it has an opt-out from the Schengen Area and is therefore able to set its own visa policy. Ireland also operates the Common Travel Area with the United Kingdom, the Channel Islands and the Isle of Man which allows for open internal borders between the countries and territories. Established in 1923, it permits British and Irish citizens to freely move around the Common Travel Area with minimal identity documents. Question: can we travel to ireland with schengen visa?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Thereferee's answer: False\\n\\nTherefore the score is: 80"
    },
    {
        "question": "Passage: Sales are prohibited on Memorial Day, Independence Day, Labor Day, Easter, Thanksgiving and Christmas unless the local unit of government has voted to allow Sunday Sales. If Sunday Sales are allowed, sales are prohibited only on Easter Sunday, Christmas and Thanksgiving. Sales are prohibited between 11:00 PM and 9:00 AM. Cities and counties which allow off-premises sales are prohibited from allowing Sunday liquor sales after 8:00 PM, but may not require retail liquor stores to close before 8:00 PM on other days. No sales are allowed at less than cost. All employees must be at least 21 years of age. Question: can you buy beer on easter sunday in kansas?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The economy of the city is mostly dominated by its financial centre, port facilities, tourism and the manufacturing sector which include textiles, chemicals, plastics and pharmaceuticals. Port Louis is home to the biggest port facility in the Indian Ocean region and one of Africa's major financial centers. Question: is port louis mauritius in the atlantic or pacific ocean?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A table of contents usually includes the titles or descriptions of the first-level headers, such as chapter titles in longer works, and often includes second-level or section titles (A-heads) within the chapters as well, and occasionally even third-level titles (subsections or B-heads). The depth of detail in tables of contents depends on the length of the work, with longer works having less. Formal reports (ten or more pages and being too long to put into a memo or letter) also have a table of contents. Within an English-language book, the table of contents usually appears after the title page, copyright notices, and, in technical journals, the abstract; and before any lists of tables or figures, the foreword, and the preface. Question: does the foreword go before the table of contents?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The Houston Texans joined the league at the 2002 NFL season, playing at the newly founded Reliant Stadium. With their opening game victory over the Dallas Cowboys that season, the team became the first expansion team to win its opening game since the Minnesota Vikings beat the Chicago Bears in 1961. While the team struggled in early seasons, results began to improve once native Houstonian Gary Kubiak became the head coach in 2006. The Texans finished with a .500 season (8-8) in both 2007 and 2008, and nearly qualified for the 2009--10 NFL playoffs with a 9--7 result in 2009. In 2010, the team started the season on a 4--2 record going into a Week 7 bye week, but promptly collapsed 2--8 in the second part of the season, finishing 6--10. In the 2011 NFL Draft, the Texans acquired Wisconsin star defensive end J.J. Watt eleventh overall. The following season, former Cowboys head coach Wade Phillips was hired as the defensive coordinator of the Texans, and the improved defense led to the Texans finishing 10--6, winning their first AFC South title. The Texans then beat wild card Cincinnati Bengals 31--10 in the first round of the 2011--12 NFL playoffs, before a 20--13 defeat by the Ravens in the semifinals. Question: have the houston texans ever won a playoff game?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In basketball, goaltending is the violation of interfering with the ball while it is on its way to the basket and it is (a) in a downward flight, (b) entirely above the rim and has the possibility of entering the basket, and (c) not touching the rim. In NCAA basketball, WNBA and NBA basketball, goaltending is also called if the ball has already touched the backboard while being above the height of the rim in its flight, regardless of whether it being in an upward or downward flight or whether it is directly above the rim. Goaltending in this context defines by exclusion what is considered a legal block of a field goal. In high school and NCAA basketball, goaltending is also called when a player interferes with a free throw at any time in its flight towards the basket. If goaltending is called for interference with a field goal, the shooting team is awarded the points for the field goal as if it had been made. In high school and NCAA basketball, if goaltending is called on a free throw, the shooting team is awarded one point and a technical foul is called against the offending player. Question: is it goaltending if the ball hits the backboard below the rim?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Shape of Water is a 2017 American romantic dark fantasy drama film directed by Guillermo del Toro and written by del Toro and Vanessa Taylor. It stars Sally Hawkins, Michael Shannon, Richard Jenkins, Doug Jones, Michael Stuhlbarg, and Octavia Spencer. Set in Baltimore, Maryland in 1962, the story follows a mute custodian at a high-security government laboratory who falls in love with a captured humanoid amphibian creature. Filming took place in Ontario, Canada, between August and November 2016. Question: is the shape of water movie based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In optics, a virtual image is an image formed when the outgoing rays from a point on an object always diverge. The image appears to be located at the point of apparent divergence. Because the rays never really converge, a virtual image cannot be projected onto a screen. In diagrams of optical systems, virtual rays are conventionally represented by dotted lines. Virtual images are located by tracing the real rays that emerge from an optical device (lens, mirror, or some combination) backward to a perceived point of origin. Question: can an image be formed without a screen?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: RACI is an acronym derived from the four key responsibilities most typically used: Responsible, Accountable, Consulted, and Informed. Question: can you be responsible and accountable in a raci?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Fear the Walking Dead is an American post-apocalyptic horror drama television series created by Robert Kirkman and Dave Erickson, that premiered on AMC on August 23, 2015. It is a companion series and prequel to The Walking Dead, which is based on the comic book series of the same name by Robert Kirkman, Tony Moore, and Charlie Adlard. Question: is fear the walking dead the same as the walking dead?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Citadel, The Military College of South Carolina, commonly referred to simply as The Citadel, is a state-supported, comprehensive college located in Charleston, South Carolina, United States. Established in 1842, it is one of six United States senior military colleges. It has 18 academic departments divided into five schools offering 29 majors and 38 minors. The military program consists of cadets pursuing bachelor's degrees who live on campus, while civilian degrees are offered through 8 undergraduate and 24 graduate programs. Question: do you have to go into the military if you go to the citadel?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a misconception in the student's answer. The Citadel is a senior military college, which means that it offers a military program for cadets pursuing bachelor's degrees who live on campus. However, it is not compulsory for all students to join the military. The Citadel also offers civilian degrees through undergraduate and graduate programs.\\n\\nTherefore, the score is: 70"
    },
    {
        "question": "Passage: Many reef aquarium keepers use reverse osmosis systems for their artificial mixture of seawater. Ordinary tap water can contain excessive chlorine, chloramines, copper, nitrates, nitrites, phosphates, silicates, or many other chemicals detrimental to the sensitive organisms in a reef environment. Contaminants such as nitrogen compounds and phosphates can lead to excessive and unwanted algae growth. An effective combination of both reverse osmosis and deionization is the most popular among reef aquarium keepers, and is preferred above other water purification processes due to the low cost of ownership and minimal operating costs. Where chlorine and chloramines are found in the water, carbon filtration is needed before the membrane, as the common residential membrane used by reef keepers does not cope with these compounds. Question: is reverse osmosis water the same as deionized?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Google Drive offers users 15 gigabytes of free storage, with 100 gigabytes, 1 terabyte, 2 terabytes, 10 terabytes, 20 terabytes, and 30 terabytes offered through optional paid plans. Files uploaded can be up to 5 terabytes in size. Users can change privacy settings for individual files and folders, including enabling sharing with other users or making content public. On the website, users can search for an image by describing its visuals, and use natural language to find specific files, such as ``find my budget spreadsheet from last December''. Question: is there a storage limit on google drive?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a storage limit on Google Drive. The passage states that users are offered different storage plans, such as 15 gigabytes for free, and larger storage options through paid plans. Additionally, the passage mentions that files uploaded can be up to 5 terabytes in size. Therefore, the score is: 100"
    },
    {
        "question": "Passage: After returning to Earth, Watney becomes a survival instructor for astronaut candidates. Five years later, on the occasion of the Ares V mission launch, those involved in Watney's rescue have begun new lives. Question: does he make it home in the martian?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is not enough information to determine if the student's answer is correct or not. Please provide the context or passage from the book \\\"The Martian\\\" to accurately evaluate the student's answer."
    },
    {
        "question": "Passage: The hair hang is an aerial circus act where performers (usually young women) are suspended by their hair, acrobatic poses and/or manipulation. Some believe the act originated in South America; others claim the act hails from China. Performers are literally hanging by their hair, which is tied into a hairhang rig; the techniques used to tie the performer's hair, and the acrobatic techniques involved in the act, are key. Question: can you pick someone up by their hair?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Federal law requires background checks for commercial gun sales, but an exemption exists for private-party sales whereby ``any person may sell a firearm to an unlicensed resident of the state where they reside, as long as they do not know or have reasonable cause to believe the person is prohibited from receiving or possessing firearms''. Under federal law, private-party sellers are not required to perform background checks on buyers, record the sale, or ask for identification, whether at a gun show or other venue. This is in contrast to sales by gun stores and other Federal Firearms License (FFL) holders, who are required to perform background checks and record all sales on almost all buyers, regardless of whether the venue is their business location or a gun show. Some states have passed laws to require background checks for private sales with limited exceptions. Access to the National Instant Criminal Background Check System (NICS) is limited to FFL holders. Question: can you buy a gun from a gun show without a background check?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Non-licensees and all users of long guns have much stricter rules for carrying firearms in their vehicles. Ohio statute O.R.C. 2923.16 allows for three ways for those not licensed to carry a concealed handgun to transport firearms in a motor vehicle. The firearm(s) must be unloaded and carried in one of the following ways: Question: can you carry a gun in your car in ohio?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Hi-C is a fruit juice-flavored drink made by the Minute Maid division of The Coca-Cola Company. It was created by Niles Foster in 1946 and released in 1947. The sole original flavor was orange. Question: does hi-c orange have caffeine in it?",
        "pred_ans": " Response: False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 2011, screenwriter Noxon told Collider.com that plans for an imminent sequel were shelved due to the disappointing performance of the first installment at the box office. Question: are there more movies after i am number four?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: This has led to categorizing trucks similarly, even if their payload is different. Therefore, the Toyota Tacoma, Dodge Dakota, Ford Ranger, Honda Ridgeline, Chevrolet S-10, GMC S-15 and Nissan Frontier are called quarter-tons (1\u20444-ton). The Ford F-150, Chevrolet C10/K10, Chevrolet/GMC 1500, Dodge 1500, Toyota Tundra, and Nissan Titan are half-tons (1\u20442-ton). The Ford F-250, Chevrolet C20/K20, Chevrolet/GMC 2500, and Dodge 2500 are three-quarter-tons (3\u20444-ton). Chevrolet/GMC's 3\u20444-ton suspension systems were further divided into light and heavy-duty, differentiated by 5-lug and 6 or 8-lug wheel hubs depending on year, respectively. The Ford F-350, Chevrolet C30/K30, Chevrolet/GMC 3500, and Dodge 3500 are one tons (1-ton). Question: is a nissan frontier a half ton truck?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Top Gear presenters go across Burma and Thailand in lorries with the goal of building a bridge over the river Kwai. After building a bridge over the Kok River, Clarkson is quoted as saying ``That is a proud moment, but there's a slope on it.'' as a native crosses the bridge, 'slope' being a pejorative for Asians. Question: did top gear build a bridge over the river kwai?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the 1050s and early 1060s William became a contender for the throne of England, then held by the childless Edward the Confessor, his first cousin once removed. There were other potential claimants, including the powerful English earl Harold Godwinson, who was named the next king by Edward on the latter's deathbed in January 1066. William argued that Edward had previously promised the throne to him and that Harold had sworn to support William's claim. William built a large fleet and invaded England in September 1066, decisively defeating and killing Harold at the Battle of Hastings on 14 October 1066. After further military efforts William was crowned king on Christmas Day 1066, in London. He made arrangements for the governance of England in early 1067 before returning to Normandy. Several unsuccessful rebellions followed, but by 1075 William's hold on England was mostly secure, allowing him to spend the majority of the rest of his reign on the continent. Question: did william the conqueror have a legitimate claim to the english throne?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In English, the letter Q is usually followed by the letter U, but there are some exceptions. The majority of these are anglicised from Arabic, Chinese, Hebrew, Inuktitut, or other languages which do not use the English alphabet, with Q representing a sound not found in English. For example, in the Chinese pinyin alphabet, qi is pronounced /t\u0283i/ by an English speaker, as pinyin uses ``q'' to represent the sound (t\u0255h), which is approximated as (t\u0283) in English. In other examples, Q represents (q) in standard Arabic, such as in qat, faqir and Qur'\u0101n. In Arabic, the letter \u0642, traditionally romanised as Q, is quite distinct from \u0643, traditionally romanised as K; for example, \u0642\u0644\u0628 /qalb/ means ``heart'' but \u0643\u0644\u0628 /kalb/ means ``dog''. However, alternative spellings are sometimes accepted which use K (or sometimes C) in place of Q; for example, Koran (Qur'\u0101n) and Cairo (al-Q\u0101hira). Question: are there any words in the english language with a q and no u?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A polar route is an aircraft route across the uninhabited polar ice cap regions. The term ``polar route'' was originally applied to great circle routes between Europe and the west coast of North America in the 1950s. Question: do flight paths go over the north pole?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Holders of visas or residence permits of EU/EFTA/GCC countries, overseas territories of EU countries (except Anguilla, Montserrat, Pitcairn, Saint Helena, Ascension and Tristan da Cunha), Australia, Canada, Israel, Japan, New Zealand, South Korea or the United States do not require a visa for max 90 days in a 180-day period. The visa/residence permit must be valid on arrival to Georgia. However, there have been many cases where those holding valid residency of GCC countries have been denied access without assigning any reason, especially if they are citizens of India and Pakistan. Question: do american citizens need a visa for georgia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: ``God Save the Queen'' (alternatively ``God Save the King'', depending on the gender of the reigning monarch) is the national or royal anthem in a number of Commonwealth realms, their territories, and the British Crown dependencies. The author of the tune is unknown, and it may originate in plainchant; but an attribution to the composer John Bull is sometimes made. Question: does god save the queen change when there is a king?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: After Margaret Thatcher became Prime Minister in May 1979, the legislation to implement the Right to Buy was passed in the Housing Act 1980. Michael Heseltine, in his role as Secretary of State for the Environment, was in charge of implementing the legislation. Some 6,000,000 people were affected; about one in three actually purchased their housing unit. Heseltine noted that ``no single piece of legislation has enabled the transfer of so much capital wealth from the state to the people''. He said the right to buy had two main objectives: to give people what they wanted, and to reverse the trend of ever-increasing dominance of the state over the life of the individual. Question: could you buy your council house before 1979?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In structural engineering, a shear wall is a structural system composed of braced panels (also known as shear panels) to counter the effects of lateral load acting on a structure. Wind and seismic loads are the most common building codes, including the International Building Code (where it is called a braced wall line) and Uniform Building Code, all exterior wall lines in wood or steel frame construction must be braced. Depending on the size of the building some interior walls must be braced as well. Question: is a shear wall a load bearing wall?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Almost all cup competitions worldwide operate a cup-tied rule, but leagues do not (as leagues do not eliminate teams during the season). Cup-tied players are only prevented from playing in that specific competition, so for example a player who is cup-tied in the FA Cup may still be eligible to play in the League Cup (or vice versa). UEFA competitions are an exception: because teams can switch between the UEFA Champions League and UEFA Europa League during the season, UEFA has a more complex system for determining whether a player is cup-tied in one or both of those competitions. Question: can a player play in the europa league and champions league?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The area was created when a chain of volcanic islands collided with the North American plate. The islands went over the North American plate and created the highlands of New Jersey. Then around 450 million years ago, a small continent collided with proto North America and created folding and faulting in western New Jersey and the southern Appalachians. When the African plate separated from North America, this created an aborted rift system or half-graben. The land lowered between the Ramapo fault in western Parsippany and the fault that was west of Paterson. The Wisconsin Glacier covered area from 21,000 to 13,000 BC. When the glacier melted due to climate change, Lake Passaic was formed, covering all of what is now Lake Hiawatha. Lake Passaic slowly drained and much of the area is swamps or low-lying meadows such as Troy Meadows. The Rockaway River flows over the Ramapo fault in Boonton and then flows along the northwestern edge of Lake Hiawatha. In this area, there are swamps near the river or in the area. Question: is there a lake in lake hiawatha nj?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Italian meal structure is typical of the Mediterranean region and different from meal structure of Northern Europe / Northwestern Europe and Germanic and Slavic Europe, though it still often consists of breakfast, lunch, and supper. However, much less emphasis is placed on breakfast, and breakfast itself is often skipped or involves lighter meal portions than are seen in other non-Mediterranean Western countries. Late-morning and mid-afternoon snacks, called merenda (plural merende), are also often included in this meal structure. Italians also commonly divide a celebratory meal into several different courses. Question: are breakfast lunch and dinner always served in italy?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The exploration of Neptune has only begun with one spacecraft, Voyager 2 in 1989. Currently there are no approved future missions to visit the Neptunian system. NASA, ESA and also independent academic groups have proposed future scientific missions to visit Neptune. Some mission plans are still active, while others have been abandoned or put on hold. Question: have any satellites or robots been to neptune?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Landon builds a telescope for Jamie to see a one-time comet in the springtime. Jamie's father helps him get it finished in time and it is brought to her on the balcony where she gets a beautiful view of the comet. It is then that Landon asks her to marry him. Jamie tearfully accepts, and they get married in the church where her mother got married. They spend their last summer together filled with strong love. Jamie's leukemia ends up killing her when summer ends. Question: does the girl die in a walk to remember?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In nine-ball, except when a push-out has been invoked, a legal shot consists of striking the cue ball into the lowest numbered object ball on the table and subsequently either pocketing an object ball, or driving any ball (including the cue ball) to any rail, otherwise the shot is a foul . Additional conditions apply for the break shot (see below). Object balls do not have to be pocketed in numerical order; Any ball may be pocketed at any time during the game, so long as the lowest-numbered ball is contacted first by the cue ball. Nine-ball is not a call shot game. The 9-ball itself can be legally pocketed for a win at any turn in the game, intentionally or by chance, including the break shot. Conversely, a player could potentially pocket all of the object balls numbered one through eight during the course of the game and lose after the other player pockets only the nine-ball. Question: do you have to call the pocket in 9 ball?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Air Canada Rouge, is a low-cost airline and subsidiary of Air Canada. Air Canada Rouge is fully integrated into the Air Canada mainline and Air Canada Express networks; flights are sold with AC flight numbers but are listed as ``operated by Air Canada Rouge'' (similar to regional flights operated under the Air Canada Express banner). Rouge means ``red'' in French. Question: is air canada rouge the same as air canada?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The wheel and axle can be viewed as a version of the lever, with a drive force applied tangentially to the perimeter of the wheel and a load force applied to the axle, respectively, that are balanced around the hinge which is the fulcrum. The mechanical advantage of the wheel and axle is the ratio of the distances from the fulcrum to the applied loads, or what is the same thing the ratio of the diameter of the wheel and axle. A major application is in wheeled vehicles, in which the wheel and axle are used to reduce friction of the moving vehicle with the ground. Other examples of devices which use the wheel and axle are capstans, belt drives and gears. Question: can a wheel and axle be called a lever?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The DeCavalcante crime family is an Italian-American organized crime family that operates in Elizabeth, New Jersey, and surrounding areas in the state and is part of the nationwide criminal phenomenon known as the American Mafia (or Cosa Nostra). It operates on the other side of the Hudson River from the Five Families of New York, but it maintains strong relations with many of them, as well as with the Philadelphia crime family and the Patriarca crime family of New England. Its illicit activities include bookmaking, building, cement, and construction violations, bootlegging, corruption, drug trafficking, extortion, fencing, fraud, hijacking, illegal gambling, loan-sharking, money laundering, murder, pier thefts, pornography, prostitution, racketeering, and waste management violations. The DeCavalcantes are, in part, the inspiration for the fictional DiMeo crime family of HBO's dramatic series The Sopranos. The DeCavalcante family was the subject of the CNBC program Mob Money, which aired on June 23, 2010, and The Real Sopranos TV documentary (first airdate April 26, 2006) directed by Thomas Viner for the UK production company Class Films. Question: is the mafia still active in new jersey?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Senate voted to acquit Chase of all charges on March 1, 1805. There were 34 Senators present (25 Republicans and 9 Federalists), and 23 votes were needed to reach the required two-thirds majority. Of the eight votes cast, the closest vote was 18 for impeachment and 16 for acquittal in regards to the Baltimore grand jury charge. He is the only U.S. Supreme Court justice to have been impeached. Judge Alexander Pope Humphrey recorded in the Virginia Law Register an account of the impeachment trial and acquittal of Chase. Question: have any supreme court justices ever been removed?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Xbox One gaming console has received updates from Microsoft since its launch in 2013 that enable it to play select games from its two predecessor consoles, Xbox and Xbox 360. On June 15, 2015, backward compatibility with supported Xbox 360 games became available to eligible Xbox Preview program users with a beta update to the Xbox One system software. The dashboard update containing backward compatibility was released publicly on November 12, 2015. On October 24, 2017, another such update added games from the original Xbox library. The following is a list of all backward compatible games on Xbox One under this functionality. Question: do all xbox 360 discs work on xbox one?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In December 2014, it was announced that Torment, the second installment in the Fallen book series, was in development. It is unknown whether the last two novels, Passion and Rapture, and the spin-off novel, Unforgiven, will be adapted as well. In 2017, producer Kevan Van Thompson asked the fans if they want an adaptation of ``Torment'', showing that the sequel still could be made. Question: will there be a fallen 2 movie 2017?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Goalkeepers are normally allowed to handle the ball within their own penalty area, and once they have control of the ball in their hands opposition players may not challenge them for it. However the back-pass rule prohibits goalkeepers from handling the ball after it has been deliberately kicked to them by a team-mate, or after receiving it directly from a throw-in taken by a team-mate. Back-passes with parts of the body other than the foot, such as headers, are not prohibited. Despite the popular name ``back-pass rule'', there is no requirement in the laws that the kick or throw-in must be backwards; handling by the goalkeeper is forbidden regardless of the direction the ball travels. Question: can a soccer goalie pick up a throw in?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Lampyridae are a family of insects in the beetle order Coleoptera. They are winged beetles, commonly called fireflies or lightning bugs for their conspicuous use of bioluminescence during twilight to attract mates or prey. Fireflies produce a ``cold light'', with no infrared or ultraviolet frequencies. This chemically produced light from the lower abdomen may be yellow, green, or pale red, with wavelengths from 510 to 670 nanometers. The eastern US is home to the species Phausis reticulata, which emits a steady blue light. Question: is there a difference between lightning bugs and fireflies?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Since 22 February 2017, New Hampshire is a constitutional carry state, requiring no license to open carry or concealed carry a firearm in public. Concealed carry permits are still issued for purposes of reciprocity with other states. Question: do you need a license to carry in new hampshire?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Stowers was grateful that Lani was much more integrated into the canvas upon her return. ``Everyone respects her. She's making great friendships. I think in the beginning (...), that wasn't there.'' Lani comes back to town with a secret. Though Lani's main reason for returning is to visit her ailing father, it is revealed that she is the woman JJ Deveraux (Casey Moss) had a drunken one-night-stand with whom he could not remember. In June 2018, Stowers renewed her deal with the soap; she will continue to appear as Lani into fall 2019. Question: did lonnie die on days of our lives?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Remaining sperm whale populations are large enough that the species' conservation status is rated as vulnerable rather than endangered. However, the recovery from centuries of commercial whaling is a slow process, particularly in the South Pacific, where the toll on breeding-age males was severe. Question: are sperm whales on the endangered species list?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: As of March 19, 2017, drivers pay $8.50 per car or $3.50 per motorcycle for tolls by mail. E\u2010ZPass users with transponders issued by the New York E\u2010ZPass Customer Service Center pay $5.76 per car or $2.51 per motorcycle. All E-ZPass users with transponders not issued by the New York E-ZPass CSC will be required to pay Toll-by-mail rates. Question: is there a toll on the battery tunnel?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Individuals possessing handgun carry permits may not carry handguns of greater than .45 caliber. Individuals with handgun carry permits may not carry in an establishment whose primary purpose is the serving of alcoholic beverages. Handgun carry permit holders may not consume alcoholic beverages while carrying. Doing so will result in revocation of carry permit and possible criminal charges. Carry with permit is allowed in an establishment that serves alcoholic beverages, (such as a restaurant that serves alcoholic beverages) as long as that is not the primary purpose of said establishment. Handgun carry permit holders cannot carry into any sports arena during a professional sporting event, in an area or building where pari-mutuel wagering is authorized (such as a casino), cannot carry in schools nor in any government building. Question: can you carry a gun in a bar in oklahoma?",
        "pred_ans": " False. According to the passage, individuals with handgun carry permits may not carry in an establishment whose primary purpose is the serving of alcoholic beverages.",
        "ground_truth": false,
        "score": " There are a few errors in the student's answer. The student correctly states that individuals with handgun carry permits may not carry in an establishment whose primary purpose is the serving of alcoholic beverages. However, the student's conclusion that you cannot carry a gun in a bar in Oklahoma is not accurate. The passage does not specify location-based restrictions, only restrictions related to the primary purpose of the establishment and the consumption of alcoholic beverages.\\n\\nTherefore, the score is: 70"
    },
    {
        "question": "Passage: It is allowed in Japan, though the incidence has declined in recent years. Question: is it okay for cousins to marry in japan?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On October 20, 1977 -- three days after the release of the band's fifth studio album Street Survivors -- a chartered plane on which the members and crew were travelling crashed in Gillsburg, Mississippi. Six people died in the accident, including band members Ronnie Van Zant, Steve Gaines and Cassie Gaines; many of the other passengers onboard were seriously injured, including Wilkeson who was left in a critical condition and reportedly declared dead three times. The group disbanded after the crash. In 1978, a collection of previously unreleased recordings from 1971 and 1972 was released as Skynyrd's First and... Last. The following year, the surviving members (with the exception of Wilkeson) reunited at Volunteer Jam for a performance of ``Free Bird'' with Charlie Daniels and his band. Question: are any of the original members of lynyrd skynyrd alive?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The ``Marcha Real'' (Spanish pronunciation: (\u02c8ma\u027et\u0283a re\u02c8al), ``Royal March'') is the national anthem of Spain. It is one of only four national anthems in the world (along with those of Bosnia and Herzegovina, Kosovo, and San Marino) that has no official lyrics. Although it had lyrics in the past, they are no longer used. Question: does the national anthem of spain have words?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The return address is not required on postal mail. However, lack of a return address prevents the postal service from being able to return the item if it proves undeliverable; such as from damage, postage due, or invalid destination. Such mail may otherwise become dead letter mail. Question: will a letter be delivered without a return address?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Yellowjacket or Yellow jacket is the common name in North America for predatory social wasps of the genera Vespula and Dolichovespula. Members of these genera are known simply as ``wasps'' in other English-speaking countries. Most of these are black and yellow like the eastern yellowjacket Vespula maculifrons and the aerial yellowjacket Dolichovespula arenaria; some are black and white like the bald-faced hornet, Dolichovespula maculata. Others may have the abdomen background color red instead of black. They can be identified by their distinctive markings, their occurrence only in colonies, and a characteristic, rapid, side-to-side flight pattern prior to landing. All females are capable of stinging. Yellowjackets are important predators of pest insects. Question: are yellow jackets and wasps the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The trophy has the engraving ``FIFA World Cup'' on its base. After the 1994 FIFA World Cup a plate was added to the bottom side of the trophy on which the names of winning countries are engraved, names therefore not visible when the trophy is standing upright. The inscriptions state the year in figures and the name of the winning nation in its national language; for example, ``1974 Deutschland'' or ``1994 Brasil''. In 2010, however, the name of the winning nation was engraved as ``2010 Spain'', in English, not in Spanish. As of 2018, twelve winners have been engraved on the base. The plate is replaced each World Cup cycle and the names of the trophy winners are rearranged into a spiral to accommodate future winners, with Spain on later occasions written in Spanish (``Espa\u00f1a''). FIFA's regulations now state that the trophy, unlike its predecessor, cannot be won outright: the winners of the tournament receive a bronze replica which is gold-plated rather than solid gold. Germany became the first nation to win the new trophy for the third time when they won the 2014 FIFA World Cup. Question: is the world cup trophy new every time?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: One way the difference between CRV and a system in which the consumer pays a deposit or tax shows up is that sales tax applies to the CRV amount, if the item is subject to sales tax. If it were not part of the basic price of the product, sales tax would not apply to it. Accordingly, when the State of California raised the CRV from $.04 on 2 L bottles / $.02 cans to $.08 and $.04, respectively, then again to $.10 and $.05, respectively, it was also raising California's sales tax revenue gained on the imposed fee. Question: is there sales tax on crv in california?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The novel was followed by a sequel called The Queen's Fool, set during the reign of Henry's daughter, Queen Mary. The Queen's Fool was followed by The Virgin's Lover, set during the early days of Queen Elizabeth I's reign. Question: is the other boleyn girl part of a series?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Neutrality of money is the idea that a change in the stock of money affects only nominal variables in the economy such as prices, wages, and exchange rates, with no effect on real variables, like employment, real GDP, and real consumption. Neutrality of money is an important idea in classical economics and is related to the classical dichotomy. It implies that the central bank does not affect the real economy (e.g., the number of jobs, the size of real GDP, the amount of real investment) by creating money. Instead, any increase in the supply of money would be offset by a proportional rise in prices and wages. This assumption underlies some mainstream macroeconomic models (e.g., real business cycle models). Others like monetarism view money as being neutral only in the long-run. Question: does monetary neutrality mean that changes in the money supply can never affect real\u200b gdp?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Dalmatian puppies are born with plain white coats and their first spots usually appear within 3 to 4 weeks after birth, however spots are visible on their skin. After about a month, they have most of their spots, although they continue to develop throughout life at a much slower rate. Spots usually range in size from 30 to 60 mm, and are most commonly black or brown (liver) on a white background. Other, more rare colors, include blue (a blue-grayish color), brindle, mosaic, tricolor-ed (with tan spotting on the eyebrows, cheeks, legs, and chest), and orange or lemon (dark to pale yellow). Patches of color may appear anywhere on the body, mostly on the head or ears, and usually, consist of a solid color. Patches are visible at birth and are not a group of connected spots and are identifiable by the smooth edge of the patch. Question: do dalmatians have spots when they are born?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A Bachelor of Education (B.Ed.) is a graduate professional degree which prepares students for work as a teacher in schools, though in some countries additional work must be done in order for the student to be fully qualified to teach. Question: is a bachelor of education an undergraduate degree?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Elderflower cordial is a soft drink made largely from a refined sugar and water solution and uses the flowers of the European elderberry, (Sambucus nigra L.). Historically the cordial has been popular in North Western Europe where it has a strong Victorian heritage. However, versions of an elderflower cordial recipe can be traced back to Roman times. Nowadays it can be found in almost all of the former Roman Empire territory, predominantly in Central Europe, especially in England, Germany, Austria, Slovenia, Romania, Hungary and Slovakia, where people have acquired a special taste for it and still make it in the traditional way. In some countries the drink can be found as an aromatic syrup, sold as a concentrated squash that is mixed with still or sparkling water. Elderflower press\u00e9 is a premixed form of this. Question: is elderflower cordial the same as elderflower syrup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Typically, no license or advanced training beyond just firearm familiarization (for rentals) and range rules familiarization is usually required for using a shooting range in the United States; the only common requirement is that the shooter must be at least 18 or 21 years old (or have a legal guardian present), and must sign a waiver prior to shooting. Question: can a 14 year old shoot a gun at a shooting range?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Like Venus, Mercury orbits the Sun within Earth's orbit as an inferior planet, and never exceeds 28\u00b0 away from the Sun. When viewed from Earth, this proximity to the Sun means the planet can only be seen near the western or eastern horizon during the early evening or early morning. At this time it may appear as a bright star-like object, but is often far more difficult to observe than Venus. The planet telescopically displays the complete range of phases, similar to Venus and the Moon, as it moves in its inner orbit relative to Earth, which reoccurs over the so-called synodic period approximately every 116 days. Question: does mercury stay a constant distance from the sun year after year?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The pardon power of the President extends only to an offense recognizable under federal law. However, the governors of most of the 50 states have the power to grant pardons or reprieves for offenses under state criminal law. In other states, that power is committed to an appointed agency or board, or to a board and the governor in some hybrid arrangement (in some states the agency is merged with that of the parole board, as in the Oklahoma Pardon and Parole Board). Question: can the president pardon someone convicted of a state crime?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A MAC may be referred to as the burned-in address (BIA). It may also be known as an Ethernet hardware address (EHA), hardware address or physical address (not to be confused with a memory physical address). Question: is physical address and mac address the same?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Bees with barbed stingers can often sting other insects without harming themselves. Queen honeybees and bees of many other species, including bumblebees and many solitary bees, have smoother stingers with smaller barbs, and can sting mammals repeatedly. Question: does a bumble bee die after stinging someone?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: While the Laws of the Game continue to provide for competitive scrums, a convention exists that some scrum rules are not enforced. During the 1970s, scrum penalties for feeding the ball into the legs of the second row, packs moving off the ``mark'' or collapsing the scrum were seen as unattractive. The ability of teams to win a game purely on goals from scrum penalties was also seen as unfair. In an effort to improve this situation, changes to rules and their enforcement were made. The number of scrums was reduced with the introduction of the ``handover'' after a team has used a set of six tackles, the differential penalty, one which cannot be kicked at goal was brought in for offences at scrums and referees ceased enforcing some rules regarding feeding the ball into scrum. Aided by this change, it is common for professional teams not to fully contest scrums, according to their choice of tactics. Question: can you contest a scrum in rugby league?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Voting rights of United States citizens in Puerto Rico, like the voting rights of residents of other United States territories, differ from those of United States citizens in each of the fifty states and the District of Columbia. Residents of Puerto Rico and other U.S. territories do not have voting representation in the United States Congress, and are not entitled to electoral votes for President. The United States Constitution grants congressional voting representation to U.S. states, which Puerto Rico and other U.S. territories are not, specifying that members of Congress shall be elected by direct popular vote and that the President and the Vice President shall be elected by electors chosen by the States. Question: can puerto rico vote for the us president?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The correct answer should be \\\"True\\\" because the passage states that residents of Puerto Rico and other U.S. territories do not have voting representation in the United States Congress and are not entitled to electoral votes for President. Therefore, the score is: 50."
    },
    {
        "question": "Passage: Soy sauce (also called soya sauce in British English) is a liquid condiment of Chinese origin, made from a fermented paste of soybeans, roasted grain, brine, and Aspergillus oryzae or Aspergillus sojae molds. Soy sauce in its current form was created about 2,200 years ago during the Western Han dynasty of ancient China, and spread throughout East and Southeast Asia where it is used in cooking and as a condiment. Question: is there a difference between soy and soya sauce?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: This is a list of all penalty shoot-outs that have occurred in the Finals tournament of the FIFA World Cup. Penalty shoot-outs were introduced as tie-breakers in the 1978 World Cup but did not occur before 1982. The first time a World Cup title was won by penalty shoot-out was in 1994. The only other time was in 2006. By the end of the 2018 edition, 30 shoot-outs have taken place in the World Cup. Of these, only two reached the sudden death stage after still being tied at the end of ``best of five kicks''. Question: has the world cup been won on penalties?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The breed is strikingly similar to the English Mastiff, with which it shares a common ancestor known as the Alpine Mastiff. The modern St. Bernard breed is radically different than the original dogs kept at the St. Bernard hospice, most notably by being much larger in size and build. Since the late 1800s, the St. Bernard breed has been ever refined and improved using many different large Molosser breeds, including the Newfoundland, Great Pyrenees, Greater Swiss Mountain Dog, Bernese Mountain Dog, Great Dane, English Mastiff, and possibly the Tibetan Mastiff and Caucasian Ovcharka. Other breeds such as the Rottweiler, Boxer, and English Bulldog may have contributed to the St. Bernard's bloodline as well. It is suspected that many of these large breeds were used to redevelop each other to combat the threat of their extinction after World War II, which may explain why all of them played a part in the creation of the St. Bernard as seen today. Question: are st bernards and bernese mountain dogs related?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In 2008, statistics showed that of all Swedes aged 25--64, 15% have completed only compulsory education (as the highest level of attainment), 46% only upper secondary education, 14% only post-secondary education of less than three years, and 22% post-secondary education of three years or more. Women are more educated than men (26% of women vs. 19% of men have post-secondary education of three years or more). The level of education is highest among those aged 25--34, and it decreases with age. Both upper secondary school and university studies are financed by taxes. Some Swedes start working immediately after secondary school. Along with several other European countries, the government used to subsidize tuition of non-EU/EEA students pursuing a degree at Swedish institutions, but in 2010 they started charging non-EU/EEA students 80,000--100,000 SEK per year. Swedish fifteen year old pupils have the 22nd highest average score in the PISA assessments, being neither significantly higher nor lower than the OECD average. Question: does sweden pay you to go to school?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In Marbury v. Madison, 5 U.S. 137 (1803), the Supreme Court held that Congress cannot pass laws that are contrary to the Constitution, and it is the role of the Judicial system to interpret what the Constitution permits. Citing the Supremacy Clause, the Court found Section 13 of the Judiciary Act of 1789 to be unconstitutional to the extent it purported to enlarge the original jurisdiction of the Supreme Court beyond that permitted by the Constitution. Question: do the courts have the power to overrule the laws written by the legislative branch of government?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Despite the practice's name, in most countries, the arresting person is usually designated as a person with arrest powers, who need not be a citizen of the country in which they are acting. For example, in the British jurisdiction of England and Wales, the power comes from section 24A(2) of the Police and Criminal Evidence Act 1984, called ``any person arrest''. This legislation states ``any person'' has these powers, and does not state that they need to be a British citizen. Question: can you make a citizen's arrest in the uk?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Mamma Mia! Here We Go Again is a 2018 jukebox musical romantic comedy film written and directed by Ol Parker, from a story by Parker, Catherine Johnson, and Richard Curtis. It is a follow-up to the 2008 film Mamma Mia!, which in turn is based on the musical of the same name using the music of ABBA. The film features an ensemble cast, including Lily James, Amanda Seyfried, Christine Baranski, Julie Walters, Pierce Brosnan, Andy Garc\u00eda, Dominic Cooper, Colin Firth, Stellan Skarsg\u00e5rd, Jessica Keenan Wynn, Alexa Davies, Jeremy Irvine, Josh Dylan, Hugh Skinner, Cher, and Meryl Streep. Both a prequel and a sequel, the plot is set after the events of the first film, and also features flashbacks to 1979, telling the story of Donna Sheridan's arrival on the island of Kalokairi and her first meetings with her daughter Sophie's three possible fathers. Question: are all the songs in mama mia here we go again by abba?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Though the award rewards a player who performed particularly well over the entirety of the playoffs, it has never been given to a player whose team did not at least reach the Stanley Cup Finals. The trophy has been awarded to members of the team that lost the Finals five times, most recently Jean-Sebastien Giguere of the Mighty Ducks of Anaheim in 2003, who backstopped his team's surprise run to the Finals, where they pushed the New Jersey Devils to seven games. The only skater to win the award in a losing cause is Philadelphia's Reggie Leach, who won it in 1976 as he had set a league record for most goals in the playoffs (19), which included a five-goal game in the semifinals and four goals in the Finals, even though the Canadiens swept his Flyers. Question: has the conn smythe ever gone to the losing team?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Chinese and English are both official languages of Hong Kong under the Hong Kong Basic Law (article 9) and the Official Languages Ordinance (chapter 5 of the Laws of Hong Kong). No law stipulates choice of spoken Chinese dialect. Question: is english an official language in hong kong?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u9605\u8bfb\u60a8\u63d0\u4f9b\u7684\u6587\u7ae0\u548c\u95ee\u9898\u3002\u7136\u540e\uff0c\u6211\u4f1a\u9605\u8bfb\u5b66\u751f\u7684\u7b54\u6848\u3002\u6700\u540e\uff0c\u6211\u4f1a\u6839\u636e\u6587\u7ae0\u5185\u5bb9\u5224\u65ad\u5b66\u751f\u7684\u7b54\u6848\u662f\u5426\u6b63\u786e\uff0c\u5e76\u7ed9\u51fa\u4e00\u4e2a 0 \u5230 100 \u4e4b\u95f4\u7684\u5206\u6570\u3002\\n\\n\u9605\u8bfb\u6587\u7ae0\u548c\u95ee\u9898\u540e\uff0c\u6211\u4e86\u89e3\u5230\u9999\u6e2f\u57fa\u672c\u6cd5\uff08\u7b2c 9 \u6761\uff09\u548c\u9999\u6e2f\u5b98\u65b9\u8bed\u8a00\u6761\u4f8b\uff08\u9999\u6e2f\u6cd5\u5f8b\u7b2c 5 \u7ae0\uff09\u89c4\u5b9a\uff0c\u4e2d\u6587\u548c\u82f1\u6587\u90fd\u662f\u9999\u6e2f\u7684\u5b98\u65b9\u8bed\u8a00\u3002\u6587\u7ae0\u4e2d\u6ca1\u6709\u63d0\u5230\u9009\u62e9\u666e\u901a\u8bdd\u65b9\u8a00\u7684\u89c4\u5b9a\u3002\\n\\n\u73b0\u5728\uff0c\u6211\u9605\u8bfb\u4e86\u5b66\u751f\u7684\u7b54\u6848\uff1a\u201cTrue\u201d\u3002\\n\\n\u56e0\u6b64\uff0c\u6839\u636e\u6587\u7ae0\u5185\u5bb9\uff0c\u5b66\u751f\u7684\u7b54\u6848\u662f\u6b63\u786e\u7684\u3002\\n\\n\u56e0\u6b64\uff0c\u5206\u6570\u662f\uff1a100"
    },
    {
        "question": "Passage: Missouri allows any person who has a valid concealed carry endorsement or permit and is lawfully carrying a firearm in a concealed manner to briefly and openly display the firearm, so long as the firearm is not displayed in an angry or threatening manner. Some localities prohibit open carry; however, concealed carry license holders are exempted from this restriction. Question: do i need a permit to open carry in missouri?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Season 1 was shot inside of an actual grocery store, Field's Market in West Hills, California. For Season 2, the market was built in a 15,500 square foot warehouse in Santa Rosa, CA. It was built over two weeks and stocked with over $700,000 of food. After each episode, the perishable items were donated to local food banks and local farmers. Question: is guys grocery games in real grocery store?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Superfecundation is the fertilization of two or more ova from the same cycle by sperm from separate acts of sexual intercourse, which can lead to twin babies from two separate biological fathers. The term superfecundation is derived from fecund, meaning the ability to produce offspring. Heteropaternal superfecundation refers to the fertilization of two separate ova by two different fathers. Homopaternal superfecundation refers to the fertilization of two separate ova from the same father, leading to fraternal twins. While heteropaternal superfecundation is referred to as a form of atypical twinning, genetically, the twins are half siblings. Superfecundation, while rare, can occur through either separate occurrences of sexual intercourse or through artificial insemination. Question: can a person have twins with different fathers?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There are three types of twins based on the passage:\\n\\n1. Fraternal twins: result from the fertilization of two separate ova from the same father.  \\n2. Heteropaternal superfecundation twins: result from the fertilization of two separate ova by two different fathers. These twins are genetically half siblings.  \\n3. Homopaternal superfecundation twins: result from the fertilization of two separate ova from the same father.\\n\\nThe student's answer is TRUE, which means they believe a person can have twins with different fathers. According to the passage, this is possible through heteropaternal superfecundation.\\n\\nTherefore the score is: 100"
    },
    {
        "question": "Passage: Flash blindness is a visual impairment during and following exposure to a light flash of extremely high intensity. The bright light overwhelms the eye and gradually fades, lasting anywhere from a few seconds to a few minutes. Question: can staring at a bright light cause blindness?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A stipulation that this kick must be towards the opponents' goal existed in the rules from 1883 until 2016. This resulted in kick-offs typically involving two people (as pictured), with one tapping the ball forward and the other passing it back to the rest of the team. Now a team may kick the ball backwards explaining why the kicker may be in the other half of the field when kicking the ball. Question: does a soccer kick off have to go forward?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A wisdom tooth or third molar is one of the three molars per quadrant of the human dentition. It is the most posterior of the three. Wisdom teeth generally erupt between the ages of 17 and 25. Most adults have four wisdom teeth, one in each of the four quadrants, but it is possible to have none, fewer, or more, in which case the extras are called supernumerary teeth. Wisdom teeth commonly affect other teeth as they develop, becoming impacted. They are often extracted when or even before this occurs. Question: is it rare to have 6 wisdom teeth?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There are four wisdom teeth in a normal human dentition, but it is possible to have none, fewer, or more, in which case the extras are called supernumerary teeth. Therefore, having 6 wisdom teeth is not rare.\\n\\nTherefore the score is: 80"
    },
    {
        "question": "Passage: In American football and Canadian football, a lateral pass or lateral (officially backward pass in American football and onside pass in Canadian football) occurs when the ball carrier throws the football to a teammate in a direction parallel to or away from the opponents' goal line. A lateral pass is distinguished from a forward pass, in which the ball is thrown forward, towards the opposition's end zone. In a lateral pass the ball is not advanced, but unlike a forward pass a lateral may be attempted from anywhere on the field by any player to any player at any time. Question: can you pass the ball more than once in american football?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Ogallala Aquifer is a shallow water table aquifer surrounded by sand, silt, clay and gravel located beneath the Great Plains in the United States. One of the world's largest aquifers, it underlies an area of approximately 174,000 sq mi (450,000 km) in portions of eight states (South Dakota, Nebraska, Wyoming, Colorado, Kansas, Oklahoma, New Mexico, and Texas). It was named in 1898 by geologist N.H. Darton from its type locality near the town of Ogallala, Nebraska. The aquifer is part of the High Plains Aquifer System, and rests on the Ogallala Formation, which is the principal geologic unit underlying 80% of the High Plains. Question: is the ogallala aquifer the largest in the world?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On 15 December 2014, the official music video for ``The Nights'' was released on YouTube and premiered on the front page of Yahoo Music. The video was produced, directed by, and stars ``professional life liver'' Rory Kramer, who filmed an exuberant action-packed recollection of his own life on roller coasters, surfing, snowboarding, skateboarding, balloon flying, making a four door convertible out of a Toyota, etc. -- living a life to be remembered. Question: is avicii the guy in the nights video?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The iPhone Special Edition (SE) is designed and sold by Apple Inc. as part of the iPhone series of devices. It was released on March 31, 2016 and serves as the successor of the iPhone 5S. Question: is iphone se and iphone 5 the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Tornadoes have been recorded on all continents except Antarctica and are most common in the middle latitudes where conditions are often favorable for convective storm development. The United States has the most tornadoes of any country, as well as the strongest and most violent tornadoes. A large portion of these tornadoes form in an area of the central United States popularly known as Tornado Alley. Other areas of the world that have frequent tornadoes include significant portions of Europe, South Africa, Philippines, Bangladesh, parts of Argentina, Uruguay, and southern and southeast Brazil, northern Mexico, New Zealand, and far eastern Asia. Question: are there tornadoes anywhere else in the world?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: Bees with barbed stingers can often sting other insects without harming themselves. Queen honeybees and bees of many other species, including bumblebees and many solitary bees, have smoother stingers with smaller barbs, and can sting mammals repeatedly. Question: does the queen bee die after she stings?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage states that queen honeybees and bees of many other species have smoother stingers with smaller barbs, and can sting mammals repeatedly. It does not mention whether the queen bee dies after she stings or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The Hobbit is a film series consisting of three high fantasy adventure films directed by Peter Jackson. They are based on the 1937 novel The Hobbit by J.R.R. Tolkien, with large portions of the trilogy inspired by the appendices to The Return of the King, which expand on the story told in The Hobbit, as well as new material and characters written especially for the films. Together they act as a prequel to Jackson's The Lord of the Rings film trilogy. The films are subtitled An Unexpected Journey (2012), The Desolation of Smaug (2013), and The Battle of the Five Armies (2014). Question: is the hobbit part of the lord of the rings trilogy?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Economic development is the process by which a nation improves the economic, political, and social well-being of its people. The term has been used frequently by economists, politicians, and others in the 20th and 21st centuries. The concept, however, has been in existence in the West for centuries. ``Modernization, ``westernization'', and especially ``industrialization'' are other terms often used while discussing economic development. Economic development has a direct relationship with the environment and environmental issues. Economic development is very often confused with industrial development, even in some academic sources. Question: is there a direct link between economic development and social development?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Cyborg is a fictional superhero appearing in American comic books published by DC Comics . The character was created by writer Marv Wolfman and artist George P\u00e9rez and first appears in a special insert in DC Comics Presents #26 (October 1980). Originally known as a member of the Teen Titans, Cyborg was established as a founding member of the Justice League in DC's 2011 reboot of its comic book titles and subsequently in the 2016 relaunch of its continuity. However, he has since been re-established as a past member of the Teen Titans again. Question: is cyborg a member of the justice league?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The previous major redesign of the iPhone, the 4.7-inch iPhone 6 and 5.5-inch iPhone 6 Plus, resulted in larger screen sizes. However a significant number of customers still preferred the 4-inch screen size of the iPhone 5 and 5S. Apple stated in their event that they sold 30 million 4-inch iPhones in 2015. Question: is the iphone se before the iphone 6?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Victor Hugo began writing Notre-Dame de Paris in 1829, largely to make his contemporaries more aware of the value of the Gothic architecture, which was neglected and often destroyed to be replaced by new buildings or defaced by replacement of parts of buildings in a newer style. For instance, the medieval stained glass panels of Notre-Dame de Paris had been replaced by white glass to let more light into the church. This explains the large descriptive sections of the book, which far exceed the requirements of the story. A few years earlier, Hugo had already published a paper entitled Guerre aux D\u00e9molisseurs (War to the Demolishers) specifically aimed at saving Paris' medieval architecture. The agreement with his original publisher, Gosselin, was that the book would be finished that same year, but Hugo was constantly delayed due to the demands of other projects. In the summer of 1830, Gosselin demanded that Hugo complete the book by February 1831. Beginning in September 1830, Hugo worked nonstop on the project thereafter. The book was finished six months later. Question: is the hunchback of notre dame real story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: New Balance maintains a manufacturing presence in the United States, as well as in the United Kingdom for the European market, where they produce some of their most popular models such as the 990 model--in contrast to its competitors, which often manufacture exclusively outside the USA and Europe. As a result, New Balance shoes tend to be more expensive than those of many other manufacturers. To offset this pricing difference, New Balance claims to differentiate their products with technical features, such as blended gel inserts, heel counters and a greater selection of sizes, particularly for very narrow and/or very wide widths. The company has made total profits of approximately $69 billion since 1992. They are the second most-renown American sporting company, after Nike. Question: are new balance and nike the same company?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Historically, there have been two main types of vermouth: sweet and dry. Responding to demand and competition, vermouth manufacturers have created additional styles, including extra-dry white, sweet white (bianco), red, amber (ambre or rosso), and ros\u00e9. Question: is rosso vermouth the same as sweet vermouth?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Veterinary science helps human health through the monitoring and control of zoonotic disease (infectious disease transmitted from non-human animals to humans), food safety, and indirectly through human applications from basic medical research. They also help to maintain food supply through livestock health monitoring and treatment, and mental health by keeping pets healthy and long living. Veterinary scientists often collaborate with epidemiologists, and other health or natural scientists depending on type of work. Ethically, veterinarians are usually obliged to look after animal welfare. Question: is veterinary science the same as veterinary medicine?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Hornets have stings used to kill prey and defend hives. Hornet stings are more painful to humans than typical wasp stings because hornet venom contains a large amount (5%) of acetylcholine. Individual hornets can sting repeatedly; unlike honey bees, hornets and wasps do not die after stinging because their stingers are not barbed and are not pulled out of their bodies. Question: do bees and hornets have the same venom?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: For many years, C&A retail clothing stores were a major presence in town centres throughout the United Kingdom. C&A also opened stores in a number of out-of-town locations, most notably its store at the Merry Hill Shopping Centre in the West Midlands, which opened in November 1989. The company's strategy of selling budget clothes from high-rent city-centre retail stores made it vulnerable to a new breed of competitors operating in cheaper, out-of-town locations, including Matalan and the rapidly expanding clothing operations of supermarket food chains such as Tesco and Asda, and to expanding high street names such as H&M, Zara, and Topshop. C&A in the United Kingdom was a notable example of an incorporated private unlimited company, which meant that it was not required to publish its financial statements under United Kingdom company law. In 2000, C&A announced its intention to withdraw from the British market, where it had been operating since 1922, and the last UK retail stores closed in 2001. Primark bought 11 of the C&A stores. Question: are there any c&a stores in the uk?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Although it is widely believed that a worker honey bee can sting only once, this is a partial misconception: although the stinger is in fact barbed so that it lodges in the victim's skin, tearing loose from the bee's abdomen and leading to its death in minutes, this only happens if the skin of the victim is sufficiently thick, such as a mammal's. Honey bees are the only hymenoptera with a strongly barbed sting, though yellow jackets and some other wasps have small barbs. Question: do bee stingers fall out on their own?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a partial misconception about the belief that a worker honey bee can sting only once. The student's answer is `False`, which is incorrect according to the passage in the question. Therefore the score is: 50/100"
    },
    {
        "question": "Passage: Miles Dominic Heizer (born May 16, 1994) is an American actor and musician. He stars in the Netflix original series 13 Reasons Why as Alex Standall. His most notable film role was in the 2007 movie Rails & Ties, in which he played character Davey Danner. From 2010 until 2015, he starred in the NBC drama series Parenthood as Drew Holt, the son of Lauren Graham's character Sarah Braverman. Miles appears in the 2016 film Nerve as Tommy, alongside actors Emma Roberts and Dave Franco. He also played the recurring role of Joshua Lipnicki on four episodes of the NBC medical drama series ER, and co-starred in the 2018 dramedy film Love, Simon as Cal. Question: is alex from 13 reasons why in nerve?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Time in New Zealand, by law, is divided into two standard time zones. The main islands use New Zealand Standard Time (NZST), 12 hours in advance of Coordinated Universal Time (UTC) / military M (Mike), while the outlying Chatham Islands use Chatham Standard Time (CHAST), 12 hours 45 minutes in advance of UTC / military M^ (Mike-Three). Question: is all of new zealand in the same time zone?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Proteins are polymer chains made of amino acids linked together by peptide bonds. During human digestion, proteins are broken down in the stomach to smaller polypeptide chains via hydrochloric acid and protease actions. This is crucial for the absorption of the essential amino acids that cannot be biosynthesized by the body. Question: can the human body directly absorb animal protein?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a crucial detail missing in the student's answer. It is true that the human body cannot directly absorb animal protein in its original form, but the hydrochloric acid and protease actions in the stomach break down the proteins into smaller polypeptide chains, which can then be absorbed by the body. Therefore, the score is: 60/100"
    },
    {
        "question": "Passage: Cellophane noodles should not be confused with rice vermicelli, which are made from rice and are white in color rather than clear (after cooking in water). Question: are rice noodles and glass noodles the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a confusion in the question between \\\"rice noodles\\\" and \\\"glass noodles\\\". The passage states that cellophane noodles should not be confused with rice vermicelli, which are made from rice and are white in color rather than clear (after cooking in water). It does not mention anything about \\\"glass noodles\\\". Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: The Indian cricket team are two times World Champions. In addition to winning the 1983 Cricket World Cup, they triumphed over Sri Lanka in the 2011 Cricket World Cup on home soil. They were also runners-up at the 2003 Cricket World Cup, and semifinalists thrice (1987, 1996 and 2015). They came last in the Super Six stage in the 1999 Cricket World Cup and have been knocked out 4 times in the Group stage (1975, 1979, 1992 and 2007). India's historical win-loss record at the cricket world cup is 46-27, with 1 match being tied and another one being abandoned due to rain. Question: has indian ever made it to the world cup?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Diary of a Wimpy Kid is a satirical realistic fiction comedy novel for children and teenagers written and illustrated by Jeff Kinney. It is the first book in the Diary of a Wimpy Kid series. The book is about a boy named Greg Heffley and his struggles to fit in as he begins middle school. Question: is diary of a wimpy kid considered a graphic novel?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Second Amendment (Amendment II) to the United States Constitution protects the right of the people to keep and bear arms and was adopted on December 15, 1791, as part of the first ten amendments contained in the Bill of Rights. The Supreme Court of the United States has ruled that the right belongs to individuals for self-defense, while also ruling that the right is not unlimited and does not prohibit all regulation of either firearms or similar devices. State and local governments are limited to the same extent as the federal government from infringing this right, per the incorporation of the Bill of Rights. Question: was the right to bear arms in the original constitution?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: To legally possess firearms or ammunition, Illinois residents must have a Firearm Owners Identification (FOID) card, which is issued by the Illinois State Police to any qualified applicant. Non-residents who may legally possess firearms in their home state are exempt from this requirement. Question: is it legal to own a gun in chicago?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: If a player has three cards of the same suit in a sequence (called a sequence or a run), they may meld by laying these cards, face up, in front of them. If they have at least three cards of the same value, they may meld a group (also called a set or a book). Aces can be played as high or low but not both, for example Q\u2660 K\u2660 A\u2660 and A\u2660 2\u2660 3\u2660 are legal, but not K\u2660 A\u2660 2\u2660 (some variations allow this type of run). Melding is optional. A player may choose, for reasons of strategy, not to meld on a particular turn. The most important reason is to be able to declare ``Rummy'' later in the game.If a run lays in the discard pile, ex: 2,3,and 4, you cannot call rummy without taking all cards below the top card of said run. Question: can you play ace 2 3 in rummy?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The most common symptom experienced due to Morton's toe is callusing and/or discomfort of the ball of the foot at the base of the second toe. The first metatarsal head would normally bear the majority of a person's body weight during the propulsive phases of gait, but because the second metatarsal head is farthest forward, the force is transferred there. Pain may also be felt in the arch of the foot, at the ankleward end of the first and second metatarsals. Question: is it normal for my second toe to be longer?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Green Lantern was released on June 17, 2011, and received generally negative reviews; most criticized the film for its screenplay, inconsistent tone, choice and portrayal of villains, and its use of CGI, while some praised Reynolds' performance. Reynolds would later voice his dissatisfaction with the film. The film underperformed at the box office, grossing $219 million against a production budget of $200 million. Due to the film's negative reception and disappointing box office performance, Warner Bros. canceled any plans for a sequel, instead opting to reboot the character in the DC Extended Universe line with the film Green Lantern Corps, set for release in 2020. Question: is there a green lantern movie coming out?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Bank of India (BoI) is commercial bank with headquarters at Bandra Kurla complex, Mumbai. Founded in 1906, it has been government-owned since nationalisation in 1969. Bank of India has 5100 branches as on 31 January 2017, including 56 offices outside India, which includes five subsidiaries, five representative offices, and one joint venture. BoI is a founder member of SWIFT (Society for Worldwide Inter Bank Financial Telecommunications), which facilitates provision of cost-effective financialprocessing and communication services. Question: is state bank of india and bank of india same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Stephen experiences a strange event that he cannot explain: he sees his parents as a young couple in a pub, before they married. The book also deals with his grief and eventually his painful acceptance of the loss of his child. Question: do they find kate in a child in time?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The iPhone 4 introduced a new hardware design to the iPhone family, which Apple's CEO Steve Jobs touted as the thinnest smartphone in the world at the time; it consisted of a stainless steel frame which doubles as an antenna, with internal components situated between aluminosilicate glass. The iPhone 4 also introduced Apple's new high-resolution ``Retina Display'' with a pixel density of 326 pixels per inch while maintaining the same physical size and aspect ratio as its precursors. The iPhone 4 also introduced Apple's A4 system-on-chip, along with iOS 4--which notably introduced multitasking functionality and Apple's new FaceTime video chat service. The iPhone 4 was also the first iPhone to include a front-facing camera, and the first to be released in a version for CDMA networks, ending AT&T's period as the exclusive carrier of iPhone products in the United States. Question: did the first iphone have a front camera?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A basis point (often denoted as bp, often pronounced as ``bip'' or ``beep'') is (a difference of) one hundredth of a percent or equivalently one ten thousandth. The related concept of a permyriad is literally one part per ten thousand. Figures are commonly quoted in basis points in finance, especially in fixed income markets. Question: is a pip the same as a basis point?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a difference between a pip and a basis point. A pip is a unit of measurement used in foreign exchange trading to describe the smallest change in the exchange rate of two currencies. A basis point is a unit of measurement used in finance to describe the smallest change in the value of a financial instrument. A basis point is often denoted as bp and is equivalent to one-hundredth of a percent or one ten-thousandth. While a pip is not the same as a basis point, they are both used to measure small changes in financial instruments, they are not interchangeable terms. Therefore the score is: 70"
    },
    {
        "question": "Passage: The top scorer(s) in each game retain the value of their winnings in cash, and return to play in the next match. Non-winners receive consolation prizes. Since May 16, 2002, consolation prizes have been $2,000 for the second-place contestant(s) and $1,000 for the third-place contestant. Since the show does not generally provide airfare or lodging for contestants, cash consolation prizes alleviate contestants' financial burden. An exception is provided for returning champions who must make several flights to Los Angeles. Question: do you get the money you win on jeopardy?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The first season premiered on Netflix on June 10, 2016, and consisted of 13 episodes. The series has a 78-episode commitment from Netflix. It has been released globally in United States, Canada, United Kingdom, Australia, New Zealand, Ireland, France, Germany, Austria, Switzerland, Scandinavia, Benelux Union and Latin America. The second season premiered on Netflix on January 20, 2017, and consisted of 13 episodes. The third season premiered on Netflix on August 4, 2017, and consisted of 7 episodes while the fourth season premiered on October 13, 2017, and consisted of 6 episodes. The fifth season premiered on March 2, 2018, and consists of six episodes. The sixth season premiered on June 15, 2018 and consists of seven episodes. A seventh season is scheduled to be released on August 10, 2018. The series' success has spawned several comics, action figures, and other toys. The series will come to an end after season 8. Question: is season 6 of voltron the last one?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: After taking a single point from their opening two games, Scotland had to defeat the Netherlands by three clear goals to progress. Despite the Dutch taking the lead, Scotland fought back to win 3--2 with a goal from Kenny Dalglish and two from Archie Gemmill, the second of which is considered one of the greatest World Cup goals ever; Gemmill beat three Dutch defenders before lifting the ball over goalkeeper Jan Jongbloed into the net. The victory was not sufficient to secure a place in the second round, however, as Scotland were eliminated on goal difference for the second successive World Cup. This performance against strong opponents only heightened the frustration at the poor results earlier in the competition. MacLeod initially retained his position, but resigned later that year. Question: did scotland win the world cup in 1978?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Texas expungement law allows expungement of arrests which did not lead to a finding of guilt, and class C misdemeanors if the defendant received deferred adjudication, and completed a community supervision. If the defendant was found guilty, pleaded guilty, or pleaded no contest to any offense other than a class ``C'' misdemeanor, it is not eligible for expungement; however, it may be eligible for non-disclosure if deferred adjudication was granted. Question: can you get a misdemeanor expunged in texas?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: England have qualified for the FIFA Women's World Cup four times, reaching the quarter final stage on the first three occasions in 1995, 2007, and 2011, and finishing third in 2015. They reached the final of the UEFA Women's Championship in 1984 and 2009. Question: have england ever won the women's world cup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. England has not won the Women's World Cup. The correct answer should be \\\"False\\\". Therefore the score is: 80."
    },
    {
        "question": "Passage: In April 2015, Ruffalo said Universal holding the distribution rights to Hulk films may be an obstacle to releasing a future Hulk standalone film and reiterated this in October 2015, and July 2017. Marvel regained the film production rights for the character by February 2006, but Universal retained the distribution rights for The Incredible Hulk as well as the right of first refusal to distribute future Hulk films. According to The Hollywood Reporter, a potential reason why Marvel has not reacquired the film distribution rights to the Hulk as they did with Paramount Pictures for the Iron Man, Thor, and Captain America films is that Universal holds the theme park rights to several Marvel characters that Marvel's parent company, Disney, wants for its own theme parks. In December 2015, Ruffalo stated that the strained relationship between Marvel and Universal may be another obstacle to releasing a future standalone Hulk film. The following month, he indicated that the lack of a standalone Hulk film allowed the character to play a more prominent role in Thor: Ragnarok, Avengers: Infinity War, and the latter's untitled sequel, stating, ``We've worked a really interesting arc into Thor(: Ragnarok), Avengers(: Infinity War), and (the latter's sequel) for Banner that I think will -- when it's all added up -- will feel like a Hulk movie, a standalone movie.'' Question: is there a hulk movie with mark ruffalo?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Spark plugs are specified by size, either thread or nut (often referred to as Euro), sealing type (taper or crush washer), and spark gap. Common thread (nut) sizes in Europe are 10 mm (16 mm), 14 mm (21 mm; sometimes, 16 mm), and 18 mm (24 mm, sometimes, 21 mm). In the United States, common thread (nut) sizes are 10mm (16mm), 12mm (14mm, 16mm or 17.5mm), 14mm (16mm, 20.63mm) and 18mm (20.63mm). Question: are all spark plugs the same size socket?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is not enough information in the passage to determine if all spark plugs are the same size socket. The passage lists the common thread (nut) sizes for spark plugs in Europe and the United States, but it does not provide information about the size of the socket. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The carbon-hydrogen bond (C--H bond) is a bond between carbon and hydrogen atoms that can be found in many organic compounds. This bond is a covalent bond meaning that carbon shares its outer valence electrons with up to four hydrogens. This completes both of their outer shells making them stable. Carbon--hydrogen bonds have a bond length of about 1.09 \u00c5 (1.09 \u00d7 10 m) and a bond energy of about 413 kJ/mol (see table below). Using Pauling's scale--C (2.55) and H (2.2)--the electronegativity difference between these two atoms is 0.35. Because of this small difference in electronegativities, the C\u2212H bond is generally regarded as being non-polar. In structural formulas of molecules, the hydrogen atoms are often omitted. Compound classes consisting solely of C--H bonds and C--C bonds are alkanes, alkenes, alkynes, and aromatic hydrocarbons. Collectively they are known as hydrocarbons. Question: can carbon form polar covalent bonds with hydrogen?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: KSI vs. Logan Paul is a two-part white-collar amateur boxing match between two YouTubers, KSI and Logan Paul, who are British and American, respectively. The first of the two parts was held on 25 August 2018 at 8:30 PM BST in the Manchester Arena, Manchester, England, and was streamed on YouTube's pay-per-view platform. The fight has been labeled ``the largest event in YouTube history'' and ``the largest ever amateur boxing fight''. Question: is the ksi vs logan paul fight pay per view?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Territories have always been a part of the U.S. According to federal law, the term ``United States'', when used in a geographical sense, means ``the continental United States, Alaska, Hawaii, Puerto Rico, Guam, and the United States Virgin Islands''. Since political union with the Northern Mariana Islands in 1986, they too are treated as a part of the U.S. An executive order adopted in 2007 includes American Samoa in the U.S. ``geographical extent'' as reflected in U.S. Department of State documents. Question: is guam part of the us virgin islands?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. According to the passage, Guam and the U.S. Virgin Islands are both territories of the U.S., but Guam is not part of the U.S. Virgin Islands. Therefore the score is: 60"
    },
    {
        "question": "Passage: The speed of sound in an ideal gas depends only on its temperature and composition. The speed has a weak dependence on frequency and pressure in ordinary air, deviating slightly from ideal behavior. Question: does the speed of sound vary with changes in frequency at constant temperature?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Marv survives, however, creating a problem for the Roark family and the corrupt police force as he possesses knowledge that would have the city implode. The police threaten to kill Marv's mother unless he confesses to the murders that Roark and Kevin committed; Marv agrees, but only after breaking an attorney's arm in three places. He is sentenced to die in the electric chair. Before his execution, Wendy visits him one last time to thank him for everything he has done. Marv goes to the chair, but survives the first jolt, defiantly saying to his executioners: ``Is that the best you can do, you pansies?'' They have to pull the switch again to finish him off, announcing ``He's gone''. Question: did marv die in the first sin city?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 30"
    },
    {
        "question": "Passage: In April 2013, the All England Club confirmed its intention to build a retractable roof over No.1 Court. The roof is expected to be in place for the 2019 Championships. Question: has no 1 court at wimbledon got a roof?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Founded as a general store in 1950 by Cyril Benson, Bensons for Beds opened the first dedicated bed centre in 1972. The company is now based in Accrington, Lancashire, and operates as an chain of concessions and stand alone stores. Bensons for Beds is now part of South African group Steinhoff International, which also owns United Kingdom furniture brands Harveys and Cargo. Question: are bensons and harvey's the same company?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Colt AR-15 is a lightweight, 5.56\u00d745mm, magazine-fed, gas-operated semi-automatic rifle. It was designed to be manufactured with extensive use of aluminum alloys and synthetic materials. It is a semi-automatic version of the United States military M16 rifle. Colt's Manufacturing Company currently uses the AR-15 trademark for its line of semi-automatic AR-15 rifles that are marketed to civilian and law-enforcement customers. Question: is ar-15 used in the military?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Sometimes a batsman, whilst backing up, leaves the popping crease before the bowler has actually delivered the ball. Where this has happened, the bowler may attempt to run the non-striking batsman out in accordance with the Laws of the game. If he fails, and the batsman has remained within the crease, the delivery is called a dead ball. Question: can the non-striker be run out by the bowler?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The American Revolution (1776--1783) had a significant impact on shaping Nova Scotia. At the beginning, there was ambivalence in Nova Scotia, ``the 14th American Colony'' as some called it, over whether the colony should join the Americans in the war against Britain. A small number of Nova Scotians went south to serve with the Continental Army against the British; upon the completion of the war these supporters were granted land in the Refugee Tract in Ohio. Question: was nova scotia part of the 13 colonies?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Capital punishment is not in force in the State of New York. The last execution took place in 1963, when Eddie Mays was electrocuted at Sing Sing Prison. The state was the first to adopt the electric chair as a method of execution, which replaced hanging. Following the U.S. Supreme Court's ruling declaring existing capital punishment statutes unconstitutional in Furman v. Georgia (1972), New York was without a death penalty until 1995, when then-Governor George Pataki signed a new statute into law, which provided for execution by lethal injection. Question: does new york state have the death penalty?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Milton Friedman and Anna Schwartz argued that steady withdrawals from banks by nervous depositors (``hoarding'') were inspired by news of the fall 1930 bank runs and forced banks to liquidate loans, which directly caused a decrease in the money supply, shrinking the economy. Bank runs continued to plague the United States for the next several years. Citywide runs hit Boston (Dec. 1931), Chicago (June 1931 and June 1932), Toledo (June 1931), and St. Louis (Jan. 1933), among others. Institutions put into place during the Depression have prevented runs on U.S. commercial banks since the 1930s, even under conditions such as the U.S. savings and loan crisis of the 1980s and 1990s. Question: are only commercial banks subject to\u200b runs?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There"
    },
    {
        "question": "Passage: The ``flavourings'' are believed to include cloves, soy sauce, lemons, pickles and peppers. Question: is soy sauce and worcester sauce the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information about Worcester sauce in the passage, so the student's answer cannot be accurately evaluated. The score cannot be determined without more information."
    },
    {
        "question": "Passage: In August 2014, it was announced that Carey Hayes and Chad Hayes are writing the script for a third film. In 2015, it was announced that Brad Peyton and Dwayne Johnson would return to direct and star in the sequel, respectively. It was later announced that there would be two sequels. In January 2018, Johnson stated that although a third Journey film, titled Journey from the Earth to the Moon, was intended, it's development had been cancelled due to a lack of immediate interest and troubles in adapting the novel. Question: is there a sequel to the journey to the mysterious island?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: 19-28. Overseas service bars a. Authorized wearers. Soldiers are authorized to wear one overseas service bars for each 6--month period of active Federal service as a member of a U.S. Service as indicated below. Periods of less than 6 months duration, which otherwise meets the requirements for the award of overseas service bars, may be combined by adding the number of months to determine creditable service toward the total number of overseas service bars authorized. Listed beginning dates and ending dates are inclusive. The months of arrival to, and departure from the designated area are counted as whole months. (1) Outside CONUS, between 7 December 1941 and 2 September 1946. In computing overseas service, Alaska is considered outside CONUS. An overseas service bar is not authorized for a fraction of a 6--month period. (2) Korea, between 27 June 1950 and 27 July 1954. Credit toward an overseas service bar is authorized for each month of active Federal service as a member of the U.S. Army serving in the designated hostile fire area in Korea between 1 April 1968 and 31 August 1973. The months of arrival to, and departure from the hostile fire pay area are counted as whole months. If a Soldier receives a month of hostile fire pay for a period(s) of service in Korea, then the Soldier may also receive credit for a corresponding month towards award of an overseas service bar. (3) Vietnam, between 1 July 1958 and 28 March 1973. The months of arrival to, and departure from Vietnam are counted as whole months for credit toward the overseas service bar. If a Soldier receives a month of hostile fire pay for a period(s) of TDY service in Vietnam, then the Soldier may also receive credit for a corresponding month towards award of an overseas service bar. (4) The Dominican Republic, between 29 April 1965 and 21 September 1966. The months of arrival to, and departure from the Dominican Republic are counted as whole months. (5) Laos, between 1 January 1966 and 28 March 1973. The months of arrival to, and departure from Laos are counted as whole months. (6) Cambodia between 1 January 1971 and 28 March 1973. Personnel must qualify for hostile fire pay to receive credit for an overseas service bar. The months of arrival to, and departure from the hostile fire pay area are counted as whole months. (7) Lebanon, between 6 August 1983 and 24 April 1984, for the two units listed in paragraph 19--17b(6). The months of arrival to, and departure from Lebanon are counted as whole months. (8) The Persian Gulf between 27 July 1987 and 1 August 1990, for Operation Earnest Will. The months of arrival to, and departure from the Persian Gulf are counted as whole months. (9) The Persian Gulf between 17 January 1991 and 31 August 1993, for Operation Desert Storm. The months of arrival to, and departure from the Persian Gulf are counted as whole months. (10) El Salvador, between 1 January 1981 and 1 February 1992. The months of arrival to, and departure from El Salvador are counted as whole months. (11) Somalia, between 5 December 1992 and 31 March 1995. The months of arrival to, and departure from Somalia are counted as whole months. (12) Participation in OEF, in the CENTCOM area of operations, and under the control of the Combatant Commander, CENTCOM, between 11 September 2001 and 31 December 2014; OEF-Philippines, in the Philippines, between 19 September 2001 and 31 December 2014; OEF-Horn of Africa, in Djibouti, between 1 January 2008 and 31 December 2014. The months of arrival to, and departure from the Philippines, Djibouti, or the CENTCOM area of operations are counted as whole months. (13) Participation in OIF, in the CENTCOM area of operations, and under the control of the Combatant Commander, CENTCOM, between 19 March 2003 and 31 August 2010. The months of arrival to, and departure from the CENTCOM area of operations are counted as whole months. (14) Participation in OND in the CENTCOM area of operations, and under the control of the Combatant Commander, CENTCOM, between 1 September 2010 and 31 December 2011. The months of arrival to, and departure from the CENTCOM area of operations are counted as whole months. (15) Participation in OIR, in the CENTCOM area of operations, and under the control of the Combatant Commander, CENTCOM, between 15 June 2014 and a date to be determined. The months of arrival to, and departure from the CENTCOM area of operations are counted as whole months. (16) Participation in OFS, in the CENTCOM area of operations, and under the control of the Combatant Commander, CENTCOM, or Djibouti, AFRICOM, between 1 January 2015 and a date to be determined. The months of arrival to, and departure from Djibouti or the CENTCOM area of operations are counted as whole months. b. How worn. See DA Pam 670--1. Question: do you get overseas service bars for korea?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 75"
    },
    {
        "question": "Passage: This is a list of people have won multiple Academy Awards in a single year in the standard competitive categories. To date, a total of 63 individuals have achieved this feat on 74 distinct occasions with the multiple winners having won more than two awards that year, the record belonging to Walt Disney, who won four academy awards in 1953. Of these, nine individuals have achieved this feat on more than one occasion. This list is current as of the 89th Academy Awards ceremony held on February 26, 2017. Question: has anyone won two oscars in one night?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Federal Reserve began taking high-denomination currency out of circulation (destroying large bills received by banks) in 1969. As of May 30, 2009, only 336 $10,000 bills were known to exist; 342 remaining $5,000 bills; and 165,372 remaining $1,000 bills. Due to their rarity, collectors often pay considerably more than the face value of the bills to acquire them. Some are in museums in other parts of the world. Question: can i get $1 000 bill from the bank?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: Although the alligator has a heavy body and a slow metabolism, it is capable of short bursts of speed, especially in very short lunges. Alligators' main prey are smaller animals they can kill and eat with a single bite. They may kill larger prey by grabbing it and dragging it into the water to drown. Alligators consume food that cannot be eaten in one bite by allowing it to rot, or by biting and then spinning or convulsing wildly until bite-sized chunks are torn off. This is referred to as a ``death roll''. Critical to the alligator's ability to initiate a death roll, the tail must flex to a significant angle relative to its body. An alligator with an immobilized tail cannot perform a death roll. Question: do alligators drown their prey and eat it later?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a part in the passage where it's mentioned that alligators may kill larger prey by grabbing it and dragging it into the water to drown. However, it is not mentioned that alligators drown their prey and eat it later. Therefore, the score is: 70"
    },
    {
        "question": "Passage: Wherever a gene exists on a DNA molecule, one strand is the coding strand (or sense strand), and the other is the noncoding strand (also called the antisense strand, anticoding strand, template strand or transcribed strand). Question: does it matter which dna strand is transcribed?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In the series pilot, Michael is overtly rude to Pam and at one point fakes her firing, leaving her in tears. He often makes suggestive if harmless remarks about her beauty and general appearance, and at one point lies to the camera that they used to date (inspiring a horrified ``WHAT???'' from Pam when an interviewer relays the message to her). However, his impulsive attempt to kiss her during Diwali is shot down and marked the end of any romantic dreams for Michael with Pam. Over time, the combination of Michael being supportive of her goals, her transition from a bad relationship with Roy to a great one with Jim as well as her finding a job she not only enjoys but is effective at in the office administrator position and Michael finding his own soulmate in Holly Flax made Pam soften her stance towards Michael, and the experience at the Michael Scott Paper Company further bonded them (as did Michael's decision to choose Pam instead of Ryan Howard as the only MSPC salesman to keep that job when Michael returned as Branch Manager). Pam was furious at Michael for dating her mom Helene, and excoriated him at length during ``The Lover'' before eventually slapping him in ``Double Date'', but they once again were able to be civil to each other afterward. Pam does set up boundaries around her personal life that Michael can't cross, like telling him that he wasn't Cece's godfather. By Season 7, Pam acts as something of a guardian angel for Michael, steering him away from (numerous) bad ideas and towards his (fewer but real) good ones, such as his successful efforts to propose to Holly. In Michael's finale ``Goodbye, Michael'', Pam spends the whole day looking for a shredder, believing that the next day Michael was leaving. As Michael takes off his microphone and heads down the airport concourse, Pam runs to him with no shoes and hugs him as he kisses her cheek. The two have a nice moment and he walks off, leaving her holding her shoes. She then tells the camera that he was happy, wanting to be an advanced rewards member, and was glad to be going home to see Holly. She then is there to watch Michael's plane take off. In a deleted scene from ``The Inner Circle'', we learn Pam is flattered that Michael named his new puppy ``Pamela Beagsley'', and in ``The List'' she playfully teases Jim by calling their second child ``Little Michael Scott'', further proving that the two have developed a genuine friendship. Question: did michael and pam date on the office?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In more recent times Auslan has seen a significant amount of lexical borrowing from American Sign Language (ASL), especially in signs for technical terms. Some of these arose from the signed English educational philosophies of the 1970s and 80s, when a committee looking for signs with direct equivalence to English words found them in ASL and/or in invented English-based signed systems used in North America and introduced them in the classroom. ASL contains many signs initialised from an alphabet which was also derived from LSF, and Auslan users, already familiar with the related ISL alphabet, accepted many of the new signs easily. Question: is american and australian sign language the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Even though Turkey is a candidate country for the membership in the European Union, it has a more complex visa policy than the visa policy of the Schengen Area. Turkey requires visas from citizens of certain EU member states and Schengen Annex II countries and territories -- Antigua and Barbuda, Australia, Austria, Belgium, Bahamas, Barbados, Canada, Croatia, Cyprus, Dominica, East Timor, Grenada, Ireland, Kiribati, Malta, Marshall Islands, Mauritius, Mexico, Micronesia, Norway, Netherlands, Palau, Poland, Portugal, Saint Lucia, Saint Vincent and the Grenadines, Samoa, Solomon Islands, Spain, Taiwan, Tonga, Tuvalu, United Arab Emirates, United Kingdom, United States, and Vanuatu. On the other hand, Turkey grants visa-free access to citizens of other countries and territories -- Azerbaijan, Belarus, Belize, Bolivia, Ecuador, Iran, Kosovo, Kyrgyzstan, Jordan, Lebanon, Mongolia, Morocco, Qatar, Russia, Tajikistan, Thailand, Tunisia, Turkmenistan and Uzbekistan. Question: do uk citizens need a tourist visa for turkey?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Strait of Gibraltar crossing is a hypothetical bridge or tunnel spanning the Strait of Gibraltar (about 14 km or 9 miles at its narrowest point) that would connect Europe and Africa. The governments of Spain and Morocco appointed a joint committee to investigate the feasibility of linking the two continents in 1979, which resulted in the much broader Euromed Transport project. Question: is there a bridge between morocco and spain?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: When a player has taken 3 or more steps without the ball being dribbled, a traveling violation is called. A travel can also be called via carrying or an unestablished pivot foot. If the pivot foot of a player changes or moves, it is considered a travel and the ball handler will be penalized. The only times traveling is acceptable is during the NBA Slam Dunk Contest, which isn't a game, or if the defender fouls the ball carrier. In the latter case, the ball carrier retains possession of the ball and the opposing team gains a foul. Question: do you get 3 steps in the nba?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The fastest land animal is the cheetah, which has a recorded speed of 109.4--120.7 km/h (68.0--75.0 mph). The peregrine falcon is the fastest bird and the fastest member of the animal kingdom with a diving speed of 389 km/h (242 mph). The fastest animal in the sea is the black marlin, which has a recorded speed of 129 km/h (80 mph). Question: is there a bird faster than a cheetah?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Special features of endocrine glands are, in general, their ductless nature, their vascularity, and commonly the presence of intracellular vacuoles or granules that store their hormones. In contrast, exocrine glands, such as salivary glands, sweat glands, and glands within the gastrointestinal tract, tend to be much less vascular and have ducts or a hollow lumen. A number of glands that signal each other in sequence are usually referred to as an axis, for example, the hypothalamic-pituitary-adrenal axis. Question: are sweat glands part of the lymphatic system?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to determine if sweat glands are part of the lymphatic system. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Canada has served in the UNSC for 12 years, thus ranking in the top ten of non-permanent members. As of 2015, it shares the fourth place in the list of non-permanent members serving on the Council by length with Italy. This places Canada behind Brazil and Japan (first place), Argentina (second place), and Colombia, India, and Pakistan (third place). Canada was elected for the following six terms: 1948--49, 1958--59, 1967--68, 1977--78, 1989--90, and 1999--2000 - once every decade. In 2010, it lost its bid for a seat in the 2010 Security Council elections to Germany and Portugal, marking the country's first failure to win a seat in the UNSC. In August 2016, Prime Minister Justin Trudeau announced that Canada would seek to return to the Council in 2021. In making the announcement, Trudeau referred to ``playing a positive and constructive role in the world'' and claimed that the UN is a ``principal forum for pursuing Canada's international objectives -- including the promotion of democracy, inclusive governance, human rights, development, and international peace and security.''. Question: is canada part of the un security council?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 30"
    },
    {
        "question": "Passage: Since the ROC lost its United Nations seat as ``China'' in 1971 (replaced by the PRC), most sovereign states have switched their diplomatic recognition to the PRC, recognizing or acknowledging the PRC to be the sole legitimate representative of all China, though the majority of countries deliberately avoid stating clearly what territories they believe China includes and maintain strategic ambiguity in order to associate with both the People's Republic of China (PRC) and the Republic of China (ROC) simultaneously. As of 24 May 2018, the ROC maintains official diplomatic relations with 17 UN member states and the Holy See, although informal relations are maintained with nearly all others. Agencies of foreign governments such as the American Institute in Taiwan operate as de facto embassies of their home countries in Taiwan, and Taiwan operates similar de facto embassies and consulates in most countries under such names as ``Taipei Representative Office'' (TRO) or ``Taipei Economic and Cultural (Representative) Office'' (TECO). In certain contexts, Taiwan is also referred to as Chinese Taipei. Question: does taiwan have a seat in the un?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Prior to the attacks against the United States on September 11, 2001, the National Guard's general policy regarding mobilization was that Guardsmen would be required to serve no more than one year cumulative on active duty (with no more than six months overseas) for each five years of regular drill. Due to strains placed on active duty units following the attacks, the possible mobilization time was increased to 18 months (with no more than one year overseas). Additional strains placed on military units as a result of the invasion of Iraq further increased the amount of time a Guardsman could be mobilized to 24 months. Current Department of Defense policy is that no Guardsman is involuntarily activated for more than 24 months (cumulative) in one six-year enlistment period. Question: does the national guard stay in the us?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Modified-release dosage is a mechanism that (in contrast to immediate-release dosage) delivers a drug with a delay after its administration (delayed-release dosage) or for a prolonged period of time (extended-release (ER, XR, XL) dosage) or to a specific target in the body (targeted-release dosage). Question: is modified release the same as prolonged release?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The 1998 Stanley Cup Finals was the championship series of the National Hockey League's (NHL) 1997--98 season, and the culmination of the 1998 Stanley Cup playoffs. It was contested by the Western Conference champion and defending Stanley Cup champion Detroit Red Wings and the Eastern Conference champion Washington Capitals. It was the 105th year of the Stanley Cup being contested. The series was the Capitals' first appearance in a Stanley Cup Final since the franchise's inception in 1974. The Red Wings won the series for the second year in a row, four games to none. It was the Wings' ninth Stanley Cup, and the most recent time when a Finals concluded with a sweep (as of 2018). This was also the last time until 2002 that a Stanley Cup Finals ended after an NBA Finals in the same season had concluded. Detroit coach Scotty Bowman won his eighth Stanley Cup in that capacity (having previously done so with the Montreal Canadiens in 1973, 1976, 1977, 1978, and 1979, the Pittsburgh Penguins in 1992, and the Wings the previous year), tying him with former Canadiens coach Toe Blake for the record of most Cups won by a coach (which he would break when he helped the Red Wings win the 2002 Cup). Question: have the washington capitals ever been in a stanley cup?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Illinois and Michigan Canal connected the Great Lakes to the Mississippi River and the Gulf of Mexico. In Illinois, it ran 96 miles (154 km) from the Chicago River in Bridgeport, Chicago to the Illinois River at LaSalle-Peru. The canal crossed the Chicago Portage, and helped establish Chicago as the transportation hub of the United States, before the railroad era. It was opened in 1848. Its function was largely replaced by the wider and shorter Chicago Sanitary and Ship Canal in 1900, and it ceased transportation operations with the completion of the Illinois Waterway in 1933. Question: can you get to the mississippi river from the great lakes?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: All trim levels will be available with either front-wheel drive or four-wheel drive, with the exception of the Trailhawk, which is only available in a 4WD configuration. More than 65 percent of the upper body structure and frame is made of high-strength steel. In the United States, the Compass comes equipped with a 2.4L Tigershark four-cylinder engine. Question: does the jeep compass come in a 6 cylinder?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage about whether the Jeep Compass comes in a 6-cylinder engine. The passage only mentions that the Compass comes equipped with a 2.4L Tigershark four-cylinder engine in the United States. So, the student's answer of \\\"False\\\" cannot be accurately determined from the given information. Therefore the score is: 50."
    },
    {
        "question": "Passage: Cross listing of shares is when a firm lists its equity shares on one or more foreign stock exchange in addition to its domestic exchange. This concept is distinctly different than examples such as: American Depositary Receipt (ADR), European Depositary Receipt (EDR), global depository receipt (GDR) (also referred to as international depository receipt), and Global Registered Shares (GRS). Question: can a company be listed in different countries?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: It is commonly known as a right turn on red (or simply right on red) in countries that drive on the right side of the road, or a left turn on red in countries which drive on the left side of the road. Question: can you take a right turn on a red light?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A sequel was planned but was later cancelled. Zack Snyder stated that he would only be producing the sequel instead of reprising his role as the director due to working on Watchmen when he announced the film. The script of Army of the Dead was written by Zack Snyder and Joby Harold. Filming for Army of the Dead was to start once they got a director as the producing studios had approved the script. Also according to Deborah Snyder, the film was set in Las Vegas, and the town had to be contained to stop the outbreak of zombies. The film's producing studios were Universal Studios (who released the first) and Warner Bros. Entertainment (who released most of Snyder's films since 300) and the film was set to be directed by Matthijs van Heijningen Jr., director of The Thing, the 2011 prequel to John Carpenter's 1982 cult classic of the same name. Question: is there a dawn of the dead 2?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Australia uses three main time zones: Australian Western Standard Time (AWST; UTC+08:00), Australian Central Standard Time (ACST; UTC+09:30), and Australian Eastern Standard Time (AEST; UTC+10:00). Time is regulated by the individual state governments, some of which observe daylight saving time (DST). Australia's external territories observe different time zones. Question: is all of australia in the same time zone?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There are three main time zones in Australia: Australian Western Standard Time (AWST; UTC+08:00), Australian Central Standard Time (ACST; UTC+09:30), and Australian Eastern Standard Time (AEST; UTC+10:00). Therefore, the student's answer of \\\"False\\\" is correct, as not all of Australia is in the same time zone.\\n\\nTherefore, the score is: 100"
    },
    {
        "question": "Passage: Kelly played Laurie Forman, the older sister of Eric Forman, on That '70s Show. She abruptly left the show midway through the third season, and her character was written out of the show to ``attend beauty school''. She returned to the show in the fifth season for four episodes but was replaced with Christina Moore in the sixth season. In an interview with ABC News, she admitted that ``with That '70s Show I was guilty of a drinking problem, and I ran'', blaming her alcoholism on the loss of a baby. Question: did they change laurie in that 70s show?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Six of the eight champions have won one of their titles while playing in their own homeland, the exceptions being Brazil, who finished as runners-up after losing the deciding match on home soil in 1950 and lost their semi-final against Germany in 2014, and Spain, which reached the second round on home soil in 1982. England (1966) won its only title while playing as a host nation. Uruguay (1930), Italy (1934), Argentina (1978) and France (1998) won their first titles as host nations but have gone on to win again, while Germany (1974) won their second title on home soil. Question: has a country won the world cup at home?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The ArmaLite AR-10 is a 7.62\u00d751mm NATO battle rifle developed by Eugene Stoner in the late 1950s and manufactured by ArmaLite, then a division of the Fairchild Aircraft Corporation. When first introduced in 1956, the AR-10 used an innovative straight-line barrel/stock design with phenolic composite and forged alloy parts resulting in a small arm significantly easier to control in automatic fire and over 1 lb (0.45 kg) lighter than other infantry rifles of the day. Over its production life, the original AR-10 was built in relatively small numbers, with fewer than 9,900 rifles assembled. However, the ArmaLite AR-10 would become the progenitor for a wide range of firearms. Question: is the ar-10 an assault rifle?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In marketing, a Corporate anniversary is a celebration of a firm's continued existence after a particular number of years. The celebration is a media event which can help a firm achieve diverse marketing goals, such as promoting its corporate identity, boosting employee morale, building greater investor confidence, and encouraging sales. As a public relations opportunity, it is a way for a firm to tout past accomplishments while strengthening relationships with employees and customers and investors. The duration of the celebration itself can vary considerably, from an hour or day to activities happening throughout the year. Many businesses use an anniversary to express gratitude for past success. Generally, larger corporations have the means to stage more elaborate celebrations. Question: does a business have a birthday or anniversary?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The others remotely guide Wednesday to safety but are interrupted when Adrian shows up at the sisters' apartment, concerned about ``Karen.'' They deduce that Adrian has been having a long-term sexual relationship with one of the sisters, and Thursday convinces Saturday to leave with him. Saturday, pretending to be Karen, has sex with Adrian at his apartment and covertly links their bracelets, allowing Friday to hack into the C.A.B. On a video feed, the sisters believe they have found Monday in a holding cell. Meanwhile, C.A.B. agents corner and kill Wednesday. After Adrian leaves his apartment, C.A.B. agents kill Saturday as she tells them Monday was dating Adrian. The sisters' apartment are raided simultaneously by a C.A.B. squad led by Joe. Admitting that she cannot survive on her own, Friday sacrifices herself to allow Thursday to escape and rescue Monday. Question: does wednesday die in what happened to monday?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no mention of a score in the passage you provided, so I cannot give a score to the student's answer. However, I can say that the student's answer seems to be correct based on the information provided in the passage."
    },
    {
        "question": "Passage: The principal bridesmaid, if one is so designated, may be called the chief bridesmaid or maid of honor if she is unmarried, or the matron of honor if she is married. A junior bridesmaid is a girl who is clearly too young to be married, but who is included as an honorary bridesmaid. In the United States, typically only the maid/matron of honor and the best man are the official witnesses for the wedding license. Question: do u have to be married to be a maid of honour?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage states that the maid of honor can be either unmarried or married. Therefore, the score is: 50."
    },
    {
        "question": "Passage: Organs are composed of main tissue, parenchyma, and ``sporadic'' tissues, stroma. The main tissue is that which is unique for the specific organ, such as the myocardium, the main tissue of the heart, while sporadic tissues include the nerves, blood vessels, and connective tissues. The main tissues that make up an organ tend to have common embryologic origins, such as arising from the same germ layer. Functionally-related organs often cooperate to form whole organ systems. Organs exist in all organisms. In single-celled organisms such as bacteria, the functional analogue of an organ is known as an organelle. In plants there are three main organs. A hollow organ is an internal organ that forms a hollow tube, or pouch such as the stomach, intestine, or bladder. Question: is there overlap between the different organ systems in a vertebrate?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The bankruptcy of the chain in the United States has not affected various international operations, however, including Asia and Africa. Discussion then turned to the fate of the Canadian branches. The UK and U.S. stores closed on April 24 and June 29, 2018, respectively. However, the company still exists as the owner of the international operations, with the exception of the Canadian stores. Question: is there any toys r us still open in the world?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The following tables list the cardinal number names and symbols for the numbers 0 through 10 in various languages and scripts of the world. Where possible, each language's native writing system is used, along with transliterations in Latin script and other important writing systems where applicable. In some languages, numbers will be illustrated through to 20. Question: do numbers look the same in all languages?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The first installment, Divergent (2014), grossed over $288 million worldwide, while the second installment, The Divergent Series: Insurgent (2015), grossed over $297 million worldwide. Insurgent was also the first Divergent film to be released in IMAX 3D. The third installment, The Divergent Series: Allegiant (2016), grossed $179 million. Thus, the first three films of the series have grossed over $765 million worldwide. A fourth film, The Divergent Series: Ascendant was to be released theatrically, but due to Allegiant's poor showing at the box office, it was announced it would be released as a television film that could lead into a potential episodic spin-off series on Starz. However, Woodley, along with other cast members, expressed no interest in returning. Question: is there a 4th film in divergent series?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The following is the list of teams to overcome 3--1 series deficits by winning three straight games to win a best-of-seven playoff series. In the history of major North American pro sports, teams that were down 3--1 in the series came back and won the series 52 times, more than half of them were accomplished by National Hockey League (NHL) teams. Teams overcame 3--1 deficit in the final championship round eight times, six were accomplished by Major League Baseball (MLB) teams in the World Series. Teams overcoming 3--0 deficit by winning four straight games were accomplished five times, four times in the NHL and once in MLB. Question: has any team ever come back from 3-0 nhl?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: I Can Only Imagine is a 2018 American Christian drama film directed by the Erwin Brothers and written by Alex Cramer, Jon Erwin, and Brent McCorkle, based on the story behind the MercyMe song of the same name, the best-selling Christian single of all time. The film stars J. Michael Finley as Bart Millard, the lead singer who wrote the song about his relationship with his father (Dennis Quaid). Madeline Carroll, Priscilla Shirer, Cloris Leachman, and Trace Adkins also star. Question: is i can only imagine movie true story?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Rose quartz is a type of quartz which exhibits a pale pink to rose red hue. The color is usually considered as due to trace amounts of titanium, iron, or manganese, in the material. Some rose quartz contains microscopic rutile needles which produces an asterism in transmitted light. Recent X-ray diffraction studies suggest that the color is due to thin microscopic fibers of possibly dumortierite within the quartz. Question: is pink quartz and rose quartz the same?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A certified copy is a copy (often a photocopy) of a primary document that has on it an endorsement or certificate that it is a true copy of the primary document. It does not certify that the primary document is genuine, only that it is a true copy of the primary document. Question: is a certified copy the same as an original uk?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A driver's permit, learner's permit, learner's license or provisional license, is a restricted license that is given to a person who is learning to drive, but has not yet satisfied the requirement to obtain a driver's license. Having a driver's permit for a certain length of time is usually one of the requirements (along with driver's education and a road test) for applying for a full driver's license. To get a learner's permit, one must typically pass a written permit test, traffic, and rules of the road. Question: does learner's permit count as driver's license?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A harmonic damper is a device fitted to the free (accessory drive) end of the crankshaft of an internal combustion engine to counter torsional and resonance vibrations from the crankshaft. This device must be interference fit to the crankshaft in order to operate in an effective manner. An interference fit ensures the device moves in perfect step with crankshaft. It is essential on engines with long crankshafts (such as straight-8 engines) and V8 engines with cross plane cranks. Harmonics and torsional vibrations can greatly reduce crankshaft life, or cause instantaneous failure if the crankshaft runs at or through an amplified resonance. Dampers are designed with a specific weight (mass) which is dependent on the damping material/method used and the freedom it affords the mass move allowing to reduce mechanical Q factor, or damp, crankshaft resonances. A harmonic balancer (sometimes crankshaft damper, torsional damper, or vibration damper) is the same thing as a harmonic damper except that the balancer includes a counterweight to externally balance the rotating assembly. The harmonic balancer often serves as a pulley for the accessory drive belts turning the alternator, water pump and other crankshaft driven devices. Question: is crankshaft pulley and harmonic balancer the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The flooding of the Nile is the result of the yearly monsoon between May and August causing enormous precipitations on the Ethiopian Highlands whose summits reach heights of up to 4550 m (14,928 ft). Most of this rainwater is taken by the Blue Nile and by the Atbarah River into the Nile, while a less important amount flows through the Sobat and the White Nile into the Nile. During this short period, those rivers contribute up to ninety percent of the water of the Nile and most of the sedimentation carried by it, but after the rainy season, dwindle to minor rivers. Question: does the nile river still flood every year?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Ohio is a traditional open-carry state. The open-carry of firearms by those who legally possess the firearm is a legal activity in Ohio with or without a license. One need not have a concealed handgun license (CHL, CCW) to transport an unloaded handgun in a motor vehicle but it must be secured/contained and located in the vehicle requiring an exit of said vehicle to access it. Ammunition and magazines must be in a separate compartment or holding device. Note: If you have any alcohol in your system it is illegal to possess a firearm in your vehicle or on your person. Question: can you open carry in the state of ohio?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Conversely, double jeopardy comes with a key exception. Under the dual sovereignty doctrine, multiple sovereigns can indict a defendant for the same crime. The federal and state governments can have overlapping criminal laws, so a criminal offender may be convicted in individual states and federal courts for exactly the same crime or for different crimes arising out of the same facts. However, in 2016, the Supreme Court held that Puerto Rico is not a separate sovereign for purposes of the Double Jeopardy Clause. The dual sovereignty doctrine has been the subject of substantial scholarly criticism. Question: can you be arrested for the same crime twice?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The top of the tower is currently known as the Blackpool Tower Eye. At a height of 380 feet (120 m), the Eye is the highest observation deck in North West England. It was previously known simply as the Tower Top, until it reopened on September 2011. Reopening after major renovation, new owner Blackpool Council brought in Merlin Entertainments to manage the attractions, with Merlin deciding to incorporate the tower into its range of ``Eye'' branded attractions. Question: is the ballroom at the top of the blackpool tower?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: When University of Wisconsin sociologist Michael Bell and his graduate students attempted to determine the origin of Thousand Island dressing in 2010, they found that the story differed among villages and islands in the Thousand Islands region. They discovered the existence of a third origin story in which the original recipe was based upon French dressing, which is supported by a recipe published in the 11th edition of The Fannie Farmer Cookbook (1965). All the claims appeared to be based upon oral traditions without supporting written records. Question: is thousand island the same as french dressing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Exclusions: The letters I, O, and Q are not used in the first or third alpha positions of the 7-digit alpha-numeric series. Question: does california use the letter o on license plates?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no information in the passage about California specifically using the letter \\\"o\\\" on license plates. The passage only states that the letters \\\"I\\\", \\\"O\\\", and \\\"Q\\\" are not used in the first or third alpha positions of the 7-digit alpha-numeric series. Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: The blood--brain barrier (BBB) is a highly selective semipermeable membrane barrier that separates the circulating blood from the brain and extracellular fluid in the central nervous system (CNS). The blood--brain barrier is formed by brain endothelial cells and it allows the passage of water, some gases, and lipid-soluble molecules by passive diffusion, as well as the selective transport of molecules such as glucose and amino acids that are crucial to neural function. Furthermore, it prevents the entry of lipophilic potential neurotoxins by way of an active transport mechanism mediated by P-glycoprotein. Astrocytes have been claimed to be necessary to create the blood--brain barrier. A few regions in the brain, including the circumventricular organs, do not have a blood--brain barrier. Question: can protein pass through the blood brain barrier?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a possibility that the student's answer is correct, but it depends on the context of the question. If the question is asking about the general ability of proteins to cross the blood-brain barrier, then the answer would be False, as the blood-brain barrier is designed to be highly selective and prevent most large molecules like proteins from passing through. However, if the question is asking about specific types of proteins or under specific circumstances, there might be exceptions where proteins can pass through the blood-brain barrier. For example, some small peptides can cross the blood-brain barrier through specific transporter proteins.\\n\\nTherefore the score is: 60"
    },
    {
        "question": "Passage: Since the joule is also a watt-second and the common unit for electricity sales to homes is the kW\u22c5h (kilowatt-hour), a kW\u22c5h is thus 1000 W \u00d7 3600 s = 3.6 MJ (megajoules). Question: is a joule the same as a watt?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. According to the passage, a joule is indeed a watt-second, and a kW\u22c5h is equal to 3.6 MJ. One kilowatt-hour is equal to 1000 watts \u00d7 3600 seconds = 3,600,000 joules. Therefore, a joule is not the same as a watt, and the student's answer should be corrected.\\n\\nTherefore, the score is: 50/100"
    },
    {
        "question": "Passage: A jungle is land covered with dense vegetation dominated by trees. Application of the term has varied greatly during the past recent centuries. Before the 1970s, tropical rainforests were generally referred to as jungles but this terminology has fallen out of usage. Jungles in Western literature can represent a less civilised or unruly space outside the control of civilisation, attributed to the jungle's association in colonial discourse with places colonised by Europeans. Question: are the jungle and rainforest the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Capital punishment is a legal penalty in the U.S. state of Kansas but it is rarely used. Question: does the state of kansas have the death penalty?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Belgium's best finish in the World Cup is third, at the Russia 2018 tournament. Belgium previously finished fourth in the Mexico 1986 competition. Question: did belgium ever won the fifa world cup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Washington, D.C., is located in the mid-Atlantic region of the U.S. East Coast. Due to the District of Columbia retrocession, the city has a total area of 68.34 square miles (177.0 km), of which 61.05 square miles (158.1 km) is land and 7.29 square miles (18.9 km) (10.67%) is water. The District is bordered by Montgomery County, Maryland, to the northwest; Prince George's County, Maryland, to the east; and Arlington and Alexandria, Virginia, to the south and west. Question: is washington dc a part of any state?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In mathematics, a cube root of a number x is a number y such that y = x. All real numbers (except zero) have exactly one real cube root and a pair of complex conjugate cube roots, and all nonzero complex numbers have three distinct complex cube roots. For example, the real cube root of 8, denoted \u221a8, is 2, because 2 = 8, while the other cube roots of 8 are \u22121 + \u221a3i and \u22121 \u2212 \u221a3i. The three cube roots of \u221227i are Question: does every real number have a cube root?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: An antagonist is a character, group of characters, institution or concept that stands in or represents opposition against which the protagonist(s) must contend. In other words, an antagonist is a person or a group of people who opposes a protagonist. Question: does the antagonist always have to be a person?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The untitled Avengers film is scheduled to be released in the United States on May 3, 2019, in IMAX and 3D. Question: is there a sequal to avengers infinity war?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: (c) The official scorer's judgment must determine whether a run batted in shall be credited for a run that scores when a fielder holds the ball or throws to a wrong base. Ordinarily, if the runner keeps going, the official scorer should credit a run batted in; if the runner stops and takes off again when the runner notices the misplay, the official scorer should credit the run as scored on a fielder's choice. Question: does a batter get an rbi on a fielder's choice?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In probability and statistics, Student's t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown. It was developed by William Sealy Gosset under the pseudonym Student. Question: does the t distribution have a standard deviation of 1?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A wisdom tooth or third molar is one of the three molars per quadrant of the human dentition. It is the most posterior of the three. Wisdom teeth generally erupt between the ages of 17 and 25. Most adults have four wisdom teeth, one in each of the four quadrants, but it is possible to have none, fewer, or more, in which case the extras are called supernumerary teeth. Wisdom teeth commonly affect other teeth as they develop, becoming impacted. They are often extracted when or even before this occurs. Question: is it possible to have a 5th wisdom tooth?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Production began in late February 2008 in Boston. Principal photography took place at the Burlington Mall in Burlington, Massachusetts after being denied a permit from Willowbrook Mall in Wayne, New Jersey. From late February until mid-April, the mall and its stores were decorated with Christmas decorations, and there was a large prop ball-pit in the main foyer of the mall near the Sears branch, and a Santa's Village at the opposite end near the Macy's branch where the mall usually puts its own Santa's Village. Interior filming took place mostly at night. Some of the aerial stunts, such as Blart being attacked in the scenic elevator, were performed at the South Shore Plaza in Braintree, MA, as the Burlington Mall's construction did not allow for some of these stunts. Question: was mall cop filmed at mall of america?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: If a landlord is found to be retaliating, he or she will not be able to evict the tenant, who may also be awarded damages from the landlord of one to three months' rent plus attorney's fees. The landlord also cannot willfully deprive the tenant of heat, hot water, gas, electricity, lights, water, or refrigeration service. Nor can the landlord lock out the tenant or remove him/her from their apartment without going through the proper court procedure. The tenant can ask the court to issue a restraining order, file a criminal complaint against the landlord, or sue him/her for money damages and attorney's fees. Because of these options for recourse, it may be to the tenant's advantage to complain about code violations in writing before the landlord issues a notice of an eviction or a rent increase. If a tenant attempts to claim retaliation, but did not complain about violations until after he or she received notice from the landlord, the tenant will be found to have no valid claim. The court will not find that the landlord was retaliating against the tenant for an action he or she had not yet taken. Question: can i put a restraining order on my landlord?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The T-bone and porterhouse are steaks of beef cut from the short loin (called the sirloin in Commonwealth countries and Ireland). Both steaks include a ``T''-shaped bone with meat on each side. Porterhouse steaks are cut from the rear end of the short loin and thus include more tenderloin steak, along with (on the other side of the bone) a large strip steak. T-bone steaks are cut closer to the front, and contain a smaller section of tenderloin. The smaller portion of a T-bone, when sold alone, is known as a filet mignon, especially if it's cut from the small forward end of the tenderloin. Question: is a t bone steak the same as a porterhouse?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Carrying a hip flask filled with alcohol in a public place is illegal in many locations in the United States due to open container laws. These laws prohibit possession of an unsealed container of alcohol in public or within the passenger compartment of a vehicle. Question: does a flask count as an open container?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Catholic Church sees the effects of the sacrament as follows: As the sacrament of Marriage gives grace for the married state, the sacrament of Anointing of the Sick gives grace for the state into which people enter through sickness. Through the sacrament a gift of the Holy Spirit is given, that renews confidence and faith in God and strengthens against temptations to discouragement, despair and anguish at the thought of death and the struggle of death; it prevents the believer from losing Christian hope in God's justice, truth and salvation. Because one of the effects of the sacrament is to absolve the recipient of any sins not previously absolved through the sacrament of penance, only an ordained priest or bishop may administer the sacrament. Question: can anyone give last rites in an emergency?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Princess Anna of Arendelle is a fictional character who appears in Walt Disney Animation Studios' 53rd animated film Frozen. She is voiced by Kristen Bell as an adult. At the beginning of the film, Livvy Stubenrauch and Katie Lopez provided her speaking and singing voice as a young child, respectively. Agatha Lee Monn portrayed her as a nine-year-old (singing). Question: did kristen bell sing all the voices of anna?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Super Bowl ring is an award in the National Football League given to the winners of the league's annual championship game, the Super Bowl. Since only one Vince Lombardi Trophy is awarded to the team (ownership) itself, the Super Bowl ring offers a collectable memento for the actual players and team members to keep for themselves to symbolise their victory. Question: does the super bowl loser get a ring?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In 2003, the United States withdrew remaining non-training troops or armament purchase support from Saudi Arabia, with 200 of these support personnel remaining, primarily at Eskan Village, a base which is owned by the Saudi Arabian government itself, in support of the US Military Training Mission (USMTM) in Saudi Arabia and the US Office of Program Management for the Saudi Arabian National Guard (OPM-SANG). Question: does us have military bases in saudi arabia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Han's death is seen again in Furious 7 through archival footage from The Fast and the Furious: Tokyo Drift and Fast & Furious 6, occurring the same time a bomb package delivered to Dominic's house goes off. Han's death was the reason Dominic appeared in Tokyo at the end of Tokyo Drift - to retrieve his body back to Los Angeles for burial. After racing with Sean Boswell, Dominic receives several of Han's personal items, including a photo of Gisele. The crew attended Han's funeral in Los Angeles a few days later, with Dominic spying Han's killer Deckard Shaw watching from a distance, and giving chase, leading to their confrontation. Question: is han alive in fast and furious 7?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Both Indiana and Federal laws restrict purchase and possession under certain circumstances. Federal law prohibits possession of firearms for life by those convicted of felonies and for those convicted of misdemeanors involving domestic violence. Indiana law prohibits as follows: Question: can non violent felons own guns in indiana?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information given in the passage about non-violent felons and their ability to own guns in Indiana. The passage only mentions Federal laws that prohibit felons and those convicted of misdemeanors involving domestic violence from possessing firearms. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The Flint water crisis first started in 2014 when the drinking water source for the city of Flint, Michigan was changed from Lake Huron and the Detroit River to the cheaper Flint River. Due to insufficient water treatment, lead leached from the lead water pipes into the drinking water, exposing over 100,000 residents. After a pair of scientific studies proved lead contamination was present in the water supply, a federal state of emergency was declared in January 2016 and Flint residents were instructed to use only bottled or filtered water for drinking, cooking, cleaning, and bathing. As of early 2017, the water quality had returned to acceptable levels; however, residents were instructed to continue to use bottled or filtered water until all the lead pipes have been replaced, which is expected to be completed no sooner than 2020. Question: can you drink the water in flint now?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Turkey-- which was a secular state with an overwhelmingly Muslim population--signed the Declaration in 1948. However, the same year, Saudi Arabia abstained from the ratification vote on the Declaration, claiming that it violated Sharia law. Pakistan--which had signed the declaration--disagreed and critiqued the Saudi position. Pakistani minister Muhammad Zafarullah Khan strongly argued in favor of including freedom of religion. In 1982, the Iranian representative to the United Nations, Said Rajaie-Khorassani, said that the Declaration was ``a secular understanding of the Judeo-Christian tradition'' which could not be implemented by Muslims without conflict with Sharia. On 30 June 2000, members of the Organisation of the Islamic Conference (now the Organisation of Islamic Cooperation) officially resolved to support the Cairo Declaration on Human Rights in Islam, an alternative document that says people have ``freedom and right to a dignified life in accordance with the Islamic Shari'ah'', without any discrimination on grounds of ``race, colour, language, sex, religious belief, political affiliation, social status or other considerations''. Question: has pakistan signed the universal declaration of human rights?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Battle of New Orleans was fought on Sunday, January 8, 1815, between the British Army under Major General Sir Edward Pakenham, and the United States Army under Brevet Major General Andrew Jackson. It took place approximately 5 miles (8.0 kilometres) south of the city of New Orleans, close to the present-day town of Chalmette, Louisiana, and was an American victory. The battle effectively marked the end of the War of 1812. Question: was the battle of new orleans fought after the war of 1812?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The battle of New Orleans was fought on January 8, 1815, and the War of 1812 officially ended on February 17, 1815, with the signing of the Treaty of Ghent. Therefore, the battle of New Orleans was fought after the War of 1812.\\n\\nTherefore the score is: 50"
    },
    {
        "question": "Passage: The Greatest Showman is a 2017 American musical film directed by Michael Gracey in his directorial debut, written by Jenny Bicks and Bill Condon and starring Hugh Jackman, Zac Efron, Michelle Williams, Rebecca Ferguson, and Zendaya. The film is inspired by the story of P.T. Barnum's creation of the Barnum & Bailey Circus and the lives of its star attractions. Question: is the greatest show on earth a musical?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Fetus in fetu (or foetus in foetu) is a developmental abnormality in which a mass of tissue resembling a fetus forms inside the body. An early example of the phenomenon was described in 1808 by George William Young. Question: can a twin be born inside the other?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Republic of the Congo (French: R\u00e9publique du Congo), also known as the Congo-Brazzaville, the Congo Republic or simply the Congo, is a country in Central Africa. It is bordered by five countries: Gabon and the Atlantic Ocean to the west; Cameroon to the northwest; the Central African Republic to the northeast; the Democratic Republic of the Congo to the east and south; and the Angolan exclave of Cabinda to the southwest. Question: is congo and democratic republic of congo the same country?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 2011, Philip Pullman remarked at the British Humanist Association annual conference that due to the first film's disappointing sales in the United States, there would not be any sequels made. Question: is there a second movie to golden compass?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The project was originally titled Oxygen while in development at Isla Producciones (in collaboration with Ol\u00e9). It was then adapted for the American market by Powwow before being acquired by The CW. Star-Crossed premiered on The CW on Monday, February 17, 2014, at 8:00 pm Eastern/7:00 pm Central. The series was picked up for a thirteen-episode season. Question: is the tv series star crossed based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Bengals are one of the 12 NFL teams to not have won a Super Bowl as of the 2017 season; however, they are also one of 8 NFL teams that have been to at least one Super Bowl, but have not won the game. Question: did the cincinnati bengals ever win a superbowl?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: On 19 October 2015, the Seven Network and South Pacific Pictures renewed the show for a second season. It premiered on 23 August 2016 in Australia. On January 24, 2017, the Seven Network announced that the series had been renewed for a third season. It screened from 12 September 2017 with a mid-season finale after 8 episodes. Question: is there going to be a 3rd series of 800 words?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In anatomy, the scapula (plural scapulae or scapulas; also known as shoulder bone, shoulder blade or wing bone) is the bone that connects the humerus (upper arm bone) with the clavicle (collar bone). Like their connected bones the scapulae are paired, with the scapula on either side of the body being roughly a mirror image of the other. The name derives from early Roman times when it was thought that the bone resembled a trowel or small shovel. Question: does the scapula form a joint with the ribs?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to determine whether the scapula forms a joint with the ribs or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: As of February 2017, the Convention has 177 contracting member countries, which makes it one of the most widely adopted treaties worldwide. Notably, Taiwan (officially the Republic of China or ROC) and Burma are not parties to the Convention. However, according to Article 27 of its Patent Act, Taiwan recognizes priority claims from contracting members. Question: is taiwan a member of the paris convention?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A goal cannot be scored directly from a throw-in; if a player throws the ball directly into their own goal without any other player touching it, the result is a corner kick to the opposing side. Likewise an offensive goal cannot be scored directly from a throw in; the result in this case is a goal kick for the defending team. Question: can you score off a throw in in soccer?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A Japanese national does not lose his or her nationality in situations where citizenship is acquired involuntarily such as when a Japanese woman marries an Iranian national. In this case she automatically acquires Iranian citizenship and is permitted to be an Iranian-Japanese dual national, since the acquisition of the Iranian citizenship was involuntary. Question: can you be a dual citizen in japan?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Display lag is a phenomenon associated with some types of liquid crystal displays (LCDs) like smartphones and computers, and nearly all types of high-definition televisions (HDTVs). It refers to latency, or lag measured by the difference between the time there is a signal input, and the time it takes the input to display on the screen. This lag time has been measured as high as 68 ms, or the equivalent of 3-4 frames on a 60 Hz display. Display lag is not to be confused with pixel response time. Currently the majority of manufacturers do not include any specification or information about display latency on the screens they produce. Question: is response time the same as input lag?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Ion-dipole and ion-induced dipole forces are similar to dipole-dipole and dipole-induced dipole interactions but involve ions, instead of only polar and non-polar molecules. Ion-dipole and ion-induced dipole forces are stronger than dipole-dipole interactions because the charge of any ion is much greater than the charge of a dipole moment. Ion-dipole bonding is stronger than hydrogen bonding. Question: are ion-dipole stronger than hydrogen bonds?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. The passage states that ion-dipole and ion-induced dipole forces are stronger than dipole-dipole interactions, but it does not explicitly state that they are stronger than hydrogen bonds. Therefore, the score is: 70"
    },
    {
        "question": "Passage: Tommy Mottola claimed that Dion recorded the song in one take, and that demo is what was released. As Cameron felt obligated to include a theme song to promote the movie, Glen Brunman also stated that the soundtrack album was supposed to be ``No song, no C\u00e9line.'' Question: was my heart will go on recorded in one take?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Additionally, several other actors reprise their MCU roles: Danai Gurira as Okoye, the head of the Dora Milaje; Letitia Wright as T'Challa's sister Shuri; William Hurt as Thaddeus Ross, the U.S. Secretary of State; Kerry Condon as the voice of Stark's A.I. F.R.I.D.A.Y.; Winston Duke as M'Baku, the leader of Wakanda's mountain tribe the Jabari; Florence Kasumba as Ayo, a member of the Dora Milaje; Jacob Batalon as Parker's friend Ned; Isabella Amara as Parker's classmate Sally; Tiffany Espensen as Parker's classmate Cindy; and Ethan Dizon as Parker's classmate Tiny. Samuel L. Jackson and Cobie Smulders make uncredited cameos as Nick Fury and Maria Hill, the former director and deputy director of S.H.I.E.L.D, respectively, in the film's post-credits scene. Question: is there something at the end of ifinity war?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Travel to each of these altitude regions can lead to medical problems, from the mild symptoms of acute mountain sickness to the potentially fatal high-altitude pulmonary edema (HAPE) and high-altitude cerebral edema (HACE). The higher the altitude, the greater the risk. Research also indicates elevated risk of permanent brain damage in people climbing to extreme altitudes. Expedition doctors commonly stock a supply of dexamethasone, or ``dex,'' to treat these conditions on site. Question: is living at high altitude good for you?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a risk of medical problems when traveling to high altitude regions, as mentioned in the passage. The higher the altitude, the greater the risk. The passage does not provide a clear statement on whether living at high altitude is good or bad for an individual's health. However, the student's answer of \\\"False\\\" could be interpreted as a generalization that living at high altitude is always bad, which is not supported by the passage.\\n\\nTherefore the score is: 70"
    },
    {
        "question": "Passage: Although the Constitution does not explicitly include the right to privacy, the Supreme Court has found that the Constitution implicitly grants a right to privacy against governmental intrusion from the First Amendment, Third Amendment, Fourth Amendment, and the Fifth Amendment. This right to privacy has been the justification for decisions involving a wide range of civil liberties cases, including Pierce v. Society of Sisters, which invalidated a successful 1922 Oregon initiative requiring compulsory public education, Griswold v. Connecticut, where a right to privacy was first established explicitly, Roe v. Wade, which struck down a Texas abortion law and thus restricted state powers to enforce laws against abortion, and Lawrence v. Texas, which struck down a Texas sodomy law and thus eliminated state powers to enforce laws against sodomy. Question: does the us constitution protect the right to privacy?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In most countries, grade retention has been banned or strongly discouraged. In the United States, grade retention can be used in kindergarten through twelfth grade, however, students in grades seven through twelve are usually only retained in the specific failed subject due to each subject having its own specific classroom rather than staying in one classroom with all subjects taught for the entire school day as it is in grades kindergarten through sixth grade. For example, in grades seven through twelve, a student can be promoted in a math class but retained in a language class. Single classroom grades kindergarten through sixth grade are confined to one room for the whole day, being taught all subjects in the same classroom usually by one teacher with the exception of art and gymnastics conducted in the art room and the gymnasium respectively. In these grades the student must generally fail or score well below the accepted level in most or all areas within the entire curriculum to be retained. The student will then again repeat the entire school year within a single classroom and repeating the same subject matter as the previous year. Question: can you get held back in freshman year?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A standard home blow dryer will kill 96.7% of eggs with proper technique. To be effective, the blow dryer must be used repeatedly (every 1 to 7 days since eggs hatch in 7 to 10 days) until the natural life cycle of the lice is over (about 4 weeks). Question: can heat from a hair dryer kill lice?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no information in the passage about the effectiveness of heat from a hair dryer in killing lice. Therefore, the score is: 0/100"
    },
    {
        "question": "Passage: Curry's first experience with the United States national team came at the 2007 FIBA Under-19 World Championship, where he helped Team USA capture the silver medal. In 2010, he was selected to the senior squad, playing limited minutes at the 2010 FIBA World Championship (known later as FIBA Basketball World Cup) as the United States won the gold medal in an undefeated tournament. In 2014, he took on a larger role with the team, helping them to another undefeated tournament at the 2014 World Cup and scoring 10 points in the final game. On June 6, 2016, Curry withdrew from consideration for the 2016 Olympics in Brazil, citing ankle and knee ailments as the major reason behind the decision. Question: does stephen curry have an olympic gold medal?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: War Room is a 2015 American Christian drama film directed by Alex Kendrick and written by he and Stephen Kendrick. It is the Kendrick brothers' fifth film and their first through their subsidiary, Kendrick Brothers Productions. Provident Films, Affirm Films and TriStar Pictures partnered with the Kendrick brothers to release the film. Question: is war room movie based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The term sixth borough is used to describe any of a number of places that are not politically within the borders of any of the five boroughs of New York City that have instead been referred to as a metaphorical part of the city by virtue of their geographic location, demographic composition, special affiliation with New York City, or cosmopolitan character. They include adjacent cities and counties in the New York metropolitan area as well as in other states, U.S. territories, and foreign countries. Question: was there a 6th burrow in new york?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Cows routinely lie down and can easily regain their footing unless sick or injured. Scientific studies have been conducted to determine if cow tipping is theoretically possible, with varying conclusions. All agree that cows are large animals that are difficult to surprise and will generally resist attempts to be tipped. Estimates suggest a force of between 3,000 and 4,000 newtons (670 and 900 pounds-force) is needed, and that at least four and possibly as many as fourteen people would be required to achieve this. In real-life situations where cattle have to be laid on the ground, or ``cast'', such as for branding, hoof care or veterinary treatment, either rope restraints are required or specialized mechanical equipment is used that confines the cow and then tips it over. On rare occasions, cattle can lie down or fall down in proximity to a ditch or hill that restricts their normal ability to rise without help. Cow tipping has many references in popular culture and is also used as a figure of speech. Question: can a cow get up after being tipped?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The gameplay is vastly different from the previous installments, as it was rebuilt from the ground up. Although the previous main installment, Ascension (2013), introduced multiplayer to the series, this installment is single-player-only. The game features a third-person, over-the-shoulder free camera, a departure from the previous installments, which featured a third-person, fixed cinematic camera (with the exception of 2007's 2D side-scroller Betrayal). Cinematographically, the game is presented in a continuous shot, with no camera cuts. The game is open, but it is not open-world. Due to it being open, players can fast travel to different locations. As the ability to swim was cut from the game, players instead use a boat to traverse bodies of water. Just like previous entries, there are puzzles for players to solve to progress through parts of the game. Enemies in the game stem from Norse mythology, such as variants of trolls, ogres, dark elves and their king, wolves, wulvers, nightmares, draugrs, tatzelwurms, as well as Gullveig and the revenants, beings warped by sei\u00f0r magic, among other original creatures. Valkyries appear as optional boss battles, and players can free the dragons F\u00e1fnir, Otr, and Reginn--dwarfs that were turned into dragons--in addition to battling a dragon called Hr\u00e6zlyr. Question: is the new god of war 2 players?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The Colorado Mines Orediggers football team represents the Colorado School of Mines in the sport of American football. Gregg Brandon has been the head coach since 2015, taking over for Bob Stitt, the winningest coach in school history. The football team has played in the Rocky Mountain Athletic Conference since 1909. They have claimed to have won 21 conference titles, with 10 of them occurring prior to joining the RMAC (1888, 1890, 1891, 1892, 1893, 1897, 1898, 1904, 1906, 1907). They have won 11 conference titles in the RMAC (1912, 1914, 1918, 1939, 1942, 1951, 1958, 2004, 2010, 2014, 2016), with co-championships in the latter three years. They have made the NCAA Tournament four times in this century. As of the end of the 2017 season, the Orediggers have an all-time record of 460-548-30. Question: does colorado school of mines have a football team?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A root hair, or absorbent hair, the rhizoid of a vascular plant, is a tubular outgrowth of a trichoblast, a hair-forming cell on the epidermis of a plant root. As they are lateral extensions of a single cell and only rarely branched, they are visible to the naked eye and light microscope. They are found only in the region of maturation of the root. Just prior to, and during, root hair cell development, there is elevated phosphorylase activity. Question: do root hairs occur along the entire length of root?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the final episode, Eric returns to Point Place for the New Year and he and Donna kiss. It is presumed that they end up together again at the end of the series and the end of the 1970s. Question: do donna and eric end up getting married?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the 1966 World Cup Finals, England used their home advantage and, under Ramsey, won their first, and only, World Cup title. England played all their games at Wembley Stadium in London, which became the last time that the hosts were granted this privilege. After drawing 0--0 in the opening game against former champions Uruguay, which started a run of four games all ending goalless. England then beat both France and Mexico 2--0 and qualified for the quarter-finals. Question: have france and england ever met in the world cup?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: British interests had to overcome obstacles of geography and climate, and the competing imperial schemes of the French and Portuguese mentioned above and of the Germans. In 1891, Germany secured the strategically critical territory of German East Africa, which along with the mountainous rainforest of the Belgian Congo precluded the building of a Cape-to-Cairo railway. Question: was germany in a position to block a british dream of building a railroad?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There"
    },
    {
        "question": "Passage: The Ontario Human Rights Code was the first law of its kind in Canada. Before June 15, 1962, various laws dealt with different kinds of discrimination. The Code brought them together into one law and added some new protections. Question: was ontario one of the last province to introduce human rights legislation?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The effects of high altitude on humans are considerable. The percentage oxygen saturation of hemoglobin determines the content of oxygen in blood. After the human body reaches around 2,100 m (7,000 feet) above sea level, the saturation of oxyhemoglobin begins to decrease rapidly. However, the human body has both short-term and long-term adaptations to altitude that allow it to partially compensate for the lack of oxygen. Athletes use these adaptations to help their performance. There is a limit to the level of adaptation; mountaineers refer to the altitudes above 8,000 metres (26,000 ft) as the ``death zone'', where it is generally believed that no human body can acclimatize. Question: is it hard to breathe at 7000 feet?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The National Guard of the United States, part of the reserve components of the United States Armed Forces, is a reserve military force, composed of National Guard military members or units of each state and the territories of Guam, the Virgin Islands, Puerto Rico, and the District of Columbia, for a total of 54 separate organizations. All members of the National Guard of the United States are also members of the militia of the United States as defined by 10 U.S.C. \u00a7 246. National Guard units are under the dual control of the state and the federal government. Question: does each state have their own national guard?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Kim Garner, the senior vice president of marketing and artist development at Universal Republic Records, said that Brand and Universal Pictures ``felt very strongly about doing something like this as opposed to a traditional soundtrack,'' and that they ``wanted to release it like we would an actual rock band's album.'' Question: is russell brand singing in get him to the greek?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Bronx Zoo is a zoo located within Bronx Park in the Bronx, a borough of New York City. It is the largest metropolitan zoo in the United States and among the largest in the world. On average, the zoo has 2.15 million visitors each year, and it comprises 265 acres (107 ha) of park lands and naturalistic habitats, through which the Bronx River flows. Question: is the bronx zoo the largest zoo in the world?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no mention in the passage that the Bronx Zoo is the largest zoo in the world. The passage states that it is among the largest in the world, which is different from being the largest. Therefore, the score is: 80"
    },
    {
        "question": "Passage: A yardstick is a straightedge used to physically measure lengths of up to one yard (3.0 feet or 0.9144 meters long) high. Yardsticks are flat boards with markings at regular intervals. In the metric system, a similar device measuring up to one meter is called a meter-stick. Question: is a meter stick the same as a yardstick?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Mickey's House and Meet Mickey is a walk through and Meet & Greet attraction at Mickey's Toontown in Disneyland and Toontown at Tokyo Disneyland. Similar attractions formerly existed in the Magic Kingdom and Hong Kong Disneyland. Question: is mickey and minnie's house still in magic kingdom?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A probability of precipitation (POP), (also expressed as: ``chance of precipitation,'' ``chance of rain'') is a description of the likelihood of precipitation that is often published with weather forecasts. Generally it refers to the probability that at least some minimum quantity of precipitation will occur within a specified forecast period and location. Question: is precipitation the same as chance of rain?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Quarter Pounder is a hamburger sold by international fast food chain McDonald's, so named for containing a patty with a precooked weight of a quarter of a pound (113.4 g). It was first introduced in 1971. In 2013, the Quarter Pounder was expanded to represent a whole line of hamburgers that replaced the company's discontinued Angus hamburger. In 2015, McDonald's increased the precooked weight to 4.25 oz (120.5 g). Question: does a quarter pounder weight a quarter pound?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Production of the American version was canceled by ABC in 2003 because of low ratings, with already-produced episodes airing first-run into 2004. The ABC Family cable channel, which had been airing repeats of the show since 2002, also showed ``new'' episodes from 2005 to 2006, formed from previously filmed but unaired performances. Question: did they cancel whose line is it anyway?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On 19 May 1999, LanChile (known as LAN and from 2016 as LATAM Chile) became a member-elect, the alliance's first representative from Latin America. LanChile's two subsidiaries, LAN Express and LAN Per\u00fa, would also join the alliance. Irish carrier Aer Lingus was formally elected on board and confirmed as the ninth member of the alliance on 2 December 1999. As LanChile and Aer Lingus joined on 1 June 2000, Canadian Airlines left the alliance, following the airline's purchase by Air Canada, a member of the rival Star Alliance. Question: is air canada part of one world alliance?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Conscription in the United States, commonly known as the draft, has been employed by the federal government of the United States in five conflicts: the American Revolution, the American Civil War, World War I, World War II, and the Cold War (including both the Korean War and the Vietnam War). The third incarnation of the draft came into being in 1940 through the Selective Training and Service Act. It was the country's first peacetime draft. From 1940 until 1973, during both peacetime and periods of conflict, men were drafted to fill vacancies in the United States Armed Forces that could not be filled through voluntary means. The draft came to an end when the United States Armed Forces moved to an all-volunteer military force. However, the Selective Service System remains in place as a contingency plan; all male civilians between the ages of 18 and 25 are required to register so that a draft can be readily resumed if needed. United States Federal Law also provides for the compulsory conscription of men between the ages of 17 and 45 and certain women for militia service pursuant to Article I, Section 8 of the United States Constitution and 10 U.S. Code \u00a7 246. Question: was there a draft in the revolutionary war?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The following is a list of awards and nominations received by English actor Tom Hardy. He was nominated for the Academy Award for Best Supporting Actor for the 2015 film The Revenant. He also won the 2011 BAFTA Rising Star Award, and has twice won the British Independent Film Award for Best Actor, for Bronson (2009) and Legend (2015). Question: did tom hardy won an oscar for the revenant?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A federal government is the common or national government of a federation. The United States is considered the first modern federation. After declaring independence from Britain, the U.S. adopted its first constitution, the Articles of Confederation in 1781. This was the first step towards federalism by establishing the confederal Congress. However, Congress was limited as to its ability to pursue economic, military, and judiciary reform. In 1787, a Constitutional Convention drafted the United States Constitution during the Philadelphia Convention. After the ratification of the Constitution by nine states in 1788, the U.S. was officially a federation, putting the U.S. in a unique position where the central government exists by the sufferance of the individual states rather than the reverse. Question: is the national government the same as the federal government?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: It begins off the east coast of Luzon, Philippines, Taiwan and flows northeastward past Japan, where it merges with the easterly drift of the North Pacific Current. It is analogous to the Gulf Stream in the Atlantic Ocean, transporting warm, tropical water northward toward the polar region. It is sometimes known as the Black Stream -- the English translation of kuroshio and an allusion to the deep blue of its water -- and also as the ``Japan Current'' (\u65e5\u672c\u6d77\u6d41, Nihon Kairy\u016b). Question: is there a gulf stream in the pacific?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The League of Assassins (renamed the League of Shadows or Society of Shadows in adapted works) is a group of fictional villains appearing in American comic books published by DC Comics. The group is depicted as a collective of assassins who work for Ra's al Ghul, an enemy of the superhero Batman. Question: is the league of assassins the same as the league of shadows?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given range (the local or relative extrema) or on the entire domain of a function (the global or absolute extrema). Pierre de Fermat was one of the first mathematicians to propose a general technique, adequality, for finding the maxima and minima of functions. Question: can a local maximum also be an absolute maximum?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Houston Astros are an American professional baseball team based in Houston, Texas. The Astros compete in Major League Baseball (MLB) as a member club of the American League (AL) West division, having moved to the division in 2013 after spending their first 51 seasons in the National League (NL). The Astros have played their home games at Minute Maid Park since 2000. Question: did the astros used to be in the national league?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Although Scottish Gaelic and Irish are closely related as Goidelic Celtic languages (or Gaelic languages), they are in fact starkly different in many ways. While most dialects are not immediately mutually comprehensible (although many individual words and phrases are), speakers of the two languages can rapidly develop mutual intelligibility. Question: is scottish gaelic and irish gaelic the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Governments had attempted to reduce sleep-deprived driving through education messages and by ingraining roads with dents, known as rumble strips in the US, which cause a noise when drivers wander out of their lane. The Government of Western Australia recently introduced a ``Driver Reviver'' program where drivers can receive free coffee to help them stay awake. Question: is it illegal to drive with no sleep?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no mention in the passage about whether driving with no sleep is illegal or not. The passage talks about attempts by governments to reduce sleep-deprived driving through education messages and rumble strips, and a program in Western Australia where drivers can receive free coffee to help them stay awake. Therefore the score is: 50"
    },
    {
        "question": "Passage: To promote the film prior to its release, Townsend, along with the other actors who portrayed the fictional musical quartet The Five Heartbeats (Leon Robinson, Michael Wright, Harry J. Lennix, and Tico Wells) performed in a concert with real-life Soul/R&B vocal group The Dells, one of many groups that inspired the film. The Dells sang and recorded the vocals as the actors lip synced. Question: is there a group called the five heartbeats?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Rescue from Gilligan's Island is a 1978 made-for-television comedy film that continues the adventures of the shipwrecked castaways from the 1964--67 sitcom Gilligan's Island, starring Bob Denver and Alan Hale, Jr., and featuring all the original cast except Tina Louise. The film first aired on NBC as a two-part special on October 14 and October 21, 1978. The film has the characters finally being rescued after 15 years on the island. The film was directed by Leslie H. Martinson. Question: did they ever make it off gilligan island?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In mathematics, and more specifically set theory, the empty set or null set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set; in other theories, its existence can be deduced. Many possible properties of sets are vacuously true for the empty set. Question: is the empty set an element of the set containing the empty set?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Dr. Bailey is in labor, and without her husband Tucker Jones by her side, she refuses to push so she can give birth. George works with Addison to convince Bailey to have the baby. He finally gets through to Bailey by giving her the motivation that she needs, and ultimately he holds her while she delivers the baby. Izzie and Alex have sex again. Chief Richard Webber is under a lot of stress from everything that's been going on, and it is believed that he is having a heart attack, which lures his wife Adele to the hospita(anxiety attack)). Dr. Bailey's husband goes into cardiac arrest. Meredith finally removes the explosive from the patient, and Dylan, the leader of the bomb squad, carries it away. Meredith steps out of the operating room into the hallway, curiously watching Dylan walk away with the explosive, and at that moment, the bomb explodes, killing Dylan and a second bomb squad member. Meredith is knocked unconscious by the explosion. There is a revival of the ``shower scene'' from the first part, but with a more serious tone: the fully clothed Izzie and Cristina wash blood off of a stunned Meredith as George looks on. Dr. Bailey's husband and the man who had the explosive embedded in his body both survive. At the end of the episode, Preston and Derek become friends, overcoming their initial rivalry in the series beginning, and call each other by their first names. Cristina says ``I love you, too'' to a sleeping Preston. Derek comes to visit Meredith and says, ``You almost died today,'' and Meredith tells him that she can't remember their last kiss. Derek recalls the kiss for her, telling her that she ``smelled like some kind of flower,'' which Meredith says was lavender, and then he leaves. Question: does the bomb go off in grey's anatomy?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Home equity is not liquid. Home equity management refers to the process of using equity extraction via loans, at favorable, and often tax-favored, interest rates, to invest otherwise illiquid equity in a target that offers higher returns. Question: is equity in your home a liquid asset?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information given in the passage to determine if the student's answer is correct or not, as the passage does not define what a liquid asset is or provide a comparison between liquid and non-liquid assets. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The lake was originally named the Andia-ta-roc-te by local Native Americans. James Fenimore Cooper in his narrative Last of the Mohicans called it the Horican, after a tribe which may have lived there, because he felt the original name was too hard to pronounce. Question: is lake george a man-made lake?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The gallbladder is shaped like a pear, with its tip opening into the cystic duct. The gallbladder is divided into three sections: the fundus, body, and neck. The fundus is the rounded base, angled so that it faces the abdominal wall. The body lies in a depression in the surface of the lower liver. The neck tapers and is continuous with the cystic duct, part of the biliary tree. The gallbladder fossa, against which the fundus and body of the gallbladder lie, is found beneath the junction of hepatic segments IVB and V. The cystic duct unites with the common hepatic duct to become the common bile duct. At the junction of the neck of the gallbladder and the cystic duct, there is an out-pouching of the gallbladder wall forming a mucosal fold known as ``Hartmann's pouch''. Question: is the fundus of the gallbladder located near the cystic duct?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Tobramycin/dexamethasone (INNs, trade name Tobradex, Tobrason in Jordan) is a prescription medication in the form of eye drops and eye ointment, marketed by Alcon. The active ingredients are tobramycin 0.3% (an antibiotic) and dexamethasone 0.1% (a corticosteroid). It is prescribed for a wide spectrum of bacterial eye infections. Tobradex can also be used to clear or contract styes that are also found in the eye. It is prescribed for the treatment of pink eye in combination with bacterial infections. Because it contains a steroid, careful use with gradual reduction of doses is required. Question: can tobramycin and dexamethasone be used for a stye?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no mention of styes in the passage, but Tobramycin/dexamethasone is prescribed for a wide spectrum of bacterial eye infections. Since a stye is a bacterial infection, it is possible that Tobramycin/dexamethasone can be used for a stye. Therefore, the score is: 80"
    },
    {
        "question": "Passage: The chickpea or chick pea (Cicer arietinum) is a legume of the family Fabaceae, subfamily Faboideae. Its different types are variously known as gram or Bengal gram, garbanzo or garbanzo bean, and Egyptian pea. Chickpea seeds are high in protein. It is one of the earliest cultivated legumes and 7500-year-old remains have been found in the Middle East. Question: are chickpeas and garbonzo beans the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There"
    },
    {
        "question": "Passage: The Strangers: Prey at Night is a 2018 American slasher film directed by Johannes Roberts and starring Christina Hendricks, Martin Henderson, Bailee Madison and Lewis Pullman. A sequel to the 2008 film The Strangers, it is written by Bryan Bertino (who wrote and directed the first film) and Ben Ketai. Mike and his wife Cindy take their son and daughter on a road trip that becomes their worst nightmare. The family members soon find themselves in a desperate fight for survival when they arrive at a secluded mobile home park that's mysteriously deserted -- until three masked psychopaths show up to satisfy their thirst for blood. Question: is the strangers prey at night a remake?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The New York Times called the film a ``completely unbelievable but wholly captivating flight into fancy composed of unequal dollops of comedy, romance, poignancy, funny colloquialisms and Manhattan's swankiest East Side areas captured in the loveliest of colors''. In reviewing the performances, the newspaper said Holly Golightly is Question: is breakfast at tiffany's black and white?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: An income statement or profit and loss account (also referred to as a profit and loss statement (P&L), statement of profit or loss, revenue statement, statement of financial performance, earnings statement, operating statement, or statement of operations) is one of the financial statements of a company and shows the company's revenues and expenses during a particular period. It indicates how the revenues (money received from the sale of products and services before expenses are taken out, also known as the ``top line'') are transformed into the net income (the result after all revenues and expenses have been accounted for, also known as ``net profit'' or the ``bottom line''). The purpose of the income statement is to show managers and investors whether the company made or lost money during the period being reported. Question: are profit and loss and income statement the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Montana has some of the most permissive gun laws in the United States. It is a ``shall issue'' state for concealed carry. The county sheriff shall issue a concealed weapons permit to a qualified applicant within 60 days. Concealed carry is not allowed in government buildings, financial institutions, or any place where alcoholic beverages are served. Carrying a concealed weapon while intoxicated is prohibited. No weapons, concealed or otherwise, are allowed in school buildings. Montana recognizes concealed carry permits issued by most but not all other states. Concealed carry without a permit is generally allowed outside city, town, or logging camp limits. Under Montana law a permit is necessary only when the weapon is `` wholly or partially covered by the clothing or wearing apparel ``, therefore it is legal to carry and/or keep a firearm inside a vehicle without a permit (as long as it is not concealed on the person). If you do not have a CWP it could be considered a violation of the law for you to conceal a gun in a purse or backpack, since the law defines a concealed weapon as one that is ``wholly or partially covered by the clothing or wearing apparel of the person carrying or bearing the weapon. As of 2017, the concealed weapons law applies only to firearms, excluding items such as knives, slingshots, billies, etc. from the permit requirement. Question: can you carry a gun in your car in montana?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There"
    },
    {
        "question": "Passage: Sling Blade is a 1996 American drama film written and directed by Billy Bob Thornton, who also stars in the lead role. Set in rural Arkansas, the film tells the story of a man named Karl Childers who has an intellectual disability and is released from a psychiatric hospital, where he has lived since killing his mother and her lover when he was 12 years old, and the friendship he develops with a young boy and his mother. In addition to Thornton, it stars Dwight Yoakam, J.T. Walsh, John Ritter, Lucas Black, Natalie Canerday, James Hampton, and Robert Duvall. Question: is the movie sling blade a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 2011, Philip Pullman remarked at the British Humanist Association annual conference that due to the first film's disappointing sales in the United States, there would not be any sequels made. Question: is there a follow up film to the golden compass?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The yard (abbreviation: yd) is an English unit of length, in both the British imperial and US customary systems of measurement, that comprises 3 feet or 36 inches. It is by international agreement in 1959 standardized as exactly 0.9144 meters. A metal yardstick originally formed the physical standard from which all other units of length were officially derived in both English systems. Question: are a meter and a yard the same length?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: An echocardiogram, often referred to as a cardiac echo or simply an echo, is a sonogram of the heart. (It is not abbreviated as ECG, because that is an abbreviation for an electrocardiogram.) Echocardiography uses standard two-dimensional, three-dimensional, and Doppler ultrasound to create images of the heart. Question: is an echocardiogram the same as a sonogram?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: There is no brachiocephalic artery for the left side of the body. The left common carotid, and the left subclavian artery, come directly off the aortic arch. However, there are two brachiocephalic veins. Question: is there a right and left brachiocephalic artery?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Twenty-eight states have some form of a ``three-strikes'' law. A person accused under such laws is referred to in a few states (notably Connecticut and Kansas) as a ``persistent offender'', while Missouri uses the unique term ``prior and persistent offender''. In most jurisdictions, only crimes at the felony level qualify as serious offenses; however, misdemeanor offenses can qualify for application of the three-strikes law in California, whose harsh application has been the subject of controversy. Question: is the 3 strikes law still in effect?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 75"
    },
    {
        "question": "Passage: A driver's permit, learner's permit, learner's license or provisional license, is a restricted license that is given to a person who is learning to drive, but has not yet satisfied the requirement to obtain a driver's license. Having a driver's permit for a certain length of time is usually one of the requirements (along with driver's education and a road test) for applying for a full driver's license. To get a learner's permit, one must typically pass a written permit test, traffic, and rules of the road. Question: is a learner's license the same as a permit?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Color Purple is a 1982 epistolary novel by American author Alice Walker which won the 1983 Pulitzer Prize for Fiction and the National Book Award for Fiction. It was later adapted into a film and musical of the same name. Question: was the color purple based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On 19 October 2015, the Seven Network and South Pacific Pictures renewed the show for a second season. It premiered on 23 August 2016 in Australia. On January 24, 2017, the Seven Network announced that the series had been renewed for a third season. It screened from 12 September 2017 with a mid-season finale after 8 episodes. Question: is there a 3rd season of 800 words?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: As the Confederation Congress attempted to govern the continually growing American states, delegates discovered that the limitations placed upon the central government rendered it ineffective at doing so. As the government's weaknesses became apparent, especially after Shays' Rebellion, individuals began asking for changes to the Articles. Their hope was to create a stronger national government. Initially, some states met to deal with their trade and economic problems. However, as more states became interested in meeting to change the Articles, a meeting was set in Philadelphia on May 25, 1787. This became the Constitutional Convention. It was quickly realized that changes would not work, and instead the entire Articles needed to be replaced. On March 4, 1789, the government under the Articles was replaced with the federal government under the Constitution. The new Constitution provided for a much stronger federal government by establishing a chief executive (the President), courts, and taxing powers. Question: do we still follow the articles of confederation?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Shortly before the release of On Stranger Tides, it was reported that Disney was planning to shoot the fifth and the sixth films back-to-back, although it was later revealed that only the fifth film was in development. On March 4, 2017, director Joachim R\u00f8nning stated that Dead Men Tell No Tales was only the beginning of the final adventure, implying that it would not be the last film of the franchise and that a sixth film could be released. Question: is there going to be any more pirates of the caribbean films?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In March 1500, Spanish conquistador Vicente Y\u00e1\u00f1ez Pinz\u00f3n was the first documented European to sail up the Amazon River. Pinz\u00f3n called the stream R\u00edo Santa Mar\u00eda del Mar Dulce, later shortened to Mar Dulce, literally, sweet sea, because of its fresh water pushing out into the ocean. Another Spanish explorer, Francisco de Orellana, was the first European to travel from the origins of the upstream river basins, situated in the Andes, to the mouth of the river. In this journey, Orellana baptised some of the affluents of the Amazonas like Rio Negro, Napo and Jurua. The name Amazonas is taken from the native warriors that attacked this expedition, mostly women, that reminded De Orellana of the mythical female Amazon warriors from the ancient Hellenic culture in Greece. Question: is the water in the amazon river fresh?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The neighborhood is bordered by Broadway to the east, the North River (part of the Hudson River) to the west, Houston Street to the south, and 14th Street to the north, and roughly centered on Washington Square Park and New York University. The neighborhoods surrounding it are the East Village and NoHo to the east, SoHo to the south, and Chelsea to the north. The East Village was formerly considered part of the Lower East Side and has never been considered a part of Greenwich Village. The western part of Greenwich Village is known as the West Village; the dividing line of its eastern border is debated. Some believe it starts at Seventh Avenue and its southern extension, a border to the west of which the neighborhood changes substantially in character and becomes heavily residential. Others say the West Village starts one avenue further east at Sixth Avenue, where the east-west streets in the city's grid plan start to orient themselves on an angle to the traditionally perpendicular grid plan occupying most of Manhattan. The Far West Village is another sub-neighborhood of Greenwich Village that is bordered on its west by the Hudson River and on its east by Hudson Street. Greenwich Village is located in New York's 10th congressional district, New York's 25th State Senate district, New York's 66th State Assembly district, and New York City Council's 3rd district. Question: are greenwich village and the west village the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: All widely cultivated bananas today descend from the two wild bananas Musa acuminata and Musa balbisiana. While the original wild bananas contained large seeds, diploid or polyploid cultivars (some being hybrids) with tiny seeds are preferred for human raw fruit consumption. These are propagated asexually from offshoots. The plant is allowed to produce two shoots at a time; a larger one for immediate fruiting and a smaller ``sucker'' or ``follower'' to produce fruit in 6--8 months. Question: can you grow banana trees from a banana?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On 7 April 1916, I Anzac Corps took up positions in a quiet sector south of Armenti\u00e8res, known as the ``Nursery''. The Australians were spared from participating in the disastrous first day on the Somme, nevertheless within three weeks of the beginning of the Allied offensive, four divisions of the AIF had been committed to the battle. Only the 3rd Division did not take part, having only just recently arrived in England from Australia. The 5th Division, positioned on the left flank of the salient, was the first to see action during the Battle of Fromelles on 19 July 1916, suffering a staggering 5,533 casualties in a single day. Question: did australia fight in the battle of the somme?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Although it is widely believed that a worker honey bee can sting only once, this is a partial misconception: although the stinger is in fact barbed so that it lodges in the victim's skin, tearing loose from the bee's abdomen and leading to its death in minutes, this only happens if the skin of the victim is sufficiently thick, such as a mammal's. Honey bees are the only hymenoptera with a strongly barbed sting, though yellow jackets and some other wasps have small barbs. Question: do bees die if they lose their stinger?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a misconception that a worker honey bee can only sting once. The passage explains that the stinger is in fact barbed, which lodges in the victim's skin, tearing loose from the bee's abdomen and leading to its death in minutes. However, this only happens if the skin of the victim is thick, such as a mammal's. Honey bees are the only hymenoptera with a strongly barbed sting, though yellow jackets and some other wasps have small barbs.\\n\\nTherefore the score is: 80"
    },
    {
        "question": "Passage: A dorsal root ganglion (or spinal ganglion) (also known as a posterior root ganglion), is a cluster of neurons (a ganglion) in a dorsal root of a spinal nerve. The cell bodies of sensory neurons known as first-order neurons are located in the dorsal root ganglia. Question: does the dorsal root ganglion carry sensory input?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a need to provide the student's answer before I can evaluate it."
    },
    {
        "question": "Passage: The attack commenced at 7:48 a.m. Hawaiian Time (18:18 GMT). The base was attacked by 353 Imperial Japanese aircraft (including fighters, level and dive bombers, and torpedo bombers) in two waves, launched from six aircraft carriers. All eight U.S. Navy battleships were damaged, with four sunk. All but the USS Arizona were later raised, and six were returned to service and went on to fight in the war. The Japanese also sank or damaged three cruisers, three destroyers, an anti-aircraft training ship, and one minelayer. One hundred eighty-eight U.S. aircraft were destroyed; 2,403 Americans were killed and 1,178 others were wounded. Important base installations such as the power station, dry dock, shipyard, maintenance, and fuel and torpedo storage facilities, as well as the submarine piers and headquarters building (also home of the intelligence section), were not attacked. Japanese losses were light: 29 aircraft and five midget submarines lost, and 64 servicemen killed. One Japanese sailor, Kazuo Sakamaki, was captured. Question: were any japanese planes shot down at pearl harbor?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Billy Thunderman (Diego Velazquez) is the third-born Thunderman child. He is an energetic little brother to Phoebe and Max and older brother to Nora and Chloe. His superpower is super-speed. In one episode, it was revealed that Barb gave birth to Billy in the air while her husband was transporting her to a hospital, implying that Billy likely hit his head after birth, which is probably why he is sometimes unintelligent. Question: are billy and nora from the thundermans twins?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In common parlance, the term jet engine loosely refers to an internal combustion airbreathing jet engine. These typically feature a rotating air compressor powered by a turbine, with the leftover power providing thrust via a propelling nozzle -- this process is known as the Brayton thermodynamic cycle. Jet aircraft use such engines for long-distance travel. Early jet aircraft used turbojet engines which were relatively inefficient for subsonic flight. Modern subsonic jet aircraft usually use more complex high-bypass turbofan engines. These engines offer high speed and greater fuel efficiency than piston and propeller aeroengines over long distances. Some jet engines optimized for high speed applications (ramjets and scramjets) use the ram effect of the vehicle's speed instead of a mechanical compressor. Question: is a jet engine an external combustion engine?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Under the REAL ID Act, the passport card is also accepted for federal purposes (such as domestic air travel or entering federal buildings), which may make it an attractive option for people living in states whose driver's licenses and ID cards are not REAL ID--compliant when those requirements go into effect. TSA regulations list the passport card as an acceptable identity document at airport security checkpoints. For the purposes of state voter photo identification initiatives, the passport card has been deemed acceptable. Question: is a us passport card a valid form of id?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no need to provide a score for this question as the student's answer is correct according to the passage in the question. The Passage clearly states that the passport card is accepted for federal purposes, including domestic air travel and entering federal buildings, and it is listed as an acceptable identity document at airport security checkpoints. Additionally, it is deemed acceptable for state voter photo identification initiatives. Therefore, the student's answer is correct."
    },
    {
        "question": "Passage: When entering into force in 1994, the EEA parties were 17 states and two European Communities: the European Community, which was later absorbed into the EU's wider framework, and the now defunct European Coal and Steel Community. Membership has grown to 31 states as of 2016: 28 EU member states, as well as three of the four member states of the EFTA (Iceland, Liechtenstein and Norway). The Agreement is applied provisionally with respect to Croatia--the remaining and most recent EU member state--pending ratification of its accession by all EEA parties. One EFTA member, Switzerland, has not joined the EEA, but has a series of bilateral agreements with the EU which allow it also to participate in the internal market. Question: is israel part of the european economic area?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no mention of Israel in the passage, so the student's answer cannot be accurately judged."
    },
    {
        "question": "Passage: 10 Downing Street, colloquially known in the United Kingdom as Number 10, is the headquarters of the Government of the United Kingdom and the official residence and office of the First Lord of the Treasury, a post which, for much of the 18th and 19th centuries and invariably since 1905, has been held by the Prime Minister. Question: does the prime minister live at number 10 downing street?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Castling consists of moving the king two squares towards a rook on the player's first rank , then moving the rook to the square over which the king crossed. Castling may only be done if the king has never moved, the rook involved has never moved, the squares between the king and the rook involved are unoccupied, the king is not in check, and the king does not cross over or end on a square in which it would be in check. Castling is one of the rules of chess and is technically a king move (Hooper & Whyld 1992:71). Question: can you castle after being checked in chess?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the United States House of Representatives, the filibuster (the right to unlimited debate) was used until 1842, when a permanent rule limiting the duration of debate was created. The disappearing quorum was a tactic used by the minority until Speaker Thomas Brackett Reed eliminated it in 1890. As the membership of the House grew much larger than the Senate, the House had acted earlier to control floor debate and the delay and blocking of floor votes. On February 7, 2018, Nancy Pelosi set a record for the longest speech on the House floor (8 hours and 7 minutes), in support of Deferred Action for Childhood Arrivals. Question: can a filibuster take place in the house?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Brexit (/\u02c8br\u025bks\u026at, \u02c8br\u025b\u0261z\u026at/) is the impending withdrawal of the United Kingdom (UK) from the European Union (EU). In a referendum on 23 June 2016, a majority of British voters supported leaving the EU. On 29 March 2017, the UK government invoked Article 50 of the Treaty on European Union. The United Kingdom is due to leave the EU on 29 March 2019 at 11 p.m. UTC (midnight Central European Time), when the period for negotiating a withdrawal agreement will end unless an extension is agreed. Question: did the united kingdom leave the european union?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Rumors of this chemical's existence go back at least as far as 1958, and the story is commonly told to children by parents who do not wish them to urinate in the pool. A 1985 biography of Orson Welles describes him using such a dye as part of a prank in 1937, and references to the substance can be found through popular culture over quite a lengthy time period. However, such a dye does not exist, and although a chemical could be manufactured that would react with urine, it would be difficult to prevent it from reacting to other organic substances present in pool water. Question: can pool water change color if you pee in it?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In Japanese martial arts the further subdivisions of black belt ranks may be linked to dan grades and indicated by 'stripes' on the belt. Y\u016bdansha (roughly translating from Japanese to ``person who holds a dan grade'') is often used to describe those who hold a black belt rank. While the belt remains black, stripes or other insignia may be added to denote seniority, in some arts, very senior grades will wear differently colored belts. In judo and some forms of karate, a sixth dan will wear a red and white belt. The red and white belt is often reserved only for ceremonial occasions, and a regular black belt is still worn during training. At 9th or 10th dan some schools award red. In some schools of Jujutsu, the Shihan rank and higher wear purple belts. These other colors are often still referred to collectively as ``black belts''. Question: is there a belt above black in karate?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Leaning Tower of Pisa (Italian: Torre pendente di Pisa) or simply the Tower of Pisa (Torre di Pisa (\u02c8torre di \u02c8pi\u02d0za)) is the campanile, or freestanding bell tower, of the cathedral of the Italian city of Pisa, known worldwide for its unintended tilt. The tower is situated behind the Pisa Cathedral and is the third oldest structure in the city's Cathedral Square (Piazza del Duomo), after the cathedral and the Pisa Baptistry. Question: is the leaning tower of pisa in rome?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: As addressed within Rule 9.02(a)(1) of the Official Baseball Rules a sacrifice fly is not counted as a time at bat for the batter, though the batter is credited with a run batted in. Question: does sac fly count as an at bat?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a crucial detail missing from the student's answer. While it is true that a sacrifice fly is not counted as a time at bat for the batter, it is also true that the batter is credited with a run batted in. Therefore, the score is: 60/100"
    },
    {
        "question": "Passage: The film is set one year after the events of Avengers: Age of Ultron. Captain America: Civil War introduces Tom Holland as Peter Parker / Spider-Man and Chadwick Boseman as T'Challa / Black Panther to the MCU, who appear in solo films in 2017 and 2018, respectively. William Hurt reprises his role as Thunderbolt Ross from The Incredible Hulk, and is now the US Secretary of State. For the mid-credits scene, in which Black Panther offers Captain America and Bucky Barnes asylum in Wakanda, Joe and Anthony Russo received input from Black Panther director Ryan Coogler on the look and design of Wakanda. Question: did civil war come out before age of ultron?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Their first appearance in the finals was in Italy at the 1990 FIFA World Cup. 1990 was also their best performance in a major championship, where they reached the quarter finals. Question: did ireland ever win the soccer world cup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the third season, Damon helps Elena in bringing his brother, Stefan, back to Mystic Falls after Stefan becomes Klaus' henchman. The arrangement transpired after a bargain for his blood that would cure Damon of the werewolf bite he had received from Tyler. At first, he is reluctant to involve Elena in the rescue attempts, employing Alaric Saltzman, Elena's guardian, instead as Klaus does not know that Elena is alive after the sacrifice which frees Klaus' hybrid side. However, Elena involves herself, desperate to find Stefan. Damon, though hesitant at first, is unable to refuse her because of his love for her. He also points out to her that she once turned back from finding Stefan since she knew Damon would be in danger, clearly showing that she also has feelings for him. He tells her that ``when (he) drag(s) (his) brother from the edge to deliver him back to (her), (he) wants her to remember the things (she) felt while he was gone.'' When Stefan finally returns to Mystic Falls, his attitude is different from that of the first and second seasons. This causes a rift between Elena and Stefan whereas the relationship between Damon and Elena becomes closer and more intimate. A still loyal Elena, however, refuses to admit her feelings for Damon. In 'Dangerous Liaisons', Elena, frustrated with her feelings for him, tells Damon that his love for her may be a problem, and that this could be causing all their troubles. This incenses Damon, causing him to revert to the uncaring and reckless Damon seen in the previous seasons. The rocky relationship between the two continues until the sexual tension hits the fan and in a moment of heated passion, Elena -- for the first time in the three seasons -- kisses Damon of her own accord. This kiss finally causes Elena to admit that she loves both brothers and realize that she must ultimately make her choice as her own ancestress, Katherine Pierce, who turned the brothers, once did. In assessment of her feelings for Damon, she states this: ``Damon just sort of snuck up on me. He got under my skin and no matter what I do, I can't shake him.'' In the season finale, a trip designed to get her to safety forces Elena to make her choice: to go to Damon and possibly see him one last time; or to go to Stefan and her friends and see them one last time. She chooses the latter when she calls Damon to tell him her decision. Damon, who is trying to stop Alaric, accepts what she says and she tells him that maybe if she had met Damon before she had met Stefan, her choice may have been different. This statement causes Damon to remember the first night he did meet Elena which was, in fact, the night her parents died - before she had met Stefan. Not wanting anyone to know he was in town and after giving her some advice about life and love, Damon compels her to forget. He remembers this as he fights Alaric and seems accepting of his death when Alaric, whose life line is tied to Elena's, suddenly collapses in his arms. Damon is grief-stricken, knowing that this means that Elena has also died and yells, ``No! You are not dead!'' A heartbroken Damon then goes to the hospital demanding to see Elena when the doctor, Meredith Fell, tells him that she gave Elena vampire blood. The last shot of the season finale episode shows Elena in transition. Question: does damon and elena get together in season 3?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Monozygotic twins are genetically nearly identical and they are always the same sex unless there has been a mutation during development. The children of monozygotic twins test genetically as half-siblings (or full siblings, if a pair of monozygotic twins reproduces with another pair or with the same person), rather than first cousins. Identical twins do not have the same fingerprints however, because even within the confines of the womb, the fetuses touch different parts of their environment, giving rise to small variations in their corresponding prints and thus making them unique. Question: can you have a boy and a girl identical twins?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The endocrine system is a chemical messenger system consisting of hormones, the group of glands of an organism that carry those hormones directly into the circulatory system to be carried towards distant target organs, and the feedback loops of homeostasis that the hormones drive. In humans, the major endocrine glands are the thyroid gland and the adrenal glands. In vertebrates, the hypothalamus is the neural control center for all endocrine systems. The field of study dealing with the endocrine system and its disorders is endocrinology, a branch of internal medicine. Question: is the prostate part of the endocrine system?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: All rounds are best-of-seven series. Series are played in a 2--2--1--1--1 format, meaning the team with home-court advantage hosts games 1, 2, 5, and 7, while their opponent hosts games 3, 4, and 6, with games 5--7 being played if needed. This format has been used since 2014, after NBA team owners unanimously voted to change from a 2--3--2 format on October 23, 2013. Question: is the first round of the nba playoffs a 5 game series?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Mamma Mia! is based on the songs of ABBA, a Swedish pop/dance group active from 1972 to 1982 and one of the most popular international pop groups of all time, topping the charts again and again in Europe, North and South America and Australia. Following the premiere of the musical in London in 1999, ABBA Gold topped the charts in the United Kingdom again. This musical was the brainchild of producer Judy Craymer. She met songwriters Bj\u00f6rn Ulvaeus and Benny Andersson in 1983 when they were working with Tim Rice on Chess. It was the song ``The Winner Takes It All'' that suggested to her the theatrical potential of their pop songs. The songwriters were not enthusiastic, but they were not completely opposed to the idea. Question: did abba write songs specifically for mamma mia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The U.S. Passport Card is the de facto national identification card of the United States and a limited travel document issued by the federal government of the United States in the size of a credit card. Like a U.S. passport book, the passport card is only issued to U.S. citizens and U.S. nationals exclusively by the U.S. Department of State and is compliant to the standards for identity documents set by the REAL ID Act and can be used as proof of U.S. citizenship. The passport card's intended primary purpose is for identification and to allow cardholders to travel by domestic air flights within the United States and to enter and exit the United States via land and sea between member states of the Western Hemisphere Travel Initiative (WHTI). However, the passport card cannot be used for international air travel. Question: can you travel internationally with a passport card?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passport card can be used for land and sea travel between member states of the Western Hemisphere Travel Initiative (WHTI), but it cannot be used for international air travel. Therefore the score is: 50"
    },
    {
        "question": "Passage: In the United States of America, Canada, and other countries participating in the North American Numbering Plan, a toll-free telephone number has one of the area codes 800, 833, 844, 855, 866, 877, and 888. Question: is 866 a toll free number in canada?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: ``She's Like the Wind'' is a 1987 power ballad from the film Dirty Dancing, performed by Patrick Swayze. Though Swayze is the primary vocalist on the single, it was billed as being performed by ``Patrick Swayze featuring Wendy Fraser''; Fraser is heard throughout much of the song, specifically in the final chorus. The single reached number three on the Billboard Hot 100 and number one on the Adult Contemporary chart. Question: was she like the wind in dirty dancing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The queen can be moved any number of unoccupied squares in a straight line vertically, horizontally, or diagonally, thus combining the moves of the rook and bishop. The queen captures by occupying the square on which an enemy piece sits. Question: can the queen move like the knight in chess?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. According to the passage, the queen can move like the knight in chess. It can move any number of unoccupied squares in a straight line vertically, horizontally, or diagonally, just like the knight. Therefore the score is: 60"
    },
    {
        "question": "Passage: The Law Enforcement Officers Safety Act (LEOSA) is a United States federal law, enacted in 2004, that allows two classes of persons--the ``qualified law enforcement officer'' and the ``qualified retired or separated law enforcement officer''--to carry a concealed firearm in any jurisdiction in the United States, regardless of state or local laws, with certain exceptions. Question: can a police officer carry in any state?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Muay Thai (Thai: \u0e21\u0e27\u0e22\u0e44\u0e17\u0e22, RTGS: Muai Thai, pronounced (m\u016ba\u032fj th\u0101j) ( listen)) or Thai boxing is a combat sport of Thailand that uses stand-up striking along with various clinching techniques. This discipline is known as the ``Art of Eight Limbs'' because it is characterized by the combined use of fists, elbows, knees and shins. Muay Thai became widespread internationally in the twentieth century, when practitioners from Thailand began competing in Kickboxing, mixed rules matches, as well as matches under Muay Thai rules around the world. The professional league is governed by The Professional Boxing Association of Thailand (P.A.T) sanctioned by The Sports Authority of Thailand (S.A.T.), and World Professional Muaythai Federation (WMF) overseas. Question: is there a difference between muay thai and thai boxing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Flipper was a mascot for the Miami Dolphins from 1966 to 1968. She was situated in a fish tank in the open (east) end of the Orange Bowl, and was trained to jump in the tank when a touchdown or field goal was scored. Question: do the miami dolphins have a real dolphin in their stadium?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Generally, if a stopped school bus is displaying a flashing, alternating red lamp, a driver of a vehicle meeting or overtaking the stopped bus from either direction (front or back) must stop and wait until the bus moves again or the red light is off. Police officers, school crossing guards, and even school bus drivers themselves may have the power to wave traffic on, even when a red light is flashing. Question: do all lanes of traffic have to stop for a school bus?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. According to the passage, all lanes of traffic do not have to stop for a school bus. Only the driver of a vehicle meeting or overtaking the stopped bus from either direction (front or back) must stop and wait until the bus moves again or the red light is off. Therefore the score is: 50"
    },
    {
        "question": "Passage: Two-stroke oil (also referred to as two-cycle oil, 2-cycle oil, 2T oil, 2-stroke oil or petroil) is a special type of motor oil intended for use in crankcase compression two-stroke engines. Question: is there a difference between 2 cycle and 2 stroke?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Set and filmed in New York City and based on Candace Bushnell's 1997 book of the same name, the show follows the lives of a group of four women--three in their mid-thirties and one in her forties--who, despite their different natures and ever-changing sex lives, remain inseparable and confide in each other. Starring Sarah Jessica Parker (as Carrie Bradshaw), Kim Cattrall (as Samantha Jones), Kristin Davis (as Charlotte York), and Cynthia Nixon (as Miranda Hobbes), the quirky series had multiple continuing storylines that tackled relevant and modern social issues such as sexuality, safe sex, promiscuity, and femininity, while exploring the difference between friendships and romantic relationships. The deliberate omission of the better part of the early lives of the four women was the writers' way of exploring social life--from sex to relationships--through each of their four very different, individual perspectives. Question: is sex and the city filmed in new york?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no need to provide a score for the student's answer as it is already provided in the passage as True. The final sentence of your response can be \\\"According to the passage, the show 'Sex and the City' is filmed in New York City.\\"
    },
    {
        "question": "Passage: AMC announced the series' cancellation in July 2012, but picked it up for a third season after a renegotiation with Fox Television Studios and Netflix. The Killing was again cancelled by AMC in September 2013, but Netflix announced in November 2013 that it had ordered a fourth season consisting of six episodes to conclude the series. The complete fourth season was released on Netflix on August 1, 2014. Question: will there be a 5th season of the killing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Canadian applicants to the bar must obtain admission (referred to as the ``call to the bar'') to one of the provincial or territorial Law Societies in the various jurisdictions of Canada. As an example, in order to sit for the bar exam, the Law Society of British Columbia requires that a student complete an undergraduate degree in any discipline (B.A. of four years), and an undergraduate law degree (LL.B. and/or B.C.L., three to four years) or Juris Doctor (three years). The applicant must complete an apprenticeship referred to as ``articling'' (nine to fifteen months depending on the jurisdiction and nature of the articling process). Question: can anyone take the bar exam in canada?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: It is illegal to sell packaged liquor (off-premises sales) on Sundays. Sales also are prohibited on Memorial Day, Independence Day, Labor Day, Thanksgiving Day, and Christmas Day. Low-point beer for consumption off-premises may not be sold between 2:00 a.m. and 6:00 a.m. Question: are liquor stores open on july 4th oklahoma?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The requirement for a visa was removed by Indonesia and Ukraine in July 2017, Qatar in August 2017, Serbia in September 2017, Tunisia in October 2017. Visa free status was granted to parts of the Russian Far East: Primorye and the rest of Khabarovsk, Sakhalin, Chukotka and Kamchatka regions in 2018. Question: do indian passport holder need visa for indonesia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: As state law on waiting periods and background checks do not apply to sales by non-licensed sellers, the Florida Constitution, Art VIII Sec. 5(b), permits counties to enact ordinances that require a criminal history records check and a 3 to 5-day waiting period for non-licensed sellers when any part of a firearm sale is conducted on property to which the public has the ``right of access'', such as at a gun show conducted on public property. These local option ordinances may not be applied to holders of a concealed weapons license. Only Miami-Dade, Broward, Palm Beach, Hillsborough, and Volusia counties had enacted such ordinances. Question: can you sell a gun privately in florida?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Tooth development is the complex process by which teeth form from embryonic cells, grow, and erupt into the mouth. Although many diverse species have teeth, their development is largely the same as in humans. For human teeth to have a healthy oral environment, enamel, dentin, cementum, and the periodontium must all develop during appropriate stages of fetal development. Primary teeth start to form in the development of the embryo between the sixth and eighth weeks, and permanent teeth begin to form in the twentieth week. If teeth do not start to develop at or near these times, they will not develop at all. Question: are babies born with 2 sets of teeth?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Shooter is an American television drama series based on the 2007 film of the same name and the novel Point of Impact by Stephen Hunter. The show stars Ryan Phillippe in the lead role of Bob Lee Swagger, an expert marksman living in exile who is coaxed back into action after learning of a plot to kill the president. USA Network picked up the pilot in August 2015 and ordered the pilot to series in February 2016. Question: is the shooter tv show the same as the movie?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The two equal sides are called the legs and the third side is called the base of the triangle. The other dimensions of the triangle, such as its height, area, and perimeter, can be calculated by simple formulas from the lengths of the legs and base. Every isosceles triangle has an axis of symmetry along the perpendicular bisector of its base. The two angles opposite the legs are equal and are always acute, so the classification of the triangle as acute, right, or obtuse depends only on the angle between its two legs. Question: are the base angles of an isosceles triangle equal?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. According to the passage, the two angles opposite the legs are equal and are always acute, but it does not explicitly state that the base angles of an isosceles triangle are equal. Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: A Long Island Iced Tea is a type of alcoholic mixed drink typically made with vodka, tequila, light rum, triple sec, gin, and a splash of cola, which gives the drink the same amber hue as its namesake. A popular version mixes equal parts vodka, gin, rum, triple sec, with \u200b1 \u2044 parts sour mix and a splash of cola. Lastly, it is decorated with the lemon and straw, after stirring with bar spoon smoothly. Question: do long island iced teas have tea in them?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The United States men's national soccer team is controlled by the United States Soccer Federation and competes in the Confederation of North, Central American and Caribbean Association Football. The team has appeared in ten FIFA World Cups, including the first in 1930, where they reached the semi-finals. The U.S. participated in the 1934 and 1950 World Cups, winning 1--0 against England in the latter. After 1950, the U.S. did not qualify for the World Cup until 1990. The U.S. hosted the 1994 World Cup, where they lost to Brazil in the round of sixteen. They qualified for five more consecutive World Cups after 1990 (for a total of seven straight appearances, a feat shared with only seven other nations), becoming one of the tournament's regular competitors and often advancing to the knockout stage. The U.S. reached the quarter-finals of the 2002 World Cup, where they lost to Germany. In the 2009 Confederations Cup, they eliminated top-ranked Spain in the semi-finals before losing to Brazil in the final, their only appearance in a final. The team failed to qualify for the 2018 World Cup, having been eliminated in continental qualifying, ending the streak of consecutive World Cups at seven. Question: has the us ever won a world cup in soccer?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Although capable of living indoors with humans similarly to cats or dogs, pet skunks are relatively rare, partly due to restrictive laws and the complexity of their care. Pet skunks are mainly kept in the United States, Canada, Germany, the Netherlands, Poland, and Italy. Question: can u have a skunk as a pet?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: California is a ``may issue'' state for permits to carry concealed guns. The willingness of issuing authorities in California ranges from No Issue in most urban areas to Shall Issue in rural counties. Additionally, the issuing authority can also impose restrictions on the CCW permit-holder, such as limiting concealed carry only to the purposes listed on the approved CCW permit application. However, concealed carry permits are valid statewide, regardless of where they were issued. This creates a situation where residents in presumptively No Issue locations such as Los Angeles and San Francisco cannot lawfully carry a concealed firearm, but residents from other counties with more permissive CCW issuance policies can lawfully carry within these same jurisdictions. California does not recognize concealed carry permits issued by other states, and non-residents are generally forbidden from obtaining a California concealed carry permit. Those eligible to carry a rifle, shotgun, or handgun under the federal Law Enforcement Officers Safety Act are not subject to some California laws. Question: can you carry a gun on you in california?",
        "pred_ans": " True, with certain restrictions and permits",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: A neurotransmitter can influence the function of a neuron through a remarkable number of mechanisms. In its direct actions in influencing a neuron's electrical excitability, however, a neurotransmitter acts in only one of two ways: excitatory or inhibitory. A neurotransmitter influences trans-membrane ion flow either to increase (excitatory) or to decrease (inhibitory) the probability that the cell with which it comes in contact will produce an action potential. Thus, despite the wide variety of synapses, they all convey messages of only these two types, and they are labeled as such. Type I synapses are excitatory in their actions, whereas type II synapses are inhibitory. Each type has a different appearance and is located on different parts of the neurons under its influence. Each neuron receives thousands of excitatory and inhibitory signals every second. Question: can a neurotransmitter be both excitatory and inhibitory?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a possibility that a neurotransmitter can be both excitatory and inhibitory. According to the passage, a neurotransmitter can influence a neuron's function through various mechanisms, and it can act in either an excitatory or inhibitory manner. However, the passage does not explicitly state that a neurotransmitter cannot be both excitatory and inhibitory at the same time or in different contexts. Therefore, the student's answer of \\\"False\\\" is not entirely accurate, and I would give a score of 70 out of 100."
    },
    {
        "question": "Passage: The Frankenstein Chronicles is a British television period crime drama series that first aired on ITV Encore on 11 November 2015, designed as a re-imagining of Mary Shelley's 1818 novel Frankenstein. Lead actor Sean Bean also acted as an associate producer on the first series. It follows Inspector John Marlott (Sean Bean), a river police officer who uncovers a corpse made up of body parts from eight missing children and sets about to determine who is responsible. Question: is the frankenstein chronicles based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The term ``Cheddar cheese'' is widely used, but has no protected designation of origin within the European Union. However, in 2007 a Protected Designation of Origin, ``West Country Farmhouse Cheddar'', was created and only Cheddar produced from local milk within Somerset, Dorset, Devon and Cornwall and manufactured using traditional methods may use the name. Outside Europe, the style and quality of cheeses labelled as cheddar may vary greatly; furthermore, cheeses that are more similar in taste and appearance to Red Leicester are sometimes popularly marketed as ``Red Cheddar''. Question: does cheddar cheese have to be made in cheddar?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Royal Mail was separated into three divisions in 1986, and in August 1990, Royal Mail Parcels was rebranded as Parcelforce. Question: is parcel force the same as royal mail?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 60"
    },
    {
        "question": "Passage: Episodes were generally shot over a three-day period in the Los Angeles-based Soundstage Studio 22 and featured upwards of 50 scenes with quick transitions and flashbacks. However, the ``Pilot'' episode was filmed at CBS Radford. The laugh track was later created by recording an audience being shown the final edited episode. Thomas claimed that shooting before a live audience would have been impossible because of the structure of the show and the numerous flashforwards in each episode and because doing so ``would blur the line between 'audience' and 'hostage situation'''. Later seasons started filming in front of an audience on occasion when smaller sets were used. Question: was how i met your mother filmed in new york?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The meeting record provides firsthand insight into the debate. South Africa's position can be seen as an attempt to protect its system of apartheid, which clearly violated several articles in the Declaration. The Saudi Arabian delegation's abstention was prompted primarily by two of the Declaration's articles: Article 18, which states that everyone has the right ``to change his religion or belief''; and Article 16, on equal marriage rights. The six communist countries abstentions centred around the view that the Declaration did not go far enough in condemning fascism and Nazism. Eleanor Roosevelt attributed the abstention of Soviet bloc countries to Article 13, which provided the right of citizens to leave their countries. Question: has saudi arabia signed the universal declaration of human rights?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The chain of survival refers to a series of actions that, properly executed, reduce the mortality associated with cardiac arrest. Like any chain, the chain of survival is only as strong as its weakest link. The four interdependent links in the chain of survival are early access, early CPR, early defibrillation, and early advanced cardiac life support Question: is attending a first aid course part of the chain of survival?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A Snellen chart is an eye chart that can be used to measure visual acuity. Snellen charts are named after the Dutch ophthalmologist Herman Snellen, who developed the chart in 1862. Many ophthalmologists and vision scientists now use an improved chart known as the LogMAR chart. Question: do all eye doctors use the same eye chart?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a bit of a misconception in the student's answer. While it is true that not all eye doctors use the same eye chart, it is important to note that the LogMAR chart is an improved version of the Snellen chart and is widely used by ophthalmologists and vision scientists. The Snellen chart is still used by some eye doctors, especially in the United States.\\n\\nTherefore, the score is: 70"
    },
    {
        "question": "Passage: Batman Forever's tone is significantly different from the previous installments, becoming more family-friendly since Warner Bros. believed that the previous Batman film, Batman Returns (1992), failed to outgross its predecessor due to parent complaints about the film's violence and dark overtones. Schumacher eschewed the dark, dystopian atmosphere of Burton's films by drawing inspiration from the Batman comic book of the Dick Sprang era, as well as the 1960s television series. Keaton chose not to reprise the role due to failing to negotiate with studio executives Terry Semel and Bob Daly about the overall approach to the script. William Baldwin and Ethan Hawke were initially considered for Keaton's replacement before Kilmer joined the cast. Rene Russo was originally set to play Chase Meridian, based on her chemistry with Keaton in One Good Cop, but was replaced with the much younger Nicole Kidman after being deemed ``too old'' for Kilmer. Question: is batman forever a sequel to batman returns?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no mention of a score in the passage, so I cannot provide a score for the student's answer."
    },
    {
        "question": "Passage: This has led to categorizing trucks similarly, even if their payload is different. Therefore, the Toyota Tacoma, Dodge Dakota, Ford Ranger, Honda Ridgeline, Chevrolet S-10, GMC S-15 and Nissan Frontier are called quarter-tons (1\u20444-ton). The Ford F-150, Chevrolet C10/K10, Chevrolet/GMC 1500, Dodge 1500, Toyota Tundra, and Nissan Titan are half-tons (1\u20442-ton). The Ford F-250, Chevrolet C20/K20, Chevrolet/GMC 2500, and Dodge 2500 are three-quarter-tons (3\u20444-ton). Chevrolet/GMC's 3\u20444-ton suspension systems were further divided into light and heavy-duty, differentiated by 5-lug and 6 or 8-lug wheel hubs depending on year, respectively. The Ford F-350, Chevrolet C30/K30, Chevrolet/GMC 3500, and Dodge 3500 are one tons (1-ton). Question: is a dodge dakota a half ton truck?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The How to Train Your Dragon franchise from DreamWorks Animation consists of two feature films How to Train Your Dragon (2010) and How to Train Your Dragon 2 (2014), with a third feature film, How to Train Your Dragon: The Hidden World, set for a 2019 release. The franchise is inspired by the British book series of the same name by Cressida Cowell. The franchise also consists of four short films: Legend of the Boneknapper Dragon (2010), Book of Dragons (2011), Gift of the Night Fury (2011) and Dawn of the Dragon Racers (2014). A television series following the events of the first film, Dragons: Riders of Berk, began airing on Cartoon Network in September 2012. Its second season was renamed Dragons: Defenders of Berk. Set several years later, and as a more immediate prequel to the second film, a new television series, titled Dragons: Race to the Edge, aired on Netflix in June 2015. The second season of the show was added to Netflix in January 2016 and a third season in June 2016. A fourth season aired on Netflix in February 2017, a fifth season in August 2017, and a sixth and final season on February 16, 2018. Question: is how to train your dragon on now tv?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A public limited company (legally abbreviated to plc) is a type of public company under the United Kingdom company law, some Commonwealth jurisdictions, and the Republic of Ireland. It is a limited liability company whose shares may be freely sold and traded to the public (although a plc may also be privately held, often by another plc), with a minimum share capital of \u00a350,000 and usually with the letters PLC after its name. Similar companies in the United States are called publicly traded companies. Public limited companies will also have a separate legal identity. Question: is a plc the same as a limited company?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The untitled Avengers film, colloquially referred to as Avengers 4, is an upcoming American superhero film based on the Marvel Comics superhero team the Avengers, produced by Marvel Studios and distributed by Walt Disney Studios Motion Pictures. It is intended to be the direct sequel to 2018's Avengers: Infinity War, as well as the sequel to 2012's Marvel's The Avengers and 2015's Avengers: Age of Ultron and the twenty-second film in the Marvel Cinematic Universe (MCU). The film is directed by Anthony and Joe Russo, with a screenplay by the writing team of Christopher Markus and Stephen McFeely, and features an ensemble cast with many actors from previous MCU films. Question: is there going to be a second part of infinity war?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The principal bridesmaid, if one is so designated, may be called the chief bridesmaid or maid of honor if she is unmarried, or the matron of honor if she is married. A junior bridesmaid is a girl who is clearly too young to be married, but who is included as an honorary bridesmaid. In the United States, typically only the maid/matron of honor and the best man are the official witnesses for the wedding license. Question: can you have a maid of honor and a chief bridesmaid?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Part 608, section 3.2 of the DMM (U.S. Domestic Mail Manual) groups holidays into ``Widely Observed'' and ``Not Widely Observed''. Holidays ``Widely Observed'' include New Year's Day, Memorial Day, Independence Day, Labor Day, Thanksgiving Day, and Christmas Day. Holidays ``Not Widely Observed'' are Martin Luther King, Jr.'s Birthday; Presidents Day; Columbus Day; and Veterans Day. Question: does the post office run on memorial day?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage states that the post office does observe Memorial Day, which is a \\\"Widely Observed\\\" holiday. Therefore, the score is: 50."
    },
    {
        "question": "Passage: Pirates of the Caribbean is a Disney franchise encompassing numerous theme park attractions and a media franchise consisting of a series of films, and spin-off novels, as well as a number of related video games and other media publications. The franchise originated with the Pirates of the Caribbean theme ride attraction, which opened at Disneyland in 1967 and was one of the last Disney theme park attractions overseen by Walt Disney. Disney based the ride on pirate legends and folklore. Question: was pirates of the caribbean based on the ride?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: South Sudan (/su\u02d0\u02c8d\u00e6n, -\u02c8d\u0251\u02d0n/ ( listen)), officially known as the Republic of South Sudan, is a landlocked country in East-Central Africa. The country gained its independence from the Republic of the Sudan in 2011, making it the newest country with widespread recognition. Its capital and largest city is Juba. Question: is sudan and south sudan the same country?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Bank of America Corporation (abbreviated as BofA) is an American multinational investment bank and financial services company based in Charlotte, North Carolina with central hubs in New York City, London, Hong Kong, and Toronto. Bank of America was formed through NationsBank's acquisition of BankAmerica in 1998. It is the second largest banking institution in the United States, after JP Morgan Chase. As a part of the Big Four, it services approximately 10.73% of all American bank deposits, in direct competition with Citigroup, Wells Fargo, and JPMorgan Chase. Its primary financial services revolve around commercial banking, wealth management, and investment banking. Question: is there a bank of america in north carolina?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Troy: Fall of a City is a British-American miniseries based on the Trojan War and the love affair between Paris and Helen. The show tells the story of the 10 year siege of Troy, set in the 13th century BC. The series was commissioned by BBC One and is a co-production between BBC One and Netflix, with BBC One airing the show on 17 February 2018 in the United Kingdom, and Netflix streaming the show internationally outside the UK. Question: is troy fall of a city on netflix?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Constitution of India designates the official language of the Government of India as Hindi written in the Devanagari script, as well as English. There is no national language as declared by the Constitution of India. Hindi is used for official purposes such as parliamentary proceedings, judiciary, communications between the Central Government and a State Government. States within India have the liberty and powers to specify their own official language(s) through legislation and therefore there are 22 officially recognized languages in India of which Hindi is the most used. The number of native Hindi speakers is about 25% of the total Indian population; however, including dialects of Hindi termed as Hindi languages, the total is around 44% of Indians, mostly accounted from the states falling under the Hindi belt. Other Indian languages are each spoken by around 10% or less of the population. Question: is hindi is our national language of india?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The storyline was softly rebooted with a main narrative led by an adult Henry Mills, set several years after last season's events. In February 2018, it was announced the seventh season would serve as the final season of the series; the season and series concluded on May 18, 2018. Question: was season 7 the last season of once upon a time?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Waterloo & City line (colloquially known as The Drain) is a London Underground line that runs between Waterloo and Bank with no intermediate stops. Its primary traffic consists of commuters from south-west London, Surrey and Hampshire arriving at Waterloo main line station and travelling forward to the City of London financial district, and for this reason the line does not normally operate on Sundays. Question: does the waterloo and city line run on a sunday?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to indicate that the Waterloo & City line does or does not run on Sundays. The passage only states that the line does not normally operate on Sundays, but it does not explicitly say that it never operates on Sundays. Therefore, the student's answer of \\\"False\\\" cannot be accurately determined to be correct or incorrect based on the information provided in the passage. The score cannot be given as there is not enough information to evaluate the student's answer."
    },
    {
        "question": "Passage: In the United States, burglary is prosecuted as a felony or misdemeanor and involves trespassing and theft, entering a building or automobile, or loitering unlawfully with intent to commit any crime, not necessarily a theft--for example, vandalism. Even if nothing is stolen in a burglary, the act is a statutory offense. Buildings can include hangars, sheds, barns, and coops; burglary of boats, aircraft, trucks, and railway cars is possible. Burglary may be an element in crimes involving rape, arson, kidnapping, identity theft, or violation of civil rights; indeed, the ``plumbers'' of the Watergate scandal were technically burglars. As with all legal definitions in the U.S., the foregoing description may not be applicable in every jurisdiction, since there are 50 separate state criminal codes, plus federal and territorial codes in force. Question: is breaking and entering into a car a felony?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In the United States a Doctor of Physical Therapy (DPT) degree is a post-baccalaureate clinical doctorate that takes 3 years to complete. A DPT is a practitioner who is educated in many areas of rehabilitation. However, a Doctor of Physical Therapy is not a medical doctor and can not prescribe medication. A Transitional Doctor of Physical Therapy Degree is also offered for those who already hold a professional Bachelor or Master of Physical Therapy (BPT or MPT) degree. As of 2015, all accredited and developing physical therapist programs are DPT programs. The DPT degree currently prepares students to be eligible for the PT license examination in all 50 states. As of March 2017, there are 222 accredited Doctor of Physical Therapy programs in the United States. After completing a DPT program the doctor of physical therapy may continue training in a residency and then fellowship. As of December 2013, there are 178 credentialed physical therapy residencies and 34 fellowships in the US with 63 additional developing residencies and fellowships. Credentialed residencies are between 9 and 36 months while credentialed fellowships are between 6 and 36 months. Question: is a doctor of physical therapy an md?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Due to their size and overall disposition, Netherlands Dwarfs often do not make good pets for children (although suitability will vary between individual rabbits). There is often a mismatch with small children, because they like to play with the pet or pick it up to cuddle with it. Dwarf rabbits do not like to be picked up or held tightly; and they can bite, scratch or struggle wildly if the child does so. This often leads to accidents if the child drops them out of fright, leading to major injuries because a rabbit has very fragile bones. Larger breeds of rabbits are recommended for children, because they have fewer issues with temperament. Question: do netherland dwarf rabbits like to be held?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: On October 9, 2015, Los Lobos was nominated for induction into the Rock and Roll Hall of Fame for the first time. Question: is los lobos in the rock and roll hall of fame?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Vice President of the United States (informally referred to as VPOTUS, or Veep) is a constitutional officer in the legislative branch of the federal government of the United States as the President of the Senate under Article I, Section 3, Clause 4, of the United States Constitution, as well as the second highest executive branch officer, after the President of the United States. In accordance with the 25th Amendment, he is the highest-ranking official in the presidential line of succession, and is a statutory member of the National Security Council under the National Security Act of 1947. Question: is the vice president the president of the senate?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: An American Dietetic Association study found that cast-iron cookware can leach significant amounts of dietary iron into food. The amounts of iron absorbed varied greatly depending on the food, its acidity, its water content, how long it was cooked, and how old the cookware is. The iron in spaghetti sauce increased 945 percent (from 0.61 mg/100g to 5.77 mg/100g), while other foods increased less dramatically; for example, the iron in cornbread increased 28 percent, from 0.67 to 0.86 mg/100g. Anemics, and those with iron deficiencies, may benefit from this effect, which was the basis for the development of the lucky iron fish, an iron ingot used during cooking to provide dietary iron to those with iron deficiency. People with hemochromatosis (iron overload, bronze disease) should avoid using cast-iron cookware because of the iron leaching effect into the food. Question: does enameled cast iron leach iron into food?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Reading Instruction Competence Assessment, or RICA, is a test required for two groups of California teaching credential candidates: those seeking a clear Multiple Subjects credential to teach elementary school and those seeking an Education Specialist credential, which is required to teach special education classes. Question: do all teachers have to take the rica?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A cable tie (also known as a wire tie, hose tie, steggel tie, zap strap or zip tie, and by the brand names Ty-Rap and Panduit strap) is a type of fastener, for holding items together, primarily electrical cables or wires. Because of their low cost and ease of use, cable ties are ubiquitous, finding use in a wide range of other applications. Stainless steel versions, either naked or coated with a rugged plastic, cater for exterior applications and hazardous environments. Question: are cable ties and zip ties the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The number of holes varies by culture, health and taste. In the United States where excessive salt is considered unhealthy, salt is stored in the shaker with the fewest holes, but in parts of Europe where pepper was historically a rare spice, this is reversed. Question: does salt go in the shaker with less holes?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Ronnie soon hears the rumor that her father burned down the church from some locals. Distraught, she goes to Will and laments about the situation. Will, knowing that it was actually his friend Scott who while playing around set fire to the church, is overcome by guilt and goes to Steve to apologize. When Ronnie comes in hearing this, she walks out and Will follows where they have an argument and break up. Will leaves. Fall arrives and Jonah returns to New York for the school year. Ronnie stays behind to take care of their father, who revealed to Ronnie & Jonah during the summer that he is terminally ill. Leading a slow life, she tries to make up for the time with her father that she's lost. She continues work on a composition he had been writing (titled ``For Ronnie''), after losing the steadiness of his hands due to his illness. He dies just as she finishes it. Question: does her dad die in the last song?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: As of March 19, 2017, drivers pay $8.50 per car or $3.50 per motorcycle for tolls by mail. E\u2010ZPass users with transponders issued by the New York E\u2010ZPass Customer Service Center pay $5.76 per car or $2.51 per motorcycle. All E-ZPass users with transponders not issued by the New York E-ZPass CSC will be required to pay Toll-by-mail rates. Question: do you have to pay for the midtown tunnel?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Most seat belt laws in the United States are left to the states. However, the first seat belt law was a federal law, Title 49 of the United States Code, Chapter 301, Motor Vehicle Safety Standard, which took effect on January 1, 1968, that required all vehicles (except buses) to be fitted with seat belts in all designated seating positions. This law has since been modified to require three-point seat belts in outboard-seating positions, and finally three-point seat belts in all seating positions. Initially, seat belt use was voluntary. New York was the first state to pass a law which required vehicle occupants to wear seat belts, a law that came into effect on December 1, 1984. Officer Nicholas Cimmino of the Westchester County Department of Public Safety wrote the nation's first ticket for such violation. New Hampshire is the only state that has no enforceable laws for the wearing of seat belts in a vehicle. Question: are there any states where you don't have to wear a seatbelt?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Bobby Hull of the Chicago Black Hawks had led the league in scoring, but the well-oiled machine called the Montreal Canadiens managed to hold him to only six goals as the Canadiens swept the Black Hawks in four. The Toronto Maple Leafs, though, had a slightly tougher time against the Gordie Howe led Detroit Red Wings as it took the Leafs 6 games, including one in triple overtime, to win the series. Question: has any team ever swept the stanley cup final?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The groundwater may be from precipitation or from groundwater flowing into the aquifer. In areas with sufficient precipitation, water infiltrates through pore spaces in the soil, passing through the unsaturated zone. At increasing depths, water fills in more of the pore spaces in the soils, until a zone of saturation is reached. Below the water table, in the phreatic zone (zone of saturation), layers of permeable rock that yield groundwater are called aquifers. In less permeable soils, such as tight bedrock formations and historic lakebed deposits, the water table may be more difficult to define. Question: is the depth of the water table always the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a lack of information in the passage to determine if the depth of the water table is always the same. Therefore, the score is: 70"
    },
    {
        "question": "Passage: The number 1 is called a unit. It has no prime factors and is neither prime nor composite. Question: is 1 a prime factor of every number?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The venue for the All-Star Game is chosen by Major League Baseball. The criteria for the venue are subjective; generally, cities with new ballparks and those who have not hosted the game in a long time--or ever--tend to get selected. Over time, this has resulted in certain cities being selected more often at the expense of others, mainly due to timely circumstances: Cleveland Stadium and the original Yankee Stadium are tied for the most times a venue has hosted the All-Star game, both hosting four games. New York City has hosted more than any other city, having done so nine times in five different stadiums. At the same time, the New York Mets failed to host for 48 seasons (1965--2012), while the Los Angeles Dodgers have not hosted since 1980 (38). (The Dodgers hosted the second all star game on August 3rd, 1959.) Among current major league teams, the Washington Nationals and the Tampa Bay Rays have yet to host the All-Star game, but the Nationals are scheduled to host the game in 2018. Question: does mlb all star game determines home field?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Kingda Ka is a steel accelerator roller coaster located at Six Flags Great Adventure in Jackson, New Jersey. It is the world's tallest roller coaster, the world's second fastest roller coaster, and was the second strata coaster ever built. It was built by Stakotra, a subcontractor to Intamin. Riders have to be 54'' in order to be able to get on the roller coaster. Question: is kingda ka the biggest roller coaster in the world?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Puppies are born with a fully functional sense of smell but can't open their eyes. During their first two weeks, a puppy's senses all develop rapidly. During this stage the nose is the primary sense organ used by puppies to find their mother's teats, and to locate their littermates, if they become separated by a short distance. Puppies open their eyes about nine to eleven days following birth. At first, their retinas are poorly developed and their vision is poor. Puppies are not able to see as well as adult dogs. In addition, puppies' ears remain sealed until about thirteen to seventeen days after birth, after which they respond more actively to sounds. Between two and four weeks old, puppies usually begin to growl, bite, wag their tails, and bark. Question: can a puppy see when they first open their eyes?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: After laying down a Phase, players try to ``go out'' as soon as possible. To go out, a player must get rid of all of their cards by hitting and discarding. The player to go out first wins the hand. The winner of the hand, and any other players who also complete their Phase, will advance to the next Phase for the next hand, while any player not able to complete their Phase remain stuck on that Phase. Players count up the total value of cards left in their hands (the fewer cards left in their hand, the better) and score them as follows; Question: can you go out on the first round of phase 10?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Love Finds a Home is a Christian drama film, the eighth and final installment based on a series of books by Janette Oke. It aired on Hallmark Channel on September 5, 2009. The film is based on the book Love Finds a Home by Janette Oke. Sarah Jones, Haylie Duff, and Jordan Bridges reprise their roles from Love Takes Wing. Question: is there a sequel to love finds a home?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In commerce, the sizes of wire are estimated by devices, also called gauges, which consist of plates of circular or oblong form having notches of different widths around their edges to receive wire and sheet metals of different thicknesses. Each notch is stamped with a number, and the wire or sheet, which just fits a given notch, is stated to be of, say, No. 10, 11, 12, etc., of the wire gauge. Question: is sheet metal gauge the same as wire gauge?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Tennessee requires a permit to carry a firearm, whether openly or concealed. Additionally, per Tenn. Code Ann. 39-17-1351 r.(1) a facially valid handgun permit, firearms permit, weapons permit or license issued by another state shall be valid in this state (Tennessee) according to its terms and shall be treated as if it is a handgun permit issued by this state (Tennessee)). Question: can i open carry without a permit in tennessee?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Junctional rhythm can be diagnosed by looking at an ECG: it usually presents without a P wave or with an inverted P wave. Retrograde P waves refers to the depolarization from the AV node back towards the SA node. Question: does a junctional rhythm have a p wave?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a possibility that the student's answer is correct, but we need more information to determine that for sure. The passage states that junctional rhythm can present without a P wave or with an inverted P wave. It does not explicitly say that a junctional rhythm never has a P wave. Therefore, the score cannot be determined solely based on the information provided."
    },
    {
        "question": "Passage: The return address is not required on postal mail. However, lack of a return address prevents the postal service from being able to return the item if it proves undeliverable; such as from damage, postage due, or invalid destination. Such mail may otherwise become dead letter mail. Question: can you send out mail without a return address?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The tin whistle, also called the penny whistle, English flageolet, Scottish penny whistle, tin flageolet, Irish whistle, Belfast Hornpipe, fead\u00f3g st\u00e1in (or simply fead\u00f3g) and Clarke London Flageolet is a simple, six-holed woodwind instrument. It is a type of fipple flute, putting it in the same class as the recorder, Native American flute, and other woodwind instruments that meet such criteria. A tin whistle player is called a tin whistler or simply a whistler. The tin whistle is closely associated with Celtic music. Question: is a recorder the same as a tin whistle?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Costs are associated with particular goods using one of the several formulas, including specific identification, first-in first-out (FIFO), or average cost. Costs include all costs of purchase, costs of conversion and other costs that are incurred in bringing the inventories to their present location and condition. Costs of goods made by the businesses include material, labor, and allocated overhead. The costs of those goods which are not yet sold are deferred as costs of inventory until the inventory is sold or written down in value. Question: is salary included in cost of goods sold?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Lymphomas in the strict sense are any neoplasms of the lymphatic tissues (lympho- + -oma) . The main classes are malignant neoplasms (that is, cancers) of the lymphocytes, a type of white blood cell that belongs to both the lymph and the blood and pervades both. Thus, lymphomas and leukemias are both tumors of the hematopoietic and lymphoid tissues, and as lymphoproliferative disorders, lymphomas and lymphoid leukemias are closely related, to the point that some of them are unitary disease entities that can be called by either name (for example adult T-cell leukemia/lymphoma). Question: is lymphoma the same as lymph node cancer?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Australia national soccer team, nicknamed the Socceroos, has represented Australia at the FIFA World Cup finals on five occasions: in 1974, 2006, 2010, 2014 and 2018. Question: has australia ever been in a world cup final?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In November 2008, the show's post-election day telecast garnered the biggest audience in the show's history at 6.2 million in total viewers, becoming the week's most-watched program in daytime television. It was surpassed on July 29, 2010, during which former President Barack Obama first appeared as a guest on The View, which garnered a total of 6.6 million viewers. In 2013, the show was reported to be averaging 3.1 million daily viewers, which outpaced rival talk show The Talk. Question: is the view and the talk the same show?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Great Pacific garbage patch, also described as the Pacific trash vortex, is a gyre of marine debris particles in the central North Pacific Ocean discovered between 1985 and 1988. It is located roughly between 135\u00b0W to 155\u00b0W and 35\u00b0N to 42\u00b0N. The collection of plastic, floating trash halfway between Hawaii and California extends over an indeterminate area of widely varying range depending on the degree of plastic concentration used to define the affected area. Question: is there a floating mass of garbage in the ocean?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Just Cause 3 was released worldwide on December 1, 2015, for Microsoft Windows, PlayStation 4 and Xbox One. The game was published by Square Enix. A Collector's Edition of the game was announced on March 12, 2015. People were allowed to vote for the items included in the Edition. The result of the poll was revealed on July 9, 2015, and the Collector's Edition includes a grappling hook, the Weaponized Vehicle Pack content, a poster of Medici and an artbook. At Gamescom 2015, Square Enix announced that players who purchased the game on Xbox One would receive the backward-compatible version of Just Cause 2 for the Xbox 360. Console players who purchased the game's Day One Edition are eligible to enter a contest held by Avalanche and Square Enix, which tasks players to score Chaos Points to top the leaderboard. The winner of the contest would get a real-life island, or US$50,000 cash. Question: can you play just cause 3 on xbox 360?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The singular they had emerged by the 14th century. Though it is commonly employed in everyday English, it has been the target of criticism since the late 19th century. Its use in formal English has increased with the trend toward gender-inclusive language. Question: can you use them as a singular pronoun?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: It is illegal to sell packaged liquor (off-premises sales) on Sundays. Sales also are prohibited on Memorial Day, Independence Day, Labor Day, Thanksgiving Day, and Christmas Day. Low-point beer for consumption off-premises may not be sold between 2:00 a.m. and 6:00 a.m. Question: are liquor stores open on memorial day oklahoma?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Large denominations of United States currency greater than $100 were circulated by the United States Treasury until 1969. Since then, U.S. dollar banknotes have only been issued in seven denominations: $1, $2, $5, $10, $20, $50, and $100. Question: is there anything higher than a 100 dollar bill?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Throughout the World Cup history, Brazil became the team's historical rival. The two countries have met each other five times but the Czechs (always as Czechoslovakia) never managed to win, with three victories for the Brazilian side and two draws. Two other historical opponents in the finals were (West) Germany and Italy with three encounters each: Czechoslovakia won, drew and lost once against the Germans and the matches against Italy all ended in a defeat. Question: has czech republic ever won the world cup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There are a few mistakes in the student's answer. First, the student incorrectly refers to the Czech Republic as \\\"Czechs.\\\" Second, the student answers \\\"False\\\" even though the passage does not explicitly state whether the Czech Republic has ever won the World Cup. However, based on the information provided, we can infer that the Czech Republic (as Czechoslovakia) has not won the World Cup, as they lost three matches and only won once.\\n\\nTherefore, the score is: 60"
    },
    {
        "question": "Passage: Carrier oil, also known as base oil or vegetable oil, is used to dilute essential oils and absolutes before they are applied to the skin in massage and aromatherapy. They are so named because they carry the essential oil onto the skin. Diluting essential oils is a critical safety practice when using essential oils. Oils alone are volatile because they begin to dissipate as soon as they are applied. The rate of dispersion will vary based on how light or heavy the carrier oil is. Carrier oils do not contain a concentrated aroma, unlike essential oils, though some, such as olive, have a mild distinctive smell. Neither do they evaporate like essential oils, which are more volatile. The carrier oils used should be as natural and unadulterated as possible. Many people feel organic oils are of higher quality. Cold-pressing and maceration are the two main methods of producing carrier oils. Question: can you use vegetable oil as carrier oil?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In March 2017, The CW renewed the series for a fifth season, which premiered on April 24, 2018. In May 2018, the series was renewed for a sixth season. Question: are they gonna make a season 5 of the 100?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Following a major overhaul in the summer of 2016, each episode is now a single 60 minute transmission, including advertisements. The series was also moved to 9:00pm on Mondays, to allow for grittier storylines, as the series is now post-watershed. A special-double episode was broadcast on 9 January 2017 as a single 120 minute transmission. This episode was co-written by actor Shaun Williamson. The second series was broadcast in the UK between 17 July 2017 and 8 September 2017, with Series 3 scheduled for this year. Question: is there a season four of red rock?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Goalkeepers are normally allowed to handle the ball within their own penalty area, and once they have control of the ball in their hands opposition players may not challenge them for it. However the back-pass rule prohibits goalkeepers from handling the ball after it has been deliberately kicked to them by a team-mate, or after receiving it directly from a throw-in taken by a team-mate. Back-passes with parts of the body other than the foot, such as headers, are not prohibited. Despite the popular name ``back-pass rule'', there is no requirement in the laws that the kick or throw-in must be backwards; handling by the goalkeeper is forbidden regardless of the direction the ball travels. Question: can the goalkeeper pick up the ball from a throw in?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a specific rule in the passage called the \\\"back-pass rule\\\" which prohibits goalkeepers from handling the ball after it has been deliberately kicked to them by a team-mate, or after receiving it directly from a throw-in taken by a team-mate. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The Xbox One gaming console has received updates from Microsoft since its launch in 2013 that enable it to play select games from its two predecessor consoles, Xbox and Xbox 360. On June 15, 2015, backward compatibility with supported Xbox 360 games became available to eligible Xbox Preview program users with a beta update to the Xbox One system software. The dashboard update containing backward compatibility was released publicly on November 12, 2015. On October 24, 2017, another such update added games from the original Xbox library. The following is a list of all backward compatible games on Xbox One under this functionality. Question: will all xbox 360 games work on xbox one?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In baseball, to tag up is for a baserunner to retouch or remain on their starting base (the time-of-pitch base) until (after) the ball either lands in fair territory or is first touched by a fielder. By rule, baserunners must tag up when a fly ball is caught in flight by a fielder. After a legal tag up, runners are free to attempt to advance, even if the ball was caught in foul territory. On long fly ball outs, runners can often gain a base; when a runner scores by these means, this is called a sacrifice fly. On short fly balls, runners seldom attempt to advance after tagging up, due to the high risk of being thrown out. Question: can you run on a caught foul ball?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Their fourth place finish in the 1986 was their best placement in their World Cup history, until 2018 when they finished third after beating England (2-0) in Saint Petersburg. Question: did england beat belgium in the world cup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A bowl, when referred to in pipe smoking, is the part of a smoking pipe or bong that is used to hold tobacco, cannabis, or other substances. Question: is a bowl and a pipe the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Ozark is an American crime drama web television series created by Bill Dubuque and produced by Media Rights Capital. Jason Bateman stars in the series; he also directed the first two and last two episodes of season 1. The first season is composed of nine one-hour episodes and a final 80-minute episode; it was released on Netflix on July 21, 2017. Question: is netflix ozark based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In seasons 13 and 14, April faces a crisis of faith as she begins to believe that good people get punished and bad people get good things. As a result, she goes into a dark place and uses partying and sex to mask her deep-rooted pain. However, mid-Season 14, she encounters a terminal patient who helps April reaffirm her faith. April starts seeing Matthew again and their relationship is made public when the two are involved in a car accident, where April almost dies of hypothermia. In the season finale, April and Matthew get married. Question: does april get married in grey's anatomy?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u9605\u8bfb\u60a8\u63d0\u4f9b\u7684\u95ee\u9898\u548c\u5b66\u751f\u7684\u7b54\u6848\u3002\u7136\u540e\uff0c\u6211\u4f1a\u6839\u636e\u95ee\u9898\u4e2d\u7684\u6587\u7ae0\u5224\u65ad\u5b66\u751f\u7684\u7b54\u6848\u662f\u5426\u6b63\u786e\uff0c\u5e76\u7ed9\u51fa\u4e00\u4e2a 0 \u5230 100 \u4e4b\u95f4\u7684\u5206\u6570\u3002\\n\\n\u9605\u8bfb\u5b8c\u6bd5\u3002\\n\\n\u5b66\u751f\u56de\u7b54\uff1a```True```\\n\\n\u6839\u636e\u6587\u7ae0\u5185\u5bb9\uff0c\u6211\u4eec\u53ef\u4ee5\u770b\u5230\uff0c\u5728\u300a\u5b9e\u4e60\u533b\u751f\u683c\u857e\u300b\u7684\u7b2c 14 \u5b63\u5b63\u672b\uff0cApril \u548c Matthew \u786e\u5b9e\u7ed3\u5a5a\u4e86\u3002\u56e0\u6b64\uff0c\u5b66\u751f\u7684\u7b54\u6848\u662f\u6b63\u786e\u7684\u3002\\n\\n\u56e0\u6b64\uff0c\u5206\u6570\u4e3a\uff1a100"
    },
    {
        "question": "Passage: GCE Advanced Levels are post-16 qualifications in the United Kingdom, and are graded on a letter grade scale, from highest to lowest: A*, A, B, C, D, E. As in GCSE, there is an 'Unclassified' (U) grade below the minimum standard required for a grade E. The A* grade was introduced in 2010. Previously an intermediate N (Nearly passed) grade was awarded for papers below grade E by a very small margin (not used since 2008). Question: is a grade d at a level a pass?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Temperature control (thermoregulation) is part of a homeostatic mechanism that keeps the organism at optimum operating temperature, as the temperature affects the rate of chemical reactions. In humans, the average internal temperature is 37.0 \u00b0C (98.6 \u00b0F), though it varies among individuals. However, no person always has exactly the same temperature at every moment of the day. Temperatures cycle regularly up and down through the day, as controlled by the person's circadian rhythm. The lowest temperature occurs about two hours before the person normally wakes up. Additionally, temperatures change according to activities and external factors. Question: is your body temp lower in the morning?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the final episode of the series Mac is selected to head a Joint Legal Service Center Southwest, in San Diego, California. It is not known if she takes this position since her future is left to the audience's imagination. She and Captain Harmon ``Harm'' Rabb become engaged during the show's final episode, flipping a coin to see who will resign their commission. Question: do mac and harm get together on jag?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Much of Elena's story revolves around her relationships with vampires Stefan Salvatore and his older brother, Damon. It is revealed that Elena is a Petrova Doppelg\u00e4nger, which is thus responsible for her being identical to her ancestor, Katherine Pierce (n\u00e9e Katerina Petrova). This also has the implication of making her a supernatural creature. Dobrev portrayed the ``conniving'' Katherine as well, who is opposite of Elena. The actress stated that it has been a challenge distinguishing the two, and enjoys playing them both. In the television series's fourth season, Elena becomes a vampire and deals with the struggles that come with her change. She took the cure and became human again towards the end of the sixth season. In the finale of the sixth season, Kai linked Elena to Bonnie's life by magic. Elena will only wake up when Bonnie dies in around 60 years. She was locked inside the Salvatore tomb, which was changed in the seventh season, and was relocated in Brooklyn, New York. In late 2016, when it was announced that the eighth season would be the final season, Dobrev was in talks about returning to the television series to reprise her role in the final episode. After much speculation. Dobrev's return was confirmed on January 26, 2017, via an Instagram post. Dobrev appeared in the final episode of the show as both Elena and her evil doppelg\u00e4nger Katherine Pierce. Question: does elena die for good in vampire diaries?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: All private sales are required to be registered through an FA-10 form with the Criminal History Board, Firearm Records division. The state has an assault weapons ban similar to the expired Federal ban. Massachusetts is a ``may issue'', as such the LTC-A is issued in a discretionary manner. Question: can you own an assault rifle in massachusetts?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a state assault weapons ban in Massachusetts, similar to the expired Federal ban. The passage also states that LTC-A is issued in a discretionary manner, indicating that ownership of assault rifles may not be allowed. Therefore the score is: 80"
    },
    {
        "question": "Passage: In many leagues (including the NHL for regular-season games since the 2005--06 season) and in international competitions, a failure to reach a decision in a single overtime may lead to a shootout. Some leagues may eschew overtime periods altogether and end games in shootout should teams be tied at the end of regulation. In the ECHL, regular season overtime periods are played four on four for one five-minute period. In the Southern Professional Hockey League, regular season overtime periods are played three on three for one five-minute period, with penalties resulting in the opponents skating one additional player on ice (up to two additional players) for the penalty for the first three minutes, and a penalty shot in the final two minutes. The AHL, since the 2014--15 season, extended the overtime to seven minutes, with the last three minutes reduced further to three men aside and teams getting an additional skater for each opponent's penalty. The idea of using 3-on-3 skaters for the entirety of a five-minute overtime period for a regular season game was adopted by the NHL on June 24, 2015, for use in the 2015--16 NHL season. Question: can a nhl playoff game end in a tie?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The transfer window of a given football association governs only international transfers into that football association. International transfers out of an association are always possible to those associations that have an open window. The transfer window of the association that the player is leaving does not have to be open. Question: can you sign free agents outside transfer window?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A king can move one square in any direction (horizontally, vertically, or diagonally) unless the square is already occupied by a friendly piece or the move would place the king in check. As a result, the opposing kings may never occupy adjacent squares (see opposition), but the king can give discovered check by unmasking a bishop, rook, or queen. The king is also involved in the special move of castling. Question: can you move a king backwards in chess?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Beginning in 2010, hand tools manufactured for Craftsman by Apex Tool Group (formerly known as Danaher) such as ratchets, sockets, and wrenches began to be sourced overseas (mainly in China, although some are produced in Taiwan), while tools produced for Craftsman by Western Forge such as adjustable wrenches, screwdrivers, pliers and larger mechanic tool sets remain made in the United States, although as of 2018, most if not all of the production for these products have moved over to Asia. Sears still has an Industrial line which is sold through various authorized distributors. These tools are US made, appearing identical to their previous non-industrial US made counterparts, save for the ``Industrial'' name stamped on them. They are manufactured by Apex on the US production lines that previously produced the USA made standard Craftsman product before production switched overseas to Asia. Question: are craftsman tool boxes made in the usa?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: From 1976 to 1983, several states voluntarily raised their purchase ages to 19 (or, less commonly, 20 or 21), in part to combat drunk driving fatalities. In 1984, Congress passed the National Minimum Drinking Age Act, which required states to raise their ages for purchase and public possession to 21 by October 1986 or lose 10% of their federal highway funds. By mid-1988, all 50 states and the District of Columbia had raised their purchase ages to 21 (but not Puerto Rico, Guam, or the Virgin Islands, see Additional Notes below). South Dakota and Wyoming were the final two states to comply with the age 21 mandate. The current drinking age of 21 remains a point of contention among many Americans, because of it being higher than the age of majority (18 in most states) and higher than the drinking ages of most other countries. The National Minimum Drinking Age Act is also seen as a congressional sidestep of the tenth amendment. Although debates have not been highly publicized, a few states have proposed legislation to lower their drinking age, while Guam has raised its drinking age to 21 in July 2010. Question: is the drinking age different in every state?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Better Call Saul is an American television crime drama series created by Vince Gilligan and Peter Gould. It is a spin-off prequel of Gilligan's prior series Breaking Bad. Set in the early 2000s, Better Call Saul follows the story of con-man turned small-time lawyer, Jimmy McGill (Bob Odenkirk), six years before the events of Breaking Bad, showing his transformation into the persona of criminal-for-hire Saul Goodman. Jimmy becomes the lawyer of former beat cop Mike Ehrmantraut (Jonathan Banks), whose relevant skill set allows him to enter the criminal underworld of drug trafficking in Albuquerque, New Mexico. The show premiered on AMC on February 8, 2015. The 10-episode fourth season is scheduled to air starting August 6, 2018, and the show has been renewed for a fifth season. Question: was better call saul filmed before breaking bad?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Status as a natural-born citizen of the United States is one of the eligibility requirements established in the United States Constitution for holding the office of President or Vice President. This requirement was intended to protect the nation from foreign influence. Question: can you be president if you were not born in the us?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the United States, age of consent laws regarding sexual activity are made at the state level. There are several federal statutes related to protecting minors from sexual predators, but laws regarding specific age requirements for sexual consent are left to individual states, territories, and the District of Columbia. Depending on the jurisdiction, legal age of consent ranges from 16 to 18 years old. In some places, civil and criminal laws within the same state conflict with each other. Question: can an 18 year old sleep with a 15 year old?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The series is set in an unnamed world that, due to the cyclical nature of time as depicted in the series, is simultaneously the distant past and the distant future Earth. The series depicts fictional, ancient mythology that references modern Earth history, while events in the series prefigure real Earth myths. The series takes place about three thousand years after ``The Breaking of the World'', a global cataclysm that ended the ``Age of Legends'', a highly advanced era. Throughout most of the series, the world's technology and institutions are comparable to those of the Renaissance, but with greater equality for women; some cultures are matriarchal. Events later in the series prompt advances similar to the Industrial Revolution. Question: is the wheel of time set on earth?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In design of experiments, single-subject design or single-case research design is a research design most often used in applied fields of psychology, education, and human behavior in which the subject serves as his/her own control, rather than using another individual/group. Researchers use single-subject design because these designs are sensitive to individual organism differences vs group designs which are sensitive to averages of groups. Often there will be large numbers of subjects in a research study using single-subject design, however--because the subject serves as their own control, this is still a single-subject design. These designs are used primarily to evaluate the effect of a variety of interventions in applied research. Question: is single research design suitable in all research studies?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Panama has qualified once for the finals of a FIFA World Cup, the 2018 edition. They directly qualified after securing the third spot in the hexagonal on the final round. This meant that after 10 failed qualification campaigns, Panama would appear at the World Cup for the first time in their history. Question: has panama been in the world cup before?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Stimuli-responsive gels (hydrogels, when the swelling agent is an aqueous solution) are a special kind of swellable polymer networks with volume phase transition behaviour. These materials change reversibly their volume, optical, mechanical and other properties by very small alterations of certain physical (e.g. electric field, light, temperature) or chemical (concentrations) stimuli. The volume change of these materials occurs by swelling/shrinking and is diffusion-based. Gels provide the biggest change in volume of solid-state materials. Combined with an excellent compatibility with micro fabrication technologies, especially stimuli-responsive hydrogels are of strong increasing interest for microsystems with sensors and actuators. Current fields of research and application are chemical sensor systems, microfluidics and multimodal imaging systems. Question: is there a material that contracts with electricity?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Graham took The Washington Post Company public on June 15, 1971 in the midst of the Pentagon Papers controversy. A total of 1,294,000 shares were offered to the public at $26 per share. By the end of Graham's tenure as CEO in 1991, the stock was worth $888 per share, not counting the effect of an intermediate 4:1 stock split. Question: did the washington post go public in 1971?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The tongue generally heals in 1--2 weeks, during which time the person may have difficulty with speech or their normal dietary habits. Splitting is reversible but the reversal is even more painful than the tongue splitting procedure. Question: can you put a split tongue back together?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: For years, the epicenter of the quake was assumed to be near the town of Olema, in the Point Reyes area of Marin County, because of evidence of the degree of local earth displacement. In the 1960s, a seismologist at UC Berkeley proposed that the epicenter was more likely offshore of San Francisco, to the northwest of the Golden Gate. The most recent analyses support an offshore location for the epicenter, although significant uncertainty remains. An offshore epicenter is supported by the occurrence of a local tsunami recorded by a tide gauge at the San Francisco Presidio; the wave had an amplitude of approximately 3 in (8 cm) and an approximate period of 40--45 minutes. Question: was there a tsunami in the 1906 san francisco earthquake?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A solar cell, or photovoltaic cell, is an electrical device that converts the energy of light directly into electricity by the photovoltaic effect, which is a physical and chemical phenomenon. It is a form of photoelectric cell, defined as a device whose electrical characteristics, such as current, voltage, or resistance, vary when exposed to light. Individual solar cell devices can be combined to form modules, otherwise known as solar panels. In basic terms a single junction silicon solar cell can produce a maximum open-circuit voltage of approximately 0.5 to 0.6 volts. Question: are solar cells the same as solar panels?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The idea for The Shape of Water formed during del Toro's breakfast with Daniel Kraus in 2011, with whom he later co-wrote the novel Trollhunters. It shows similarities to the 2015 short film The Space Between Us. It was also primarily inspired by del Toro's childhood memories of seeing Creature from the Black Lagoon and wanting to see the Gill-man and Kay Lawrence (played by Julie Adams) succeed in their romance. When del Toro was in talks with Universal to direct a remake of Creature from the Black Lagoon, he tried pitching a version focused more on the creature's perspective, where the Creature ended up together with the female lead, but the studio executives rejected the concept. Question: is the movie shape of water based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Mal de debarquement (or mal de d\u00e9barquement) syndrome (MdDS, or common name disembarkment syndrome) is a neurological condition usually occurring after a cruise, aircraft flight, or other sustained motion event. The phrase ``mal de d\u00e9barquement'' is French and translates to ``illness of disembarkation''. MdDS is typically diagnosed by a neurologist or an ear nose & throat specialist when a person reports a persistent rocking, swaying, or bobbing feeling (though they are not necessarily rocking). This usually follows a cruise or other motion experience. Because most vestibular testing proves to be negative, doctors may be baffled as they attempt to diagnose the syndrome. A major diagnostic indicator is that most patients feel better while driving or riding in a car, i.e.: while in passive motion. MdDS is unexplained by structural brain or inner ear pathology and most often corresponds with a motion trigger, although it can occur spontaneously. This differs from the very common condition of ``land sickness'' that most people feel for a short time after a motion event such as a boat cruise, aircraft ride, or even a treadmill routine which may only last minutes to a few hours. The syndrome has recently received increased attention due to the number of people presenting with the condition and more scientific research has commenced to determine what triggers MdDS and how to cure it. Question: is it normal to be dizzy after a cruise?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The fluid ounce is distinct from the ounce as a unit of weight or mass, although it is sometimes referred to simply as an ``ounce'' where context makes the meaning clear, such as ounces in a bottle. Question: is an oz and a fl oz the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The tournament book of the London 1883 international chess tournament (originally published in 1883) contains a ``Revised International Chess Code'', which was ``published for the consideration of Chess Players, and especially of the managers of future International Tournaments''. Unlike the 1862 rule, which allowed the pawn to remain a pawn, it requires that: ``A Pawn reaching the eighth square must be named as a Queen or piece ... '' (Minchin 1973:iii--iv) Question: can you promote a pawn to a pawn?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The ``flavourings'' are believed to include cloves, soy sauce, lemons, pickles and peppers. Question: is soy sauce and worcestershire sauce the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information provided in the passage to determine if soy sauce and Worcestershire sauce are the same thing. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The Act provides a comprehensive code of company law for the United Kingdom, and made changes to almost every facet of the law in relation to companies. The key provisions are: Question: does companies act 2006 apply to all companies?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In November 2011, Warner Bros. hired Dan Mazeau and David Leslie Johnson to develop and write a treatment for a third installment, Revenge of the Titans. The pair had previously written Wrath of the Titans, which was still in post-production at the time. In the spring of 2013, Sam Worthington said he did not think a third film would be made. Question: is there a third clash of the titans?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Johnson & Wales University (JWU) is an American private, nonprofit, co-educational, career-oriented university with one main and three branch campuses located throughout the United States. Providence, Rhode Island, is home to JWU's first, largest, and main campus. Founded as a business school in 1914 by Gertrude I. Johnson and Mary T. Wales, JWU currently has 15,063 students enrolled in business, arts & sciences, culinary arts, education, engineering, equine management, hospitality, and engineering technology programs across its campuses. Question: is johnson and wales a for profit school?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a factual error in the student's answer. The passage clearly mentions that Johnson & Wales University is a \\\"private, nonprofit, co-educational, career-oriented university.\\\" Therefore, the student's answer should be corrected.\\n\\nTherefore the score is: 50"
    },
    {
        "question": "Passage: A third Ghostbusters film had been in various stages of development following the release of Ghostbusters II in 1989. As a result of original cast member Bill Murray's refusal to commit to the project and the death of fellow cast member Harold Ramis in 2014, Sony decided to reboot the series. Much of the original film's cast make cameo appearances in new roles. The announcement of the female-led cast in 2015 drew a polarized response from the public and Internet backlash, leading to the film's IMDb page and associated YouTube videos receiving low ratings prior to the film's release. Question: were all 4 ghostbusters in the new movie?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The following is a list of episodes from the Disney Channel Original Series Phineas and Ferb, which ran from August 17, 2007, to June 12, 2015. The show ended with a total of 222 segments (133 episodes). Question: are there 104 episodes of phineas and ferb?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The unions between descendants of Queen Victoria and of King Christian IX did not end with the First World War, despite the overthrows of both the German and Russian monarchies (along with House of Habsburg in Austria-Hungary). On the contrary, nearly all European reigning kings and queens today are most closely related through their descent from Victoria, Christian or both. Question: are all the royal families of europe related?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: Sale or serving of alcoholic beverages from 3 a.m. Christmas Day until 7 a.m. December 26 was banned until HB 1542 was passed in 2015. Question: can you buy alcohol on a holiday in indiana?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Predators is a 2010 American science-fiction action film directed by Nimr\u00f3d Antal and starring Adrien Brody, Topher Grace, Alice Braga, Walton Goggins, and Laurence Fishburne. It was distributed by 20th Century Fox. It is the third installment of the Predator franchise (the fifth counting the two Alien vs. Predator films), following Predator (1987) and Predator 2 (1990). Another film, The Predator, is set for release in 2018. Question: will there be a sequel to the movie predators?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Angular frequency (or angular speed) is the magnitude of the vector quantity angular velocity. The term angular frequency vector \u03c9 \u2192 (\\displaystyle (\\vec (\\omega ))) is sometimes used as a synonym for the vector quantity angular velocity. Question: is angular frequency and angular velocity the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Through most of Canada, a driver may turn right at a red light after coming to a complete stop unless a sign indicates otherwise. In the province of Quebec, turning right on a red was illegal until a pilot study carried out in 2003 showed that the right turn on red manoeuvre did not result in significantly more accidents. Subsequent to the study, the Province of Quebec now allows right turns on red except where prohibited by a sign. However, like in New York City, it remains illegal to turn right on a red anywhere on the Island of Montreal. Motorists are reminded of this by large signs posted at the entrance to all bridges. Question: can i turn right on a red light in quebec?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: There may be several meanings of ``solving an equation''. One may want to express the solutions as explicit numbers; for example, the unique solution of 2x -- 1 = 0 is 1/2. Unfortunately, this is, in general, impossible for equations of degree greater than one, and, since the ancient times, mathematicians have searched to express the solutions as algebraic expression; for example the golden ratio ( 1 + 5 ) / 2 (\\displaystyle (1+(\\sqrt (5)))/2) is the unique positive solution of x 2 \u2212 x \u2212 1 = 0. (\\displaystyle x^(2)-x-1=0.) In the ancient times, they succeeded only for degrees one and two. For quadratic equations, the quadratic formula provides such expressions of the solutions. Since the 16th century, similar formulas (using cube roots in addition to square roots), but much more complicated are known for equations of degree three and four (see cubic equation and quartic equation). But formulas for degree 5 and higher eluded researchers for several centuries. In 1824, Niels Henrik Abel proved the striking result that there are equations of degree 5 whose solutions cannot be expressed by a (finite) formula, involving only arithmetic operations and radicals (see Abel--Ruffini theorem). In 1830, \u00c9variste Galois proved that most equations of degree higher than four cannot be solved by radicals, and showed that for each equation, one may decide whether it is solvable by radicals, and, if it is, solve it. This result marked the start of Galois theory and group theory, two important branches of modern algebra. Galois himself noted that the computations implied by his method were impracticable. Nevertheless, formulas for solvable equations of degrees 5 and 6 have been published (see quintic function and sextic equation). Question: can a polynomial have a square root of a variable?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: By the spring of 1949, the airlift was clearly succeeding, and by April it was delivering more cargo than had previously been transported into the city by rail. On 12 May 1949, the USSR lifted the blockade of West Berlin. The Berlin Blockade served to highlight the competing ideological and economic visions for postwar Europe. Question: is the berlin wall the same as the berlin blockade?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Las Vegas Strip is a stretch of South Las Vegas Boulevard in Clark County, Nevada that is known for its concentration of resort hotels and casinos. The Strip is approximately 4.2 miles (6.8 km) in length, located immediately south of the Las Vegas city limits in the unincorporated towns of Paradise and Winchester. However, the Strip is often referred to as being in Las Vegas. Question: is the strip in the city of las vegas?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Letter or ANSI Letter is a paper size commonly used as home or office stationery in the United States, Canada, Chile, Mexico, the Dominican Republic and the Philippines. It measures 8.5 by 11 inches (215.9 by 279.4 mm). US Letter-size paper is a standard defined by the American National Standards Institute (ANSI, paper size A), in contrast to A4 paper used by most other countries, and adopted at varying dates, which is defined by the International Organization for Standardization, specifically in ISO 216. Question: is 8.5 x 11 the same as a4?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Admission to the bar in the United States is the granting of permission by a particular court system to a lawyer to practice law in that system. Each U.S state and similar jurisdiction (e.g., territories under federal control) has its own court system and sets its own rules for bar admission (or privilege to practice law), which can lead to different admission standards among states. In most cases, a person who is ``admitted'' to the bar is thereby a ``member'' of the particular bar. Question: do you have to take the bar exam to practice law?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The United States recognizes the right of asylum for individuals as specified by international and federal law. A specified number of legally defined refugees who either apply for asylum from inside the U.S. or apply for refugee status from outside the U.S., are admitted annually. Refugees compose about one-tenth of the total annual immigration to the United States, though some large refugee populations are very prominent. Since World War II, more refugees have found homes in the U.S. than any other nation and more than two million refugees have arrived in the U.S. since 1980. In the years 2005 through 2007, the number of asylum seekers accepted into the U.S. was about 40,000 per year. This compared with about 30,000 per year in the UK and 25,000 in Canada. The U.S. accounted for about 10% of all asylum-seeker acceptances in the OECD countries in 1998-2007. The United States is by far the most populous OECD country and receives fewer than the average number of refugees per capita: In 2010-14 (before the massive migrant surge in Europe in 2015) it ranked 28 of 43 industrialized countries reviewed by UNHCR. Question: do you need to be on us soil to claim asylum?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Until 1998, the Division Series rotated which of the three division champions would not have home-field advantage, with the wild card never having it. Now the two division winners with the best records in each league have home field, with the least-winning divisional winner and the wild card not having home field. The DS used a 2-3 format until 1998 and now uses a 2-2-1 format. This is seen as a more fair distribution of home-field advantage because previously under the 2-3 format, the team hosting the first two games had absolutely no chance of winning the series at home. With the current 2-2-1 format however, both teams have the home-field advantage in a sense. While one team gets to host three games (including the critical first and last game), the other team does get two chances out of three (games 3 and 4) of winning the series on its home field. Question: can a wild card team have home field advantage in mlb playoffs?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Powered speakers, also known as self-powered speakers and active speakers, are loudspeakers that have built-in amplifiers. Powered speakers are used in a range of settings, including in sound reinforcement systems (used at live music concerts), both for the main speakers facing the audience and the monitor speakers facing the performers; by DJs performing at dance events and raves; in private homes as part of hi-fi or home cinema audio systems and as computer speakers. They can be connected directly to a mixing console or other low-level audio signal source without the need for an external amplifier. Some active speakers designed for sound reinforcement system use have an onboard mixing console and microphone preamplifier, which enables microphones to be connected directly to the speaker. Question: do you need an amplifier for powered speakers?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a crucial detail missing in the student's answer. While it is true that powered speakers do not require an external amplifier, the student should have mentioned that they have a built-in amplifier. Therefore the score is: 70"
    },
    {
        "question": "Passage: Persons driving into Canada must have their vehicle's registration document and proof of insurance. Question: can you drive a us car into canada?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: The return address is not required on postal mail. However, lack of a return address prevents the postal service from being able to return the item if it proves undeliverable; such as from damage, postage due, or invalid destination. Such mail may otherwise become dead letter mail. Question: can i send something without a return address?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A license is required to carry a loaded handgun either openly or concealed. Such permits are issued through the Department of Safety to qualified residents 21 years or 18 years old if the applicant is active duty, reservist, guardsman, or honorably discharged from their branch of service, DD-214 must mention 'pistol qualification' in order to be exempt from 8 hour safety course must have a valid military ID. The length of the term for the initial license is determined by the age of the applicant. If renewed properly and on time, the license is renewed every 8 years. Tennessee recognizes any valid, out-of-state permit for carrying a handgun as long as the permittee is not a resident of Tennessee. Nonresidents are not issued permits unless they are regularly employed in the state. Such persons are then required to obtain Tennessee permits even if they have home state permits unless their home state has entered into a reciprocity agreement with Tennessee. Permittees may carry handguns in most areas except civic centers, public recreation buildings and colleges. Businesses or landowners posting ``no carry'' signs may prohibit gun carry on any portion of their properties. Question: can you carry a concealed weapon in tennessee?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: This is one of the lowest levels of leave in the industrialized world. In comparison to other countries, the United States is one of the only countries in the world, and the only OECD member, that has not passed laws requiring business and corporations to offer paid maternity leave to their employees. Question: is it law to get paid maternity leave?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Sometimes the inner phase itself can act as an emulsifier, and the result is a nanoemulsion, where the inner state disperses into ``nano-size'' droplets within the outer phase. A well-known example of this phenomenon, the ``Ouzo effect'', happens when water is poured into a strong alcoholic anise-based beverage, such as ouzo, pastis, absinthe, arak, or raki. The anisolic compounds, which are soluble in ethanol, then form nano-size droplets and emulsify within the water. The resulting color of the drink is opaque and milky white. Question: is it possible to make an emulsion with water ethanol and an emulsifier?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Cat is the fourth animal symbol in the 12-year cycle of the Vietnamese zodiac and Gurung zodiac, taking place of the Rabbit in the Chinese zodiac. As such, the traits associated with the Rabbit are attributed to the cat. Cats are in conflict with the Rat. Question: is the cat part of the chinese zodiac?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In addition to the Vehicle Type Approval, to be ridden on the road the minibike must be registered with the Driver and Vehicle Licensing Agency, display a valid road tax disc (vehicle excise duty), and the driver or rider must be aged 17 or over (or 16 if the vehicle meets the definition of a moped), have an appropriate driving license, and insurance. The only exceptions relate to electrically-assisted pedal cycles and mobility aids for disabled people. In the UK there are in most areas mini bike tracks for the public. Question: do you need a license to ride a mini bike?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In finance, the time value (TV) (extrinsic or instrumental value) of an option is the premium a rational investor would pay over its current exercise value (intrinsic value), based on the probability it will increase in value before expiry. For an American option this value is always greater than zero in a fair market, thus an option is always worth more than its current exercise value.. As an option can be thought of as 'price insurance' (e.g., an airline insuring against unexpected soaring fuel costs caused by a hurricane), TV can be thought of as the risk premium the option seller charges the buyer--the higher the expected risk (volatility \u22c5 (\\displaystyle \\cdot ) time), the higher the premium. Conversely, TV can be thought of as the price an investor is willing to pay for potential upside. Question: can the time value of an option be negative?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a possibility that the time value of an option can be negative. According to the passage, the time value (TV) of an option is the premium a rational investor would pay over its current exercise value (intrinsic value), based on the probability it will increase in value before expiry. For an American option, this value is always greater than zero in a fair market, but it is possible for the option to have a negative time value if the probability of it increasing in value is low. Therefore, the score is: 60"
    },
    {
        "question": "Passage: Pregnancy over age 50 has, over recent years, become possible for more women, and more easily achieved for many, due to recent advances in assisted reproductive technology, in particular egg donation. Typically, a woman's fecundity ends with menopause, which by definition is 12 consecutive months without having had any menstrual flow at all. During perimenopause, the menstrual cycle and the periods become irregular and eventually stop altogether, but even when periods are still regular, the egg quality of women in their forties is lower than in younger women, making the likelihood of conceiving a healthy baby also reduced, particularly after age 42. It is important to note, that the female biological clock can vary greatly from woman to woman. A woman's individual level of fertility can be tested through a variety of methods. Question: can i get pregnant at 52 years old?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Far Cry games, due to the history of their development, do not have any significant shared narrative elements, but instead share a theme of placing the player in a wilderness environment where they must help fight against one or more despots that control the region as well as surviving against wild animals that roam the open spaces. The Far Cry games feature a robust single-player campaign with later titles offering co-operative campaign support. The games also offer competitive multiplayer options and the ability for users to edit the games' maps for these matches. Question: are all of the far cry games connected?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Spontaneous glass breakage is a phenomenon by which toughened glass (or tempered) may spontaneously break without any apparent reason. The most common causes are: Question: can a glass window break on its own?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The castle stands upon the plug of an extinct volcano, which is estimated to have risen about 350 million years ago during the lower Carboniferous period. The Castle Rock is the remains of a volcanic pipe, which cut through the surrounding sedimentary rock before cooling to form very hard dolerite, a type of basalt. Subsequent glacial erosion was resisted by the dolerite, which protected the softer rock to the east, leaving a crag and tail formation. Question: is edinburgh castle built on top of a volcano?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The word electricity refers generally to the movement of electrons (or other charge carriers) through a conductor in the presence of potential and an electric field. The speed of this flow has multiple meanings. In everyday electrical and electronic devices, the signals or energy travel as electromagnetic waves typically on the order of 50%--99% of the speed of light, while the electrons themselves move (drift) much more slowly. Question: can electrons travel at the speed of light?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a bit of a misconception in the student's answer. While it is true that electrons do not travel at the speed of light, the student has not fully understood the passage. The passage states that in everyday electrical and electronic devices, signals or energy travel as electromagnetic waves, which is typically on the order of 50%--99% of the speed of light, while the electrons themselves move (drift) much more slowly. This means that while the electrons do not travel at the speed of light, the energy they carry can travel at close to the speed of light.\\n\\nTherefore, the score is: 70"
    },
    {
        "question": "Passage: Heat is a 1995 American crime film written, co-produced and directed by Michael Mann, and starring Robert De Niro, Al Pacino, and Val Kilmer. De Niro plays Neil McCauley, a professional thief, while Pacino plays Lt. Vincent Hanna, a LAPD robbery-homicide detective tracking down McCauley's crew. The story is based on the former Chicago police officer Chuck Adamson's pursuit during the 1960s of a criminal named McCauley, after whom De Niro's character is named. Heat is a remake by Mann of an unproduced television series he had worked on, the pilot of which was released as the TV movie L.A. Takedown in 1989. Question: was the movie heat based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Beef short ribs are the equivalent of spare ribs in pork, with beef short ribs usually larger and meatier than pork spare ribs. Question: are beef short ribs the same as spare ribs?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The drinking age in Wisconsin is 21. Those under the legal drinking age may be served, possess, or consume alcohol if they are with a parent, legal guardian, or spouse who is of legal drinking age. Those age 18-20 may also be served, possess or consumer alcohol if they are with a parent, legal guardian, or spouse who is of legal drinking age. Those age 18 to 20 may also possess (but not consume) alcohol as part of their employment. Question: is it legal to drink with parents in wisconsin?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Since their invention in 1900, reflector sights have come to be used as gun sights on all kinds of weapons. They were used on fighter aircraft, in a limited capacity in World War I, widely used in World War II, and still used as the base component in many types of modern head-up displays. They have been used in other types of (usually large) weapons as well, such as anti-aircraft gun sights, anti tank gun sights, and any other role where the operator had to engage fast moving targets over a wide field of view, and the sight itself could be supplied with sufficient electrical power to function. There was some limited use of the sight on small arms after World War II but it came into widespread use after the late 1970s with the invention of the red dot sight, with a red light-emitting diode (LED) as its reticle, making a dependable sight with durability and extremely long illumination run time. Question: did they have reflex sights in world war 2?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A peck is an imperial and United States customary unit of dry volume, equivalent to 2 dry gallons or 8 dry quarts or 16 dry pints (9.09 (UK) or 8.81 (US) liters). Two pecks make a kenning (obsolete), and four pecks make a bushel. Although the peck is no longer widely used, some produce, such as apples, is still often sold by the peck. Despite being referenced in the well-known Peter Piper tongue twister, pickled peppers are so rarely sold by the peck that any association between pickled peppers and the peck unit of measurement is considered humorous in nature. Question: is a peck the same as a bushel?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Typically, once spun, cotton candy is only marketed by color. Absent a clear name other than 'blue', the distinctive taste of the blue raspberry flavor mix has gone on to become a compound flavor that some other foods (gum, ice cream, rock candy, fluoride toothpaste) occasionally borrow (``cotton-candy flavored ice cream'') to invoke the nostalgia of cotton candy that people typically only get to experience on vacation or holidays. Pink bubble gum went through a similar transition from specific branded product to a generic flavor that transcended the original confection, and 'bubble gum flavor' often shows up in the same product categories as 'cotton candy flavor'. Question: do blue and pink cotton candy taste the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is not enough information in the passage to determine if blue and pink cotton candy taste the same. The passage only states that blue raspberry flavor mix has become a compound flavor that some other foods borrow, and pink bubble gum went through a similar transition. However, it does not provide any information on the taste of blue and pink cotton candy. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Development of methods to mass-produce spider silk has led to manufacturing of military, medical and consumer goods, such as ballistics armour, athletic footwear, personal care products, breast implant and catheter coatings, mechanical insulin pumps, fashion clothing, and outerwear. Question: can you make clothes out of spider silk?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Data visualization is both an art and a science. It is viewed as a branch of descriptive statistics by some, but also as a grounded theory development tool by others. Increased amounts of data created by Internet activity and an expanding number of sensors in the environment are referred to as ``big data'' or Internet of things. Processing, analyzing and communicating this data present ethical and analytical challenges for data visualization. The field of data science and practitioners called data scientists help address this challenge. Question: is data visualization a part of data science?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a possibility that the student's answer is correct, as the passage states that data visualization is a branch of descriptive statistics and is also used as a tool for grounded theory development. Additionally, the field of data science and data scientists help address the challenges presented by the increasing amount of data. However, the question is not explicitly answered in the passage whether data visualization is a part of data science. Therefore, the score is: 70/100"
    },
    {
        "question": "Passage: CeCe's condition worsens and Ivy is forced to take her to the hospital, where she has a run in with the Van der Woodsens and Carol. When Charlie arrives, Ivy and Carol finally tell the truth about Carol's scheme and Ivy's role in it. Ivy is turned away by the Van der Woodsens. Following this, Ivy turns to Georgina and the two later crash CeCe's wake, during which, through the reading of CeCe's will, Ivy is left everything under her legal name instead of her alias Charlie Rhodes, revealing that CeCe knew about the fact that Ivy isn't her real granddaughter. After this revelation, Ivy kicks Lily and Rufus out of the apartment as it had been paid for by CeCe and was now hers. Ivy throws a party to commemorate the Celia Rhodes Foundation and to gain acceptance in the Upper East Side. She believes she has found an ally in William van der Woodsen after he agrees to persuade people to attend the party in exchange for money, however, it is later revealed he is working with Lily to gather evidence against Ivy to take her to court and contest CeCe's will. Question: do they ever find out about charlie in gossip girl?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " I have read the passage and the student's answer. The question asks if they ever find out about Charlie in \\\"Gossip Girl.\\\" According to the passage, it is not explicitly stated whether they find out about Charlie. Therefore, the score is: 60/100."
    },
    {
        "question": "Passage: Since the 20th century, the word ``girdle'' also has been used to define an undergarment made of elasticized fabric that was worn by women. It is a form-fitting foundation garment that encircles the lower torso, perhaps extending below the hips, and worn often to shape or for support. It may be worn for aesthetic or medical reasons. In sports or medical treatment, a girdle may be worn as a compression garment. This form of women's foundation wear replaced the corset in popularity, and was in turn to a large extent surpassed by the pantyhose in the 1960s. Question: is a girdle the same as a corset?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: On February 9, 2006, Nick Baron announced that a version of Halo 2 would be released on PC, exclusively for the Windows Vista operating system. While this was a deliberate decision by Microsoft to push sales of Vista, the game could be enabled to play on Windows XP through an unauthorized third-party patch. The game was ported by a small team at Microsoft Game Studios (codenamed Hired Gun) who worked closely with Bungie. As one of the launch titles of Games for Windows -- Live, the game offered Live features not available in the Xbox version, such as guide support and achievements. The Windows port also added two exclusive multiplayer maps and a map editor. Question: does halo 2 have achievements on xbox 360?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Kiwis lack an egg tooth, instead using their legs and beak to break through a relatively thin eggshell. The superprecocial megapodes possess an egg tooth in their early embryonic development, but instead use their claws during hatching. Question: do all baby birds have an egg tooth?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no mention of whether all baby birds have an egg tooth in the passage. The passage only discusses kiwis and superprecocial megapodes, which suggests that not all baby birds have an egg tooth. However, the student's answer cannot be determined to be completely accurate based on the information provided in the passage. Therefore the score is: 50"
    },
    {
        "question": "Passage: A player sanctioned with a red card was sent off from the pitch and could not be replaced. Furthermore, the player was automatically banned from his team's next match. After a straight red card, FIFA conducted a hearing and considered extending this ban beyond one match. If the ban extended beyond the end of the World Cup finals (for example, a player was sent off in his team's last match), it had to be served in the team's next competitive international match(es). Question: do red cards carry over in world cup?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A new engine is broken in by following specific driving guidelines during the first few hours of its use. The focus of breaking in an engine is on the contact between the piston rings of the engine and the cylinder wall. There is no universal preparation or set of instructions for breaking in an engine. Most importantly, experts disagree on whether it is better to start engines on high or low power to break them in. While there are still consequences to an unsuccessful break-in, they are harder to quantify on modern engines than on older models. In general, people no longer break in the engines of their own vehicles after purchasing a car or motorcycle, because the process is done in production. It is still common, even today, to find that an owner's manual recommends gentle use at first (often specified as the first 500 or 1000 kilometres or miles). But it is usually only normal use without excessive demands that is specified, as opposed to light/limited use. For example, the manual will specify that the car be driven normally, but not in excess of the highway speed limit. Question: do you need to break in a car?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Among the taller players who have enjoyed success at the position is Ben Simmons, who at 6' 10'' won the 2018 National Basketball Association Rookie of the Year Award. Behind him is Magic Johnson, who at 6' 9'' (2.06 m) won the National Basketball Association Most Valuable Player Award three times in his career. Other point guards who have been named NBA MVP include Russell Westbrook, Bob Cousy, Oscar Robertson, Allen Iverson, Derrick Rose and two-time winners Steve Nash and Stephen Curry. In the NBA, point guards are usually about 6' 3'' (1.93 m) or shorter, and average about 6' 2'' (1.88 m) whereas in the WNBA, point guards are usually 5' 9'' (1.75 m) or shorter. Having above-average size (height, muscle) is considered advantageous, although size is secondary to situational awareness, speed, quickness, and ball handling skills. Shorter players tend to be better dribblers since they are closer to the floor, and thus have better control of the ball while dribbling. Question: does a point guard have to be tall?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The four-leaf clover is a rare variation of the common three-leaf clover. According to traditional superstition, such clovers bring good luck, though it is not clear when or how that superstition got started. The earliest mention of ``Fower-leafed or purple grasse'' is from 1640 and simply says that it was kept in gardens because it was ``good for the purples in children or others''. A description from 1869 says that four-leaf clovers were ``gathered at night-time during the full moon by sorceresses, who mixed it with vervain and other ingredients, while young girls in search of a token of perfect happiness made quest of the plant by day''. The first reference to luck might be from an 11-year-old girl, who wrote in an 1877 letter to St. Nicholas Magazine, ``Did the fairies ever whisper in your ear, that a four-leaf clover brought good luck to the finder?'' Question: is it lucky to find a four leaf clover?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Freepost is a postal service provided by various postal administrations, whereby a person sends mail without affixing postage, and the recipient pays the postage when collecting the mail. Freepost differs from self-addressed stamped envelopes, courtesy reply mail, and metered reply mail in that the recipient of the freepost pays only for those items that are actually received, rather than for all that are distributed. Question: do i need a stamp if it says freepost?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: During Microsoft's E3 2015 press conference on June 15, 2015, Microsoft announced plans to introduce Xbox 360 backward compatibility on the Xbox One at no additional cost. Supported Xbox 360 games will run within an emulator and have access to certain Xbox One features, such as recording and broadcasting gameplay. Games do not run directly from discs. A relicensed form of the game is downloaded automatically when a supported game is inserted, instead of having to make extensive modifications to the game in-order to port the original title. This means, that the only reason every single Xbox 360 title is not available, is a judicial issue, not an engineering one. All Xbox 360 games could run out-of-the-box on Xbox One, as they require no modifications or porting to run, other than a valid license. While digitally-purchased games will automatically appear for download in the user's library once available. As with Xbox One titles, if the game is installed using physical media, the disc is still required for validation purposes. Question: can the xbox one play xbox 360 discs?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Initially used for higher octane Super Unleaded petrol/gasoline (formerly known as Optimax in some regions), it is now additionally used for high specification diesel fuel. Question: is v power the same as super unleaded?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: When asked about developing a sequel series, King stated ``I'd love to revisit Jake and Sadie, and also revisit the rabbit hole that dumps people into the past, but sometimes it's best not to go back for a second helping.''. Question: will there be a season 2 of 11-22-63?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Except for case length, the .38 Special is identical to the .38 Short Colt, .38 Long Colt, and .357 Magnum. This allows the .38 Special round to be safely fired in revolvers chambered for the .357 Magnum, and the .38 Long Colt in revolvers chambered for .38 Special, increasing the versatility of this cartridge. However, the longer and more powerful .357 Magnum cartridge will usually not chamber and fire in weapons rated specifically for .38 Special (e.g. all versions of the Smith & Wesson Model 10), which are not designed for the greatly increased pressure of the magnum rounds. Both .38 Special and .357 Magnum will chamber in Colt New Army revolvers in .38 Long Colt, due to the straight walled chambers, but this should not be done under any circumstances, due to dangerous pressure levels, up to three times what the New Army is designed to withstand. Question: can you shoot .38 long colt in .38 special?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Stanley Cup playoffs consists of four rounds of best-of-seven series. Each series is played in a 2--2--1--1--1 format, meaning the team with home-ice advantage hosts games one, two, five, and seven, while their opponent hosts games three, four, and six. Games five, six, and seven are only played if needed. Question: is the first round of nhl playoffs 5 games?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: An income statement or profit and loss account (also referred to as a profit and loss statement (P&L), statement of profit or loss, revenue statement, statement of financial performance, earnings statement, operating statement, or statement of operations) is one of the financial statements of a company and shows the company's revenues and expenses during a particular period. It indicates how the revenues (money received from the sale of products and services before expenses are taken out, also known as the ``top line'') are transformed into the net income (the result after all revenues and expenses have been accounted for, also known as ``net profit'' or the ``bottom line''). The purpose of the income statement is to show managers and investors whether the company made or lost money during the period being reported. Question: is profit and loss account same as income statement?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Chinatown in St. Louis, Missouri, was a Chinatown near Downtown St. Louis that existed from 1869 until its demolition for Busch Memorial Stadium in 1966. Also called Hop Alley, it was bounded by Seventh, Tenth, Walnut and Chestnut streets. Question: is there a chinatown in st louis mo?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: As a general practice, discouraged workers, who are often classified as marginally attached to the labor force, on the margins of the labor force, or as part of hidden unemployment, are not considered part of the labor force, and are thus not counted in most official unemployment rates--which influences the appearance and interpretation of unemployment statistics. Although some countries offer alternative measures of unemployment rate, the existence of discouraged workers can be inferred from a low employment-to-population ratio. Question: are discouraged workers part of the unemployment rate?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Agronomists describe the benefits to yield in rotated crops as ``The Rotation Effect''. There are many found benefits of rotation systems: however, there is no specific scientific basis for the sometimes 10-25% yield increase in a crop grown in rotation versus monoculture. The factors related to the increase are simply described as alleviation of the negative factors of monoculture cropping systems. Explanations due to improved nutrition; pest, pathogen, and weed stress reduction; and improved soil structure have been found in some cases to be correlated, but causation has not been determined for the majority of cropping systems. Question: is there a way one can grow more crops from the same land?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The most important power is found in section 58: ``When a proposed law passed by both Houses of Parliament is presented to the Governor-General for the Queen's assent, he shall declare ... that he assents in the Queen's name.'' The royal assent brings such laws into effect, as legislation, from the date of signing. Question: does the governor general have to sign all bills passed by federal parliament?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Superfecundation is the fertilization of two or more ova from the same cycle by sperm from separate acts of sexual intercourse, which can lead to twin babies from two separate biological fathers. The term superfecundation is derived from fecund, meaning the ability to produce offspring. Heteropaternal superfecundation refers to the fertilization of two separate ova by two different fathers. Homopaternal superfecundation refers to the fertilization of two separate ova from the same father, leading to fraternal twins. While heteropaternal superfecundation is referred to as a form of atypical twinning, genetically, the twins are half siblings. Superfecundation, while rare, can occur through either separate occurrences of sexual intercourse or through artificial insemination. Question: can you have twin with two different father?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Under the criteria generally, it is possible for a player to have a choice of representing several national teams. Montenegrin defender of mixed Serbian and Bulgarian descent Nikola Vujadinovi\u0107, for example, would be eligible to play for the senior teams of Serbia, Montenegro or Bulgaria if he invoked his father's Bulgarian descent and lived in Bulgaria for two years. It is not uncommon for national team managers and scouts to attempt to persuade players to change their FIFA nationality; in June 2011, for example, Scotland manager Craig Levein confirmed that his colleagues had started a dialogue with United States under-17 international Jack McBean in an attempt to persuade him to represent Scotland in the future. Gareth Bale was asked about a possibility to play for England, being of English descent through his grandmother, but ultimately opted to represent Wales, his country of birth. Question: can you play for two international football teams?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In-N-Out Burger is an American regional chain of fast food restaurants with locations primarily in the American Southwest and Pacific coast. It was founded in Baldwin Park, California in 1948 by Harry Snyder and Esther Snyder. The chain is currently headquartered in Irvine, California and has slowly expanded outside Southern California into the rest of California, as well as into Arizona, Nevada, Utah, Texas, and Oregon. The current owner is Lynsi Snyder, the Snyders' only grandchild. Question: is there an in and out burger in maryland?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The whale shark (Rhincodon typus) is a slow-moving, filter-feeding carpet shark and the largest known extant fish species. The largest confirmed individual had a length of 12.65 m (41.5 ft) and a weight of about 21.5 t (47,000 lb). The whale shark holds many records for size in the animal kingdom, most notably being by far the largest living nonmammalian vertebrate. It is the sole member of the genus Rhincodon and the only extant member of the family Rhincodontidae which belongs to the subclass Elasmobranchii in the class Chondrichthyes. Before 1984 it was classified as Rhiniodon into Rhinodontidae. Question: is there such thing as a whale shark?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Four-wheel drive, also called 4\u00d74 (``four by four'') or 4WD, refers to a two-axled vehicle drivetrain capable of providing torque to all of its wheels simultaneously. It may be full-time or on-demand, and is typically linked via a transfer case providing an additional output drive-shaft and, in many instances, additional gear ranges. Question: is 4x4 the same as 4 wheel drive?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The How to Train Your Dragon franchise from DreamWorks Animation consists of two feature films How to Train Your Dragon (2010) and How to Train Your Dragon 2 (2014), with a third feature film, How to Train Your Dragon: The Hidden World, set for a 2019 release. The franchise is inspired by the British book series of the same name by Cressida Cowell. The franchise also consists of four short films: Legend of the Boneknapper Dragon (2010), Book of Dragons (2011), Gift of the Night Fury (2011) and Dawn of the Dragon Racers (2014). A television series following the events of the first film, Dragons: Riders of Berk, began airing on Cartoon Network in September 2012. Its second season was renamed Dragons: Defenders of Berk. Set several years later, and as a more immediate prequel to the second film, a new television series, titled Dragons: Race to the Edge, aired on Netflix in June 2015. The second season of the show was added to Netflix in January 2016 and a third season in June 2016. A fourth season aired on Netflix in February 2017, a fifth season in August 2017, and a sixth and final season on February 16, 2018. Question: will there be a new how to train your dragon series?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Little Witch Academia (\u30ea\u30c8\u30eb\u30a6\u30a3\u30c3\u30c1\u30a2\u30ab\u30c7\u30df\u30a2, Ritoru Witchi Akademia) is a Japanese anime franchise created by Yoh Yoshinari and produced by Trigger. The original short film, directed by Yoshinari and written by Masahiko Otsuka, was released in theaters on March 2, 2013 as part of the Young Animator Training Project's Anime Mirai 2013 project, and was later streamed with English subtitles on YouTube from April 19, 2013. A second short film partially funded through Kickstarter, Little Witch Academia: The Enchanted Parade, was released on October 9, 2015. An anime television series aired in Japan between January and June 2017, with its first 13 episodes available on Netflix worldwide beginning on June 30, 2017. The remaining 12 episodes of its first season was labeled as the show's second season and was made available on the platform on August 15, 2017. Two manga series have been published by Shueisha. Question: is there going to be a season 2 of little witch academia?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: According to the current scientific theories, matter is required to travel at slower-than-light (also subluminal or STL) speed with respect to the locally distorted spacetime region. Apparent FTL is not excluded by general relativity; however, any apparent FTL physical plausibility is speculative. Examples of apparent FTL proposals are the Alcubierre drive and the traversable wormhole. Question: can we travel faster than speed of light?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The World Ocean is also collectively known as just ``the sea''. The International Hydrographic Organization lists over 70 distinct bodies of water called seas. Question: are the sea and the ocean the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: While some 19th-century experiments suggested that the underlying premise is true if the heating is sufficiently gradual, according to contemporary biologists the premise is false: a frog that is gradually heated will jump out. Indeed, thermoregulation by changing location is a fundamentally necessary survival strategy for frogs and other ectotherms. Question: does a frog jump out of boiling water?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Netflix original series The Ranch, starring Ashton Kutcher, Danny Masterson, Sam Elliott and Debra Winger is set in the fictional town of Garrison, Colorado, but the opening shot of the town during the credit sequence is of Ouray, and the San Juan Valley just north of Ouray. Question: is there such a place as garrison colorado?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Xbox One gaming console has received updates from Microsoft since its launch in 2013 that enable it to play select games from its two predecessor consoles, Xbox and Xbox 360. On June 15, 2015, backward compatibility with supported Xbox 360 games became available to eligible Xbox Preview program users with a beta update to the Xbox One system software. The dashboard update containing backward compatibility was released publicly on November 12, 2015. On October 24, 2017, another such update added games from the original Xbox library. The following is a list of all backward compatible games on Xbox One under this functionality. Question: can all xbox 360 games be played on xbox one?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Goalkeepers are normally allowed to handle the ball within their own penalty area, and once they have control of the ball in their hands opposition players may not challenge them for it. However the back-pass rule prohibits goalkeepers from handling the ball after it has been deliberately kicked to them by a team-mate, or after receiving it directly from a throw-in taken by a team-mate. Back-passes with parts of the body other than the foot, such as headers, are not prohibited. Despite the popular name ``back-pass rule'', there is no requirement in the laws that the kick or throw-in must be backwards; handling by the goalkeeper is forbidden regardless of the direction the ball travels. Question: can a goalkeeper pick up a ball from a throw in?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The County Court Business Centre (CCBC) is a centre of the County Court of England and Wales created to deal with claims by the use of various electronic media. Unlike other County Court centres the CCBC does not physically hear cases. If any case might require a hearing it is transferred to another centre. Question: is the county court business centre a real court?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The lunar eclipse was completely visible over Eastern Africa, Southern Africa, Southern Asia and Central Asia, seen rising over South America, Western Africa, and Europe, and setting over Eastern Asia, and Australia. Question: is july 27 lunar eclipse visible in usa?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is not enough information in the passage to determine if the July 27 lunar eclipse is visible in the USA. The passage only mentions that the eclipse was visible in certain regions but does not specify if the USA is included in those regions. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Pyruvic acid (CHCOCOOH) is the simplest of the alpha-keto acids, with a carboxylic acid and a ketone functional group. Pyruvate (/pa\u026a\u02c8ru\u02d0ve\u026at/), the conjugate base, CHCOCOO, is a key intermediate in several metabolic pathways. Question: is pyruvic acid and pyruvate the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Reported values vary depending on how it is measured: oral (under the tongue): 7002309950000000000\u266036.8\u00b10.4 \u00b0C (7002309927777777777\u266098.2\u00b10.72 \u00b0F), internal (rectal, vaginal): 37.0 \u00b0C (98.6 \u00b0F). A rectal or vaginal measurement taken directly inside the body cavity is typically slightly higher than oral measurement, and oral measurement is somewhat higher than skin measurement. Other places, such as under the arm or in the ear, produce different typical temperatures. While some people think of these averages as representing normal or ideal measurements, a wide range of temperatures has been found in healthy people. The body temperature of a healthy person varies during the day by about 0.5 \u00b0C (0.9 \u00b0F) with lower temperatures in the morning and higher temperatures in the late afternoon and evening, as the body's needs and activities change. Other circumstances also affect the body's temperature. The core body temperature of an individual tends to have the lowest value in the second half of the sleep cycle; the lowest point, called the nadir, is one of the primary markers for circadian rhythms. The body temperature also changes when a person is hungry, sleepy, sick, or cold. Question: is your body temperature higher in the evening?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Coast Guard operates approximately 201 fixed and rotary wing aircraft from 24 Coast Guard Air Stations throughout the contiguous United States, Alaska, Hawaii, and Puerto Rico. Most of these air stations are tenant activities at civilian airports, several of which are former Air Force Bases and Naval Air Stations, although several are also independent military facilities. Coast Guard Air Stations are also located on active Naval Air Stations, Air National Guard bases, and Army Air Fields. Question: is the national guard part of the coast guard?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mention of Air National Guard bases in the passage, which is part of the National Guard. However, the question specifically asks if the National Guard is part of the Coast Guard. The passage does not provide any information to support that the National Guard is part of the Coast Guard. Therefore, the score is: 70"
    },
    {
        "question": "Passage: South Africa has a 'hybrid' or 'mixed' legal system, formed by the interweaving of a number of distinct legal traditions: a civil law system inherited from the Dutch, a common law system inherited from the British, and a customary law system inherited from indigenous Africans (often termed African Customary Law, of which there are many variations depending on the tribal origin). These traditions have had a complex interrelationship, with the English influence most apparent in procedural aspects of the legal system and methods of adjudication, and the Roman-Dutch influence most visible in its substantive private law. As a general rule, South Africa follows English law in both criminal and civil procedure, company law, constitutional law and the law of evidence; while Roman-Dutch common law is followed in the South African contract law, law of delict (tort), law of persons, law of things, family law, etc. With the commencement in 1994 of the interim Constitution, and in 1997 its replacement, the final Constitution, another strand has been added to this weave. Question: does roman-dutch law still apply to south african law?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In as different groups as animals and trypanosomes, the mitochondria contain both stabilising and destabilising poly(A) tails. Destabilising polyadenylation targets both mRNA and noncoding RNAs. The poly(A) tails are 43 nucleotides long on average. The stabilising ones start at the stop codon, and without them the stop codon (UAA) is not complete as the genome only encodes the U or UA part. Plant mitochondria have only destabilising polyadenylation, and yeast mitochondria have no polyadenylation at all. Question: is the poly a tail added immediately after the stop codon?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: With the release of Wake Up Call, the band decided to undergo a name change, shortening it to Theory. The group cited that discussions involving the name of their band with people who are unfamiliar with their music was challenging. Connolly equated it to rock band Red Hot Chili Peppers frequently abbreviating their name to Chili Peppers. Question: is theory of a deadman changing their name?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: After a long-running legal battle, Bates reunited the stadium freehold with the club in 1992 by doing a deal with the banks of the property developers, who had been bankrupted by a market crash. Chelsea's form in the new Premier League was unconvincing, although they did reach the 1994 FA Cup Final with Glenn Hoddle. It was not until the appointment of Ruud Gullit as player-manager in 1996 that their fortunes changed. He added several top international players to the side, as the club won the FA Cup in 1997 and established themselves as one of England's top sides again. Gullit was replaced by Gianluca Vialli, who led the team to victory in the League Cup Final, the UEFA Cup Winners' Cup Final and the UEFA Super Cup in 1998, the FA Cup in 2000 and their first appearance in the UEFA Champions League. Vialli was sacked in favour of Claudio Ranieri, who guided Chelsea to the 2002 FA Cup Final and Champions League qualification in 2002--03. Question: have chelsea always been in the premier league?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Patrick is questioned by the police about Remy's death and learns that the police never had a phone transcript like the one that Doyle had read to him prior to the botched exchange. The police dismiss his claims of Remy's corruption as conspiracy theory and Patrick does not press the issue further. Patrick and Angie drive to Doyle's home, where Patrick finds Amanda alive and well with Doyle and his wife. Doyle was part of the kidnapping all along and helped set up the fake exchange to frame Cheese and throw Patrick off the scent. Patrick threatens to call the authorities, but Doyle attempts to convince him that Amanda is better off living with them than with her neglectful mother, and that is reason enough to not get involved. Patrick leaves and discusses the choices with Angie, who says she will leave him if he calls the police, since she also believes Amanda is much better off with the Doyles. However, Patrick believes Amanda's mother can change and she shouldn't be denied her child, and calls the police; Doyle and Lionel are arrested, Amanda is returned to her mother amidst heavy publicity, and Patrick and Angie break up. Question: do they find the child in gone baby gone?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: While the film was a box office hit and raised Streisand's reputation as a director, its numerous changes from the original novel upset some Conroy purists. Conroy and Johnston eliminated most of the novel's flashback scenes. They describe Tom Wingo's relationship with his siblings in great detail. In the novel, these flashbacks form the main plot and take up more of the novel than the romance between Streisand's character, Dr. Lowenstein, and Tom Wingo. The removal of the flashbacks makes the relationship between Wingo and Lowenstein the central story in the film, whereas in the novel, it is not. Another character in the novel - the second Wingo brother, Luke, who appears only in flashbacks onscreen - is vitally important to the novel, and his death is a major plot point. In fact, the title of the book derives from a poem written by Savannah about Luke and his struggle against the government after the seizure of Colleton. In the film, ``The Prince of Tides'' is the title of a book of poetry written by Savannah and dedicated to Tom. Luke only appears intermittently, and only as a child, and his death is only vaguely described. Question: is prince of tides based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A perfectly inelastic collision occurs when the maximum amount of kinetic energy of a system is lost. In a perfectly inelastic collision, i.e., a zero coefficient of restitution, the colliding particles stick together. In such a collision, kinetic energy is lost by bonding the two bodies together. This bonding energy usually results in a maximum kinetic energy loss of the system. It is necessary to consider conservation of momentum: (Note: In the sliding block example above, momentum of the two body system is only conserved if the surface has zero friction. With friction, momentum of the two bodies is transferred to the surface that the two bodies are sliding upon. Similarly, if there is air resistance, the momentum of the bodies can be transferred to the air.) The equation below holds true for the two-body (Body A, Body B) system collision in the example above. In this example, momentum of the system is conserved because there is no friction between the sliding bodies and the surface. Question: is kinetic energy conserved in perfectly inelastic collisions?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: An alternative cut of the ending appeared on the season nine DVD and the box set. It did not contain any new footage, but is edited in a way that changes the fates of Ted, Tracy and Robin. The voiceover is performed by Saget, rather than Radnor, and is completely different. Question: is there an alternative ending to how i met your mother?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: When the monorail company first announced details of the extension in September 2008, the airport extension was to be built with private funds and was expected to be built by 2012. However, as of March 2011, the Las Vegas Monorail Company was still in the planning phases of the proposed extension to McCarran International Airport with a proposed stop on the UNLV campus. Question: does the monorail go to las vegas airport?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Canned beans, often containing pork, were among the first convenience foods, and it is in this form that they became exported and popularised by U.S. companies operating in the UK in the early 20th century. The U.S. Food and Drug Administration stated in 1996: ``It has for years been recognized by consumers generally that the designation 'beans with pork,' or 'pork and beans' is the common or usual name for an article of commerce that contains very little pork.'' The included pork is typically a piece of salt pork that adds fat to the dish. Question: are pork n beans the same as baked beans?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The ASU includes a midnight blue coat and low waist trousers for male soldiers; and a midnight blue coat, slacks and skirt for female soldiers. The fabric for the ASU is heavier and more wrinkle resistant than previously manufactured uniforms and will consist of 55% wool and 45% polyester material. The ASU coat has a tailored, athletic cut to improve uniform fit and appearance. The ASU includes an improved heavier and wrinkle resistant short and long-sleeved white shirt with permanent military creases and shoulder loops. The JROTC version replaces the white shirt with the prototype grey shirt and gold braid is not worn on the blue trousers or on the sleeves of the class A coat. Compared to the Army's previous uniforms, the ASU does not include a garrison cap; soldiers will continue to wear the Army's berets. Question: can you wear short sleeve shirt with asu jacket?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Instead, Zelena travels the time portal with Emma and Hook to change the past, where she takes the opportunity to kill Robin's wife, Marian, and takes her place instead using magic to conceal her true identity. At some point after turning into Marian, Zelena uses magic to learn everything she can about her. Question: does maid marian die in once upon a time?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Western Digital Corporation (abbreviated WDC, commonly shortened to Western Digital or WD) is an American computer data storage company and one of the largest computer hard disk drive manufacturers in the world, along with its main competitor Seagate Technology. Question: are seagate and western digital the same company?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a factual error in the student's answer. Western Digital and Seagate are not the same company. They are two separate companies that compete with each other in the data storage industry. Therefore the score is: 60"
    },
    {
        "question": "Passage: Display lag is a phenomenon associated with some types of liquid crystal displays (LCDs) like smartphones and computers, and nearly all types of high-definition televisions (HDTVs). It refers to latency, or lag measured by the difference between the time there is a signal input, and the time it takes the input to display on the screen. This lag time has been measured as high as 68 ms, or the equivalent of 3-4 frames on a 60 Hz display. Display lag is not to be confused with pixel response time. Currently the majority of manufacturers do not include any specification or information about display latency on the screens they produce. Question: is input lag and response time the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The return address is not required on postal mail. However, lack of a return address prevents the postal service from being able to return the item if it proves undeliverable; such as from damage, postage due, or invalid destination. Such mail may otherwise become dead letter mail. Question: can you mail something with no return address?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Conscription in the United States, commonly known as the draft, has been employed by the federal government of the United States in five conflicts: the American Revolution, the American Civil War, World War I, World War II, and the Cold War (including both the Korean War and the Vietnam War). The third incarnation of the draft came into being in 1940 through the Selective Training and Service Act. It was the country's first peacetime draft. From 1940 until 1973, during both peacetime and periods of conflict, men were drafted to fill vacancies in the United States Armed Forces that could not be filled through voluntary means. The draft came to an end when the United States Armed Forces moved to an all-volunteer military force. However, the Selective Service System remains in place as a contingency plan; all male civilians between the ages of 18 and 25 are required to register so that a draft can be readily resumed if needed. United States Federal Law also provides for the compulsory conscription of men between the ages of 17 and 45 and certain women for militia service pursuant to Article I, Section 8 of the United States Constitution and 10 U.S. Code \u00a7 246. Question: was there a military draft in world war 2?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: With the 2004 acquisition of Universal Studios by General Electric and merging with GE's NBC, Universal Music Group was cast under separate management from the eponymous film studio. This is the second time a music company has done so, the first being the separation of Time Warner and Warner Music Group. In February 2006, the label became 100% owned by French media conglomerate Vivendi when Vivendi purchased the last 20% from Matsushita. On June 25, 2007, Vivendi completed its \u20ac1.63 billion ($2.4 billion) purchase of BMG Music Publishing, after receiving European Union regulatory approval, having announced the acquisition on September 6, 2006. Question: is universal music group part of universal studios?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Fu\u00dfball-Club Bayern M\u00fcnchen e.V., commonly known as FC Bayern M\u00fcnchen (German pronunciation: (\u0294\u025bf tse\u02d0 \u02c8ba\u026a\u0250n \u02c8m\u028fn\u00e7n\u0329)), FCB, Bayern Munich, or FC Bayern, is a German sports club based in Munich, Bavaria (Bayern). It is best known for its professional football team, which plays in the Bundesliga, the top tier of the German football league system, and is the most successful club in German football history, having won a record 28 national titles and 18 national cups. Question: is bayern munich the same as bayern munchen?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Virgo Supercluster (Virgo SC) or the Local Supercluster (LSC or LS) is a mass concentration of galaxies containing the Virgo Cluster and Local Group, which in turn contains the Milky Way and Andromeda galaxies. At least 100 galaxy groups and clusters are located within its diameter of 33 megaparsecs (110 million light-years). The Virgo SC is one of about 10 million superclusters in the observable universe and is in the Pisces--Cetus Supercluster Complex, a galaxy filament. Question: is the milky way in a galaxy cluster?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The show had been sold to the network using the pitch ``hip parents, square kids.'' Originally, Elyse and Steven were intended to be the main characters. However, the audience reacted so positively to Alex during the taping of the fourth episode that he became the focus on the show. Fox had received the role after Matthew Broderick turned it down. Question: was family ties filmed in front of a live audience?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Although the censorship affects the whole nation, it does not affect China's special administrative regions such as Hong Kong and Macau. This is because these regions enjoys a high degree of autonomy, as specified in local laws and the ``One country, two systems'' principle. Nevertheless, it was reported that the central government authorities has been closely monitoring the Internet use in these regions. Question: is hong kong inside the great firewall of china?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In Australia, all jurisdictions except the Northern Territory allow altruistic surrogacy; with commercial surrogacy being a criminal offense. The Northern Territory has no legislation governing surrogacy. In New South Wales, Queensland and the Australian Capital Territory it is an offence to enter into international commercial surrogacy arrangements with potential penalties extending to imprisonment for up to one year in Australian Capital Territory, up to two years imprisonment in New South Wales and up to three years imprisonment in Queensland. Question: can you have a surrogate mother in australia?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Eeyore (/\u02c8i\u02d0\u0254\u02d0r/ ( listen) EE-or) is a character in the Winnie-the-Pooh books by A.A. Milne. He is generally characterized as a pessimistic, gloomy, depressed, anhedonic, old grey stuffed donkey who is a friend of the title character, Winnie-the-Pooh. Question: is eeyore from winnie the pooh a donkey?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Major Crimes is an American television police procedural series starring Mary McDonnell. It is a continuation spin-off of The Closer, set in the same police division. It premiered on TNT August 13, 2012, following The Closer's finale. Question: is the closer and major crimes the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: As addressed within Rule 9.02(a)(1) of the Official Baseball Rules a sacrifice fly is not counted as a time at bat for the batter, though the batter is credited with a run batted in. Question: is a sacrifice fly count as an at bat?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a crucial detail missing from the student's answer. While it is true that a sacrifice fly is not counted as a time at bat for the batter, it is also true that the batter is credited with a run batted in. Therefore, the score is: 60/100"
    },
    {
        "question": "Passage: The .223 Remington (.223 Rem) is a rifle cartridge. It started as the .222 Special and was renamed .223 Remington. It is commercially loaded with 0.224 inch (5.7 mm) diameter jacketed bullets, with weights ranging from 40 to 85 grains (2.6 to 5.8 g), with the most common loading by far being 55 grains (3.6 g). Ninety and ninety-five grain Sierra Matchking bullets are available for reloaders. The .223 Rem was first offered to the civilian sporting market in December 1963 in the Remington 760 rifle. In 1964 the .223 Rem cartridge was adopted for use in the Colt M16 rifle which became an alternate standard rifle of the U.S. Army. The military version of the cartridge uses a 55 gr full metal jacket boat tail design and was designated M193. In 1980 NATO modified the .223 Remington into a new design which is designated 5.56\u00d745mm NATO type SS109. Question: can you use .224 bullets in a .223?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: ME to WE is a socially conscious lifestyle brand, with half of its annual net profits donated to Free The Children, now known as WE Charity, and the other half reinvested to keep the social enterprise sustainable. The enterprise has been noted in Canadian media for setting new standards of governance in the social enterprise field. Question: is me to we a non profit organization?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Bank of Nova Scotia (French: La Banque de Nouvelle-\u00c9cosse), operating as Scotiabank with the CEO of Brian J. Porter (French: Banque Scotia), is a Canadian multinational bank. It is the third largest bank in Canada by deposits and market capitalization. It serves more than 24 million customers in over 50 countries around the world and offers a range of products and services including personal and commercial banking, wealth management, corporate and investment banking. With a team of more than 88,000 employees and assets of $915 billion (as at October 31, 2017), Scotiabank trades on the Toronto (TSX: BNS) and New York Exchanges (NYSE: BNS). Question: is scotiabank and bank of nova scotia the same?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: According to the article 20.20 of the Offences Code of Russia, drinking in a place where it is forbidden by the federal law is punishable with a fine of 500 to 1500 rubles. The article 16 of the Federal Law #171-FZ ``About the State Regulation of Production and Trade of Ethanol, Alcoholic and Ethanol-containing Products and about Restriction of Alcoholic Products Consumption (Drinking)'' forbids drinking in almost all public places (including entrance halls, staircases and elevators of living buildings) except bars, restaurants or other similar establishments where it is permitted to sell alcoholic products for immediate consumption. Question: can you drink in the street in russia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Remember the Titans is a 2000 American biographical sports drama film produced by Jerry Bruckheimer and directed by Boaz Yakin. The screenplay, written by Gregory Allen Howard, is based on the true story of African-American coach Herman Boone, portrayed by Denzel Washington, and his attempt to integrate the T.C. Williams High School football team in Alexandria, Virginia, in 1971. Will Patton portrays Bill Yoast, Boone's assistant coach. Real-life athletes Gerry Bertier and Julius Campbell are portrayed by Ryan Hurst and Wood Harris, respectively. Question: is remember the titans based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Encino is a neighborhood in the San Fernando Valley region of Los Angeles, California, United States. Question: is encino in the city of los angeles?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Most municipalities, including St. Louis and Kansas City have enacted local laws following the state law, which prohibit the retail sale of liquor between 1:30 AM and 6:00 AM Tuesday through Saturday, and between midnight on Sunday and 9:00 AM the following morning. Question: can you buy alcohol on sunday in mo?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A blighted ovum or anembryonic gestation is characterized by a normal-appearing gestational sac, but the absence of an embryo. It likely occurs as a result of early embryonic death with continued development of the trophoblast. When small, the sac cannot be distinguished from the early normal pregnancy, as there may be a yolk sac, though a fetal pole is not seen. For diagnosis, the sac must be of sufficient size that the absence of normal embryonic elements is established. The criteria depends on the type of ultrasound exam performed. A pregnancy is anembryonic if a transvaginal ultrasound reveals a sac with a mean gestational sac diameter (MGD) greater than 25 mm and no yolk sac, or an MGD >25 mm with no embryo. Transabdominal imaging without transvaginal scanning may be sufficient for diagnosing early pregnancy failure when an embryo whose crown--rump length is 15 mm or more has no visible cardiac activity. Question: can a blighted ovum have a yolk sac?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There\\n"
    },
    {
        "question": "Passage: As a performance shoe, the AF1 is still used for street play as well as for professional play. NBA players Jerry Stackhouse (who now wears Adidas) and Rasheed Wallace have worn AF1s on court. Question: are nike air force 1 good for basketball?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is not enough information in the passage to determine if the Nike Air Force 1 is good for basketball. The passage states that NBA players have worn AF1s on court, but it does not provide any information on the performance or quality of the shoes for basketball. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Artificial turf first gained substantial attention in the 1960s, when it was used in the newly constructed Astrodome. The specific product used was developed by Monsanto and called AstroTurf; this term since then became a generic trademark for any artificial turf throughout the late 20th century. AstroTurf remains a registered trademark but is no longer owned by Monsanto. The first generation turf systems (i.e., short-pile fibers without infill) of the 1960s have been largely replaced by the second generation and third generation turf systems. Second generation synthetic turf systems feature longer fibers and sand infills, and third generation systems, which are most widely used today, offer infills that are mixtures of sand and granules of recycled rubber. Question: is astro turf the same as artificial grass?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The rules of football state that a player running on the field with the ball must take a running bounce at least once every fifteen metres. If they run too far without taking a running bounce, the umpire pays a free kick for running too far to the opposition at the position where the player oversteps his limit. The umpire signals ``running too far'' by rolling their clenched fists around each other -- similar to false starts in American football or traveling in basketball. Question: do you have to bounce the ball in rugby?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Scratch hardness is the measure of how resistant a sample is to fracture or permanent plastic deformation due to friction from a sharp object. The principle is that an object made of a harder material will scratch an object made of a softer material. When testing coatings, scratch hardness refers to the force necessary to cut through the film to the substrate. The most common test is Mohs scale, which is used in mineralogy. One tool to make this measurement is the sclerometer. Question: can a soft material scratch a hard material?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a possibility that a soft material can scratch a hard material if the hard material is softened by heat or another factor, but it is less likely compared to a harder material scratching a softer material. Therefore the score is: 70"
    },
    {
        "question": "Passage: Transmission between humans is extremely rare, although it can happen through organ transplants, or through bites. Question: can a person transmit rabies to another person?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is not enough information to determine if the student's answer is correct or not. Please provide the context or passage from which the student's answer was derived."
    },
    {
        "question": "Passage: Wendy Torrance in the film is relatively meek, submissive, passive, gentle, and mousy; this is shown by the way she defends Jack even in his absence to the doctor examining Danny. It is implied she has perhaps been abused by him as well. In the novel, she is a far more self-reliant and independent personality who is tied to Jack in part by her poor relationship with her parents. In the novel, she never displays hysteria or collapses the way she does in the film, but remains cool and self-reliant. Writing in Hollywood's Stephen King, author Tony Magistrale writes about the mini-series remake: Question: did they do a remake of the shining?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In popular usage, the Mason--Dixon line symbolizes a cultural boundary between the North and the South (Dixie). Originally ``Mason and Dixon's Line'' referred to the border between Pennsylvania and Maryland. After Pennsylvania abolished slavery, it served as a demarcation line for the legality of slavery. That demarcation did not extend beyond Pennsylvania because Delaware, then a slave state, extended north and east of the boundary. Also lying north and east of the boundary was New Jersey, where slavery was formally abolished in 1846, but former slaves continued to be ``apprenticed'' to their masters until the passage of the Thirteenth Amendment to the United States Constitution in 1865. Question: does the mason dixon line go through nj?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The New Legends of Monkey is a television series inspired by Monkey, a Japanese production from the 1970s and 80s which garnered a cult following in New Zealand, Australia, the U.K. and South Africa. The Japanese production was based on the 16th century Chinese novel Journey to the West. The show is a co-production between ABC Me, TVNZ, and Netflix, and consists of ten episodes. The New Legends of Monkey premiered on 28 January 2018. Question: will the new legends of monkey have a season 2?",
        "pred_ans": " False\\n\\nAs of my knowledge is limited to my training data, The New Legends of Monkey has not been renewed for a second season. However, the series has been popular among its target audience, so there is a possibility that it could be renewed in the future.",
        "ground_truth": false,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: To date only two competitors, rock-climbers Isaac Caldiero and Geoff Britten, have finished the course and achieved ``Total Victory''. Caldiero is the only competitor to win the cash prize. The series premiered on December 12, 2009 on the now-defunct cable channel G4 and now airs on NBC with encore episodes airing on USA Network and NBCSN. Question: has there ever been a american ninja warrior?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Dominican Republic (Spanish: Rep\u00fablica Dominicana (re\u02c8pu\u03b2li\u02ccka \u00f0o\u02ccmini\u02c8kana)) is a sovereign state located in the island of Hispaniola, in the Greater Antilles archipelago of the Caribbean region. It occupies the eastern five-eighths of the island, which it shares with the nation of Haiti, making Hispaniola one of two Caribbean islands, along with Saint Martin, that are shared by two countries. The Dominican Republic is the second-largest Caribbean nation by area (after Cuba) at 48,445 square kilometers (18,705 sq mi), and third by population with approximately 10 million people, of which approximately three million live in the metropolitan area of Santo Domingo, the capital city. Question: is the dominican republic considered part of the united states?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The method was developed in the late 1940s by Willard Libby, who received the Nobel Prize in Chemistry for his work in 1960. It is based on the fact that radiocarbon ( C) is constantly being created in the atmosphere by the interaction of cosmic rays with atmospheric nitrogen. The resulting C combines with atmospheric oxygen to form radioactive carbon dioxide, which is incorporated into plants by photosynthesis; animals then acquire C by eating the plants. When the animal or plant dies, it stops exchanging carbon with its environment, and from that point onwards the amount of C it contains begins to decrease as the C undergoes radioactive decay. Measuring the amount of C in a sample from a dead plant or animal such as a piece of wood or a fragment of bone provides information that can be used to calculate when the animal or plant died. The older a sample is, the less C there is to be detected, and because the half-life of C (the period of time after which half of a given sample will have decayed) is about 5,730 years, the oldest dates that can be reliably measured by this process date to around 50,000 years ago, although special preparation methods occasionally permit accurate analysis of older samples. Question: can carbon 14 dating be used to detect the age of a live animal?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: ``I do not choose to run'' is President Calvin Coolidge's statement to the press on his decision not to run for the 1928 presidential election. The statement was ambiguous, and led to considerable debate as to the intentions of its language. Question: can a president choose not to run for reelection?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Whole brain emulation (WBE), mind upload or brain upload (sometimes called ``mind copying'' or ``mind transfer'') is the hypothetical futuristic process of scanning the mental state (including long-term memory and ``self'') of a particular brain substrate and copying it to a computer. The computer could then run a simulation model of the brain's information processing, such that it responds in essentially the same way as the original brain (i.e., indistinguishable from the brain for all relevant purposes) and experiences having a conscious mind. Question: is it possible to connect your brain to a computer?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Commonwealth government has its own tax laws and Puerto Ricans are also required to pay some US federal taxes, although most residents do not have to pay the federal personal income tax. In 2009, Puerto Rico paid $3.742 billion into the US Treasury. Residents of Puerto Rico pay into Social Security, and are thus eligible for Social Security benefits upon retirement. However, they are excluded from the Supplemental Security Income. Question: is federal income tax the same as social security?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There"
    },
    {
        "question": "Passage: Diffusion is the net movement of material from an area of high concentration to an area with lower concentration. The difference of concentration between the two areas is often termed as the concentration gradient, and diffusion will continue until this gradient has been eliminated. Since diffusion moves materials from an area of higher concentration to an area of lower concentration, it is described as moving solutes ``down the concentration gradient'' (compared with active transport, which often moves material from area of low concentration to area of higher concentration, and therefore referred to as moving the material ``against the concentration gradient''). However, in many cases (e.g. passive drug transport) the driving force of passive transport can not be simplified to the concentration gradient. If there are different solutions at the two sides of the membrane with different equilibrium solubility of the drug, the difference in degree of saturation is the driving force of passive membrane transport. It is also true for supersaturated solutions which are more and more important owing to the spreading of the application of amorphous solid dispersions for drug bioavailability enhancement. Question: does passive transport go against the concentration gradient?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Compared to patents, the advantages of trade secrets are that a trade secret is not limited in time (it ``continues indefinitely as long as the secret is not revealed to the public'', whereas a patent is only in force for a specified time, after which others may freely copy the invention), a trade secret does not imply any registration costs, has an immediate effect, does not require compliance with any formalities, and does not imply any disclosure of the invention to the public. The disadvantages of trade secrets include that ``others may be able to legally discover the secret and be thereafter entitled to use it'', ``others may obtain patent protection for legally discovered secrets'', and a trade secret is more difficult to enforce than a patent. Question: is there a time limit on trade secrets?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The First Amendment prohibits the making of any law respecting an establishment of religion, impeding the free exercise of religion, abridging the freedom of speech, infringing on the freedom of the press, interfering with the right to peaceably assemble or prohibiting the petitioning for a governmental redress of grievances. Initially, the First Amendment applied only to laws enacted by Congress, and many of its provisions were interpreted more narrowly than they are today. Question: is the first amendment in the bill of rights?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no need to provide a score for this question as the student's answer is correct according to the passage in the question. The First Amendment is indeed in the Bill of Rights."
    },
    {
        "question": "Passage: The rushing sound that one hears is in fact the noise of the surrounding environment, resonating within the cavity of the shell. The same effect can be produced with any resonant cavity, such as an empty cup or even by simply cupping one's hand over one's ear. The similarity of the noise produced by the resonator to that of the oceans is due to the resemblance between ocean movements and airflow. Question: can you really hear the ocean in a seashell?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Just as before the 'Mazda 626 was renamed to Mazda6 Atenza, Ford continues to use the Mazda's G-series platform for the basis of a number of its CD3 platform coded vehicles, including the Ford Fusion, Mercury Milan, Lincoln Zephyr/MKZ, Lincoln MKX, and a range of SUVs and minivans. Ford also plans to offer a hybrid powertrain on the platform. The official Mazda chassis codes are GG (saloon/hatch) and GY (estate) series -- following the 626/Capella in its GF/GW series. Question: is the ford fusion and mazda 6 the same car?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Whether or not they were socially tolerated or accepted by any particular culture, the existence of intersex people was known to many ancient and pre-modern cultures. The Greek historian Diodorus Siculus wrote of ``hermaphroditus'' in the first century BCE that Hermaphroditus ``is born with a physical body which is a combination of that of a man and that of a woman'', and with supernatural properties. Question: can you be born with both female and male parts?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is not enough information in the passage to determine if the student's answer is correct. The passage only states that intersex people exist, and it mentions a mythological character, Hermaphroditus, who is described as having physical traits of both man and woman. However, it does not provide any concrete evidence or information about people being born with both female and male parts. Therefore, the score is: 50"
    },
    {
        "question": "Passage: One morning, Marley and Sarah sit in the park. Marley confesses to Sarah that she is no longer afraid to die, even though Sarah is terrified at the prospect of having to live without her. Marley drifts off and Sarah is unable to wake her. Knowing it's almost time, Marley is visited by everyone in the hospital who surround her bed and wait with her until the end. The exception is Renee, who is in labor with her son. Julian makes it in the nick of time to say goodbye and, in that final moment, Marley realizes her third wish was to fall in love. The film ends with Marley's funeral, a colorful celebration of her life with all of her friends and family. Question: does marley die in a little bit of heaven?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Ross Stores, Inc. is an American chain of off-price department stores headquartered in Dublin, California, officially operating under the brandname, Ross Dress for Less. It is the largest off-priced retailer in the U.S. As of August 2015, Ross operates 1,412 stores in 37 U.S. states, the District of Columbia and Guam, covering much of the country, but with no presence in New England, New York, northern New Jersey, Alaska, and areas of the Midwest. Question: is ross dress for less a thrift store?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: During oxidative phosphorylation, electrons are transferred from electron donors to electron acceptors such as oxygen, in redox reactions. These redox reactions release energy, which is used to form ATP. In eukaryotes, these redox reactions are carried out by a series of protein complexes within the inner membrane of the cell's mitochondria, whereas, in prokaryotes, these proteins are located in the cells' intermembrane space. These linked sets of proteins are called electron transport chains. In eukaryotes, five main protein complexes are involved, whereas in prokaryotes many different enzymes are present, using a variety of electron donors and acceptors. Question: is the electron transport chain and oxidative phosphorylation the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The 1988 team was the inspiration for the film Cool Runnings (1993). The characters in the film are fictional, although original footage of the crash during a qualifier is used during the film. The film's depiction of the post-crash rescue was changed to show the bobsledders carrying the sled over the line on their shoulders for dramatic effect. Question: did the jamaican bobsled team carry the sled?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Every team which had a player hit two grand slams won their milestone games. These games have resulted in other single-game MLB records being set due to the extreme offensive performance. Lazzeri, for example, proceeded to hit a third home run in the game and finished with a total of eleven runs batted in, an American League record. Fernando Tat\u00eds became the only player to hit two grand slams in the same inning, when he attained the milestone, slugging two in the third inning for the St. Louis Cardinals on April 23, 1999. In achieving the feat, he also set a new major league record with eight runs batted in in a single inning. Question: has anyone ever hit 2 grand slams in one inning?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Historical records date the introduction of cats to Australia at around 1804 and that cats first became feral around Sydney by 1820. In the early 1900s, concern was expressed at the pervasiveness of the cat problem Question: are domestic and feral cats native to australia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: All burgers consist of zero (in the case of a 'grilled cheese') or more 2 oz (57 g) beef patties cooked to ``medium-well'', and served on a toasted bun. The standard style of burger includes tomato, hand-leafed lettuce and ``spread'', a sauce similar to Thousand Island dressing. Question: do in n out burgers come with pickles?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: 2016 was the first time in nearly 70 years that a full moon and the Northern Hemisphere's summer solstice occurred on the same day. The 2016 summer solstice's full moon rose just as the sun set. Question: is there always a full moon on summer solstice?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: ``Separation of church and state'' is paraphrased from Thomas Jefferson and used by others in expressing an understanding of the intent and function of the Establishment Clause and Free Exercise Clause of the First Amendment to the Constitution of the United States which reads: ``Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof...'' Question: does the first amendment separate church and state?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The United States Open Championship, commonly known as the U.S. Open, is the annual open national championship of golf in the United States. It is the second of the four major championships in golf, and is on the official schedule of both the PGA Tour and the European Tour. Since 1898 the competition has been 72 holes of stroke play (4 rounds on an 18-hole course), with the winner being the player with the lowest total number of strokes. It is staged by the United States Golf Association (USGA) in mid-June, scheduled so that, if there are no weather delays, the final round is played on the third Sunday, which is Father's Day. The U.S. Open is staged at a variety of courses, set up in such a way that scoring is very difficult, with a premium placed on accurate driving. Question: is the us open played on the same course every year?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: In some areas, lemonade stands are usually in technical violation of several laws, including operation without a business license, lack of adherence to health codes, and sometimes child labor laws. Question: do you have to have a license to have a lemonade stand?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In Florida, two bills were approved in January 2018 by House and Senate committees, to move most of the state permanently to Atlantic Standard Time (with the panhandle moving to year-round Eastern Standard Time) with no observation of daylight saving time. Question: is eastern standard time the same as atlantic standard time?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There are two time zones mentioned in the passage: Eastern Standard Time and Atlantic Standard Time. The passage states that Florida will be moving from Eastern Standard Time to Atlantic Standard Time, with the exception of the panhandle region which will remain on Eastern Standard Time year-round.\\n\\nTherefore the score is: 50"
    },
    {
        "question": "Passage: From the 1930 FIFA World Cup to the 1938 FIFA World Cup, Serbia was part of the Kingdom of Yugoslavia, and afterwards, from the 1950 FIFA World Cup to the 1990 FIFA World Cup, Serbia was part of the Socialist Federal Republic of Yugoslavia, both of which competed in the world cup. From the 1994 FIFA World Cup to the 2006 FIFA World Cup, Serbia played with Montenegro. From 2006 to the present, Serbia played as an independent country. After the dissolution of the SFRY, Serbia has not made it to the knockout stages. Question: has serbia ever won the fifa world cup?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Nigeria have appeared in the finals of the FIFA World Cup on six occasions, the first being in 1994 where they reached the second round. Their sixth and most recent appearance at the finals was the 2018 FIFA World Cup in Russia. Question: has nigeria ever won the fifa world cup before?",
        "pred_ans": " Answer: False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: ``God Save the Queen'' (alternatively ``God Save the King'', depending on the gender of the reigning monarch) is the national or royal anthem in a number of Commonwealth realms, their territories, and the British Crown dependencies. The author of the tune is unknown, and it may originate in plainchant; but an attribution to the composer John Bull is sometimes made. Question: does the uk national anthem change if there is a king?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A routing number consists of a five digit transit number (also called branch number) identifying the branch where an account is held and a three digit financial institution number corresponding to the financial institution. The number is given as one of the following forms, where XXXXX is the transit number and YYY is the financial institution number: Question: is the routing number the same as transit number?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Brown recluse spiders build asymmetrical (irregular) webs that frequently include a shelter consisting of disorderly thread. They frequently build their webs in woodpiles and sheds, closets, garages, plenum spaces, cellars, and other places that are dry and generally undisturbed. When dwelling in human residences they seem to favor cardboard, possibly because it mimics the rotting tree bark which they inhabit naturally. Human-recluse contact often occurs when such isolated spaces are disturbed and the spider feels threatened. Unlike most web weavers, they leave these lairs at night to hunt. Males move around more when hunting than the females, which tend to remain nearer to their webs. Question: do brown recluse spiders live in the woods?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mention in the passage that brown recluse spiders frequently build their webs in woodpiles and sheds, closets, garages, plenum spaces, cellars, and other places that are dry and generally undisturbed. These can be found in the woods. Therefore the score is: 80"
    },
    {
        "question": "Passage: The U.S. Passport Card is the de facto national identification card of the United States and a limited travel document issued by the federal government of the United States in the size of a credit card. Like a passport book, the passport card is only issued to U.S. citizens and U.S. nationals exclusively by the U.S. Department of State and is compliant to the standards for identity documents set by the REAL ID Act and can be used as proof of U.S. citizenship. The passport card's intended primary purpose is for identification and to allow cardholders to travel by domestic air flights within the United States and to enter or exit the United States via land and sea between member states of the Western Hemisphere Travel Initiative (WHTI). However, the passport card cannot be used for international air travel. Question: can a passport card be used for domestic flights?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. The passage states that the passport card is compliant with the REAL ID Act and can be used as proof of U.S. citizenship, but it is not explicitly mentioned that the card can be used for domestic flights. However, the intended primary purpose of the passport card is for identification and to allow cardholders to travel by domestic air flights within the United States. Therefore, the student's answer should be considered partially correct, and I would give them a score of 70 out of 100."
    },
    {
        "question": "Passage: The optic nerve is the second of twelve paired cranial nerves and is technically part of the central nervous system, rather than the peripheral nervous system because it is derived from an out-pouching of the diencephalon (optic stalks) during embryonic development. As a consequence, the fibers of the optic nerve are covered with myelin produced by oligodendrocytes, rather than Schwann cells of the peripheral nervous system, and are encased within the meninges. Peripheral neuropathies like Guillain--Barr\u00e9 syndrome do not affect the optic nerve. However, most typically the optic nerve is grouped with the other eleven cranial nerves and considered to be part of the peripheral nervous system. Question: is the eyes part of the nervous system?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no mention in the passage about whether the eyes are part of the nervous system. The passage only discusses the optic nerve, which is a part of the central nervous system, and how it is different from the peripheral nervous system. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Make It or Break It was created by Holly Sorensen who, along with Paul Stupin and John Ziffren, served as the show's executive producers. The stunt doubles were former elite, Olympian or NCAA champion gymnasts. Question: are the make it or break it cast real gymnasts?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to determine if the Make It or Break It cast are real gymnasts. The passage only mentions that the stunt doubles were former elite, Olympian, or NCAA champion gymnasts. Therefore the score is: 50"
    },
    {
        "question": "Passage: The common bile duct, sometimes abbreviated CBD, is a duct in the gastrointestinal tract of organisms that have a gall bladder. It is formed by the union of the common hepatic duct and the cystic duct (from the gall bladder). It is later joined by the pancreatic duct to form the ampulla of Vater. There, the two ducts are surrounded by the muscular sphincter of Oddi. Question: is the cystic duct the same as the common bile duct?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Vena amoris is a Latin name meaning, literally, ``vein of love''. Traditional belief established that this vein ran directly from the fourth finger of the left hand to the heart. This theory has been cited in western cultures as one of the reasons the engagement ring and/or wedding ring was placed on the fourth finger, or ``ring finger''. This traditional belief is factually inaccurate as all the fingers in the hand have a similar vein structure. Question: is it true that the ring finger is connected to the heart?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a traditional belief that the ring finger is connected to the heart, but this belief is factually inaccurate as all the fingers in the hand have a similar vein structure. Therefore the score is: 80"
    },
    {
        "question": "Passage: American College of Education is a regionally accredited, online college based in Indianapolis, Indiana, delivering online master's, doctorate and specialist degree programs, a bachelor's degree completion program, and graduate-level certificates in Education, Healthcare, and Nursing. Question: is american college of education accredited in new york?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to determine if American College of Education is accredited in New York or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The cities considered the global ``Big Four'' fashion capitals of the 21st century are Milan, London, New York and Paris. Question: is paris still the fashion capital of the world?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Preakness is the second leg in American thoroughbred racing's Triple Crown series and almost always attracts the Kentucky Derby winner, some of the other horses that ran in the Derby, and often a few horses that did not start in the Derby. The Preakness is \u200b1 \u2044 miles, or \u200b9 \u2044 furlongs (1.88km), compared to the Kentucky Derby, which is \u200b1 \u2044 miles / 10 furlongs (2km). It is followed by the third leg, the Belmont Stakes, which is \u200b1 \u2044 miles / 12 furlongs (2.4km). Question: is the preakness the same distance as the kentucky derby?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The film was adapted by Wesley Strick from the original screenplay by James R. Webb, which was an adaptation from the novel The Executioners by John D. MacDonald. Question: is cape fear based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Ascension is a 2014 Canadian/American science fiction mystery drama television miniseries which aired on CBC in Canada and Syfy in the United States. It consists of six 43 minute episodes. The show was created by Philip Levens and Adrian A. Cruz. The pilot was written and executive produced by Philip Levens, who served as the showrunner. Question: is the tv show ascension based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The theme music that accompanies the title sequence was composed by Ramin Djawadi. The production team showed the title sequence they were working on to Djawadi, who was then inspired to create the music for the Game of Thrones Theme, and finished the theme music three days later. Djawadi said the show runners Benioff and Weiss wanted the theme music to be about a journey that reflects the variety of locations and characters in the show. Question: did the game of thrones theme song change?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Deep ocean water (DOW) is the name for cold, salty water found deep below the surface of Earth's oceans. Ocean water differs in temperature and salinity. Warm surface water is generally saltier than the cooler deep or polar waters; in polar regions, the upper layers of ocean water are cold and fresh. Deep ocean water makes up about 90% of the volume of the oceans. Deep ocean water has a very uniform temperature, around 0-3 \u00b0C, and a salinity of about 3.5% or as oceanographers state as 35 ppt (parts per thousand). Question: is it saltier at the bottom of the ocean?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to determine if the water at the bottom of the ocean is saltier than the water at the surface. The passage states that warm surface water is generally saltier than the cooler deep or polar waters, but it does not provide any comparison between the water at the bottom of the ocean and the water at the surface. Therefore the score is: 50"
    },
    {
        "question": "Passage: Bounty hunters may run into serious legal problems if they try to apprehend fugitives outside the United States, where laws treat the re-arrest of any fugitive by private persons as kidnapping, or the bail agent may incur the punishments of some other serious crime if local and international laws are broken by them. While the United States government generally allows the activities of bounty hunters within the United States, the governments in other sovereign nations consider them a felony. Question: is it legal to be a bounty hunter in the us?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Following a major overhaul in the summer of 2016, each episode is now a single 60 minute transmission, including advertisements. The series was also moved to 9:00pm on Mondays, to allow for grittier storylines, as the series is now post-watershed. A special-double episode was broadcast on 9 January 2017 as a single 120 minute transmission. This episode was co-written by actor Shaun Williamson. The second series was broadcast in the UK between 17 July 2017 and 8 September 2017, with Series 3 scheduled for this year. Question: is there a season 4 for red rock?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The casting of actors not known for their singing abilities led to some mixed reviews. Variety stated that ``some stars, especially the bouncy and rejuvenated Streep, seem better suited for musical comedy than others, including Brosnan and Skarsg\u00e5rd.'' Brosnan, especially, was savaged by many critics: his singing was compared to ``a water buffalo'' (New York Magazine), ``a donkey braying'' (The Philadelphia Inquirer) and ``a wounded raccoon'' (The Miami Herald), and Matt Brunson of Creative Loafing Charlotte said he ``looks physically pained choking out the lyrics, as if he's being subjected to a prostate exam just outside of the camera's eye.'' Question: does everyone do their own singing in mamma mia?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The section of I-30 between Dallas and Fort Worth is designated the Tom Landry Highway in honor of the long-time Dallas Cowboys coach. Though I-30 passed well south of Texas Stadium, the Cowboys' former home, their new stadium in Arlington, Texas is near I-30. However, the freeway designation was made before Arlington voted to build Cowboys Stadium. This section was previously known as the Dallas-Fort Worth Turnpike, which preceded the Interstate System. Although tolls had not been collected for many years, it was still known locally as the Dallas-Fort Worth Turnpike until receiving its present name. The section from downtown Dallas to Arlington was recently widened to over 16 lanes in some sections, by 2010. From June 15, 2010, through February 6, 2011, this 30-mile (48 km) section of I-30 was temporarily designated as the ``Tom Landry Super Bowl Highway'' in commemoration of Super Bowl XLV which was played at Cowboys Stadium. Question: is i 30 in dallas a toll road?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A martingale is any of a class of betting strategies that originated from and were popular in 18th century France. The simplest of these strategies was designed for a game in which the gambler wins his stake if a coin comes up heads and loses it if the coin comes up tails. The strategy had the gambler double his bet after every loss, so that the first win would recover all previous losses plus win a profit equal to the original stake. The martingale strategy has been applied to roulette as well, as the probability of hitting either red or black is close to 50%. Question: can you keep doubling your bet on roulette?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: It is important to note that mithridatism is not effective against all types of poison (immunity generally is only possible with biologically complex types which the immune system can respond to) and, depending on the toxin, the practice can lead to the lethal accumulation of a poison in the body. Results depend on how each poison is processed by the body, ie, on how the toxic compound is metabolized or passed out of the body. In some cases, it is possible to build up tolerance against specific non-biological poisons. For some poisons, this involves conditioning the liver to produce more of the particular enzymes that deal with these poisons (for example alcohol). Another mechanism involves conditioning the target tissues of the poisons. These methods do not work for all non-biological poisons. Exposure to certain toxic substances, such as hydrofluoric acid and heavy metals, is either lethal or has little to no effect, and thus cannot be used in this way at all. Arsenic is a notable exception with some people actually having a genetic adaptation granting them higher resistance which can be replicated with mithridatism. In addition, simple toxins that work through chemical processes that bypass the immune system cannot be dealt with (good example would be the variants of cyanide). Question: can you build up an immunity to arsenic?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The United States Post Office issued the Statue of Liberty Forever stamp on December 1, 2010. The stamp shows the replica of the Statue of Liberty (Liberty Enlightening the World) located at the New York-New York Hotel and Casino on the Las Vegas Strip rather than the original Statue of Liberty in New York. The error was not noticed until March 2011. The error was identified by Sunipix, a stock photo agency in Texas. Ten and a half billion of the error stamps were produced. The mistake is the largest run of an error on a postage stamp. Question: is the lady liberty stamp a forever stamp?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Pirates of the Caribbean is a Disney franchise encompassing numerous theme park attractions and a media franchise consisting of a series of films, and spin-off novels, as well as a number of related video games and other media publications. The franchise originated with the Pirates of the Caribbean theme ride attraction, which opened at Disneyland in 1967 and was one of the last Disney theme park attractions overseen by Walt Disney. Disney based the ride on pirate legends and folklore. Question: are the pirates of the caribbean movies based on books?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Xbox One gaming console has received updates from Microsoft since its launch in 2013 that enable it to play select games from its two predecessor consoles, Xbox and Xbox 360. On June 15, 2015, backward compatibility with supported Xbox 360 games became available to eligible Xbox Preview program users with a beta update to the Xbox One system software. The dashboard update containing backward compatibility was released publicly on November 12, 2015. On October 24, 2017, another such update added games from the original Xbox library. The following is a list of all backward compatible games on Xbox One under this functionality. Question: do xbox original games work on xbox one?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A conditional discharge is an order made by a criminal court whereby an offender will not be sentenced for an offence unless a further offence is committed within a stated period. Once the stated period has elapsed and no further offence is committed then the conviction may be removed from the defendant's record. Question: does a conditional discharge mean a criminal record?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A passphrase is a sequence of words or other text used to control access to a computer system, program or data. A passphrase is similar to a password in usage, but is generally longer for added security. Passphrases are often used to control both access to, and operation of, cryptographic programs and systems, especially those that derive an encryption key from a passphrase. The origin of the term is by analogy with password. The modern concept of passphrases is believed to have been invented by Sigmund N. Porter in 1982. Question: is a passphrase the same as a password?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 1969, the college separated from Baylor University and became an independent institution, which allowed it access to federal research funding, changing its name to Baylor College of Medicine. That same year, BCM negotiated with the Texas Legislature to double its class size in order to increase the number of physicians in Texas. Question: is baylor college of medicine related to baylor university?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Diamond cutting is the practice of changing a diamond from a rough stone into a faceted gem. Cutting diamond requires specialized knowledge, tools, equipment, and techniques because of its extreme difficulty. Question: can you cut a diamond into a different shape?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Situated on the continent of Antarctica, it is the site of the United States Amundsen--Scott South Pole Station, which was established in 1956 and has been permanently staffed since that year. The Geographic South Pole is distinct from the South Magnetic Pole, the position of which is defined based on the Earth's magnetic field. The South Pole is at the center of the Southern Hemisphere. Question: is antarctica the same as the south pole?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The team's only World Series appearance came in 1982. After winning the ALCS against the California Angels, the Brewers faced off against the St. Louis Cardinals in the World Series, losing 4--3. In 2011, the Brewers defeated the Arizona Diamondbacks to win the NLDS 3--2, but lost in the NLCS to the eventual World Series champion Cardinals 4--2. Question: did the brewers make it to the world series?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Celebratory gunfire (also called aerial firing or happy fire) is the shooting of a firearm into the air in celebration. It is culturally accepted in parts of the Balkans, the Middle East, the Central Asian region of Afghanistan, and the South Asian region of Pakistan. In regions such as Puerto Rico and continental areas of the United States it is practiced illegally, especially on holidays like New Year's Eve. Question: is it illegal to shoot in the air?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There are regions where celebratory gunfire is culturally accepted, but in regions such as Puerto Rico and continental areas of the United States, it is practiced illegally, especially on holidays like New Year's Eve. Therefore the score is: 70"
    },
    {
        "question": "Passage: At various times, it has been placed in the Procyonidae, Ursidae, with Ailuropoda (giant panda) in the Ailuropodinae (until this family was moved into the Ursidae), and into its own family, the Ailuridae. This uncertainty comes from difficulty in determining whether certain characteristics of Ailurus are phylogenetically conservative or are derived and convergent with species of similar ecological habits. Question: is the red panda is related to the giant panda?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Tax-Free Savings Account (TFSA, French: Compte d'\u00e9pargne libre d'imp\u00f4t or C\u00c9LI) is an account available in Canada and South Africa that provides tax benefits for saving. Investment income, including capital gains and dividends, earned in a TFSA is not taxed in most cases, even when withdrawn. Contributions to a TFSA are not deductible for income tax purposes, unlike contributions to a Registered Retirement Savings Plan (RRSP). Question: is a tax free savings account an investment?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Since its release, Pulp Fiction has been widely regarded as Tarantino's masterpiece, with particular praise singled out for its screenwriting. The film's self-reflexivity, unconventional structure, and extensive use of homage and pastiche have led critics to describe it as a touchstone of postmodern film. It is often considered a cultural watershed, with a strong influence felt not only in later movies that adopted various elements of its style, but in several other media as well. A 2008 Entertainment Weekly list named it the best film from 1983 to 2008 and the work has appeared on many critics' lists of the greatest films ever made. In 2013, Pulp Fiction was selected for preservation in the United States National Film Registry by the Library of Congress as being ``culturally, historically, or aesthetically significant''. Question: is pulp fiction based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Domestic sheep (Ovis aries) are quadrupedal, ruminant mammal typically kept as livestock. Like most ruminants, sheep are members of the order Artiodactyla, the even-toed ungulates. Although the name sheep applies to many species in the genus Ovis, in everyday usage it almost always refers to Ovis aries. Numbering a little over one billion, domestic sheep are also the most numerous species of sheep. An adult female sheep is referred to as a ewe (/ju\u02d0/), an intact male as a ram or occasionally a tup, a castrated male as a wether, and a younger sheep as a lamb. Question: is a ram and a lamb the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Along with his friend Alistair and Tess's coach Tink, Charlie takes a boat to find her. The following sunset, Charlie misses his game with Sam. As Charlie confesses his love for his departed sibling, Sam tells Charlie that he loves him back and moves on from the living world. He appears to Charlie as a shooting star in the sky to reveal Tess' location. The group finds Tess' wrecked boat along with her lying on the rocks. Charlie uses his body heat to keep Tess warm until they are found by the Coast Guard. Question: is tess carroll died in charlie st cloud?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Court of Appeal, the High Court, the Crown Court, the County Court, and the magistrates' courts are administered by Her Majesty's Courts and Tribunals Service, an executive agency of the Ministry of Justice. Question: is county court the same as magistrates court?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: NET10 Wireless also allows customers to bring AT&T, T-Mobile, Sprint, Verizon, or unlocked GSM phones to NET10 by buying a SIM card or activation kit and air time from the company. This program does not work with Blackberry and branded Straight Talk, SafeLink, TracFone, Total Wireless, and NET10 phones. Question: can i use a net10 card on a tracfone?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The ground squirrels are members of the squirrel family of rodents (Sciuridae) which generally live on or in the ground, rather than trees. The term is most often used for the medium-sized ground squirrels, as the larger ones are more commonly known as marmots (genus Marmota) or prairie dogs, while the smaller and less bushy-tailed ground squirrels tend to be known as chipmunks. Together, they make up the ``marmot tribe'' of squirrels, Marmotini, and the large and mainly ground squirrel subfamily Xerinae, and containing six living genera. Well-known members of this largely Holarctic group are the marmots (Marmota), including the American groundhog, the chipmunks, the susliks (Spermophilus), and the prairie dogs (Cynomys). They are highly variable in size and habitus, but most are remarkably able to rise up on their hind legs and stand fully erect comfortably for prolonged periods. They also tend to be far more gregarious than other squirrels, and many live in colonies with complex social structures. Most Marmotini are rather short-tailed and large squirrels, and the alpine marmot (Marmota marmota) is the largest living member of the Sciuridae, at 53--73 cm in length and weighing 5--8 kg. Question: is a chipmunk the same as a ground squirrel?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Thread seal tape (also known as PTFE tape or plumber's tape) is a polytetrafluoroethylene (PTFE) film tape commonly used in plumbing for sealing pipe threads. The tape is sold cut to specific widths and wound on a spool, making it easy to wind around pipe threads. It is also known by the genericized trademark Teflon tape; while Teflon is in fact identical to PTFE, Chemours (the trade-mark holders) consider this usage incorrect, especially as they no longer manufacture Teflon in tape form. Thread seal tape lubricates allowing for a deeper seating of the threads, and it helps prevent the threads from seizing when being unscrewed. The tape also works as a deformable filler and thread lubricant, helping to seal the joint without hardening or making it more difficult to tighten, and instead making it easier to tighten. Question: is thread seal tape and teflon tape the same?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Hypnagogia, also referred to as ``hypnagogic hallucinations'', is the experience of the transitional state from wakefulness to sleep: the hypnagogic state of consciousness, during the onset of sleep (for the transitional state from sleep to wakefulness see hypnopompic). Mental phenomena that may occur during this ``threshold consciousness'' phase include lucid thought, lucid dreaming, hallucinations, and sleep paralysis. However, sleep paralysis and lucid dreaming are separate sleep conditions that are sometimes experienced during the hypnagogic state. Question: can you start dreaming before you fall asleep?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The 1958 ill-fated Japanese expedition to Antarctica inspired the 1983 hit film Antarctica, of which Eight Below is a remake. Eight Below adapts the events of the 1958 incident, moved forward to 1993. In the 1958 event, fifteen Sakhalin Husky sled dogs were abandoned when the expedition team was unable to return to the base. When the team returned a year later, two dogs were still alive. Another seven were still chained up and dead, five were unaccounted for, and one died just outside Showa Station. Question: is the movie 8 below a true story?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: A property tax or millage rate is an ad valorem tax on the value of a property, usually levied on real estate. The tax is levied by the governing authority of the jurisdiction in which the property is located. This can be a national government, a federated state, a county or geographical region or a municipality. Multiple jurisdictions may tax the same property. This tax can be contrasted to a rent tax which is based on rental income or imputed rent, and a land value tax, which is a levy on the value of land, excluding the value of buildings and other improvements. Question: is land tax the same as property tax?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The Kentucky Derby /\u02c8d\u025c\u02d0rbi/, is a horse race that is held annually in Louisville, Kentucky, United States, on the first Saturday in May, capping the two-week-long Kentucky Derby Festival. The race is a Grade I stakes race for three-year-old Thoroughbreds at a distance of one and a quarter miles (2 km) at Churchill Downs. Colts and geldings carry 126 pounds (57 kilograms) and fillies 121 pounds (55 kilograms). Question: can a horse win the kentucky derby twice?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to indicate whether a horse can win the Kentucky Derby twice or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: After becoming the predominant defensive alignment in the late 1970s-early 1980s, the 3--4 defense declined in popularity over the next two decades, but experienced a resurgence in the 2000s among both professional and college football teams. As of 2017, NFL teams that regularly incorporate the 3--4 defensive alignment scheme as a base include the Green Bay Packers, Oakland Raiders, Los Angeles Rams, Pittsburgh Steelers, Baltimore Ravens, Arizona Cardinals, Indianapolis Colts, Kansas City Chiefs, New York Jets, Washington Redskins, Denver Broncos, Tennessee Titans, Houston Texans, and the Chicago Bears, who used the 3--4 as their base defense for the first time in 2015. Question: do the saints run a 3 4 defense?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: VictoriaPlum.com, a trading name of Victoria Plum Ltd, is an online bathroom retailer. The company traded under the name Victoria Plumb up until 21 July 2015, when it was rebranded as VictoriaPlum.com, in order to emphasise the exclusively online nature of the business. Question: is victoria plum the same as victorian plumbing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The season is set for release in 2019. Question: are they going to make a punisher season 2?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: On December 1, 2015, it was announced that the series was cancelled and would not be renewed for a fourth season. Question: will there be a cedar cove season 4?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: ``The Tortoise and the Hare'' is one of Aesop's Fables and is numbered 226 in the Perry Index. The account of a race between unequal partners has attracted conflicting interpretations. It is itself a variant of a common folktale theme in which ingenuity and trickery (rather than doggedness) are employed to overcome a stronger opponent. Question: is the tortoise and the hare a fairy tale?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Practice varies as to whether a vote can be considered unanimous if some voter abstains. In Robert's Rules of Order, a ``unanimous vote'' is not specifically defined, although an abstention is not counted as a vote regardless of the voting threshold. Also in this book, action could be taken by ``unanimous consent'', or ``general consent'', if there are no objections raised. However, unanimous consent may not necessarily be the same as a unanimous vote (see Not the same as unanimous vote). In either case, it does not take into account the members who were not present. Question: can you have a unanimous vote with an abstention?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no clear answer to the question \\\"can you have a unanimous vote with an abstention\\\" in the passage provided. Therefore the score is: 50"
    },
    {
        "question": "Passage: A drinking straw or drinking tube is a small pipe that allows its user to more conveniently consume a beverage. A thin tube of paper, plastic (such as polypropylene and polystyrene), or other material is used by placing one end in the mouth and the other in the beverage. A combination of muscular action of the tongue and cheeks reduces air pressure in the mouth and above the liquid in the straw, whereupon atmospheric pressure forces the beverage through the straw. Drinking straws can be straight or have an angle-adjustable bellows segment. Question: did straws used to be made of paper?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no mention of the history of drinking straws in the passage, so the student's answer cannot be accurately judged. The score cannot be determined based on the information provided."
    },
    {
        "question": "Passage: A live-action television adaptation was co-produced by Fuji TV (Japan) & Netflix (worldwide). Like the manga, the series is set in Tokyo and follows the relationships of the main characters from high school to university. Season one aired in 2016, and a second season aired in 2017 under the title Good Morning Call: Our Campus Days. According to the program's social media, there is currently discussion of a third season. Question: is there a third season of good morning call?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A soccer-specific stadium typically has amenities, dimensions and scale suitable for soccer in North America, including a scoreboard, video screen, luxury suites and possibly a roof. The field dimensions are within the range found optimal by FIFA: 110--120 yards (100--110 m) long by 70--80 yards (64--73 m) wide. These soccer field dimensions are wider than the regulation American football field width of 53 \u2044 yards (48.8 m), or the 65-yard (59 m) width of a Canadian football field. The playing surface typically consists of grass as opposed to artificial turf, as the latter is generally disfavored for soccer matches since players are more susceptible to injuries. However, some soccer specific stadiums, such as Portland's Providence Park and Creighton University's Morrison Stadium, do have artificial turf. Question: is a soccer stadium bigger than a football stadium?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Market economies range from minimally regulated ``free market'' and laissez-faire systems--where state activity is restricted to providing public goods and services and safeguarding private ownership--to interventionist forms where the government plays an active role in correcting market failures and promoting social welfare. State-directed or dirigist economies are those where the state plays a directive role in guiding the overall development of the market through industrial policies or indicative planning--which guides but does not substitute the market for economic planning--a form sometimes referred to as a mixed economy. Question: is free market and market economy the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: One scene that was cut from the film following test screenings was a post-credits scene featuring Deadpool travelling back in time to kill a baby Adolf Hitler. It was decided that the scene made audiences too ``squeamish'', which was not the feeling that the creative team wanted people to be leaving the film with. The film originally did not have any post- or mid- credits scenes, with the Hitler scene and the film's other time-traveling mid-credits scenes shot during additional photography. The latter came about when someone suggested the time travel device be used to fix real-world mistakes like Reynolds role in Green Lantern which the writers felt was ``the funniest idea ever, and what a great idea to end the movie''. Additional footage of the X-Force team was shot for the film's marketing to hide the fact that the majority of the X-Force are immediately killed as a joke in the film. Due to Deadpool's mask, the creative team was able to change the character's dialogue up to the film being officially completed; Reynolds took this opportunity to keep adding new jokes to the film as long as possible. Question: are there any post credit scenes in dead pool 2?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Bile is a digestive fluid made by the liver, stored in the gallbladder, and discharged into duodenum after food is ingested to aid in the digestion of fat. Normally, the pyloric sphincter prevents bile from entering the stomach. When the pyloric sphincter is damaged or fails to work correctly, bile can enter the stomach and then be transported into the esophagus as in gastric reflux. The presence of small amounts of bile in the stomach is relatively common and usually asymptomatic, but excessive refluxed bile causes irritation and inflammation. Question: is bile supposed to be in the stomach?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a small amount of bile in the stomach, but it is usually asymptomatic and does not cause any problems. However, excessive refluxed bile can cause irritation and inflammation. Therefore the score is: 80"
    },
    {
        "question": "Passage: Double jeopardy is a procedural defence that prevents an accused person from being tried again on the same (or similar) charges and on the same facts, following a valid acquittal or conviction. As described by the U.S. Supreme Court in its unanimous decision one of its earliest cases dealing with double jeopardy, ``the prohibition is not against being twice punished, but against being twice put in jeopardy; and the accused, whether convicted or acquitted, is equally put in jeopardy at the first trial.'' Question: can you be retried if new evidence is found?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: U.S. Highway 20 (US 20) is an east--west United States highway that stretches from the Pacific Northwest all the way to New England. The ``0'' in its route number indicates that US 20 is a coast-to-coast route. Spanning 3,365 miles (5,415 km), it is the longest road in the United States, and particularly from Idaho to Massachusetts, the route roughly parallels that of Interstate 90 (I-90), which is in turn the longest Interstate Highway in the U.S. There is a discontinuity in the official designation of US 20 through Yellowstone National Park, with unnumbered roads used to traverse the park. Question: is there an interstate that goes coast to coast?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The World Is Still Beautiful, also in subtitle as Still World Is Beautiful (Japanese: \u305d\u308c\u3067\u3082\u4e16\u754c\u306f\u7f8e\u3057\u3044, Hepburn: Soredemo Sekai wa Utsukushii, lit. ``Even So, The World is Beautiful'') is a Japanese manga series by Dai Shiina, serialized in Hakusensha's sh\u014djo manga magazine Hana to Yume from 2012. Soredemo Sekai wa Utsukushii was first published as a one-shot in the same magazine in 2009, with a second one-shot published in 2011. It has been collected in 14 tank\u014dbon volumes as of October 2016. An anime television series adaptation by Pierrot aired between April 6, 2014 and June 29, 2014 on NTV. Question: will there be a season 2 of soredemo sekai wa utsukushii?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Although not identical, the 7.62\u00d751mm NATO and the commercial .308 Winchester cartridges are similar enough that they can be loaded into rifles chambered for the other round, but the Winchester .308 cartridges are typically loaded to higher pressures than 7.62\u00d751mm NATO cartridges. Even though the Sporting Arms and Ammunition Manufacturers' Institute (SAAMI) does not consider it unsafe to fire the commercial round in weapons chambered for the NATO round, there is significant discussion about compatible chamber and muzzle pressures between the two cartridges based on powder loads and wall thicknesses on the military vs. commercial rounds. While the debate goes both ways, the ATF recommends checking the stamping on the barrel; if unsure, one can consult the maker of the firearm. Question: is 7.62 nato the same as 308 win?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Coldwater Creek is an American catalog and online retailer of women's apparel, accessories and home d\u00e9cor with one brick-and-mortar store as of spring 2018. Question: didn't coldwater creek go out of business?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The building opened in 1885 and was demolished 47 years later in 1931. Question: is the home insurance building still standing in chicago?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Charles Evans Hughes Sr. (April 11, 1862 -- August 27, 1948) was an American statesman, Republican politician, and the 11th Chief Justice of the United States. He was also the 36th Governor of New York, the Republican presidential nominee in the 1916 presidential election, and the 44th United States Secretary of State. Question: can a supreme court justice run for president?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Contrary to the popular belief that the lines and figures can only be seen from an aircraft, they are visible from the surrounding foothills and other high places. Question: can the nazca lines be seen from the ground?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no information given about the student's answer in the passage. Please provide the student's answer before I can evaluate it."
    },
    {
        "question": "Passage: The birth certificate is the initial identification document issued to parents shortly after the birth of their child. The birth certificate is typically issued by local governments, usually the city or county where a child is born. It is an important record, often called a ``feeder document,'' because it establishes U.S. citizenship through birthright citizenship, which is then used to obtain, or is the basis for, all other identity documents. By itself, the birth certificate is usually only considered proof of citizenship but not proof of identity, since it is issued without a photograph at birth, containing no identifying features. A birth certificate is normally produced along with proof of identity, such as a driver's license or the testimony of a third party (such as a parent), to establish identity or entitlement to a service. Question: does birth certificate count as a form of id?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: To avoid any future boycotts or controversy, FIFA began a pattern of alternation between the Americas and Europe, which continued until the 2002 FIFA World Cup in Asia. The system evolved so that the host country is now chosen in a vote by FIFA's Congress. This is done under an exhaustive ballot system. The decision is currently made roughly seven years in advance of the tournament, though the hosts for the 2022 tournament were chosen at the same time as those for the 2018 tournament. Question: does the winner of the world cup host it next time?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: While it is commonly done, removing one's letter from the letter jacket upon graduation is not firmly held as protocol. Many graduates keep the letter on the jacket after graduation as a symbol of accomplishment and school pride and commitment, especially with college lettermen. Question: can you wear your letterman after high school?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no clear answer to this question in the passage. While the passage does mention that many graduates keep the letter on the jacket after graduation, it does not explicitly state whether or not you can wear your letterman after high school. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Downregulation or upregulation of an RNA or protein may also arise by an epigenetic alteration. An epigenetic alteration can be permanent or semi-permanent in a somatic cell lineage. Such an epigenetic alteration can cause expression of the RNA or protein to no longer respond to an external stimulus. This occurs, for instance, during drug addiction or progression to cancer. Question: would drug tolerance be caused by upregulation and downregulation?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Color Wonder is a product made by Crayola, primarily intended for use by younger children, in which the special clear-ink marker only appears on the Color Wonder paper. Originally made with markers and paper, Color Wonder has also made specialty products including paints, etc. The Color Wonder products debuted in 1993. Color Wonder paints and fingerpaints, as well as Color Wonder coloring books of popular characters such as Disney Pixar's Cars and Disney Princess also exist. Question: do color wonder markers work on regular paper?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Archie Comics trademarked the term 'Bughead', the name created by fans of the relationship between Betty and Jughead in both comics and the CW Riverdale. Betty and Jughead are canon, romantically, so far only in the 'Riverdale' universe, though Archie Comics has introduced their sleuthing relationship and subsequent ship name (#bughead) into their run of Riverdale comics. Question: were betty and jughead together in the comics?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Last of Us is an action-adventure survival horror video game developed by Naughty Dog and published by Sony Computer Entertainment. It was released for the PlayStation 3 worldwide on June 14, 2013. Players control Joel, a smuggler tasked with escorting a teenage girl named Ellie across a post-apocalyptic United States. The Last of Us is played from a third-person perspective; players use firearms and improvised weapons, and can use stealth to defend against hostile humans and cannibalistic creatures infected by a mutated strain of the Cordyceps fungus. In the game's online multiplayer mode, up to eight players engage in cooperative and competitive gameplay. Question: is the last of us a ps4 exclusive?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A catch is legal if the ball is finally held by any fielder before it touches the ground. Runners may leave their bases the instant the first fielder touches the ball. A fielder may reach over a fence, a railing, a rope, or a line of demarcation to make a catch. He may jump on top of a railing or a canvas that may be in foul ground. Interference should not be called in cases where a spectator comes into contact with a fielder and a catch is not made if the fielder reaches over a fence, a railing, a rope. The fielder does so at his or her own risk. Question: can a baseball player catch a ball in the stands?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: ``Birthday Cake'' is a song by Barbadian recording artist Rihanna, from her sixth studio album, Talk That Talk (2011). After it leaked onto the internet, fans expressed interest in the track being included on Talk That Talk, but it was later revealed that the 1:18 (one minute, 18 seconds) length that leaked was in fact the final cut and was not being considered for inclusion on the album. Due to a high level of fan interest, the song was included on the album as an interlude. The full length version, also known as the official remix of the track, featuring Rihanna's ex-boyfriend Chris Brown, was premiered online on February 20, 2012, to coincide with Rihanna's 24th birthday. The song peaked in the top fifty. Question: is there a full version of birthday cake by rihanna?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the series 11 finale Kepner tells Avery that she is leaving with Owen Hunt to serve as a trauma surgeon in the Army; it will help her grieve for their son. Avery lets her go and wonders how he can deal with his own grief. After discussions over the phone via Facetime, Kepner tells Avery that she is extending her service time. The sound of gunfire and explosions are heard at April's base camp, leaving her to quickly terminate the call. On Valentine's Day, Kepner returns to the hospital, where she and Avery embrace in the foyer. In season 12, their marriage begins to fall apart and they grow estranged. In episode 11, they file for a civil divorce. After their divorce is completed, Kepner reveals that she is pregnant with Avery's child. In Season 14 Jackson begins a relationship with Maggie Pierce. Question: does jackson avery die on grey's anatomy?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The dukedom was first conferred on 24 November 1801 upon Prince Augustus Frederick, the sixth son of King George III. He was made Baron Arklow and Earl of Inverness at the same time, also in the Peerage of the United Kingdom. The title became extinct upon Prince Augustus Frederick's death in 1843. Although he was survived by a son and daughter by Lady Augusta Murray, their marriage had been annulled for lack of royal permission under the Royal Marriages Act 1772, rendering the children illegitimate under English law and unable to inherit titles from their father. Both children by the annulled marriage died childless, rendering the issue of their inheritance moot. Question: has there been a duke of sussex before?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The first gameplay footage of the pre-alpha build for Need for Speed was revealed at EA's press conference at E3 on June 15, 2015. The E3 presentation shows a part of the story, followed by the customization of a Subaru BRZ which showed the new and improved customization system, and the 'action camera' which was later revealed to be one of the five different camera angles. There are five different gameplay types: Speed, Style, Crew, Build, and Outlaw where players can earn points for engaging in to progress in the game through five overlapping storylines. Need for Speed takes place in the fictional city of Ventura Bay and its surroundings which is based on Los Angeles. Question: does need for speed have a story mode?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: New York City is split up into five boroughs, which are the Bronx, Brooklyn, Manhattan, Queens, and Staten Island. Each borough has the same boundaries as a county of the state. The county governments were dissolved when the city consolidated in 1898, along with all city, town, and village governments within each county. The term borough was adopted to describe a unique form of governmental administration for each of the five fundamental constituent parts of the newly consolidated city. Question: is new york city and manhattan the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u9605\u8bfb\u60a8\u63d0\u4f9b\u7684\u95ee\u9898\u548c\u5b66\u751f\u7684\u7b54\u6848\u3002\u7136\u540e\uff0c\u6211\u4f1a\u6839\u636e\u95ee\u9898\u4e2d\u7684\u6587\u7ae0\u5224\u65ad\u5b66\u751f\u7684\u7b54\u6848\u662f\u5426\u6b63\u786e\uff0c\u5e76\u7ed9\u51fa\u4e00\u4e2a 0 \u5230 100 \u4e4b\u95f4\u7684\u5206\u6570\u3002\\n\\n\u95ee\u9898\uff1a```Passage: New York City is split up into five boroughs, which are the Bronx, Brooklyn, Manhattan, Queens, and Staten Island. Each borough has the same boundaries as a county of the state. The county governments were dissolved when the city consolidated in 1898, along with all city, town, and village governments within each county. The term borough was adopted to describe a unique form of governmental administration for each of the five fundamental constituent parts of the newly consolidated city. Question: is new york city and manhattan the same thing?```\\n\\n\u5b66\u751f\u7684\u7b54\u6848\uff1a``` False```\\n\\n\u56e0\u6b64\uff0c\u6839\u636e\u6587\u7ae0\u5185\u5bb9\uff0c\u5b66\u751f\u7684\u7b54\u6848\u662f\u6b63\u786e\u7684\u3002\u56e0\u6b64\uff0c\u5206\u6570\u662f\uff1a100"
    },
    {
        "question": "Passage: A venomoid is a venomous snake that has undergone a surgical procedure to remove or inhibit the production of snake venom. This procedure has been used for venomous snakes kept as pets or used in public demonstrations in order to remove the risk of injury or death when handled. The removal of venom glands or fangs of exhibited animals may be by surgery or simple mutilation; some or all of these procedures have been considered illegal and unethical. Removal of fangs is uncommon, as snakes frequently regenerate teeth, and the more invasive procedure of removing the underlying maxillary bone would be fatal. Most venomoid procedures consist of either removing the venom gland itself, or severing the duct between the gland and the fang. However, the duct and gland have been known to regenerate, and supposedly ``safe'' snakes have killed mice and successfully envenomated humans. Question: can you remove the venom glands from a snake?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Red velvet cake is traditionally a red, red-brown, mahogany, maroon, crimson or scarlet colored chocolate layer cake, layered with white cream cheese or ermine icing. The cake is commonly served on Christmas or Valentine's Day. Common modern red velvet cake is made with red dye; the red color was originally due to non-Dutched, anthocyanin-rich cocoa. Question: is red velvet cake chocolate cake with red dye?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There are multiple parts to the student's answer. The first part, \\\"Red velvet cake is traditionally a red, red-brown, mahogany, maroon, crimson or scarlet colored chocolate layer cake,\\\" is correct according to the passage. However, the second part, \\\"Common modern red velvet cake is made with red dye; the red color was originally due to non-Dutched, anthocyanin-rich cocoa,\\\" is not accurate, as the passage states that the red color is due to non-Dutched, anthocyanin-rich cocoa, not red dye.\\n\\nTherefore the score is: 50/100"
    },
    {
        "question": "Passage: In the World Championship, O'Sullivan beat David Gilbert 10--7 in the first round. After the match, he refused to attend a mandatory press conference, and also refused to talk to the tournament broadcasters, the BBC. He received a formal warning from World Snooker, and was advised that further breaches of contract would lead to fines. In the second round, he lost 12--13 to Barry Hawkins, his first loss against Hawkins in 14 years and only the second time in 13 years that he had failed to reach the World Championship quarterfinals. Question: is ronnie o sullivan in the world championship 2016?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Czechoslovakia, or Czecho-Slovakia (/\u02cct\u0283\u025bko\u028aslo\u028a\u02c8v\u00e6ki\u0259, -k\u0259-, -sl\u0259-, -\u02c8v\u0251\u02d0-/; Czech and Slovak: \u010ceskoslovensko, \u010cesko-Slovensko), was a sovereign state in Central Europe that existed from October 1918, when it declared its independence from the Austro-Hungarian Empire, until its peaceful dissolution into the Czech Republic and Slovakia on 1 January 1993. Question: is the czech republic and czechoslovakia the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The pardon power of the President extends only to an offense recognizable under federal law. However, the governors of most of the 50 states have the power to grant pardons or reprieves for offenses under state criminal law. In other states, that power is committed to an appointed agency or board, or to a board and the governor in some hybrid arrangement (in some states the agency is merged with that of the parole board, as in the Oklahoma Pardon and Parole Board). Question: can the president pardon someone for a state crime?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: During Microsoft's E3 2015 press conference on June 15, 2015, Microsoft announced plans to introduce Xbox 360 backward compatibility on the Xbox One at no additional cost. Supported Xbox 360 games will run within an emulator and have access to certain Xbox One features, such as recording and broadcasting gameplay. Games do not run directly from discs. A ported form of the game is downloaded automatically when a supported game is inserted, while digitally-purchased games will automatically appear for download in the user's library once available. As with Xbox One titles, if the game is installed using physical media, the disc is still required for validation purposes. Question: do all xbox 360 games work on xbox one x?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: There were also Sam's Club locations in Canada, six located in Ontario, in which the last location closed in 2009. Question: are there any sam's clubs in canada?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A new engine is broken in by following specific driving guidelines during the first few hours of its use. The focus of breaking in an engine is on the contact between the piston rings of the engine and the cylinder wall. There is no universal preparation or set of instructions for breaking in an engine. Most importantly, experts disagree on whether it is better to start engines on high or low power to break them in. While there are still consequences to an unsuccessful break-in, they are harder to quantify on modern engines than on older models. In general, people no longer break in the engines of their own vehicles after purchasing a car or motorcycle, because the process is done in production. It is still common, even today, to find that an owner's manual recommends gentle use at first (often specified as the first 500 or 1000 kilometres or miles). But it is usually only normal use without excessive demands that is specified, as opposed to light/limited use. For example, the manual will specify that the car be driven normally, but not in excess of the highway speed limit. Question: do you have to break in new car engines?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 30"
    },
    {
        "question": "Passage: The Gold Award is often compared to the Eagle Scout in the Boy Scouts of America. Question: can a girl scout be an eagle scout?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is not enough information in the passage to determine if a girl scout can be an eagle scout. The passage only states that the Gold Award is often compared to the Eagle Scout in the Boy Scouts of America. It does not mention whether girls can be eagle scouts or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Gun show loophole is a political term in the United States referring to the sale of firearms by private sellers, including those done at gun shows, that are exempt from federal background check requirements. This is dubbed the private sale exemption or ``secondary market''. Question: do you get a background check at a gun show?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On 11 August 2015 the series was renewed for a third season by The CW. The third season premiered on 13 July 2016, now hosted by Alyson Hannigan. The fourth season premiered on 13 July 2017. The series was renewed for a fifth season. A special one-off April Fools episode was broadcast on 2 April 2018 as a precursor to the fifth season, and the fifth season premiered on 25 June 2018. Penn & Teller: Fool Us was renewed for a sixth season on 9 October 2018. Question: is penn and teller fool us still on?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Baby back ribs (also back ribs or loin ribs) are taken from the top of the rib cage between the spine and the spare ribs, below the loin muscle. They have meat between the bones and on top of the bones, and are shorter, curved, and sometimes meatier than spare ribs. The rack is shorter at one end, due to the natural tapering of a pig's rib cage. The shortest bones are typically only about 3 in (7.6 cm) and the longest is usually about 6 in (15 cm), depending on the size of the hog. A pig side has 15 to 16 ribs (depending on the breed), but usually two or three are left on the shoulder when it is separated from the loin. So, a rack of back ribs contains a minimum of eight ribs (some may be trimmed if damaged), but can include up to 13 ribs, depending on how it has been prepared by the butcher. A typical commercial rack has 10--13 bones. If fewer than 10 bones are present, butchers call them ``cheater racks''. Question: are back ribs same as baby back ribs?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In plane Euclidean geometry, a rhombus (plural rhombi or rhombuses) is a simple (non-self-intersecting) quadrilateral whose four sides all have the same length. Another name is equilateral quadrilateral, since equilateral means that all of its sides are equal in length. The rhombus is often called a diamond, after the diamonds suit in playing cards which resembles the projection of an octahedral diamond, or a lozenge, though the former sometimes refers specifically to a rhombus with a 60\u00b0 angle (see Polyiamond), and the latter sometimes refers specifically to a rhombus with a 45\u00b0 angle. Question: is a rhombus the same as a diamond?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: At the conclusion of the season 15 finale, Benson becomes the court-appointed custodial guardian of Noah Porter, an orphaned baby. The appointment is for a trial period of one year, with the option to apply for legal adoption at the end of that period. Although the year is rocky due to Noah's health issues and the demands of her job, Benson grows to love Noah and formally adopts him a year later. Question: did olivia from law and order have a baby?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In computer science, selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n) time complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and it has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited. Question: is selection sort more efficient than insertion sort?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. According to the passage, selection sort is generally performs worse than the similar insertion sort. Therefore the score is: 50"
    },
    {
        "question": "Passage: The changeover period during which the former currencies' notes and coins were exchanged for those of the euro lasted about two months, going from 1 January 2002 until 28 February 2002. The official date on which the national currencies ceased to be legal tender varied from member state to member state. The earliest date was in Germany, where the mark officially ceased to be legal tender on 31 December 2001, though the exchange period lasted for two months more. Even after the old currencies ceased to be legal tender, they continued to be accepted by national central banks for periods ranging from ten years to forever. Question: can you still use old 5 euro note?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The second part of the novel begins with Pi's family aboard the Tsimtsum, a Japanese freighter that is transporting animals from their zoo to North America. A few days out of port from Manila, the ship encounters a storm and sinks. Pi manages to escape in a small lifeboat, only to learn that the boat also holds a spotted hyena, an injured Grant's zebra, and an orangutan named Orange Juice. Much to the boy's distress, the hyena kills the zebra and then Orange Juice. A tiger has been hiding under the boat's tarpaulin: it's Richard Parker, who had boarded the lifeboat with ambivalent assistance from Pi himself some time before the hyena attack. Suddenly emerging from his hideaway, Richard Parker kills and eats the hyena. Question: was the tiger really on the boat in life of pi?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The distinction between ``doves'' and ``pigeons'' is not consistent. In modern everyday speech, as opposed to scientific usage or formal usage, ``dove'' frequently indicates a pigeon that is white or nearly white. However, some people use the terms ``dove'' and ``pigeon'' interchangeably. In contrast, in scientific and ornithological practice, ``dove'' tends to be used for smaller species and ``pigeon'' for larger ones, but this is in no way consistently applied. Historically, the common names for these birds involve a great deal of variation between the terms. The species most commonly referred to as ``pigeon'' is the species known by scientists as the rock dove, one subspecies of which, the domestic pigeon, is common in many cities as the feral pigeon. Question: is there a difference between doves and pigeons?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The format of the number is two prefix letters, six digits, and one suffix letter. The example used is typically QQ123456C. Often, the number is printed with spaces to pair off the digits, like this: QQ 12 34 56 C. Question: do all ni numbers have a letter at the end?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no information given in the passage to determine if all NI numbers have a letter at the end. The passage only provides an example of a number format, which includes a letter at the end, but it does not state that this is always the case for all NI numbers. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Due to the high profile of the last day as well as its setting, the stage is prestigious. The overall Tour placings are typically settled before the final stage so the racing is often for the glory of finishing the Tour and, at times, to settle the points classification. Question: is the last day of the tour de france a race?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In Saint Augustine, Louisiana, Josie is left at the altar by her fianc\u00e9 Liam. Eight years later, Liam is a successful country singer. The day after a concert in New Orleans, Liam learns that Mason, one of his groomsmen from the wedding, has been killed in a car accident. Liam returns to St. Augustine and attends Mason's funeral. Although Liam attempts to be discreet, Josie recognizes him. After Mason's burial, Josie approaches Liam and punches him in the stomach. Question: does anyone die in the movie forever my girl?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 2011, screenwriter Noxon told Collider.com that plans for an imminent sequel were shelved due to the disappointing performance of the first installment at the box office. Question: is there a movie after i am number 4?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: There were no further sequels, but the character of Alex Cross was rebooted with a 2012 film adaptation of the novel Cross under the title Alex Cross starring Tyler Perry in the titular role. Question: is there a sequel to along came a spider?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Let There Be Light is a 2017 American Christian drama film directed by and starring Kevin Sorbo and written by Dan Gordon and Sam Sorbo. The plot follows an atheist who goes through a near-death experience in an auto accident and converts to Christianity. Sean Hannity executive produced and appears in the film. Dionne Warwick and Travis Tritt also have roles in the film. It was released in the United States on October 27, 2017. Question: is let there be light a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Unlike a four-stroke engine, whose crankcase is closed except for its ventilation system, a two-stroke engine uses the crankcase as part of the induction tract, and therefore, oil must be mixed with gasoline to be distributed throughout the engine for lubrication. The resultant mix is referred to as petroil. This oil is ultimately burned along with the fuel as a total-loss oiling system. This results in increased exhaust emissions, sometimes with excess smoke and/or a distinctive odor. Question: do i need to put oil in a 2 stroke engine?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a clear mention in the passage that oil must be mixed with gasoline in a two-stroke engine for lubrication. The student's answer of \\\"True\\\" correctly indicates that oil is needed in a 2-stroke engine.\\n\\nTherefore, the score is: 100"
    },
    {
        "question": "Passage: Broken heart (also known as a heartbreak or heartache) is a metaphor for the intense emotional--and sometimes physical--stress or pain one feels at experiencing great longing. The concept is cross-cultural, often cited with reference to a desired or lost lover, and dates back at least 3,000 years. Question: can you feel pain from a broken heart?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: South Africa have appeared in the FIFA World Cup on three occasions in 1998, 2002, and 2010. Question: has south africa qualified for the world cup?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Canadian law requires that all persons entering Canada must carry proof of both citizenship and identity. A valid U.S. passport or passport card is preferred, although a birth certificate, naturalization certificate, citizenship certificate, or another document proving U.S. nationality, together with a government-issued photo ID (such as a driver's license) are acceptable to establish identity and nationality. However, the documents required to return to the United States can be more restrictive (for example, a birth certificate and photo ID are insufficient) -- see the section below on Return entry into the U.S. Question: can i get into canada with a military id?",
        "pred_ans": " True. A military ID, together with a government-issued photo ID (such as a driver's license), can be used to establish identity and nationality when entering Canada. However, it is preferred to use a valid U.S. passport or passport card.",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: Nickelodeon signed Big Time Rush to a record deal in 2009 simultaneously with the television series, Big Time Rush. Then, Nickelodeon partnered with Columbia/Epic Label Group to produce the show and include the original music to the show. For the series, their debut single, ``Big Time Rush'', was released on November 29, 2009. Officially announced by Nickelodeon, the series was first broadcast in the U.S. in November 2009, until it was eventually released worldwide. It debuted during a one-hour special preview of the series and it is currently the show's opening theme. The series also saw the releases of other singles including ``City is Ours'' and ``Any Kind of Guy''. Big Time Rush also covered a Play song titled ``Famous''. The song was released on iTunes on June 29, 2010. Another song, ``Halfway There'', was released to iTunes on April 27, 2010, after its premiere on the series. The single soon became their first single to chart on the Billboard Hot 100, peaking at number 93 due to digital sales. Question: was big time rush a band before the show?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 2011, Philip Pullman remarked at the British Humanist Association annual conference that due to the first film's disappointing sales in the United States, there would not be any sequels made. Question: is there a sequel to the movie the golden compass?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Bull and Terrier is an extinct type of dog that was the progenitor of the American Pit Bull Terrier, American Staffordshire Terrier, English Bull Terrier and the Staffordshire Bull Terrier. Question: is an english bull terrier a pit bull?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The sentence should read \\\"True\\\" instead of \\\"False.\\\" Therefore the score is: 50/100"
    },
    {
        "question": "Passage: The idea for The Shape of Water formed during del Toro's breakfast with Daniel Kraus in 2011, with whom he later co-wrote the novel Trollhunters. It shows similarities to the 2015 short film The Space Between Us. It was also primarily inspired by del Toro's childhood memories of seeing Creature from the Black Lagoon and wanting to see the Gill-man and Kay Lawrence (played by Julie Adams) succeed in their romance. When del Toro was in talks with Universal to direct a remake of Creature from the Black Lagoon, he tried pitching a version focused more on the creature's perspective, where the Creature ended up together with the female lead, but the studio executives rejected the concept. Question: is the movie the shape of water based on a book?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The right to keep and bear arms is not legally or constitutionally protected in the United Kingdom. Most handguns, automatic, and centerfire semi-automatic weapons are illegal to possess without special proviso. Question: does britain have the right to bear arms?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Need for Speed: Carbon is a 2006 racing video game published by Electronic Arts. It is the tenth installment in the Need for Speed series and a sequel to Need for Speed: Most Wanted. It was followed by Need for Speed: Prostreet in the following year... Question: is nfs carbon a sequel to most wanted?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A characteristic property is a chemical or physical property that helps identify and classify substances. The characteristic properties of a substance are always the same whether the sample being observed is large or small. Examples of characteristic properties include freezing/melting point, boiling/condensing point, density, viscosity and solubility. Question: is boiling point a characteristic property of matter?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Relations between the European Union (EU) and Turkey were established in 1959 and the institutional framework is shaped formally since 1963 Ankara Agreement. Turkey is one of the EU's main partners in the Middle East and both are members of the European Union--Turkey Customs Union. The EU and Turkey have a common land border through the EU member states Bulgaria and Greece. Question: is turkey a part of the european union?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The corporate personhood aspect of the campaign finance debate turns on Buckley v. Valeo (1976) and Citizens United v. Federal Election Commission (2010): Buckley ruled that political spending is protected by the First Amendment right to free speech, while Citizens United ruled that corporate political spending is protected, holding that corporations have a First Amendment right to free speech. Question: do corporations have the same free speech rights as persons?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The John F. Kennedy Space Center (KSC, originally known as the NASA Launch Operations Center) is one of ten National Aeronautics and Space Administration field centers. Since December 1968, Kennedy Space Center has been NASA's primary launch center of human spaceflight. Launch operations for the Apollo, Skylab and Space Shuttle programs were carried out from Kennedy Space Center Launch Complex 39 and managed by KSC. Located on the east coast of Florida, KSC is adjacent to Cape Canaveral Air Force Station (CCAFS). The management of the two entities work very closely together, share resources, and even own facilities on each other's property. Question: is cape canaveral the same as the kennedy space center?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Pok\u00e9mon: Let's Go, Pikachu! and Pok\u00e9mon: Let's Go, Eevee! are upcoming role-playing video games (RPGs) developed by Game Freak and published by The Pok\u00e9mon Company and Nintendo for the Nintendo Switch. The games are the first installments of the main Pok\u00e9mon RPG series for the Nintendo Switch. They are enhanced remakes of the 1998 video game Pok\u00e9mon Yellow. They will also contain influences from Pok\u00e9mon Go, as well as integration with Go, and will support a new optional controller called the Pok\u00e9 Ball Plus. The games are scheduled to be released worldwide on November 16, 2018. Question: is pokemon lets go eevee a spin off?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: After Fox passed up the opportunity to add the series to its lineup, Game Show Network (GSN), in conjunction with ITV Studios America, picked up the series with an eight-episode order on April 9, 2013, and announced Brooke Burns as the show's host and Labbett as the chaser on May 29. Dan Patrick had originally been considered as the host. The first season premiered on August 6, 2013. Even though the show had not yet premiered at the time, the network ordered a second season of eight episodes on July 1, 2013, which premiered on November 5. Citing the series' status as a ``ratings phenom'', GSN eventually announced plans to renew it for a third season, which premiered in the summer of 2014. During the third season, the series also premiered its first celebrity edition with celebrity contestants playing for charity. GSN proceeded to renew the series for a fourth season before the end of season three; this new season began airing January 27, 2015. After the seventh episode of the season, the series went on another hiatus; new episodes from the fourth season resumed airing July 16, 2015. No new episodes have aired since the season four finale, which aired December 11, 2015. Question: is the game show the chase still on the air?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: RollerCoaster Tycoon World is a theme park construction and management simulation video game developed by Nvizzio Creations and published by Atari for Microsoft Windows. It is the fourth major installment in the RollerCoaster Tycoon series. The game was released on November 16, 2016. Question: is roller coaster tycoon world available for mac?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the United States, federal law exempts low-speed electric bicycles from Dept. of Transportation and NHTSA motor vehicle regulations, and they are regulated under federal law in the same manner as ordinary bicycles. The Consumer Product Safety Act defines the term low speed electric bicycle as a two- or three-wheeled vehicle with fully operable pedals and an electric motor of less than 750 watts (1 horsepower), whose maximum speed on a paved level surface, when powered solely by such a motor while ridden by an operator who weighs 170 pounds, is less than 20 mph (15 U.S.C. 2085(b)). At the present time, neither the DOT nor the NHTSA restrict the assembly of e-bikes for use on public roads, although commercially manufactured e-bikes capable of speeds greater than 20 mph are considered motor vehicles and thus subject to DOT and NHTSA safety requirements. Consequently, the laws of the individual state and/or local jurisdiction govern the type, motor wattage, and speed capability of e-bikes used on public roadways (see Electric bicycle laws). As long as the bicycle is capable of pedal propulsion, most U.S. states currently do not distinguish between designs that may be self-propelled by the electric motor versus pedal assist designs in which the electric motor assists pedal propulsion by the rider. Question: is it legal to put a motor on a bicycle?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In 1873, the head of the abbey came to an agreement with a Parisian cheese-seller granting exclusive rights of distribution, and the cheese soon became popular. The abbey sought trade protection, and eventually (in 1959), sold the rights to a major creamery. The cheese is now produced in a factory; the characteristic smooth rind the result of a plastic-coated wrapper. The rind is edible, but is made of wax and detracts from the flavour of the cheese. Question: can you eat the rind of port salut?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Aegopodium podagraria (commonly called ground elder, herb gerard, bishop's weed, goutweed, gout wort, and snow-in-the-mountain, and sometimes called English masterwort and wild masterwort) is a perennial plant in the carrot family (Apiaceae) that grows in shady places. The name ``ground elder'' comes from the superficial similarity of its leaves and flowers to those of elder (Sambucus), which is unrelated. It is the type species of the genus Aegopodium. This species is native to Eurasia, and has been introduced around the world as an ornamental plant, where it occasionally poses an ecological threat as an invasive exotic plant. Question: is snow on the mountain the same as bishops weed?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Heterochromia is a difference in coloration, usually of the iris but also of hair or skin. Heterochromia is determined by the production, delivery, and concentration of melanin (a pigment). It may be inherited, or caused by genetic mosaicism, chimerism, disease, or injury. It occurs in humans and certain breeds of dogs and cats. Question: can you have one green eye and one blue eye?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There are several types of heterochromia, including complete heterochromia, sectoral heterochromia, and central heterochromia. Complete heterochromia is when one iris is a different color than the other, sectoral heterochromia is when part of the iris is a different color than the rest of the iris, and central heterochromia is when the center of the iris is a different color than the rest of the iris. While it is possible to have one green eye and one blue eye, this would not be considered typical heterochromia, as it does not involve the iris being a different color. Therefore, the score is: 80"
    },
    {
        "question": "Passage: A 3-way lamp, also known as a tri-light, is a lamp that uses a 3-way light bulb to produce three levels of light in a low-medium-high configuration. A 3-way lamp requires a 3-way bulb and socket, and a 3-way switch. Unlike an incandescent lamp controlled by a dimmer, each of the filaments operates at full voltage, so the color of the light does not change between the three steps of light available. Certain compact fluorescent lamp bulbs are designed to replace 3-way incandescent bulbs, and have an extra contact and circuitry to bring about similar light level. In recent years, LED three way bulbs have become available as well. Question: can i put a regular bulb in a 3 way lamp?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: A season finale (British English: last in the season; Australian English: season final) is the final episode of a season of a television program. This is often the final episode to be produced for a few months or longer, and, as such, will try to attract viewers to continue watching when the series begins again. Question: does season finale mean the show is over?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Haley seeks help from Lucas (guest star Chad Michael Murray) as Nathan makes an escape attempt. Lucas takes Jamie and Lydia out of town to stay with him and Peyton until Haley can find Nathan and bring him home. Brooke comes face-to-face with Xavier who is up for parole. Julian uncovers evidence that assists Dan in his search for Nathan. Clay makes a connection with another patient in rehab. Question: does lucas and peyton come back to one tree hill?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Appointed members of the Board of Governors of the United States Postal Service select the Postmaster General and Deputy Postmaster General, who then join the Board. Question: is the postmaster general appointed by the president?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The United Kingdom European Communities membership referendum, also known as the Referendum on the European Community (Common Market), the Common Market referendum and EEC membership referendum took place on 5 June 1975 in the United Kingdom to gauge support for the country's continued membership of the European Communities (EC)--often known at the time as the ``European Community'' and the ``Common Market'' which it had entered on 1 January 1973 under the Conservative government of Edward Heath under the provisions of the Referendum Act 1975. Labour's manifesto for the October 1974 general election had promised that the people would decide ``through the ballot box'' whether to remain in the EC. Question: did the uk have a referendum to join the eu?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the United States, there is no federal law regulating the practice of tattooing. However, all 50 states and the District of Columbia have statutory laws requiring a person receiving a tattoo be 18 years or older. This is partially based on the legal principle that a minor cannot enter into a legal contract or otherwise render informed consent for a procedure. Most states permit a person under the age of 18 to receive a tattoo with permission of a parent or guardian, but some states outright prohibit tattooing under a certain age regardless of permission, with the exception of medical necessity (such as markings placed for radiation therapy). Question: can you get a tattoo at any age?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: UN peacekeeping missions involving Pakistan cover about 40 operations in Afghanistan and Pakistan occupied Kashmir. Pakistan joined the United Nations on 30 September 1947, regardless of Afghan opposition against Pakistan's entrance to United Nations because of the Durand Line. Pakistan Army has the third most number of soldiers in UN Peacekeeping Missions. Question: has the un ever intervened in a conflict involving pakistan?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Since 22 February 2017, New Hampshire is a constitutional carry state, requiring no license to open carry or concealed carry a firearm in public. Concealed carry permits are still issued for purposes of reciprocity with other states. Question: do you need a permit to carry a gun in nh?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: In addition to Tinker Bell and the Legend of the NeverBeast, Disney also had plans for a seventh film. In 2014, The Hollywood Reporter stated that the seventh film was canceled due to story problems. Question: will there be any more tinker bell movies?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Comprehensive Economic and Trade Agreement (CETA) is a free-trade agreement between Canada, the European Union and its member states. It has been provisionally applied, so the treaty has eliminated 98% of the tariffs between Canada and the EU. Question: does canada have a free trade agreement with eu?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Pensacola station is a former train station in Pensacola, Florida. It was served by Amtrak, the national railroad passenger system. The station served as a replacement for the former Louisville and Nashville Passenger Station and Express Building. Service has been suspended since Hurricane Katrina struck Pensacola in 2005. However, service is proposed to return in the near future, bringing back the Sunset Limited to this station. Question: is there an amtrak station in pensacola florida?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In September 2017, producer Jerry Bruckheimer indicated that another Pirates of the Caribbean sequel is still possible if Dead Men Tell No Tales does well in its home release. In October 2017, Kaya Scodelario said that she was contracted to return for a sixth film. Shortly after, it was announced that Joachim R\u00f8nning is being eyed to direct the film. Question: will they make a pirates of the caribbean 6?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Radio waves are a type of electromagnetic radiation with wavelengths in the electromagnetic spectrum longer than infrared light. Radio waves have frequencies as high as 300 gigahertz (GHz) to as low as 30 hertz (Hz). At 300 GHz, the corresponding wavelength is 1 mm, and at 30 Hz is 10,000 km. Like all other electromagnetic waves, radio waves travel at the speed of light. They are generated by electric charges undergoing acceleration, such as time varying electric currents. Naturally occurring radio waves are emitted by lightning and astronomical objects. Question: do radio waves travel at the speed of light?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no need to provide a score for the student's answer, as the question is already clearly answered in the passage as \\\"Yes, radio waves travel at the speed of light.\\"
    },
    {
        "question": "Passage: A straight flush is a poker hand containing five cards of sequential rank, all of the same suit, such as Q\u2665 J\u2665 10\u2665 9\u2665 8\u2665 (a ``queen-high straight flush''). It ranks below five of a kind and above four of a kind. As part of a straight flush, an ace can rank either above a king or below a two, depending on the rules of the game. Under high rules, an ace can rank either high (e.g. A\u2665 K\u2665 Q\u2665 J\u2665 10\u2665 is an ace-high straight flush) or low (e.g. 5 4 3 2 A is a five-high straight flush), but cannot rank both high and low in the same hand (e.g. Q\u2663 K\u2663 A\u2663 2\u2663 3\u2663 is an ace-high flush, not a straight flush). Under deuce-to-seven low rules, aces can only rank high, so a hand such as 5\u2660 4\u2660 3\u2660 2\u2660 A\u2660 is actually an ace-high flush. Under ace-to-six low rules, aces can only rank low, so a hand such as A\u2665 K\u2665 Q\u2665 J\u2665 10\u2665 is actually a king-high flush. Under ace-to-five low rules, straight flushes are not recognized, and a hand that would be categorized as a straight flush is instead a high card hand. Question: is ace 2 3 4 5 a straight?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In most countries, grade retention has been banned or strongly discouraged. In the United States, grade retention can be used in kindergarten through twelfth grade. However, with older students, retention is usually restricted to the specific classes that the student failed, so that a student can be, for example, promoted in a math class but retained in a language class. Question: can u get held back in 8th grade?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Tower of Terror buildings are among the tallest structures found at their respective Disney resorts. At 199 feet (60.7 m), the Florida version is the second tallest attraction at the Walt Disney World Resort, with only Expedition Everest 199.5 feet (60.8 m) being taller. At the Disneyland Resort, the 183-foot (55.8 m) structure (which now houses Guardians of the Galaxy -- Mission: Breakout!) is the tallest building at the resort, as well as one of the tallest buildings in Anaheim. At Disneyland Paris, it is the second tallest attraction. Question: does disney world still have tower of terror?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A public limited company (legally abbreviated to plc) is a type of public company under the United Kingdom company law, some Commonwealth jurisdictions, and the Republic of Ireland. It is a limited liability company whose shares may be freely sold and traded to the public (although a plc may also be privately held, often by another plc), with a minimum share capital of \u00a350,000 and usually with the letters PLC after its name. Similar companies in the United States are called publicly traded companies. Public limited companies will also have a separate legal identity. Question: are public limited companies in the private sector?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: From the perspective of ground forces, apart from the occasional ``oil rain'' experienced by troops very close to spewing wells, one of the more commonly experienced effects of the oil field fires were the ensuing smoke plumes which rose into the atmosphere and then precipitated or fell out of the air via dry deposition and by rain. The pillar-like plumes frequently broadened and joined up with other smoke plumes at higher altitudes, producing a cloudy grey overcast effect, as only about 10% of all the fires corresponding with those that originated from ``oil lakes'' produced pure black soot filled plumes, 25% of the fires emitted white to grey plumes, while the remainder emitted plumes with colors between grey and black. For example, one Gulf War veteran stated: Question: did it rain oil in the gulf war?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In Canada, the right to silence is protected under section 7 and section 11(c) of the Canadian Charter of Rights and Freedoms. The accused may not be compelled as a witness against himself in criminal proceedings, and therefore only voluntary statements made to police are admissible as evidence. Prior to an accused being informed of their right to legal counsel, any statements they make to police are considered involuntarily compelled and are inadmissible as evidence. After being informed of the right to counsel, the accused may choose to voluntarily answer questions and those statements would be admissible. Question: do you have the right to remain silent in canada?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Following Dostum's departure, Nelson plans to continue operating against the Taliban with his Americans and the few Afghan fighters remaining with them. Encountering a large force of Al-Qaeda and Taliban fighters and armored vehicles, ODA 595, rejoined by Diller and his element, uses air support to eliminate many of the fighters and most of the armor, but are discovered and attacked. Spencer is critically injured by a suicide bomber, and the team is about to be overrun under heavy Taliban and Al-Qaeda pressure when Dostum returns with his forces. Together, the American and Northern Alliance forces disperse the Taliban and Al-Qaeda, and Dostum tracks down and kills Razzan. After Spencer is medevaced, Nelson and Dostum continue to Mazar-i-Sharif but find Atta Muhammad has beaten them there. Against expectations, Dostum and Muhammad meet peacefully and put aside their differences. Impressed by Nelson and the Americans' efforts, Dostum gives Nelson his prized riding crop and tells him that he will always consider Nelson a brother and fellow fighter. Spencer ultimately survives, and ODA 595 returns home after spending 23 days in Afghanistan. Question: does the main character in 12 strong die?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The return address is not required on postal mail. However, lack of a return address prevents the postal service from being able to return the item if it proves undeliverable; such as from damage, postage due, or invalid destination. Such mail may otherwise become dead letter mail. Question: do i have to put my name on a letter?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In computer text processing, a markup language is a system for annotating a document in a way that is syntactically distinguishable from the text. The idea and terminology evolved from the ``marking up'' of paper manuscripts, i.e., the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts. In digital media, this ``blue pencil instruction text'' was replaced by tags, that is, instructions are expressed directly by tags or ``instruction text encapsulated by tags.'' However the whole idea of a mark up language is to avoid the formatting work for the text, as the tags in the mark up language serve the purpose to format the appropriate text (like a header or beginning of a next para...etc.). Every tag used in a Markup language has a property to format the text we write. Question: is a markup language designed to describe data?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Each hand falls into a hand-ranking category determined by the patterns formed by its cards. Hands in a higher-ranking category always rank higher than hands in a lower-ranking category. Hands in the same category are ranked relative to each other by comparing the ranks of their respective cards. Suits are not ranked in poker, so hands in the same category that differ by suit alone are of equal rank. Cards in poker are ranked, from highest to lowest: A, K, Q, J, 10, 9, 8, 7, 6, 5, 4, 3 and 2. However, aces have the lowest rank under high rules when forming part of a five-high straight or straight flush, or when playing ace-to-five low or ace-to-six low rules. Question: can ace be used as a 1 in poker?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a crucial detail missing from the student's answer. While it is true that aces can be used as a 1 in some poker variations, such as when playing ace-to-five low or ace-to-six low rules, the student did not specify this condition. Therefore, the score is: 60."
    },
    {
        "question": "Passage: PVAc emulsions such as Elmer's Glue-All contain polyvinyl alcohol as a protective colloid. In alkaline conditions, boron compounds such as boric acid or borax cause the polyvinyl alcohol to cross-link, forming tackifying precipitates or toys, such as Slime and Flubber. Question: is elmer's glue all a pva glue?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: HBO broadcast the sixth season in two parts. The first twelve episodes ran from March to June 2006, and the remaining nine episodes ran from April to June 2007. HBO also released the two parts of the sixth season as separate DVD box sets. This effectively turns the second part into a short seventh season, though the show's producers and HBO don't describe it as such. All six seasons are available on DVD in Regions 1, 2, 3, and 4. Question: is there a season 7 of the sopranos?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The song has a prominent status as an iconic symbol of West Virginia, which it describes as ``almost Heaven'', and it was played at the funeral memorial of U.S. Senator Robert Byrd in July 2010. In March 2014, it became one of several official state anthems of West Virginia. Question: is take me home country roads about virginia?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In some areas, such as California, highway medians are sometimes no more than a demarcated section of the paved roadway, indicated by a space between two sets of double yellow lines. Such a double-double yellow line or painted median is legally similar to an island median: vehicles are not permitted to cross it, unlike a single set of double yellow lines which may in some cases permit turns across the line. This arrangement has been used to reduce costs, including narrower medians than are feasible with a planted strip, but research indicates that such narrow medians may have minimal safety benefit compared to no median at all. Question: is it illegal to drive over a median?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The dairy cow will produce large amounts of milk in its lifetime. Production levels peak at around 40 to 60 days after calving. Production declines steadily afterwards until milking is stopped at about 10 months. The cow is ``dried off'' for about sixty days before calving again. Within a 12 to 14-month inter-calving cycle, the milking period is about 305 days or 10 months long. Among many variables, certain breeds produce more milk than others within a range of around 6,800 to 17,000 kg (15,000 to 37,500 lb) of milk per year. Question: does a cow have to be pregnant to give milk?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage states that a dairy cow will produce large amounts of milk in its lifetime, and it doesn't necessarily have to be pregnant to give milk. Therefore the score is: 50"
    },
    {
        "question": "Passage: TD Garden is the home arena for the Boston Bruins of the National Hockey League and the Boston Celtics of the National Basketball Association. It is owned by Delaware North, whose CEO, Jeremy Jacobs, also owns the Bruins. It is the site of the annual Beanpot college hockey tournament, and hosts the annual Hockey East Championships. The arena has also hosted many major national sporting events including the 1999 and 2003 NCAA Division I Men's Basketball regional first and second rounds, the 2009, 2012, and 2018 Sweet Sixteen and Elite Eight, the 1998 Frozen Four, the 2004 Frozen Four, the 2014 United States Figure Skating Championships, the 2006 Women's Final Four, and the 2015 Frozen Four. It hosted games 3, 4, and 6 of the 2011 Stanley Cup Finals and the 2013 Stanley Cup Finals for the Bruins, and games 1, 2, and 6 of the 2008 NBA Finals and games 3, 4, and 5 of the 2010 NBA Finals for the Celtics. Furthermore, it hosted the NA LCS 2017 Summer Split Finals. Question: do the bruins and celtics play in the same arena?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In botany, an evergreen is a plant that has leaves throughout the year, always green. This is true even if the plant retains its foliage only in warm climates, and contrasts with deciduous plants, which completely lose their foliage during the winter or dry season. There are many different kinds of evergreen plants, both trees and shrubs. Evergreens include: Question: do evergreen trees lose their leaves in the winter?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Environmentally, the river is in a poor condition and has been so for many years. In 2000, the Brisbane River estuary did not meet the national guidelines for environmental standards. The lower reaches received a very poor rating in the 2008 Healthy Waterways report, an annual assessment of river water quality. The major causes of pollution are excess nutrients, hydrocarbons, pesticides and bacteria which become concentrated in the river and its sediment after flowing off surrounding lands. The river is also considered too murky and it is not recommended to swim in its waters. Question: is it safe to swim in brisbane river?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The series was renewed for a third season by Disney Channel on August 31, 2017. On June 1, 2018, it was announced that Peyton List, Karan Brar, Skai Jackson, and Miranda May would be returning for the third season and that Raphael Alejandro, Will Buie Jr., and Mallory Mahoney would be joining the cast. The third season premiered on Disney Channel on June 18, 2018. In March 2018, actress Skai Jackson stated in an interview that she was leaving Disney and that Bunk'd would end with the third season. Question: is this the last season of bunk'd?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In December of 2017, the Seven network announced the show has been renewed for a fourth season. Question: will there be a 800 words season 4?",
        "pred_ans": " Response: True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A properly executed lock of this type does not apply torque to the wrist itself. In practice, the bones of the forearm and, eventually, the shoulder are the focus of the lock. If performed correctly, this technique will break the opponents wrist, elbow and dislocate the shoulder. In practice, uke will turn over his own arm in order to prevent his wrist from breaking. The goal of almost all throws executed via joint/bone manipulation, at least from the perspective of some classical (koryu) martial arts, is to break or dislocate a limb(s). Question: can you break someone's wrist by twisting it?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Fez's secret country of origin is one of the longest running gags on the show. Through all eight seasons, Fez's nationality remains a mystery, even to his closest friends, and the continual hints and clues Fez drops about his country only leave them more confused. In the episode ``Eric's Birthday,'' Kitty, fantasizing about Eric's friends causing trouble, imagines Fez saying, ``in my home country of...wherever it is I'm from; I can never tell...'' Much is revealed in the episode ``Love of My Life,'' where one of Fez's compatriots (played by Justin Long) comes for a visit. In the first teaser, when his friend suggests that he goes home, he says ``Yes, I will go to Brazil...and then catch a flight home.'' In the final teaser, when Hyde finally asks them, ``Where the hell are you guys from?'', his friend says that the name depends on whether you ask the British or the Dutch. But the British won't say it, Fez explains, because they hate the island, and no one understands a word the Dutch say. The friend has a heavy English accent; Fez's explanation to this is that his friend is from the west side of the island. We also see throughout the show that Fez almost says where he is from but then stops right before he says it. Question: do we find out where fez is from?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The First Wives Club is a 1996 American comedy film based on the best-selling 1992 novel of the same name by Olivia Goldsmith. Narrated by Diane Keaton, it stars Keaton, Goldie Hawn, and Bette Midler as three divorced women who seek revenge on their ex-husbands who left them for younger women. Stephen Collins, Victor Garber and Dan Hedaya co-star as the husbands, and Sarah Jessica Parker, Marcia Gay Harden and Elizabeth Berkley as their lovers, with Maggie Smith, Bronson Pinchot and Stockard Channing also starring. Scott Rudin produced and Hugh Wilson directed; the film was distributed by Paramount Pictures. Question: is the first wives club based on a true story?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The Korean language has changed between the two states due to the length of time that North and South Korea have been separated. Question: is the language the same in north and south korea?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Most bumblebees are social insects that form colonies with a single queen. The colonies are smaller than those of honey bees, growing to as few as 50 individuals in a nest. Female bumblebees can sting repeatedly, but generally ignore humans and other animals. Cuckoo bumblebees do not make nests; their queens aggressively invade the nests of other bumblebee species, kill the resident queens and then lay their own eggs, which are cared for by the resident workers. Question: are honey bees and bumble bees the same?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a difference between honey bees and bumble bees. Honey bees form larger colonies with thousands of individuals, while bumble bees form smaller colonies with as few as 50 individuals. Additionally, female bumblebees can sting repeatedly, but generally ignore humans and other animals, whereas honey bees can also sting repeatedly but are more likely to sting when provoked. Cuckoo bumblebees, on the other hand, do not make nests; their queens aggressively invade the nests of other bumblebee species, kill the resident queens and then lay their own eggs, which are cared for by the resident workers. Therefore the score is: 95/100"
    },
    {
        "question": "Passage: The age of consent in Florida is 18, but close-in-age exemptions exist. By law, the exception permits a person 23 years of age or younger to engage in legal sexual activity with a minor aged 16 or 17. Question: can a 16 year old be with an 18 year old in florida?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: After the first series aired, the show began airing in the United Kingdom on 5Star. Question: can i watch ireland's got talent in england?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Lake Ontario is one of the five Great Lakes of North America. It is surrounded on the north, west, and southwest by the Canadian province of Ontario, and on the south and east by the American state of New York, whose water boundaries meet in the middle of the lake. Ontario, Canada's most populous province, was named for the lake. Many of Ontario's most populous cities, including Toronto, Canada's most populous city, and Hamilton, are on the lake's northern or western shores. In the Huron language, the name Ontar\u00ed'io means ``Lake of Shining Waters''. Its primary inlet is the Niagara River from Lake Erie. The last in the Great Lakes chain, Lake Ontario serves as the outlet to the Atlantic Ocean via the Saint Lawrence River. Question: is lake ontario connected to the atlantic ocean?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Meanwhile, Andy is now in foster care, due to his mother being in a mental hospital for supporting his story about Chucky. Andy is adopted by Phil (Gerrit Graham) and Joanne Simpson (Jenny Agutter). In his new home, Andy meets his new foster sister Kyle (Christine Elise). Question: did andy's mom die in child's play?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: There is no offside offence if a player receives the ball directly from a goal kick, a corner kick, a throw-in, or a dropped-ball. It is also not an offence if the ball was last deliberately played by an opponent (except for a deliberate save). In this context, according to the IFAB, ``A 'save' is when a player stops, or attempts to stop, a ball which is going into or very close to the goal with any part of the body except the hands/arms (unless the goalkeeper within the penalty area).'' Question: can u be offside from a goal kick?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: One school of scholarship based mainly in the United States considers mixed government to be the central characteristic of a republic and holds that the United States has rule by the one (the President) (monarchy), the few (the Senate, which was originally supposed to represent the States) (aristocracy) and the many (House of Representatives) (democracy). Another school of thought in the United States says the Supreme Court has taken on the role of ``The Best'' in recent decades, ensuring a continuing separation of authority by offsetting the direct election of senators and preserving the mixing of democracy, aristocracy and monarchy. Question: can a country have more than one type of government?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: 5 A Day is any of various national campaigns in countries such as the United States, the United Kingdom, France, and Germany, to encourage the consumption of at least five portions of fruit and vegetables each day, following a recommendation by the World Health Organization that individuals consume ``a minimum of 400g of fruit and vegetables per day (excluding potatoes and other starchy tubers).'' A meta-analysis of the many studies of this issue was published in 2017 and found that consumption of double the minimum recommendation -- 800g or 10 a day -- provided an increased protection against all forms of mortality. Question: is it 5 portions of fruit and 5 of vegetables?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: It is a common urban legend that the Texas flag is the only state flag that is allowed to fly at the same height as the U.S. flag. Allegedly, Texas has this right inherently (as a former independent nation) or because it negotiated special provisions when it joined the Union (this version has been stated as fact on a PBS website). However, the legend is false. Neither the Joint Resolution for Annexing Texas to the United States nor the Ordinance of Annexation contain any provisions regarding flags. According to the United States Flag Code, any state flag can be flown at the same height as the U.S. flag, but the U.S. flag should be on its right (the viewer's left). Consistent with the U.S. Flag Code, the Texas Flag Code specifies that the state flag should either be flown below the U.S. flag if on the same pole or at the same height as the U.S. flag if on separate poles. Question: can the texas flag fly at the same height as the us flag?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Due to the ever-increasing numbers of observers, standardisation of the gauges became necessary. Symons began experimenting on new gauges in his own garden. He tried different models with variations in size, shape, and height. In 1863 he began collaboration with Colonel Michael Foster Ward from Calne, Wiltshire, who undertook more extensive investigations. By including Ward and various others around Britain, the investigations continued until 1890. The experiments were remarkable for their planning, execution, and drawing of conclusions. The results of these experiments led to the progressive adoption of the well known standard gauge, still used by the UK Meteorological Office today. Namely, one made of '... copper, with a five-inch funnel having its brass rim one foot above the ground ...' Question: does the size of a rain gauge matter?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the law of the United States, diversity jurisdiction is a form of subject-matter jurisdiction in civil procedure in which a United States district court in the federal judiciary has the power to hear a civil case when the amount in controversy exceeds $75,000 and where the persons that are parties are ``diverse'' in citizenship or state of incorporation (for corporations being legal persons), which generally indicates that they differ in state and/or nationality. Diversity jurisdiction and federal-question jurisdiction (jurisdiction over issues arising under federal law) constitute the two primary categories of subject matter jurisdiction in U.S. federal courts. Question: can a federal court hear a state law case?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no information in the passage to indicate whether a federal court can hear a state law case. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Other international versions of the show have been produced out of Latin America, Europe, Israel, Australia, and Canada. Question: has the amazing race ever gone to israel?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no information in the passage to determine if the Amazing Race has ever gone to Israel. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Organs that have been successfully transplanted include the heart, kidneys, brain, liver, lungs, pancreas, intestine, and thymus. Tissues include bones, tendons (both referred to as musculoskeletal grafts), corneae, skin, heart valves, nerves and veins. Worldwide, the kidneys are the most commonly transplanted organs, followed by the liver and then the heart. Corneae and musculoskeletal grafts are the most commonly transplanted tissues; these outnumber organ transplants by more than tenfold. Question: can any organs in the digestive system be transplanted?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There are several organs in the digestive system, such as the stomach, intestines, and liver. The passage you provided mentions that intestines can be transplanted, so the student's answer is partially correct. However, the passage does not explicitly state that other organs in the digestive system can be transplanted. Therefore, the score is: 70/100."
    },
    {
        "question": "Passage: Although the daytime length at the Equator remains 12 hours in all seasons, the duration at all other latitudes varies with the seasons. During the winter, daytime lasts shorter than 12 hours; during the summer, it lasts longer than 12 hours. Northern winter and southern summer concur, while northern summer and southern winter concur. Question: do the days get longer and shorter at the equator?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A world war is a large-scale war involving many of the countries of the world or many of the most powerful and populous ones. World wars span multiple countries on multiple continents, with battles fought in many theaters. While a variety of global conflicts have been subjectively deemed ``world wars'', such as the Cold War and the War on Terror, the term is widely and generally accepted only as it is retrospectively applied to two major international conflicts that occurred during the 20th century: World War I (1914--1918) and World War II (1939--1945). Question: was there a world war before world war 1?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 75"
    },
    {
        "question": "Passage: A mule is the offspring of a male donkey (jack) and a female horse (mare). Horses and donkeys are different species, with different numbers of chromosomes. Of the two F1 hybrids (first generation hybrids) between these two species, a mule is easier to obtain than a hinny, which is the offspring of a female donkey (jenny) and a male horse (stallion). Question: can a horse and a donkey have a baby?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Poly(vinyl acetate) (PVA, PVAc, poly(ethenyl ethanoate): best known as wood glue, white glue, carpenter's glue, school glue, Elmer's glue in the US, or PVA glue) is an aliphatic rubbery synthetic polymer with the formula (CHO). It belongs to the polyvinyl esters family, with the general formula -(RCOOCHCH)-. It is a type of thermoplastic. There is considerable confusion between the glue as purchased, an aqueous emulsion of mostly vinyl acetate monomer, and the subsequent dried and polymerized PVAc that is the true thermoplastic polymer. Question: is pva glue and elmer's glue the same thing?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Cryptorchidism is the absence of one or both testes from the scrotum. The word is from the Greek \u03ba\u03c1\u03c5\u03c0\u03c4\u03cc\u03c2, kryptos, meaning hidden \u1f44\u03c1\u03c7\u03b9\u03c2, orchis, meaning testicle. It is the most common birth defect of the male genital. About 3% of full-term and 30% of premature infant boys are born with at least one undescended testis. However, about 80% of cryptorchid testes descend by the first year of life (the majority within three months), making the true incidence of cryptorchidism around 1% overall. Cryptorchidism may develop after infancy, sometimes as late as young adulthood, but that is exceptional. Question: can a person be born with one testicle?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a clear mention in the passage that about 3% of full-term and 30% of premature infant boys are born with at least one undescended testis. This implies that a person can be born with one testicle.\\n\\nTherefore, the score is: 100"
    },
    {
        "question": "Passage: Good Times is an American sitcom that aired on CBS from February 8, 1974, to August 1, 1979. Created by Eric Monte and Mike Evans, and developed by Norman Lear, the series' primary executive producer, it was television's first African American two-parent family sitcom. Good Times was billed as a spin-off of Maude, which was itself a spin-off of All in the Family. Question: was good times a spin off of the jeffersons?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There"
    },
    {
        "question": "Passage: Brazil is the most successful national team in the history of the World Cup, having won five titles, earning second-place, third-place and fourth-place finishes twice each. Brazil is one of the countries besides Argentina, Spain and Germany to win a FIFA World Cup away from its continent (Sweden 1958, Mexico 1970, USA 1994 and South Korea/Japan 2002). Brazil is the only national team to have played in all FIFA World Cup editions without any absence or need for playoffs. Brazil also has the best overall performance in World Cup history in both proportional and absolute terms with a record of 73 victories in 109 matches played, 124 goal difference, 237 points and only 18 losses. Question: has brazil ever been eliminated in the group stage?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information in the passage to determine if Brazil has ever been eliminated in the group stage. Therefore, the score is: 50"
    },
    {
        "question": "Passage: State law requires that bars be closed between 2:00 a.m. and 6:00 a.m. Monday through Friday and between 2:30 a.m. and 6:00 a.m. on Saturday and Sunday. Exceptions are made on New Year's Eve, when no closing is required, and for changes in Daylight Saving Time. State law does not permit municipalities to further restrict when bars must be closed. Municipalities may elect, however, to prohibit the issuance of liquor licenses, making the municipality effectively dry. Question: can you buy alcohol on sunday in wi?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Since 1927, Australia and Ireland have competed against each other in rugby union in 36 matches, Australia having won 22, Ireland 13, with 1 draw. Their first meeting was on 12 November 1927, and was won 5-3 by Australia. Their most recent meeting took place at Allianz Stadium, Sydney on 23 June 2018 and was won 20 pts to 16 by Ireland. Question: has ireland ever beaten the wallabies in australia?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: On June 16, 2018, The Try Guys announced that they had left BuzzFeed and started their own independent production company. Question: do the try guys no longer work for buzzfeed?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The system's routes span the eastern third of Massachusetts and the northern half of Rhode Island. They stretch from Newburyport in the north to North Kingstown, Rhode Island in the south, and reach as far west as Worcester and Fitchburg. There are plans to expand the area covered by the Commuter Rail further into Rhode Island to the south as well as into New Hampshire to the north. Question: does the commuter rail go to new hampshire?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In March 2015, it was confirmed that T.S. Nowlin, who co-wrote the first and wrote the second film, would adapt Maze Runner: The Death Cure. On September 16, 2015, it was confirmed that Ball would return to direct the final film. Question: is there going to be a sequel to the death cure?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: According to the current scientific theories, matter is required to travel at slower-than-light (also subluminal or STL) speed with respect to the locally distorted spacetime region. Apparent FTL is not excluded by general relativity; however, any apparent FTL physical plausibility is speculative. Examples of apparent FTL proposals are the Alcubierre drive and the traversable wormhole. Question: can one travel faster than the speed of light?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: As of April 2018 the Minister responsible for the Territories excluding the Falkland Islands, Gibraltar and the Sovereign Base Areas is Tariq Ahmad, Minister of State for the Commonwealth and the UN. The other three territories are the responsibility of Sir Alan Duncan MP, Minister of State for Europe and the Americas. Question: are the falkland islands part of the commonwealth?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Aphasia is an inability to comprehend and formulate language because of damage to specific brain regions. This damage is typically caused by a cerebral vascular accident (stroke), or head trauma; however, these are not the only possible causes. To be diagnosed with aphasia, a person's speech or language must be significantly impaired in one (or several) of the four communication modalities following acquired brain injury or have significant decline over a short time period (progressive aphasia). The four communication modalities are auditory comprehension, verbal expression, reading and writing, and functional communication. Question: is it possible to lose the ability to speak?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The actual record of birth is stored with a government agency. That agency will issue certified copies or representations of the original birth record upon request, which can be used to apply for government benefits, such as passports. The certification is signed and/or sealed by the registrar or other custodian of birth records, who is commissioned by the government. Question: is a certification of birth the same as a birth certificate?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: A private equity fund is a collective investment scheme used for making investments in various equity (and to a lesser extent debt) securities according to one of the investment strategies associated with private equity. Private equity funds are typically limited partnerships with a fixed term of 10 years (often with annual extensions). At inception, institutional investors make an unfunded commitment to the limited partnership, which is then drawn over the term of the fund. From the investors' point of view, funds can be traditional (where all the investors invest with equal terms) or asymmetric (where different investors have different terms). Question: is a private equity fund a financial institution?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: This article lists albatrosses that have been scored in important golf tournaments. An albatross, also called a double eagle, is a score of three-under-par on a single hole. This is most commonly achieved with two shots on a par-5, but can be done with a hole-in-one on a par-4 or three shots on a par-6. Question: is a double eagle the same as an albatross?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The first recorded rudimentary steam engine was the aeolipile described by Heron of Alexandria in 1st-century Roman Egypt. Several steam-powered devices were later experimented with or proposed, such as Taqi al-Din's steam jack, a steam turbine in 16th-century Ottoman Egypt, and Thomas Savery's steam pump in 17th-century England. In 1712, Thomas Newcomen's atmospheric engine became the first commercially successful engine using the principle of the piston and cylinder, which was the fundamental type steam engine used until the early 20th century. The steam engine was used to pump water out of coal mines Question: was the steam engine invented during the renaissance?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The final season revolves around Barney and Robin's wedding weekend. After some apprehension on both their parts, they get married in ``The End of the Aisle'' after he vows to always be honest with her. The series finale, ``Last Forever'', reveals that, three years after their wedding, they get divorced because Robin's hectic travel schedule prevents them from spending any time together. Barney returns to a lifestyle of meaningless sex with multiple women for several years afterward, until he gets one of his one-night stands pregnant. He hates the idea of being a father until the day his child -- a girl named Ellie -- is born. He falls in love with her at first sight and becomes a devoted father, turning away from his player lifestyle for good. Question: does barney die in how i met your mother?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no mention of whether Barney dies in the given passage. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Playing online games requires that users set up the system's network connection configuration, which is saved to a memory card. This can be done with the network Startup Disk that came with the network adapter or using one of the many games that had the utility built into them, such as Resident Evil Outbreak, to set up the network settings. The new slimline PlayStation 2 came with a disk in the box by default. The last version of the disk was network startup disk 5.0, which was included with the newer SCPH 90004 model released in 2009. However, as of December 31, 2012, the PlayStation 2 has been discontinued, and the servers for games have all since been shut down. Question: can you get on the internet with a playstation 2?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In any quantitative science, the terms relative change and relative difference are used to compare two quantities while taking into account the ``sizes'' of the things being compared. The comparison is expressed as a ratio and is a unitless number. By multiplying these ratios by 100 they can be expressed as percentages so the terms percentage change, percent(age) difference, or relative percentage difference are also commonly used. The distinction between ``change'' and ``difference'' depends on whether or not one of the quantities being compared is considered a standard or reference or starting value. When this occurs, the term relative change (with respect to the reference value) is used and otherwise the term relative difference is preferred. Relative difference is often used as a quantitative indicator of quality assurance and quality control for repeated measurements where the outcomes are expected to be the same. A special case of percent change (relative change expressed as a percentage) called percent error occurs in measuring situations where the reference value is the accepted or actual value (perhaps theoretically determined) and the value being compared to it is experimentally determined (by measurement). Question: is percent difference the same as percent change?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 1935, the Supreme Court overturned five of President Franklin Roosevelt's executive orders (6199, 6204, 6256, 6284, 6855). Executive Order 12954, issued by President Bill Clinton in 1995, attempted to prevent the federal government from contracting with organizations that had strike-breakers on the payroll; a federal appeals court subsequently ruled that the order conflicted with the National Labor Relations Act, and invalidated the order. Question: can the supreme court do anything to stop an executive order?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Since the late 2000s, the Canadian dollar has been valued at levels comparable to the years before the swift rise in 2007. A dollar in the mid 70 cent US range has been the usual rate for much of the 2010s. Question: is the canadian dollar worth more than usd?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Phantom pain sensations are described as perceptions that an individual experiences relating to a limb or an organ that is not physically part of the body. Limb loss is a result of either removal by amputation or congenital limb deficiency. However, phantom limb sensations can also occur following nerve avulsion or spinal cord injury. Question: is pain experienced in a missing body part or paralyzed area?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no mention of whether the pain is experienced in a missing body part or a paralyzed area in the passage. Therefore, the score is: 50"
    },
    {
        "question": "Passage: American Ninja Warrior (sometimes abbreviated as ANW) is an American sports entertainment competition that is a spin-off of the Japanese television series Sasuke. It features hundreds of competitors attempting to complete a series of obstacle courses of increasing difficulty, trying to make it to the national finals on the Las Vegas Strip, in hopes of becoming an ``American Ninja Warrior''. To date only two competitors, rock-climbers Isaac Caldiero and Geoff Britten, have finished the course and achieved ``Total Victory''. Caldiero is the only competitor to win the cash prize. The series premiered on December 12, 2009 on the now-defunct cable channel G4 and now airs on NBC with encore episodes airing on USA Network and NBCSN. Question: has anyone ever won the american ninja warrior?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Payola, in the music industry, is the illegal practice of payment or other inducement by record companies for the broadcast of recordings on commercial radio in which the song is presented as being part of the normal day's broadcast, without announcing this prior to broadcast. Under US law, a radio station can play a specific song in exchange for money, but this must be disclosed on the air as being sponsored airtime, and that play of the song should not be counted as a ``regular airplay''. Question: is payola legal in canada and the united states?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: In basketball, an assist is attributed to a player who passes the ball to a teammate in a way that leads to a score by field goal, meaning that he or she was ``assisting'' in the basket. There is some judgment involved in deciding whether a passer should be credited with an assist. An assist can be scored for the passer even if the player who receives the pass makes a basket after dribbling the ball. However, the original definition of an assist did not include such situations, so the comparison of assist statistics across eras is a complex matter. Question: can you give yourself an assist in basketball?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The newspaper has won 47 Pulitzer Prizes. This includes six separate Pulitzers awarded in 2008, second only to The New York Times' seven awards in 2002 for the highest number ever awarded to a single newspaper in one year. Post journalists have also received 18 Nieman Fellowships and 368 White House News Photographers Association awards. In the early 1970s, in the best-known episode in the newspaper's history, reporters Bob Woodward and Carl Bernstein led the American press' investigation into what became known as the Watergate scandal; reporting in the newspaper greatly contributed to the resignation of President Richard Nixon. In years since, its investigations have led to increased review of the Walter Reed Army Medical Center. Question: is the washington post a reputable news source?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: In electromagnetism, displacement current density is the quantity \u2202D/\u2202t appearing in Maxwell's equations that is defined in terms of the rate of change of D, the electric displacement field. Displacement current density has the same units as electric current density, and it is a source of the magnetic field just as actual current is. However it is not an electric current of moving charges, but a time-varying electric field. In physical materials (as opposed to vacuum), there is also a contribution from the slight motion of charges bound in atoms, called dielectric polarization. Question: is displacement current like conduction current a source of magnetic field?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Other states which have expressed an interest in joining the Commonwealth over the years or states which may be eligible to join the Commonwealth include Algeria, Bahrain, Cambodia, Egypt, Eritrea, Israel, Libya, Madagascar, Palestine, United States and Yemen. Question: is the united states part of the commonwealth of nations?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no information given in the passage to determine whether the United States is part of the Commonwealth of Nations or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: In all but two states, incest is criminalized between consenting adults. In New Jersey and Rhode Island, incest between consenting adults (16 or over for Rhode Island, 18 or over for New Jersey) is not a criminal offense, though marriage is not allowed in either state. New Jersey also increases the severity of underage sex offenses by a degree if they're also incestuous, and criminalizes incest with 16-17 year olds (the normal age of consent in New Jersey is 16). Question: are there any states where you can marry your sibling?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In 2011, Philip Pullman remarked at the British Humanist Association annual conference that due to the first film's disappointing sales in the United States, there would not be any sequels made. Question: was there a sequel to the golden compass movie?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Navient is a U.S. corporation based in Wilmington, Delaware, whose operations include servicing and collecting on student loans. Managing nearly $300 billion in student loans for more than 12 million customers, the company was formed in 2014 by the split of Sallie Mae into two distinct entities, Sallie Mae Bank and Navient. Navient employs 6,000 individuals at offices across the U.S. As of 2018, Navient services 25% of student loans in the United States. Question: is navient and sallie mae the same company?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In the 1995 live-action film Mortal Kombat, Johnny Cage is one of Raiden's three chosen warriors with Liu Kang and Sonya, and he takes part in the tournament to prove he is a legitimate fighter after Shang Tsung assumes the identity of Cage's sensei in order to trick him into participating. He defeats Scorpion and Goro, and is handpicked by Shang Tsung to fight him in final combat near the conclusion until Liu Kang accepts the challenge. Cage was played by actor Linden Ashby, who had practiced martial arts before he was cast in the role. He did not return for the 1997 sequel Mortal Kombat: Annihilation, in which Cage is played by Chris Conrad. During Shao Kahn's invasion of Earth in the beginning of the film, Cage is killed by the emperor in his attempt to save Sonya when she is taken hostage. Pat E. Johnson, the first film's stunt choreographer, recommended Conrad as Ashby's replacement to Annihilation's producers. Question: does johnny cage die in mortal kombat annihilation?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Geometric series are among the simplest examples of infinite series with finite sums, although not all of them have this property. Historically, geometric series played an important role in the early development of calculus, and they continue to be central in the study of convergence of series. Geometric series are used throughout mathematics, and they have important applications in physics, engineering, biology, economics, computer science, queueing theory, and finance. Question: can an infinite geometric series have a sum?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Although the name ``freshwater pearl mussel'' is often used for this species, other freshwater mussel species can also create pearls and some can also be used as a source of mother of pearl. In fact, most cultured pearls today come from Hyriopsis species in Asia, or Amblema species in North America, both members of the related family Unionidae; pearls are also found within species in the genus Unio. Question: can you get a pearl from a muscle?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a species of freshwater mussel, the \\\"freshwater pearl mussel,\\\" that can create pearls. However, other freshwater mussel species, including some in the genus Unio, can also create pearls. Therefore, the score is: 80/100"
    },
    {
        "question": "Passage: This punishment still has advocates today, even though its use has diminished considerably in recent years in favour of discipline methods that are not considered violent or humiliating. Additionally, ingestion of soaps and detergents can have potentially serious health consequences, and persons utilizing this form of punishment may face legal sanctions. Question: is it safe to wash your kid's mouth out with soap?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is no clear answer to the question in the passage, as it does not explicitly state whether it is safe or unsafe to wash a child's mouth out with soap. Therefore, the score is: 50"
    },
    {
        "question": "Passage: A straight flush is a poker hand containing five cards of sequential rank, all of the same suit, such as Q\u2665 J\u2665 10\u2665 9\u2665 8\u2665 (a ``queen-high straight flush''). It ranks below five of a kind and above four of a kind. As part of a straight flush, an ace can rank either above a king or below a two, depending on the rules of the game. Under high rules, an ace can rank either high (e.g. A\u2665 K\u2665 Q\u2665 J\u2665 10\u2665 is an ace-high straight flush) or low (e.g. 5 4 3 2 A is a five-high straight flush), but cannot rank both high and low in the same hand (e.g. Q\u2663 K\u2663 A\u2663 2\u2663 3\u2663 is an ace-high flush, not a straight flush). Under deuce-to-seven low rules, aces can only rank high, so a hand such as 5\u2660 4\u2660 3\u2660 2\u2660 A\u2660 is actually an ace-high flush. Under ace-to-six low rules, aces can only rank low, so a hand such as A\u2665 K\u2665 Q\u2665 J\u2665 10\u2665 is actually a king-high flush. Under ace-to-five low rules, straight flushes are not recognized, and a hand that would be categorized as a straight flush is instead a high card hand. Question: is an ace 2 3 4 5 a straight?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A sequel, Allegiant, was released on March 18, 2016. Question: is there going to be another insurgent movie?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The bank robber Clyde Barrow modified his Browning A-5 shotgun by cutting the barrel down to the same length as the magazine tube, and shortening the stock by 5 to 6 inches (125 to 150 mm) to make it more concealable. A small, 10--12-inch (250--300 mm) strap was attached to both ends of the butt of the gun, and was looped around his shoulder, concealing the gun between his arm and chest under his jacket in the manner of a shoulder holster. The gun was drawn up quickly and fired from the shoulder under which it was carried. Barrow dubbed it the ``Whippit'', as he was able to ``whip it'' out easily. Question: can you cut the stock off a shotgun?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: An orphan (from the Greek: \u03bf\u03c1\u03c6\u03b1\u03bd\u03cc\u03c2 orphan\u00f3s) is someone whose parents have died, are unknown, or have permanently abandoned them. Question: do your parents have to be dead to be an orphan?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is no single correct answer to this question because it is a matter of opinion. Some people may consider a person an orphan if their parents have died, some may consider it if their parents have permanently abandoned them, and some may consider it if their parents are unknown. Therefore the score is: 60"
    },
    {
        "question": "Passage: In the United States education system, social studies is the integrated study of multiple fields of social science and the humanities, including history, geography, and political science. The term was first coined by American educators around the turn of the twentieth century as a catch-all for these subjects, as well as others which did not fit into the traditional models of lower education in the United States, such as philosophy and psychology. Question: is social studies and geography the same thing?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Heraldry in the United States was first established by European settlers who brought with them the heraldic customs of their respective countries of origin. As the use of coats of arms may be seen as a custom of royalty and nobility, it had been debated whether the use of arms is reconcilable with American republican traditions. Families from English, Scottish, Irish, Welsh, German, and other European nations with a heraldic tradition have retained their familial coat of arms in the United States. Several ``founding fathers'' also employed personal arms and a great number of Americans continue to do so. Question: can an american get a coat of arms?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Battlefield 1 received positive reviews by critics and was seen as an improvement over previous installments, Battlefield 4 and Battlefield Hardline. Most of the praise was directed towards its World War I theme, single player campaign, multiplayer modes, visuals, and sound design. Question: does battlefield 1 have a single player campaign?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Gopher is a fictional grey anthropomorphic gopher character who first appeared in the 1966 Disney animated film Winnie the Pooh and the Honey Tree, introducing himself as Samuel J. Gopher. He has a habit of whistling out his sibilant consonants, one of various traits he has in common with the beaver in Lady and the Tramp, by whom he may have been inspired. While he never made appearances in any episodes of Welcome to Pooh Corner, Gopher was fleshed out a bit further in the television series The New Adventures of Winnie the Pooh. He is portrayed as generally hard-working, especially in his tunnels (which he inevitably falls into at least once). He does not appear in the original books Winnie the Pooh and The House at Pooh Corner by A.A. Milne until 1966 (a fact that is regularly pointed out in Winnie the Pooh and the Honey Tree, when he breaks the fourth wall by saying he's ``not in the book, y'know'', also trying to say that he would not be in a phone book). Gopher's voice was originally done by Howard Morris, who retired from the role and was replaced by Michael Gough. Question: is there a gopher in winnie the pooh?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The Czech Republic is bound to adopt the euro in the future and to join the eurozone once it has satisfied the euro convergence criteria by the Treaty of Accession since it joined the European Union (EU) in 2004. The Czech Republic is therefore a candidate for the enlargement of the eurozone and it uses the Czech koruna as its currency, regulated by the Czech National Bank, a member of the European System of Central Banks, and does not participate in European Exchange Rate Mechanism II (ERM II). Question: can you use the euro in the czech republic?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Following Arnie's 18th birthday party, Bonnie climbs the stairs to her bedroom for the first time since her husband's suicide. Arnie later tries to wake her but discovers that she has died. The children, not willing to let their mother become the joke of the town by having her corpse lifted from the house by crane, empty their family home of possessions and set it on fire. A year later, Gilbert describes what happened to his family after his mother's death, as Gilbert and his brother Arnie wait by the side of a road for Becky, who arrives with her grandmother, and picks them up. Question: does arnie die in what eating gilbert grape?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Outside the context of lawful detention or arrest, a person has no duty to answer any questions of the police. If judicial compulsion is sought by the State, the person can still invoke his or her Fifth Amendment right against compulsory self-incrimination, and refuse to testify if answers to questions posed are potentially self-incriminating. Only if granted immunity by the state, in a formal proceeding, from having any testimony or evidence derived from the testimony used against him or her, can a person be compelled to answer over an assertion of this right. If police detain (or arrest) a person, they must advise him or her that he or she has a right to remain silent, and the right to an attorney, among other rights. (This is known as the Miranda warning.) If the detained person invokes these rights, all interrogation must cease, and ordinarily nothing said by the defendant in violation of this rule may be admitted against him or her at trial. Question: do you have the right to remain silent before arrest?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. The passage states that a person has the right to remain silent outside the context of lawful detention or arrest. Therefore, the score is: 50"
    },
    {
        "question": "Passage: The Tampa Spartans football program was an intercollegiate American football team for the University of Tampa located in Tampa, Florida. The team competed in the NCAA Division I as an independent. The school's first football team was fielded in 1933. The football program was discontinued at the conclusion of the 1974 season. Question: does university of tampa have a football team?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Cement plaster is a mixture of suitable plaster, sand, portland cement and water which is normally applied to masonry interiors and exteriors to achieve a smooth surface. Interior surfaces sometimes receive a final layer of gypsum plaster. Walls constructed with stock bricks are normally plastered while face brick walls are not plastered. Various cement-based plasters are also used as proprietary spray fireproofing products. These usually use vermiculite as lightweight aggregate. Heavy versions of such plasters are also in use for exterior fireproofing, to protect LPG vessels, pipe bridges and vessel skirts. Question: can i mix cement and plaster of paris?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Articles of Confederation contain a preamble, thirteen articles, a conclusion, and a signatory section. The individual articles set the rules for current and future operations of the confederation's central government. Under the Articles, the states retained sovereignty over all governmental functions not specifically relinquished to the national Congress, which was empowered to make war and peace, negotiate diplomatic and commercial agreements with foreign countries, and to resolve disputes between the states. The document also stipulates that its provisions ``shall be inviolably observed by every state'' and that ``the Union shall be perpetual''. Question: did articles of confederation have an executive branch?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Per capita income is often used to measure an area's average income. This is used to see the wealth of the population with those of others. Per capita income is often used to measure a country's standard of living. It is usually expressed in terms of a commonly used international currency such as the euro or United States dollar, and is useful because it is widely known, is easily calculable from readily available gross domestic product (GDP) and population estimates, and produces a useful statistic for comparison of wealth between sovereign territories. This helps to ascertain a country's development status. It is one of the three measures for calculating the Human Development Index of a country. Question: is income per capita the same as gdp per capita?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The UNHRC was established by the UN General Assembly on March 15, 2006 (by resolution A/RES/60/251) to replace the UN Commission on Human Rights (UNCHR, herein CHR) that had been strongly criticised for allowing countries with poor human rights records to be members. UN Secretaries General Kofi Annan and Ban Ki-moon, former president of the council Doru Costea, the European Union, Canada, and the United States have accused the UNHRC of focusing disproportionately on the Israeli--Palestinian conflict, and many allege an anti-Israel bias- the Council has resolved more resolutions condemning Israel than the rest of the world combined. The UNHRC works closely with the Office of the High Commissioner for Human Rights (OHCHR) and engages the UN's special procedures. Question: is canada part of the human rights council?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A live-action film, with Reese Witherspoon playing Tinker Bell and Victoria Strouse writing the script, is in the works. Question: are there going to be more tinkerbell movies?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Admission to the bar in the United States is the granting of permission by a particular court system to a lawyer to practice law in that system. Each U.S state and similar jurisdiction (e.g., territories under federal control) has its own court system and sets its own rules for bar admission (or privilege to practice law), which can lead to different admission standards among states. In most cases, a person who is ``admitted'' to the bar is thereby a ``member'' of the particular bar. Question: can you be an attorney without taking the bar exam?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The movement is the tracing of the shape of a cross in the air or on one's own body, echoing the traditional shape of the cross of the Christian crucifixion narrative. There are two principal forms: one--three fingers, right to left--is exclusively used in the Eastern Orthodox Church, Church of the East and the Eastern Catholic Churches in the Byzantine, Assyrian and Chaldean traditions; the other--left to right to middle, other than three fingers--is the one used in the Latin (Catholic) Church, Anglicanism, Methodism, Presbyterianism, Lutheranism and Oriental Orthodoxy. The ritual is rare within other Christian traditions. Question: do church of england make the sign of the cross?",
        "pred_ans": " True",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Gulf of Mexico (Spanish: Golfo de M\u00e9xico) is an ocean basin and a marginal sea of the Atlantic Ocean, largely surrounded by the North American continent. It is bounded on the northeast, north and northwest by the Gulf Coast of the United States, on the southwest and south by Mexico, and on the southeast by Cuba. The U.S. states of Texas, Louisiana, Mississippi, Alabama, and Florida border the Gulf on the north, which are often referred to as the ``Third Coast'', in comparison with the U.S. Atlantic and Pacific coasts. Question: is the gulf of mexico considered an ocean?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Canada Dry is a brand of soft drinks owned since 2008 by the American Keurig Dr Pepper. For over a century, Canada Dry has been known for its ginger ale, though the company also manufactures a number of other soft drinks and mixers. Although Canada Dry originated in Canada, it is now produced in many countries around the globe, including the United States, Mexico, Colombia, Peru, Japan, & a number of countries of the Middle East & Europe. Question: is canada dry ginger ale made in canada?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage states that Canada Dry originated in Canada, but it does not specify whether the ginger ale is still made in Canada. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Most modern libraries replace C strings with a structure containing a 32-bit or larger length value (far more than were ever considered for length-prefixed strings), and often add another pointer, a reference count, and even a NUL to speed up conversion back to a C string! Memory is far larger now, such that if the addition of 3 (or 16, or more) bytes to each string is a real problem the software will have to be dealing with so many small strings that some other storage method will save even more memory (for instance there may be so many duplicates that a hash table will use less memory). Examples include the C++ Standard Template Library std::string , the Qt QString , the MFC CString , and the C-based implementation CFString from Core Foundation as well as its Objective-C sibling NSString from Foundation, both by Apple. More complex structures may also be used to store strings such as the rope. Question: do c++ strings need to be null terminated?",
        "pred_ans": " False",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    }
]