[
    {
        "question": "Passage: The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation of estimate. If the parameter or the statistic is the mean, it is called the standard error of the mean (SEM). Question: is the standard error the same as standard deviation?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: In 1998 EchoStar purchased the broadcasting assets of a satellite broadcasting joint venture of News Corporation and MCI Worldcom, called ASkyB (for American Sky Broadcasting, named after News Corp's BSkyB service in Britain); the two companies had nearly merged (which called for Dish Network being renamed Sky) before it was called off due to Charlie Ergen's clashes with News Corp. executives. With this purchase EchoStar obtained 28 of the 32 transponder licenses in the 110\u00b0 West orbital slot, more than doubling existing continental United States broadcasting capacity at a value of $682.5 million; some of the other assets were picked up by rival PrimeStar, which was sold to DirecTV in 1999. The acquisition (which also included an uplink center in Gilbert, Arizona) inspired the company to introduce a multi satellite system called Dish 500, theoretically capable of receiving more than 500 channels on one Dish. In the same year, EchoStar, partnering with Bell Canada, launched Dish Network Canada. Question: is directv and dish network owned by the same company?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Fullmetal Alchemist was adapted into two anime series for television: a loose adaptation titled Fullmetal Alchemist in 2003--2004, and a more faithful 2009--2010 retelling titled Fullmetal Alchemist: Brotherhood. Question: is fullmetal alchemist brotherhood a continuation of the original?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: In mathematics, a divisor of an integer n (\\displaystyle n) , also called a factor of n (\\displaystyle n) , is an integer m (\\displaystyle m) that may be multiplied by some integer to produce n (\\displaystyle n) . In this case, one also says that n (\\displaystyle n) is a multiple of m . (\\displaystyle m.) An integer n (\\displaystyle n) is divisible by another integer m (\\displaystyle m) if m (\\displaystyle m) is a divisor of n (\\displaystyle n) ; this implies dividing n (\\displaystyle n) by m (\\displaystyle m) leaves no remainder. Question: is a divisor the same as a factor?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The United States men's national soccer team is controlled by the United States Soccer Federation and competes in the Confederation of North, Central American and Caribbean Association Football. The team has appeared in ten FIFA World Cups, including the first in 1930, where they reached the semi-finals. The U.S. participated in the 1934 and 1950 World Cups, winning 1--0 against England in the latter. After 1950, the U.S. did not qualify for the World Cup until 1990. The U.S. hosted the 1994 World Cup, where they lost to Brazil in the round of sixteen. They qualified for five more consecutive World Cups after 1990 (for a total of seven straight appearances, a feat shared with only seven other nations), becoming one of the tournament's regular competitors and often advancing to the knockout stage. The U.S. reached the quarter-finals of the 2002 World Cup, where they lost to Germany. In the 2009 Confederations Cup, they eliminated top-ranked Spain in the semi-finals before losing to Brazil in the final, their only appearance in a final. The team failed to qualify for the 2018 World Cup, having been eliminated in continental qualifying, ending the streak of consecutive World Cups at seven. Question: does america have a team in the world cup?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Greece in the Roman era describes the period of Greek history when it was dominated by the Roman republic, the Roman Empire and the Byzantine Empire (collectively, the Roman era). It began with the Roman victory over the Corinthians, at the Battle of Corinth (146 BC). It continued with the adoption of the city of Byzantium by the Emperor Constantine the Great as the capital of the Roman Empire (as Nova Roma, later Constantinople) in AD 330. After this date, the Eastern Empire became largely Greek speaking. Question: were greece and rome around at the same time?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: While the term was in use as early as 1933, it became official only after the formation of the NCAA Division I athletic conference in 1954. Seven of the eight schools were founded during the colonial period (Cornell was founded in 1865). Ivy League institutions account for seven of the nine Colonial Colleges chartered before the American Revolution; the other two are Rutgers University and the College of William & Mary. Question: is college of william and mary an ivy league school?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The United Kingdom European Union membership referendum, also known as the EU referendum and the Brexit referendum, took place on 23 June 2016 in the United Kingdom (UK) and Gibraltar to gauge support for the country either remaining a member of, or leaving, the European Union (EU) under the provisions of the European Union Referendum Act 2015 and also the Political Parties, Elections and Referendums Act 2000. The referendum resulted in a simple majority of 51.9% (of people who voted) being in favour of leaving the EU. Although legally the referendum was non-binding, the government of that time had promised to implement the result, and it initiated the official EU withdrawal process on 29 March 2017, which put the UK on course to leave the EU by 30 March 2019, after a period of Brexit negotiations. Question: is britain still a member of european union?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Ordinarily, a baseball game consists of nine innings (in softball and high school baseball games there are typically seven innings; in Little League, six), each of which is divided into halves: the visiting team bats first, after which the home team takes its turn at bat. However, if the score remains tied at the end of the regulation number of complete innings, the rules provide that ``play shall continue until (1) the visiting team has scored more total runs than the home team at the end of a completed inning; or (2) the home team scores the winning run in an uncompleted inning.'' Question: can an mlb game end in a tie?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Six episodes of the first season premiered on May 5, 2017. The series was renewed for a second season and it premiered on September 8, 2017. The series was renewed for a third season and it premiered on November 17, 2017. The series was renewed for a fourth season and it premiered on March 16, 2018. A fifth season of the show was released on Netflix on May 11, 2018. A sixth season of the show was released on Netflix on August 17, 2018. Question: will there be more episodes of spirit riding free?",
        "pred_ans": " Based on the passage, it is unknown if there will be more episodes of Spirit Riding Free, as the passage does not mention any further seasons or releases after the sixth season in August 2018. Therefore, the answer is False.",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: In Mexico, Belgium, Germany and Austria, the philosophy of the law holds that it is human nature to want to escape. In those countries, escapees who do not break any other laws are not charged for anything and no extra time is added to their sentence. However, in Mexico, officers are allowed to shoot prisoners attempting to escape and an escape is illegal if violence is used against prison personnel or property, or if prison inmates or officials aid the escape. Question: is it legal to escape from prison in germany?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: There is no offside offence if a player receives the ball directly from a goal kick, a corner kick, a throw-in, or a dropped-ball. It is also not an offence if the ball was last deliberately played by an opponent (except for a deliberate save). In this context, according to the IFAB, ``A 'save' is when a player stops, or attempts to stop, a ball which is going into or very close to the goal with any part of the body except the hands/arms (unless the goalkeeper within the penalty area).'' Question: can there be an offside offence if a player received the ball directly from a dropped ball?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: In order to recover construction and operating costs, the bridge was electronically tolled when originally built. The toll rates increased to $1.60 for motorcycle, $3.15 for cars, $6.30 for small trucks and $9.45 for large trucks on August 15, 2015. Through increased prices and greater traffic, Transportation Investment Corporation (TI Corp), the public Crown corporation responsible for toll operations on the Port Mann Bridge, forecast its revenue would grow by 85% between fiscal years 2014 and 2017. These fees were assessed using radio-frequency identification (RFID) decals or licence plate photos. A B.C. licensed driver who owes more than $25 in tolls outstanding 90 days is penalized $20 and is unable to purchase vehicle insurance or renew drivers permits without payment of the debt. Out-of-province drivers were also contacted for payment by a US-based contractor. A licence plate processing fee of $2.30 per trip was added to the toll rate for unregistered users who did not pay their toll within seven days of their passage. Monthly passes, which allowed unlimited crossing on the bridge, were available for purchase. Users may have set up an account for online payment of tolls. Users who opted for this method received a decal with an embedded RFID to place on their vehicle's windshield or headlight and avoid paying a processing fee. Tolls were expected to be removed by the year of 2050 or after collecting $3.3 billion. As announced by B.C. Premier John Horgan a few days earlier, all tolls on the Port Mann Bridge were removed on September 1, 2017. Debt service was transferred to the province of British Columbia at a cost of $135 million per year. Question: is the port mann bridge a toll bridge?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: The Becket controversy or Becket dispute was the quarrel between Thomas Becket, the Archbishop of Canterbury, and King Henry II of England, from 1163 to 1170. The controversy culminated with Becket's murder in 1170, and was followed by Becket's canonization in 1173 and Henry's public penance at Canterbury in July 1174. Question: has the long exile of the archbishop ended?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: This has led to categorizing trucks similarly, even if their payload is different. Therefore, the Toyota Tacoma, Dodge Dakota, Ford Ranger, Honda Ridgeline, Chevrolet S-10, GMC S-15 and Nissan Frontier are called quarter-tons (1\u20444-ton). The Ford F-150, Chevrolet C10/K10, Chevrolet/GMC 1500, Dodge 1500, Toyota Tundra, and Nissan Titan are half-tons (1\u20442-ton). The Ford F-250, Chevrolet C20/K20, Chevrolet/GMC 2500, and Dodge 2500 are three-quarter-tons (3\u20444-ton). Chevrolet/GMC's 3\u20444-ton suspension systems were further divided into light and heavy-duty, differentiated by 5-lug and 6 or 8-lug wheel hubs depending on year, respectively. The Ford F-350, Chevrolet C30/K30, Chevrolet/GMC 3500, and Dodge 3500 are one tons (1-ton). Question: is a nissan frontier a 1/2 ton truck?",
        "pred_ans": " No, the Nissan Frontier is a quarter-ton (1\u20444-ton) truck.",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The Chernobyl Nuclear Power Plant or Chernobyl Nuclear Power Station (Ukrainian: \u0427\u043e\u0440\u043d\u043e\u0431\u0438\u043b\u044c\u0441\u044c\u043a\u0430 \u0430\u0442\u043e\u043c\u043d\u0430 \u0435\u043b\u0435\u043a\u0442\u0440\u043e\u0441\u0442\u0430\u043d\u0446\u0456\u044f, Chornobyls'ka Atomna Elektrostantsiya, Russian: \u0427\u0435\u0440\u043d\u043e\u0431\u044b\u043b\u044c\u0441\u043a\u0430\u044f \u0410\u042d\u0421, Chernobyl'skaya AES) is a decommissioned nuclear power station near the city of Pripyat, Ukraine, 14.5 km (9.0 mi) northwest of the city of Chernobyl, 16 km (9.9 mi) from the Belarus--Ukraine border, and about 110 km (68 mi) north of Kiev. Reactor No. 4 was the site of the Chernobyl disaster in 1986 and the power plant is now within a large restricted area known as the Chernobyl Exclusion Zone. Both the zone and the former power plant are administered by the State Agency of Ukraine of the Exclusion Zone (Ministry of Ecology and Natural Resources). All four reactors have been shut down. Question: is the chernobyl nuclear power plant still operating?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: North America covers an area of about 24,709,000 square kilometers (9,540,000 square miles), about 16.5% of the earth's land area and about 4.8% of its total surface. North America is the third largest continent by area, following Asia and Africa, and the fourth by population after Asia, Africa, and Europe. In 2013, its population was estimated at nearly 579 million people in 23 independent states, or about 7.5% of the world's population, if nearby islands (most notably the Caribbean) are included. Question: is the caribbean in north america or south america?",
        "pred_ans": " False. The Caribbean is located in the North Atlantic Ocean, south of Florida in the United States, and it is considered part of North America.",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: In all criminal prosecutions, the accused shall enjoy the right to a speedy and public trial, by an impartial jury of the State and district wherein the crime shall have been committed, which district shall have been previously ascertained by law, and to be informed of the nature and cause of the accusation; to be confronted with the witnesses against him; to have compulsory process for obtaining witnesses in his favor, and to have the Assistance of Counsel for his defence. Question: is the right to a fair trial in the constitution?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: In the United Kingdom, there is not an equivalent of a vehicle title. Instead, there is a document known as the 'vehicle registration document', and is issued by the Driver and Vehicle Licensing Agency (DVLA). The current version has the reference number V5C. Prior to computerisation, the title document was the 'log book', and this term is sometimes still used to describe the V5C. The V5 document records who the Registered Keeper of the vehicle is; it does not establish legal ownership of the vehicle. These documents used to be blue on the front. However, they were changed to red in 2010/11 after approximately 2.2 million blank blue V5 documents were stolen, allowing thieves to clone stolen vehicles much more easily. Question: is a title and registration the same thing?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The hosts of the World Cup receive an automatic berth. Unlike many other sports, results of the previous World Cups or of the continental championships are not taken into account. Until 2002, the defending champions also received an automatic berth, but starting from the 2006 World Cup this is no longer the case. Question: does host country have to qualify for world cup?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: On December 2, 2011, Universal Orlando Resort announced that the Jaws attraction along with the entire Amity area of Universal Studios Florida would close permanently on January 2, 2012 to ``make room for an exciting, NEW, experience.'' (the second phase of The Wizarding World Of Harry Potter.) severe backlash followed after the announcement. The attraction officially closed on January 2, 2012 at 9:00 pm with Michael Skipper aka ``Skip'' giving the final voyage to the last lucky group of 48 guests. By the next morning, the entire Amity area was walled off and completely demolished in the following months. The hanging shark statue from the town square remains as a tribute to the ride and can be found in the Fisherman's Wharf area of the San Francisco section of the park. The attraction remains open at Universal Studios Japan as well as the original tram stop at Universal Studios Hollywood. Question: is the jaws ride at universal orlando closed?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The subsequent trophy, called the ``FIFA World Cup Trophy'', was introduced in 1974. Made of 18 carat gold with bands of malachite on its base, it stands 36.8 centimetres high and weighs 6.1 kilograms. The trophy was made by Stabilimento Artistico Bertoni company in Italy. It depicts two human figures holding up the Earth. The current holders of the trophy are France, winners of the 2018 World Cup. Question: does the world cup trophy have a name?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: There is no offside offence if a player receives the ball directly from a goal kick, a corner kick, a throw-in, or a dropped-ball. It is also not an offence if the ball was last deliberately played by an opponent (except for a deliberate save). In this context, according to the IFAB, ``A 'save' is when a player stops, or attempts to stop, a ball which is going into or very close to the goal with any part of the body except the hands/arms (unless the goalkeeper within the penalty area).'' Question: can you be offside on a corner kick?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Destin--Fort Walton Beach Airport (IATA: VPS, ICAO: KVPS, FAA LID: VPS) is an airport located within Eglin Air Force Base, near Destin and Fort Walton Beach in Okaloosa County, Florida. No private aircraft are allowed, so Destin Executive Airport is used instead for non-commercial operations by general aviation and business aircraft. The airport was previously named Northwest Florida Regional Airport until February 17, 2015 and Okaloosa Regional Airport until September 2008. Question: is there an airport in fort walton beach florida?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: On 28 September 2016, Nine renewed the program for a second season after just two episodes having been aired. On 11 October 2017, the series was renewed for a third season at Nine's upfronts. and premiered on Monday, 6 August 2018, instead of the previous Wednesday night slot. On 17 October 2018 the series was renewed for a fourth season. Question: is there a fourth season of doctor doctor?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Before the Civil War, President James Buchanan took a weak position amid a looming South secession crisis. Secretary of State Lewis Cass of Michigan, a 78-year-old elder statesman who has been Michigan's U.S. senator and governor of Michigan Territory, resigned from Buchanan's cabinet in protest, remarking that ``he had seen the Constitution born and now feared he was seeing it die''. Question: was michigan a state during the civil war?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: National Car Rental is an American rental car agency based in Clayton, Missouri, United States. National is owned by Enterprise Holdings, along with other agencies including Enterprise Rent-A-Car, and Alamo Rent a Car. Question: is national and enterprise car rental the same company?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The United States Economic Census is the U.S. federal government's official five-year measure of American business and the economy. It is conducted by the U.S. Census Bureau, and response is required by law. Forms go out to nearly 4 million businesses, including large, medium and small companies representing all U.S. locations and industries. Respondents are asked to provide a range of operational and performance data for their companies. Trade associations, chambers of commerce, and businesses use information from the economic census for economic development, business decisions, and strategic planning purposes. The next Economic Census will be conducted for the year ending December 2017. Question: are you required to complete the 2017 economic census?",
        "pred_ans": " False. As an AI language model, I am not a business or individual respondent required to complete the Economic Census. The passage states that the forms go out to nearly 4 million businesses, but I am not one of those businesses.",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: The NBA high school draftees are players who have been drafted to the National Basketball Association (NBA) straight out of high school without playing basketball at the collegiate level. The process of jumping directly from high school to the professional level is also known as going prep-to-pro. Since 2006, the practice of drafting high school players has been prohibited by the new collective bargaining agreement, which requires that players who entered the draft be 19 years of age and at least one year removed from high school. Contrary to popular belief, the player does not have to play at least a year in college basketball, as the player can choose to instead play in another professional league (like the NBA G League or especially somewhere overseas) like Brandon Jennings or Emmanuel Mudiay in Italy and China respectively, simply take the year off, such as the case with Mitchell Robinson, or even hold themselves back a year in high school before declaring for the draft, like with Satnam Singh Bhamara or Thon Maker. Question: can you go to nba out of high school?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: U.S. Marshals is a 1998 American action crime thriller film directed by Stuart Baird. The storyline was conceived from a screenplay written by Roy Huggins and John Pogue. The film is a spin-off to the 1993 motion picture The Fugitive, which in turn was based on the television series of the same name, created by Huggins. The story does not involve the character of Dr. Richard Kimble, portrayed by Harrison Ford in the initial film, but instead the plot centers on United States Deputy Marshal Sam Gerard, once again played by Tommy Lee Jones. The plot follows Gerard and his team as they pursue another fugitive, Mark Sheridan, played by Wesley Snipes, who attempts to escape government officials following an international conspiracy scandal. The cast features Robert Downey, Jr., Joe Pantoliano, Daniel Roebuck, Tom Wood, and LaTanya Richardson, several of whom portrayed Deputy Marshals in the previous film. Question: is us marshals a sequel to the fugitive?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: A cordon bleu or schnitzel cordon bleu is a dish of meat wrapped around cheese (or with cheese filling), then breaded and pan-fried or deep-fried. Veal or pork cordon bleu is made of veal or pork pounded thin and wrapped around a slice of ham and a slice of cheese, breaded, and then pan fried or baked. For chicken cordon bleu chicken breast is used instead of veal. Ham cordon bleu is ham stuffed with mushrooms and cheese. Question: is chicken cordon bleu made with blue cheese?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is no mention of blue cheese in the passage, and the description of chicken cordon bleu does not include any blue cheese as an ingredient. Therefore, the score is: 100"
    },
    {
        "question": "Passage: The natural logarithm of a number is its logarithm to the base of the mathematical constant e, where e is an irrational and transcendental number approximately equal to 7000271828182845899\u26602.718281828459. The natural logarithm of x is generally written as ln x, log x, or sometimes, if the base e is implicit, simply log x. Parentheses are sometimes added for clarity, giving ln(x), log(x) or log(x). This is done in particular when the argument to the logarithm is not a single symbol, to prevent ambiguity. Question: is log x the same as ln x?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The group winners, Serbia, qualified directly for the 2018 FIFA World Cup. The group runners-up, Republic of Ireland, advanced to the play-offs as one of the best 8 runners-up, where they lost to Denmark and thus failed to qualify. Question: has ireland qualified for the world cup 2018?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: The groundhog (Marmota monax), also known as a woodchuck, is a rodent of the family Sciuridae, belonging to the group of large ground squirrels known as marmots. It was first scientifically described by Carl Linnaeus in 1758. The groundhog is also referred to as a chuck, wood-shock, groundpig, whistlepig, whistler, thickwood badger, Canada marmot, monax, moonack, weenusk, red monk and, among French Canadians in eastern Canada, siffleux. The name ``thickwood badger'' was given in the Northwest to distinguish the animal from the prairie badger. Monax was a Native American name of the woodchuck, which meant ``the digger''. Young groundhogs may be called chucklings. Other marmots, such as the yellow-bellied and hoary marmots, live in rocky and mountainous areas, but the groundhog is a lowland creature. It is found through much of the eastern United States across Canada and into Alaska Question: is a woodchuck and a ground hog the same thing?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The building used in the show for the firehouse exteriors is a working Chicago Fire Department firehouse, and is the headquarters of Engine 18, located at 1360 South Blue Island Avenue at Maxwell Street, between 13th & Racine. Housed here is ALS Engine 18, 2--2--1 (Deputy District Chief -- 1st District), 2--1--21 (1st District Chief), 6--4--16 (High-Rise Response Unit), and ALS Ambulance 65. The interiors of Firehouse 51 are filmed at Cinespace Chicago Film Studios. The station house used for exteriors in Chicago PD is just a few blocks away at 949 West Maxwell Street at Morgan Street (interiors likewise filmed at Cinespace). Question: is chicago fire filmed in a real firehouse?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: In addition to being one of the top six international marathons run over the distance of 26 mi 385 yd (42.195 km), the IAAF standard for the marathon established in 1921 and originally used for the 1908 London Olympics, the London Marathon is also a large, celebratory sporting festival, third in England only to the Great North Run in Newcastle upon Tyne and Great Manchester Run in Manchester in terms of the number of participants . The event has raised over \u00a3450 million for charity since 1981, and holds the Guinness world record as the largest annual fund raising event in the world, with the 2009 participants raising over \u00a347.2 million for charity. In 2007, 78% of all runners raised money. In 2011 the official charity of the London Marathon was Oxfam. In 2014, the official charity was Anthony Nolan, and in 2015, it was Cancer Research UK. Question: is the london marathon the largest in the world?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: There are eleven official languages of South Africa: Afrikaans, English, Ndebele, Northern Sotho, Sotho, SiSwati, Tsonga, Tswana, Venda, Xhosa and Zulu. Fewer than two percent of South Africans speak a first language other than an official one. Most South Africans can speak more than one language. Dutch and English were the first official languages of South Africa from 1910 to 1925. Afrikaans was added as a part of Dutch in 1925, although in practice, Afrikaans effectively replaced Dutch, which fell into disuse. When South Africa became a republic in 1961, the official relationship changed such that Afrikaans was considered to include Dutch, and Dutch was dropped in 1984, so between 1984 and 1994, South Africa had two official languages: English and Afrikaans. Question: is english the official language of south africa?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Eastbound vehicles must pay a toll to cross the bridge; as with all Hudson River crossings along the North River, westbound vehicles cross for free. As of December 6, 2015, the cash tolls going from New Jersey to New York are $15 for both cars and motorcycles. E-ZPass users are charged $10.50 for cars and $9.50 for motorcycles during off-peak hours, and $12.50 for cars and $11.50 for motorcycles during peak hours. Trucks are charged cash tolls of $20.00 per axle, with discounted peak, off-peak, and overnight E-ZPass tolls. A discounted carpool toll ($6.50) is available at all times for cars with three or more passengers using NY or NJ E-ZPass, who proceed through a staffed toll lane (provided they have registered with the free ``Carpool Plan''). There is an off-peak toll of $7.00 for qualified low-emission passenger vehicles, which have received a Green E-ZPass based on registering for the Port Authority Green Pass Discount Plan. Question: is george washington bridge a one way toll?",
        "pred_ans": " False. The passage states that eastbound vehicles must pay a toll, but westbound vehicles cross for free.",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Although it is widely believed that a worker honey bee can sting only once, this is a partial misconception: although the stinger is in fact barbed so that it lodges in the victim's skin, tearing loose from the bee's abdomen and leading to its death in minutes, this only happens if the skin of the victim is sufficiently thick, such as a mammal's. Honey bees are the only hymenoptera with a strongly barbed sting, though yellow jackets and some other wasps have small barbs. Question: is there always a stinger in a bee sting?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: It is possible to have extra, or ``supernumerary,'' teeth. This phenomenon is called hyperdontia and is often erroneously referred to as ``a third set of teeth.'' These teeth may erupt into the mouth or remain impacted in the bone. Hyperdontia is often associated with syndromes such as cleft lip and palate, trichorhinophalangeal syndrome, cleidocranial dysplasia, and Gardner's syndrome. Question: can you get a 3rd set of teeth?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The ``Little House'' Books is a series of American children's novels written by Laura Ingalls Wilder, based on her childhood and adolescence in the American Midwest (Wisconsin, Kansas, Minnesota, South Dakota, and Missouri) between 1870 and 1894. Eight of the novels were completed by Wilder, and published by Harper & Brothers. The appellation ``Little House'' books comes from the first and third novels in the series of eight published in her lifetime. The second novel was about her husband's childhood. The first draft of a ninth novel was published posthumously in 1971 and is commonly included in the series. Question: is the little house on the prairie fiction?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Turtle may either refer to the order as a whole, or to particular turtles that make up a form taxon that is not monophyletic, or may be limited to only aquatic species. Tortoise usually refers to any land-dwelling, non-swimming chelonian. Terrapin is used to describe several species of small, edible, hard-shell turtles, typically those found in brackish waters. Question: is a turtle the same as a tortoise?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: We Bought a Zoo is a 2011 American family comedy-drama film loosely based on the 2008 memoir of the same name by Benjamin Mee. It was written and directed by Cameron Crowe and stars Matt Damon as widowed father Benjamin Mee, who purchases a dilapidated zoo with his family and takes on the challenge of preparing the zoo for its reopening to the public. The film also stars Scarlett Johansson, Maggie Elizabeth Jones, Thomas Haden Church, Patrick Fugit, Elle Fanning, Colin Ford, and John Michael Higgins. The film was released in the United States on December 23, 2011 by 20th Century Fox. The film earned $120.1 million on a $50 million budget. We Bought a Zoo was released on DVD and Blu-ray on April 3, 2012 by 20th Century Fox Home Entertainment. Dartmoor Zoological Park (originally Dartmoor Wildlife Park), on which the film is based, is a 33-acre zoological garden located near the village of Sparkwell, Devon, England. Question: is we bought a zoo a true story?",
        "pred_ans": " Answer: False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Brazil is the most successful national team in the history of the World Cup, having won five titles, earning second-place, third-place and fourth-place finishes twice each. Brazil is one of the countries besides Argentina, Spain and Germany to win a FIFA World Cup away from its continent (Sweden 1958, Mexico 1970, USA 1994 and South Korea/Japan 2002). Brazil is the only national team to have played in all FIFA World Cup editions without any absence or need for playoffs. Brazil also has the best overall performance in World Cup history in both proportional and absolute terms with a record of 73 victories in 109 matches played, 124 goal difference, 237 points and only 18 losses. Question: has brazil ever won the world cup in europe?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There are a few mistakes in the student's answer. Brazil has won the World Cup in Europe, specifically in Sweden in 1958. Therefore, the score is: 50."
    },
    {
        "question": "Passage: Daisy Johnson, also known as Quake, is a fictional superhero appearing in American comic books published by Marvel Comics. Created by writer Brian Michael Bendis and artist Gabriele Dell'Otto, the character first appeared in Secret War #2 (July 2004). The daughter of the supervillain Mister Hyde, she is a secret agent of the intelligence organization S.H.I.E.L.D. with the power to generate earthquakes. Question: is daisy the director of shield in the comics?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: On July 28, 2017, the bill was returned to the calendar after the Senate rejected several amendments, including S.Amdt. 667, the ``Skinny Repeal'' package offered by Sen. Mitch McConnell, which failed on a 49--51 vote. Sens. John McCain, Susan Collins, and Lisa Murkowski were the only Republicans to vote against the measure. Question: did the american health care act pass the senate?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Although Karev and Stevens had an on-off romantic history in previous seasons, it was not until the fifth season that the two formed a lasting relationship. Despite Stevens' advanced skin cancer, the two marry. Justin Chambers commented on his character's inability to tell Stevens that he loves her in the beginning of their relationship, saying that Karev has difficulty expressing himself. Though Stevens departs in season 6, series creator Shonda Rhimes has said that she would like the chance to create closure for both Karev and Stevens. Rhimes later retracted her comments and stated that she has no plans to ever re-approach Izzie's storyline again. In the ninth season, Karev befriends intern Jo Wilson. They begin a relationship in the tenth season until their break-up in the twelfth season. However, they get back together again and marry in the fourteenth season. Question: does alex come back to grey sloan memorial?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: In chemistry and atomic physics, the main group is the group of elements whose lightest members are represented by helium, lithium, beryllium, boron, carbon, nitrogen, oxygen, and fluorine as arranged in the periodic table of the elements. The main group includes the elements (except hydrogen, which is sometimes not included) in groups 1 and 2 (s-block), and groups 13 to 18 (p-block). The s-block elements are primarily characterised by one main oxidation state, and the p-block elements, when they have multiple oxidation states, often have common oxidation states separated by two units. Question: is there a main group element in period 6?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The Nuggets finished the 2007--08 season with exactly 50 wins (50--32 overall record, tied for the third-best all-time Nuggets record since the team officially joined the NBA in 1976), following a 120--111 home victory over the Memphis Grizzlies in the last game of the season. It was the first time since the 1987--88 NBA season that the Nuggets finished with at least 50 wins in a season. Denver ended up as the 8th seed in the Western Conference of the 2008 Playoffs, and their 50 wins marked the highest win total for an 8th seed in NBA history. It also meant that for the first time in NBA history, all eight playoff seeds in a conference had at least 50 wins. The Nuggets faced the top-seeded Los Angeles Lakers (57--25 overall record) in the first round of the Playoffs. The seven games separating the Nuggets overall record and the Lakers overall record is the closest margin between an eighth seed and a top seed since the NBA went to a 16-team playoff format in 1983--84. The Lakers swept the Nuggets in four games, marking the second time in NBA history that a 50-win team was swept in a best-of-seven playoff series in the first round. For the series, Anthony averaged 22.5 ppg, 9.5 rpg (playoff career-high), 2.0 apg and 0.5 spg. Question: did carmelo anthony go to the western conference finals?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Nigella sativa (black caraway, also known as black cumin, nigella, and kalonji) is an annual flowering plant in the family Ranunculaceae, native to south and southwest Asia. Question: is black cumin seed same as black seed?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The cremaster muscle is a muscle that covers the testis and the spermatic cord. Question: is the cremaster muscle in the spermatic cord?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The game features a nonlinear story, that follows the unnamed player character as they go to become a racing icon in the United States by winning in all racing disciplines available in the game. There are four disciplines: Street Racing, Off Road, Freestyle and Pro Racing. In Street Racing, the player is assisted by Latrell. In Off Road, the player is assisted by Tucker ``Tuck'' Morgan. In Freestyle, the player is assisted by Sofia and her father. In Pro Racing, the player is assisted by Alexis. Question: does the crew 2 have a story line?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Mycosis fungoides was first described in 1806 by French dermatologist Jean-Louis-Marc Alibert. The name mycosis fungoides is very misleading--it loosely means ``mushroom-like fungal disease''. The disease, however, is not a fungal infection but rather a type of non-Hodgkin's lymphoma. It was so named because Alibert described the skin tumors of a severe case as having a mushroom-like appearance. Question: is mycosis fungoides caused by a fungal infection?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The mishap in The Guardian where Randall loses his crew is loosely based on an actual U.S. Coast Guard aviation mishap in Alaska. The aircraft was an HH-3F Pelican (USCG variant of the Jolly Green Giant) instead of the HH-60J Jayhawk (USCG variant of the Blackhawk/Seahawk) pictured in the movie. Question: is the movie the guardian based on a true story?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Carbon (from Latin: carbo ``coal'') is a chemical element with symbol C and atomic number 6. It is nonmetallic and tetravalent--making four electrons available to form covalent chemical bonds. It belongs to group 14 of the periodic table. Three isotopes occur naturally, C and C being stable, while C is a radionuclide, decaying with a half-life of about 5,730 years. Carbon is one of the few elements known since antiquity. Question: is carbon a metal on the periodic table?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: In Little League, in the Tee-Ball and Minor League divisions, the batter is out after the third strike regardless of whether the pitched ball is caught cleanly by the catcher. In Little League (or the Major Division), Junior, Senior, and Big League divisions, a batter may attempt to advance to first base on an uncaught third strike. Little League Major Division Softball and many other youth baseball leagues (such as the USSSA) also follow the rule. Question: can you run on a dropped third strike in little league?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Ursa Major is primarily known from the asterism of its main seven relatively bright stars comprising the ``Big Dipper'', ``the Wagon'', ``Charles's Wain'' or ``the Plough'' (among others), with its stellar configuration mimicking the shape of the ``Little Dipper''. Question: is the big dipper the same as ursa major?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: ``Eye of the Tiger'' is a song by American rock band Survivor. It was released as a single from their third album of the same name Eye of the Tiger and was also the theme song for the film Rocky III, which was released a day before the single. The song was written by Survivor guitarist Frankie Sullivan and keyboardist Jim Peterik, and was recorded at the request of Rocky III star, writer, and director Sylvester Stallone, after Queen denied him permission to use ``Another One Bites the Dust'', the song Stallone intended as the Rocky III theme. Originally, the song was made for the movie The Karate Kid. The director of both Rocky and The Karate Kid planned to use the song for a fighting montage towards the end of the feature. John G. Avildsen opted to using ``You're the Best'' by Joe Esposito. The version of the song that appears in the movie is the demo version of the song. The movie version also contained tiger growls, something that did not appear on the album version. It features original Survivor singer Dave Bickler on lead vocals. Question: was eye of the tiger written for rocky?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Continuous positive airway pressure (CPAP) is a form of positive airway pressure ventilator, which applies mild air pressure on a continuous basis to keep the airways continuously open in people who are able to breathe spontaneously on their own. It is an alternative to positive end-expiratory pressure (PEEP). Both modalities stent the lungs' alveoli open and thus recruit more of the lung's surface area for ventilation. But while PEEP refers to devices tvt impose positive pressure only at the end of the exhalation, CPAP devices apply continuous positive airway pressure throughout the breathing cycle. Thus, the ventilator itself does not cycle during CPAP, no additional pressure above the level of CPAP is provided, and patients must initiate all of their breaths. Question: is a cpap the same as a ventilator?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Princeton Theological Seminary (PTS) is a private, nonprofit, and independent graduate school of theology in Princeton, New Jersey. Founded in 1812 under the auspices of Archibald Alexander, the General Assembly of the Presbyterian Church, and the College of New Jersey (now Princeton University), it is the second-oldest seminary in the United States. It is also the largest of ten seminaries associated with the Presbyterian Church (USA). Question: is princeton theological seminary part of princeton university?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Peptidoglycan, also known as murein, is a polymer consisting of sugars and amino acids that forms a mesh-like layer outside the plasma membrane of most bacteria, forming the cell wall. The sugar component consists of alternating residues of \u03b2-(1,4) linked N-acetylglucosamine (NAG) and N-acetylmuramic acid (NAM) . Attached to the N-acetylmuramic acid is a peptide chain of three to five amino acids. The peptide chain can be cross-linked to the peptide chain of another strand forming the 3D mesh-like layer. Peptidoglycan serves a structural role in the bacterial cell wall, giving structural strength, as well as counteracting the osmotic pressure of the cytoplasm. A common misconception is that peptidoglycan gives the cell its shape; however, whereas peptidoglycan helps maintain the structural strength of the cell, it is actually the MreB protein that facilitates cell shape. Peptidoglycan is also involved in binary fission during bacterial cell reproduction. Question: do all bacteria have peptidoglycan in their cell walls?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: A hydraulic fluid or hydraulic liquid is the medium by which power is transferred in hydraulic machinery. Common hydraulic fluids are based on mineral oil or water. Examples of equipment that might use hydraulic fluids are excavators and backhoes, hydraulic brakes, power steering systems, transmissions, garbage trucks, aircraft flight control systems, lifts, and industrial machinery. Question: is mineral oil the same as hydraulic oil?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: A 12-episode second season of The Affair premiered on October 4, 2015. On December 9, 2015, the series was renewed for a third season, which debuted on November 20, 2016. On January 9, 2017, Showtime renewed the series for a fourth season, which premiered on June 17, 2018. On July 26, 2018, Showtime announced it had renewed the series for a fifth and final season to debut in 2019. Question: will there be another season of the affiar?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: The island is dominated by a maritime climate with quite narrow temperature differences between seasons. Politically, Great Britain is part of the United Kingdom of Great Britain and Northern Ireland, and constitutes most of its territory. Most of England, Scotland, and Wales are on the island. The term ``Great Britain'' is often used to include the whole of England, Scotland and Wales including their component adjoining islands; and is also occasionally but contentiously applied to the UK as a whole in some contexts. Question: is northern ireland part of the great britain?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: On 13 July 1967, British cyclist Tom Simpson died climbing Mont Ventoux after taking amphetamine. Question: has anyone ever died doing the tour de france?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: The passport card is a limited travel document, valid only for land and sea travel within North America (Canada, the United States, Mexico, the Caribbean, and Bermuda). It cannot be used for international air travel. The Department of State indicates that this is because ``designing a card format passport for wide use, including by air travelers, would inadvertently undercut the broad based international effort to strengthen civil aviation security and travel document specifications to address the post 9/11 threat environment''. Question: can a passport card be used to fly to canada?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The subsequent trophy, called the ``FIFA World Cup Trophy'', was introduced in 1974. Made of 18 carat gold with a malachite base, it stands 36.8 centimetres high and weighs 6.1 kilograms. The trophy was made by Stabilimento Artistico Bertoni company in Italy. It depicts two human figures holding up the Earth. The current holders of the trophy are France, winners of the 2018 World Cup. Question: is the world cup made out of solid gold?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: While some gold medals are solid gold, others are gold-plated or silver-gilt, like those of the Olympic Games, the Lorentz Medal, the United States Congressional Gold Medal and the Nobel Prize medal. Nobel Prize medals consist of 18 karat green gold plated with 24 karat gold. Before 1980 they were struck in 23 karat gold. Question: is the olympic gold medal made of gold?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The New York metropolitan area, also referred to as the Tri-State Area, is the largest metropolitan area in the world by urban landmass, at 4,495 sq mi (11,640 km). The metropolitan area includes New York City (the most populous city in the United States), Long Island, and the Mid and Lower Hudson Valley in the state of New York; the five largest cities in New Jersey: Newark, Jersey City, Paterson, Elizabeth, and Edison, and their vicinities; six of the seven largest cities in Connecticut: Bridgeport, New Haven, Stamford, Waterbury, Norwalk, and Danbury, and their vicinities. Question: is new jersey a suburb of new york city?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The abdomen (less formally called the belly, stomach, tummy or midriff) constitutes the part of the body between the thorax (chest) and pelvis, in humans and in other vertebrates. The region occupied by the abdomen is termed the abdominal cavity. In arthropods it is the posterior tagma of the body; it follows the thorax or cephalothorax. The abdomen stretches from the thorax at the thoracic diaphragm to the pelvis at the pelvic brim. The pelvic brim stretches from the lumbosacral joint (the intervertebral disc between L5 and S1) to the pubic symphysis and is the edge of the pelvic inlet. The space above this inlet and under the thoracic diaphragm is termed the abdominal cavity. The boundary of the abdominal cavity is the abdominal wall in the front and the peritoneal surface at the rear. Question: is the abdomen the same as the stomach?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The British Isles are a group of islands in the North Atlantic off the north-western coast of continental Europe that consist of the islands of Great Britain, Ireland, the Isle of Man and over six thousand smaller isles. They have a total area of about 315,159 km and a combined population of just under 70 million, and include two sovereign states, the Republic of Ireland (which covers roughly five-sixths of the island of Ireland) and the United Kingdom of Great Britain and Northern Ireland. The islands of Alderney, Jersey, Guernsey and Sark, and their neighbouring smaller islands, are sometimes also taken to be part of the British Isles. Question: is southern ireland part of the british isles?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: All other nations have some form of legislation meant to prevent the illegal trading of organs, whether by an outright ban or through legislation that limits how and by whom donations can be made. Many countries, including Belgium and France, use a system of presumed consent to increase the amount of legal organs available for transplant. . In the United States, federal law prohibits the sale of organs; however, the government has created initiatives to encourage organ gifting and to compensate those who freely donate their organs. In 2004, the state of Wisconsin began providing tax deductions to living donors. Question: is it legal to sell human body parts?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Greece were subsequently drawn against Croatia in the play-off round, where they were knocked out over two legs; a 4--1 away defeat set the tone for Greece's campaign, and in the second leg they drew a blank in a 0--0 stalemate against the Croats to signify the end of their World Cup hopes. Kostas Mitroglou finished as Greece's top scorer throughout their campaign, scoring six goals. Question: is greece not in the world cup 2018?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Pierce's home, ``Port Waldo'', is (real-life) Waldoboro, Maine, and the FinestKind Clinic is just up U.S. Route 1 in Rockland. ``Crabapple Cove'' is actually Broad Cove, in Bremen just down the Medomak River from Waldoboro Village. Author Richard Hooker (Hornberger) owned an old farmhouse on Heath Point. The reader will note Wreck Island, Thief Island, and other Muscongus Bay landmarks in the book. It is possible that the Pierce family is modeled after the (real-life) Spear family, who had a number of different branches in the area, in the 1950s. Question: is there such a place as crabapple cove maine?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Friction tape is a type of adhesive tape made from cloth impregnated with a rubber-based adhesive, mainly used to insulate splices in electric wires and cables. Because the adhesive is impregnated in the cloth, friction tape is sticky on both sides. The rubber-based adhesive makes it an electrical insulator and provides a degree of protection from liquids and corrosion. In the past, friction tape was widely used by electricians, but PVC electrical tape has replaced it in most applications today. The frictional properties of the tape come from the cloth material, which is usually made from cotton, while the fabric base protects electrical splices against punctures and abrasion. Question: is friction tape the same as electrical tape?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Second only to members of the family Proteaceae, melaleucas are an important food source for nectarivorous insects, birds, and mammals. Many are popular garden plants, either for their attractive flowers or as dense screens; and a few have economic value for producing fencing and oils such as ``tea tree'' oil. Most melaleucas are endemic to Australia, with a few also occurring in Malesia. Seven are endemic to New Caledonia, and one is found only on (Australia's) Lord Howe Island. Melaleucas are found in a wide variety of habitats. Many are adapted for life in swamps and boggy places, while others thrive in the poorest of sandy soils or on the edge of saltpans. Some have a wide distribution and are common, whilst others are rare and endangered. Land clearing, exotic myrtle rust, and especially draining and clearing of swamps threaten many species. Question: is tea tree oil and melaluca the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Sanders played football primarily at cornerback, but also as a kick returner, punt returner, and occasionally wide receiver. He played in the National Football League (NFL) for the Atlanta Falcons, the San Francisco 49ers, the Dallas Cowboys, the Washington Redskins and the Baltimore Ravens, winning the Super Bowl with both the 49ers and the Cowboys. An outfielder in baseball, he played professionally for the New York Yankees, the Atlanta Braves, the Cincinnati Reds and the San Francisco Giants, and participated in the 1992 World Series with the Braves. He attended Florida State University, where he was recognized as a two-time All-American in football, and also played baseball and ran track. Question: did deion sanders ever win a world series?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: In the United Kingdom and the Crown dependencies, any household watching or recording live television transmissions as they are being broadcast (terrestrial, satellite, cable, or Internet) is required to hold a television licence. Businesses, hospitals, schools and a range of other organisations are also required to hold television licences to watch and record live TV broadcasts. A television licence is also required to receive video on demand programme services provided by the BBC, on the iPlayer catch-up service. Question: do you have to have a license to own a tv in england?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Usually, one of the first items in an order of business or an agenda for a meeting is the reading and approval of the minutes from the previous meeting. If the members of the group agree (usually by unanimous consent) that the written minutes reflect what happened at the previous meeting, then they are approved, and the fact of their approval is recorded in the minutes of the current meeting. If there are significant errors or omissions, then the minutes may be redrafted and submitted again at a later date. Minor changes may be made immediately using the normal amendment procedures, and the amended minutes may be approved ``as amended''. It is normally appropriate to send a draft copy of the minutes to all the members in advance of the meeting so that the meeting is not delayed by a reading of the draft. Question: do minutes of a meeting have to be approved?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: The Tribute in Light is an art installation of 88 searchlights placed six blocks south of the World Trade Center on top of the Battery Parking Garage in New York City to create two vertical columns of light to represent the Twin Towers in remembrance of the September 11, 2001 attacks. Tribute in Light began initially as a temporary commemoration of the attacks in early 2002 but became an annual commemoration, currently produced on September 11th by the Municipal Art Society of New York. Question: do they light up the twin towers every night?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: If ``infield fly'' is called and the fly ball is caught, it is treated exactly as an ordinary caught fly ball; the batter is out, there is no force, and the runners must tag up. On the other hand, if ``infield fly'' is called and the ball lands fair without being caught, the batter is still out, there is still no force, but the runners are not required to tag up. In either case, the ball is live, and the runners may advance on the play, at their own peril. Question: do you have to tag up on an infield fly rule?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: In no-limit and pot-limit games, there is a minimum amount that is required to be bet in order to open the action. In games with blinds, this amount is usually the amount of the big blind. Standard poker rules require that raises must be at least equal to the amount of the previous bet or raise. For example, if an opponent bets $5, a player must raise by at least another $5, and they may not raise by only $2. If a player raises a bet of $5 by $7 (for a total of $12), the next re-raise would have to be by at least another $7 (the previous raise) more than the $12 (for a total of at least $19). The primary purpose of the minimum raise rule is to avoid game delays caused by ``nuisance'' raises (small raises of large bets, such as an extra $1 over a current bet of $50, that have little effect on the action but take time as all others must call). This rule is overridden by table stakes rules, so that a player may in fact raise a $5 bet by $2 if that $2 is his entire remaining stake. Question: do you have to raise double in poker?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Before 1973, the NCAA's smaller schools were grouped together in the College Division. In 1973, the College Division split in two when the NCAA began using numeric designations for its competitions. The College Division members who wanted to offer athletic scholarships or compete against those who did became Division II, while those who chose not to offer athletic scholarships became Division III. Question: do ncaa division 2 schools offer athletic scholarships?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Permission to publicly perform a song must be obtained from the copyright holder or a collective rights organization. Question: can i perform a copyrighted song in public?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Miniature pig (also micro-pig, teacup pig, Michelle Davila, etc.) is an erroneous term that is used to refer to small breeds of domestic pig, such as Pot-bellied pigs, G\u00f6ttingen minipigs, Juliana pigs, Choctaw Hogs, or Kunekune (and specimens derived by cross-breeding with these). Notable features of most miniature pigs distinguishing them from other pigs may be defined by their possession of small, perked-back ears, a potbelly, sway back, chubby figure, rounded head, short snout, legs, and neck, and a short tail with thick hair at the end. Typically, most breeds of mini pigs will range from the minimum weight of 75 pounds (34 kg) to 200 pounds (91 kg). Question: is there such a thing as a miniature pig?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Canada is one of the oldest continuing monarchies in the world. Initially established in the 16th century, monarchy in Canada has evolved through a continuous succession of French and British sovereigns into the independent Canadian sovereigns of today, whose institution is sometimes colloquially referred to as the Maple Crown. Question: is canada still part of the british monarchy?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The National Minimum Drinking Age Act of 1984 (23 U.S.C. \u00a7 158) was passed by the United States Congress on July 17, 1984. It was a controversial bill that punished every state that allowed persons below 21 years to purchase and publicly possess alcoholic beverages by reducing its annual federal highway apportionment by 10 percent. The law was later amended, lowering the penalty to 8 percent from fiscal year 2012 and beyond. Question: is the legal drinking age a federal law?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Termination of employment, is an employee's departure from a job and the end of an employee's duration with an employer. Termination may be voluntary on the employee's part, or it may be at the hands of the employer, often in the form of dismissal (firing) or a layoff. Dismissal or firing is generally thought to be the fault of the employee, whereas a layoff is generally done for business reasons (for instance a business slowdown or an economic downturn) outside the employee's performance. Question: is employment termination the same as being fired?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Historically, tin and copper as well as a few other metals (e.g. arsenic, silver, and zinc) have been mined in Cornwall and Devon. As of 2007 there are no active metalliferous mines remaining. However, tin deposits still exist in Cornwall, and there has been talk of reopening the South Crofty tin mine. In addition, work has begun on re-opening the Hemerdon tungsten and tin mine in south-west Devon. In view of the economic importance of mines and quarries, geological studies have been conducted: about forty distinct minerals have been identified from type localities in Cornwall (e.g. endellionite from St Endellion). Quarrying of the igneous and metamorphic rocks has also been a significant industry. In the 20th century the extraction of kaolin was important economically. Question: are there any tin mines left in cornwall?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Six Flags Great America is an amusement park located in Gurnee, Illinois. Part of the Six Flags chain, Great America was first opened in 1976 by the Marriott Corporation as Marriott's Great America. Six Flags has owned and operated the park since 1984, making it the seventh park in the chain. The park offers ten themed areas, as well as Hurricane Harbor, a 20-acre (81,000 m) water park, and three specially themed children's areas. Question: is great america the same as six flags?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u9605\u8bfb\u60a8\u63d0\u4f9b\u7684\u95ee\u9898\u548c\u5b66\u751f\u7684\u7b54\u6848\u3002\u7136\u540e\uff0c\u6211\u5c06\u6839\u636e\u95ee\u9898\u4e2d\u7684\u6587\u7ae0\u5224\u65ad\u5b66\u751f\u7684\u7b54\u6848\u662f\u5426\u6b63\u786e\uff0c\u5e76\u7ed9\u51fa\u76f8\u5e94\u7684\u5206\u6570\u3002\\n\\n\u95ee\u9898\uff1a```Six Flags Great America is an amusement park located in Gurnee, Illinois. Part of the Six Flags chain, Great America was first opened in 1976 by the Marriott Corporation as Marriott's Great America. Six Flags has owned and operated the park since 1984, making it the seventh park in the chain. The park offers ten themed areas, as well as Hurricane Harbor, a 20-acre (81,000 m) water park, and three specially themed children's areas. Question: is great america the same as six flags?```\\n\\n\u5b66\u751f\u7684\u7b54\u6848\uff1a```True```\\n\\n\u6839\u636e\u6587\u7ae0\u5185\u5bb9\uff0c\u516d\u65d7\u5927\u7f8e\u56fd\u662f\u4e00\u4e2a\u6e38\u4e50\u56ed\uff0c\u4f4d\u4e8e\u4f0a\u5229\u8bfa\u4f0a\u5dde\u683c\u5c3c\u5e02\uff0c\u5b83\u662f\u516d\u65d7\u94fe\u7684\u4e00\u90e8\u5206\u3002\u6587\u7ae0\u6ca1\u6709\u660e\u786e\u8bf4\u660e\u516d\u65d7\u5927\u7f8e\u56fd\u662f\u5426\u548c\u516d\u65d7\u662f\u4e00\u6837\u7684\uff0c\u56e0\u6b64\u6211\u65e0\u6cd5\u5224\u65ad\u5b66\u751f\u7684\u7b54\u6848\u662f\u5426\u6b63\u786e\u3002\\n\\n\u56e0\u6b64\uff0c\u5206\u6570\u4e3a\uff1a70\u3002"
    },
    {
        "question": "Passage: The original script by Ilya Tilkin does not have any literary source. The screenwriter studied diaries of the participants of the Battle of Stalingrad. He also used museum archives, documents and recorded stories of its participants. Question: is the movie stalingrad based on a true story?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Alex's past love interest. Though they had a very rough start from the beginning they eventually started to soften towards each other. Sean has a hard time getting Alex to bring down her walls. Alex herself tries not to get emotional and Sean sometimes gets Alex to recognize that he likes her. Despite Sean giving many signs of attraction to Alex, she either ignored them or she was oblivious to them since she was emotionally not ready to commit to a relationship with all the things happening in her life. He and Alex share a first kiss in a car with Birkoff driving and Ryan in the passenger seat. In the second-season finale, he tries to ask her out on a date four times, but Alex never lets him finish due to being in action, criticizing that he said that she was a goal, Alex passing out due to a broken arm and being electrocuted, and Nikita interrupting Sean right before he was going to ask Alex while she was in a Division medical facility. At the start of season 3, it appears Alex and Sean are in a relationship. However, by episode 3 it is revealed that Sean is only at Division for Alex, because he loved her. But Alex is at Division because it is the only place she knows as home, where she can be herself, and where her ``family'' (Nikita) is. After a toxin is released in the lab, Sean returns to ask Alex why she's not returning his calls, and asks her again why she's still there. ``I can't tell you what to do, Alex, but I'm not going to stand by and watch this place destroy another person that I love,'' he says before he kisses her. ``I love you, but if that's not enough of a reason for you to leave, I've got no reason to stay.'' After he leaves, she pops a pill and heads for the operations floor, insisting that she ought to be included in the hunt for Amanda. They eventually make up after an emotional scene in a medical room, they then entered a storage closet and make love, for the first time. In ``Black Badge'', Amanda framed Sean for the death of the head of the CIA. As a result, they faked his death and Sean was officially welcomed into Division by Alex. Sean died in season 3 episode 18, as a result of a bullet nicking his artery. They shared their last moments together in where they met, operations. When Nikita enters OPS, she finds Birkoff near Sean and Alex gone. She presumably went to get revenge for the current events. Sean later died in Alex's arms towards the end of season 3. Question: do sean and alex get together on nikita?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: Sterol lipids, such as cholesterol and its derivatives, are an important component of membrane lipids, along with the glycerophospholipids and sphingomyelins. The steroids, all derived from the same fused four-ring core structure, have different biological roles as hormones and signaling molecules. The eighteen-carbon (C18) steroids include the estrogen family whereas the C19 steroids comprise the androgens such as testosterone and androsterone. The C21 subclass includes the progestogens as well as the glucocorticoids and mineralocorticoids. The secosteroids, comprising various forms of vitamin D, are characterized by cleavage of the B ring of the core structure. Other examples of sterols are the bile acids and their conjugates, which in mammals are oxidized derivatives of cholesterol and are synthesized in the liver. The plant equivalents are the phytosterols, such as \u03b2-sitosterol, stigmasterol, and brassicasterol; the latter compound is also used as a biomarker for algal growth. The predominant sterol in fungal cell membranes is ergosterol. Question: is cholesterol a partial breakdown product of lipids?",
        "pred_ans": " False. Cholesterol is a sterol lipid, which is an important component of membrane lipids. It is not a partial breakdown product of lipids.",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: In addition to a regular and 'light' spread, Unilever also uses the brand name to market a liquid butter substitute contained in a spray-bottle. This product is an emulsion of vegetable oil in water formulated with a 'hint' of butter flavor (derived from buttermilk) and is marketed as having zero calories and zero fat content. In 2017, Unilever announced two new varieties, ``It's Vegan'' and ``It's Organic''. Question: is i cant believe its not butter margarine?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The four-second fall from the Golden Gate Bridge sends a person plunging 245 feet (75 m) at 75 miles per hour (121 km/h) to hit the waters of the San Francisco Bay ``with the force of a speeding truck meeting a concrete building.'' Jumping off the bridge holds a 98 percent fatality rate; As of 2005, it is estimated that 26 people have survived after jumping. Some die instantly from internal injuries, while others drown or die of hypothermia. The Golden Gate bridge's death toll has since been surpassed only by the Nanjing Yangtze River Bridge in China. In 2013, 118 potential jumpers were talked down from their attempt and did not jump. Question: could you survive jumping off the golden gate bridge?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Vauxhall (/\u02c8v\u0252ks\u0254\u02d0l/, VOK-sawl) is a National Rail, London Underground and London Buses interchange station in central London. It is at the Vauxhall Cross road junction opposite the southern approach to Vauxhall Bridge over the River Thames in the district of Vauxhall. The station is on the boundary of zones 1 and 2 of the London Travelcard area and, although a through station, it is classed as a central London terminus for ticketing purposes. Question: is vauxhall station in zone 1 or 2?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The fluid ounce is distinct from the ounce as a unit of weight or mass, although it is sometimes referred to simply as an ``ounce'' where context makes the meaning clear, such as ounces in a bottle. Question: is 1 ounce the same as 1 fluid ounce?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: The United States Federal Food, Drug, and Cosmetic Act (abbreviated as FFDCA, FDCA, or FD&C), is a set of laws passed by Congress in 1938 giving authority to the U.S. Food and Drug Administration (FDA) to oversee the safety of food, drugs, and cosmetics. A principal author of this law was Royal S. Copeland, a three-term U.S. Senator from New York. In 1968, the Electronic Product Radiation Control provisions were added to the FD&C. Also in that year the FDA formed the Drug Efficacy Study Implementation (DESI) to incorporate into FD&C regulations the recommendations from a National Academy of Sciences investigation of effectiveness of previously marketed drugs. The act has been amended many times, most recently to add requirements about bioterrorism preparations. Question: was the food drug and cosmetic act successful?",
        "pred_ans": " It is difficult to provide a simple yes or no answer to the question of whether the Food, Drug, and Cosmetic Act (FFDCA) has been successful. The FFDCA has certainly had a significant impact on the regulation of food, drugs, and cosmetics in the United States, and it has been amended many times over the years to address new concerns and challenges. However, the success of the FFDCA depends on various factors, including its ability to protect the public from unsafe or ineffective products, its impact on innovation in the pharmaceutical and cosmetics industries, and its effectiveness in preventing bioterrorism. There are certainly areas where the FFDCA could be improved, and there have been instances where unsafe or ineffective products have slipped through the regulatory process. However, overall, the FFDCA has been a critical tool in ensuring the safety and effectiveness of the products that Americans consume and use on a daily basis.",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The first film was followed by two successful sequels, The Empire Strikes Back (1980) and Return of the Jedi (1983); these three films constitute the original Star Wars trilogy. A prequel trilogy was released between 1999 and 2005, albeit to mixed reactions from critics and fans. A sequel trilogy concluding the main story of the nine-episode saga began in 2015 with The Force Awakens. The first eight films were nominated for Academy Awards (with wins going to the first two released) and were commercially successful, with a combined box office revenue of over US$8.5 billion. Together with the theatrical spin-off films The Clone Wars (2008), Rogue One (2016) and Solo: A Star Wars Story (2018), Star Wars is the second highest-grossing film series ever. Question: is there a star wars movie this year?",
        "pred_ans": " False. According to the passage, the most recent Star Wars movie released was \\\"The Rise of Skywalker\\\" in 2019. There are no plans for a new Star Wars movie to be released in 2021.",
        "ground_truth": true,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Common-law marriage, also known as sui juris marriage, informal marriage, marriage by habit and repute, or marriage in fact is a legal framework in a limited number of jurisdictions where a couple is legally considered married, without that couple having formally registered their relation as a civil or religious marriage. The original concept of a ``common-law marriage'' is a marriage that is considered valid by both partners, but has not been formally recorded with a state or religious registry, or celebrated in a formal religious service. In effect, the act of the couple representing themselves to others as being married, and organizing their relation as if they were married, acts as the evidence that they are married. The requirements for a common-law marriage to be recognised differ from state to state. Question: is common law marriage legal in the united states?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 100. The student's answer is correct according to the passage in the question, as common-law marriage is not mentioned as being legal in the United States."
    },
    {
        "question": "Passage: The Book of Jasher (also, Jashar) or the Book of the Upright or the Book of the Just Man (Hebrew: \u05e1\u05b5\u05e4\u05b6\u05e8 \u05d4\u05b7\u05d9\u05c7\u05bc\u05e9\u05c7\u05c1\u05e8\u202c; transliteration: s\u0113fer hayy\u0101\u0161\u0101r) is an unknown book mentioned in the Hebrew Bible. The translation ``Book of the Just Man'' is the traditional Greek and Latin translation, while the transliterated form ``Jasher'' is found in the King James Bible, 1611. Question: is the book of jasher in the bible?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The United States Congress is the bicameral legislature of the Federal government of the United States. The legislature consists of two chambers: the Senate and the House of Representatives. Question: is the house of representatives the same as congress?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: The production of milk requires that the cow be in lactation, which is a result of the cow having given birth to a calf. The cycle of insemination, pregnancy, parturition, and lactation is followed by a ``dry'' period of about two months before calving, which allows udder tissue to regenerate. A dry period that falls outside this time frames can result in decreased milk production in subsequent lactation. Dairy operations therefore include both the production of milk and the production of calves. Bull calves are either castrated and raised as steers for beef production or used for veal. Question: do cows have to be pregnant to get milk?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " There is a part in the passage that says, \\\"The production of milk requires that the cow be in lactation, which is a result of the cow having given birth to a calf.\\\" This implies that cows have to be pregnant to get milk. Therefore the score is: 95"
    },
    {
        "question": "Passage: Eidetic memory (/a\u026a\u02c8d\u025bt\u026ak/; sometimes called photographic memory) is an ability to vividly recall images from memory after only a few instances of exposure, with high precision for a brief time after exposure, without using a mnemonic device. Although the terms eidetic memory and photographic memory are popularly used interchangeably, they are also distinguished, with eidetic memory referring to the ability to view memories like photographs for a few minutes, and photographic memory referring to the ability to recall pages of text or numbers, or similar, in great detail. When the concepts are distinguished, eidetic memory is reported to occur in a small number of children and as something generally not found in adults, while true photographic memory has never been demonstrated to exist. Question: are eidetic memory and photographic memory the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 90"
    },
    {
        "question": "Passage: To maintain lactation, a dairy cow must be bred and produce calves. Depending on market conditions, the cow may be bred with a ``dairy bull'' or a ``beef bull.'' Female calves (heifers) with dairy breeding may be kept as replacement cows for the dairy herd. If a replacement cow turns out to be a substandard producer of milk, she then goes to market and can be slaughtered for beef. Male calves can either be used later as a breeding bull or sold and used for veal or beef. Dairy farmers usually begin breeding or artificially inseminating heifers around 13 months of age. A cow's gestation period is approximately nine months. Newborn calves are removed from their mothers quickly, usually within three days, as the mother/calf bond intensifies over time and delayed separation can cause extreme stress on both cow and calf. Question: does a cow have to be pregnant to lactate?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 95"
    },
    {
        "question": "Passage: Orange is the colour between yellow and red on the spectrum of visible light. Human eyes perceive orange when observing light with a dominant wavelength between roughly 585 and 620 nanometres. In painting and traditional colour theory, it is a secondary colour of pigments, created by mixing yellow and red. It is named after the fruit of the same name. Question: is the fruit orange named after the color?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 100"
    },
    {
        "question": "Passage: Call of Duty: WWII is a first-person shooter video game developed by Sledgehammer Games and published by Activision. It was released worldwide on November 3, 2017 for Microsoft Windows, PlayStation 4 and Xbox One. It is the fourteenth main installment in the Call of Duty series and the first title in the series to be set primarily during World War II since Call of Duty: World at War in 2008. Question: was call of duty ww2 based on a true story?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 90"
    }
]