[
    {
        "question": "Passage: Tea tree oil, also known as melaleuca oil or ti tree oil, is an essential oil with a fresh camphoraceous odor and a colour that ranges from pale yellow to nearly colourless and clear. It is from the leaves of the tea tree, Melaleuca alternifolia, native to Southeast Queensland and the Northeast coast of New South Wales, Australia. Question: is melaleuca and tea tree the same thing?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In the U.S. in 2010, the bottling size was reduced from a typical 12 oz. per serving to 11.2 oz. per serving which is equivalent to the typical metric serving of 0.33L. Question: is red stripe light sold in the us?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is no information in the passage about Red Stripe Light being sold in the US. The student's answer cannot be determined to be true or false based on the information provided in the passage. Therefore the score is: 50"
    },
    {
        "question": "Passage: A split-phase or single-phase three-wire system is a type of single-phase electric power distribution. It is the AC equivalent of the original Edison three-wire direct-current system. Its primary advantage is that it saves conductor material over a single-ended single-phase system, while only requiring a single phase on the supply side of the distribution transformer. Question: is split phase the same as single phase?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: While the term was in use as early as 1933, it became official only after the formation of the NCAA Division I athletic conference in 1954. Seven of the eight schools were founded during the colonial period (Cornell was founded in 1865). Ivy League institutions account for seven of the nine Colonial Colleges chartered before the American Revolution; the other two are Rutgers University and the College of William & Mary. Question: is college of william and mary an ivy league school?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a factual error in the student's answer. The last sentence in your response can ONLY start with `Therefore the score is:` and followed by a score between 0 and 100."
    },
    {
        "question": "Passage: Delmonico steak (or steak Delmonico) is a particular preparation of one of several cuts of beef (typically the ribeye) originated by Delmonico's restaurant in New York City during the mid-19th century. Controversy exists about the specific cut of steak that Delmonico's originally used. Question: is a ribeye steak the same as a delmonico steak?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In 2003, the United States withdrew remaining non-training troops or armament purchase support from Saudi Arabia, with 200 of these support personnel remaining, primarily at Eskan Village, a base which is owned by the Saudi Arabian government itself, in support of the US Military Training Mission (USMTM) in Saudi Arabia and the US Office of Program Management for the Saudi Arabian National Guard (OPM-SANG). Question: does us have military bases in saudi arabia?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The New Legends of Monkey is a television series inspired by Monkey, a Japanese production from the 1970s and 80s which garnered a cult following in New Zealand, Australia, the U.K. and South Africa. The Japanese production was based on the 16th century Chinese novel Journey to the West. The show is a co-production between ABC Me, TVNZ, and Netflix, and consists of ten episodes. The New Legends of Monkey premiered on 28 January 2018. Question: will the new legends of monkey have a season 2?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A vehicle identification number (VIN) is a unique code, including a serial number, used by the automotive industry to identify individual motor vehicles, towed vehicles, motorcycles, scooters and mopeds, as defined in ISO 3779:2009. Question: is a car serial number the same as a vin?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The area was created when a chain of volcanic islands collided with the North American plate. The islands went over the North American plate and created the highlands of New Jersey. Then around 450 million years ago, a small continent collided with proto North America and created folding and faulting in western New Jersey and the southern Appalachians. When the African plate separated from North America, this created an aborted rift system or half-graben. The land lowered between the Ramapo fault in western Parsippany and the fault that was west of Paterson. The Wisconsin Glacier covered area from 21,000 to 13,000 BC. When the glacier melted due to climate change, Lake Passaic was formed, covering all of what is now Lake Hiawatha. Lake Passaic slowly drained and much of the area is swamps or low-lying meadows such as Troy Meadows. The Rockaway River flows over the Ramapo fault in Boonton and then flows along the northwestern edge of Lake Hiawatha. In this area, there are swamps near the river or in the area. Question: is there a lake in lake hiawatha nj?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: When the equilateral pentagon is dissected into triangles, two of them appear as isosceles (triangles in orange and blue) while the other one is more general (triangle in green). We assume that we are given the adjacent angles \u03b1 (\\displaystyle \\alpha ) and \u03b2 (\\displaystyle \\beta ) . Question: is a pentagon made of 5 equilateral triangles?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A batsman may not be given out bowled, leg before wicket, caught, stumped or hit wicket off a no-ball. A batsman may be given out run out, hit the ball twice, or obstructing the field. Thus the call of no-ball protects the batsman against losing his wicket in ways that are attributed to the bowler, but not in ways that are attributed to running, or to the batsman's own conduct. Question: can a batsman be run out on a no ball?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Area code 408 is a California telephone area code that was split from area code 415 as a flash-cut in 1959. Area code 669 is an overlay of 408 that became effective on November 20, 2012. It covers most of Santa Clara County and Northern Santa Cruz County and includes Gilroy, Morgan Hill, Saratoga, Los Gatos, Monte Sereno, Milpitas, Sunnyvale, Santa Clara, Cupertino, and San Jose. Question: is area code 669 a toll free number?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On November 19, 2013, producers announced that the show would close on January 4, 2014, citing falling ticket sales and no longer being able to get injury insurance for the production as reasons for closure. Having run on Broadway for over three years, the production failed to make back its $75 million cost, the largest in Broadway history, with investors reportedly losing $60 million. Question: is spider man turn off the dark still playing?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The common bile duct, sometimes abbreviated CBD, is a duct in the gastrointestinal tract of organisms that have a gall bladder. It is formed by the union of the common hepatic duct and the cystic duct (from the gall bladder). It is later joined by the pancreatic duct to form the ampulla of Vater. There, the two ducts are surrounded by the muscular sphincter of Oddi. Question: is the cystic duct the same as the common bile duct?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: All bilaterians have a gastrointestinal tract, also called a gut or an alimentary canal. This is a tube that transfers food to the organs of digestion. In large bilaterians, the gastrointestinal tract generally also has an exit, the anus, by which the animal disposes of feces (solid wastes). Some small bilaterians have no anus and dispose of solid wastes by other means (for example, through the mouth). The human gastrointestinal tract consists of the esophagus, stomach, and intestines, and is divided into the upper and lower gastrointestinal tracts. The GI tract includes all structures between the mouth and the anus, forming a continuous passageway that includes the main organs of digestion, namely, the stomach, small intestine, and large intestine. However, the complete human digestive system is made up of the gastrointestinal tract plus the accessory organs of digestion (the tongue, salivary glands, pancreas, liver and gallbladder). The tract may also be divided into foregut, midgut, and hindgut, reflecting the embryological origin of each segment. The whole human GI tract is about nine metres (30 feet) long at autopsy. It is considerably shorter in the living body because the intestines, which are tubes of smooth muscle tissue, maintain constant muscle tone in a halfway-tense state but can relax in spots to allow for local distention and peristalsis. Question: is the pancreas part of the gastrointestinal system?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Card counting is not illegal under British law, nor is it under federal, state, or local laws in the United States provided that no external card counting device or person assists the player in counting cards. Still, casinos object to the practice, and try to prevent it, banning players believed to be counters. In their pursuit to identify card counters, casinos sometimes misidentify and ban players suspected of counting cards even if they do not. Question: is it legal to count cards in las vegas?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: In the sport of athletics, a four-minute mile means completing a mile run (1,760 yards, or 1,609.344 metres) in less than four minutes. It was first achieved in 1954 by Roger Bannister in 3:59.4. The ``four-minute barrier'' has since been broken by over 1,400 male athletes, and is now the standard of all male professional middle distance runners. In the last 50 years the mile record has been lowered by almost 17 seconds, and currently stands at 3:43.13. Running a mile in four minutes translates to a speed of 15 miles per hour (24.14 km/h, or 2:29.13 per kilometre, or 14.91 seconds per 100 metres). It also equals 22 feet per second (1,320 feet per minute). Question: has anyone ever ran a 4 minute mile?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 0"
    },
    {
        "question": "Passage: The War Powers Resolution (also known as the War Powers Resolution of 1973 or the War Powers Act) (50 U.S.C. 1541--1548) is a federal law intended to check the president's power to commit the United States to an armed conflict without the consent of the U.S. Congress. The Resolution was adopted in the form of a United States Congress joint resolution. It provides that the U.S. President can send U.S. Armed Forces into action abroad only by declaration of war by Congress, ``statutory authorization,'' or in case of ``a national emergency created by attack upon the United States, its territories or possessions, or its armed forces.'' Question: can president go to war without congress approval?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The third season of the American television drama series Better Call Saul premiered on April 10, 2017, and concluded on June 19, 2017. The ten-episode season was broadcast on Monday nights in the United States on AMC. Better Call Saul is a spin-off of Breaking Bad created by Vince Gilligan and Peter Gould who also worked on Breaking Bad. Question: is there a third season of better call saul?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: A ganglionic blocker (or ganglioplegic) is a type of medication that inhibits transmission between preganglionic and postganglionic neurons in the Autonomic Nervous System, often by acting as a nicotinic receptor antagonist. Nicotinic acetylcholine receptors are found on skeletal muscle, but also within the route of transmission for the parasympathetic and sympathetic nervous system (which together comprise the autonomic nervous system). More specifically, nicotinic receptors are found within the ganglia of the autonomic nervous system, allowing outgoing signals to be transmitted from the presynaptic to the postsynaptic cells. Thus, for example, blocking nicotinic acetylcholine receptors blocks both sympathetic (excitatory) and parasympathetic (calming) stimulation of the heart. The nicotinic antagonist hexamethonium, for example, does this by blocking the transmission of outgoing signals across the autonomic ganglia at the postsynaptic nicotinic acetylcholine receptor. Question: can nicotine be classified as a ganglion blocker?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The 1916 New York Giants hold the record for the longest unbeaten streak in MLB history at 26, with a tie in-between the 14th and 15th win. The record for the longest winning streak by an American League team is held by the 2017 Cleveland Indians at 22. The Chicago Cubs franchise has won 21 games twice, once in 1880 when they were the Chicago White Stockings and once in 1935. Question: has any major league baseball team gone undefeated?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A wisdom tooth or third molar is one of the three molars per quadrant of the human dentition. It is the most posterior of the three. Wisdom teeth generally erupt between the ages of 17 and 25. Most adults have four wisdom teeth, one in each of the four quadrants, but it is possible to have none, fewer, or more, in which case the extras are called supernumerary teeth. Wisdom teeth commonly affect other teeth as they develop, becoming impacted. They are often extracted when or even before this occurs. Question: is it rare to have 6 wisdom teeth?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is no information in the passage to determine if having 6 wisdom teeth is rare or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: For a standard game of Klondike, drawing three cards at a time and placing no limit on the number of re-deals, the number of possible hands is over 7067800000000000000\u26608\u00d710, or an 8 followed by 67 zeros. About 79% of the games are theoretically winnable, but in practice, human players do not win 79% of games played, due to wrong moves that cause the game to become unwinnable. If one allows cards from the foundation to be moved back to the tableau, then between 82% and 91.5% are theoretically winnable. Note that these results depend on complete knowledge of the positions of all 52 cards, which a player does not possess. Another recent study has found the Draw 3, Re-Deal Infinite to have a 83.6% win rate after 1000 random games were solved by a computer solver. The issue is that a wrong move cannot be known in advance whenever more than one move is possible. The number of games a skilled player can probabilistically expect to win is at least 43%. In addition, some games are ``unplayable'' in which no cards can be moved to the foundations even at the start of the game; these occur in only 0.25% (1 in 400) of hands dealt. Question: is there always a way to win solitaire?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: The casting of actors not known for their singing abilities led to some mixed reviews. Variety stated that ``some stars, especially the bouncy and rejuvenated Streep, seem better suited for musical comedy than others, including Brosnan and Skarsg\u00e5rd.'' Brosnan, especially, was savaged by many critics: his singing was compared to ``a water buffalo'' (New York Magazine), ``a donkey braying'' (The Philadelphia Inquirer) and ``a wounded raccoon'' (The Miami Herald), and Matt Brunson of Creative Loafing Charlotte said he ``looks physically pained choking out the lyrics, as if he's being subjected to a prostate exam just outside of the camera's eye.'' Question: does everyone do their own singing in mamma mia?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Bank of Nova Scotia (French: La Banque de Nouvelle-\u00c9cosse), operating as Scotiabank with the CEO of Brian J. Porter (French: Banque Scotia), is a Canadian multinational bank. It is the third largest bank in Canada by deposits and market capitalization. It serves more than 24 million customers in over 50 countries around the world and offers a range of products and services including personal and commercial banking, wealth management, corporate and investment banking. With a team of more than 88,000 employees and assets of $915 billion (as at October 31, 2017), Scotiabank trades on the Toronto (TSX: BNS) and New York Exchanges (NYSE: BNS). Question: is scotiabank and bank of nova scotia the same?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: A player is in an 'offside position' if they are in the opposing team's half of the field and also ``nearer to the opponents' goal line than both the ball and the second-last opponent.'' The 2005 edition of the Laws of the Game included a new IFAB decision that stated, ``In the definition of offside position, 'nearer to his opponents' goal line' means that any part of their head, body or feet is nearer to their opponents' goal line than both the ball and the second last opponent. The arms are not included in this definition''. By 2017, the wording had changed to say that, in judging offside position, ``The hands and arms of all players, including the goalkeepers, are not considered.'' In other words, a player is in an offside position if two conditions are met: Question: does the goalkeeper count in the offside rule?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Although the minimum legal age to purchase alcohol is 21 in all states (see National Minimum Drinking Age Act), the legal details vary greatly. While a few states completely ban alcohol usage for people under 21, the majority have exceptions that permit consumption. Question: can you drink under the age of 21?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Designed originally to promote the Province of Ontario through exhibits and entertainment, its focus changed over time to be that of a theme park for families with a water park, a children's play area, and amusement rides. Exhibits in the pods were discontinued and the building became a venue for private events. The concert stage was turned over to a private concert operator and rebuilt as the Amphitheatre. After a long period of declining attendance, the Government of Ontario closed the facility except for its music venue and marina. It plans to re-open the facility after redevelopment into a year-round multiple-use facility. Question: is there a water park at ontario place?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: An orphan (from the Greek: \u03bf\u03c1\u03c6\u03b1\u03bd\u03cc\u03c2 orphan\u00f3s) is someone whose parents have died, are unknown, or have permanently abandoned them. Question: do your parents have to be dead to be an orphan?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The distinction between ``doves'' and ``pigeons'' is not consistent. In modern everyday speech, as opposed to scientific usage or formal usage, ``dove'' frequently indicates a pigeon that is white or nearly white. However, some people use the terms ``dove'' and ``pigeon'' interchangeably. In contrast, in scientific and ornithological practice, ``dove'' tends to be used for smaller species and ``pigeon'' for larger ones, but this is in no way consistently applied. Historically, the common names for these birds involve a great deal of variation between the terms. The species most commonly referred to as ``pigeon'' is the species known by scientists as the rock dove, one subspecies of which, the domestic pigeon, is common in many cities as the feral pigeon. Question: is there a difference between doves and pigeons?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A function f is said to be continuously differentiable if the derivative f\u2032(x) exists and is itself a continuous function. Though the derivative of a differentiable function never has a jump discontinuity, it is possible for the derivative to have an essential discontinuity. For example, the function Question: is the derivative of a continuous function always continuous?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: There are currently certain restrictions on the possession of airsoft replicas, which came in with the introduction of the ASBA (Anti-Social Behaviour Act 2003) Amendments, prohibiting the possession of any firearms replica in a public place without good cause (to be concealed in a gun case or container only, not to be left in view of public at any time). Question: do you need a license for airsoft guns uk?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Governments had attempted to reduce sleep-deprived driving through education messages and by ingraining roads with dents, known as rumble strips in the US, which cause a noise when drivers wander out of their lane. The Government of Western Australia recently introduced a ``Driver Reviver'' program where drivers can receive free coffee to help them stay awake. Question: is it illegal to drive with no sleep?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " There is no clear information in the passage to determine if it is illegal to drive with no sleep. The passage mainly discusses various attempts by governments to reduce sleep-deprived driving, such as through education messages and rumble strips, and a program in Western Australia that provides free coffee to help drivers stay awake. Therefore, the score is: 50"
    },
    {
        "question": "Passage: DMSI applied the brand to a new online and catalog-based retailing operation, with no physical stores, headquartered in Cedar Rapids, Iowa. DMSI then began operating under the Montgomery Ward branding and managed to get it up and running in three months. The new firm began operations in June 2004, selling essentially the same categories of products as the former brand, but as a new, smaller catalog. Question: are there any montgomery ward stores still open?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: King's is generally considered part of the 'golden triangle' of research-intensive English universities alongside the University of Oxford, University of Cambridge, University College London, Imperial College London, and The London School of Economics. It is a member of academic organisations including the Association of Commonwealth Universities, European University Association, and the Russell Group. King's is home to six Medical Research Council centres and is a founding member of the King's Health Partners academic health sciences centre, Francis Crick Institute and MedCity. It is the largest European centre for graduate and post-graduate medical teaching and biomedical research, by number of students, and includes the world's first nursing school, the Florence Nightingale Faculty of Nursing and Midwifery. Question: is king's college london part of cambridge university?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: While mentioned in passing throughout later seasons, Burke officially returns in the tenth season in order to conclude Cristina Yang's departure from the series. Question: does dr burke come back after season 3?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 30"
    },
    {
        "question": "Passage: Situated on the continent of Antarctica, it is the site of the United States Amundsen--Scott South Pole Station, which was established in 1956 and has been permanently staffed since that year. The Geographic South Pole is distinct from the South Magnetic Pole, the position of which is defined based on the Earth's magnetic field. The South Pole is at the center of the Southern Hemisphere. Question: is antarctica the same as the south pole?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: In 1963, the revolutionary government in Burma nationalized Central Bank of India's operations there, which became People's Bank No. 1. Question: is central bank of india a nationalised bank?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: When selecting the position for the static port, the aircraft designer's objective is to ensure the pressure in the aircraft's static pressure system is as close as possible to the atmospheric pressure at the altitude at which the aircraft is flying, across the operating range of weight and airspeed. Many authors describe the atmospheric pressure at the altitude at which the aircraft is flying as the freestream static pressure. At least one author takes a different approach in order to avoid a need for the expression freestream static pressure. Gracey has written ``The static pressure is the atmospheric pressure at the flight level of the aircraft''. Gracey then refers to the air pressure at any point close to the aircraft as the local static pressure. Question: is static pressure the same as atmospheric pressure?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The American military was entirely segregated during World War I. Although the military training of black Americans was opposed by white supremacist politicians such as Sen. James K. Vardaman (D-Mississippi) and Sen. Benjamin Tillman (D-South Carolina), the decision was made to include African-Americans in the 1917 draft. A total of 290,527 black Americans were ultimately registered for the draft. Question: were the us armed forces integrated in wwi?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The production of milk requires that the cow be in lactation, which is a result of the cow having given birth to a calf. The cycle of insemination, pregnancy, parturition, and lactation is followed by a ``dry'' period of about two months before calving, which allows udder tissue to regenerate. A dry period that falls outside this time frames can result in decreased milk production in subsequent lactation. Dairy operations therefore include both the production of milk and the production of calves. Bull calves are either castrated and raised as steers for beef production or used for veal. Question: do cows have to be pregnant to get milk?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. The passage states that the cow needs to be in lactation, which is a result of giving birth to a calf. Therefore, the correct answer should be \\\"False\\\" because the cow does not have to be pregnant to be in lactation, but it does need to have recently given birth to a calf.\\n\\nTherefore the score is: 50"
    },
    {
        "question": "Passage: In 1957, Frank Broyles was hired as the head football coach and served in that position for 19 years. Broyles' team was awarded the 1964 National Championship by the Football Writers Association of America and the Helms Athletic Foundation. At the time, The AP and UPI both awarded the designation before bowl games, and gave the award to Alabama. However, Alabama lost their bowl game to Texas, while Arkansas won theirs against Nebraska. The FWAA and HAF both awarded their national championship designations to Arkansas, who was the only team to go undefeated through bowl games that year. Both the University of Arkansas and the University of Alabama claimed national championships for the year 1964. Question: has arkansas ever won a national championship in any sport?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Though The Big Cypress is the largest growth of cypress swamps in South Florida, cypress swamps can be found near the Atlantic Coastal Ridge and between Lake Okeechobee and the Eastern flatwoods, as well as in sawgrass marshes. Cypresses are deciduous conifers that are uniquely adapted to thrive in flooded conditions, with buttressed trunks and root projections that protrude out of the water, called ``knees''. Bald cypress trees grow in formations with the tallest and thickest trunks in the center, rooted in the deepest peat. As the peat thins out, cypresses grow smaller and thinner, giving the small forest the appearance of a dome from the outside. They also grow in strands, slightly elevated on a ridge of limestone bordered on either side by sloughs. Other hardwood trees can be found in cypress domes, such as red maple, swamp bay, and pop ash. If cypresses are removed, the hardwoods take over, and the ecosystem is recategorized as a mixed swamp forest. Question: is the everglades the largest swamp in north america?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The gastrointestinal tract (digestive tract, digestional tract, GI tract, GIT, gut, or alimentary canal) is an organ system within humans and other animals which takes in food, digests it to extract and absorb energy and nutrients, and expels the remaining waste as feces. The mouth, esophagus, stomach and intestines are part of the gastrointestinal tract. Gastrointestinal is an adjective meaning of or pertaining to the stomach and intestines. A tract is a collection of related anatomic structures or a series of connected body organs. Question: is the gut the same as the stomach?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A conditional discharge is an order made by a criminal court whereby an offender will not be sentenced for an offence unless a further offence is committed within a stated period. Once the stated period has elapsed and no further offence is committed then the conviction may be removed from the defendant's record. Question: does a conditional discharge mean a criminal record?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Southern Nevada Zoological-Botanical Park, informally known as the Las Vegas Zoo, was a 3-acre (1.2 ha), nonprofit Zoological park and botanical garden located in Las Vegas, Nevada that closed in September 2013. It was located northwest of the Las Vegas Strip, about 15 minutes away. It focused primarily on the education of desert life and habitat protection. Its mission statement was to ``educate and entertain the public by displaying a variety of plants and animals''. An admission fee was charged. The park included a small gem exhibit area and a small gift shop at the main exit. The gift shop and admission fees helped support the zoo. Question: is there a zoo in las vegas nevada?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Chroma key compositing, or chroma keying, is a visual effects/post-production technique for compositing (layering) two images or video streams together based on color hues (chroma range). The technique has been used heavily in many fields to remove a background from the subject of a photo or video -- particularly the newscasting, motion picture and videogame industries. A color range in the foreground footage is made transparent, allowing separately filmed background footage or a static image to be inserted into the scene. The chroma keying technique is commonly used in video production and post-production. This technique is also referred to as color keying, colour-separation overlay (CSO; primarily by the BBC), or by various terms for specific color-related variants such as green screen, and blue screen -- chroma keying can be done with backgrounds of any color that are uniform and distinct, but green and blue backgrounds are more commonly used because they differ most distinctly in hue from most human skin colors. No part of the subject being filmed or photographed may duplicate the color used as the backing. Question: can you use a white background as a green screen?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On December 1, 2015, it was announced that the series was cancelled and would not be renewed for a fourth season. Question: will there be a cedar cove season 4?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In mathematics, and more specifically set theory, the empty set or null set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set; in other theories, its existence can be deduced. Many possible properties of sets are vacuously true for the empty set. Question: is an empty set an element of an empty set?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Apple Pencil is a digital stylus pen that works as an input device for the iPad Pro and the 2018 iPad tablet computer and was designed by Apple Inc. It was announced on September 9, 2015, alongside the iPad Pro and released in conjunction with it on November 11, 2015. Question: does the ipad pro come with the apple pencil?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The first recorded rudimentary steam engine was the aeolipile described by Heron of Alexandria in 1st-century Roman Egypt. Several steam-powered devices were later experimented with or proposed, such as Taqi al-Din's steam jack, a steam turbine in 16th-century Ottoman Egypt, and Thomas Savery's steam pump in 17th-century England. In 1712, Thomas Newcomen's atmospheric engine became the first commercially successful engine using the principle of the piston and cylinder, which was the fundamental type steam engine used until the early 20th century. The steam engine was used to pump water out of coal mines Question: was the steam engine invented during the renaissance?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Cheeks (Latin: buccae) constitute the area of the face below the eyes and between the nose and the left or right ear. ``Buccal'' means relating to the cheek. In humans, the region is innervated by the buccal nerve. The area between the inside of the cheek and the teeth and gums is called the vestibule or buccal pouch or buccal cavity and forms part of the mouth. In other animals the cheeks may also be referred to as jowls. Question: are the inside of your cheeks called gums?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: In addition to being one of the top six international marathons run over the distance of 26 mi 385 yd (42.195 km), the IAAF standard for the marathon established in 1921 and originally used for the 1908 London Olympics, the London Marathon is also a large, celebratory sporting festival, third in England only to the Great North Run in Newcastle upon Tyne and Great Manchester Run in Manchester in terms of the number of participants . The event has raised over \u00a3450 million for charity since 1981, and holds the Guinness world record as the largest annual fund raising event in the world, with the 2009 participants raising over \u00a347.2 million for charity. In 2007, 78% of all runners raised money. In 2011 the official charity of the London Marathon was Oxfam. In 2014, the official charity was Anthony Nolan, and in 2015, it was Cancer Research UK. Question: is the london marathon the largest in the world?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: In the United States, an honours degree (or honors degree in US spelling) requires a thesis or project work beyond that needed for the normal bachelor's program. Honours programs in the US are taken alongside the rest of the degree and often have a minimum GPA requirement for entry, which can vary between institutions. Some institutions do not have a separate honours program, but instead refer to bachelor's degrees awarded with Latin honours, which may be based either on GPA or class position, as honours degrees. Question: is a master's degree higher than an honours degree?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Leatherface is a 2017 American horror film directed by Julien Maury and Alexandre Bustillo, written by Seth M. Sherwood, and starring Stephen Dorff, Vanessa Grasse, Sam Strike, and Lili Taylor. It is the eighth film in the Texas Chainsaw Massacre franchise (TCM), and works as a prequel to 1974's The Texas Chain Saw Massacre, explaining the origin of the series' lead character. Question: is leatherface in texas chainsaw massacre the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Greece in the Roman era describes the period of Greek history when it was dominated by the Roman republic, the Roman Empire and the Byzantine Empire (collectively, the Roman era). It began with the Roman victory over the Corinthians, at the Battle of Corinth (146 BC). It continued with the adoption of the city of Byzantium by the Emperor Constantine the Great as the capital of the Roman Empire (as Nova Roma, later Constantinople) in AD 330. After this date, the Eastern Empire became largely Greek speaking. Question: were greece and rome around at the same time?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Retail sale of beer and wine is prohibited on Sundays between 2:00 a.m. and 1:00 p.m. and between 2:00 a.m. and 7:00 a.m. on weekdays and Saturdays. Retail sale of liquor is prohibited on Sundays, Christmas Day, and between 12:00 midnight and 8:00 a.m on all other days. Question: can i buy liquor on sunday in wv?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There are two main restrictions mentioned in the passage:\\n\\n1. Retail sale of beer and wine is prohibited on Sundays between 2:00 a.m. and 1:00 p.m. and between 2:00 a.m. and 7:00 a.m. on weekdays and Saturdays.  \\n2. Retail sale of liquor is prohibited on Sundays, Christmas Day, and between 12:00 midnight and 8:00 a.m on all other days.\\n\\nThe question asks, \\\"can i buy liquor on sunday in wv?\\\"\\n\\nThe student's answer is `False`.\\n\\nHowever, based on the passage, it is NOT False. The passage does not mention any prohibition on the retail sale of liquor on Sundays in West Virginia. It only mentions the restrictions on beer and wine sales on Sundays and liquor sales on specific days, including Sundays.\\n\\nTherefore, the score is: 50/100"
    },
    {
        "question": "Passage: A pimiento (Spanish pronunciation: (pi\u02c8mjento)), pimento, or cherry pepper is a variety of large, red, heart-shaped chili pepper (Capsicum annuum) that measures 3 to 4 in (7 to 10 cm) long and 2 to 3 in (5 to 7 cm) wide (medium, elongate). Question: are roasted red peppers and pimentos the same?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Humans have a four-chambered heart consisting of the right atrium, left atrium, right ventricle, and left ventricle. The atria are the two upper chambers. The right atrium receives and holds deoxygenated blood from the superior vena cava, inferior vena cava, anterior cardiac veins and smallest cardiac veins and the coronary sinus, which it then sends down to the right ventricle (through the tricuspid valve) which in turn sends it to the pulmonary artery for pulmonary circulation. The left atrium receives the oxygenated blood from the left and right pulmonary veins, which it pumps to the left ventricle (through the mitral valve) for pumping out through the aorta for systemic circulation. Question: does the right atrium receive blood from the lungs?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There are a few issues with the student's answer. The right atrium does receive blood from the lungs, but it is deoxygenated blood, not oxygenated blood. The student should have read the passage more carefully to understand the details of the heart's function.\\n\\nTherefore, the score is: 60"
    },
    {
        "question": "Passage: Developmental psychology is the scientific study of how and why human beings change over the course of their life. Originally concerned with infants and children, the field has expanded to include adolescence, adult development, aging, and the entire lifespan. Developmental psychologists aim to explain how thinking, feeling and behaviour change throughout life. This field examines change across three major dimensions: physical development, cognitive development, and socioemotional development. Within these three dimensions are a broad range of topics including motor skills, executive functions, moral understanding, language acquisition, social change, personality, emotional development, self-concept and identity formation. Question: is child psychology the same as developmental psychology?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Compared to similar technology in other sports, goal-line technology is a relatively recent addition to association football; its integration having been opposed by the sport's authorities. In July 2012, the International Football Association Board (IFAB) officially approved the use of goal line technology, amending the Laws of the Game to permit (but not require) its use. Due to its expense, goal-line technology is only used at the highest levels of the game. Goal-line technology is currently used in the top European domestic leagues, and at major international competitions such as the 2014 Men's, 2018 Men's and 2015 Women's FIFA World Cups. Question: is there goal line technology in the world cup?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Isle of Man (Manx: Ellan Vannin (\u02c8\u025blj\u0259n \u02c8van\u026an)), sometimes referred to simply as Mann (/m\u00e6n/; Manx: Mannin (\u02c8man\u026an)), is a self-governing British Crown dependency, an island in the Irish Sea between Great Britain and Ireland. The head of state is Queen Elizabeth II, who holds the title of Lord of Mann and is represented by a Lieutenant Governor. Defence is the responsibility of the United Kingdom. Question: is the isle of man part of the uk?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Often referred to as drywall taping mud, joint compound is the primary material used in the drywall industry by a tradesperson, or applicator, called a ``drywall mechanic,'' ``taper,'' or ``drywall taper.'' A similar compound is used in various ways as a sprayed-on textural finishing for gypsum panel walls and ceilings that have been pre-sealed and coated with joint compound. The flexibility and plastic qualities of joint compound make it a very versatile material both as sealer or finishing coat for wall surfaces, and also in decorative applications that range from machine sprayed texturing to hand-trowelled or even hand-crafted and sculptural finishes. In North America the application of joint mud and drywall tape sealer and trowelled joint compound on gypsum panels is a standard construction technique for painted wall and ceiling surfaces. Until more recently in North America, and through the world, several different plasters such as veneer plaster and ``plaster of Paris'' have been used in a similar ways to joint compounds as fillers or for decorative purposes since ancient times, and the actual make up and working properties of these compounds is much similar. Modern ready-mixes or powder and water mixes are available in a wide range of styles from slow-drying to quick-drying to suit specific demands for use by contractors or decorators. Question: is drywall mud the same as joint compound?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In 1965, because of rises in bullion prices, the Mint began to strike copper-nickel clad coins instead of silver. No dollar coins had been issued in thirty years, but beginning in 1969, legislators sought to reintroduce a dollar coin into commerce. After Eisenhower died that March, there were a number of proposals to honor him with the new coin. While these bills generally commanded wide support, enactment was delayed by a dispute over whether the new coin should be in base metal or 40% silver. In 1970, a compromise was reached to strike the Eisenhower dollar in base metal for circulation, and in 40% silver as a collectible. President Richard Nixon, who had served as vice president under Eisenhower, signed legislation authorizing mintage of the new coin on December 31, 1970. Question: is there silver in a 1971 silver dollar?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The United States Marine Corps (USMC), also referred to as the United States Marines, is a branch of the United States Armed Forces responsible for conducting amphibious operations with the United States Navy. The U.S. Marine Corps is one of the four armed service branches in the U.S. Department of Defense (DoD) and one of the seven uniformed services of the United States. Question: is the marines a part of the navy?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Sarah Ann Kennedy is a British voice actress best known for providing the voices of Miss Rabbit and Mummy Rabbit in the children's animated series Peppa Pig, Nanny Plum in the children's animated series Ben & Holly's Little Kingdom and Dolly Pond in Pond Life. She is also a writer and animation director and the creator of Crapston Villas, an animated soap opera for Channel 4 in 1996--1998. She has also written for Hit Entertainment and Peppa Pig, and is a lecturer at the University of Central Lancashire. Question: is nanny plum and miss rabbit the same voice?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The World Bank Group is part of the United Nations system and has a formal relationship agreement with the UN, but retains its independence. The WBG comprises a group of five legally separate but affiliated institutions: the International Bank for Reconstruction and Development (IBRD), the International Finance Corporation (IFC), the International Development Association (IDA), the Multilateral Investment Guarantee Agency (MIGA), and the International Centre for Settlement of Investment Disputes (ICSID). It is a vital source of financial and technical assistance to developing countries around the world. Its mission is to fight poverty with passion and professionalism for lasting results and to help people help themselves and their environment by providing resources, sharing knowledge, building capacity and forging partnerships in the public and private sectors. The WBG headquarters are located in Washington, D.C., United States of America. Question: is the world bank affiliated with the united nations?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A moving violation is any violation of the law committed by the driver of a vehicle while it is in motion. The term ``motion'' distinguishes it from other motor vehicle violations, such as paperwork violations (which include violations involving automobile insurance, registration and inspection), parking violations, or equipment violations. Question: is a no insurance ticket a moving violation?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Most competitions only allow each team to make a maximum of three substitutions during a game and a fourth substitute during extra time, although more substitutions are often permitted in non-competitive fixtures such as friendlies. A fourth substitution in extra time was first implemented in recent tournaments, including the 2016 Summer Olympic Games, the 2017 FIFA Confederations Cup and the 2017 CONCACAF Gold Cup final. A fourth substitute in extra time has been approved for use in the elimination rounds at the 2018 FIFA World Cup, the UEFA Champions League and the UEFA Europa League. Each team nominates a number of players (typically between five and seven, depending on the competition) who may be used as substitutes; these players typically sit in the technical area with the coaches, and are said to be ``on the bench''. When the substitute enters the field of play it is said they have come on or have been brought on, while the player they are substituting is coming off or being brought off. Question: can a player be substituted twice in football?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Most EMS providers operate on the principle of informed consent; that is, patients must know exactly what it is they are refusing, and what the possible consequences might be, in order to make a proper decision. This precludes parties who are intoxicated or otherwise incapable of making an informed decision, such as the mentally incompetent. Otherwise, agencies could release someone who was not able to understand what refusing might mean to their health. Question: can paramedics make you go to the hospital?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Empire State Building stood as the tallest building in the world from its completion until 1972, when the 110-story North Tower of the original World Trade Center was completed. At 1,368 feet (417 m), The World Trade Center briefly held the title as the world's tallest building until the completion of the 108-story Willis Tower (formerly known as the Sears Tower) in Chicago in 1974. The World Trade Center towers were destroyed by terrorist attacks in 2001, and the Empire State Building regained the title of tallest building in the City. It remained the tallest until April 2012, when the construction on One World Trade Center surpassed it. The fourth-tallest building in New York is the Bank of America Tower, which rises to 1,200 feet (366 m), including its spire. Tied for fifth-tallest are the 1,046-foot (319 m) Chrysler Building, which was the world's tallest building from 1930 until 1931, and the New York Times Building, which was completed in 2007. If the Twin Towers were still standing today, they would be the third and fourth tallest buildings in the city, or second and third assuming the new buildings would not have been built. Only 432 Park Avenue is taller. Question: were the twin towers taller than the empire state building?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Maple syrup is a syrup usually made from the xylem sap of sugar maple, red maple, or black maple trees, although it can also be made from other maple species. In cold climates, these trees store starch in their trunks and roots before winter; the starch is then converted to sugar that rises in the sap in late winter and early spring. Maple trees are tapped by drilling holes into their trunks and collecting the exuded sap, which is processed by heating to evaporate much of the water, leaving the concentrated syrup. Question: does maple syrup come straight from the tree?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The mishap in The Guardian where Randall loses his crew is loosely based on an actual U.S. Coast Guard aviation mishap in Alaska. The aircraft was an HH-3F Pelican (USCG variant of the Jolly Green Giant) instead of the HH-60J Jayhawk (USCG variant of the Blackhawk/Seahawk) pictured in the movie. Question: is the movie the guardian based on a true story?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In the 21st century, the currencies associated with large economies typically do not fix or peg exchange rates to other currencies. The last large economy to use a fixed exchange rate system was the People's Republic of China, which, in July 2005, adopted a slightly more flexible exchange rate system, called a managed exchange rate. The European Exchange Rate Mechanism is also used on a temporary basis to establish a final conversion rate against the euro from the local currencies of countries joining the Eurozone. Question: does the united states have a fixed exchange rate?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 2011, Sylvester Stallone was inducted into the International Boxing Hall of Fame for his work on the Rocky Balboa character, having ``entertained and inspired boxing fans from around the world''. Additionally, Stallone was awarded the Boxing Writers Association of America award for ``Lifetime Cinematic Achievement in Boxing.'' Question: is rocky balboa in the boxing hall of fame?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Impaired driving is the term used in Canada to describe the criminal offence of operating or having care or control of a motor vehicle while the person's ability to operate the motor vehicle is impaired by alcohol or a drug. Impaired driving is punishable under multiple offences in the Criminal Code, with greater penalties depending on the harm caused by the impaired driving. It can also result in various types of driver's licence suspensions. Question: is a dui an indictable offence in canada?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Ant-Man and the Wasp is a 2018 American superhero film based on the Marvel Comics characters Scott Lang / Ant-Man and Hope van Dyne / Wasp. Produced by Marvel Studios and distributed by Walt Disney Studios Motion Pictures, it is the sequel to 2015's Ant-Man, and the twentieth film in the Marvel Cinematic Universe (MCU). The film is directed by Peyton Reed and written by the writing teams of Chris McKenna and Erik Sommers, and Paul Rudd, Andrew Barrer, and Gabriel Ferrari. It stars Rudd as Lang and Evangeline Lilly as Van Dyne, alongside Michael Pe\u00f1a, Walton Goggins, Bobby Cannavale, Judy Greer, Tip ``T.I.'' Harris, David Dastmalchian, Hannah John-Kamen, Abby Ryder Fortson, Randall Park, Michelle Pfeiffer, Laurence Fishburne, and Michael Douglas. In Ant-Man and the Wasp, the titular pair work with Hank Pym to retrieve Janet van Dyne from the quantum realm. Question: do ant man and the wasp get together?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The group winners, Spain, qualified directly for the 2018 FIFA World Cup. The group runners-up, Italy, advanced to the play-offs as one of the best 8 runners-up, where they lost to Sweden and thus failed to qualify for the first time since 1958. Question: is italy going to the world cup 2018?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: A fried egg is a cooked dish made from one or more eggs which are removed from their shells and placed into a pan, usually without breaking the yolk, and fried with minimal accompaniment. Fried eggs are traditionally eaten for breakfast in many countries but may also be served at other times of the day. Question: does a fried egg have a runny yolk?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The dairy cow will produce large amounts of milk in its lifetime. Production levels peak at around 40 to 60 days after calving. Production declines steadily afterwards until milking is stopped at about 10 months. The cow is ``dried off'' for about sixty days before calving again. Within a 12 to 14-month inter-calving cycle, the milking period is about 305 days or 10 months long. Among many variables, certain breeds produce more milk than others within a range of around 6,800 to 17,000 kg (15,000 to 37,500 lbs) of milk per year. Question: does cows have to be pregnant to produce milk?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Magnesium citrate is a magnesium preparation in salt form with citric acid in a 1:1 ratio (1 magnesium atom per citrate molecule). The name ``magnesium citrate'' is ambiguous and sometimes may refer to other salts such as trimagnesium citrate which has a magnesium:citrate ratio of 3:2. Question: does magnesium citrate have citric acid in it?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Hypnagogia, also referred to as ``hypnagogic hallucinations'', is the experience of the transitional state from wakefulness to sleep: the hypnagogic state of consciousness, during the onset of sleep (for the transitional state from sleep to wakefulness see hypnopompic). Mental phenomena that may occur during this ``threshold consciousness'' phase include lucid thought, lucid dreaming, hallucinations, and sleep paralysis. However, sleep paralysis and lucid dreaming are separate sleep conditions that are sometimes experienced during the hypnagogic state. Question: can you start dreaming before you fall asleep?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Turtle may either refer to the order as a whole, or to particular turtles that make up a form taxon that is not monophyletic, or may be limited to only aquatic species. Tortoise usually refers to any land-dwelling, non-swimming chelonian. Terrapin is used to describe several species of small, edible, hard-shell turtles, typically those found in brackish waters. Question: is a turtle the same as a tortoise?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The New York metropolitan area, also referred to as the Tri-State Area, is the largest metropolitan area in the world by urban landmass, at 4,495 sq mi (11,640 km). The metropolitan area includes New York City (the most populous city in the United States), Long Island, and the Mid and Lower Hudson Valley in the state of New York; the five largest cities in New Jersey: Newark, Jersey City, Paterson, Elizabeth, and Edison, and their vicinities; six of the seven largest cities in Connecticut: Bridgeport, New Haven, Stamford, Waterbury, Norwalk, and Danbury, and their vicinities. Question: is new jersey a suburb of new york city?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The United States Federal Food, Drug, and Cosmetic Act (abbreviated as FFDCA, FDCA, or FD&C), is a set of laws passed by Congress in 1938 giving authority to the U.S. Food and Drug Administration (FDA) to oversee the safety of food, drugs, and cosmetics. A principal author of this law was Royal S. Copeland, a three-term U.S. Senator from New York. In 1968, the Electronic Product Radiation Control provisions were added to the FD&C. Also in that year the FDA formed the Drug Efficacy Study Implementation (DESI) to incorporate into FD&C regulations the recommendations from a National Academy of Sciences investigation of effectiveness of previously marketed drugs. The act has been amended many times, most recently to add requirements about bioterrorism preparations. Question: was the food drug and cosmetic act successful?",
        "pred_ans": " It is difficult to provide a yes or no answer to the question of whether the Food, Drug, and Cosmetic Act (FFDCA) has been successful. The FFDCA has been in effect for over 80 years and has been amended many times. It has played a significant role in ensuring the safety of food, drugs, and cosmetics in the United States. However, there have been instances where the FDA has been criticized for not acting quickly enough to remove unsafe products from the market or for being too lenient in its approval process for new drugs and devices. The success of the FFDCA can be considered a mixed bag, with some areas where it has been effective and others where it has fallen short.",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: The internal intercostal muscles have fibres that are angled obliquely downward and backward from rib to rib. These muscles can therefore assist in lowering the rib cage, adding force to exhalation. Question: do the internal intercostal muscles contract during inspiration?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is no information provided in the passage to determine if the internal intercostal muscles contract during inspiration or not. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Trinidad and Tobago entered qualification for the 2018 FIFA World Cup in the Fourth Round and was drawn into Group C with Guatemala, Saint Vincent and the Grenadines, and the United States. The team would finish second in Group C with a total of 11 points to qualify for the Hexagonal. However, they would finish in sixth place in the final round with only 6 points, even though they eliminated the United States from World Cup contention with a 2--1 victory in the final match. Question: is trinidad and tobago going to world cup 2018?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A battery may be modeled as a voltage source in series with a resistance. In practice, the internal resistance of a battery is dependent on its size, chemical properties, age, temperature, and the discharge current. It has an electronic component due to the resistivity of the component materials and an ionic component due to electrochemical factors such as electrolyte conductivity, ion mobility, and electrode surface area. Measurement of the internal resistance of a battery is a guide to its condition, but may not apply at other than the test conditions. Measurement with an alternating current, typically at a frequency of 7003100000000000000\u26601 kHz, may underestimate the resistance, as the frequency may be too high to take into account slower electrochemical processes. Internal resistance depends on temperature; for example, a fresh Energizer E91 AA alkaline primary battery drops from about 0.9 \u03a9 at -40 \u00b0C, when the low temperature reduces ion mobility, to about 0.15 \u03a9 at room temperature and about 0.1 \u03a9 at 40 \u00b0C. Question: does the internal resistance of a battery change?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The first installment in the Indiana Jones film franchise, it stars Harrison Ford as archaeologist Indiana Jones, who battles a group of Nazis searching for the Ark of the Covenant. It co-stars Karen Allen as Indiana's former lover, Marion Ravenwood; Paul Freeman as Indiana's rival, French archaeologist Ren\u00e9 Belloq; John Rhys-Davies as Indiana's sidekick, Sallah; Ronald Lacey as Gestapo agent Arnold Toht; and Denholm Elliott as Indiana's colleague, Marcus Brody. Question: does indiana jones play a role in raiders?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In chess, the king (\u2654,\u265a) is the most important piece. The object of the game is to threaten the opponent's king in such a way that escape is not possible (checkmate). If a player's king is threatened with capture, it is said to be in check, and the player must remove the threat of capture on the next move. If this cannot be done, the king is said to be in checkmate, resulting in a loss for that player. Although the king is the most important piece, it is usually the weakest piece in the game until a later phase, the endgame. Players cannot make any move that places their own king in check. Question: can you take out the king in chess?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A queen ant (formally known as a gyne) is an adult, reproducing female ant in an ant colony; generally she will be the mother of all the other ants in that colony. Some female ants, such as Cataglyphis cursor, do not need to mate to produce offspring, reproducing through asexual parthenogenesis or cloning, and all of those offspring will be female. Others, like those in the genus Crematogaster, mate in a nuptial flight. Queen offspring develop from larvae specially fed in order to become sexually mature among most species. Depending on the species, there can be either a single mother queen, or potentially, hundreds of fertile queens in some species. Queen ants have one of the longest life-spans of any known insect -- up to 30 years. A queen of Lasius niger was held in captivity by German entomologist Hermann Appel for 283\u20444 years; also a Pogonomyrmex owyheei has a maximum estimated longevity of 30 years in the field. Question: can an ant colony have more than one queen?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Per capita income is often used c measure an area's average income. This is used to see the wealth of the population with those of others. Per capita income is often used to measure a country's standard of living. It is usually expressed in terms of a commonly used international currency such as the euro or United States dollar, and is useful because it is widely known, is easily calculable from readily available gross domestic product (GDP) and population estimates, and produces a useful statistic for comparison of wealth between sovereign territories. This helps to ascertain a country's development status. It is one of the three measures for calculating the Human Development Index of a country. Question: is gdp per capita same as per capita income?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: Spain has two time zones and observes daylight saving time. Spain mainly uses Central European Time (GMT+01:00) and Central European Summer Time (GMT+02:00) in Peninsular Spain, the Balearic Islands, Ceuta, Melilla and plazas de soberan\u00eda. In the Canary Islands, the time zone is Western European Time (GMT\u00b100:00) and Western European Summer Time (GMT+01:00). Daylight saving time is observed from the last Sunday in March (01:00 GMT) to the last Sunday in October (01:00 GMT) throughout Spain. Question: is spain all in the same time zone?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Street Addressing will have the same street address of the post office, plus a ``unit number'' that matches the P.O. Box number. As an example, in El Centro, California, the post office is located at 1598 Main Street. Therefore, for P.O. Box 9975 (fictitious), the Street Addressing would be: 1598 Main Street Unit 9975, El Centro, CA. Nationally, the first five digits of the zip code may or may not be the same as the P.O. Box address, and the last four digits (Zip + 4) are virtually always different. Except for a few of the largest post offices in the U.S., the 'Street Addressing' (not the P.O. Box address) nine digit Zip + 4 is the same for all boxes at a given location. Question: does p o box come before street address?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Dodo is a fictional character appearing in Chapters 2 and 3 of the book Alice's Adventures in Wonderland by Lewis Carroll (Charles Lutwidge Dodgson). The Dodo is a caricature of the author. A popular but unsubstantiated belief is that Dodgson chose the particular animal to represent himself because of his stammer, and thus would accidentally introduce himself as ``Do-do-dodgson''. Historically, the Dodo was a non-flying bird that lived on the island of Mauritius, east of Madagascar in the Indian Ocean. It became extinct in the mid 17th century during the colonisation of the island by the Dutch. Question: is there a dodo in alice in wonderland?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The original series aired for 30 series and a total of 592 episodes, from 4 September 1998 to 11 February 2014, and was presented by Chris Tarrant. Over the course of its run, the original series had around five contestants walk away with the top cash prize of \u00a31 million, and faced a number of controversies during its run, including an attempt to defraud the show of its top prize by a contestant. The original format of the programme was tweaked in later years, changing the number of questions from fifteen to twelve and altering the payout structure as a result, and later incorporating a time limit. Four years after the original series ended, ITV unveiled a revived series, created by Stellify Media, to commemorate the 20th anniversary of the programme. The revived format, based upon the original design, was presented by Jeremy Clarkson, and broadcast in 2018, from 5--11 May. Question: has anyone ever won the million on millionaire?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Article II of the Constitution establishes the executive branch of the federal government. It vests the executive power of the United States in the president. The power includes the execution and enforcement of federal law, alongside the responsibility of appointing federal executive, diplomatic, regulatory and judicial officers, and concluding treaties with foreign powers with the advice and consent of the Senate. The president is further empowered to grant federal pardons and reprieves, and to convene and adjourn either or both houses of Congress under extraordinary circumstances. The president directs the foreign and domestic policies of the United States, and takes an active role in promoting his policy priorities to members of Congress. In addition, as part of the system of checks and balances, Article One of the United States Constitution gives the president the power to sign or veto federal legislation. Since the office of president was established in 1789, its power has grown substantially, as has the power of the federal government as a whole. Question: is the president the only member of the executive branch?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: This has led to categorizing trucks similarly, even if their payload is different. Therefore, the Toyota Tacoma, Dodge Dakota, Ford Ranger, Honda Ridgeline, Chevrolet S-10, GMC S-15 and Nissan Frontier are called quarter-tons (1\u20444-ton). The Ford F-150, Chevrolet C10/K10, Chevrolet/GMC 1500, Dodge 1500, Toyota Tundra, and Nissan Titan are half-tons (1\u20442-ton). The Ford F-250, Chevrolet C20/K20, Chevrolet/GMC 2500, and Dodge 2500 are three-quarter-tons (3\u20444-ton). Chevrolet/GMC's 3\u20444-ton suspension systems were further divided into light and heavy-duty, differentiated by 5-lug and 6 or 8-lug wheel hubs depending on year, respectively. The Ford F-350, Chevrolet C30/K30, Chevrolet/GMC 3500, and Dodge 3500 are one tons (1-ton). Question: is a dodge 3500 a 1 ton truck?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: ``Lord of all Hopefulness'' is a Christian hymn written by Jan Struther, which was published in the enlarged edition of Songs of Praise (Oxford University Press) in 1931. The hymn is used in liturgy, at weddings and at the beginning of funeral services. Question: is lord of all hopefulness a funeral hymn?",
        "pred_ans": " Answer: False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the 1930s, the show began hiring professionals and expanded to four hours. Broadcasting by then at 50,000 watts, WSM made the program a Saturday night musical tradition in nearly 30 states. In 1939, it debuted nationally on NBC Radio. The Opry moved to a permanent home, the Ryman Auditorium, in 1943. As it developed in importance, so did the city of Nashville, which became America's ``country music capital.'' The Grand Ole Opry holds such significance in Nashville that its name is included on the city/county line signs on all major roadways. The signs read ``Music City Metropolitan Nashville Davidson County Home of the Grand Ole Opry.'' Question: are the ryman and grand ole opry the same thing?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Daisy Johnson, also known as Quake, is a fictional superhero appearing in American comic books published by Marvel Comics. Created by writer Brian Michael Bendis and artist Gabriele Dell'Otto, the character first appeared in Secret War #2 (July 2004). The daughter of the supervillain Mister Hyde, she is a secret agent of the intelligence organization S.H.I.E.L.D. with the power to generate earthquakes. Question: is daisy the director of shield in the comics?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In recent years a lower level resolution of offences has often been used by police forces in England and Wales instead of a caution. This is usually called a 'community resolution' and invariably requires less police time as offenders are not arrested. A community resolution does not require any formal record but the offender should admit the offence and the victim should be happy with this method of informal resolution. Concerns have been expressed over the use of community resolution for violent offences, in particular 'domestic violence'. Question: can you get a caution without being arrested?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In addition to domestic units, industrial dishwashers are available for use in commercial establishments such as hotels and restaurants, where a large number of dishes must be cleaned. Washing is conducted with temperatures of 65--71 \u00b0C (149--160 \u00b0F) and sanitation is achieved by either the use of a booster heater that will provide an 82 \u00b0C (180 \u00b0F) ``final rinse'' temperature or through the use of a chemical sanitizer. Question: does the dishwasher make its own hot water?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Although not identical, the 7.62\u00d751mm NATO and the commercial .308 Winchester cartridges are similar enough that they can be loaded into rifles chambered for the other round, but the Winchester .308 cartridges are typically loaded to higher pressures than 7.62\u00d751mm NATO cartridges. Even though the Sporting Arms and Ammunition Manufacturers' Institute (SAAMI) does not consider it unsafe to fire the commercial round in weapons chambered for the NATO round, there is significant discussion about compatible chamber and muzzle pressures between the two cartridges based on powder loads and wall thicknesses on the military vs. commercial rounds. While the debate goes both ways, the ATF recommends checking the stamping on the barrel; if unsure, one can consult the maker of the firearm. Question: is 7.62 nato the same as 308 win?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: RMS Titanic was a British passenger liner that sank in the North Atlantic Ocean in the early hours of 15 April 1912, after colliding with an iceberg during its maiden voyage from Southampton to New York City. There were an estimated 2,224 passengers and crew aboard, and more than 1,500 died, making it one of the deadliest commercial peacetime maritime disasters in modern history. RMS Titanic was the largest ship afloat at the time it entered service and was the second of three Olympic-class ocean liners operated by the White Star Line. It was built by the Harland and Wolff shipyard in Belfast. Thomas Andrews, her architect, died in the disaster. Question: did the titanic sank on its maiden voyage?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The Train is a 1964 war film directed by John Frankenheimer from a story and screenplay by Franklin Coen and Frank Davis, inspired by the non-fiction book Le front de l'art by Rose Valland, who documented the works of art placed in storage that had been looted by the Germans from museums and private art collections. It stars Burt Lancaster, Paul Scofield and Jeanne Moreau. Question: is the movie the train a true story?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Pan-American Highway is a system of roads measuring about 30,000 km (19,000 mi) long that crosses through the entirety of North, Central, and South America, with the sole exception of the Dari\u00e9n Gap. On the South American side, the Highway terminates at Turbo, Colombia near 8\u00b06\u2032N 76\u00b040\u2032W\ufeff / \ufeff8.100\u00b0N 76.667\u00b0W\ufeff / 8.100; -76.667. On the Panamanian side, the road terminus is the town of Yaviza at 8\u00b09\u2032N 77\u00b041\u2032W\ufeff / \ufeff8.150\u00b0N 77.683\u00b0W\ufeff / 8.150; -77.683. This marks a straight-line separation of about 100 km (60 mi). In between are marshland and forest. Question: can you get to south america by car?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: English is New Zealand's official language. It is the primary language used for court proceedings, and statutes and other official pronouncements. English is spoken by 96.1 percent of the population. Question: is english an official language of new zealand?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A fossil fuel is a fuel formed by natural processes, such as anaerobic decomposition of buried dead organisms, containing energy originating in ancient photosynthesis. The age of the organisms and their resulting fossil fuels is typically millions of years, and sometimes exceeds 650 million years. Fossil fuels contain high percentages of carbon and include petroleum, coal, and natural gas. Other commonly used derivatives include kerosene and propane. Fossil fuels range from volatile materials with low carbon to hydrogen ratios like methane, to liquids like petroleum, to nonvolatile materials composed of almost pure carbon, like anthracite coal. Methane can be found in hydrocarbon fields either alone, associated with oil, or in the form of methane clathrates. Question: is petroleum an example of a fossil fuel?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Production of the American version was canceled by ABC in 2003 because of low ratings, with already-produced episodes airing first-run into 2004. The ABC Family cable channel, which had been airing repeats of the show since 2002, also showed ``new'' episodes from 2005 to 2006, formed from previously filmed but unaired performances. Question: did they cancel whose line is it anyway?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Until the early 1960s, most front turn signals worldwide emitted white light and most rear turn signals emitted red. The auto industry in the USA voluntarily adopted amber front-turn signals for most vehicles beginning in the 1963 model year, though the advent of amber signals was accompanied by legal stumbles in some states and front turn signals were still legally permitted to emit white light until FMVSS 108 took effect for the 1968 model year, whereupon amber became the only permissible front turn signal colour. Presently, most countries outside of the United States and Canada require that all front, side and rear turn signals produce amber light. Exceptions include Switzerland and New Zealand. Question: do front turn signals have to be amber?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: ``American statutes typically provide that, in absence of issue and subject to the share of a surviving spouse, intestate property passes to the parents or to the surviving parent of the decedent''. Under the civil law system of computation and its various modified forms that are widely adopted by statute in the United States, ``a claimant's degree of kinship is the total of (1) the number of the steps, counting one from each generation, from the decedent up to the nearest common ancestor of the decedent and the claimant, and (2) the number of steps from the common ancestor down to the claimant.'' ``The claimant having the lowest degree count (i.e., the nearest or next of kin) is entitled to the property.'' ``If there are two or more claimants who stand in equal degree of kinship to the decedent, they share per capita.'' Question: can you choose who is your next of kin?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Surgical treatments of ingrown toenails include a number of different options. If conservative treatment of a minor ingrown toenail does not succeed or if the ingrown toenail is severe, surgical management is recommended by a podiatrist. The initial surgical approach is typically a partial avulsion of the nail plate known as a wedge resection or a complete removal of the toenail. If the ingrown toenail reoccurs despite this treatment, destruction of the germinal matrix with phenol is recommended. Antibiotics are not needed if surgery is performed. Question: can ingrown toenails come back after being removed?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a factual error in the student's answer. The passage states that if the ingrown toenail reoccurs despite the initial surgical treatment, destruction of the germinal matrix with phenol is recommended. This suggests that ingrown toenails can indeed come back after being removed. Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: Access courses are generally tailored as pathways; that is, they prepare students with the necessary skills and imbue the appropriate knowledge required for a specific undergraduate career. For example, there are 'access to law', 'access to medicine' and 'access to nursing' pathways that prepare students to study law, medicine and nursing at undergraduate level, respectively. Question: is an access course classed as higher education?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: ``Separation of church and state'' is paraphrased from Thomas Jefferson and used by others in expressing an understanding of the intent and function of the Establishment Clause and Free Exercise Clause of the First Amendment to the Constitution of the United States which reads: ``Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof...'' Question: does the first amendment separate church and state?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is no clear answer to the question in the passage provided. The passage quotes the Establishment Clause and Free Exercise Clause of the First Amendment, but it does not explicitly state whether the first amendment separates church and state. The student's answer of \\\"NILL\\\" is a valid response given the information provided. Therefore the score is: 50/100"
    },
    {
        "question": "Passage: U.S. Marshals is a 1998 American action crime thriller film directed by Stuart Baird. The storyline was conceived from a screenplay written by Roy Huggins and John Pogue. The film is a spin-off to the 1993 motion picture The Fugitive, which in turn was based on the television series of the same name, created by Huggins. The story does not involve the character of Dr. Richard Kimble, portrayed by Harrison Ford in the initial film, but instead the plot centers on United States Deputy Marshal Sam Gerard, once again played by Tommy Lee Jones. The plot follows Gerard and his team as they pursue another fugitive, Mark Sheridan, played by Wesley Snipes, who attempts to escape government officials following an international conspiracy scandal. The cast features Robert Downey, Jr., Joe Pantoliano, Daniel Roebuck, Tom Wood, and LaTanya Richardson, several of whom portrayed Deputy Marshals in the previous film. Question: is us marshals a sequel to the fugitive?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A routing number consists of a five digit transit number (also called branch number) identifying the branch where an account is held and a three digit financial institution number corresponding to the financial institution. The number is given as one of the following forms, where XXXXX is the transit number and YYY is the financial institution number: Question: is the routing number the same as transit number?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: ``Eye of the Tiger'' is a song by American rock band Survivor. It was released as a single from their third album of the same name Eye of the Tiger and was also the theme song for the film Rocky III, which was released a day before the single. The song was written by Survivor guitarist Frankie Sullivan and keyboardist Jim Peterik, and was recorded at the request of Rocky III star, writer, and director Sylvester Stallone, after Queen denied him permission to use ``Another One Bites the Dust'', the song Stallone intended as the Rocky III theme. Originally, the song was made for the movie The Karate Kid. The director of both Rocky and The Karate Kid planned to use the song for a fighting montage towards the end of the feature. John G. Avildsen opted to using ``You're the Best'' by Joe Esposito. The version of the song that appears in the movie is the demo version of the song. The movie version also contained tiger growls, something that did not appear on the album version. It features original Survivor singer Dave Bickler on lead vocals. Question: was eye of the tiger written for rocky?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: An AC adapter, AC/DC adapter, or AC/DC converter is a type of external power supply, often enclosed in a case similar to an AC plug. Other common names include plug pack, plug-in adapter, adapter block, domestic mains adapter, line power adapter, wall wart, power brick, and power adapter. Adapters for battery-powered equipment may be described as chargers or rechargers (see also battery charger). AC adapters are used with electrical devices that require power but do not contain internal components to derive the required voltage and power from mains power. The internal circuitry of an external power supply is very similar to the design that would be used for a built-in or internal supply. Question: is a power adapter the same as a charger?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Some tax protesters such as Edward Brown and tax protester organizations such as the We the People Foundation have used the phrase ``show me the law'' to argue that the Internal Revenue Service refuses to disclose the laws that impose the legal obligation to file Federal income tax returns or pay Federal income taxes--and to argue that there must be no law imposing Federal income taxes. Question: is there a law that says we have to pay taxes?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: In Canada, contempt of court is an exception to the general principle that all criminal offences are set out in the federal Criminal Code. Contempt of court is the only remaining common law offence in Canada, besides the related offence of Contempt of Parliament. Question: is contempt of court a criminal offence in canada?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Avengers: Infinity War and its upcoming untitled sequel are American superhero films based on the Marvel Comics superhero team the Avengers, produced by Marvel Studios and distributed by Walt Disney Studios Motion Pictures. They are the sequels to Marvel's The Avengers (2012) and Avengers: Age of Ultron (2015), and serve as the nineteenth and twenty-second films of the Marvel Cinematic Universe (MCU), respectively. Both films are directed by Anthony and Joe Russo from screenplays by the writing team of Christopher Markus and Stephen McFeely, and feature an ensemble cast composed of many previous MCU actors. Question: is there a sequel to the avengers infinity war?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 0"
    },
    {
        "question": "Passage: Alexander Graham Bell originally suggested 'ahoy-hoy' be adopted as the standard greeting when answering a telephone, before 'hello' (suggested by Thomas Edison) became common. Question: did alexander graham bell propose answering the phone with ahoy?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Sweden have been one of the more successful national teams in the history of the World Cup, having reached 4 semi-finals, and becoming runners-up on home ground in 1958. They have been present at 11 out of 20 World Cups by 2014. Question: has sweden ever been in a world cup final?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The character appears in various Marvel Cinematic Universe films, including The Avengers (2012), portrayed by Damion Poitier, and Guardians of the Galaxy (2014), Avengers: Age of Ultron (2015), Avengers: Infinity War (2018), and the fourth Avengers film (2019), portrayed by Josh Brolin through voice and motion capture. The character has appeared in various comic adaptations, including animated television series, arcade, and video games. Question: was thanos in the first guardians of the galaxy?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Becket controversy or Becket dispute was the quarrel between Thomas Becket, the Archbishop of Canterbury, and King Henry II of England, from 1163 to 1170. The controversy culminated with Becket's murder in 1170, and was followed by Becket's canonization in 1173 and Henry's public penance at Canterbury in July 1174. Question: has the long exile of the archbishop ended?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The National Minimum Drinking Age Act of 1984 (23 U.S.C. \u00a7 158) was passed by the United States Congress on July 17, 1984. It was a controversial bill that punished every state that allowed persons below 21 years to purchase and publicly possess alcoholic beverages by reducing its annual federal highway apportionment by 10 percent. The law was later amended, lowering the penalty to 8 percent from fiscal year 2012 and beyond. Question: is the legal drinking age a federal law?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Top Gear presenters go across Burma and Thailand in lorries with the goal of building a bridge over the river Kwai. After building a bridge over the Kok River, Clarkson is quoted as saying ``That is a proud moment, but there's a slope on it.'' as a native crosses the bridge, 'slope' being a pejorative for Asians. Question: did top gear really build a bridge over the river?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Three Little Pigs is a fable about three pigs who build three houses of different materials. A big bad wolf blows down the first two pigs' houses, made of straw and sticks respectively, but is unable to destroy the third pig's house, made of bricks. Printed versions date back to the 1840s, but the story itself is thought to be much older. The phrases used in the story, and the various morals drawn from it, have become embedded in Western culture. Many versions of The Three Little Pigs have been recreated or have been modified over the years, sometimes making the wolf a kind character. It is a type 124 folktale in the Aarne--Thompson classification system. Question: is the three little pigs a nursery rhyme?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Fans of author J.R.R. Tolkien have drawn attention to the similarities between his novel The Lord of the Rings and the Harry Potter series; specifically Tolkien's Wormtongue and Rowling's Wormtail, Tolkien's Shelob and Rowling's Aragog, Tolkien's Gandalf and Rowling's Dumbledore, Tolkien's Nazg\u00fbl and Rowling's Dementors, Old Man Willow and the Whomping Willow and the similarities between both authors' antagonists, Tolkien's Dark Lord Sauron and Rowling's Lord Voldemort (both of whom are sometimes within their respective continuities unnamed due to intense fear surrounding their names; both often referred to as 'The Dark Lord'; and both of whom are, during the time when the main action takes place, seeking to recover their lost power after having been considered dead or at least no longer a threat). Several reviews of Harry Potter and the Deathly Hallows noted that the locket used as a horcrux by Voldemort bore comparison to Tolkien's One Ring, as it negatively affects the personality of the wearer. Rowling maintains that she had not read The Hobbit until after she completed the first Harry Potter novel (though she had read The Lord of the Rings as a teenager) and that any similarities between her books and Tolkien's are ``Fairly superficial. Tolkien created a whole new mythology, which I would never claim to have done. On the other hand, I think I have better jokes.'' Tolkienian scholar Tom Shippey has maintained that ``no modern writer of epic fantasy has managed to escape the mark of Tolkien, no matter how hard many of them have tried''. Question: was harry potter inspired by lord of the rings?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: On March 7, 2013, Variety confirmed that Disney has already approved plans for a sequel with Mitchell Kapner and Joe Roth returning as screenwriter and producer respectively. Mila Kunis said during an interview with E! News, ``We're all signed on for sequels.'' On March 8, 2013, Sam Raimi told Bleeding Cool that he has no plans to direct the sequel, saying, ``I did leave some loose ends for another director if they want to make the picture,'' and that ``I was attracted to this story but I don't think the second one would have the thing I would need to get me interested.'' On March 11, 2013, Kapner and Roth have said to the Los Angeles Times that the sequel will ``absolutely not'' involve Dorothy Gale, with Kapner pointing out that there are twenty years between the events of the first film and Dorothy's arrival, and ``a lot can happen in that time.'' Question: is there a sequel to oz the great and powerful?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Avengers: Infinity War held its world premiere on April 23, 2018 in Los Angeles and was released in the United States on April 27, 2018, in IMAX and 3D. The film received praise for the performances of the cast (particularly Brolin's) and the emotional weight of the story, as well as the visual effects and action sequences. It was the fourth film and the first superhero film to gross over $2 billion worldwide, breaking numerous records and becoming the highest-grossing film of 2018, as well as the fourth-highest-grossing film of all time and in the United States and Canada. The currently untitled sequel is set to be released on May 3, 2019. Question: is infinity war going to be a trilogy?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Colocasia esculenta is thought to be native to Southern India and Southeast Asia, but is widely naturalised. It is a perennial, tropical plant primarily grown as a root vegetable for its edible starchy corm, and as a leaf vegetable. It is a food staple in African, Oceanic and Indian cultures and is believed to have been one of the earliest cultivated plants. Colocasia is thought to have originated in the Indomalaya ecozone, perhaps in East India, Nepal, and Bangladesh, and spread by cultivation eastward into Southeast Asia, East Asia and the Pacific Islands; westward to Egypt and the eastern Mediterranean Basin; and then southward and westward from there into East Africa and West Africa, where it spread to the Caribbean and Americas. It is known by many local names and often referred to as ``elephant ears'' when grown as an ornamental plant. At around 3.3 million metric tons per year, Nigeria is the largest producer of taro in the world. Question: is taro root the same as elephant ears?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The game takes place in the same fictional world as the comic, with events occurring shortly after the onset of the zombie apocalypse in Georgia. However, most of the characters are original to the game, which centers on university professor and convicted criminal Lee Everett, who helps to rescue and subsequently care for a young girl named Clementine. Kirkman provided oversight for the game's story to ensure it corresponded to the tone of the comic, but allowed Telltale to handle the bulk of the developmental work and story specifics. Some characters from the original comic book series also make in-game appearances. Question: is the walking dead game the same as the show?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The relationship between House and Cuddy is known by the portmanteau term ``Huddy''. Cuddy has what USA Today's Peter Johnson terms a ``cat-and-mouse'' relationship with House. Edelstein has described it as ``a really complicated, adult relationship'', explaining: ``These are people who have very full lives and lots of responsibilities that perhaps conflict with their feelings for each other.'' The actress ``would love for them to have a (romantic) relationship, because it could be as complicated as the rest of their relationship'', however, she is unsure how it would affect the dynamics of the show. Jacobs commented at the end of the show's third season: ``I can't see them pairing them in a permanent fashion. But they are close; they have gone through a lot together. Might there be a moment of weakness in which the two might explore their chemistry? Maybe.'' Questioned at the end of the fourth season on whether Cuddy and House would ever consummate their relationship on-screen, Jacobs responded: ``there is heat and chemistry between them and I never want to see that go away because that is the essence of their relationship. (...) we'll never ignore (their chemistry) because, as I said, it's the very essence of them. She wouldn't forgive him over and over again if he wasn't so brilliant in her eyes, clearly she's got a soft spot for him. And he has one for her. You will continue to see that.'' Prior to the beginning of the fifth season, series creator David Shore discussed his intention to further the relationship between the two, as: ``If House is capable of any relationship with anyone, it's Cuddy. We can't have them dancing around forever.'' Following the fifth season revelation that House had hallucinated a physical relationship with Cuddy, Shore commented on the storyline's continuation into the sixth season: ``it would be dishonest to just let that disappear. Obviously House has feelings for her. Even though the love affair didn't happen, in House's mind it did.'' Edelstein does not know whether the two characters will eventually end up together, however believes that the combination of frustration and love Cuddy feels for House ``makes for a very interesting relationship'', as: ``there's a great deal of admiration and respect, and also an incredible amount of annoyance and frustration, which is like how most relationships are in your life.'' As of the very end of the Sixth Season finale, Help Me, House and Cuddy appear to have entered a romantic relationship. In the closing minutes of the episode, House came very close to relapsing and taking vicodin once again, at which point Cuddy entered to tell him that she had ended her relationship with Lucas. She professed her love for House, which led to them kissing briefly. A close-up shot of their clasped hands (House's left, Cuddy's right) was the closing shot of the episode, as well as the season. The relationship later ends in season 7; in the episode ``Bombshells''. Question: do house and cuddy get back together in season 8?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The following is the list of teams to overcome 3--1 series deficits by winning three straight games to win a best-of-seven playoff series. In the history of major North American pro sports, teams that were down 3--1 in the series came back and won the series 52 times, more than half of them were accomplished by National Hockey League (NHL) teams. Teams overcame 3--1 deficit in the final championship round eight times, six were accomplished by Major League Baseball (MLB) teams in the World Series. Teams overcoming 3--0 deficit by winning four straight games were accomplished five times, four times in the NHL and once in MLB. Question: has anyone ever came back from 3-0 in nba?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Superfecundation is the fertilization of two or more ova from the same cycle by sperm from separate acts of sexual intercourse, which can lead to twin babies from two separate biological fathers. The term superfecundation is derived from fecund, meaning the ability to produce offspring. Heteropaternal superfecundation refers to the fertilization of two separate ova by two different fathers. Homopaternal superfecundation refers to the fertilization of two separate ova from the same father, leading to fraternal twins. While heteropaternal superfecundation is referred to as a form of atypical twinning, genetically, the twins are half siblings. Superfecundation, while rare, can occur through either separate occurrences of sexual intercourse or through artificial insemination. Question: can you have two babies with different fathers?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a chance to have two babies with different fathers, as explained in the passage. Heteropaternal superfecundation refers to the fertilization of two separate ova by two different fathers, and genetically, the twins are half siblings. Therefore the score is: 70"
    },
    {
        "question": "Passage: First aid treatment is pressure on the wound and artificial respiration once the paralysis has disabled the victim's respiratory muscles, which often occurs within minutes of being bitten. Because the venom primarily kills through paralysis, victims are frequently saved if artificial respiration is started and maintained before marked cyanosis and hypotension develop. Efforts should be continued even if the victim appears not to be responding. Respiratory support until medical assistance arrives ensures the victims will generally recover. Question: can you survive a blue ringed octopus bite?",
        "pred_ans": " Passage: First aid treatment is pressure on the wound and artificial respiration once the paralysis has disabled the victim's respiratory muscles, which often occurs within minutes of being bitten. Because the venom primarily kills through paralysis, victims are frequently saved if artificial respiration is started and maintained before marked cyanosis and hypotension develop. Efforts should be continued even if the victim appears not to be responding. Respiratory support until medical assistance arrives ensures the victims will generally recover.\\n\\nResponse: False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The most common symptom experienced due to Morton's toe is callusing and/or discomfort of the ball of the foot at the base of the second toe. The first metatarsal head would normally bear the majority of a person's body weight during the propulsive phases of gait, but because the second metatarsal head is farthest forward, the force is transferred there. Pain may also be felt in the arch of the foot, at the ankleward end of the first and second metatarsals. Question: is it normal for your second toe to be longer than your first?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In 1866, at the behest of Chief Justice Chase, Congress passed an act providing that the next three justices to retire would not be replaced, which would thin the bench to seven justices by attrition. Consequently, one seat was removed in 1866 and a second in 1867. In 1869, however, the Circuit Judges Act returned the number of justices to nine, where it has since remained. Question: can we have more than 9 supreme court justices?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Using the information provided by Creasy, The Voice's identity is revealed by Manzano to be Daniel S\u00e1nchez, who Mariana exposes on the front page of her newspaper. Creasy sneaks into the home of S\u00e1nchez's ex-wife and children, and is fatally shot by his brother Aurelio (Gero Camilo), whom Creasy then captures. Creasy finds out that he not only has The Voice's brother hostage, but his pregnant wife. Creasy calls Daniel S\u00e1nchez and threatens to kill all of his family unless he gives himself up (shooting Aurelio's fingers off with a shotgun as a warning while Sanchez listens on the phone), but S\u00e1nchez reveals that Pita is still alive, and offers her in exchange for Aurelio and Creasy. After Sanchez confirms Pita's identity (Creasy has him identify what Pita calls her teddy bear as proof), Creasy agrees to the demands. Creasy contacts Lisa to confirm that Pita is alive and to meet him at the exchange. He instructs her to hold the shotgun to Aurelio and to not let him go until Lisa has Pita. He surrenders himself to S\u00e1nchez' men, after Pita is released to him. After an embrace, Creasy instructs Pita to runs to her mother. Aurelio is then released and Creasy surrenders to the kidnappers as Lisa and Pita drive away. As Creasy drives off with the kidnappers, it is implied that Creasy, at peace with himself, succumbs to his wounds and dies. Daniel S\u00e1nchez is later shot and killed by Manzano during an AFI arrest raid. Question: does the little girl die in man on fire?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: A California roll or California maki is a makizushi sushi roll, usually made inside-out, containing cucumber, crab meat or imitation crab, and avocado. Sometimes crab salad is substituted for the crab stick, and often the outer layer of rice in an inside-out roll (uramaki) is sprinkled with toasted sesame seeds, tobiko or masago (capelin roe). Question: does a california roll have fish in it?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Pok\u00e9mon Gold Version and Silver Version are the second installments of the Pok\u00e9mon series of role-playing video games, developed by Game Freak and published by Nintendo for the Game Boy Color. They were released in Japan in 1999, Australia and North America in 2000, and Europe in 2001. Pok\u00e9mon Crystal, a special edition, was released roughly a year later in each region. In 2009, Game Freak remade Gold and Silver for the Nintendo DS as Pok\u00e9mon HeartGold and SoulSilver. Question: are pokemon gold silver and crystal the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The lunar eclipse was completely visible over Eastern Africa, Southern Africa, Southern Asia and Central Asia, seen rising over South America, Western Africa, and Europe, and setting over Eastern Asia, and Australia. Question: is july 27 lunar eclipse visible in usa?",
        "pred_ans": " Based on the passage, the lunar eclipse was visible over Eastern Africa, Southern Africa, Southern Asia, and Central Asia, as well as Europe, Eastern Asia, and Australia. However, it does not specifically mention whether the July 27 lunar eclipse was visible in the USA. Therefore, I cannot confirm whether it was visible in the USA with absolute certainty.",
        "ground_truth": false,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: Historically, sales of the Five-seven pistol were restricted by FN to military and law enforcement customers, but in 2004 the new Five-seven IOM model was introduced and offered to civilian shooters for use with 5.7\u00d728mm sporting ammunition. The IOM model incorporated several modifications to the weapon's design, such as the addition of an M1913 accessory rail, a magazine safety mechanism, and fully adjustable sights. Although offered only with sporting ammunition, the Five-seven's introduction to civilian shooters was met with strong opposition from gun control organizations such as the Brady Campaign. Question: can a civilian own a fn five seven?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The fourth season of the American television series The Flash, which is based on the DC Comics character Barry Allen / Flash, premiered on The CW on October 10, 2017, and ran for 23 episodes until May 22, 2018. The season follows a crime scene investigator with superhuman speed who fights criminals, including others who have also gained superhuman abilities. It is set in the Arrowverse, sharing continuity with the other television series of the universe, and is a spin-off of Arrow. The season is produced by Berlanti Productions, Warner Bros. Television, and DC Entertainment, with Andrew Kreisberg and Todd Helbing serving as showrunners. Question: are they making a season 4 of the flash?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In the knockout stage, if a match was level at the end of 90 minutes of normal playing time, extra time was played (two periods of 15 minutes each), where each team was allowed to make a fourth substitution. If still tied after extra time, the match was decided by a penalty shoot-out to determine the winners. Question: is there extra time in the world cup playoffs?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Fear the Walking Dead is an American post-apocalyptic horror drama television series created by Robert Kirkman and Dave Erickson, that premiered on AMC on August 23, 2015. It is a companion series and prequel to The Walking Dead, which is based on the comic book series of the same name by Robert Kirkman, Tony Moore, and Charlie Adlard. Question: is fear the walking dead based on the comics?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: F1 and F2 generations are usually the largest, due to the stronger genetic influence of the African serval ancestor. As with other hybrid cats such as the Chausie and Bengal cat, most first generation cats will possess many or all of the serval's exotic looking traits, while these traits often diminish in later generations. Male Savannahs tend to be larger than females. Question: is a savannah cat the same as a bengal?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Dame is an honorific title and the feminine form of address for the honour of knighthood in the British honours system and the systems of several other Commonwealth countries, such as Australia and New Zealand, with the masculine form of address being Sir. The word damehood is rarely used, but the official website of the British monarchy uses it as the correct term. Question: is a dame the same as a knight?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: There may be several meanings of ``solving an equation''. One may want to express the solutions as explicit numbers; for example, the unique solution of 2x -- 1 = 0 is 1/2. Unfortunately, this is, in general, impossible for equations of degree greater than one, and, since the ancient times, mathematicians have searched to express the solutions as algebraic expression; for example the golden ratio ( 1 + 5 ) / 2 (\\displaystyle (1+(\\sqrt (5)))/2) is the unique positive solution of x 2 \u2212 x \u2212 1 = 0. (\\displaystyle x^(2)-x-1=0.) In the ancient times, they succeeded only for degrees one and two. For quadratic equations, the quadratic formula provides such expressions of the solutions. Since the 16th century, similar formulas (using cube roots in addition to square roots), but much more complicated are known for equations of degree three and four (see cubic equation and quartic equation). But formulas for degree 5 and higher eluded researchers for several centuries. In 1824, Niels Henrik Abel proved the striking result that there are equations of degree 5 whose solutions cannot be expressed by a (finite) formula, involving only arithmetic operations and radicals (see Abel--Ruffini theorem). In 1830, \u00c9variste Galois proved that most equations of degree higher than four cannot be solved by radicals, and showed that for each equation, one may decide whether it is solvable by radicals, and, if it is, solve it. This result marked the start of Galois theory and group theory, two important branches of modern algebra. Galois himself noted that the computations implied by his method were impracticable. Nevertheless, formulas for solvable equations of degrees 5 and 6 have been published (see quintic function and sextic equation). Question: can a polynomial have a square root of a variable?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Charles B. McVay III (July 30, 1898 -- November 6, 1968) was an American naval officer and the commanding officer of USS Indianapolis (CA-35) when it was lost in action in 1945, resulting in a massive loss of life. Of all captains in the history of the United States Navy, he is the only one to have been subjected to court-martial for losing a ship sunk by an act of war, despite the fact that he was on a top secret mission maintaining radio silence (the testimony of the Japanese commander who sank his ship also seemed to exonerate McVay). After years of mental health problems, he committed suicide. Following years of efforts by some survivors and others to clear his name, McVay was posthumously exonerated by the 106th United States Congress and President Bill Clinton on October 30, 2000. Question: did the captain of the uss indianapolis live?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Diary of a Wimpy Kid is a satirical realistic fiction comedy novel for children and teenagers written and illustrated by Jeff Kinney. It is the first book in the Diary of a Wimpy Kid series. The book is about a boy named Greg Heffley and his struggles to fit in as he begins middle school. Question: are diary of a wimpy kid books graphic novels?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Indiana Toll Road, officially the Indiana East--West Toll Road, is a toll road that runs for 156.28 miles (251.51 km) east--west across northern Indiana from the Illinois state line to the Ohio state line. It has been advertised as the ``Main Street of the Midwest''. The entire toll road is designated as part of Interstate 90, and the segment from Lake Station east to the Ohio state line is a concurrency with Interstate 80. The toll road is owned by the Indiana Finance Authority and operated by the Indiana Toll Road Concession Company, which is owned by IFM Investors. Question: is i 80 in indiana a toll road?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The United States Coast Guard (USCG) is a branch of the United States Armed Forces and one of the country's seven uniformed services. The Coast Guard is a maritime, military, multi-mission service unique among the U.S. military branches for having a maritime law enforcement mission (with jurisdiction in both domestic and international waters) and a federal regulatory agency mission as part of its mission set. It operates under the U.S. Department of Homeland Security during peacetime, and can be transferred to the U.S. Department of the Navy by the U.S. President at any time, or by the U.S. Congress during times of war. This has happened twice, in 1917, during World War I, and in 1941, during World War II. Question: is coast guard part of the armed forces?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 0"
    },
    {
        "question": "Passage: In Australia, Remembrance Day is always observed on 11 November, regardless of the day of the week, and is not a public holiday; it is a time when people can pay their respects to the substantial number of soldiers who died in battle. Some institutions observe two-minute's silence at 11 am through a programme named Read 2 Remember, children read the Pledge of Remembrance by Rupert McCall and teachers deliver specially developed resources to help children understand the significance of the day and the resilience of those who have fought for their country and call on children to also be resilient when facing difficult times. Services are held at 11 am at war memorials and schools in suburbs and cities across the country, at which the ``Last Post'' is sounded by a bugler and a one-minute silence is observed. In recent decades, Remembrance Day has been largely eclipsed as the national day of war commemoration by ANZAC Day (25 April), which is a public holiday in all states. Question: is anzac day and remembrance day the same?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Travel to each of these altitude regions can lead to medical problems, from the mild symptoms of acute mountain sickness to the potentially fatal high-altitude pulmonary edema (HAPE) and high-altitude cerebral edema (HACE). The higher the altitude, the greater the risk. Research also indicates elevated risk of permanent brain damage in people climbing to extreme altitudes. Expedition doctors commonly stock a supply of dexamethasone, or ``dex,'' to treat these conditions on site. Question: is living at high altitude good for you?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " There is no clear answer to the question \\\"is living at high altitude good for you\\\" in the given passage. The passage mainly discusses the risks associated with traveling to high-altitude regions and the medical problems that can occur. It does not provide any information on the long-term effects of living at high altitude. Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: Kenneth E. Gaspar (born February 3, 1953), more commonly known as Boom Gaspar, is an American musician who has performed with the American rock band Pearl Jam as a piano/keyboard/organ player since 2002. Question: is boom gaspar a member of pearl jam?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: An income statement or profit and loss account (also referred to as a profit and loss statement (P&L), statement of profit or loss, revenue statement, statement of financial performance, earnings statement, operating statement, or statement of operations) is one of the financial statements of a company and shows the company's revenues and expenses during a particular period. It indicates how the revenues (money received from the sale of products and services before expenses are taken out, also known as the ``top line'') are transformed into the net income (the result after all revenues and expenses have been accounted for, also known as ``net profit'' or the ``bottom line''). The purpose of the income statement is to show managers and investors whether the company made or lost money during the period being reported. Question: is profit and loss statement same as income statement?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Sterol lipids, such as cholesterol and its derivatives, are an important component of membrane lipids, along with the glycerophospholipids and sphingomyelins. The steroids, all derived from the same fused four-ring core structure, have different biological roles as hormones and signaling molecules. The eighteen-carbon (C18) steroids include the estrogen family whereas the C19 steroids comprise the androgens such as testosterone and androsterone. The C21 subclass includes the progestogens as well as the glucocorticoids and mineralocorticoids. The secosteroids, comprising various forms of vitamin D, are characterized by cleavage of the B ring of the core structure. Other examples of sterols are the bile acids and their conjugates, which in mammals are oxidized derivatives of cholesterol and are synthesized in the liver. The plant equivalents are the phytosterols, such as \u03b2-sitosterol, stigmasterol, and brassicasterol; the latter compound is also used as a biomarker for algal growth. The predominant sterol in fungal cell membranes is ergosterol. Question: is cholesterol a partial breakdown product of lipids?",
        "pred_ans": " False. Cholesterol is a sterol lipid, which is an important component of membrane lipids.",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Unlike people with Limited Leave to Remain (LTR) in the UK, ILR holders have access to public funds. ``No recourse to public funds'' is not written in ILR holders' visas. As a result, they are able to claim job seekers' allowances and other benefits that are usually available only to British, EU, and EEA citizens. Question: can you claim benefits if you have indefinite leave to remain?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. The correct answer should be \\\"Yes\\\".\\n\\nTherefore the score is: 50"
    },
    {
        "question": "Passage: Tanjore painting is an important form of classical South Indian painting native to the town of Tanjore in Tamil Nadu. The art form dates back to the early 9th century, a period dominated by the Chola rulers, who encouraged art and literature. These paintings are known for their elegance, rich colours, and attention to detail. The themes for most of these paintings are Hindu Gods and Goddesses and scenes from Hindu mythology. In modern times, these paintings have become a much sought-after souvenir during festive occasions in South India. Question: is tanjore a traditional indian folk art form?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Formed by five Charterhouse pupils including Banks, Rutherford, Gabriel, and Anthony Phillips, Genesis were named by former pupil Jonathan King, who arranged for them to record several unsuccessful singles and an album. After splitting with King, the group began touring professionally, signing with Charisma Records. Following the departure of Phillips, Genesis recruited Collins and Hackett and recorded several progressive rock style albums, with live shows centred around Gabriel's theatrical costumes and performances. The group were initially commercially successful in mainland Europe, before entering the UK charts with Foxtrot (1972). They followed this with Selling England by the Pound (1973) and The Lamb Lies Down on Broadway (1974) before Gabriel left the group. Question: were phil collins and peter gabriel in genesis at the same time?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: Eastbound vehicles must pay a toll to cross the bridge; as with all Hudson River crossings along the North River, westbound vehicles cross for free. As of December 6, 2015, the cash tolls going from New Jersey to New York are $15 for both cars and motorcycles. E-ZPass users are charged $10.50 for cars and $9.50 for motorcycles during off-peak hours, and $12.50 for cars and $11.50 for motorcycles during peak hours. Trucks are charged cash tolls of $20.00 per axle, with discounted peak, off-peak, and overnight E-ZPass tolls. A discounted carpool toll ($6.50) is available at all times for cars with three or more passengers using NY or NJ E-ZPass, who proceed through a staffed toll lane (provided they have registered with the free ``Carpool Plan''). There is an off-peak toll of $7.00 for qualified low-emission passenger vehicles, which have received a Green E-ZPass based on registering for the Port Authority Green Pass Discount Plan. Question: do you have to pay both ways on the george washington bridge?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The dairy cow will produce large amounts of milk in its lifetime. Production levels peak at around 40 to 60 days after calving. Production declines steadily afterwards until milking is stopped at about 10 months. The cow is ``dried off'' for about sixty days before calving again. Within a 12 to 14-month inter-calving cycle, the milking period is about 305 days or 10 months long. Among many variables, certain breeds produce more milk than others within a range of around 6,800 to 17,000 kg (15,000 to 37,500 lb) of milk per year. Question: does a cow have to be pregnant to give milk?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " There is a crucial detail missing in the student's answer. The passage states that the cow's production of milk declines steadily after the peak and milking is stopped at about 10 months. This suggests that a cow does not have to be pregnant to give milk, as the milk production continues even after calving. Therefore, the score is: 60"
    },
    {
        "question": "Passage: The Federal Reserve began taking high-denomination currency out of circulation (destroying large bills received by banks) in 1969. As of May 30, 2009, only 336 $10,000 bills were known to exist; 342 remaining $5,000 bills; and 165,372 remaining $1,000 bills. Due to their rarity, collectors often pay considerably more than the face value of the bills to acquire them. Some are in museums in other parts of the world. Question: are there any thousand dollar bills in circulation?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: In trigonometry, the law of sines, sine law, sine formula, or sine rule is an equation relating the lengths of the sides of a triangle (any shape) to the sines of its angles. According to the law, Question: can law of sines be used on any triangle?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The right of asylum (sometimes called right of political asylum, from the Ancient Greek word \u1f04\u03c3\u03c5\u03bb\u03bf\u03bd) is an ancient juridical concept, under which a person persecuted by his own country may be protected by another sovereign authority, such as another country or church official, who in medieval times could offer sanctuary. This right was already recognized by the Egyptians, the Greeks, and the Hebrews, from whom it was adopted into Western tradition. Ren\u00e9 Descartes fled to the Netherlands, Voltaire to England, and Thomas Hobbes to France, because each state offered protection to persecuted foreigners. Question: can you seek asylum from your home country?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A rooster, also known as a gamecock, cockerel or cock, is an adult male gallinaceous bird, usually a male chicken (Gallus gallus domesticus). Question: is a chicken and rooster the same thing?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Citizens of member nations of the Gulf Cooperation Council may travel to Oman without visa limits. Nationals of 71 other countries and territories can apply for visas online which are valid for a period of 30 days. All visitors must hold a passport valid for 6 months. Question: do you need a visa to visit oman?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The United States Congress is the bicameral legislature of the Federal government of the United States. The legislature consists of two chambers: the Senate and the House of Representatives. Question: is the house of representatives also called congress?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There are two parts to the question:\\n\\n1. \\\"is the house of representatives also called congress?\\\"  \\n2. The student's answer: \\\"False\\\"\\n\\nAccording to the passage, the House of Representatives is one of the two chambers of the United States Congress, which is the bicameral legislature of the Federal government of the United States. So, the House of Representatives is a part of Congress but not the same as Congress.\\n\\nTherefore the score is: 80"
    },
    {
        "question": "Passage: A spark plug (sometimes, in British English, a sparking plug, and, colloquially, a plug) is a device for delivering electric current from an ignition system to the combustion chamber of a spark-ignition engine to ignite the compressed fuel/air mixture by an electric spark, while containing combustion pressure within the engine. A spark plug has a metal threaded shell, electrically isolated from a central electrode by a porcelain insulator. The central electrode, which may contain a resistor, is connected by a heavily insulated wire to the output terminal of an ignition coil or magneto. The spark plug's metal shell is screwed into the engine's cylinder head and thus electrically grounded. The central electrode protrudes through the porcelain insulator into the combustion chamber, forming one or more spark gaps between the inner end of the central electrode and usually one or more protuberances or structures attached to the inner end of the threaded shell and designated the side, earth, or ground electrode(s). Question: does a spark plug keep an engine running?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The United States does not have a nationwide soda tax, but a few of its cities have passed their own tax and the U.S. has seen a growing debate around taxing soda in various cities, states and even in Congress in recent years. A few states impose excise taxes on bottled soft drinks or on wholesalers, manufacturers, or distributors of soft drinks. Question: is there a sugar tax in the us?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: In ice hockey, a goal is scored when the puck entirely crosses the goal line between the two goal posts and below the goal crossbar. A goal awards one point to the team attacking the goal scored upon, regardless of which team the player who actually deflected the puck into the goal belongs to (see also own goal). Typically, a player on the team attempting to score shoots the puck with his/her stick towards the goal net opening, and a player on the opposing team called a goaltender tries to block the shot to prevent a goal from being scored against his/her team. Question: does the hockey puck have to cross the line to be a goal?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Since the joule is also a watt-second and the common unit for electricity sales to homes is the kW\u22c5h (kilowatt-hour), a kW\u22c5h is thus 1000 W \u00d7 3600 s = 3.6 MJ (megajoules). Question: is a joule the same as a watt?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " There is no answer provided by the student. Therefore the score is: 0/100"
    },
    {
        "question": "Passage: The first FA Cup Final to go to extra time and a replay was the 1875 final, between the Royal Engineers and the Old Etonians. The initial tie finished 1--1 but the Royal Engineers won the replay 2--0 in normal time. The last replayed final was the 1993 FA Cup Final, when Arsenal and Sheffield Wednesday fought a 1--1 draw. The replay saw Arsenal win the FA Cup, 2--1 after extra time. Question: can the fa cup final end in a tie?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Although Scottish Gaelic and Irish are closely related as Goidelic Celtic languages (or Gaelic languages), they are in fact starkly different in many ways. While most dialects are not immediately mutually comprehensible (although many individual words and phrases are), speakers of the two languages can rapidly develop mutual intelligibility. Question: is scottish gaelic and irish gaelic the same?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Air Canada Rouge, is a low-cost airline and subsidiary of Air Canada. Air Canada Rouge is fully integrated into the Air Canada mainline and Air Canada Express networks; flights are sold with AC flight numbers but are listed as ``operated by Air Canada Rouge'' (similar to regional flights operated under the Air Canada Express banner). Rouge means ``red'' in French. Question: is air canada rouge the same as air canada?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: This is a list of all penalty shoot-outs that have occurred in the Finals tournament of the FIFA World Cup. Penalty shoot-outs were introduced as tie-breakers in the 1978 World Cup but did not occur before 1982. The first time a World Cup title was won by penalty shoot-out was in 1994. The only other time was in 2006. By the end of the 2018 edition, 30 shoot-outs have taken place in the World Cup. Of these, only two reached the sudden death stage after still being tied at the end of ``best of five kicks''. Question: can the world cup final be decided on penalties?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Juris Doctor degree (J.D. or JD), also known as the Doctor of Jurisprudence degree (J.D., JD, D.Jur. or DJur), is a graduate-entry professional degree in law and one of several Doctor of Law degrees. It is earned by completing law school in Australia, Canada and the United States, and some other common law countries. It has the academic standing of a professional doctorate in the United States, a master's degree in Australia, and a second-entry, baccalaureate degree in Canada, (in all three jurisdictions the same as other professional degrees such as M.D. or D.D.S., the degrees required to be a practicing physician or dentist, respectively). Question: is a jd the same as a doctorate?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The frequency of nocturnal emissions is highly variable. Some reported that it is due to being sexually inactive for a period of 1--2 weeks, with no engagement in either intercourse or masturbation. Some males have experienced large numbers of nocturnal emissions as teenagers, while others have never experienced one. In the U.S., 83% of men experience nocturnal emissions at some time in their life. For males who have experienced nocturnal emissions the mean frequency ranges from 0.36 times per week (about once every three weeks) for single 15-year-old males to 0.18 times per week (about once every five-and-a-half weeks) for 40-year-old single males. For married males the mean ranges from 0.23 times per week (about once per month) for 19-year-old married males to 0.15 times per week (about once every two months) for 50-year-old married males. In some parts of the world nocturnal emissions are more common. For example, in Indonesia surveys have shown that 97% of men experience nocturnal emissions by the age of 24. Question: is it normal for a 50 year old man to have wet dreams?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The deed in lieu of foreclosure offers several advantages to both the borrower and the lender. The principal advantage to the borrower is that it immediately releases him/her from most or all of the personal indebtedness associated with the defaulted loan. The borrower also avoids the public notoriety of a foreclosure proceeding and may receive more generous terms than he/she would in a formal foreclosure. Another benefit to the borrower is that it hurts his/her credit less than a foreclosure does. Advantages to a lender include a reduction in the time and cost of a repossession, lower risk of borrower revenge (metal theft and vandalism of the property before sheriff eviction), and additional advantages if the borrower subsequently files for bankruptcy. Question: is a deed in lieu considered a foreclosure?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: We Bought a Zoo is a 2011 American family comedy-drama film loosely based on the 2008 memoir of the same name by Benjamin Mee. It was written and directed by Cameron Crowe and stars Matt Damon as widowed father Benjamin Mee, who purchases a dilapidated zoo with his family and takes on the challenge of preparing the zoo for its reopening to the public. The film also stars Scarlett Johansson, Maggie Elizabeth Jones, Thomas Haden Church, Patrick Fugit, Elle Fanning, Colin Ford, and John Michael Higgins. The film was released in the United States on December 23, 2011 by 20th Century Fox. The film earned $120.1 million on a $50 million budget. We Bought a Zoo was released on DVD and Blu-ray on April 3, 2012 by 20th Century Fox Home Entertainment. Dartmoor Zoological Park (originally Dartmoor Wildlife Park), on which the film is based, is a 33-acre zoological garden located near the village of Sparkwell, Devon, England. Question: is we bought a zoo a true story?",
        "pred_ans": " Answer: False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the United States House of Representatives, the District is represented by a delegate, who is not allowed to vote on the House floor but can vote on procedural matters and in congressional committees. D.C. residents have no representation in the United States Senate. The Twenty-third Amendment to the United States Constitution, adopted in 1961, entitles the District to the same number of electoral votes as that of the least populous state in the election of the President and Vice President of the United States. Question: can you vote for president in washington dc?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In 2011, Department of Defense instructions in regard to the Medal of Honor were amended to read ``for each succeeding act that would otherwise justify award of the Medal of Honor, the individual receiving the subsequent award is authorized to wear an additional Medal of Honor ribbon and/or a 'V' device on the Medal of Honor suspension ribbon'' (the ``V'' device is a \u2044-inch-high (6.4 mm) bronze miniature letter ``V'' with serifs that denotes valor). The Medal of Honor was the only decoration authorized the use of the ``V'' device (none were ever issued) to designate subsequent awards in such fashion. Nineteen individuals, all now deceased, were double Medal of Honor recipients. In July 2014, DoD instructions were changed to read, ``A separate MOH is presented to an individual for each succeeding act that justified award.'' As of 2014, no attachments are authorized for the Medal of Honor. Question: has anyone ever been awarded two medal of honors?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The eurozone ( pronunciation (help info)), officially called the euro area, is a monetary union of 19 of the 28 European Union (EU) member states which have adopted the euro (\u20ac) as their common currency and sole legal tender. The monetary authority of the eurozone is the Eurosystem. The other nine members of the European Union continue to use their own national currencies, although most of them are obliged to adopt the euro in the future. Question: do european countries still have their own currency?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A new engine is broken in by following specific driving guidelines during the first few hours of its use. The focus of breaking in an engine is on the contact between the piston rings of the engine and the cylinder wall. There is no universal preparation or set of instructions for breaking in an engine. Most importantly, experts disagree on whether it is better to start engines on high or low power to break them in. While there are still consequences to an unsuccessful break-in, they are harder to quantify on modern engines than on older models. In general, people no longer break in the engines of their own vehicles after purchasing a car or motorcycle, because the process is done in production. It is still common, even today, to find that an owner's manual recommends gentle use at first (often specified as the first 500 or 1000 kilometres or miles). But it is usually only normal use without excessive demands that is specified, as opposed to light/limited use. For example, the manual will specify that the car be driven normally, but not in excess of the highway speed limit. Question: do you have to break in new car engines?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A key difference between the two sports is that in rugby union both sets of forwards try to push the opposition backwards whilst competing for the ball and thus the team that did not throw the ball into the scrum have some minimal chance of winning the possession. In practice, however, the team with the 'put-in' usually keeps possession (92% of the time with the feed) and put-ins are not straight. Forwards in rugby league do not usually push in the scrum, scrum-halves often feed the ball directly under the legs of their own front row rather than into the tunnel, and the team with the put-in usually retains possession (thereby making the 40/20 rule workable). Question: can you push in a rugby league scrum?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A fair catch is a feature of American football and several other codes of football, in which a player attempting to catch a ball kicked by the opposing team -- either on a kickoff or punt -- is entitled to catch the ball without interference from any member of the kicking team. A ball caught in this manner becomes dead once caught, i.e., the player catching the ball is not entitled to run with the ball in an attempt to gain yardage, and the receiving team begins its drive at the spot where the ball was caught. A player wishing to make a fair catch signals his intent by extending one arm above his head and waving it while the kicked ball is in flight. The kicking team must allow the player an opportunity to make the catch without interference. Question: can you call fair catch on a kickoff?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: In finance, the time value (TV) (extrinsic or instrumental value) of an option is the premium a rational investor would pay over its current exercise value (intrinsic value), based on the probability it will increase in value before expiry. For an American option this value is always greater than zero in a fair market, thus an option is always worth more than its current exercise value.. As an option can be thought of as 'price insurance' (e.g., an airline insuring against unexpected soaring fuel costs caused by a hurricane), TV can be thought of as the risk premium the option seller charges the buyer--the higher the expected risk (volatility \u22c5 (\\displaystyle \\cdot ) time), the higher the premium. Conversely, TV can be thought of as the price an investor is willing to pay for potential upside. Question: can the time value of an option be negative?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " There is a mistake in the student's answer. According to the passage, the time value of an option can be negative. It is mentioned that for a European option, the time value can be negative if the current exercise value is greater than the expected future value of the option. Therefore, the score is: 50"
    },
    {
        "question": "Passage: Scar makes a brief cameo appearance in the film in Simba's nightmare. In the nightmare, Simba runs down the cliff where his father died, attempting to rescue him. Scar intervenes, however, and then turns into Kovu and throws Simba off the cliff. Scar makes another cameo appearance in a pool of water, as a reflection, after Kovu is exiled from Pride Rock. Question: is scar alive in the lion king 2?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Jack Reacher: Never Go Back is a 2016 American action thriller film, directed by Edward Zwick and written by Zwick, Richard Wenk and Marshall Herskovitz, and based on the novel Never Go Back by Lee Child. A sequel to the 2012 film Jack Reacher, the film stars Tom Cruise, Cobie Smulders, Patrick Heusinger, Aldis Hodge, Danika Yarosh, Holt McCallany and Robert Knepper. The plot follows Reacher going on the run with an Army Major who has been framed for espionage, as the two reveal a dark conspiracy. Question: is there going to be a jack reacher 2?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Brake fluid is a type of hydraulic fluid used in hydraulic brake and hydraulic clutch applications in automobiles, motorcycles, light trucks, and some bicycles. It is used to transfer force into pressure, and to amplify braking force. It works because liquids are not appreciably compressible. Question: can i use hydraulic fluid for brake fluid?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The World Is Still Beautiful, also in subtitle as Still World Is Beautiful (Japanese: \u305d\u308c\u3067\u3082\u4e16\u754c\u306f\u7f8e\u3057\u3044, Hepburn: Soredemo Sekai wa Utsukushii, lit. ``Even So, The World is Beautiful'') is a Japanese manga series by Dai Shiina, serialized in Hakusensha's sh\u014djo manga magazine Hana to Yume from 2012. Soredemo Sekai wa Utsukushii was first published as a one-shot in the same magazine in 2009, with a second one-shot published in 2011. It has been collected in 14 tank\u014dbon volumes as of October 2016. An anime television series adaptation by Pierrot aired between April 6, 2014 and June 29, 2014 on NTV. Question: will there be a season 2 of soredemo sekai wa utsukushii?",
        "pred_ans": " According to the passage, there is no information about a season 2 of Soredemo Sekai wa Utsukushii. Therefore, I cannot confirm whether there will be a season 2 or not.",
        "ground_truth": false,
        "score": " There is no information in the passage about a season 2 of Soredemo Sekai wa Utsukushii. The student's answer accurately reflects this, so Therefore the score is: 100"
    },
    {
        "question": "Passage: The Munchkin or '''Sausage Cat''' is a new breed of cat characterized by its very short legs, which are caused by a genetic mutation. Much controversy erupted over the breed when it was recognized by The International Cat Association in 1995 with critics voicing concern over potential health and mobility issues. Question: is there such a thing as a munchkin cat?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Senate cloture rules historically required a two-thirds affirmative vote to advance nominations to a vote; this was changed to a three-fifths supermajority in 1975. In November 2013, the then-Democratic Senate majority eliminated the filibuster for executive branch nominees and judicial nominees except for Supreme Court nominees by invoking the so called nuclear option. In April 2017, the Republican Senate majority applied the nuclear option to Supreme Court nominations as well, enabling the nominations of Trump nominees Neil Gorsuch and Brett Kavanaugh to proceed to a vote. Question: can a filibuster stop a supreme court nominee?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Though not all of its rules represent law, the Highway Code states ``Only flash your headlights to let other road users know that you are there. Do not flash your headlights in an attempt to intimidate other road users''. Drivers warning others about speed traps have been fined in the past for ``misuse of headlights''. Question: is it illegal to flash your headlights to warn of police uk?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Elizabeth's only sibling, Princess Margaret, was born in 1930. The two princesses were educated at home under the supervision of their mother and their governess, Marion Crawford. Lessons concentrated on history, language, literature and music. Crawford published a biography of Elizabeth and Margaret's childhood years entitled The Little Princesses in 1950, much to the dismay of the royal family. The book describes Elizabeth's love of horses and dogs, her orderliness, and her attitude of responsibility. Others echoed such observations: Winston Churchill described Elizabeth when she was two as ``a character. She has an air of authority and reflectiveness astonishing in an infant.'' Her cousin Margaret Rhodes described her as ``a jolly little girl, but fundamentally sensible and well-behaved''. Question: did the queen have any brothers or sisters?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The University of Mary Hardin--Baylor (UMHB) is a Christian co-educational institution of higher learning located in Belton, Texas, United States. UMHB was chartered by the Republic of Texas in 1845 as Baylor Female College, the female department of what is now Baylor University. It has since become its own institution and grown to 3,914 students and awards degrees at the baccalaureate, master's, and doctoral levels. It is affiliated with the Baptist General Convention of Texas. Question: is baylor and mary hardin baylor the same school?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There are some key points to consider when evaluating the student's answer. First, the passage states that UMHB was chartered as Baylor Female College, which was the female department of what is now Baylor University. However, it has since become its own institution. Based on this information, we can determine that Baylor and Mary Hardin-Baylor are not the same school.\\n\\nTherefore, the score is: 85"
    },
    {
        "question": "Passage: Tomato-garlic sauce is prepared using tomatoes as a main ingredient, and is used in various cuisines and dishes. In Italian cuisine, alla pizzaiola refers to tomato-garlic sauce, which is used on pizza, pasta and meats. Question: is pizza sauce and tomato sauce the same thing?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The 1958 ill-fated Japanese expedition to Antarctica inspired the 1983 hit film Antarctica, of which Eight Below is a remake. Eight Below adapts the events of the 1958 incident, moved forward to 1993. In the 1958 event, fifteen Sakhalin Husky sled dogs were abandoned when the expedition team was unable to return to the base. When the team returned a year later, two dogs were still alive. Another seven were still chained up and dead, five were unaccounted for, and one died just outside Showa Station. Question: is the movie 8 below a true story?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: To maintain lactation, a dairy cow must be bred and produce calves. Depending on market conditions, the cow may be bred with a ``dairy bull'' or a ``beef bull.'' Female calves (heifers) with dairy breeding may be kept as replacement cows for the dairy herd. If a replacement cow turns out to be a substandard producer of milk, she then goes to market and can be slaughtered for beef. Male calves can either be used later as a breeding bull or sold and used for veal or beef. Dairy farmers usually begin breeding or artificially inseminating heifers around 13 months of age. A cow's gestation period is approximately nine months. Newborn calves are removed from their mothers quickly, usually within three days, as the mother/calf bond intensifies over time and delayed separation can cause extreme stress on both cow and calf. Question: does a cow have to be pregnant to lactate?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The name ``corpse flower'' applied to Rafflesia can be confusing because this common name also refers to the titan arum (Amorphophallus titanum) of the family Araceae. Moreover, because Amorphophallus has the world's largest unbranched inflorescence, it is sometimes mistakenly credited as having the world's largest flower. Both Rafflesia and Amorphophallus are flowering plants, but they are only distantly related. Rafflesia arnoldii has the largest single flower of any flowering plant, at least in terms of weight. Amorphophallus titanum has the largest unbranched inflorescence, while the talipot palm (Corypha umbraculifera) forms the largest branched inflorescence, containing thousands of flowers; the talipot is monocarpic, meaning the individual plants die after flowering. Question: is rafflesia the largest flower in the world?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In December of 2017, the Seven network announced the show has been renewed for a fourth season. Question: will there be a 800 words season 4?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: When the monorail company first announced details of the extension in September 2008, the airport extension was to be built with private funds and was expected to be built by 2012. However, as of March 2011, the Las Vegas Monorail Company was still in the planning phases of the proposed extension to McCarran International Airport with a proposed stop on the UNLV campus. Question: does the monorail go to las vegas airport?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: After the defeat in the 2016 Olympics, the USWNT underwent a year of experimentation which saw them losing 3 home games. If not for a comeback win against Brazil, the USWNT was on the brink of losing 4 home games in one year, a low never before seen by the USWNT. 2017 saw the USWNT play 12 games against teams ranked in the top-15 in the world. The USWNT heads into World Cup Qualifying in fall of 2018. Question: is the us womens soccer team in the world cup?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The 2001 World Series was the championship series of Major League Baseball's (MLB) 2001 season. The 97th edition of the World Series, it was a best-of-seven playoff between the National League (NL) champion Arizona Diamondbacks and the three-time defending World Series champions and American League (AL) champion New York Yankees. The Diamondbacks defeated the Yankees, four games to three to win the series. Considered one of the greatest World Series of all time, memorable aspects included two extra-inning games and three late-inning comebacks. Diamondbacks pitchers Randy Johnson and Curt Schilling were both named World Series Most Valuable Players. Question: did the yankees win the world series in 2001?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: The liver detoxifies and breaks down chemicals, poisons and other toxins that enter the body. For example, the liver transforms ammonia (which is poisonous) into urea in fish, amphibians and mammals, and into uric acid in birds and reptiles. Urea is filtered by the kidney into urine or through the gills in fish and tadpoles. Uric acid is paste-like and expelled as a semi-solid waste (the ``white'' in bird excrements). The liver also produces bile, and the body uses bile to break down fats into usable fats and unusable waste. Question: is the liver part of the excretory system?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Some legal scholars have argued that because countries have constantly invoked the Declaration for more than 50 years, it has become binding as a part of customary international law. However, in the United States, the Supreme Court in Sosa v. Alvarez-Machain (2004), concluded that the Declaration ``does not of its own force impose obligations as a matter of international law.'' Courts of other countries have also concluded that the Declaration is not in and of itself part of domestic law. Question: does the us follow the universal declaration of human rights?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In the Northern Hemisphere, the winter sun (December, January, February) rises in the southeast, transits the celestial meridian at a low angle in the south (more than 43\u00b0 above the southern horizon in the tropics), and then sets in the southwest. It is on the south (equator) side of the house all day long. A vertical window facing south (equator side) is effective for capturing solar thermal energy. For comparison, the winter sun in the Southern Hemisphere (June, July, August) rises in the northeast, peaks out at a low angle in the north (more than halfway up from the horizon in the tropics), and then sets in the northwest. There, the north-facing window would let in plenty of solar thermal energy to the house. Question: does the sun ever shine from the north?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Regan, who was not allowed in the basement previously, sees her father's notes on the creatures and on his experimentation with several different implants. When the creature returns to invade the basement, Regan places the boosted implant on a nearby microphone, magnifying the feedback to ward off the creature. Painfully disoriented, the creature exposes the flesh beneath its armored head, and Evelyn shoots the creature in the head with a shotgun, destroying its head and killing it. The family views a CCTV monitor, showing two creatures attracted by the noise of the shotgun blast approaching the house. With their newly acquired knowledge of the creatures' weakness, the members of the family arm themselves and prepare to fight back. Question: do they live at the end of a quiet place?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: As all executive authority is vested in the sovereign, their assent is required to allow for bills to become law and for letters patent and orders in council to have legal effect. While the power for these acts stems from the Canadian people through the constitutional conventions of democracy, executive authority remains vested in the Crown and is only entrusted by the sovereign to their government on behalf of the people, underlining the Crown's role in safeguarding the rights, freedoms, and democratic system of government of Canadians, and reinforcing the fact that ``governments are the servants of the people and not the reverse''. Thus, within a constitutional monarchy the sovereign's direct participation in any of these areas of governance is limited, with the sovereign normally exercising executive authority only on the advice of the executive committee of the Queen's Privy Council for Canada, with the sovereign's legislative and judicial responsibilities largely carried out through parliamentarians as well as judges and justices of the peace. The Crown today primarily functions as a guarantor of continuous and stable governance and a nonpartisan safeguard against abuse of power, the sovereign acting as a custodian of the Crown's democratic powers and a representation of the ``power of the people above government and political parties''. Question: does the british monarchy have any power in canada?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The pardon power of the President extends only to an offense recognizable under federal law. However, the governors of most of the 50 states have the power to grant pardons or reprieves for offenses under state criminal law. In other states, that power is committed to an appointed agency or board, or to a board and the governor in some hybrid arrangement (in some states the agency is merged with that of the parole board, as in the Oklahoma Pardon and Parole Board). Question: can the president pardon someone convicted of a state crime?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: All burgers consist of zero (in the case of a 'grilled cheese') or more 2 oz (57 g) beef patties cooked to ``medium-well'', and served on a toasted bun. The standard style of burger includes tomato, hand-leafed lettuce and ``spread'', a sauce similar to Thousand Island dressing. Question: do in n out burgers come with pickles?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " There is no information about whether In-N-Out Burgers come with pickles in the given passage. Therefore, the score is: 50"
    },
    {
        "question": "Passage: In more recent times Auslan has seen a significant amount of lexical borrowing from American Sign Language (ASL), especially in signs for technical terms. Some of these arose from the signed English educational philosophies of the 1970s and 80s, when a committee looking for signs with direct equivalence to English words found them in ASL and/or in invented English-based signed systems used in North America and introduced them in the classroom. ASL contains many signs initialised from an alphabet which was also derived from LSF, and Auslan users, already familiar with the related ISL alphabet, accepted many of the new signs easily. Question: is american and australian sign language the same?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The band originally formed on September 27, 2016, and announced it the next day via their YouTube account. Since then, the band has released three EPs and five singles. Question: is why don't we still a band?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The law of Vatican City State consists of many forms, the most important of which is the Fundamental Law of Vatican City State. The Code of Penal Procedure governs tribunals and the Lateran Treaty governs relations with the Republic of Italy. Question: does the vatican city have its own laws?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u9605\u8bfb\u60a8\u63d0\u4f9b\u7684\u95ee\u9898\u548c\u5b66\u751f\u7b54\u6848\u3002\u7136\u540e\uff0c\u6211\u5c06\u6839\u636e\u95ee\u9898\u4e2d\u7684\u6587\u7ae0\u5224\u65ad\u5b66\u751f\u7684\u7b54\u6848\u662f\u5426\u6b63\u786e\uff0c\u5e76\u7ed9\u51fa\u4e00\u4e2a 0 \u5230 100 \u4e4b\u95f4\u7684\u5206\u6570\u3002\\n\\n\u95ee\u9898\uff1a```\u68b5\u8482\u5188\u57ce\u56fd\u662f\u5426\u6709\u81ea\u5df1\u7684\u6cd5\u5f8b\uff1f```\\n\\n\u5b66\u751f\u7b54\u6848\uff1a```NILL```\\n\\n\u56e0\u6b64\uff0c\u6839\u636e\u6587\u7ae0\u5185\u5bb9\uff0c\u68b5\u8482\u5188\u57ce\u56fd\u786e\u5b9e\u6709\u81ea\u5df1\u7684\u4e00\u7cfb\u5217\u6cd5\u5f8b\uff0c\u5305\u62ec\u300a\u68b5\u8482\u5188\u57ce\u56fd\u57fa\u672c\u6cd5\u300b\u7b49\u3002\u6240\u4ee5\uff0c\u5b66\u751f\u7684\u7b54\u6848\u662f\u9519\u8bef\u7684\u3002\\n\\n\u56e0\u6b64\uff0c\u5206\u6570\u662f\uff1a20"
    },
    {
        "question": "Passage: In 1998 EchoStar purchased the broadcasting assets of a satellite broadcasting joint venture of News Corporation and MCI Worldcom, called ASkyB (for American Sky Broadcasting, named after News Corp's BSkyB service in Britain); the two companies had nearly merged (which called for Dish Network being renamed Sky) before it was called off due to Charlie Ergen's clashes with News Corp. executives. With this purchase EchoStar obtained 28 of the 32 transponder licenses in the 110\u00b0 West orbital slot, more than doubling existing continental United States broadcasting capacity at a value of $682.5 million; some of the other assets were picked up by rival PrimeStar, which was sold to DirecTV in 1999. The acquisition (which also included an uplink center in Gilbert, Arizona) inspired the company to introduce a multi satellite system called Dish 500, theoretically capable of receiving more than 500 channels on one Dish. In the same year, EchoStar, partnering with Bell Canada, launched Dish Network Canada. Question: is directv and dish network owned by the same company?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Oslo Airport Station (Norwegian: Oslo lufthavn stasjon), also known as Gardermoen Station, is a railway station located in the airport terminal building of Oslo Airport, Gardermoen in Norway. Located on the Gardermoen Line, it is served by the Airport Express Trains, express trains to Trondheim and Oslo, regional trains to Lillehammer and Skien (via Oslo) and commuter trains to Eidsvoll and Kongsberg (via Oslo). Question: is there a train station at oslo airport?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The vagus nerve (/\u02c8ve\u026a\u0261\u0259s/ VAY-g\u0259s), historically cited as the pneumogastric nerve, is the tenth cranial nerve or CN X, and interfaces with parasympathetic control of the heart, lungs, and digestive tract. The vagus nerves are paired; however, they are normally referred to in the singular. It is the longest nerve of the autonomic nervous system in the human body. The vagus nerve also has a sympathetic function via the peripheral chemoreceptors. Question: is the vagus nerve part of the cns?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: In the Time of the Butterflies is a historical novel by Julia Alvarez, relating an account of the Mirabal sisters during the time of the Trujillo dictatorship in the Dominican Republic. The book is written in the first and third person, by and about the Mirabal sisters. First published in 1994, the story was adapted into a feature film in 2001. Question: is in the time of the butterflies a true story?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The U.S. Coast Guard reports directly to the Secretary of Homeland Security. However, under 14 U.S.C. \u00a7 3 as amended by section 211 of the Coast Guard and Maritime Transportation Act of 2006, upon the declaration of war and when Congress so directs in the declaration, or when the President directs, the Coast Guard operates under the Department of Defense as a service in the Department of the Navy. Question: is coast guard part of department of defense?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Church of England (C of E) is the state church of England. The Archbishop of Canterbury (currently Justin Welby) is the most senior cleric, although the monarch is the supreme governor. The Church of England is also the mother church of the international Anglican Communion. It traces its history to the Christian church recorded as existing in the Roman province of Britain by the third century, and to the 6th-century Gregorian mission to Kent led by Augustine of Canterbury. Question: are the church of england and the anglican church the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: There are eleven official languages of South Africa: Afrikaans, English, Ndebele, Northern Sotho, Sotho, SiSwati, Tsonga, Tswana, Venda, Xhosa and Zulu. Fewer than two percent of South Africans speak a first language other than an official one. Most South Africans can speak more than one language. Dutch and English were the first official languages of South Africa from 1910 to 1925. Afrikaans was added as a part of Dutch in 1925, although in practice, Afrikaans effectively replaced Dutch, which fell into disuse. When South Africa became a republic in 1961, the official relationship changed such that Afrikaans was considered to include Dutch, and Dutch was dropped in 1984, so between 1984 and 1994, South Africa had two official languages: English and Afrikaans. Question: is english the official language of south africa?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: On the film review aggregation website Rotten Tomatoes, films that have exclusively positive reviews and have been reviewed by at least five critics have a 100% approval rating. Many of these films, particularly those with a high number of positive reviews, have achieved wide critical acclaim and are often considered among the best. A number of these films also appear on the AFI's 100 Years...100 Movies lists, but there are many others and several entries with dozens of positive reviews, which are considered surprising to some experts. As of July 2018, Paddington 2 holds the site's record, with an approval rating of 100% and 199 positive reviews. Question: has any movie gotten 100 on rotten tomatoes?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: The following is the list of teams to overcome 3--1 series deficits by winning three straight games to win a best-of-seven playoff series. In the history of major North American pro sports, teams that were down 3--1 in the series came back and won the series 52 times, more than half of them were accomplished by National Hockey League (NHL) teams. Teams overcame 3--1 deficit in the final championship round eight times, six were accomplished by Major League Baseball (MLB) teams in the World Series. Teams overcoming 3--0 deficit by winning four straight games were accomplished five times, four times in the NHL and once in MLB. Question: has anyone come back from 3-0 in the nba finals?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In the United Kingdom in March 2008, 20,000 numbered packs of pink Blu Tack were made available, to help raise money for Breast Cancer Campaign, with 10 pence from each pack going to the charity. The formulation was slightly altered to retain complete consistency with its blue counterpart. Since then, many coloured variations have been made, including red and white, yellow and a green Halloween pack. Question: is white tack the same as blu tack?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In biology, tissue is a cellular organizational level between cells and a complete organ. A tissue is an ensemble of similar cells and their extracellular matrix from the same origin that together carry out a specific function. Organs are then formed by the functional grouping together of multiple tissues. Question: is tissue composed of one type of cell?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " There is a difference between a single cell and a tissue. A tissue is an ensemble of similar cells and their extracellular matrix from the same origin that together carry out a specific function. Therefore, the score is: 80"
    },
    {
        "question": "Passage: The Isle of Dogs, locally referred to as the island, is a geographic area made up of Millwall, Cubitt Town, Canary Wharf and parts of Blackwall, Limehouse and Poplar. It is in the East End of London and is bounded on three sides (east, south and west) by one of the largest meanders in the River Thames. The northern boundary has never been clearly or consistently defined but many accept it to be the (former) line of the West India South Dock. The name Isle of Dogs had no official status until 1987, with the creation of the Isle of Dogs Neighbourhood by Tower Hamlets London Borough Council. Question: is canary wharf on the isle of dogs?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The Ford Escape is a compact crossover vehicle sold by Ford since 2000 over three generations. Ford released the original model in 2000 for the 2001 model year--a model jointly developed and released with Mazda of Japan--who took a lead in the engineering of the two models and sold their version as the Mazda Tribute. Although the Escape and Tribute share the same underpinnings constructed from the Ford CD2 platform (based on Mazda GF underpinnings), the only panels common to the two vehicles are the roof and floor pressings. Powertrains were supplied by Mazda with respect to the base inline-four engine, with Ford providing the optional V6. At first, the twinned models were assembled by Ford in the US for North American consumption, with Mazda in Japan supplying cars for other markets. This followed a long history of Mazda-derived Fords, starting with the Ford Courier in the 1970s. Ford also sold the first generation Escape in Europe and China as the Ford Maverick, replacing the previous Nissan-sourced model. Then in 2004, for the 2005 model year, Ford's luxury Mercury division released a rebadged version called the Mercury Mariner, sold mainly in North America. The first iteration Escape remains notable as the first SUV to offer a hybrid drivetrain option, released in 2004 for the 2005 model year to North American markets only. Question: are mazda tribute and ford escape the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: International Blind Sports Federation rules require that any time during a game in which one team has scored ten (10) more goals than the other team that game is deemed completed. In US high school soccer, most states use a mercy rule that ends the game if one team is ahead by 10 or more goals at any point from halftime onward. Youth soccer leagues use variations on the rule. Question: is there a mercy rule in professional soccer?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Most seat belt laws in the United States are left to the states. However, the first seat belt law was a federal law, Title 49 of the United States Code, Chapter 301, Motor Vehicle Safety Standard, which took effect on January 1, 1968, that required all vehicles (except buses) to be fitted with seat belts in all designated seating positions. This law has since been modified to require three-point seat belts in outboard-seating positions, and finally three-point seat belts in all seating positions. Initially, seat belt use was voluntary. New York was the first state to pass a law which required vehicle occupants to wear seat belts, a law that came into effect on December 1, 1984. Officer Nicholas Cimmino of the Westchester County Department of Public Safety wrote the nation's first ticket for such violation. New Hampshire is the only state that has no enforceable laws for the wearing of seat belts in a vehicle. Question: are there any states that do not have a seat belt law?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Dalmatian puppies are born with plain white coats and their first spots usually appear within 3 to 4 weeks after birth, however spots are visible on their skin. After about a month, they have most of their spots, although they continue to develop throughout life at a much slower rate. Spots usually range in size from 30 to 60 mm, and are most commonly black or brown (liver) on a white background. Other, more rare colors, include blue (a blue-grayish color), brindle, mosaic, tricolor-ed (with tan spotting on the eyebrows, cheeks, legs, and chest), and orange or lemon (dark to pale yellow). Patches of color may appear anywhere on the body, mostly on the head or ears, and usually, consist of a solid color. Patches are visible at birth and are not a group of connected spots and are identifiable by the smooth edge of the patch. Question: do dalmatians get more spots as they grow?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: JPMorgan Chase Bank, N.A., doing business as Chase Bank, is a national bank headquartered in Manhattan, New York City, that constitutes the consumer and commercial banking subsidiary of the U.S. multinational banking and financial services holding company, JPMorgan Chase & Co. The bank was known as Chase Manhattan Bank until it merged with J.P. Morgan & Co. in 2000. Chase Manhattan Bank was formed by the merger of the Chase National Bank and The Manhattan Company in 1955. The bank has been headquartered in Columbus, Ohio since its merger with Bank One Corporation in 2004. The bank acquired the deposits and most assets of Washington Mutual. Question: is jpmorgan chase the same as chase bank?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Diary of a Wimpy Kid is a satirical realistic fiction comedy novel for children and teenagers written and illustrated by Jeff Kinney. It is the first book in the Diary of a Wimpy Kid series. The book is about a boy named Greg Heffley and his struggles to fit in as he begins middle school. Question: is diary of a wimpy kid considered a graphic novel?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Stand by Me is a 1986 American coming-of-age comedy-drama film directed by Rob Reiner and starring Wil Wheaton, River Phoenix, Corey Feldman, and Jerry O'Connell. The film, whose plot is based on Stephen King's novella The Body (1982) and title is derived from Ben E. King's eponymous song, which plays over the ending credits, tells the story of four boys in a small town in Maine who go on a hike to find the dead body of a missing child. Question: did river phoenix play in stand by me?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: There is no offside offence if a player receives the ball directly from a goal kick, a corner kick, a throw-in, or a dropped-ball. It is also not an offence if the ball was last deliberately played by an opponent (except for a deliberate save). In this context, according to the IFAB, ``A 'save' is when a player stops, or attempts to stop, a ball which is going into or very close to the goal with any part of the body except the hands/arms (unless the goalkeeper within the penalty area).'' Question: can you be offside on a corner kick?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: An American in Paris was an enormous success, garnering eight Academy Award nominations and winning six, as well as earning other industry honors. In 1993, it was selected for preservation by the United States Library of Congress in the National Film Registry for being ``culturally, historically, or aesthetically significant.'' It is ranked #9 among AFI's Greatest Movie Musicals. Question: did an american in paris win an oscar?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Temperatures at sea level generally range from highs of 85--90 \u00b0F (29--32 \u00b0C) during the summer months to 79--83 \u00b0F (26--28 \u00b0C) during the winter months. Rarely does the temperature rise above 90 \u00b0F (32 \u00b0C) or drop below 65 \u00b0F (18 \u00b0C) at lower elevations. Temperatures are lower at higher altitudes; in fact, the three highest mountains of Mauna Kea, Mauna Loa, and Haleakal\u0101 often receive snowfall during the winter. Question: does it get cold at night in hawaii?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The hosts of the World Cup receive an automatic berth. Unlike many other sports, results of the previous World Cups or of the continental championships are not taken into account. Until 2002, the defending champions also received an automatic berth, but starting from the 2006 World Cup this is no longer the case. Question: does the host country for the world cup get an automatic bid?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Black garlic can be eaten alone, on bread, or used in soups, sauces, crushed into a mayonnaise or simply tossed into a vegetable dish. A vinaigrette can be made with black garlic, sherry vinegar, soy, a neutral oil, and Dijon mustard. Its softness increases with water content. Question: is japanese black garlic supposed to be mushy?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is no information in the passage to indicate whether Japanese black garlic is supposed to be mushy or not. The student's answer of \\\"False\\\" is not based on any information provided in the passage and is therefore not accurate. Therefore the score is: 50"
    },
    {
        "question": "Passage: Ladies may wear a long (over the shoulders or to ankles) cloak usually called a cape, or a full-length cloak. Gentlemen wear an ankle-length or full-length cloak. Formal cloaks often have expensive, colored linings and trimmings such as silk, satin, velvet and fur. Question: is a cape and a cloak the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is a difference between a cape and a cloak, as described in the passage. A cape is usually over the shoulders or to the ankles, while a cloak can be ankle-length or full-length. Additionally, formal cloaks often have expensive, colored linings and trimmings such as silk, satin, velvet, and fur. Therefore the score is: 70"
    },
    {
        "question": "Passage: On 15 December 2014, the official music video for ``The Nights'' was released on YouTube and premiered on the front page of Yahoo Music. The video was produced, directed by, and stars ``professional life liver'' Rory Kramer, who filmed an exuberant action-packed recollection of his own life on roller coasters, surfing, snowboarding, skateboarding, balloon flying, making a four door convertible out of a Toyota, etc. -- living a life to be remembered. Question: is avicii the guy in the nights video?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Germany's Theo Albrecht (owner and CEO of Aldi Nord) bought the company in 1979 as a personal investment for his family. Coulombe was succeeded as CEO by John Shields in 1987. Under his leadership the company expanded beyond California, moving into Arizona in 1993 and into the Pacific Northwest two years later. In 1996, the company opened its first stores on the East Coast: in Brookline and Cambridge both outside Boston. Shields retired in 2001 when Dan Bane succeeded him as CEO after being the President of the Western Division. When Bane became CEO there were 156 stores in 15 states. Question: is aldi's affiliated with trader joe's?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: As the Confederation Congress attempted to govern the continually growing American states, delegates discovered that the limitations placed upon the central government rendered it ineffective at doing so. As the government's weaknesses became apparent, especially after Shays' Rebellion, individuals began asking for changes to the Articles. Their hope was to create a stronger national government. Initially, some states met to deal with their trade and economic problems. However, as more states became interested in meeting to change the Articles, a meeting was set in Philadelphia on May 25, 1787. This became the Constitutional Convention. It was quickly realized that changes would not work, and instead the entire Articles needed to be replaced. On March 4, 1789, the government under the Articles was replaced with the federal government under the Constitution. The new Constitution provided for a much stronger federal government by establishing a chief executive (the President), courts, and taxing powers. Question: do we still follow the articles of confederation?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Lynyrd Skynyrd is a Southern rock band from Jacksonville, Florida. Formed in 1964, the group originally included vocalist Ronnie Van Zant, guitarists Gary Rossington and Allen Collins, bassist Larry Junstrom and drummer Bob Burns. The current lineup features Rossington, guitarist and vocalist Rickey Medlocke (from 1971 to 1972, and since 1996), lead vocalist Johnny Van Zant (since 1987), drummer Michael Cartellone (since 1999), guitarist Mark Matejka (since 2006), keyboardist Peter Keys (since 2009) and bassist Keith Christopher (since 2017). The band also tours with two backing vocalists, currently Dale Krantz-Rossington (since 1987) and Carol Chase (since 1996). Question: is there any original members of lynyrd skynyrd?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Additionally, although fees for debit card ATM usage are very rare in countries such as the UK, where cashback originated , this is not the case in some other countries. In Canada and the United States, fees of $1~2 are typical when using an ATM from a different bank than the one with which the customer has an account. The fees in some other countries are even higher. In Germany, for instance, usual fees are \u20ac4~5 when using an ATM of another bank network than the one of his bank. This gives rise to another potential cashback advantage for the consumer: by making use of the cashback procedure, this ATM fee can be avoided for the cardholder. Question: does it cost money to get cash back?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Since the September 11 attacks in 2001, the island is guarded by patrols of the United States Park Police Marine Patrol Unit. Public access is by ferry from either Communipaw Terminal in Liberty State Park or from the Battery at the southern tip of Manhattan. The ferry operator, Hornblower Cruises and Events, also provides service to the nearby Statue of Liberty. A bridge built for transporting materials and personnel during restoration projects connects Ellis Island with Liberty State Park but is not open to the public. The city of New York and the private ferry operator at the time opposed proposals to use it or replace it with a pedestrian bridge. Question: is ellis island connected to the statue of liberty?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is no mention of a direct connection between Ellis Island and the Statue of Liberty in the passage. However, the passage does mention that the ferry operator, Hornblower Cruises and Events, provides service to both Ellis Island and the Statue of Liberty. So, while there isn't a direct connection, you can access both locations via the same ferry service.\\n\\nTherefore the score is: 70"
    },
    {
        "question": "Passage: Tyrannosaurus is a genus of coelurosaurian theropod dinosaur. The species Tyrannosaurus rex (rex meaning ``king'' in Latin), often colloquially called simply T. rex or T-Rex, is one of the most well-represented of the large theropods. Tyrannosaurus lived throughout what is now western North America, on what was then an island continent known as Laramidia. Tyrannosaurus had a much wider range than other tyrannosaurids. Fossils are found in a variety of rock formations dating to the Maastrichtian age of the upper Cretaceous Period, 68 to 66 million years ago. It was the last known member of the tyrannosaurids, and among the last non-avian dinosaurs to exist before the Cretaceous--Paleogene extinction event. Question: are t rex and tyrannosaurus rex the same?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Humans have a four-chambered heart consisting of the right atrium, left atrium, right ventricle, and left ventricle. The atria are the two upper chambers. The right atrium receives and holds deoxygenated blood from the superior vena cava, inferior vena cava, anterior cardiac veins and smallest cardiac veins and the coronary sinus, which it then sends down to the right ventricle (through the tricuspid valve) which in turn sends it to the pulmonary artery for pulmonary circulation. The left atrium receives the oxygenated blood from the left and right pulmonary veins, which it pumps to the left ventricle (through the mitral valve) for pumping out through the aorta for systemic circulation. Question: is there a difference in structure of the two atria?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The eighth season of the American legal drama Suits was ordered on January 30, 2018, and began airing on USA Network in the United States July 18, 2018. Question: are they making a season 8 of suits?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: At the time of the FCC vote, the Senate had the proper amount of backing to force its own vote on net neutrality. The vote was being forced under Senate rules that went into effect in 1996 called the Congressional Review Act. Senate Democrats expressed optimism at their level of support given the help of Republican member Susan Collins. The motion to restore net neutrality passed in the Senate on May 16, 2018. Collins was joined by Republicans John Kennedy and Lisa Murkowski. If the challenge is not passed by the House of Representatives and signed by the President within 60 legislative days from February 22, 2018 (the date of publication in the Federal Register), the measure will fail. Barring that, FCC Commissioner Rosenworcel said that ``Restoring Internet Freedom'' will become the official policy of the US June 11, 2018. FCC Chairman Ajit Pai responded to the Senate vote by saying ``It's disappointing that Senate Democrats forced this resolution through by a narrow margin, but ultimately, I'm confident that their effort to reinstate heavy-handed government regulation of the Internet will fail'' and cited The Washington Post's ``three-Pinnochio'' fact-check of Democratic claims regarding net neutrality. Question: do we still have net neutrality in the us?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Recent astronomical research suggests that planetary systems around some other stars may follow Titius--Bode-like laws. Bovaird and Lineweaver applied a generalized Titius--Bode relation to 68 exoplanet systems that contain four or more planets. They showed that 96% of these exoplanet systems adhere to a generalized Titius--Bode relation to a similar or greater extent than the Solar System does. The locations of potentially undetected exoplanets are predicted in each system. Question: do exoplanetary systems follow the titus bode rule?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Generally, there is an increase in biodiversity from the poles to the tropics. Thus localities at lower latitudes have more species than localities at higher latitudes. This is often referred to as the latitudinal gradient in species diversity. Several ecological mechanisms may contribute to the gradient, but the ultimate factor behind many of them is the greater mean temperature at the equator compared to that of the poles. Question: is there a relationship between our latitude position and high diversity of life form in the country?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a relationship between latitude position and high diversity of life forms in the country, as the passage states that there is generally an increase in biodiversity from the poles to the tropics. Therefore, the score is: 50/100"
    },
    {
        "question": "Passage: Instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some instruction by dividing incoming instructions into a series of sequential steps (the eponymous ``pipeline'') performed by different processor units with different parts of instructions processed in parallel. It allows faster CPU throughput than would otherwise be possible at a given clock rate, but may increase latency due to the added overhead of the pipelining process itself. Question: can pipelining help latency of a single task?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The kilowatt hour (symbol kWh, kW\u22c5h or kW h) is a unit of energy equal to 3.6 megajoules. If energy is transmitted or used at a constant rate (power) over a period of time, the total energy in kilowatt hours is equal to the power in kilowatts multiplied by the time in hours. The kilowatt hour is commonly used as a billing unit for energy delivered to consumers by electric utilities. Question: is a kilowatt hour a unit of power?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Puppies are born with a fully functional sense of smell but can't open their eyes. During their first two weeks, a puppy's senses all develop rapidly. During this stage the nose is the primary sense organ used by puppies to find their mother's teats, and to locate their littermates, if they become separated by a short distance. Puppies open their eyes about nine to eleven days following birth. At first, their retinas are poorly developed and their vision is poor. Puppies are not able to see as well as adult dogs. In addition, puppies' ears remain sealed until about thirteen to seventeen days after birth, after which they respond more actively to sounds. Between two and four weeks old, puppies usually begin to growl, bite, wag their tails, and bark. Question: can a puppy see when they first open their eyes?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Aside from tidal flooding along rivers, the storm's effects were primarily concentrated along the coast. A buoy off the coast of Nova Scotia reported a wave height of 100.7 feet (30.7 m), the highest ever recorded in the province's offshore waters. In the middle of the storm, the fishing vessel Andrea Gail sank, killing her crew of six and inspiring the book, and later movie, The Perfect Storm. Off the shore of New York's Long Island, an Air National Guard helicopter ran out of fuel and crashed; four members of its crew were rescued and one was killed. Two people died after their boat sank off Staten Island. High waves swept two people to their deaths, one in Rhode Island and one in Puerto Rico, and another person was blown off a bridge to his death. The tropical cyclone that formed late in the storm's duration caused little impact, limited to power outages and slick roads; one person was killed in Newfoundland from a traffic accident related to the storm. Question: did a helicopter crash in the perfect storm?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: All biomass goes through at least some of these steps: it needs to be grown, collected, dried, fermented, distilled, and burned. All of these steps require resources and an infrastructure. The total amount of energy input into the process compared to the energy released by burning the resulting ethanol fuel is known as the energy balance (or ``energy returned on energy invested''). Figures compiled in a 2007 report by National Geographic Magazine point to modest results for corn ethanol produced in the US: one unit of fossil-fuel energy is required to create 1.3 energy units from the resulting ethanol. The energy balance for sugarcane ethanol produced in Brazil is more favorable, with one unit of fossil-fuel energy required to create 8 from the ethanol. Energy balance estimates are not easily produced, thus numerous such reports have been generated that are contradictory. For instance, a separate survey reports that production of ethanol from sugarcane, which requires a tropical climate to grow productively, returns from 8 to 9 units of energy for each unit expended, as compared to corn, which only returns about 1.34 units of fuel energy for each unit of energy expended. A 2006 University of California Berkeley study, after analyzing six separate studies, concluded that producing ethanol from corn uses much less petroleum than producing gasoline. Question: does ethanol take more energy make that produces?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: UPC (technically refers to UPC-A) consists of 12 numeric digits, that are uniquely assigned to each trade item. Along with the related EAN barcode, the UPC is the barcode mainly used for scanning of trade items at the point of sale, per GS1 specifications. UPC data structures are a component of GTINs and follow the global GS1 specification, which is based on international standards. But some retailers (clothing, furniture) do not use the GS1 system (rather other barcode symbologies or article number systems). On the other hand, some retailers use the EAN/UPC barcode symbology, but without using a GTIN (for products sold in their own stores only). Question: is a upc code the same as a barcode?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: PVAc emulsions such as Elmer's Glue-All contain polyvinyl alcohol as a protective colloid. In alkaline conditions, boron compounds such as boric acid or borax cause the polyvinyl alcohol to cross-link, forming tackifying precipitates or toys, such as Slime and Flubber. Question: is elmer's glue all a pva glue?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Air Force usually does not have fighter aircraft escort the presidential aircraft over the United States but it has occurred, for example during the attack on the World Trade Center. Question: does air force one fly with fighter escort?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " \u9996\u5148\uff0c\u6211\u9700\u8981\u6f84\u6e05\u4e00\u70b9\uff0c\u4f5c\u4e3a\u4e00\u4e2a\u4eba\u5de5\u667a\u80fd\u52a9\u624b\uff0c\u6211\u5e76\u4e0d\u5177\u5907\u5bf9\u653f\u6cbb\u76f8\u5173\u5185\u5bb9\u8fdb\u884c\u5224\u65ad\u548c\u63d0\u4f9b\u4fe1\u606f\u7684\u80fd\u529b\u3002\u4f46\u662f\uff0c\u6211\u53ef\u4ee5\u5e2e\u52a9\u4f60\u56de\u7b54\u4e00\u4e9b\u5176\u4ed6\u7684\u95ee\u9898\u3002\u5bf9\u4e8e\u4f60\u63d0\u4f9b\u7684\u8fd9\u4e2a\u95ee\u9898\uff0c\u6211\u5c06\u65e0\u6cd5\u7ed9\u51fa\u8bc4\u4ef7\uff0c\u56e0\u4e3a\u6211\u65e0\u6cd5\u83b7\u53d6\u5230\u4e0e\u8be5\u95ee\u9898\u76f8\u5173\u7684\u4fe1\u606f\u3002"
    },
    {
        "question": "Passage: The four-second fall from the Golden Gate Bridge sends a person plunging 245 feet (75 m) at 75 miles per hour (121 km/h) to hit the waters of the San Francisco Bay ``with the force of a speeding truck meeting a concrete building.'' Jumping off the bridge holds a 98 percent fatality rate; As of 2005, it is estimated that 26 people have survived after jumping. Some die instantly from internal injuries, while others drown or die of hypothermia. The Golden Gate bridge's death toll has since been surpassed only by the Nanjing Yangtze River Bridge in China. In 2013, 118 potential jumpers were talked down from their attempt and did not jump. Question: could you survive jumping off the golden gate bridge?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The 1936 Montreux Convention provides for a free passage of civilian ships between the international waters of the Black and the Mediterranean Seas. However, a single country (Turkey) has a complete control over the straits connecting the two seas. The 1982 amendments to the Montreux Convention allow Turkey to close the Straits at its discretion in both wartime and peacetime. Question: can you get to the black sea from the mediterranean?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: The Thirteenth Amendment (Amendment XIII) to the United States Constitution abolished slavery and involuntary servitude, except as punishment for a crime. In Congress, it was passed by the Senate on April 8, 1864, and by the House on January 31, 1865. The amendment was ratified by the required number of states on December 6, 1865. On December 18, 1865, Secretary of State William H. Seward proclaimed its adoption. It was the first of the three Reconstruction Amendments adopted following the American Civil War. Question: was the 13th amendment after the civil war?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: This article lists albatrosses that have been scored in important golf tournaments. An albatross, also called a double eagle, is a score of three-under-par on a single hole. This is most commonly achieved with two shots on a par-5, but can be done with a hole-in-one on a par-4 or three shots on a par-6. Question: is a double eagle the same as an albatross?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Mount Evans is the highest summit of the Chicago Peaks in the Front Range of the Rocky Mountains of North America. The prominent 14,271-foot (4350 m) fourteener is located in the Mount Evans Wilderness, 13.4 miles (21.6 km) southwest by south (bearing 214\u00b0) of the City of Idaho Springs in Clear Creek County, Colorado, United States, on the drainage divide between Arapaho National Forest and Pike National Forest. Question: is mt evans part of rocky mountain national park?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In no-limit and pot-limit games, there is a minimum amount that is required to be bet in order to open the action. In games with blinds, this amount is usually the amount of the big blind. Standard poker rules require that raises must be at least equal to the amount of the previous bet or raise. For example, if an opponent bets $5, a player must raise by at least another $5, and they may not raise by only $2. If a player raises a bet of $5 by $7 (for a total of $12), the next re-raise would have to be by at least another $7 (the previous raise) more than the $12 (for a total of at least $19). The primary purpose of the minimum raise rule is to avoid game delays caused by ``nuisance'' raises (small raises of large bets, such as an extra $1 over a current bet of $50, that have little effect on the action but take time as all others must call). This rule is overridden by table stakes rules, so that a player may in fact raise a $5 bet by $2 if that $2 is his entire remaining stake. Question: do you have to raise double in poker?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Orlando International Airport (IATA: MCO, ICAO: KMCO, FAA LID: MCO) is a major public airport located six miles (10 km) southeast of Downtown Orlando, Florida, United States. In 2017, MCO handled 44,611,265 passengers, making it the busiest airport in the state of Florida and the eleventh-busiest airport in the United States. Question: is orlando international airport the same as mco?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: In humans, a single transverse palmar crease is a single crease that extends across the palm of the hand, formed by the fusion of the two palmar creases (known in palmistry as the ``heart line'' and the ``head line'') and is found in people with Down syndrome. However, it is not an indication that a person with single transverse palmar crease has to have Down syndrome. It is also found in 1.5% of the general population in at least one hand. Question: do all down syndrome babies have simian crease?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: But, the hospital used for most other exterior and a few interior shots is not in Seattle; these scenes are shot at the VA Sepulveda Ambulatory Care Center in North Hills, California, and occasional shots from an interior walkway above the lobby show dry California mountains in the distance. The exterior of Meredith Grey's house, also known as the Intern House, is real. In the show, the address of Grey's home is 613 Harper Lane, but this is not an actual address. The physical house is located at 303 W. Comstock St., on Queen Anne Hill, Seattle, Washington. Most scenes are taped at Prospect Studios in Los Feliz, just east of Hollywood, where the Grey's Anatomy set occupies six sound stages. Some outside scenes are shot at the Warren G. Magnuson Park in Seattle. Several props used are working medical equipment, including the MRI machine. Question: is grey's anatomy filmed at a real hospital?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A soccer-specific stadium typically has amenities, dimensions and scale suitable for soccer in North America, including a scoreboard, video screen, luxury suites and possibly a roof. The field dimensions are within the range found optimal by FIFA: 110--120 yards (100--110 m) long by 70--80 yards (64--73 m) wide. These soccer field dimensions are wider than the regulation American football field width of 53 \u2044 yards (48.8 m), or the 65-yard (59 m) width of a Canadian football field. The playing surface typically consists of grass as opposed to artificial turf, as the latter is generally disfavored for soccer matches since players are more susceptible to injuries. However, some soccer specific stadiums, such as Portland's Providence Park and Creighton University's Morrison Stadium, do have artificial turf. Question: is a soccer stadium bigger than a football stadium?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation of estimate. If the parameter or the statistic is the mean, it is called the standard error of the mean (SEM). Question: is sampling error the same as standard deviation?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: A goal cannot be scored directly from a throw-in; if a player throws the ball directly into their own goal without any other player touching it, the result is a corner kick to the opposing side. Likewise an offensive goal cannot be scored directly from a throw in; the result in this case is a goal kick for the defending team. Question: can you score from a throw in in soccer?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Orange is the colour between yellow and red on the spectrum of visible light. Human eyes perceive orange when observing light with a dominant wavelength between roughly 585 and 620 nanometres. In painting and traditional colour theory, it is a secondary colour of pigments, created by mixing yellow and red. It is named after the fruit of the same name. Question: is the fruit orange named after the color?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Nuclear power is the use of nuclear reactions that release nuclear energy to generate heat, which most frequently is then used in steam turbines to produce electricity in a nuclear power plant. Nuclear power can be obtained from nuclear fission, nuclear decay and nuclear fusion. Presently, the vast majority of electricity from nuclear power is produced by nuclear fission of elements in the actinide series of the periodic table. Nuclear decay processes are used in niche applications such as radioisotope thermoelectric generators. The possibility of generating electricity from nuclear fusion is still at a research phase with no commercial applications. This article mostly deals with nuclear fission power for electricity generation. Question: is nuclear power the same as nuclear energy?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: An income statement or profit and loss account (also referred to as a profit and loss statement (P&L), statement of profit or loss, revenue statement, statement of financial performance, earnings statement, operating statement, or statement of operations) is one of the financial statements of a company and shows the company's revenues and expenses during a particular period. It indicates how the revenues (money received from the sale of products and services before expenses are taken out, also known as the ``top line'') are transformed into the net income (the result after all revenues and expenses have been accounted for, also known as ``net profit'' or the ``bottom line''). The purpose of the income statement is to show managers and investors whether the company made or lost money during the period being reported. Question: are profit and loss and income statement the same thing?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Bj\u00f6rlin left Days in June 2003 to concentrate on her singing career but returned later in December 2003. In September 2005, Bj\u00f6rlin left Days of our Lives again and joined the cast of the UPN series Sex, Love & Secrets. The show was canceled by the network, but Bj\u00f6rlin continued to make guest appearances on television series such as Jake in Progress and Out of Practice. Question: does chloe on days of our lives really sing?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is no mention in the passage about whether Chloe on Days of Our Lives really sings. Therefore the score is: 50"
    },
    {
        "question": "Passage: According to data compiled by the National Highway Traffic Safety Administration, the Taconic was the second deadliest road in Dutchess County after US 9 between 1994 and 2008. New York State Police blamed travelers exceeding speed limits, wildlife crossings and trucks being directed onto the parkway by their GPS navigation devices. The state was planning to post more explicit signage making it clear that trucks are not allowed on parkways in New York. Question: can trucks go on the taconic state parkway?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Joseph William Namath (/\u02c8ne\u026am\u026a\u03b8/; born May 31, 1943), nicknamed ``Broadway Joe'', is a former American football quarterback and actor. He played college football for the University of Alabama under coach Paul ``Bear'' Bryant from 1962 to 1964, and professional football in the American Football League (AFL) and National Football League (NFL) during the 1960s and 1970s. Namath was an AFL icon and played for that league's New York Jets for most of his professional football career. He finished his career with the Los Angeles Rams. He was elected to the Pro Football Hall of Fame in 1985. Question: is joe namath in the nfl hall of fame?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The Territory of Hawaii or Hawaii Territory was an organized incorporated territory of the United States that existed from August 12, 1898, until August 21, 1959, when most of its territory, excluding Palmyra Island and the Stewart Islands, was admitted to the Union as the fiftieth U.S. state, the State of Hawaii. The Hawaii Admission Act specified that the State of Hawaii would not include the distant Palmyra Island, the Midway Islands, Kingman Reef, and Johnston Atoll, which includes Johnston (or Kalama) Island and Sand Island, and the Act was silent regarding the Stewart Islands. Question: is hawaii part of the united states territory?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: A pitaya (/p\u026a\u02c8ta\u026a.\u0259/) or pitahaya (/\u02ccp\u026at\u0259\u02c8ha\u026a.\u0259/) is the fruit of several different cactus species indigenous to the Americas. Pitaya usually refers to fruit of the genus Stenocereus, while pitahaya or dragon fruit refers to fruit of the genus Hylocereus, both in the Cactaceae family. The dragon fruit is cultivated in Southeast Asia, Florida, the Caribbean, Australia, and throughout tropical and subtropical world regions. Question: is dragon fruit and pitaya the same thing?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The project was originally titled Oxygen while in development at Isla Producciones (in collaboration with Ol\u00e9). It was then adapted for the American market by Powwow before being acquired by The CW. Star-Crossed premiered on The CW on Monday, February 17, 2014, at 8:00 pm Eastern/7:00 pm Central. The series was picked up for a thirteen-episode season. Question: is the tv series star crossed based on a book?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: The Fire Tablet, formerly called the Kindle Fire, is a tablet computer developed by Amazon.com. Built with Quanta Computer, the Kindle Fire was first released in November 2011, featuring a color 7-inch multi-touch display with IPS technology and running a custom version of Google's Android operating system called Fire OS. The Kindle Fire HD followed in September 2012, and the Kindle Fire HDX in September 2013. In September 2014, when the fourth generation was introduced, the name ``Kindle'' was dropped. In September 2015, the fifth generation Fire 7 was released, followed by the sixth generation Fire HD 8, in September 2016. The seventh generation Fire 7 was released in June 2017. Question: is a fire 7 the same as a kindle?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is no mention in the passage whether a Fire 7 is the same as a Kindle, so the student's answer cannot be accurately judged. Therefore the score is: 50"
    },
    {
        "question": "Passage: The domain name org is a generic top-level domain (gTLD) of the Domain Name System (DNS) used in the Internet. The name is truncated from organization. It was one of the original domains established in 1985, and has been operated by the Public Interest Registry since 2003. The domain was originally intended for non-profit entities, but this restriction was not enforced and has been removed. The domain is commonly used by schools, open-source projects, and communities, but also by some for-profit entities. The number of registered domains in org has increased from fewer than one million in the 1990s, to ten million as of June 2012. Question: does a .org have to be a non profit?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Nigeria have appeared in the finals of the FIFA World Cup on six occasions, the first being in 1994 where they reached the second round. Their sixth and most recent appearance at the finals was the 2018 FIFA World Cup in Russia. Question: has nigeria reached quater final in world cup?",
        "pred_ans": " Response: False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Marble Falls is located in southern Burnet County at 30\u00b034\u2032N 98\u00b017\u2032W\ufeff / \ufeff30.567\u00b0N 98.283\u00b0W\ufeff / 30.567; -98.283 (30.5741, -98.2782), on the banks of Lake Marble Falls. According to the Handbook of Texas website, the former falls were flooded by the lake, which was created by a shelf of limestone running diagonally across the Colorado River from northeast to southwest. The upper layer of limestone, brownish on the exterior but a deep blue inside, was so hard and cherty it was mistaken for marble. The falls were actually three distinct formations at the head of a canyon 1.25 miles (2.01 km) long, with a drop of some 50 feet (15 m) through the limestone strata. The natural lake and waterfall were covered when the Colorado River was dammed with the completion of Max Starcke Dam in 1951. A photo of the falls as they once existed can be seen at the website for the Wallace Guest House, a local bed and breakfast. Lake Marble Falls sits between Lake Lyndon B. Johnson to the north and Lake Travis to the south. The falls for which the city is named are now underwater but are revealed every few years when the lake is lowered. Question: is there a waterfall in marble falls tx?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: In most jurisdictions, secondary education in the United States refers to the last four years of statutory formal education (grade nine through grade twelve) either at high school or split between a final year of 'junior high school' and three in high school. Question: is secondary school the same as high school in the united states?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Toys ``R'' Us expanded as a chain, becoming predominant in its niche field of toy retail. Represented by cartoon mascot Geoffrey the Giraffe from 1969, Toys ``R'' Us eventually branched out into launching the stores Babies ``R'' Us, Toys ``R'' Us Express, and the now-defunct Kids ``R'' Us. Question: is babies r us and toys r us the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Battle of the Alamo (February 23 -- March 6, 1836) was a pivotal event in the Texas Revolution. Following a 13-day siege, Mexican troops under President General Antonio L\u00f3pez de Santa Anna launched an assault on the Alamo Mission near San Antonio de B\u00e9xar (modern-day San Antonio, Texas, United States), killing the Texian defenders. Santa Anna's cruelty during the battle inspired many Texians--both Texas settlers and adventurers from the United States--to join the Texian Army. Buoyed by a desire for revenge, the Texians defeated the Mexican Army at the Battle of San Jacinto, on April 21, 1836, ending the revolution. Question: was the battle of the alamo part of the mexican american war?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " There is no mention of the Mexican American War in the passage, so the student's answer cannot be accurately judged."
    },
    {
        "question": "Passage: In 1962, the first appearance of a space-faring Robinson family occurred in a comic book published by Gold Key Comics. The Space Family Robinson, who were scientists aboard Earth's ``Space Station One'', are swept away in a cosmic storm in the comic's second issue. These Robinsons were scientist father Craig, scientist mother June, early teens Tim (son) and Tam (daughter), along with pets Clancy (dog) and Yakker (parrot). Space Station One also boasted two spacemobiles for ship-to-planet travel. Question: is lost in space based on a book?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The President's Guest House is one of several residences owned by the United States government for use by the President and Vice President of the United States; other such residences include the White House, Camp David, One Observatory Circle, the Presidential Townhouse, and Trowbridge House. The President's Guest House has been called ``the world's most exclusive hotel'' because it is primarily used to host visiting dignitaries and other guests of the president. It is larger than the White House and closed to the public. Question: do foreign dignitaries stay at the white house?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A 24-hour run is a form of ultramarathon, in which a competitor runs as far as they can in 24 hours. They are typically held on 1- to 2-mile loops or occasionally 400-meter tracks. Question: is it possible to run for 24 hours?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: The actual record of birth is stored with a government agency. That agency will issue certified copies or representations of the original birth record upon request, which can be used to apply for government benefits, such as passports. The certification is signed and/or sealed by the registrar or other custodian of birth records, who is commissioned by the government. Question: is a certification of birth the same as a birth certificate?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Extended-release dosage consists of sustained-release (SR) and controlled-release (CR) dosage. SR maintains drug release over a sustained period but not at a constant rate. CR maintains drug release over a sustained period at a nearly constant rate. Question: is extended release the same as sustained release?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The series is co-produced by Nickelodeon Productions and Nickelodeon Animation Studio. Dora the Explorer is one of the longest-running shows of Nick Jr. During the sixth season, the show became the Nick Jr. series with the most episodes, surpassing Blue's Clues with 143 episodes, having 144 after it had completed broadcasting on television. It ended on June 5, 2014 after 8 seasons and 172 episodes. Question: do they still make new episodes of dora?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Second-year player Michael Jordan put on a record-setting performance in Game 2 of the Bulls' first-round series against the Celtics, scoring 63 points in a 2-OT loss, which surpassed Elgin Baylor's 61-point performance from the 1962 NBA Finals and still stands as the NBA Playoff scoring record. Jordan averaged 43.7 points per game in the series, but was unable to prevent the Bulls from being swept by a more experienced, more talented Celtics team. The Bulls set a dubious mark by posting the second worst record for a playoff-qualifying team in history, going just 30--52 during the season. Game 2, where the record was set, was ranked by TV Guide as the 26th Most Memorable Moment in Television History, and is credited with boosting the NBA's popularity surge and eventual rise to near the top of the United States television sports market, trailing only football by the mid-90s. Question: did the bulls get swept by the celtics?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: XYY syndrome is a genetic condition in which a male has an extra Y chromosome. Symptoms are usually few. They may include being taller than average, acne, and an increased risk of learning problems. The person is generally otherwise normal, including normal fertility. Question: is it possible to have an extra y chromosome?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a genetic condition called XYY syndrome in which a male has an extra Y chromosome. The symptoms are usually few and may include being taller than average, acne, and an increased risk of learning problems. The person is generally otherwise normal, including normal fertility. Therefore the score is: 50"
    },
    {
        "question": "Passage: Due to the use of contemporary music in each episode, none of the seasons are presently available on DVD, due to music licensing issues. However, the entire series, incorporating the contemporary music, was previously released on DVD as Cold Case: The Complete Edition, by CBS Productions (ISBN 8-5857-9659-6), on 44 dual-layer disks, in a single boxed set. This set is out of print. Question: will cold case ever be released on dvd?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Although initially happy in her relationship with Jackson, Lexie grows increasingly distraught and frustrated when she discovers that Mark has started dating an ophthalmologist named Julia. When she sees Mark and Julia flirting during a charity softball match, Lexie's jealousy gets the better of her and she throws a ball at Julia, injuring the latter's chest. Jackson senses that Lexie is still in love with Mark and ends their relationship. Lexie begins working under Derek's service and becomes increasingly proficient in neurosurgery, helping Derek with a set of ``hopeless cases'' - high risk surgeries for patients who had otherwise run out of options. During a surgery, Derek is called away on an emergency, leaving Lexie and Meredith to carry out the procedure on their own. Though Derek had instructed them to merely reduce the patient's brain tumor, Meredith allows Lexie to remove it completely, despite not being authorized by either the patient or Derek to do so. The sisters celebrate the successful surgery but Lexie is devastated when she discovers that the patient suffered severe brain damage, thus losing the ability to speak. Alex, Jackson and April move out of Meredith's house without inviting Lexie to join them, and with Derek and Meredith settling down with baby Zola, Lexie begins to feel lonely and isolated. After being left babysitting Zola on Valentine's Day, she contemplates confessing her true feelings to Mark. However, after plucking up the courage to visit his apartment, she finds Mark studying with Jackson and loses her nerve, instead claiming that she wanted to set up a play date for Zola and Sofia. When Mark confides in Derek that he and Julia have been discussing moving in together, Derek warns Lexie not to miss her chance again, resulting in her professing her love to a shell-shocked Mark, who merely thanks her for her candor. Mark later confesses to Derek that he feels the same way about Lexie, but is unsure of how to go about things. Days later, Lexie is named as part of a team of surgeons that will be sent to Boise to separate conjoined twins, along with Mark, Meredith, Derek, Cristina and Arizona Robbins (Jessica Capshaw). However, while flying to their destination, the doctors' plane crashes in the wilderness and Lexie is crushed under debris from the aircraft but manages to alert Mark and Cristina to help her. The pair try in vain to free Lexie, who realizes that she is suffering from a hemothorax and is unlikely to survive. While Cristina tries to find an oxygen tank and water to save Lexie, Mark holds Lexie's hand and professes his love for her, telling her that they will get married, have kids and live the best life together, as they are ``meant to be''. While fantasizing about the future that she and Mark could have had together, Lexie succumbs to her injuries and dies moments before Meredith arrives. The remaining doctors are left stranded in the woods waiting for rescue, with a devastated Meredith crying profusely and Mark refusing to let go of Lexie's hand. Question: do lexie and mark ever get back together?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: The song tells of ``Bennie and the Jets'', a fictional band of whom the song's narrator is a fan. The song is written in the key of G major. In interviews, Taupin has said that the song's lyrics are a satire on the music industry of the 1970s. The greed and glitz of the early 1970s music scene is portrayed by Taupin's words: Question: was bennie and the jets a real band?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: In Canada, the right to silence is protected under section 7 and section 11(c) of the Canadian Charter of Rights and Freedoms. The accused may not be compelled as a witness against himself in criminal proceedings, and therefore only voluntary statements made to police are admissible as evidence. Prior to an accused being informed of their right to legal counsel, any statements they make to police are considered involuntarily compelled and are inadmissible as evidence. After being informed of the right to counsel, the accused may choose to voluntarily answer questions and those statements would be admissible. Question: do you have the right to remain silent in canada?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: The first film in the series was Iron Man (2008), which was distributed by Paramount Pictures. Paramount also distributed Iron Man 2 (2010), Thor (2011) and Captain America: The First Avenger (2011), while Universal Pictures distributed The Incredible Hulk (2008). Walt Disney Studios Motion Pictures began distributing the films with the 2012 crossover film The Avengers, which concluded Phase One of the franchise. Phase Two includes Iron Man 3 (2013), Thor: The Dark World (2013), Captain America: The Winter Soldier (2014), Guardians of the Galaxy (2014), Avengers: Age of Ultron (2015), and Ant-Man (2015). Question: is age of ultron connected to guardians of the galaxy?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 0"
    },
    {
        "question": "Passage: The climate in the region is generally cool, owing to the southern latitude. There are no weather stations in the group of islands including Cape Horn; but a study in 1882--1883, found an annual rainfall of 1,357 millimetres (53.4 inches), with an average annual temperature of 5.2 \u00b0C (41.4 \u00b0F). Winds were reported to average 30 kilometres per hour (8.33 m/s; 18.64 mph), (5 Bf), with squalls of over 100 kilometres per hour (27.78 m/s; 62.14 mph), (10 Bf) occurring in all seasons. There are 278 days of rainfall (70 days snow) and 2,000 millimetres (79 inches) of annual rainfall Question: is the southern tip of south america cold?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Austria-Hungary was one of the Central Powers in World War I. It was already effectively dissolved by the time the military authorities signed the armistice of Villa Giusti on 3 November 1918. The Kingdom of Hungary and the First Austrian Republic were treated as its successors de jure, whereas the independence of the West Slavs and South Slavs of the Empire as the First Czechoslovak Republic, the Second Polish Republic and the Kingdom of Yugoslavia, respectively, and most of the territorial demands of the Kingdom of Romania were also recognized by the victorious powers in 1920. Question: was romania part of the austro hungarian empire?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The Lykan HyperSport is featured in the film Furious 7, and the video games Project CARS, Driveclub, Asphalt 8: Airborne, Asphalt Nitro, Forza Motorsport 6, Forza Horizon 3, Forza Motorsport 7, GT Racing 2: The Real Car Experience, CSR Racing and CSR Racing 2. The Lykan can also be briefly seen in the second Fate of the Furious trailer, however, the Lykan does not make an appearance, the footage is actually from the seventh instalment in the series, Fast and Furious 7. Question: was a real lykan used in furious 7?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Physically and functionally, the auditory system of an absolute listener does not appear to be different from that of a non-absolute listener. Rather, ``it reflects a particular ability to analyze frequency information, presumably involving high-level cortical processing.'' Absolute pitch is an act of cognition, needing memory of the frequency, a label for the frequency (such as ``B-flat''), and exposure to the range of sound encompassed by that categorical label. Absolute pitch may be directly analogous to recognizing colors, phonemes (speech sounds), or other categorical perception of sensory stimuli. Just as most people have learned to recognize and name the color blue by the range of frequencies of the electromagnetic radiation that are perceived as light, it is possible that those who have been exposed to musical notes together with their names early in life will be more likely to identify, for example, the note C. Absolute pitch may also be related to certain genes, possibly an autosomal dominant genetic trait, though it ``might be nothing more than a general human capacity whose expression is strongly biased by the level and type of exposure to music that people experience in a given culture.'' Question: do you have to be born with perfect pitch?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: An antagonist is a character, group of characters, institution or concept that stands in or represents opposition against which the protagonist(s) must contend. In other words, an antagonist is a person or a group of people who opposes a protagonist. Question: does the antagonist always have to be a person?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: As the governing body of association football, FIFA is responsible for maintaining and implementing the rules that determine whether an association football player is eligible to represent a particular country in officially recognised international competitions and friendly matches. In the 20th century, FIFA allowed a player to represent any national team, as long as the player held citizenship of that country. In 2004, in reaction to the growing trend towards naturalisation of foreign players in some countries, FIFA implemented a significant new ruling that requires a player to demonstrate a ``clear connection'' to any country they wish to represent. FIFA has used its authority to overturn results of competitive international matches that feature ineligible players. Question: do world cup players have to play for their home country?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The Hatch Act of 1939, officially An Act to Prevent Pernicious Political Activities, is a United States federal law whose main provision prohibits employees in the executive branch of the federal government, except the president, vice-president, and certain designated high-level officials, from engaging in some forms of political activity. It went into law on August 2, 1939. The law was named for Senator Carl Hatch of New Mexico. It was most recently amended in 2012. Question: does the hatch act apply to elected officials?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The Atlantic City Rail Terminal is Atlantic City, New Jersey's train station. It is the easternmost stop on the Atlantic City Line to and from Philadelphia. The station was also served by the Atlantic City Express Service (ACES) from 2009 until it was formally discontinued on March 9, 2012. The Atlantic City terminal is a 5-track, 3-platform terminal located inside of the Atlantic City Convention Center. The Atlantic City Line is a commuter train and runs daily all day. Question: is there a train station in atlantic city?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: For an adiabatic free expansion of an ideal gas, the gas is contained in an insulated container and then allowed to expand in a vacuum. Because there is no external pressure for the gas to expand against, the work done by or on the system is zero. Since this process does not involve any heat transfer or work, the first law of thermodynamics then implies that the net internal energy change of the system is zero. For an ideal gas, the temperature remains constant because the internal energy only depends on temperature in that case. Since at constant temperature, the entropy is proportional to the volume, the entropy increases in this case, therefore this process is irreversible. Question: does temperature remain constant in an adiabatic process?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Roxanne Roxanne is a 2017 American drama film written and directed by Michael Larnell. It stars Chant\u00e9 Adams, Mahershala Ali, Nia Long, Elvis Nolasco, Kevin Phillips and Shenell Edmonds. The film revolves around the life of rapper Roxanne Shant\u00e9. It was screened in the U.S. Dramatic Competition section of the 2017 Sundance Film Festival. Question: is the movie roxanne roxanne a true story?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Contour feathers are not uniformly distributed on the skin of the bird except in some groups such as the penguins, ratites and screamers. In most birds the feathers grow from specific tracts of skin called pterylae; between the pterylae there are regions which are free of feathers called apterylae (or apteria). Filoplumes and down may arise from the apterylae. The arrangement of these feather tracts, pterylosis or pterylography, varies across bird families and has been used in the past as a means for determining the evolutionary relationships of bird families. Question: do penguins have feathers arising from the epidermis?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: South Africa has a 'hybrid' or 'mixed' legal system, formed by the interweaving of a number of distinct legal traditions: a civil law system inherited from the Dutch, a common law system inherited from the British, and a customary law system inherited from indigenous Africans (often termed African Customary Law, of which there are many variations depending on the tribal origin). These traditions have had a complex interrelationship, with the English influence most apparent in procedural aspects of the legal system and methods of adjudication, and the Roman-Dutch influence most visible in its substantive private law. As a general rule, South Africa follows English law in both criminal and civil procedure, company law, constitutional law and the law of evidence; while Roman-Dutch common law is followed in the South African contract law, law of delict (tort), law of persons, law of things, family law, etc. With the commencement in 1994 of the interim Constitution, and in 1997 its replacement, the final Constitution, another strand has been added to this weave. Question: does roman-dutch law still apply to south african law?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: All other nations have some form of legislation meant to prevent the illegal trading of organs, whether by an outright ban or through legislation that limits how and by whom donations can be made. Many countries, including Belgium and France, use a system of presumed consent to increase the amount of legal organs available for transplant. . In the United States, federal law prohibits the sale of organs; however, the government has created initiatives to encourage organ gifting and to compensate those who freely donate their organs. In 2004, the state of Wisconsin began providing tax deductions to living donors. Question: is it legal to sell human body parts?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: When President Bush came to the end of his second term in 2009, a VC-25 was used to transport him to Texas. For this purpose the aircraft call sign was Special Air Mission 28000, as the aircraft did not carry the current President of the United States. Similar arrangements were made for former Presidents Ronald Reagan, Bill Clinton, and Barack Obama. Question: do ex presidents fly on air force one?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: In electromagnetism, displacement current density is the quantity \u2202D/\u2202t appearing in Maxwell's equations that is defined in terms of the rate of change of D, the electric displacement field. Displacement current density has the same units as electric current density, and it is a source of the magnetic field just as actual current is. However it is not an electric current of moving charges, but a time-varying electric field. In physical materials (as opposed to vacuum), there is also a contribution from the slight motion of charges bound in atoms, called dielectric polarization. Question: is displacement current like conduction current a source of magnetic field?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The British Isles are a group of islands in the North Atlantic off the north-western coast of continental Europe that consist of the islands of Great Britain, Ireland, the Isle of Man and over six thousand smaller isles. They have a total area of about 315,159 km and a combined population of just under 70 million, and include two sovereign states, the Republic of Ireland (which covers roughly five-sixths of the island of Ireland) and the United Kingdom of Great Britain and Northern Ireland. The islands of Alderney, Jersey, Guernsey and Sark, and their neighbouring smaller islands, are sometimes also taken to be part of the British Isles. Question: is southern ireland part of the british isles?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 75"
    },
    {
        "question": "Passage: The following is a list of association footballers who died while playing a game, died directly from injuries sustained while playing, or died after being taken ill on the pitch. Following an increase in deaths, both during matches and training, FIFA (Federation of International Football Associations) considered mandatory cardiac testing, already in place for years in some countries, such as Italy. Question: has anyone ever died during a football game?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Before 1973, the NCAA's smaller schools were grouped together in the College Division. In 1973, the College Division split in two when the NCAA began using numeric designations for its competitions. The College Division members who wanted to offer athletic scholarships or compete against those who did became Division II, while those who chose not to offer athletic scholarships became Division III. Question: do ncaa division 2 schools offer athletic scholarships?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Its plumage is lavender-blue to mid-blue in the crest, back, wings, and tail, and its face is white. The underside is off-white and the neck is collared with black which extends to the sides of the head. The wing primaries and tail are strongly barred with black, sky-blue and white. The bill, legs, and eyes are all black. Males and females are almost identical, but the male is slightly larger. Question: do female and male blue jays look the same?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " There is a lot of information in the passage about the plumage of the blue jay, but the specific question is whether female and male blue jays look the same. According to the passage, \\\"Males and females are almost identical, but the male is slightly larger.\\\" Therefore, the student's answer should have been more specific in addressing the difference between the male and female blue jays.\\n\\nTherefore the score is: 70"
    },
    {
        "question": "Passage: Miniature pig (also micro-pig, teacup pig, Michelle Davila, etc.) is an erroneous term that is used to refer to small breeds of domestic pig, such as Pot-bellied pigs, G\u00f6ttingen minipigs, Juliana pigs, Choctaw Hogs, or Kunekune (and specimens derived by cross-breeding with these). Notable features of most miniature pigs distinguishing them from other pigs may be defined by their possession of small, perked-back ears, a potbelly, sway back, chubby figure, rounded head, short snout, legs, and neck, and a short tail with thick hair at the end. Typically, most breeds of mini pigs will range from the minimum weight of 75 pounds (34 kg) to 200 pounds (91 kg). Question: is there such a thing as a miniature pig?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: WordPad is a basic word processor that is included with almost all versions of Microsoft Windows from Windows 95 onwards. It is more advanced than Microsoft Notepad but simpler than Microsoft Works Word Processor and Microsoft Word. It replaced Microsoft Write. Question: is wordpad the same thing as microsoft word?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: Ordinarily, a baseball game consists of nine innings (in softball and high school baseball games there are typically seven innings; in Little League Baseball, six), each of which is divided into halves: the visiting team bats first, after which the home team takes its turn at bat. However, if the score remains tied at the end of the regulation number of complete innings, the rules provide that ``play shall continue until (1) the visiting team has scored more total runs than the home team at the end of a completed inning; or (2) the home team scores the winning run in an uncompleted inning.'' (Since the home team bats second, condition (2) implies that the visiting team will not have the opportunity to score more runs before the end of the inning.) Question: can a home team win by 2 in extra innings?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: The Open Door Policy is a term in foreign affairs initially used to refer to the United States policy established in the late 19th century and the early 20th century that would allow for a system of trade in China open to all countries equally. It was used mainly to mediate the competing interests of different colonial powers in China. In more recent times, Open Door policy describes the economic policy initiated by Deng Xiaoping in 1978 to open up China to foreign businesses that wanted to invest in the country. This later policy set into motion the economic transformation of modern China. Question: is the open door policy still used today?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: Six Flags New Orleans (SFNO) is a 140-acre, abandoned theme park in New Orleans that has been closed since Hurricane Katrina struck the state in August 2005. It is owned by the Industrial Development Board (IDB) of New Orleans. Six Flags had leased the park from 2002 until 2009, when the lease was terminated during its bankruptcy proceedings. The former park is located in New Orleans East, off Interstate 10. Question: is there a six flags in new orleans?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In December of 2015, Bad Lip Reading simultaneously released three new videos, one for each of the three films in the original Star Wars trilogy. These videos found BLR using guest voices for the first time, featuring Jack Black as Darth Vader, Maya Rudolph as Princess Leia, and Bill Hader in multiple roles. The Empire Strikes Back BLR video featured a scene of Yoda singing to Luke about an unfortunate encounter with a seagull on the beach. BLR would later expand this scene into a full-length standalone song known as ``Seagulls! (Stop It Now)'', which was released in November 2016 (eventually hitting #1 on the Billboard Comedy Digital Tracks chart.) As of late 2017, the ``Seagulls!'' video is Bad Lip Reading's second most viewed YouTube upload, and most popular musical production. In the song, Yoda sings to Luke Skywalker about the dangers posed by vicious seagulls if one dares to go to the beach. Mark Hamill, who played Luke Skywalker in the Star Wars films, publicly praised ``Seagulls!'' (and Bad Lip Reading in general) while speaking at Star Wars Celebration in 2017: ``I love them, and I showed Carrie (Fisher) the Yoda one... we were dying. I showed it to her in her trailer. She loved it. I retweeted it... and (BLR) contacted me and said 'Do you want to do Bad Lip Reading?' And I said, 'I'd love to...'''. Hamill and Bad Lip Reading would go on to collaborate on Bad Lip Reading's version of The Force Awakens, with Hamill providing the voice of Han Solo. Question: is seagulls stop it now a real song?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: Data visualization is both an art and a science. It is viewed as a branch of descriptive statistics by some, but also as a grounded theory development tool by others. Increased amounts of data created by Internet activity and an expanding number of sensors in the environment are referred to as ``big data'' or Internet of things. Processing, analyzing and communicating this data present ethical and analytical challenges for data visualization. The field of data science and practitioners called data scientists help address this challenge. Question: is data visualization a part of data science?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Russia has participated in 4 FIFA World Cups since its independence in December 1991. The Russian Federation played their first international match against Mexico on 16 August 1992 winning 2-0. Their first participation in a World Cup was the United States of America in 1994 and they achieved 18th place. In 1946 the Soviet Union was accepted by FIFA and played their first World Cup in Sweden 1958. The Soviet Union represented 15 Socialist republics and various football federations, and the majority of players came from the Dynamo Kyiv team of the Ukrainian SSR. The Soviet Union national football team played in 7 World Cups. Their best performance was reaching 4th place in England 1966. However Soviet football was dissolved in 1991 when Belarus, Russia and Ukraine declared independence under the Belavezha Accords. The CIS national football team (Commonwealth of Independent States) was formed with other independent nations in 1992 but did not participate in any World Cups. Question: has russia ever made it to the world cup finals?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Eastbound vehicles must pay a toll to cross the bridge; as with all Hudson River crossings along the North River, westbound vehicles cross for free. As of December 6, 2015, the cash tolls going from New Jersey to New York are $15 for both cars and motorcycles. E-ZPass users are charged $10.50 for cars and $9.50 for motorcycles during off-peak hours, and $12.50 for cars and $11.50 for motorcycles during peak hours. Trucks are charged cash tolls of $20.00 per axle, with discounted peak, off-peak, and overnight E-ZPass tolls. A discounted carpool toll ($6.50) is available at all times for cars with three or more passengers using NY or NJ E-ZPass, who proceed through a staffed toll lane (provided they have registered with the free ``Carpool Plan''). There is an off-peak toll of $7.00 for qualified low-emission passenger vehicles, which have received a Green E-ZPass based on registering for the Port Authority Green Pass Discount Plan. Question: is george washington bridge a one way toll?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The square root of 3 is an irrational number. It is also known as Theodorus' constant, named after Theodorus of Cyrene, who proved its irrationality. Question: is the square root of 3 a rational number?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The crank sensor can be used in combination with a similar camshaft position sensor to monitor the relationship between the pistons and valves in the engine, which is particularly important in engines with variable valve timing. This method is also used to ``synchronise'' a four stroke engine upon starting, allowing the management system to know when to inject the fuel. It is also commonly used as the primary source for the measurement of engine speed in revolutions per minute. Question: is an engine speed sensor the same as a crankshaft sensor?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The How to Train Your Dragon franchise from DreamWorks Animation consists of two feature films How to Train Your Dragon (2010) and How to Train Your Dragon 2 (2014), with a third feature film, How to Train Your Dragon: The Hidden World, set for a 2019 release. The franchise is inspired by the British book series of the same name by Cressida Cowell. The franchise also consists of four short films: Legend of the Boneknapper Dragon (2010), Book of Dragons (2011), Gift of the Night Fury (2011) and Dawn of the Dragon Racers (2014). A television series following the events of the first film, Dragons: Riders of Berk, began airing on Cartoon Network in September 2012. Its second season was renamed Dragons: Defenders of Berk. Set several years later, and as a more immediate prequel to the second film, a new television series, titled Dragons: Race to the Edge, aired on Netflix in June 2015. The second season of the show was added to Netflix in January 2016 and a third season in June 2016. A fourth season aired on Netflix in February 2017, a fifth season in August 2017, and a sixth and final season on February 16, 2018. Question: is how to train your dragon on now tv?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: As originally enacted, the Voting Rights Act also suspended the use of literacy tests in all jurisdictions in which less than 50% of voting-age residents were registered as of November 1, 1964, or had voted in the 1964 presidential election. In 1970, Congress amended the Act and expanded the ban on literacy tests to the entire country. The Supreme Court then upheld the ban as constitutional in Oregon v. Mitchell (1970). The Court was deeply divided in this case, and a majority of justices did not agree on a rationale for the holding. Question: do you have to be literate to vote?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 70"
    },
    {
        "question": "Passage: The endodermis is the central, innermost layer of cortex in some land plants. It is made of compact living cells surrounded by an outer ring of endodermal cells that are impregnated with hydrophobic substances (Casparian Strip) to restrict apoplastic flow of water to the inside. The endodermis is the boundary between the cortex and the stele. Question: do plant cell walls restrict the entry of water?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There is a mistake in the student's answer. The passage states that the endodermis, which is the central, innermost layer of cortex in some land plants, restricts the apoplastic flow of water to the inside by having outer ring of endodermal cells impregnated with hydrophobic substances (Casparian Strip). This means that the plant cell walls do restrict the entry of water.\\n\\nTherefore, the score is: 50"
    },
    {
        "question": "Passage: Since April 1, 2014, Boss has been featured on the Ellen DeGeneres Show as a guest DJ. and on October 1, 2014 he announced he had been cast for Magic Mike XXL. Question: is twitch still on the ellen degeneres show?",
        "pred_ans": " As per the passage, Boss was featured on the Ellen DeGeneres Show as a guest DJ since April 1, 2014, and on October 1, 2014, he announced he had been cast for Magic Mike XXL. However, the passage does not provide information about whether Twitch is still on the Ellen DeGeneres Show. Therefore, I cannot provide a definitive answer to the question.",
        "ground_truth": true,
        "score": " Therefore the score is: 85"
    },
    {
        "question": "Passage: Corinthian leather is a term coined by the advertising agency Bozell to describe the upholstery used in certain Chrysler luxury vehicles. The term first appeared in advertising in 1974. Although the term suggests that the product has a relationship to or origination from Corinth, there is no relationship; the term is merely a marketing concept. Question: is there such a thing as corinthian leather?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: Six episodes of the first season premiered on May 5, 2017. The series was renewed for a second season and it premiered on September 8, 2017. The series was renewed for a third season and it premiered on November 17, 2017. The series was renewed for a fourth season and it premiered on March 16, 2018. A fifth season of the show was released on Netflix on May 11, 2018. A sixth season of the show was released on Netflix on August 17, 2018. Question: will there be more episodes of spirit riding free?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: In the seventh season premiere, ``The Day Will Come When You Won't Be'', Abraham is revealed to be Negan's chosen victim; Negan brutally beats him to death with Lucille as the rest of the group watches, horrified. When Daryl strikes Negan in the face, Negan declares that he will need to kill someone else as punishment. He then strikes Glenn with Lucille. After two blows to the head, Glenn sits up, severely brain damaged with a dislocated eye, and mutters ``Maggie, I'll find you'', before Negan repeatedly bludgeons Glenn's skull into a bloody pulp. Question: did glenn die in the walking dead season 6?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: If ``infield fly'' is called and the fly ball is caught, it is treated exactly as an ordinary caught fly ball; the batter is out, there is no force, and the runners must tag up. On the other hand, if ``infield fly'' is called and the ball lands fair without being caught, the batter is still out, there is still no force, but the runners are not required to tag up. In either case, the ball is live, and the runners may advance on the play, at their own peril. Question: do you have to tag up on an infield fly rule?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: As of June 2018, the following Our Gang kids are believed to be alive. Note that for several of the people listed below, information is insufficient: Question: are any members of our gang still alive?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Shrek Forever After (previously promoted as Shrek: The Final Chapter) is a computer-animated, 2010 American comedy film, produced in 3D by DreamWorks Animation. It is the fourth installment in the Shrek film franchise and the sequel to Shrek the Third (2007). The film was directed by Mike Mitchell from a script by Josh Klausner and Darren Lemke, and stars Mike Myers, Eddie Murphy, Cameron Diaz, Antonio Banderas, Julie Andrews, and John Cleese reprising their previous roles, with Walt Dohrn introduced in the role of Rumpelstiltskin. The plot follows Shrek struggling as a family man with no privacy, who yearns for the days when he was once feared. He's tricked by Rumpelstiltskin into signing a contract that leads to disastrous consequences. Question: is there going to be a shrek 4?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Who Wants to Be a Millionaire? is a British quiz show, created and produced by David Briggs, and made for the ITV network. The show's format, devised by Briggs, sees contestants taking on multiple-choice questions, based upon general knowledge, winning a cash prize for each question they answer correctly, with the amount offered increasing as they take on more difficult questions. To assist each contestant who takes part, they are given three lifelines to use, may walk away with the money they already have won if they wish not to risk answering a question, and are provided with a safety net that grants them a guaranteed cash prize if they give an incorrect answer, provided they reach a specific milestone in the quiz. Question: can you take the money in who wants to be a millionaire?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: At the 2010 Consumer Electronics Show, Boost Mobile announced it would begin to offer a new unlimited plan using Sprint's CDMA network, costing $50 a month. For $10 more, Boost also offered an unlimited plan for the BlackBerry Curve 8830. Sprint would also acquire fellow prepaid wireless provider Virgin Mobile USA in 2010--both Boost and Virgin Mobile would be re-organized into a new group within Sprint, encompassing the two brands and other no-contract phone services offered by the company. Question: is boost mobile and virgin mobile the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: On June 1, 2017, it was announced that Seyfried would return as Sophie. Later that month, Dominic Cooper confirmed that he would return for the sequel, along with Streep, Firth and Brosnan as Sky, Donna, Harry, and Sam, respectively. In July 2017, Baranski was also confirmed to return as Tanya. On July 12, 2017, Lily James was cast to play the role of young Donna. On August 3, 2017, Jeremy Irvine and Alexa Davies were also cast in the film, with Irvine playing Brosnan's character Sam in a past era, and Hugh Skinner to play Young Harry, Davies as a young Rosie, played by Julie Walters. On August 16, 2017, it was announced that Jessica Keenan Wynn had been cast as a young Tanya, who is played by Baranski. Julie Walters and Stellan Skarsg\u00e5rd also reprised their roles as Rosie and Bill, respectively. On October 16, 2017, it was announced that singer and actress Cher had joined the cast, in her first on-screen film role since 2010, and her first film with Streep since Silkwood. Question: is the cast of mama mia the same?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The second season, consisting of 18 episodes, aired from September 26, 2017, to March 13, 2018, on NBC. This Is Us served as the lead-out program for Super Bowl LII in February 2018 with the second season's fourteenth episode. Question: is this is us over for the season?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: This was the original use for FPNs, currently continuing in Great Britain under powers provided by the Road Traffic Act 1991 as well as in Northern Ireland; in many areas this style of enforcement has been taken over from police by local authorities. Some other motoring offences (other than parking) can also be dealt with by the issue of FPNs by police, VOSA or local authority personnel. FPNs issued by local authority parking attendants are backed with powers to obtain payment by civil action and are defined as ``penalty charge notices'', distinguishing them from other FPNs which are often backed with a power of criminal prosecution if the penalty is not paid; in the latter case the ``fixed penalty'' is sometimes designated as a ``mitigated penalty'' to indicate the avoidance of being prosecuted which it provides. Question: is a penalty charge notice the same as a fixed penalty?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 40"
    },
    {
        "question": "Passage: The New York Times called the film a ``completely unbelievable but wholly captivating flight into fancy composed of unequal dollops of comedy, romance, poignancy, funny colloquialisms and Manhattan's swankiest East Side areas captured in the loveliest of colors''. In reviewing the performances, the newspaper said Holly Golightly is Question: is breakfast at tiffany's black and white?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: The sweet potato (Ipomoea batatas) is a dicotyledonous plant that belongs to the bindweed or morning glory family, Convolvulaceae. Its large, starchy, sweet-tasting, tuberous roots are a root vegetable. The young leaves and shoots are sometimes eaten as greens. The sweet potato is only distantly related to the potato (Solanum tuberosum) and does not belong to the nightshade family, Solanaceae, but both families belong to the same taxonomic order, the Solanales. The sweet potato is botanically very distinct from a genuine yam (Dioscorea), which is native to Africa and Asia and belongs to the monocot family Dioscoreaceae. Question: is a sweet potato part of the potato family?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: After laying down a Phase, players try to ``go out'' as soon as possible. To go out, a player must get rid of all of their cards by hitting and discarding. The player to go out first wins the hand. The winner of the hand, and any other players who also complete their Phase, will advance to the next Phase for the next hand, while any player not able to complete their Phase remain stuck on that Phase. Players count up the total value of cards left in their hands (the fewer cards left in their hand, the better) and score them as follows; Question: can you go out on the first round of phase 10?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": "NILL"
    },
    {
        "question": "Passage: In the summer of 2016, the satellite network yes officially picked up season 2 of the show, stating it will focus more on real world events. During the fall of 2017 the initial trailer was released, and the official premier date was later announced to be 31 December. A few weeks prior to the airing of season 2, Fauda was renewed for a third season, to air in 2019. Question: is there going to be a third season of fauda?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: Garnet is a fictional character from the animated series Steven Universe, created by Rebecca Sugar. She is a ``gem'', a fictional alien being that exists as a magical gemstone projecting a holographic body. Garnet is a ``fusion gem'' -- i.e., two gemstones combining personalities and appearances as one shared holographic body -- formed by two Gems named Ruby and Sapphire; they chose to remain permanently fused out of love for each other. Question: does a ruby and a sapphire make a garnet?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 25"
    },
    {
        "question": "Passage: In the United States, a Cornish game hen, also sometimes called a Cornish hen, poussin, Rock Cornish hen, or simply Rock Cornish, is a hybrid chicken sold whole. Despite the name, it is not a game bird. Rather, it is a broiler chicken, the most common strain of commercially raised meat chickens. Though the bird is called a ``hen'', it can be either male or female. A Cornish hen typically commands a higher price per pound than typically sold chickens, despite a shorter growing span of 28 to 30 days, as opposed to 42 or more for regular chicken. Question: is a cornish game hen a baby chicken?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": " Therefore the score is: 80"
    },
    {
        "question": "Passage: Brown rice is whole-grain rice with the inedible outer hull removed; white rice is the same grain with the hull, bran layer, and cereal germ removed. Red rice, gold rice, and black rice (also called purple rice) are all whole rices, but with differently pigmented outer layers. Question: is whole wheat rice the same as brown rice?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " There are several types of rice, each with its own characteristics and nutritional value. Brown rice is a whole-grain rice with the inedible outer hull removed, while white rice is the same grain with the hull, bran layer, and cereal germ removed. Red rice, gold rice, and black rice (also called purple rice) are all whole rices, but with differently pigmented outer layers.\\n\\nThe student's answer is: False.\\n\\nTherefore the score is: 50."
    },
    {
        "question": "Passage: The breed is strikingly similar to the English Mastiff, with which it shares a common ancestor known as the Alpine Mastiff. The modern St. Bernard breed is radically different than the original dogs kept at the St. Bernard hospice, most notably by being much larger in size and build. Since the late 1800s, the St. Bernard breed has been ever refined and improved using many different large Molosser breeds, including the Newfoundland, Great Pyrenees, Greater Swiss Mountain Dog, Bernese Mountain Dog, Great Dane, English Mastiff, and possibly the Tibetan Mastiff and Caucasian Ovcharka. Other breeds such as the Rottweiler, Boxer, and English Bulldog may have contributed to the St. Bernard's bloodline as well. It is suspected that many of these large breeds were used to redevelop each other to combat the threat of their extinction after World War II, which may explain why all of them played a part in the creation of the St. Bernard as seen today. Question: are st bernards and bernese mountain dogs related?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Citizens of countries in the European Economic Area (other than British and Irish citizens) and Swiss citizens obtain permanent residence status automatically after five years' residence in the United Kingdom exercising Treaty rights rather than ILR. The rights of EEA citizens are not governed by UK Immigration Regulations but rather the EEA Regulations. Question: do eu citizens have indefinite leave to remain?",
        "pred_ans": " True",
        "ground_truth": false,
        "score": "NILL"
    },
    {
        "question": "Passage: This condition is caused by an infection of hair follicles due to the bacterium Pseudomonas aeruginosa. The bacterium is commonly found in hot tubs, water slides, and such places. Children are more prone to this because they usually stay in the water longer than adults. Hot tub folliculitis appears on the skin in the form of a rash, roughly resembling chicken pox and then develops further to appear as a pimple. Hot tub folliculitis can be extremely painful and/or itchy, and left alone without scratching will go away much more quickly. If the rash is aggravated, it can stay, worsen, and spread, lasting for months. By that time, it is much more difficult to treat. The dots usually go away after about 7 to 10 days but the condition leaves a hyperpigmented lesion that goes away after a few months. Question: can you get a rash from using a hot tub?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 20"
    },
    {
        "question": "Passage: The term remote keyless system (RKS), also called keyless entry or remote central locking, refers to a lock that uses an electronic remote control as a key which is activated by a handheld device or automatically by proximity. Question: is keyless entry the same as remote start?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: A person who has indefinite leave to remain, the right of abode or Irish citizenship has settled status if resident in the United Kingdom (all full British citizens have the right of abode). Question: is right of abode the same as indefinite leave to remain?",
        "pred_ans": " False",
        "ground_truth": true,
        "score": " Therefore the score is: 50"
    },
    {
        "question": "Passage: It returned to ITV, as Bruce's Price is Right, from 4 September 1995 to 16 December 2001 with Bruce Forsyth hosting for seven series, and again on the same channel from 8 May 2006 until 12 January 2007, this time hosted by Joe Pasquale. Two one-off specials aired as part of ITV's Gameshow Marathon in September 2005 and April 2007. Question: did bruce forsyth do the price is right?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: The United States two-dollar bill ($2) is a current denomination of U.S. currency. The third U.S. President (1801--09), Thomas Jefferson, is featured on the obverse of the note. The reverse features an engraving of the painting The Declaration of Independence by John Trumbull. Throughout the $2 bill's pre-1929 life as a large-sized note, it was issued as a United States Note, National Bank Note, silver certificate, Treasury or ``Coin'' Note and Federal Reserve Bank Note. When U.S. currency was changed to its current size, the $2 bill was issued only as a United States Note. Production went on until 1966, when the series was discontinued. Ten years passed before the $2 bill was reissued as a Federal Reserve Note with a new reverse design. Two-dollar bills are seldom seen in circulation as a result of banking policies with businesses which has resulted in low production numbers due to lack of demand. This comparative scarcity in circulation, coupled with a lack of public knowledge that the bill is still in production and circulation, has also inspired urban legends about its authenticity and value and has occasionally created problems for those trying to use the bill to make purchases. Question: is the us 2 dollar bill still in circulation?",
        "pred_ans": "NILL",
        "ground_truth": true,
        "score": " Therefore the score is: 10"
    },
    {
        "question": "Passage: Elderflower cordial is a soft drink made largely from a refined sugar and water solution and uses the flowers of the European elderberry, (Sambucus nigra L.). Historically the cordial has been popular in North Western Europe where it has a strong Victorian heritage. However, versions of an elderflower cordial recipe can be traced back to Roman times. Nowadays it can be found in almost all of the former Roman Empire territory, predominantly in Central Europe, especially in England, Germany, Austria, Slovenia, Romania, Hungary and Slovakia, where people have acquired a special taste for it and still make it in the traditional way. In some countries the drink can be found as an aromatic syrup, sold as a concentrated squash that is mixed with still or sparkling water. Elderflower press\u00e9 is a premixed form of this. Question: is elderflower cordial the same as elderflower syrup?",
        "pred_ans": "NILL",
        "ground_truth": false,
        "score": " Therefore the score is: 20"
    }
]