{"overall_metrics": {"CHAIRs": 0.3807615230460922, "CHAIRi": 0.12141127637495676, "Recall": 0.5452468007312614, "num_hallucinated_caps": 190.0, "num_caps": 499.0, "hallucinated_word_count": 351.0, "coco_word_count": 2891.0, "length_response": 72.78356713426854, "hallucinated_caps_ls": [144305, 329614, 324008, 247179, 39053, 493862, 181446, 401027, 161227, 25812, 195645, 155671, 412894, 32625, 18688, 370657, 481596, 254368, 459141, 209347, 523292, 12375, 13550, 146837, 209925, 104326, 233888, 300784, 183766, 169169, 409346, 344942, 300124, 490491, 175878, 12669, 133146, 498747, 227204, 185327, 348654, 482432, 556478, 37907, 406611, 278365, 432213, 368367, 21779, 148977, 467887, 352734, 557998, 401010, 313928, 346940, 400851, 388237, 24104, 183709, 59710, 527497, 285302, 265628, 499594, 255483, 453731, 87052, 357578, 418692, 259345, 120776, 545594, 378831, 449798, 375769, 426172, 180764, 295105, 443259, 249969, 146504, 240681, 542792, 14773, 298067, 563938, 526414, 505213, 441363, 188136, 191691, 89378, 557447, 561967, 198163, 217593, 301266, 475813, 558784, 539951, 230983, 564629, 53431, 381031, 9628, 275749, 163518, 525202, 118124, 288729, 372220, 302789, 30667, 532481, 291634, 88269, 98979, 536154, 374061, 353320, 256628, 158806, 424270, 416384, 364783, 531327, 502936, 51054, 1268, 436814, 63934, 516248, 45550, 444913, 303099, 205193, 272566, 14940, 408449, 354540, 191301, 124236, 344930, 381615, 111109, 226802, 317254, 425569, 35827, 283118, 429598, 490711, 163728, 277051, 145666, 304984, 352900, 153607, 227438, 202658, 18150, 491000, 268293, 474499, 358086, 239445, 105782, 138834, 492057, 388215, 111409, 397268, 444304, 305000, 154846, 301402, 415828, 495357, 184945, 42889, 401411, 246963, 124601, 581749, 227960, 315740, 456784, 33753, 99053]}, "results": {"sentences": [{"image_id": 144305, "caption": "There are a few computer devices displayed in the image, including a laptop sitting on top of another laptop and an external monitor next to it. A cell phone is also present on the desk, alongside two mice placed at different heights. A television is visible in the background, possibly indicating a setup for a multi-screen experience.", "mscoco_hallucinated_words": [["monitor", "tv"], ["cell phone", "cell phone"], ["mouse", "mouse"], ["television", "tv"]], "mscoco_gt_words": ["keyboard", "laptop", "dining table"], "hallucination_idxs": [23, 29, 40, 47], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.6666666666666666}}, {"image_id": 164091, "caption": "The image portrays a scene with two dolls, one inside a baby carriage and the other on a bed, near a chest of drawers. The baby carriage, with a crib inside, has a teddy bear placed next to it. Both dolls are dressed in various clothing and accessories. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["teddy bear", "bed", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 308003, "caption": " The image depicts a wooden bench with gray paint peeling off, revealing the natural wood beneath the paint. The bench is located in an outdoor area, perhaps a park or a picnic site. The seat of the bench occupies a significant portion of the image, with a part of the bench visible above it.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["chair", "bench", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 54870, "caption": " The image features a plate with food on it, including a hamburger and a salad. The hamburger is topped with cheese and pickles, and the salad is a typical salad with mixed greens. The plate is placed in a room setting, likely a restaurant or dining area. There is also a knife present on the plate, possibly used for cutting the food. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sandwich", "fork", "knife"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 95908, "caption": "The image shows a computer with a mouse on the keyboard, near a computer mouse. The mouse has a silver cup on it, with the cup partially covering the mouse's top surface. The mouse is positioned on the left side of the keyboard, creating a setup where the mouse is adjacent to the keyboard and cup.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["keyboard", "cup", "dining table", "laptop", "mouse"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8}}, {"image_id": 329614, "caption": "There is a city street scene with a school bus and a red bus in front of an old-style cathedral. The school bus is a popular yellow transit vehicle, and the red bus is positioned behind it on the street. The cathedral is situated nearby the buses, with its large clock tower prominently visible in the sky above it. ", "mscoco_hallucinated_words": [["clock", "clock"]], "mscoco_gt_words": ["bus", "person", "car"], "hallucination_idxs": [54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.3333333333333333}}, {"image_id": 324008, "caption": "The image features a tray with various food items, including a banana, an orange, a bagel, and a donut. There are three bottles positioned throughout the tray, with two close to the banana and orange and the other placed near the top right corner of the tray. The tray is in an airplane setting, likely part of a meal service for a flight.", "mscoco_hallucinated_words": [["bagel", "donut"], ["donut", "donut"], ["bottle", "bottle"]], "mscoco_gt_words": ["cup", "dining table", "train", "orange", "airplane", "person", "banana"], "hallucination_idxs": [18, 22, 27], "metrics": {"CHAIRs": 1, "CHAIRi": 0.375, "Recall": 0.42857142857142855}}, {"image_id": 52700, "caption": " The image features a large dining table filled with many assorted cupcakes, cakes, and other desserts. The table is draped in white cloth, adding a formal touch to the gathering. A group of people surround the table, standing close to it, likely to observe or partake in the desserts. The people are all looking toward the middle of the table, where the majority of the desserts are displayed.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "dining table", "spoon", "cake", "fork", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 501311, "caption": "The scene takes place on a baseball field where a baseball game is in progress. Multiple players can be seen on the field, with one man standing on the pitcher's mound, getting ready to throw the ball. Another player is nearby, throwing a baseball glove onto the field. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 247179, "caption": "The image portrays a festive scene with a group of people riding two elephants, followed by several others walking along with them. There are eight people visible in the scene, with three of them sitting on the elephants, four walking alongside, and a fifth person located towards the center of the scene. All the people seem to be enjoying the parade-like atmosphere, with some of them wearing crowns and other ornate headwear. Apart from the human participants, there are also two cars and a tuk-tuk present in the image.", "mscoco_hallucinated_words": [["car", "car"]], "mscoco_gt_words": ["truck", "clock", "person", "elephant"], "hallucination_idxs": [90], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.5}}, {"image_id": 577912, "caption": "In the image, a small bathroom is neatly decorated and has several features, including a toilet, a sink, and a large whirlpool tub. The toilet is situated near the sink, which is positioned on the right side of the bathroom. A beautiful flower arrangement adorns the bathroom, creating a warm and inviting atmosphere. The whirlpool tub is prominently displayed, taking up a significant portion of the bathroom's space.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sink", "toothbrush", "toilet", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 196341, "caption": "In the image, there is a large clock with a prominent central dial, with the time displayed on the clock face. A group of people is standing in the middle of the scene, facing the clock, possibly at a bus stop or during a break in their daily routine. They are mostly arranged around the clock and some are farther back in the scene. The clock appears to be the central feature in the picture, drawing the attention of the people towards it.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "handbag", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 234433, "caption": "The image displays a left-right street sign, sitting on a wooden pole with a hand written sign below it, all of which are situated in the middle of a parking lot. A traffic signal can also be seen in the background, on the left side of the scene. Cars are dispersed around the parking lot, occupying various spots and creating a bustling atmosphere. One car is parked closer to the pole with the left-right sign, while another car is visible farther away, toward the right edge of the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "car", "person", "bicycle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 455610, "caption": " The image depicts a red and green motorcycle, along with a pink and purple car, parked in a parking lot. People are standing around and some appear to be interacting with each other, but it's unclear whether they are friends or simply bystanders.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "motorcycle", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 39053, "caption": "The image shows a table topped with a variety of food items. There are multiple plates on the table, some containing a dish made with bread, yogurt, and cucumber, another with a sandwich, and a third with a mixture of cucumber, yogurt, and sauce. A few cups are also present, including one on the left side, two more in the middle, and another one on the right side of the table. A cell phone can be seen on the left side, near the cups.", "mscoco_hallucinated_words": [["sandwich", "sandwich"]], "mscoco_gt_words": ["cup", "dining table", "bowl", "spoon", "cake", "cell phone", "apple"], "hallucination_idxs": [37], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.42857142857142855}}, {"image_id": 493862, "caption": "According to the picture, there is a man in a field, standing with a flying remote in his hand, likely watching a television show. There are several other people present in the field, but they are not as prominent as the man in the foreground.", "mscoco_hallucinated_words": [["remote", "remote"], ["television", "tv"]], "mscoco_gt_words": ["person", "tennis racket", "dog"], "hallucination_idxs": [17, 25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4, "Recall": 0.3333333333333333}}, {"image_id": 269873, "caption": "The image captures a black and white scene where numerous people are walking across a street. They are gathered around a bus stop, with some people standing near the bus stop, while others are walking by the side of the street. The street appears to be busy, with traffic both to the right and to the left of the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "person", "bus", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 524656, "caption": " It is a kitchen area filled with various items, including a small oven, a microwave, and a sink. There are bottles on the countertop in different parts of the kitchen, and a few knives can be seen lying around. A few bowls are scattered around the kitchen, with some placed near the sink.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["toaster", "cup", "knife", "oven", "refrigerator", "bowl", "spoon", "bottle", "wine glass", "sink", "microwave"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5454545454545454}}, {"image_id": 369547, "caption": " The image captures a scene on a beach where a small black and white dog is sitting on the rocky surface. Near the dog, a pair of birds \u2013 one crane and one bird \u2013 is present, flying close to the animal. Both birds appear to be in flight, capturing the attention of the dog. The beach is covered in pebbles and rocks, adding texture to the environment.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bird", "dog", "cat"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 181446, "caption": " In the image, there is a man wearing glasses standing in front of a refrigerator. The refrigerator is full of various items and is spread across two-thirds of the image's width. A bottle and a spoon can be seen in the scene, along with a few more bottles dispersed throughout the fridge. The kitchen appears to be well-stocked with a mix of food items and household items. The overall setting seems to be a typical kitchen with a refrigerator, stove, and cabinets.", "mscoco_hallucinated_words": [["spoon", "spoon"], ["stove", "oven"]], "mscoco_gt_words": ["refrigerator", "bowl", "bottle", "sink", "person"], "hallucination_idxs": [39, 87], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2222222222222222, "Recall": 0.6}}, {"image_id": 128119, "caption": " The scene is set on a city street, with a white bus traveling down the road. The bus has a significant amount of space left around it, indicating there is no congestion. There are two bicycles visible \u2013 one attached near the bus's rear, and the other one on the side. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "bus", "handbag", "cell phone", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.16666666666666666}}, {"image_id": 401027, "caption": "The image features a white plate, likely served on a dining table, that is topped with a sandwich. The sandwich is cut into two pieces, highlighting its inside components: a cheese and a tomato. Additionally, there are two bowls, one to the left of the sandwich and the other on the right side.", "mscoco_hallucinated_words": [["bowl", "bowl"]], "mscoco_gt_words": ["cup", "knife", "dining table", "sandwich", "fork"], "hallucination_idxs": [45], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.4}}, {"image_id": 253630, "caption": "The image depicts a modern style living room with a black square table and a flat cushion on the carpet in front of it. There is a television placed on the right side of the room, and another cushion is positioned near the left edge of the scene. A variety of plants are located in different parts of the room, including a small one close to the table and another towards the left side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["potted plant", "chair", "tv", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 326021, "caption": " In the image, a woman is sitting in front of a table with a teddy bear laid out on it. The table is set against a red background and the woman is holding a teddy bear in her arms. The scene appears to be in a park or outdoor setting. There are a few other people in the background, but they are farther away from the woman and the table.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["teddy bear", "person", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 317320, "caption": " In the image, there is a man on a bike and another person riding on it. The man riding the bike has his back to the viewer, while the other person is closer to the camera. The bike is located near several crossings, including one visible in the lower left, one in the lower right, and others more centrally in the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "motorcycle", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 507171, "caption": "The image features a person riding a surfboard on the water. The surfer is positioned on the right side of the scene, with the surfboard stretching across the entire image, from the right side to the left side. There is a significant water wave behind the surfer, likely created by the powerful currents or the oncoming storm.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["surfboard", "person", "cat"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 161227, "caption": "In the image, a large black leather chair is featured with two remotes sitting on it. One remote is placed on the right side of the chair and the other is positioned towards the center. A laptop is also in view, sitting close to the remotes. There are two cell phones placed on the chair, one near the top left corner and the other in the bottom left corner. A book rests at the edge of the couch, with some magazines nearby.", "mscoco_hallucinated_words": [["book", "book"]], "mscoco_gt_words": ["remote", "keyboard", "laptop", "cell phone", "chair", "couch"], "hallucination_idxs": [77], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1, "Recall": 0.8333333333333334}}, {"image_id": 25812, "caption": "In the image, a person is wearing a snow suit and riding down a snowy hill with ski equipment on, possibly snowboarding or skiing. The person is close to the end of the hill, approaching a large ski area, which could be Ski Hill. The scene also includes several benches located around the hill, where people can rest or watch the activities. There is another person in the picture, perhaps a passerby or someone who might be joining in the skiing fun.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["skis", "person", "snowboard"], "hallucination_idxs": [55], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 0.6666666666666666}}, {"image_id": 267537, "caption": "The image captures a group of people, mostly women, sitting together in a seating area underneath umbrellas. The people are sitting on a bench and are dressed in black and white clothing. There are at least nine people in the picture, with one person in the front row holding an umbrella.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tie", "umbrella", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 195645, "caption": " The scene takes place in a backyard with several people standing and talking. There are several cups of wine, including a wine glass on the table near the man and a wine glass on the ground beside another person. A child is also present in the scene, likely being played by one of the adults.", "mscoco_hallucinated_words": [["cup", "cup"]], "mscoco_gt_words": ["handbag", "bench", "dining table", "bottle", "wine glass", "person"], "hallucination_idxs": [17], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.5}}, {"image_id": 155671, "caption": "The scene is an empty city street filled with cars, traffic lights, and fire hydrants. There are several cars parked or driving on the street, with some closer to the camera and others further away. Among the cars, a traffic light is prominently displayed, hanging high above the scene, while another traffic light can be seen in the background.", "mscoco_hallucinated_words": [["fire hydrant", "fire hydrant"]], "mscoco_gt_words": ["traffic light", "car", "truck"], "hallucination_idxs": [15], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.6666666666666666}}, {"image_id": 412894, "caption": "A group of pedestrians is gathered outside a subway station near a busy city street, with several people crossing the street in various directions. There are multiple streetlights on the left and right sides of the intersection, providing light and traffic signals. A traffic light is located on the left side of the street, controlling the flow of traffic on the busy city street. Among the pedestrians, there are a few people carrying handbags, likely to store items such as cell phones and wallets. A couple of umbrellas can be seen, one placed on the ground near the left side of the scene, and another one held by one of the people crossing the street. In the background, there are two benches, and at the top center of the scene, there is a clock tower visible.", "mscoco_hallucinated_words": [["cell phone", "cell phone"], ["umbrella", "umbrella"], ["bench", "bench"], ["clock", "clock"]], "mscoco_gt_words": ["traffic light", "person", "handbag", "backpack"], "hallucination_idxs": [88, 96, 134, 148], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.75}}, {"image_id": 32625, "caption": " The image features a white plate with food on a table, placed between two chairs. There is also a cup of tea on the table, possibly served alongside the food. The food appears to consist of vegetables and beans, which is a healthy meal choice. A knife is also present on the table.", "mscoco_hallucinated_words": [["chair", "chair"], ["knife", "knife"]], "mscoco_gt_words": ["cup", "dining table", "bowl", "fork", "book"], "hallucination_idxs": [15, 52], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.4}}, {"image_id": 18688, "caption": "The image depicts a baseball game on a TV screen. There are several people present on the field during the game, including a baseball player, an umpire, and other players. ", "mscoco_hallucinated_words": [["tv", "tv"]], "mscoco_gt_words": ["person", "baseball bat", "sports ball", "baseball glove"], "hallucination_idxs": [8], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.25}}, {"image_id": 343534, "caption": "According to the image, there is a small doll dressed in a knit cap and pink outfit, standing with two teddy bears. One of the teddy bears appears to be quite large, while the other is slightly smaller. The doll is positioned on the left side of the image, while the two teddy bears are on the right side. All three are of different sizes and seem to be part of the doll's collection.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["teddy bear", "potted plant", "dog", "book", "chair"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 370657, "caption": "The scene takes place in a busy street, as several people and a child ride motorbikes and a bicycle. Some of the riders wear masks, indicating the possibility of a COVID-19 outbreak in the area. The main group consists of four adults and one child, all riding motorbikes. One person is seated on the back of a motorbike, with another person standing nearby. The bicycle is also visible in the background, possibly belonging to someone in the group. ", "mscoco_hallucinated_words": [["bicycle", "bicycle"], ["bicycle", "bicycle"]], "mscoco_gt_words": ["motorcycle", "handbag", "person"], "hallucination_idxs": [19, 72], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.6666666666666666}}, {"image_id": 481596, "caption": "According to the picture, there is a man in the middle of a crowded stadium, sitting and eating a food item, likely a sandwich. He is surrounded by many other people in the stands, as well as in various areas throughout the stadium. The man seems to be focusing on his meal and not taking his attention away from it. The scene is filled with a large gathering of people enjoying the event at the stadium.", "mscoco_hallucinated_words": [["sandwich", "sandwich"]], "mscoco_gt_words": ["hot dog", "chair", "person"], "hallucination_idxs": [26], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.3333333333333333}}, {"image_id": 460347, "caption": "The image captures a roadway featuring a city bus and multiple cars driving in different directions. The bus is the main centerpiece of the scene, occupying a considerable part of the road, and multiple cars can be seen on either side of it, dispersed along the street. One car is located closest to the bus, while another is further behind, and the remaining vehicles are positioned in various parts of the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 398753, "caption": " It is a bright, sunny day on the baseball field as people gather to watch the game. A large crowd is standing or seated around the field, observing the baseball players on the field. Numerous baseball players can be seen on the field, participating in the game. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "baseball bat", "sports ball", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 67833, "caption": "The image showcases a dining table with a large number of items placed across it. Two large pizzas are prominently displayed on the table, with some olive slices and other toppings visible. Additionally, there are multiple glasses, two bottles, and two cups scattered around the table.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "bowl", "bottle", "fork", "wine glass", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4444444444444444}}, {"image_id": 272880, "caption": "The image depicts a table with a plate of pizza on it. There is a small piece of pizza on the plate, with a large empty space around it. A person is standing in front of the plate, holding a card with the word \u201cmystic pizza\u201d. There is also a fork in the image, possibly used to eat the food. The table is wooden and appears to be dining or restaurant setting.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "fork", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 254368, "caption": "According to the picture, there is a man sitting in bed eating something, and a woman is laying beside him. The man is holding something while eating, which might be a bottle or another item. There is a bowl on the bed, possibly containing a meal or dessert. In addition, there are a few pillows scattered around the bed to provide comfort.", "mscoco_hallucinated_words": [["bottle", "bottle"], ["bowl", "bowl"]], "mscoco_gt_words": ["pizza", "bed", "person"], "hallucination_idxs": [35, 43], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 193108, "caption": " A baseball game is underway as a batter is holding a baseball bat during an exciting moment in the game. The batter is wearing a baseball helmet and sporting a new batting helmet, as he stands with a baseball bat ready to swing at the upcoming pitch. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "baseball bat", "sports ball"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 459141, "caption": "According to the picture, there is a person wearing a hat while riding a bicycle. The person is in the foreground of the image, with a bike next to them. The bicycle has a basket on it. The overall scene is somewhat cluttered with several other items in the background, including other bikes, a bench, and some other structures. The scene is taking place in an outdoor setting.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["bicycle", "motorcycle", "potted plant", "person"], "hallucination_idxs": [61], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.5}}, {"image_id": 231534, "caption": "The image captures an airport scene with various airplanes and transport vehicles. One of the airplanes, likely a passenger jet, is on a runway with a truck close to the front part of it, while a second truck is farther behind the first truck and another is located near the back side of the plane.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "suitcase", "handbag", "airplane", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 227031, "caption": "In the image, a man and a woman are standing in front of a white cake, presumably being cut into for a party. The man is wearing a suit and tie, while the woman is wearing a dress. Both individuals are positioned closer to the center of the image. The cake is placed on a table in the middle, with a pair of candles sitting next to it on the left side of the image. Additionally, there are a few other people scattered throughout the scene, likely attendees of the party.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["knife", "dining table", "tie", "potted plant", "cake", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 209347, "caption": "In the image, a woman is sitting on a bench inside a subway car and looking at her cell phone. She is surrounded by other people sitting on benches, including two in the middle of the car and one towards the right. There is another person standing nearby, closer to the right side of the image.", "mscoco_hallucinated_words": [["bench", "bench"], ["car", "car"], ["bench", "bench"], ["car", "car"]], "mscoco_gt_words": ["handbag", "tie", "train", "cell phone", "person"], "hallucination_idxs": [10, 14, 30, 39], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.4}}, {"image_id": 523292, "caption": " The scene features a cat sitting on top of a laptop, which is placed on a table in a living room setting. The cat appears to be looking intently at the laptop screen, as if trying to figure out what is displayed on it. The laptop is positioned on the left side of the image, with the cat on the right side, sitting next to the table. The living room setting appears to include chairs on the sides and a couch in the background. There are also two cups placed on the table, one on the left side and the other near the center.", "mscoco_hallucinated_words": [["couch", "couch"], ["cup", "cup"]], "mscoco_gt_words": ["tv", "dining table", "laptop", "cat", "chair"], "hallucination_idxs": [87, 96], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.8}}, {"image_id": 12375, "caption": "The image displays a plate full of food, including a variety of meats such as ham and a meatball. It also includes carrots, which are both in a pile and scattered throughout the dish. There is also a cup of potatoes in the dish, likely mashed. The plate appears to be a meal that includes meat and vegetables.", "mscoco_hallucinated_words": [["cup", "cup"]], "mscoco_gt_words": ["carrot", "broccoli", "dining table"], "hallucination_idxs": [42], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.3333333333333333}}, {"image_id": 13550, "caption": " In the image, a woman is crouching down next to a large gray elephant, who appears to be the focus of the scene. She is holding a water bottle and petting the elephant, seemingly enjoying the experience. The elephant is quite large, taking up a significant portion of the image. There are also some small trees in the background, adding to the overall scene. ", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["person", "handbag", "elephant"], "hallucination_idxs": [31], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 146837, "caption": "In the image, a person is in motion while riding a skateboard on the sidewalk. They are positioned parallel to a bench, which is behind them, as they lean against the railing while performing a trick. The skateboard can be seen underneath the person, in the process of riding down the rail.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["motorcycle", "skateboard", "person"], "hallucination_idxs": [23], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 67042, "caption": " In the image, a man is standing in a narrow hallway, looking at his cell phone. He is wearing a tie and is located near the middle of the hallway. Around him, there are various other people, walking and passing through the area, with a few people standing in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tie", "person", "cell phone"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 209925, "caption": "The image captures a baseball game happening on a field, with a baseball player on the mound and a crowd of people surrounding him. The player is standing on the mound, wearing an orange jersey, and holding a baseball glove.", "mscoco_hallucinated_words": [["orange", "orange"]], "mscoco_gt_words": ["sports ball", "person", "baseball glove"], "hallucination_idxs": [36], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 2154, "caption": " There is a shopping cart sitting on the sidewalk near a parked car, close to a pedestrian path near a street corner. The cart is empty, seemingly waiting to be filled with items during shopping. In the background, there is a bench placed close to the street corner, and multiple trees can be seen throughout the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["car", "umbrella", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 274931, "caption": "The image depicts a row of parked cars next to a building, which could be a street or a sidewalk. The cars are parked on the side of the street, with one being parked a little closer to the edge than the others. The street appears to be an old, black and white photograph. There are several vehicles parked in the scene, including one car that is further from the camera than the others, and the others are parked closer to the left side of the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "fire hydrant", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 104326, "caption": "The image portrays a clean and green kitchen, accompanied by a dining area, inside a large room. There are a couple of chairs and a long dining table with chairs, forming a gathering spot for residents. The kitchen area has a counter, a sink, and several cabinets for storage, while the dining area is home to some potted plants. There are a few bottles placed around the kitchen and dining area, likely containing liquid items such as beverages.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["clock", "knife", "oven", "dining table", "potted plant", "sink", "microwave", "chair"], "hallucination_idxs": [72], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.5}}, {"image_id": 233888, "caption": "According to the picture, a young boy is playing baseball, holding a bat and standing in front of a large flat-screen television displaying a live baseball game. The television is prominently displayed in the room and appears to be an important part of the scene. There are also several vases placed in various areas around the room, including one in the background, one in the middle of the scene, and another closer to the television. A bottle is also visible, positioned in the middle of the room.", "mscoco_hallucinated_words": [["vase", "vase"], ["bottle", "bottle"]], "mscoco_gt_words": ["tv", "potted plant", "baseball bat", "person", "baseball glove"], "hallucination_idxs": [53, 84], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.4}}, {"image_id": 213905, "caption": "The scene is a tennis court where two women are standing on opposite sides of the net, ready to play. The woman on the left is holding a tennis racket, and the one on the right has a racket as well. They are both wearing matching outfits, and their movement makes them appear to be walking and dancing on the court. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "tennis racket", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 277479, "caption": "In this image, a small child is standing near an open refrigerator. The child is holding a bottle and appears to be reaching for it, possibly to grab a drink or investigate what is inside. There are two other people in the scene, standing in different parts of the room. One person is closer to the left side of the room, while the other person is closer to the right side. Both of the people in the room seem to be observing the child by the refrigerator.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "refrigerator", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 300784, "caption": "The image depicts a group of kids sitting and talking together on a couch. There are four children sitting on the couch, engaged in a conversation or perhaps watching something together. In the background, there is another chair, as well as various items scattered around the area, including a vase, a bottle, a purse, a clock, and a backpack.", "mscoco_hallucinated_words": [["chair", "chair"], ["vase", "vase"], ["bottle", "bottle"], ["purse", "handbag"], ["clock", "clock"], ["backpack", "backpack"]], "mscoco_gt_words": ["person", "bench", "dog", "couch"], "hallucination_idxs": [41, 55, 58, 61, 64, 68], "metrics": {"CHAIRs": 1, "CHAIRi": 0.6, "Recall": 0.5}}, {"image_id": 183766, "caption": "The image features an assortment of cakes decorated with various fruits and berries. The cakes are placed on a table, accompanied by a cup and a vase with multiple cups. The fruits and berries can be seen on top of the cakes and also in the background, creating a colorful and delicious display. There are two knives on the table, one near the cup and another near the vase. The scene suggests a celebration or a special occasion where the cakes might be served as desserts.", "mscoco_hallucinated_words": [["cup", "cup"], ["vase", "vase"], ["cup", "cup"], ["cup", "cup"], ["vase", "vase"]], "mscoco_gt_words": ["cake", "fork", "knife", "dining table"], "hallucination_idxs": [25, 28, 31, 69, 74], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4166666666666667, "Recall": 0.75}}, {"image_id": 169169, "caption": " The image captures a city street, with a sidewalk bustling with people. There are multiple individuals walking down the street, some carrying umbrellas, as they move in different directions. Two signs on the sidewalk are visible, promoting events and the city's attractions. ", "mscoco_hallucinated_words": [["umbrella", "umbrella"]], "mscoco_gt_words": ["traffic light", "person", "potted plant", "car"], "hallucination_idxs": [25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.25}}, {"image_id": 409346, "caption": " The scene is a social gathering with several people in attendance. Two women are the central focus of the event, as they stand side by side near a table filled with various food items. One of the women is reaching into the table to grab a cake, while the other person has a bottle of alcohol, likely wine, in her hand.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["dining table", "bowl", "spoon", "cake", "person"], "hallucination_idxs": [57], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 0.6}}, {"image_id": 522262, "caption": "A long stretch of sidewalk or a street is lined with multiple bicycles, locked to the railing with a bracket attached. The bikes are parked alongside each other in a row, making the scene appear like a bike rack. Several people are walking or milling around the area, with at least two individuals visible in the immediate vicinity.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "car", "person", "bicycle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 79014, "caption": "According to the picture, a man is standing in front of a mirror, holding a lemon under his chin. There is another person reflected in the glass, likely the man's reflection from a different angle. The lemon appears to be placed in a way that makes the man appear to have a second head, as if he's making a funny face with the fruit. The scene is a bit unusual, yet somehow the lemon helps to bring the reflection of the man into focus.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["orange", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 344942, "caption": " The image features a computer screen with a large keyboard on top, and a single chair near the desk. There is a stuffed toy with a box head placed on top of the keyboard, possibly providing an interesting distraction from the screen. There are multiple objects around the screen, such as two books, one on the left side and another near the top-left corner, and a cell phone placed near the right side. The screen displays various buttons and features, including the search bar on the top, various icons in the top left corner, and a clock towards the lower-left corner. ", "mscoco_hallucinated_words": [["keyboard", "keyboard"], ["chair", "chair"], ["desk", "dining table"], ["keyboard", "keyboard"], ["book", "book"], ["cell phone", "cell phone"]], "mscoco_gt_words": ["clock", "laptop", "tv"], "hallucination_idxs": [9, 16, 19, 35, 57, 73], "metrics": {"CHAIRs": 1, "CHAIRi": 0.75, "Recall": 0.6666666666666666}}, {"image_id": 275929, "caption": "The image features several white toilets sitting on various steps, both above and below a staircase. These toilets have different sizes and positions, with one being larger than the others. The toilets appear to be discarded, adding a sense of decay to the setting. The presence of these toilets creates an unusual scene that draws attention to the issue of proper waste disposal.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sink", "bowl", "toilet"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 465601, "caption": "According to the picture, a group of people with umbrellas is walking in the rain outside a large house. They are positioned near the entrance and appear to be walking down a path or sidewalk, possibly in the direction of the street. Several of the people are holding and wearing umbrellas, which provide them with shelter from the rain.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "umbrella", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 300124, "caption": "The image features a couch placed on the sidewalk outside a house. The couch appears to be old and damaged, with multiple holes on its surface. A small television is sitting on top of the couch, resting on a wooden piece placed atop the couch. There is a dining chair located beside the couch, also outside the house. The overall scene suggests a makeshift outdoor living area or perhaps a dumping area for old furniture.", "mscoco_hallucinated_words": [["television", "tv"]], "mscoco_gt_words": ["microwave", "couch", "chair", "dining table"], "hallucination_idxs": [31], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.5}}, {"image_id": 46919, "caption": "According to the image, a person in a snowshoeing trip is traveling up a snow-covered slope. There are also two other people in the image, one standing and another further down the slope. The image captures a moment in a snow-covered winter setting where people are enjoying a snowshoeing trip together.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 434022, "caption": "There is an image of a train travelling down a train track in the picture, with a gray color. The train is on the lower train track, with the engine pulling a long, black and gray car train on the upper train track. The car train is moving from left to right along the train track. A few other cars can be seen scattered around the train track, but they are not attached to it. The scene appears to be on a cloudy day or evening with the train traveling in the distance.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 490491, "caption": "The image shows a road intersection with a sidewalk, streetlight, and a sign post in the foreground. The sign is positioned behind the wall, with a letter \"I\" and an upside-down \"V\" written on the stone. The streetlight and the sign are next to a cobblestone road. Additionally, there are four people visible in the scene, two of them standing near the sidewalk and the other two walking further away.", "mscoco_hallucinated_words": [["streetlight", "traffic light"], ["streetlight", "traffic light"]], "mscoco_gt_words": ["stop sign", "car", "person"], "hallucination_idxs": [10, 46], "metrics": {"CHAIRs": 1, "CHAIRi": 0.6666666666666666, "Recall": 0.3333333333333333}}, {"image_id": 175878, "caption": "According to the picture, a group of four people is standing together in what appears to be an outdoor space, such as a patio or balcony. They are wearing casual attire, with a couple of them wearing blue shirts. There are two cell phones visible in the image, which are likely to be the center of attention or conversation between the group. A bottle can be seen near one of the men, and a cup is positioned towards the bottom left of the scene. Some cups are placed closer to the bottle, while others are farther away, and some of them appear to be empty or unused.", "mscoco_hallucinated_words": [["bottle", "bottle"], ["cup", "cup"], ["cup", "cup"], ["bottle", "bottle"]], "mscoco_gt_words": ["backpack", "cell phone", "person"], "hallucination_idxs": [70, 82, 94, 100], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5714285714285714, "Recall": 0.6666666666666666}}, {"image_id": 12669, "caption": "According to the image, several people are standing in a room or space, possibly at an art show or exhibit. There are three women, one holding a glass, another standing near the glass, and another standing to the side of the first woman. There are also some chairs and a table in the room or space, possibly for visitors to view and interact with the artwork. ", "mscoco_hallucinated_words": [["chair", "chair"], ["table", "dining table"]], "mscoco_gt_words": ["wine glass", "handbag", "vase", "person"], "hallucination_idxs": [54, 57], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4, "Recall": 0.25}}, {"image_id": 192496, "caption": "The image depicts two trains, both of them on a train track. The first train is a blue and white train, while the second one is a black train. Both trains are moving along the track, one more slowly than the other. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 133146, "caption": " It is a scene of a spacious living room with comfortable seating, a fireplace, and a bookcase. The room is furnished with leather sofas placed against the walls, and several chairs scattered throughout the space. There is also a fireplace, and a large bookcase located near one of the walls. A potted plant can be seen near the fireplace. Various books, including two dictionaries, are present on the bookcase, while a Bible is on the table. The living room is well lit, allowing the occupants to relax and enjoy their surroundings.", "mscoco_hallucinated_words": [["potted plant", "potted plant"], ["table", "dining table"]], "mscoco_gt_words": ["tv", "vase", "book", "chair", "couch"], "hallucination_idxs": [58, 86], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4, "Recall": 0.6}}, {"image_id": 53744, "caption": "The image depicts a cutting board filled with different types of vegetables and meat, with an assortment of sliced tomatoes arranged around it. A knife, likely used for cutting the vegetables, can be seen in the scene, positioned to the left of the cutting board. A sandwich is also visible in the background, possibly as a meal being prepared or as an item being eaten. This setting showcases various ingredients being prepared or enjoyed during a meal.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sandwich", "knife", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 375902, "caption": "In the scene, there is a person on skis who is holding a sign with an immense amount of snow. Another person is located near the main focus, also on skis. The two skiers have their skis prominently displayed, indicating their excitement and achievement. The image captures the unique atmosphere of this snowy event.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 498747, "caption": "According to the picture, there are two people in the scene, a man and a woman. The man is holding a cell phone and talking on it, likely taking a picture of the woman. A wine glass can be seen in the background, as well as several bottles and another cell phone near the right edge of the image. There are also multiple other people in the room, some of whom can be seen towards the back and sides. The setting appears to be a social event or gathering, with people interacting and engaging in various activities.", "mscoco_hallucinated_words": [["wine glass", "wine glass"]], "mscoco_gt_words": ["cell phone", "bird", "person", "bottle"], "hallucination_idxs": [40], "metrics": {"CHAIRs": 1, "CHAIRi": 0.09090909090909091, "Recall": 0.75}}, {"image_id": 52435, "caption": "In the image, a group of people are enjoying a day at the beach, with some standing or sitting along the shoreline and others in the water. There is a sandy beach with waves lapping against the shoreline, creating a picturesque scene. Some people are actively surfing, while others are standing or sitting nearby. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["surfboard", "person", "umbrella", "bird"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 301977, "caption": "The image depicts a cow standing near a produce stand or market. It appears to be looking out of a large box filled with sodas. The cow is the main focus of the scene, with its attention drawn towards the soda box. There are several bottles of sodas placed around the cow, with some on the right side of the image and others in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cow", "refrigerator", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 261097, "caption": " The image features a tennis court where a tennis player, likely the focal point of the scene, is walking down the court with his tennis racket. There are several people in the background watching him, either fellow players or fans in the stands. A chair can be spotted in the middle of the court, possibly designated for a chair umpire.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["chair", "tennis racket", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 227204, "caption": "The image shows three women sitting around a table at a kitchen dinette, each talking on their cell phones. They are seated close together and have food on the table in front of them. The table contains a number of bowls, some filled with beverages, and some with food such as rice, eggs, and bananas. There are also cups and a knife on the table. Apart from the women, there are a few other items in the room, including a bottle and a spoon, as well as a chair placed near the wall.", "mscoco_hallucinated_words": [["chair", "chair"]], "mscoco_gt_words": ["cup", "knife", "dining table", "bowl", "spoon", "bottle", "cell phone", "person", "banana"], "hallucination_idxs": [100], "metrics": {"CHAIRs": 1, "CHAIRi": 0.07142857142857142, "Recall": 1.0}}, {"image_id": 353981, "caption": "The image is an old photo of an airplane parked on an airfield. The airplane is sitting on the runway, and several people are standing next to it, including one person to the left and one to the right. There are three other people in the background, one with her back to the airplane and the other two further away, one closer to the left and the other closer to the right.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "airplane", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 185327, "caption": " In this image, a herd of animals is walking down the street across from a car and bus, with several people also in the scene. The animals are comprised of various types, including cows, dogs, and horses, walking on a city street or sidewalk.", "mscoco_hallucinated_words": [["dog", "dog"]], "mscoco_gt_words": ["horse", "bus", "cow", "car", "person"], "hallucination_idxs": [39], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.8}}, {"image_id": 348654, "caption": "There is a cell phone placed on top of a speaker in an entertainment system, such as a TV or a computer monitor. The cell phone appears to be quite large and is positioned near the center of the image. In the background, there is a keyboard and a mouse present, which likely belong to the computer system.", "mscoco_hallucinated_words": [["keyboard", "keyboard"], ["mouse", "mouse"]], "mscoco_gt_words": ["cell phone", "laptop", "tv", "dining table"], "hallucination_idxs": [50, 53], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.75}}, {"image_id": 482432, "caption": "The image displays a close-up view of a sink with two toothbrushes, one being a child's toothbrush, on the right side. The sink is positioned towards the left side of the image. There is also a bottle with a white top on the left side of the sink, likely a skin care product. The scene conveys a sense of hygiene and grooming.", "mscoco_hallucinated_words": [["child", "person"], ["bottle", "bottle"]], "mscoco_gt_words": ["toothbrush", "sink", "bowl"], "hallucination_idxs": [16, 41], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2857142857142857, "Recall": 0.6666666666666666}}, {"image_id": 454161, "caption": " The image features a piece of pizza sliced and placed on top of a small, white plate. The slice of pizza is off to the side of the plate, partially covered by a napkin. Near the pizza on the plate, there is a wine glass filled with red wine, possibly as a beverage to complement the meal.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["wine glass", "pizza", "fork", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 144723, "caption": " The image shows a dog leaping into the air while catching a red frisbee in its mouth. The dog is mid-jump and appears to be excited about its catch. The background is filled with a number of other dogs, some scattered throughout the field and others closer to the main action. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "frisbee", "car", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 129707, "caption": "According to the picture, it is an image of a person holding a cordless telephone. The person is in the center of the scene, with their hand partially covering the top part of the phone. The phone appears to have buttons on its front, and its screen displays some content, possibly indicating a message or call. In the background, there is another person, likely sitting or standing in the same room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["book", "cell phone", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 379842, "caption": " The image depicts two Nintendo Wiis, with one on top of the other. A wireless controller is held by one of the Wiis, and an unopened game lies nearby. The second Wii is positioned on top of the first, creating a stack. The overall setting suggests a game night or entertainment event involving these gaming devices.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["book", "remote", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.0}}, {"image_id": 203110, "caption": "The image captures a large orange streetcar driving down the street. There is a mix of pedestrians, some closer to the trolley and others more distant from it, in various positions, creating a bustling atmosphere around the streetcar. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bus", "train", "orange", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 556478, "caption": "In the image, there are two pizzas sitting side-by-side on a tray in the kitchen. One pizza has a topping of vegetables, likely consisting of green peas, and the other pizza has a different type of topping. The oven is visible on the left side of the picture, indicating that the pizzas may have been cooked or finished in this space. There are several kitchen utensils, such as a spoon, located near the two pizzas, suggesting that they may be prepared for consumption.", "mscoco_hallucinated_words": [["spoon", "spoon"]], "mscoco_gt_words": ["pizza", "cup", "oven", "bottle"], "hallucination_idxs": [77], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.5}}, {"image_id": 103815, "caption": "The image captures a baseball game in progress, with a pitcher throwing a ball on the mound and a batter swinging a baseball bat. Several players are present, some standing in different positions around the field, while others are in action. There are a few baseballs visible in the scene, with one nearby the bat during the current game, another closer to the background, and a third ball in the far background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "baseball bat", "sports ball", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 451943, "caption": "The image portrays a bedroom with a luxurious bed, featuring a comfortable-looking headboard, pillow and a pair of lamps on the bedside tables. There are two lamps on the bed, one closer to the center and the other towards the right side. The bed is positioned close to a window, allowing natural light to enter the room. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "suitcase", "bed", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 268854, "caption": "A group of people, including an older woman and a younger man, are walking around and chatting by the shore. They are near a few boats, which are situated next to one another. Some of the boats appear to be small, and others have larger proportions.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "person", "boat"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 34464, "caption": "According to the picture, there are several people enjoying the waves in the beach area. A small surfboard is located near the shore, partially submerged in the water. The ocean water is quite choppy with several sets of waves moving in different directions. Several people are walking around, some closer to the waves, while others are standing further away or walking on the sand. The main group of people appears to be having fun together, making it an ideal spot for beach activities.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["surfboard", "boat", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 495513, "caption": "The image depicts a large crowd gathered for a party at night. Several people are scattered throughout the scene, some standing together in groups while others are standing on their own. The main focus of the image is a couple standing near the center, holding up hot dogs with skewers. They are likely enjoying themselves while partying with their peers. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["fork", "hot dog", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 529256, "caption": "The image features two dogs sleeping comfortably on a bed. The bed occupies a significant portion of the picture, with one dog occupying the left side and the other occupying the right side. They lay on top of a red blanket, cuddling up for a peaceful sleep. The image gives a sense of a relaxing and cozy setting for the dogs.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["book", "bed", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 37907, "caption": "In the image, a cat is sitting on top of a television in front of a potted plant. The cat is the main subject in the scene, occupying a significant portion of the television surface. The television is situated on a dining table or entertainment center, with a few other items placed in front of it. A second cat is also visible, though it is much smaller and sitting further back in the image. The combination of the two cats creates an interesting and unique scene, with the potted plant adding visual interest to the background.", "mscoco_hallucinated_words": [["table", "dining table"]], "mscoco_gt_words": ["cat", "potted plant", "tv"], "hallucination_idxs": [46], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1, "Recall": 1.0}}, {"image_id": 506401, "caption": "The image captures a scene on a city street with several statues, a small pond, and other figurines. In the foreground, there is a street light located at the center of the intersection, providing light to the area.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "traffic light", "cat", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 406611, "caption": " The image captures a busy ski slope with people of various ages walking and standing in groups, ready to start skiing or snowboarding. The snow-covered hill is filled with people enjoying the skiing season. Several people in the scene can be seen carrying skis, and some are carrying snowboards. There are a few clusters of people dispersed across the slope, likely waiting for their friends or ski instructors before embarking on their ski experience.", "mscoco_hallucinated_words": [["snowboard", "snowboard"]], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [52], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.6666666666666666}}, {"image_id": 267225, "caption": "In the image, there is a white truck parked next to a smaller vehicle, possibly a pickup truck. The two trucks have multiple satellite dishes attached to their roofs, giving the impression that they might be parked at an outdoor event. The location seems to be a sidewalk, as there are people walking by the trucks. The people appear to be in a group, gathered near the trucks or moving along the sidewalk.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "remote", "tv", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 278365, "caption": "According to the picture, there are two people sitting at a table. One person is wearing a blue sweater, and the other person has a cup in their hand. There is pizza on the table, as well as a bottle in front of the two people. Additionally, there are two more cups on the table, one towards the left side and the other closer to the right side.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["pizza", "cup", "dining table", "cell phone", "chair", "person"], "hallucination_idxs": [44], "metrics": {"CHAIRs": 1, "CHAIRi": 0.09090909090909091, "Recall": 0.6666666666666666}}, {"image_id": 69757, "caption": " The image depicts a group of people sitting around a dining table, enjoying a meal together. There are three main people in the scene, with a man standing behind a woman, and a woman sitting across from another man. They appear to be having a conversation as they eat.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "knife", "dining table", "cake", "wine glass", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2857142857142857}}, {"image_id": 432213, "caption": " The image features a meal consisting of a sandwich and a couple of French fries on a plate, surrounded by a variety of side dishes. Some fries are located on the left side of the plate, while a few are on the right side. The sandwich appears to be the main item on the plate. There are also a few oranges, placed in various positions, one towards the left side and two towards the right side. ", "mscoco_hallucinated_words": [["orange", "orange"]], "mscoco_gt_words": ["knife", "dining table", "bowl", "sandwich", "wine glass", "person"], "hallucination_idxs": [65], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.16666666666666666}}, {"image_id": 368367, "caption": "In the image, a woman is talking on a cell phone, covering part of her face with the handset. She is the main subject of the scene. There are two chairs present, one at the right side of the room, and the other slightly to the left. A bottle can be seen placed near the right side of the room, while a cup is located towards the center of the image. The scene captures the moment when the woman is talking on her phone while being surrounded by furniture and people.", "mscoco_hallucinated_words": [["chair", "chair"], ["bottle", "bottle"]], "mscoco_gt_words": ["cell phone", "cup", "person"], "hallucination_idxs": [34, 55], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 1.0}}, {"image_id": 201555, "caption": "In the image, there is a section of dirt on a highway next to a sign. The sign is on the left side of the dirt, indicating the state road is ending and giving directions to Royal National Boulevard.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "car", "truck"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.0}}, {"image_id": 92131, "caption": "In the image, a green and white public bus is parked in front of a large building, with people in the vicinity of the bus. A group of individuals, likely patrons, are seen at the bus stop, waiting to board or getting off the bus. One person is standing close to the back of the bus, while others are standing further away in various positions.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "person", "handbag"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 21779, "caption": "The image features a woman lying down on the floor, likely taking a break or resting after a busy day. There is a handbag, or suitcase, beside her, while a few other bags can be seen in the background. The woman seems to be playing dead, pretending to be unconscious or sleeping, and making a silly or entertaining gesture for the photograph.", "mscoco_hallucinated_words": [["handbag", "handbag"]], "mscoco_gt_words": ["backpack", "suitcase", "person"], "hallucination_idxs": [25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 565148, "caption": " This image captures a baseball game taking place on a field. There are multiple people in the scene, including several players, a batter, an umpire, and a catcher. The batter has a baseball bat in his hand, while the catcher and umpire are nearby, ready to assist or observe. The players are strategically placed around the field, and some are positioned closer to the batter and the catcher, while others are further away. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "baseball bat", "sports ball", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 160183, "caption": "The scene features a red and white fire truck parked in a large parking lot. The truck has a white ladder on its back, extending out of the top of the vehicle. The truck appears to be stationary, possibly awaiting further instructions or action.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 555456, "caption": "The image features snow skiers in the snow, with one person skiing down a slope and several others around the area. There is a skier skiing to the right, while three other skiers are scattered throughout the scene, some in close proximity to the person descending the hill. The scene appears to be taking place during a cold, overcast day, with a few trees visible in the background. The presence of multiple skiers highlights the popularity of the sport and the various skill levels in the group.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skis", "person", "snowboard"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 235595, "caption": " The image showcases several motorcycles parked next to each other on the street, creating a unique sight. The motorcycles appear to be in line with one another, positioned close to each other on the sidewalk.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "motorcycle", "potted plant", "car", "parking meter", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.16666666666666666}}, {"image_id": 489909, "caption": "The image features a basket full of fruits, including several oranges and bananas. The oranges are placed towards the front and middle of the basket, while the bananas are located in the background. There are multiple oranges throughout the basket, with some positioned closer to the front and others closer to the middle or back. A banana can also be found in the background, located to the right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "orange", "banana"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 501080, "caption": "According to the picture, there is a large, beautiful arrangement of flowers, with a mix of pinks and greens. The arrangement is primarily located on the right side of the image and consists of several flowers in the foreground, middle, and background. One flower is prominently displayed, taking up a significant part of the arrangement. The flowers are vibrant and attractively arranged, adding a lively touch to the space they occupy.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "vase", "potted plant", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.0}}, {"image_id": 405183, "caption": "According to the picture, a woman is standing in a living room holding a remote control for a Nintendo Wii video game, while a man is seated nearby. There are two other people in the room - one is seated towards the back of the image and another is standing near the left side of the room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "handbag", "tv", "potted plant", "book", "chair", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 409138, "caption": "The image features a large, white bus parked near a sidewalk near a building with several other buses in the area. It appears to be stopping at or near a parking lot, which can be seen in the background. There is also a car parked to the right of the bus, while other cars can be seen on the left, both near and further away from the bus. The scene is situated in front of a parking area with several parking spots, making the bus and other vehicles visible on the street.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 411840, "caption": "The image depicts two women tennis players walking across a tennis court together during an event. They are both holding tennis rackets and are closely positioned, with one of the players to the left of the other. Another person is standing near the left edge of the court, likely a fellow competitor or a spectator. The scene showcases the women's tennis event taking place on the tennis court.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "chair", "backpack", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 333712, "caption": "The image features a cheese and spinach pizza, partially consumed and with some slices missing from the dish. The pizza appears to have been served on a table, with some crumbs on the surface as well. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "spoon", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 148977, "caption": "The image features a teddy bear wearing headphones on its ear, sitting on a table. The teddy bear is situated in the middle of the scene, with books and other objects surrounding it. There are various books displayed on the table, ranging from the bottom left corner to the top right corner of the scene. Some of the books are placed closer to the teddy bear, while others are situated in different positions. Additionally, a cup is visible in the top-right corner of the scene.", "mscoco_hallucinated_words": [["cup", "cup"]], "mscoco_gt_words": ["keyboard", "teddy bear", "bear", "dining table", "bottle", "book"], "hallucination_idxs": [84], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.5}}, {"image_id": 399058, "caption": "The image features an elephant walking down the street with several people standing around it. The main focus of the scene is the elephant, which is taking up most of the road. Several other people are present on the street, some standing close to the elephant and others further away. Some of the people appear to be riding on the elephant, while others are simply watching it pass by.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bicycle", "elephant", "orange", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 183571, "caption": "According to the picture, there is a man standing on the boat with a dog. The man is standing on the right side of the boat, near the middle, while the dog is located at the very back of the boat. There are also several other boats visible in the scene, including a large boat that appears to be connected to the main boat. Some of these boats have people on them, suggesting that they may be interacting with each other. Overall, the scene showcases a man and his dog spending time together on a small boat on a body of water.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "boat", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 325836, "caption": "There is a showcase of various motorcycles at a convention with people walking around. One motorcycle, possibly a custom one, is the centerpiece of the exhibit. The motorcycle is sitting on the red carpet and attracting attention from the other motorcycles surrounding it and visitors in the room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "motorcycle", "handbag", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 437910, "caption": "The image features a small bathroom with two toilets side by side. One of them is situated closer to the left side of the room, while the other toilet is closer to the center. Both toilets have their seats in the open position. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sink", "toilet", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 155885, "caption": "This image shows a busy harbour scene with many boats of different sizes scattered around. Some of the boats are located closer to the shore while others are further out into the water. The scene is set in a black and white, which gives it a historical feel. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bird", "boat", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 467887, "caption": "A man is sitting at a table, working on his laptop. There is a dining table, two chairs, a vase, and a couch positioned around the room. A cup is placed on one of the table surfaces, while another is found on the dining table. There are two additional chairs in the room, one on each side of the couch.", "mscoco_hallucinated_words": [["vase", "vase"], ["cup", "cup"]], "mscoco_gt_words": ["dining table", "laptop", "mouse", "chair", "person", "couch"], "hallucination_idxs": [23, 34], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.8333333333333334}}, {"image_id": 174402, "caption": " The image features a man standing on a tennis court, holding a tennis racquet and ready to play the game. The man is wearing white clothes and seems to be getting ready to serve the ball. In the background, there are other tennis balls nearby, possibly waiting to be used in the match. The court is located in the center of the scene, with two more tennis racquets visible at the sides of the court.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 352734, "caption": " The image depicts a young boy eating ice cream at a dining table. He is holding a spoon in his hand and has a piece of cake in his mouth. In the background, there are three chairs, one on the right side, another on the left side, and the last one close to the boy at the dining table. A second person, likely a parent or guardian, is also present in the scene, but they are not sitting in any of the chairs.", "mscoco_hallucinated_words": [["chair", "chair"], ["chair", "chair"]], "mscoco_gt_words": ["dining table", "spoon", "cake", "fork", "person"], "hallucination_idxs": [39, 92], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2222222222222222, "Recall": 0.8}}, {"image_id": 346716, "caption": "There are two double-decker buses driving down a city street, surrounded by other vehicles and people. The buses are heading in opposite directions, with one to the left and another to the right of the street. On the other side of the street, a few cars are parked and waiting for their chance to pass by. A handful of people are also walking around, some closer to the buses, while others are further away. One person is walking near the front of one of the buses, while another individual is close to the back of the same bus. The street appears to be a busy one with the presence of both buses and a variety of people.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 451144, "caption": " The image captures two people standing on a snow-covered slope, looking at skis in the mountains. They are hanging their backpacks on a strap while they take turns to admire the scenery. The people are standing close to each other, with one person further ahead and the other closer to the backpack. The mountains are visible in the background, indicating a beautiful winter scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 295656, "caption": "According to the picture, the main focus is on three people, one woman and two men, standing next to each other and posing in a group. All three individuals are wearing formal attire, with ties and a suit, and seem to be enjoying themselves at a gathering. They are eating and drinking, holding dishes and glasses.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "clock", "tv", "tie", "car", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.14285714285714285}}, {"image_id": 129886, "caption": "In the image, there is a ski slope, possibly at a resort, with two main groups of people visible, both of which appear to be in the middle of skiing activities. The first group is near the ski lift, with four skiers spread across the scene. One person is on the left side, one is in the middle, another is on the right side, and the fourth person is located a bit further back. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skis", "person", "snowboard"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 557998, "caption": "According to the picture, a young man is sitting at a wooden table or bench, looking at a glass of wine. He appears to be wearing a white shirt in the image. Surrounding the scene are bottles and a cup, which could be other alcoholic beverages or other items for the event. There are also a few chairs visible in the background.", "mscoco_hallucinated_words": [["bench", "bench"], ["cup", "cup"]], "mscoco_gt_words": ["dining table", "bottle", "wine glass", "chair", "person"], "hallucination_idxs": [15, 43], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.8}}, {"image_id": 401010, "caption": "In the image, there is a bedroom with a bed situated in the center. The bed appears to be well-made and features multiple pillows. The bed is surrounded by various items placed throughout the room, including a bottle, an apple, a book, and a cup. The objects are positioned near the bed, creating a sense of clutter in the small space. ", "mscoco_hallucinated_words": [["bottle", "bottle"], ["apple", "apple"], ["book", "book"], ["cup", "cup"]], "mscoco_gt_words": ["sink", "chair", "bed", "tv"], "hallucination_idxs": [41, 44, 47, 51], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.25}}, {"image_id": 567562, "caption": "The image depicts a woman sitting at a dining table surrounded by four children, including two young girls and two boys. There are various pizzas on the table, some partially eaten and some still sealed. A glass bottle can be seen sitting near the center of the table.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "dining table", "bowl", "spoon", "bottle", "fork", "broccoli", "book", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.36363636363636365}}, {"image_id": 313928, "caption": "The image depicts a balcony overlooking the ocean with a table, chairs, and a surfboard. There are three chairs placed around the table, two closer to the surfboard and one more distant from it. The table, which has an umbrella and a bottle, has various items on it, including a glass, a vase, a bowl, and a cup. There is also a bucket in the scene.", "mscoco_hallucinated_words": [["umbrella", "umbrella"], ["vase", "vase"], ["bowl", "bowl"], ["cup", "cup"]], "mscoco_gt_words": ["surfboard", "dining table", "bottle", "wine glass", "chair"], "hallucination_idxs": [45, 61, 64, 68], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.8}}, {"image_id": 490284, "caption": " The image shows a dog, possibly a black and brown breed, lying on top of a person's lap on the couch. The dog is peacefully sleeping, taking up much of the visible portion of the image. Behind the dog, another person is visible, likely seated on the couch next to the dog. They may be resting on the same or a different part of the couch. The image captures a cozy, intimate scene where the dog is being comforted by a human and the relaxation of both parties is evident.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["couch", "person", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 346940, "caption": " The scene features a stove with various cooking utensils and kitchenware. A black kettle is located on the stove, and there are two spoons in the vicinity. In addition to the kettle, there is a bowl near the stove as well. ", "mscoco_hallucinated_words": [["spoon", "spoon"]], "mscoco_gt_words": ["bowl", "knife", "oven"], "hallucination_idxs": [25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 222782, "caption": "This is an image of a vintage sign in black and white. The sign reads \" american amoco gas\" and is displayed outside a store or gas station. The scene is relatively calm, with only a few people walking on the sidewalk.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "cell phone", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 255934, "caption": " The image showcases two elephants standing close to each other in a large enclosure, possibly a zoo or a park. The two elephants are roughly the same size and are positioned together, walking side by side. There is another elephant further away from them, but still within the enclosure.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "handbag", "elephant", "cow"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 278627, "caption": "The image is a collage of pictures featuring food items. There are several images that depict people eating various types of dishes, such as seafood, fruit, and vegetables. One image shows a man and a woman dining together in a restaurant. Another image shows a man eating food in a separate picture. Another image shows people enjoying food and drinks at a table.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "dining table", "spoon", "potted plant", "bottle", "fork", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 400851, "caption": "A large group of people is seen in the image standing on the beach, enjoying their day and making use of various seating areas, such as benches and lawn chairs. There are several people scattered across the sand, and some are standing on top of a table or a chair. The beach scene appears to be a popular gathering spot as there are multiple tents nearby, likely belonging to different groups.", "mscoco_hallucinated_words": [["table", "dining table"]], "mscoco_gt_words": ["bench", "umbrella", "chair", "backpack", "boat", "sports ball", "person"], "hallucination_idxs": [50], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.42857142857142855}}, {"image_id": 472610, "caption": "The image shows a person, likely a woman, standing outside with her dog on a leash. The dog is a large, possibly a white and blue cow, and is the center of attention. There are a few cows in the background, including one close to the dog and the woman. The image also depicts a couple of barns, further highlighting the farm setting.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cow", "person", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 78565, "caption": " The image portrays a group of people gathered on the beach near some small sailboats. There are several people visible in the image, both standing and sitting by the water. The sailboats are spread out across the beach, with some closer to the shore and others further away. The people are likely enjoying the scene or perhaps participating in the activities in the water.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "boat", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 388237, "caption": "The scene features a young person riding a skateboard in a parking lot, surrounded by people at various locations. There are three other skateboards in the area, one to the left of the main skateboard and the others on either side of the person. A bench and a purse can also be seen in the parking lot. The individuals are standing in various positions, with one person close to the left side, two closer to the middle, and two others near the right side. The people appear to be enjoying the parking lot atmosphere or watching the skateboarder. The scene appears to be leisurely and casual.", "mscoco_hallucinated_words": [["bench", "bench"], ["purse", "handbag"]], "mscoco_gt_words": ["skateboard", "chair", "person", "bottle"], "hallucination_idxs": [49, 52], "metrics": {"CHAIRs": 1, "CHAIRi": 0.18181818181818182, "Recall": 0.5}}, {"image_id": 289640, "caption": "In the image, a man holding an orange frisbee over his head while another person stands nearby. There are chairs at various locations around the room, some arranged in circular seating areas, and others positioned towards the sides. Several dining tables can be seen, with a dining table being occupied by a person and another one towards the right side. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "dining table", "frisbee", "orange", "wine glass", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.7142857142857143}}, {"image_id": 122602, "caption": "There is a variety of craft supplies and tools placed on a wooden table in this image. The table is filled with a mix of materials, including a bottle, paper, and scissors. The bottle is situated on the right side of the table, and paper is dispersed throughout the table. The scissors are mostly located on the left side, with one on the upper half and another closer to the bottom.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "scissors", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 24104, "caption": " The image portrays a typical office desk with a multi-monitor setup, a single computer, and various office items placed on it. The computer is centered, surrounded by three monitors on both sides, and two mice positioned nearby. A coffee cup and a book are also visible on the desk, adding a homely atmosphere to the workspace. A cell phone can be found at the top right of the image.", "mscoco_hallucinated_words": [["cell phone", "cell phone"]], "mscoco_gt_words": ["keyboard", "cup", "tv", "dining table", "laptop", "mouse", "book", "bed"], "hallucination_idxs": [65], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.75}}, {"image_id": 183709, "caption": "The image depicts a group of people standing on the sidewalk, in front of an intersection where a crosswalk and two traffic lights are located. The group includes multiple individuals wearing backpacks and hats, indicating a social gathering or day out for some of them. A man is in the center of the scene, and two cars can be seen in the background, parked on the street. There is a person on the left, and another person standing on the right side of the group.", "mscoco_hallucinated_words": [["backpack", "backpack"]], "mscoco_gt_words": ["traffic light", "handbag", "person", "car"], "hallucination_idxs": [33], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.75}}, {"image_id": 59710, "caption": "The image depicts a black and white refrigerator in a bedroom, likely situated near the bed. It is surrounded by several pieces of furniture, including a bed, two dressers, and two chairs. The bed and the first dresser are located closer to the refrigerator, while the second dresser and the two chairs can be seen further in the background. The setup suggests the bedroom serves as a place to eat and relax, as there is a refrigerator right by the bed and a dining area.", "mscoco_hallucinated_words": [["bed", "bed"], ["bed", "bed"], ["bed", "bed"], ["bed", "bed"]], "mscoco_gt_words": ["chair", "refrigerator", "dining table"], "hallucination_idxs": [16, 29, 39, 89], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4444444444444444, "Recall": 0.6666666666666666}}, {"image_id": 279407, "caption": "The image captures a little league baseball game with multiple players, including a young player catching a ball and another player in the background. The young player has a glove on his hand, holding the ball, and wears a baseball jersey. Another baseball is nearby, but it is not being caught or handled by anyone in the scene. The scene also includes a bench on the field, which might be used by players or coaches.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "bottle", "sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 314729, "caption": "The image is a baseball card, likely dating back to the early 20th century, featuring a baseball player holding a bat and wearing an old-fashioned uniform. There is a placard or writing on the card, which might contain information about the player. In the scene, several other baseball cards can be seen, arranged in the background and positioned around the baseball player. The focus is on the main baseball player, showcasing the vintage look of the baseball card.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "baseball bat", "sports ball"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 527497, "caption": "According to the image, there are two computer monitors placed on a table. One is positioned on the left side, while the other one is located on the right. There is a black cell phone placed in the middle of the table, closer to the left monitor. The scene is in a computer lab or office cubicle, with chairs and various equipment placed around the desk.", "mscoco_hallucinated_words": [["cell phone", "cell phone"], ["chair", "chair"]], "mscoco_gt_words": ["keyboard", "tv", "dining table", "laptop", "mouse"], "hallucination_idxs": [37, 65], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2222222222222222, "Recall": 0.6}}, {"image_id": 411564, "caption": " The image shows a man sitting on a windowsill in a kitchen. He is wearing a hat and appears to be holding a cell phone while sitting outside of the room. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["refrigerator", "dining table", "bottle", "cell phone", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 320350, "caption": "The image features a man and a woman, both wearing skis, standing on a snowy plain with two dogs. They are positioned side by side, slightly spaced apart, while the dogs are standing in the middle of the pair. The man and woman appear to be smiling and enjoying their time together.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "skis", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 401857, "caption": " There is a baseball game happening in a baseball field, with numerous players and baseball equipment spread throughout the area. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "bench", "baseball bat", "sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.16666666666666666}}, {"image_id": 285302, "caption": " The image depicts a young boy sitting at a dining table and cutting paper with scissors. He is holding the paper with scissors on it, while the scissors are in close proximity to the paper. Surrounding the child are various chairs, a cupboard, and a couch. The child is the focal point of the scene, engaging in a creative and fun activity.", "mscoco_hallucinated_words": [["couch", "couch"]], "mscoco_gt_words": ["person", "chair", "scissors", "dining table"], "hallucination_idxs": [50], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 1.0}}, {"image_id": 61959, "caption": "The image features a large white bus driving down the street in an urban area. There are a few cars parked along the street, with one located towards the front of the scene, another further back, and a third on the left side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "bus", "truck", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 229659, "caption": " In the image, there is a large open space filled with chairs, some of which are at tables. A few people are seated in the dining area, such as the two main individuals at the tables on the right side of the room, and a person standing on the left side. They appear to be enjoying their meals and the social atmosphere of the restaurant.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "tv", "dining table", "laptop", "bottle", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.42857142857142855}}, {"image_id": 473118, "caption": " The image shows a person riding a skateboard on top of a half-pipe or a ramp, skillfully balancing on a skateboard ramp. Another person is seen in the background, also enjoying the ramp, but they are closer to the left side of the scene. Several bicycles are scattered around the area, including two bicycles at the lower left corner of the image and another one on the right. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "skateboard", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 391222, "caption": "The image depicts two people standing on skis in the snow, one of them wearing blue and the other wearing red. They appear to be standing next to each other and are in the process of discussing something. The skiers are standing in front of a large information sign that is located nearby, indicating the trail or information for skiers and visitors. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 416104, "caption": "According to the picture, there is a group of people seated at outdoor tables under multiple blue umbrellas, creating a relaxed atmosphere. Several people are present, including two individuals towards the right side, one seated in the middle, and at least four people on the left side of the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["umbrella", "dining table", "bottle", "wine glass", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 511852, "caption": "The image shows two people skiing down a snow-covered slope. Both skiers have their backpacks with them. One person is located near the top of the hill, with their skis pointing towards the right side of the image, while the other person is skiing a bit lower down the slope. The pair seems to be enjoying their skiing excursion and making their way towards the base of the mountain.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 59386, "caption": " The image features a group of giraffes standing in a field, likely eating grass. The giraffes are scattered throughout the field, some closer to the left side and others towards the center or right side. They are all different sizes, some smaller and others bigger. The scene presents a herd of giraffes, and it is quite a sight to see these majestic animals together.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["zebra", "giraffe", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 540180, "caption": "A picture captures a busy city street scene with traffic lights mounted on a pole. There is a green traffic light sitting next to other traffic lights, occupying a prominent position. The street has numerous cars parked near the traffic lights, contributing to a busy atmosphere. The street and traffic lights are located near some buildings, adding to the urban environment. In addition to the traffic lights and vehicles, there are several pedestrians walking around the area.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 153692, "caption": "According to the image, there is a man riding a bicycle and crossing the street with his dog. The man is at the center of the image, surrounded by a few traffic lights. Additionally, there are at least five traffic lights in the scene, with one placed above the man and dog, another above the middle of the image, and two others near the top of the image, on the right-hand side and in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "traffic light", "dog", "car", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 265628, "caption": "The image showcases a unique public restroom facility, featuring a toilet and sink arranged side-by-side. The toilet is positioned at the center of the scene, with a sink adjacent to it on the right side. A pair of legs is visible near the bottom left corner, likely belonging to a person standing or sitting near the toilet.", "mscoco_hallucinated_words": [["sink", "sink"], ["sink", "sink"]], "mscoco_gt_words": ["bowl", "person", "toilet"], "hallucination_idxs": [13, 30], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.6666666666666666}}, {"image_id": 499594, "caption": " The image features a meal consisting of various food items placed on a table. The main dish appears to be meat, possibly steak or pork, served with a mix of vegetables, including broccoli and cheese. There are several bowls of vegetables and cheese on the table, as well as some bottles and glasses, likely containing drinks. A bottle is situated in the top right corner of the image, and a couple of forks are also visible, placed near the meat dish and at the bottom of the table.", "mscoco_hallucinated_words": [["bowl", "bowl"]], "mscoco_gt_words": ["cup", "knife", "dining table", "bottle", "fork", "broccoli"], "hallucination_idxs": [43], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 0.6666666666666666}}, {"image_id": 5934, "caption": "According to the picture, there is a baseball and a baseball glove sitting in a dining bench, likely on a sports field. The baseball glove appears to be an official ball glove, possibly of a popular brand. The baseball, meanwhile, is positioned near the glove, as if it has just been placed inside or removed from the mitt. The bench provides a seating area or a place to rest during play.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "sports ball", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 47828, "caption": "According to the image, it is night time and there is a large river filled with water, possibly a canal, near several bridges. Numerous people are walking alongside the river, some of them walking on the sidewalk, while others are near the water. Overall, there are at least 13 people visible in the scene, spread across the walkway and the water.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "boat", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 47916, "caption": " There are several dogs, including three main dogs, running around a grassy field in the image. One of the main dogs is holding a frisbee in its mouth, with the other two dogs nearby. There is another smaller dog that is also visible, possibly a puppy or a younger dog.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bench", "dog", "frisbee", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 255483, "caption": "The image depicts a person sitting at a table, surrounded by several books and papers. The person appears to be reading or studying the documents. There is a pair of scissors and a spoon near the table, which are likely related to the person's activity. ", "mscoco_hallucinated_words": [["spoon", "spoon"]], "mscoco_gt_words": ["book", "person", "scissors", "dining table"], "hallucination_idxs": [36], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 1.0}}, {"image_id": 131138, "caption": "A well-set office desk is showcasing various computer parts, including a laptop and a monitor. The laptop is placed near the center of the desk, and a mouse can be found close by, positioned near the bottom right side. Two keyboards are also visible, one on the left side of the laptop and another closer to the monitor on the right. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["keyboard", "cup", "tv", "dining table", "spoon", "laptop", "potted plant", "mouse", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5555555555555556}}, {"image_id": 258209, "caption": " The image depicts a man sitting on a bench in front of a large building, possibly a brown building or a building with a brown color. The bench appears to be near the front of the building and the man is positioned close to the edge of the sidewalk. Several other people can be seen in the scene, walking or standing in the vicinity of the man and the bench.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "book", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 347453, "caption": "In this scene, there are two red double-decker buses, one parked and the other moving down the street. The street is lined with tall buildings and is relatively empty, possibly during off-peak hours.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "bench", "traffic light", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 285820, "caption": "The image shows a living room filled with furniture, including a couch in the center of the room and another smaller couch along the wall. There are two large windows on one wall, providing a natural source of light and creating a comfortable and inviting atmosphere. A dining table is positioned near the wall opposite the couches, and a television is located on the opposite side of the room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "tv", "dining table", "bowl", "cell phone", "book", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.42857142857142855}}, {"image_id": 245664, "caption": "The image features a table with a laptop sitting on it, which is on and connected to a monitor. The laptop appears to be a personal computer set up for work and a keyboard can be seen on the table. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["keyboard", "cup", "tv", "dining table", "laptop", "mouse", "book", "wine glass"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 453731, "caption": "According to the image, there is a dog sleeping on the floor next to a wooden bench or counter. The bench or counter is covered in glass bottles and there are a few other bottles scattered around the area. The bottles are likely used for storing wine or other beverages.", "mscoco_hallucinated_words": [["bench", "bench"], ["bench", "bench"]], "mscoco_gt_words": ["cup", "dog", "bottle"], "hallucination_idxs": [17, 22], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.6666666666666666}}, {"image_id": 524601, "caption": "In the image, there is a living room setting with a group of five people standing together, playing a video game. The people are holding Nintendo Wii remotes, and they appear to be enjoying themselves while interacting with each other.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "refrigerator", "book", "chair", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 253262, "caption": " The image shows several people standing around an airplane, which appears to be parked at an airport or airfield. One person is in the center, possibly waiting for an airport employee to assist them. There are four other individuals surrounding this person, two on the left side and two on the right.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "suitcase", "tie", "airplane", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 87052, "caption": " The image features a couple standing in a kitchen, possibly getting ready to eat. They are positioned near a microwave, with a spoon on top of it. The microwave is placed on the left side of the kitchen, in the vicinity of an oven. The couple is standing near the oven, with the man standing more to the left side of the kitchen and the woman closer to the right. They appear to be concentrating on their task, possibly preparing food. ", "mscoco_hallucinated_words": [["microwave", "microwave"], ["spoon", "spoon"], ["microwave", "microwave"]], "mscoco_gt_words": ["cup", "oven", "refrigerator", "bowl", "bottle", "vase", "person"], "hallucination_idxs": [21, 25, 32], "metrics": {"CHAIRs": 1, "CHAIRi": 0.42857142857142855, "Recall": 0.2857142857142857}}, {"image_id": 357578, "caption": "There is a red fire truck parked in front of a red car on the street, with several other vehicles present including a blue car, a black truck, and a green truck. These vehicles are driving on the road as the fire truck is parked and a traffic light is present in the area. A police car can also be seen further down the street, adding an extra element to the scene. Various traffic signs can be noticed, such as a stop sign, an indication sign, and a hand-held traffic light. The atmosphere seems to be a typical city street, with people and cars moving along the road.", "mscoco_hallucinated_words": [["stop sign", "stop sign"]], "mscoco_gt_words": ["traffic light", "car", "truck", "person"], "hallucination_idxs": [89], "metrics": {"CHAIRs": 1, "CHAIRi": 0.08333333333333333, "Recall": 1.0}}, {"image_id": 418692, "caption": " The image depicts a woman sitting in a living room on a couch. The room features a large dining table in the center, adorned with several decorative items and a vase. There are two chairs placed near the table, and a clock can be seen on the wall.", "mscoco_hallucinated_words": [["table", "dining table"], ["vase", "vase"], ["table", "dining table"], ["clock", "clock"]], "mscoco_gt_words": ["remote", "cell phone", "book", "chair", "person", "couch"], "hallucination_idxs": [20, 32, 41, 45], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5714285714285714, "Recall": 0.5}}, {"image_id": 259345, "caption": "According to the picture, a man is sitting on a bench while holding an umbrella. He has a long beard and appears to be wearing a multi-colored dress. The umbrella he's holding is large and spans most of the image. The bench he is sitting on can be seen in the right side of the picture. There is also another person visible in the image, sitting closer to the right side. The background appears to be mostly empty, giving focus to the man and his umbrella.", "mscoco_hallucinated_words": [["bench", "bench"], ["bench", "bench"]], "mscoco_gt_words": ["umbrella", "chair", "person"], "hallucination_idxs": [11, 46], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 119709, "caption": " The image depicts a woman standing in a kitchen, focused on a task in the vicinity of a table. She appears to be busy in the process of baking cookies. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["chair", "donut", "person", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 443299, "caption": " The image features a man and a woman sitting on a bench, possibly waiting for something or someone. The bench appears to be a public seating area in a building. The two individuals are sitting on the left side of the picture, with the man closer to the center and the woman on the right.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "bench", "cell phone", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 120776, "caption": "In the image, there are several outdoor umbrellas, each with their unique design and colors. Some people are seated on benches placed around the umbrellas, enjoying the shaded area and the company of others. A woman can be seen standing to the side, possibly waiting or enjoying the scene. Various handbags are positioned near the benches, possibly belonging to the seated people.", "mscoco_hallucinated_words": [["bench", "bench"], ["handbag", "handbag"], ["bench", "bench"]], "mscoco_gt_words": ["tie", "umbrella", "person"], "hallucination_idxs": [23, 57, 62], "metrics": {"CHAIRs": 1, "CHAIRi": 0.375, "Recall": 0.6666666666666666}}, {"image_id": 411765, "caption": "In the image, a man is playing a game of tennis on a tennis court, holding a tennis racket. He is in the middle of hitting the tennis ball, swinging his racket with precision.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 447991, "caption": "", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bicycle", "handbag", "traffic light", "cell phone", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.0}}, {"image_id": 276719, "caption": " The scene takes place in a city street with multiple cars parked on both sides of the street. The car closest to the viewer has a blue color, and another car parked to the right of it has a brown color. Among the cars, there is a dog standing on the sidewalk, which appears to be the focus of the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "handbag", "motorcycle", "dog", "car", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2857142857142857}}, {"image_id": 422328, "caption": "In the image, a horse is tied to a snow plow, making it an unusual sight. The snow plow is located next to the horse, with the horse and plow placed right outside a building. A person is also present nearby, standing or walking around the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["horse", "snowboard", "car", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 354559, "caption": " The image features a young woman wearing a black and white striped dress, standing against a wall and posing with a stuffed animal. The woman appears to be smiling and having fun with the prop.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["teddy bear", "bear", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 438788, "caption": "The image depicts a dining table with a slice of cake placed on it. There are two cakes sitting on the table, one with an uneven piece taken out of it and the other cake untouched. A knife is also visible, potentially used to cut the cake. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cake", "knife", "person", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 468005, "caption": " The image shows a large elephant with tusks, walking down a path. The elephant appears to be on its way to the ocean as there is a person standing close to the animal. There are other people, some standing and others sitting, near the water. The scene captures a unique moment, as the elephant walks in the company of humans, both on the beach and on a path nearby. The people are scattered around the scene, watching the elephant or engaging in their own activities.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "person", "elephant"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 545594, "caption": "According to the picture, a group of people is sitting on a bus as they go about their daily activities. Some are sitting close together, while others are positioned more apart. The bus has two rows of seats: the first row with three seats and the second row with two. There is a backpack on the first seat on the right and a cell phone can be seen on the second seat on the right.", "mscoco_hallucinated_words": [["cell phone", "cell phone"]], "mscoco_gt_words": ["backpack", "chair", "orange", "person"], "hallucination_idxs": [69], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.75}}, {"image_id": 219546, "caption": "The image depicts a table filled with various foods and a large number of bowls arranged around it. The table is covered with a tablecloth and is adorned with a few plates.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["knife", "dining table", "bowl", "spoon", "bottle", "fork", "chair"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2857142857142857}}, {"image_id": 386645, "caption": " The image captures a street scene with an old looking building, some cars, and a small truck parked in front of the building. The car in the foreground is an older model, and there are also a couple of other vehicles parked near the building on the right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "motorcycle", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 155131, "caption": "The image features a woman dressed in white performing a serve in a tennis match. She is the central figure in the scene, and her movement is captured mid-air as she raises her arm to hit the ball. A number of people are scattered throughout the scene, some of them closer to the tennis player and others farther away.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "tie", "tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 188752, "caption": " In the image, there are several hanging bunches of bananas arranged vertically from top to bottom, creating an appealing visual display. There are two people in the scene: a man standing near the left edge of the image and a woman standing at the right edge, both seemingly admiring the banana display.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["apple", "person", "banana"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 228350, "caption": "According to the picture, there is a large group of people sitting outside of a restaurant, with multiple tables and chairs set up in front of red umbrellas. There are a total of 12 people scattered across the dining area, and a couple of additional people are standing further back. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "umbrella", "dining table", "bowl", "tie", "wine glass", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 437720, "caption": "The image is a painting that depicts a nude woman sitting at a table while working on a computer. The woman appears to be sitting in front of a red chair and working on her laptop. The painting is composed of various objects such as chairs, a table, and a chair, as well as artwork hanging on the walls.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "laptop", "vase", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8}}, {"image_id": 378831, "caption": "A woman is walking down the street, carrying grocery bags, and wearing a backpack. She appears to be in a parking lot or next to a fire hydrant.", "mscoco_hallucinated_words": [["backpack", "backpack"]], "mscoco_gt_words": ["fire hydrant", "handbag", "person"], "hallucination_idxs": [15], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.6666666666666666}}, {"image_id": 375503, "caption": "The image shows a stairwell leading to a residence. The stairwell has a set of stairs with two railings and a wooden floor. There are two doors at the top of the stairs, one on the left and one on the right. Both doors have a clock on the outside. The image gives a sense of being in an upstairs area of a building or house. The atmosphere is somewhat elegant, with wooden features and the clocks adding a bit of personality.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "tv", "dining table", "laptop", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.16666666666666666}}, {"image_id": 330369, "caption": "According to the picture, a large red and white boat is traveling on the water, occupying a significant portion of the scene. The boat has several other boats nearby, likely sharing the same body of water. The ocean and the boats are spread across the canvas, creating an interesting scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bird", "boat", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 288576, "caption": " According to the picture, a man in his 40s is playing frisbee with his son outside in a field filled with grass. The man is standing in the field, with the frisbee in the air as he throws it. He is about to catch the frisbee while his son tries to stop it. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "frisbee", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 32190, "caption": "The image features a toy horse on the sidewalk next to a street. The horse is on the right side of the scene, and a few vehicles are parked nearby or driving down the street. A car is parked at the back of the scene, while two more vehicles are positioned closer to the front of the image. The scene is surrounded by tall buildings in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["horse", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 535403, "caption": "The image displays a small, tiled bathroom with white and blue vanities and mirrors. There are two sinks, one on the left side and the other on the right side, located under the mirrors. The sink on the left has a comb visible in its reflection, as well as a bottle visible near the sink. Several bottles are scattered throughout the room, likely holding personal grooming products. A vacuum cleaner is also visible on the right side of the room, likely to be used for cleaning purposes. A hair dryer is situated in the center of the room. The bathroom appears neat and well-organized, with personal grooming tools and products placed around it.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "potted plant", "bottle", "toilet", "sink", "hair drier"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 449798, "caption": "The image features a desk or table topped with various items. Among the collection of toys, a keyboard sits on the left side of the table. A computer keyboard with a cup and a mouse on it is positioned further towards the center. The display is dominated by various small figures and figurines, each displayed differently. There are a few figurines on the left side of the table, a few in the middle, and others on the right. Additionally, a television can be seen in the background of the scene.", "mscoco_hallucinated_words": [["cup", "cup"], ["television", "tv"]], "mscoco_gt_words": ["keyboard", "dining table", "laptop", "mouse", "cell phone", "book"], "hallucination_idxs": [34, 90], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 375769, "caption": "In the image, a young boy wearing a helmet is sitting at a table in front of a cafe with several other people in the background. There are multiple tables, chairs, and a bench positioned around the cafe, providing various seating options for customers. The boy is eating a hot dog, while some bikes are placed around the cafe, one at the far right and another closer to the left side.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["bicycle", "cup", "umbrella", "dining table", "hot dog", "car", "chair", "person"], "hallucination_idxs": [37], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.625}}, {"image_id": 426172, "caption": "A photo display stands with a bunch of signs in front of it creates an eye-catching setup. The main focal point of the image is the dining table, which is filled with various products such as cake, wine bottles, and knives. A brochure stand is also positioned close to the table, displaying different flyers. Two hand bags and a cell phone can be seen on the table as well, emphasizing the store display nature of the scene. Additionally, a few bottles can be seen scattered around the table and on the brochure stand, further emphasizing the store setting. A dining chair is positioned to the side of the table, potentially providing a place for customers to sit while shopping.", "mscoco_hallucinated_words": [["cake", "cake"], ["cell phone", "cell phone"]], "mscoco_gt_words": ["knife", "dining table", "bottle", "book", "chair"], "hallucination_idxs": [38, 66], "metrics": {"CHAIRs": 1, "CHAIRi": 0.18181818181818182, "Recall": 0.8}}, {"image_id": 20650, "caption": " The image features a plate with various food items, including a delicious sandwich and a side of vegetables, specifically broccoli. There are multiple pieces of broccoli, some of them cut into smaller pieces, spread across the plate. A fork is positioned close to the food, suggesting that someone is eating or serving the meal. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "bowl", "sandwich", "fork", "broccoli", "orange"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 138078, "caption": "In the image, there is a baseball field where two baseball players are standing in the dirt. One of the players is wearing a baseball glove and is on the left side of the field. The other player is on the right side of the field and wears a baseball cap. They appear to be waiting to make a play on the field.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 152499, "caption": "According to the image, a group of people and kids are enjoying a party by a pool. There are various pool floats, boards, and surfboards in the water, indicating a fun and festive atmosphere. A couple of kids are playing and others are sitting around the pool, creating a sense of family and friendship. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["surfboard", "umbrella", "dining table", "potted plant", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 180764, "caption": "According to the picture, a man in glasses is biting a pink glazed donut, and another person is visible in the background, likely standing or sitting nearby. The scene takes place in a kitchen area with some utensils and a sink nearby.", "mscoco_hallucinated_words": [["sink", "sink"]], "mscoco_gt_words": ["microwave", "donut", "person", "bottle"], "hallucination_idxs": [44], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.5}}, {"image_id": 147762, "caption": "The image features a small white plate, sitting on a table, adorned with a piece of cake, ice cream, and a lemon on top. The cake appears to have a square shape with a creamy center and is covered in whipped cream. The ice cream is placed atop the cake. The lemon seems to be floating in the center of the dessert.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["fork", "cake", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 113860, "caption": "The image features a canal in which a boat is moving from the back. It appears to be traveling along the canal with a passenger or passengers on board. The boat is the main focus of the scene, with a large portion of it visible in the foreground. The water surrounding the boat and the canal itself can be seen in the background. There are a few other boats also present, some closer to the camera, while others are farther away. Overall, it is a picturesque scene with a boat sailing in a peaceful and scenic location.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["boat", "umbrella", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 460866, "caption": "The image shows a cup filled with multiple toothbrushes of various sizes. Some toothbrushes are stacked on top of each other and some are inside the cup. The toothbrushes appear to be a mix of different brands and designs, with some appearing more dominant than others. There is also a cup, likely used to hold the toothbrushes, visible in the scene. The overall scene suggests a collection of toothbrushes in a cup, with some of the toothbrushes protruding from the cup and others inside it.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["toothbrush", "cup", "dining table", "bowl", "sink"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 177015, "caption": " The image captures a man sitting on a couch, holding a cat on his lap as he uses a laptop computer. The cat is either sleeping or calmly sitting on the man's lap as he interacts with the laptop. There are also two other chairs in the scene, both located to the left and right of the man.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["refrigerator", "laptop", "cat", "apple", "chair", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.7142857142857143}}, {"image_id": 333998, "caption": "A family of four is preparing meals in a small kitchen filled with wooden cabinets. There are two people in the kitchen, one in the center and the other on the right side. The kitchen has a white refrigerator, black microwave, and a recessed sink. The refrigerator has a few bottles stored on the shelves, with one bottle placed closer to the refrigerator's left side and a larger bottle placed near the middle. On the right side, a bowl can be seen next to the sink.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "suitcase", "oven", "refrigerator", "bowl", "bottle", "sink", "microwave", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 47131, "caption": "The image shows a black and white photo of a train that has derailed and fallen off the rails. Many people are gathered around the wrecked train, with some of them standing on the wrecked car and others looking at it. The image captures the moment of the train derailment, and the people are curious or possibly working to restore the train. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 425447, "caption": "The scene features a white pickup truck parked on the side of the street, sitting in the middle of the roadway. The truck has a hose spraying the street, with the water flowing towards the front and back of the truck. There is a fire hydrant located near the truck on the left side of the image. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "fire hydrant", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 295105, "caption": "In the image, a man is standing inside a kitchen, looking out of a refrigerator toward a microwave located on the left side of the kitchen. The refrigerator is one of the two located in the room, with the other situated on the right side. The man is the only person present in the kitchen. There are multiple chairs throughout the space, with three in the right corner, one in the center, and two more in the left corner. Various bottles are placed in the room as well, with two located near the refrigerator and three towards the center of the kitchen. A couple of knives are also present, one closer to the man and the other near the microwave.", "mscoco_hallucinated_words": [["microwave", "microwave"], ["chair", "chair"], ["knife", "knife"], ["microwave", "microwave"]], "mscoco_gt_words": ["oven", "refrigerator", "bowl", "bottle", "apple", "sink", "person"], "hallucination_idxs": [19, 64, 117, 132], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.42857142857142855}}, {"image_id": 443259, "caption": "This image showcases a modern and minimalist kitchen equipped with all stainless steel appliances and surrounded by oak wood flooring. The kitchen features an island in the middle, which contains a sink, a knife, and a bottle of soda arranged on top. A refrigerator and a two-stool counter are located nearby. The scene is lit by a potted plant in one corner of the kitchen.", "mscoco_hallucinated_words": [["knife", "knife"]], "mscoco_gt_words": ["oven", "refrigerator", "dining table", "bowl", "potted plant", "bottle", "orange", "sink", "chair"], "hallucination_idxs": [36], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.4444444444444444}}, {"image_id": 537861, "caption": "Scene) and a person walking on the sidewalk nearby, with a small group standing to the right. The train appears to be yellow and gray in color, and is pulling alongside the station, next to the sidewalk. The group is dispersed throughout the scene, with some standing closer to the train and others further away. The person walking on the sidewalk is in the upper left corner of the scene, while the group is positioned near the left and right edges. There are several benches visible in the scene, placed on both sides of the station and next to the sidewalk.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 145020, "caption": "There is an image of a disneyland, likely donald duck, in the center of the scene, with a bunch of toys on the left side and a few more to the right. There is also a christmas decorated tree near the disney character.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["teddy bear", "bird", "mouse", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 249969, "caption": "The image showcases a plate filled with various types of meat, vegetables, and potatoes, all arranged beautifully and artistically. The meat consists of chicken pieces and beef slices, surrounded by numerous vegetables, including broccoli, carrots, and a few pieces of bell peppers. There are also a few potatoes placed in the dish. The plate is surrounded by various bottles, some containing sauces or other condiments, while others hold beverages like wine. There are three knives on the dining table: one placed vertically in the center, one on the right side, and another on the left side.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["knife", "dining table", "spoon", "fork", "broccoli", "wine glass", "carrot", "bird"], "hallucination_idxs": [68], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.5}}, {"image_id": 146504, "caption": " The image shows a person sitting in front of a keyboard and mouse, with his hand on the mouse. The mouse is located in the center of the image, while the keyboard is situated towards the left side. There are also a few additional objects on the table, including a cup and a cell phone. The background is not visible in this particular scene.", "mscoco_hallucinated_words": [["cell phone", "cell phone"]], "mscoco_gt_words": ["keyboard", "cup", "dining table", "laptop", "mouse", "person"], "hallucination_idxs": [58], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.8333333333333334}}, {"image_id": 40037, "caption": "There is an image of a large building with a clock tower situated on the top. The clock tower has a cross on top of it. The building is positioned on the right side of the street, in front of several people walking around. Some of these people are farther away, while others are closer to the building.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 343218, "caption": " The image shows a tennis court where two men are playing tennis. One of the players is holding a tennis racket and is in the process of serving the tennis ball, while the other player is positioned behind the server, ready to return the ball. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 301837, "caption": "The image features a long, narrow kitchen with white appliances and a sink located towards the back. There are several pots placed around the kitchen, including one in the foreground and four more along the left side of the room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "oven", "refrigerator", "apple", "sink"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 309071, "caption": " In the image, an old man is standing next to a bicycle that is leaning against a fence. The man appears to be resting, possibly tired, or engaged in a conversation, as he's leaning on the fence near the bike. A bench can be seen behind the man, providing seating for those passing by.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "bench", "potted plant", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 240681, "caption": "The scene is set on a street with a sidewalk, and a man is standing near his red motorcycle. He is holding a cellphone, possibly on a call. The man has a backpack on his back, and there are multiple cars in the background, parked along the street. The background is a mixture of buildings and green areas, indicating a more urban setting.", "mscoco_hallucinated_words": [["cellphone", "cell phone"]], "mscoco_gt_words": ["backpack", "motorcycle", "person", "car"], "hallucination_idxs": [25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 1.0}}, {"image_id": 178753, "caption": "The image shows a young girl, possibly a teenager, sitting in a chair with a cell phone in her hand. She is wearing braces and smiling while holding the phone. She is surrounded by several chairs, likely in a seating area or at a venue.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cell phone", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 278890, "caption": "In the image, a man is placing a cell phone in his pocket while wearing a blue shirt. There are two other individuals in the background, one to the right and the other towards the center. The focus is on the man who is taking care of his personal belongings.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cell phone", "person", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 47781, "caption": "There is an image of a bedroom that is very modern in style. The room has a bed in the center, with a pillow on the bed. There is also a nightstand in the room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tv", "vase", "book", "bed", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 542792, "caption": "The image shows a city street scene with a large baseball bat sticking out from the side of a building. The bat is prominently displayed, making a bold statement. It stands next to a red brick building with several tall windows and a few cars parked nearby. In addition to the bat, there are a few other vehicles in the scene, including a couple of trucks and a car parked a little further away. The picture conveys the sense of playfulness and sportsmanship associated with baseball, drawing attention to the iconic object.", "mscoco_hallucinated_words": [["car", "car"], ["truck", "truck"], ["car", "car"]], "mscoco_gt_words": ["person", "baseball bat", "sports ball"], "hallucination_idxs": [47, 71, 74], "metrics": {"CHAIRs": 1, "CHAIRi": 0.75, "Recall": 0.3333333333333333}}, {"image_id": 293703, "caption": " The image portrays a group of three men sitting at a dining table in a restaurant, smiling as they enjoy their meal. They are surrounded by pizza slices placed on two plates, and each man has a glass of water. There are two more chairs in the restaurant, one placed closer to the left side of the table, and the other near the right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "bowl", "bottle", "fork", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4444444444444444}}, {"image_id": 318777, "caption": " The scene takes place in a large sports stadium with several baseball players on the field during a game. Many baseball players are on the field, spread out across various positions. A baseball catcher is positioned behind the home plate, while the pitcher can be seen at the middle of the field. There are multiple other players in the background, some more closely and some more distantly placed from the action on the field.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["chair", "sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 14773, "caption": " The image features a table topped with a piece of food resembling a car, with hot dogs on it. The hot dogs are arranged in an artful presentation, creating a unique dining experience. Various beverages and cups are positioned around the table, including two cups placed closer to the food and a bottle or cup further away. It appears that the table is occupied by several plates and beverage containers, suggesting that the area is a busy and active dining location.", "mscoco_hallucinated_words": [["container", "bowl"]], "mscoco_gt_words": ["cup", "dining table", "spoon", "bottle", "fork", "hot dog", "car", "person"], "hallucination_idxs": [75], "metrics": {"CHAIRs": 1, "CHAIRi": 0.09090909090909091, "Recall": 0.625}}, {"image_id": 521863, "caption": "According to the picture, a person on a skateboard is skating in a public area where there are people all around him, walking and socializing. The scene shows a typical city street with various people standing and interacting with one another. Some are closer to the skateboarder, while others are farther away.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "skateboard", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 250540, "caption": " In the scene, there are five people riding three motorcycles down a street. They are lined up from left to right, with two motorcycles positioned closely together, followed by a third motorcycle to the right of them. One of the motorcycles appears to be on the left side of the group, while the others are on the right side. All three motorcycles have riders on them.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "motorcycle", "cell phone", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 211604, "caption": "According to the picture, there is a female tennis player lying on the floor, resting after play. The woman is wearing a white shirt and shorts. She is holding a tennis racquet, which can be seen close to her right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 298067, "caption": "A plate topped with several pieces of food is next to a bottle, creating a combination of a meal with a beer. The plate contains chicken and beer bottle holds a beer, likely a beer from a well-known brand like Kingfisher. The beer seems to be a popular beer brand in the region where the dish was prepared.", "mscoco_hallucinated_words": [["kingfisher", "bird"]], "mscoco_gt_words": ["bowl", "cup", "dining table", "bottle"], "hallucination_idxs": [43], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.25}}, {"image_id": 563938, "caption": "According to the image, there are three people sitting on a couch in a living room-style space. One man and one woman are sitting together, while the other man is seated a little further back from them. There are some dining chairs in the scene, with one towards the left side of the image, two on the right side, and one towards the middle of the room. The woman appears to be wearing sunglasses, while the other two individuals do not have sunglasses on.", "mscoco_hallucinated_words": [["chair", "chair"]], "mscoco_gt_words": ["bench", "tie", "bottle", "person", "couch"], "hallucination_idxs": [45], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.4}}, {"image_id": 461311, "caption": " The image features a large field filled with people enjoying themselves during the day. Several kites are flying in the sky, giving an impression of freedom and enjoyment. The people are scattered throughout the field, engaged in various activities or simply enjoying the surroundings.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["kite", "handbag", "car", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 47940, "caption": "The image displays a person sitting on a bed, surrounded by various books, including three on the bed and one on the floor next to them. The person's feet are also visible, resting on a pair of blue jeans.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["book", "laptop", "bed", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 261318, "caption": "In the scene, a man is sitting on a chair while talking on his cell phone. He appears to be outside, possibly on a patio or a courtyard area. A bench can be seen close to the man, and multiple suitcases are placed around the area, indicating a travel or travel-related scenario. There are two additional chairs nearby, one of which is located to the right of the man. The chairs and bench suggest a seating arrangement for a group of people, perhaps while waiting for transportation.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["suitcase", "handbag", "bench", "cell phone", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8333333333333334}}, {"image_id": 134768, "caption": "The image depicts a combination of a keyboard and speakers, which could be considered a set-up for a computer or sound system. The keyboard is positioned to the left of the speakers and is placed on top of a wooden table or desk. The speakers are positioned closer to the right side of the image, taking up a significant portion of the scene. The computer's keyboard has a mouse nearby it.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["keyboard", "laptop", "mouse", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 526414, "caption": "A woman and a cat are shown working in a sewing workshop, which appears to be situated in the living room of a house. The woman is seated at the desk and using a sewing machine, while the cat is in the background. The desk and surrounding areas are filled with various sewing tools, including a few scissors, a couple of pins, and several books. The entire room is full of various items such as a cell phone, a television set, and numerous books, which add to the clutter and create a cozy living space. There is also another laptop nearby, contributing to the workshop environment. This photo captures a typical crafting workshop inside a living room, filled with creative tools and everyday items.", "mscoco_hallucinated_words": [["woman", "person"], ["woman", "person"], ["cell phone", "cell phone"]], "mscoco_gt_words": ["remote", "cup", "tv", "dining table", "bowl", "laptop", "cat", "book", "scissors", "couch"], "hallucination_idxs": [1, 27, 84], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6}}, {"image_id": 273132, "caption": " The image features an old woman sitting on a park bench, likely in a park or a public area. The bench appears to be placed outside in a setting with a chain link fence visible in the background. The woman appears to be dressed in a hat and a jacket, possibly to protect herself from the weather or shade from the sun. A hand can be seen in the picture, possibly belonging to the woman or someone else in the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 505213, "caption": " The scene depicts a man wearing a jacket, sitting outside near a bench, surrounded by a few buildings in the background. He is holding up a hot dog, which is a large sausage. There are several other hot dogs, including one in his hand, and one partially visible on the right side of the scene. Some hot dogs have ketchup or other condiments on them, adding flavor to the meal. ", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["sandwich", "hot dog", "person"], "hallucination_idxs": [13], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 192656, "caption": "The image depicts a baseball field with several people standing in various positions. The main focus is a man wearing a baseball glove, throwing his hands up in the air as he appears to be making an exciting catch during the game. There are at least eight individuals in the scene, most likely players, coaches, or fans.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 272728, "caption": "The image features a vase filled with various kinds of flowers, creating a visually appealing arrangement. The flowers include pink and purple ones, arranged in different parts of the vase. Additionally, there are two small pink flowers, with one located at the center of the arrangement and the other at the top. The vase is situated on a table, enhancing the aesthetic appeal of the bouquet. The table is placed close to a window, providing ample sunlight to the flowers, thereby emphasizing their vibrant colors.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["vase", "potted plant", "chair", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 509811, "caption": " The image depicts a large, open room filled with numerous tables, all with multiple laptops placed on them. There are several laptops neatly arranged on the left side of the room, creating a setup for presentations or workshops.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "laptop", "apple", "book", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 412407, "caption": "According to the picture, there are multiple people in the scene, each one holding a cell phone in their hands. One cell phone is positioned in the foreground, and it appears to be open, showing a black screen with a dark area on the upper-left corner. Another cell phone is further back, also open and displaying the same dark area. The remaining cell phones seem to be in various positions, some smaller and others larger.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cell phone", "clock", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 353658, "caption": "The scene takes place on a wet and chilly day as a woman is walking down the sidewalk under a large, colorful umbrella. She appears to be holding a handbag in one hand, and the other hand holds the umbrella. The woman is surrounded by other people on the sidewalk, creating an active and bustling atmosphere. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "suitcase", "handbag", "umbrella", "carrot", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 441363, "caption": "The image depicts a tennis player on a tennis court, leaping up in the air while striking the ball. There are several people around the court, with one sitting near the center, another in the background on the right side, and two more towards the left edge. A clock, possibly indicating game time, can be seen on the right side of the image. ", "mscoco_hallucinated_words": [["clock", "clock"]], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 331084, "caption": " In the image, there is a dining table laden with various pizzas, several onions, and a knife. There are at least three people in the picture, likely family members, who are busy preparing the pizzas for a meal. Some of the pizzas are topped with onions and are spread across the table, while others are placed closer to the edge. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "knife", "dining table", "bowl", "spoon", "carrot", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5714285714285714}}, {"image_id": 282553, "caption": "According to the picture, a dirt road leads to a hill, with a couple of people walking along the trail at dusk. The woman in front seems to be looking in the direction of the camera, while the other person appears to be closer to the right of the image. There is a green pasture and some trees visible behind the dirt road, creating a rustic setting. The grassy area is quite long, stretching all the way to the horizon.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cell phone", "handbag", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 328818, "caption": " In the image, a woman is leaning on a park bench and putting on her shoes. There are two bicycles in the scene, one towards the left side and another one towards the right side of the picture. The woman appears to be the main focus of the image as she is standing in the center. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "bench", "person", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 3103, "caption": "In the image, a man rides his skateboard on top of a large wooden ramp, appearing to be in the middle of his trick. He wears knee pads for protection while skating. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "skateboard", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 188136, "caption": "This image features a spacious kitchen area with a table, two sinks, and several metal buckets of various sizes. The metal buckets are placed on the left and right sides of the kitchen counter, while the table is situated at the center of the scene. There are multiple apples and a knife on the table, which appear to be fruit that has been sliced or chopped. Additionally, there is a refrigerator present in the room, but it is difficult to determine its location without more context. Overall, the space gives off a clean and organized appearance with plenty of fruit for preparation.", "mscoco_hallucinated_words": [["apple", "apple"], ["knife", "knife"], ["refrigerator", "refrigerator"]], "mscoco_gt_words": ["dining table", "bowl", "orange", "sink", "carrot", "person"], "hallucination_idxs": [53, 56, 78], "metrics": {"CHAIRs": 1, "CHAIRi": 0.42857142857142855, "Recall": 0.3333333333333333}}, {"image_id": 449675, "caption": "According to the image, there is a white plate topped with a sandwich, sitting on a table in a dining area. There is also a knife and a fork positioned next to the sandwich on the plate. Surrounding the plate are various other tableware items such as cups and a bottle, further emphasizing the dining setting. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "knife", "dining table", "spoon", "bottle", "sandwich", "fork", "carrot"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 191691, "caption": "According to the picture, a man is shaving his face in the bathroom while standing in front of a sink. The sink is a tall, white one with a bottle placed close to it on the left side.", "mscoco_hallucinated_words": [["sink", "sink"], ["sink", "sink"]], "mscoco_gt_words": ["toothbrush", "person", "bottle"], "hallucination_idxs": [20, 23], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.6666666666666666}}, {"image_id": 89378, "caption": " The image depicts a man standing next to a motorcycle with a large building behind him. The motorcycle appears to be red, and it is parked near a red building with large doors. The building is located at the left side of the image, while the motorcycle is positioned closer to the middle. There are no other people in the scene, and there is only one motorcycle visible. The scene appears to be an outdoor setting with several outbuildings and a truck parked nearby.", "mscoco_hallucinated_words": [["truck", "truck"]], "mscoco_gt_words": ["bicycle", "motorcycle", "backpack", "person"], "hallucination_idxs": [88], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.5}}, {"image_id": 99874, "caption": " In the image, a woman with sunglasses on is sitting in a dining area or restaurant, holding a cell phone to her ear. She is the main focus of the scene. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "dining table", "bottle", "cell phone", "cat", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 437895, "caption": "The image features a colorful blanket on a bed, giving it a cozy and cheerful look. There is a single pillow on the bed, and an assortment of pillows is placed beneath the blanket. The bed is located in a corner of the room, making it a central feature. The pillows, along with the blanket, provide a comfortable and decorative setting for the bed.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["vase", "potted plant", "bed"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 557447, "caption": " The image shows a man in a business setting sitting down at a table and placing his hand on a document. He is wearing a suit and tie, and the focus of the scene is the hand gesture he is making. There are two books in the background, one on the left side of the room and the other on the right. The man seems to be giving a presentation or sharing information as he leans over the document.", "mscoco_hallucinated_words": [["book", "book"]], "mscoco_gt_words": ["knife", "dining table", "tie", "spoon", "fork", "person"], "hallucination_idxs": [47], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.5}}, {"image_id": 159979, "caption": " In the image, there is a group of people gathered around a row of small boats floating on the water. The boats appear to be for sale, with several market umbrellas placed above them, providing shade from the sun. One large umbrella covers a significant portion of the scene, while smaller umbrellas are positioned on the boats themselves. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "boat", "umbrella", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 357, "caption": " A professional baseball game is taking place on a field in the middle of an intense competition. The image captures the baseball game from above, showcasing a series of players standing on the field during the game.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["baseball bat", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 538891, "caption": "The image features a woman standing on a tennis court, with a tennis racket in her hand. She is looking forward and appears to be ready for a serve. The ball is placed closer to the left side of the court, indicating the likely location of the serve. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 561967, "caption": " Sighting a train on a railway track, it has multiple wagons and a colorful combination of orange, blue, and black paint on the cars. The train is traveling from the right side of the picture towards the left, pulling numerous cars behind it. There is another train visible at the bottom left side of the image, traveling along a different set of railway tracks. The scene showcases the train moving on the tracks, carrying a variety of products to their destinations.", "mscoco_hallucinated_words": [["orange", "orange"]], "mscoco_gt_words": ["traffic light", "train", "car"], "hallucination_idxs": [17], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.6666666666666666}}, {"image_id": 565778, "caption": "In the scene, a blue train is seen on the train tracks coming towards the viewer. The train appears to be passing through a tunnel, which is situated at the back of the image. This train is the main focus, with several cars positioned behind it.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "train", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 7125, "caption": " The image features a person riding a bicycle on a paved bike path with a couple of other bicycles nearby. There is a bridge in the scene, possibly indicating a path over water. The path is lined by several trees, with at least four trees positioned along the route. Additionally, there are other people walking or riding bicycles on the path as well. Some people can be seen further down the trail, while others are closer to the main rider.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "dog", "bottle", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 221350, "caption": " In the image, an older man is holding a red tie and appears to be adjusting it. The tie is positioned over his white shirt, and his hands can be seen as he makes the adjustment. Behind the man, there is a large chair placed in the center of the room. Additional chairs are placed in the back right and left corners of the room.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "dining table", "tie", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 280930, "caption": " A young girl, possibly around six years old, is standing in a kitchen, holding a tray of food. She is excited, smiling and laughing while the tray of food, possibly consisting of cookies, is on the oven. The girl is wearing blue and seems to be delighted while her mom stands near the fridge.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "oven", "refrigerator", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 151081, "caption": "The image depicts an open field with a large cloud of smoke hanging overhead. There are two vehicles, an aircraft and a tractor, that are visible. The aircraft is an airplane, which appears to be flying in the sky, while the tractor is stationary, traveling alongside the road in the same direction. The landscape is spacious, with the open field occupying the majority of the scene. The vehicles and smoke cloud are in the center of the field, positioned along the road.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "airplane", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 122232, "caption": "In a tennis court setting, a person is standing and holding a tennis racket, possibly playing the sport. There are a number of tennis balls on the court, scattered around the area.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 134223, "caption": "The image features a hot dog and fries on a plate, which is surrounded by several condiments and dipping sauces. There are multiple dip bowls, including ketchup, mustard, and relish. Among the food items on the plate, a drink can also be seen in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "hot dog", "broccoli", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 198163, "caption": " In the image, there is a man sitting in front of a display of numerous cups filled with various colors of paint. The cups are arranged in a seemingly random way, covering the majority of the space in the picture. The man appears to be focused on the task at hand, painting the cups. ", "mscoco_hallucinated_words": [["cup", "cup"], ["cup", "cup"], ["cup", "cup"]], "mscoco_gt_words": ["bowl", "person", "dining table"], "hallucination_idxs": [16, 25, 58], "metrics": {"CHAIRs": 1, "CHAIRi": 0.6, "Recall": 0.3333333333333333}}, {"image_id": 3742, "caption": "In the image, there is a vase filled with multiple apples arranged in a decorative way on the surface. The vase is placed near a window where there is a pink bowl full of red roses. There is another bowl placed on the windowsill filled with multiple apples, creating a visually appealing display.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "vase", "apple", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 217593, "caption": "According to the picture, there are multiple people, possibly ball players, gathered in a sand pit area. The main child wearing a baseball uniform appears to be playing baseball while another young player in the same uniform stands nearby. A baseball glove and a baseball are also visible on the field. The presence of several individuals in the image suggests a team event or a fun gathering.", "mscoco_hallucinated_words": [["ball", "sports ball"]], "mscoco_gt_words": ["handbag", "chair", "baseball bat", "person", "baseball glove"], "hallucination_idxs": [11], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.4}}, {"image_id": 279078, "caption": "The image depicts a young boy standing in front of an open refrigerator filled with numerous bottles of various types and sizes. There are multiple bottles of different colors placed throughout the refrigerator, creating a visually striking arrangement. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "refrigerator", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 572081, "caption": " The image captures a tennis tournament taking place on a blue court. Two men are standing in the middle of the court holding tennis rackets, awaiting their opponents in a competitive match. Another person is standing closer to the left side of the court, possibly as a ball boy or referee. Several other people can be seen in the scene, possibly as spectators or participants in other games. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "chair", "sports ball", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 477016, "caption": "In the image, there is a room with two women and a young girl standing in it. All three individuals seem to be enjoying their time together. One of the women is wearing a pink shirt while the other has a handbag.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "oven", "backpack", "potted plant", "sink", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 301266, "caption": " The image shows a group of people, likely a family or a group of friends, gathered around a kitchen counter in a store. They are engaged in a conversation while they work together, picking and placing various items in different baskets, including bowls and cups. Some of the items are being added into baskets, while others are being removed or put back. There are multiple baskets in the scene, some containing food items and others containing different kitchen utensils.", "mscoco_hallucinated_words": [["bowl", "bowl"]], "mscoco_gt_words": ["cup", "dining table", "spoon", "book", "fork", "sink", "person"], "hallucination_idxs": [47], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.2857142857142857}}, {"image_id": 475813, "caption": "The image features a young child in a room, dressed in an outfit with a top and bottom. The child is standing in the center of the room, holding an object in their hands and looking up towards the top of the image. The room appears to have various items scattered around, including a couch placed near the center, chairs in the background, and a bottle towards the right side. There are also a few other objects placed at different locations within the room.", "mscoco_hallucinated_words": [["couch", "couch"], ["chair", "chair"], ["bottle", "bottle"]], "mscoco_gt_words": ["remote", "handbag", "person"], "hallucination_idxs": [59, 65, 72], "metrics": {"CHAIRs": 1, "CHAIRi": 0.6, "Recall": 0.3333333333333333}}, {"image_id": 558784, "caption": "According to the picture, there is a man standing near a river, holding a baseball bat. He appears to be standing in a grassy area next to the water. There are several other people scattered around the area, some closer to the river, and some further away. The man might be on a baseball field or a park, possibly about to engage in a game or just enjoying a day outdoors.", "mscoco_hallucinated_words": [["baseball bat", "baseball bat"]], "mscoco_gt_words": ["backpack", "frisbee", "person"], "hallucination_idxs": [16], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.3333333333333333}}, {"image_id": 308543, "caption": " In the image, a person is holding two silver spoons in their hand. The spoons appear to be symmetrical, with one spoon slightly to the left and the other one slightly to the right. The person is likely grasping onto the spoons with their hands.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sink", "spoon", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 359164, "caption": " In the image, there are several small dogs playing in a grassy area. The dogs are placed around a backyard and seem to be walking and exploring the outdoor space. There are two dogs on the right side of the image, one closer to the middle of the grassy area, and the other near the left edge. A third dog is located near the middle of the scene, closer to the right side. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["car", "chair", "dog", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 539951, "caption": "The image showcases a lego kitchen with checkered flooring. The kitchen is decorated with a refrigerator, an oven, and a microwave, which are placed near each other. There are several bottles on the countertop, creating a chic, sleek look. The countertop appears to have black and white checkers throughout, which gives the kitchen a modern, chic appeal. The kitchen is set against a backdrop of a checkered floor, enhancing the overall visual appeal of the lego kitchen setup.", "mscoco_hallucinated_words": [["microwave", "microwave"], ["bottle", "bottle"]], "mscoco_gt_words": ["sink", "clock", "oven", "refrigerator"], "hallucination_idxs": [23, 35], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.5}}, {"image_id": 143479, "caption": " The scene depicts a large gathering of people, possibly attending an outdoor party or event. Several people are spread out throughout the area, with some standing and others sitting on the grass. The focus of the scene is an elderly man, who appears to be an important figure among the attendees, holding a tray with a variety of food, including several hot dogs.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["hot dog", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 475024, "caption": "The image depicts a vase placed on a wooden surface under a mirror. The vase contains a variety of flowers arranged in the arrangement, showcasing a mix of orange and other colors. The mirror is prominently displayed behind the vase, reflecting the arrangement and the room's lighting.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["vase", "potted plant", "orange", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 238065, "caption": "The image shows a young man riding on a skateboard down a ramp while performing tricks in the air. He is balancing on the top edge of the ramp while riding the skateboard. There are two other skateboards visible in the scene, one closer to the man and another at a further distance, both parked near the base of the ramp.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skateboard", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 314016, "caption": "According to the picture, there is a bathroom sink with toilet paper on a shelf above it. The sink is accompanied by various items such as books, wine, and other items. A book is positioned on the sink and another book is placed near it, while other items are scattered around the bathroom. The bathroom itself appears to be of a decent size, providing ample space for the sink and the other items present.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sink", "book", "toilet", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 322163, "caption": " The image depicts a man in a large kitchen, actively preparing food in the middle of the room. He appears to be placing something on a pizza as he is in the process of making it. In the background, there are two other people standing near the kitchen's edge, with one person close to the left side and the other closer to the right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "pizza", "oven", "dining table", "bowl", "potted plant", "bottle", "microwave", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2222222222222222}}, {"image_id": 230983, "caption": "The image features a person on a skateboard, captured in mid-air while jumping over an object. The person appears to be leaping over a green plastic bottle.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["skateboard", "car", "person"], "hallucination_idxs": [28], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 564629, "caption": "According to the image, there is a kitchen area that has wooden cabinets with different sizes throughout. There are several appliances, including a stove, a microwave, and a sink. Some knobs can be found on the cabinets and appliances. The image showcases a clean and well-organized kitchen with modern appliances. There are no people or other furniture in the scene, giving it a sense of organization and neatness.", "mscoco_hallucinated_words": [["person", "person"]], "mscoco_gt_words": ["sink", "microwave", "oven", "refrigerator"], "hallucination_idxs": [61], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.75}}, {"image_id": 53431, "caption": "In the image, a man is sitting outside a building, possibly waiting for someone or enjoying the day. He is wearing a tie and a white shirt. He has a laptop open in front of him on the ground, using it. The image suggests a moment of relaxation, with the man engaging in an outdoor activity.", "mscoco_hallucinated_words": [["tie", "tie"]], "mscoco_gt_words": ["laptop", "chair", "person"], "hallucination_idxs": [25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 441840, "caption": "The image captures a typical baseball game, featuring a young boy in a baseball uniform who is swinging at a baseball, held in his hand. There are other people present in the scene, such as an adult who is standing near the boy, likely an instructor or a teammate. The background is filled with grass, creating a lush environment for the game to take place. Overall, the scene portrays an active and lively atmosphere, as the young player attempts to hit the baseball successfully.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "car", "baseball bat", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 53465, "caption": " The image features a large sheet cake decorated with pink frosting that is prominently displayed on a table. The cake features a message written in the frosting that says, \"welcome to Singapore.\" There is a fork and a spoon situated close to the cake, possibly for serving.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "dining table", "bowl", "spoon", "cake", "fork", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5714285714285714}}, {"image_id": 381031, "caption": " An image portrays a food scene laid out on a table top, containing a variety of food items. The table is covered in a mixture of ingredients, including an apple, oranges, a bagel with cream cheese, a knife, and a few bottles. The different bottles may contain beverages like juice or milk. Some of these items appear to be part of a healthy breakfast, like the apple and oranges. The knife might be used to cut up the apple and the bagel with cream cheese. ", "mscoco_hallucinated_words": [["orange", "orange"], ["knife", "knife"], ["orange", "orange"], ["knife", "knife"]], "mscoco_gt_words": ["cup", "dining table", "spoon", "bottle", "apple", "microwave", "donut"], "hallucination_idxs": [34, 43, 78, 81], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3076923076923077, "Recall": 0.5714285714285714}}, {"image_id": 530520, "caption": " In the image, a group of four men, including two seated and two standing, sit outside on the grass, relaxing and enjoying their time together. One man is using a laptop as he sits, while the other man in the middle is leaning back on his hands. All the men are wearing different styles of pants.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "motorcycle", "laptop", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 539555, "caption": "According to the picture, a man in the center is catching a frisbee and performing a trick while holding it in his hand. Around him, there are several other people, some of whom are watching the performance. In the background, there are two frisbees, one placed higher up and the other lower, showcasing the frisbee that the man is holding.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skateboard", "frisbee", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 279521, "caption": " In the image, there are two people standing near a public transit train. The train appears to be a red, white, and black train, with a side view of the train in the picture. The two people are positioned on the sidewalk next to the train, observing or waiting for it. There are other pedestrians in the scene as well, with two individuals to the right of the train and another one closer to the train's back end. A backpack and a purse can be seen in the scene, located near the people and the train.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "handbag", "train", "car", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 560439, "caption": " The image captures a snowy winter scene where a group of three people, all wearing snowboards, are traveling down a snow-covered hill. They are in the middle of a path, enjoying the snowy adventure together. One person is closer to the right side of the path, another person is in the middle, and the third person is on the left side of the path.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "snowboard", "backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 248353, "caption": " The scene is set at a ski slope where people are enjoying their skiing activities. Several people can be seen wearing snow skis, either standing or skiing down the slope. There are a total of nine people in the image, spread out across the slope, with some skis closer to the bottom of the slope.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skis", "person", "snowboard"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 9628, "caption": " The image depicts a person, possibly a man, riding a water ski behind a boat. They are being pulled by the boat, with the person being suspended in the air. The water ski appears to be in a high position, suggesting they are performing a jump or trick while being towed.", "mscoco_hallucinated_words": [["ski", "skis"], ["ski", "skis"]], "mscoco_gt_words": ["surfboard", "boat", "person"], "hallucination_idxs": [13, 37], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2857142857142857, "Recall": 0.6666666666666666}}, {"image_id": 330050, "caption": "According to the picture, a man and woman are standing next to each other on a tennis court. They are both holding tennis racquets and appear to be enjoying the moment together. The woman is wearing a black top, and the man is wearing a white shirt and a black hat. There is a net in the background of the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 254161, "caption": " In this scene, a woman is flying a large kite outdoors, possibly in a city park. The kite is prominently displayed in the sky, with its string running along the entire park area. The park is filled with people, some of whom are engaged in outdoor activities while others are just milling about, enjoying the day.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "kite", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 275749, "caption": "According to the picture, the scene involves a group of people gathered in a room, likely sitting on a couch or chairs, near several beverage bottles and a TV monitor. A man and two women can be seen conversing and interacting, with one person on the left wearing a black top and the others standing on the right side. The TV monitor is placed on the left side of the room. There are three computer screens visible, two in the center of the room and one further back on the right side. A chair is positioned to the right of the back left monitor, and another chair is located on the left side of the image, close to the bottles.", "mscoco_hallucinated_words": [["couch", "couch"]], "mscoco_gt_words": ["cup", "tv", "dining table", "laptop", "mouse", "bottle", "chair", "person"], "hallucination_idxs": [21], "metrics": {"CHAIRs": 1, "CHAIRi": 0.0625, "Recall": 0.625}}, {"image_id": 163518, "caption": " The image depicts a woman working in a kitchen, standing in front of a table filled with assorted salads. The table has numerous small cups, each holding a salad in it. The salads are likely prepared meals, and the woman appears to be placing them into the cups. ", "mscoco_hallucinated_words": [["cup", "cup"], ["cup", "cup"]], "mscoco_gt_words": ["oven", "dining table", "bowl", "chair", "person"], "hallucination_idxs": [26, 52], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.4}}, {"image_id": 156696, "caption": "The image features a large train at a train station, positioned to the left of the platform. There are two smaller train cars on the right side, partially covered by the larger train.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "train", "handbag", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 525202, "caption": "The image features a pizza placed on top of a red and white tablecloth, likely in a restaurant or dining area. There are several people present in the scene, spread out throughout the table and nearby chairs. A few of them are holding food, such as a bottle of wine and an apple. ", "mscoco_hallucinated_words": [["apple", "apple"]], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "spoon", "bottle", "fork", "chair", "person"], "hallucination_idxs": [57], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.5555555555555556}}, {"image_id": 423984, "caption": " The image features a plate of food topped with carrots, broccoli, and bread. Among the vegetables, several carrots can be seen, dispersed around the plate, with some located closer to the edges. The broccoli is also dispersed, with a few smaller florets closer to the center of the plate. In addition to the vegetables, the meal includes several small-sized potatoes on the side, scattered throughout the plate. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["carrot", "broccoli", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 499486, "caption": "The scene features two cats sitting on top of a unmade bed. They appear to be relaxing, resting on the sheets and blankets covering the bed. One of the cats is closer to the left edge of the bed while the other is slightly more towards the center of the image. The cats are placed in the middle of the bed and occupy a significant portion of it. There is another cat visible in the background, but it's not sitting on the bed.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["book", "cat", "bed"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 118124, "caption": "The image features a man wearing glasses, a blue tie, and a plaid shirt. He is sitting in a car with his left hand raised, possibly taking a selfie or holding something. The car's interior is relatively empty, with the exception of a few items like a cell phone, placed near the right edge of the scene. The man appears to be the only person present in the car.", "mscoco_hallucinated_words": [["cell phone", "cell phone"]], "mscoco_gt_words": ["tie", "person", "car"], "hallucination_idxs": [54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 1.0}}, {"image_id": 288729, "caption": " The image features a large cake that resembles a circus tent. The cake is adorned with colorful frosting and decorations, with a red and white theme. A toy teddy bear is positioned at the entrance of the cake, surrounded by smaller items, such as a toy car and a toy train. The entire display seems to be set up as if it was inside a circus tent.", "mscoco_hallucinated_words": [["car", "car"], ["train", "train"]], "mscoco_gt_words": ["teddy bear", "bear", "elephant", "cake", "person"], "hallucination_idxs": [51, 55], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.4}}, {"image_id": 177721, "caption": " In the image, a person is performing a trick on a skateboard. They are balancing themselves as they perform the trick, which captures the exciting moment on the ramp. There are multiple people in the scene, some of them standing close to the ramp while others can be seen further away. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skateboard", "person", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 372220, "caption": " The image captures a man with gray hair and a beard wearing a black and white suit, black hat, and a tie. He is using a cell phone, likely texting or browsing the internet. The surrounding area is occupied by a few other people, including one person near the man and the others more in the background. ", "mscoco_hallucinated_words": [["tie", "tie"]], "mscoco_gt_words": ["cell phone", "horse", "handbag", "person"], "hallucination_idxs": [23], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.5}}, {"image_id": 252625, "caption": "This is a photo capturing a small street or alley lined with many shops on the right side and a sidewalk on the left. The street has a few parked bicycles, one near the sidewalk and another further down the street. The sidewalk has a few small plants and a handful of people walking, two closer to the middle and two more dispersed near the edges. Some of the shops have signs and advertisements visible in the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "potted plant", "person", "bicycle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 493610, "caption": "The scene features a large truck or van with a flat top, occupied by a group of people. They are riding inside the vehicle, which appears to be an old school bus, possibly en route to their destination. Some people are seated, while others are standing or walking around the bus, which has an open top, giving them a chance to feel the air and enjoy the outdoors. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "person", "bus", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 302789, "caption": "A small room is depicted with a bed sitting against the wall. A suitcase is placed beside the bed, with an empty bottle next to it. A sink is located nearby, close to the bed and to the right of the room. The room also includes a small kitchen, with a refrigerator and an oven to the left of the bed. The refrigerator is at the top-right corner of the image, while the oven is situated at the lower-left portion. The kitchen also has a microwave, with a pane on the top-left corner and another pane below it.", "mscoco_hallucinated_words": [["oven", "oven"], ["oven", "oven"], ["microwave", "microwave"]], "mscoco_gt_words": ["suitcase", "handbag", "refrigerator", "backpack", "bottle", "sink", "bed"], "hallucination_idxs": [60, 81, 94], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.7142857142857143}}, {"image_id": 30667, "caption": "According to the picture, there are two people sitting in the green booth, with a dog occupying one of the seating areas. The dog is positioned on the left side of the booth, while the people are seated on the right side. Among the group, there are two books being read, one on the right side and the other on the left side. ", "mscoco_hallucinated_words": [["book", "book"]], "mscoco_gt_words": ["person", "boat", "dog"], "hallucination_idxs": [54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 114108, "caption": " The image features several people seated around a table, some of them focused intently on the two slices of pizza placed on the table. They appear to be enjoying their meals, possibly discussing something while dining.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "spoon", "fork", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.375}}, {"image_id": 328110, "caption": "According to the picture, there is a woman sitting on top of a park bench with a baby in her arms. The baby is held tightly, creating a sense of comfort and love. There are several other people in the background, walking around or milling about the area. They are dressed in various clothing styles, adding some variety and contrast to the scene. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "handbag", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 366111, "caption": " This image appears to be a vintage or black and white photo featuring several people riding bikes. There are six bicycles present, with at least one person on each one. They are spread across the scene, with two riders on the left side, two in the middle, and one on the right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "car", "person", "bicycle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 532481, "caption": "The image features people enjoying a day on the water, with a man being pulled by a parasail in the middle of the ocean. The parasail is flying over a group of people standing on boats, watching the exciting scene.", "mscoco_hallucinated_words": [["boat", "boat"]], "mscoco_gt_words": ["person", "surfboard", "kite", "car"], "hallucination_idxs": [37], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.25}}, {"image_id": 291634, "caption": "The scene is a sidewalk with a row of motorcycles parked in a row next to the curb. There are also bicycles and a person standing nearby. Some of the motorcycles are parked close to one another, creating a bit of a \"parking lot\" atmosphere. Additionally, there is a handbag visible in the image, possibly belonging to the person nearby. A few bicycles can be seen parked along the curb in the scene, with one close to the middle and the other on the left.", "mscoco_hallucinated_words": [["handbag", "handbag"]], "mscoco_gt_words": ["bicycle", "motorcycle", "person"], "hallucination_idxs": [56], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 1.0}}, {"image_id": 463678, "caption": "The image shows a man on a tennis court swinging his tennis racket in preparation to return a serve, which is the main focus of the picture. The tennis ball is mid-air during the serve, and the man is leaning over, preparing to strike the ball. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 70815, "caption": " There is a dog sitting on the sidewalk near a bicycle, which is the main object in the scene. The bicycle is located in the middle of the street, with a couple of bikes nearby. The main dog has a leash tied around its neck, and there is another dog nearby, but it's not connected to the leash.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "person", "truck", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 367872, "caption": " The scene captures a moment of excitement during a baseball game as a batter swings at a ball on the field. The batter is in mid-swing, and the baseball glove, which has a baseball player, can be seen in the foreground. In total, there are 10 people present on the field, including the batter and the players, with two of them on the field, and the others likely spectators or teammates. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["umbrella", "person", "baseball bat", "sports ball", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 353898, "caption": "The scene shows a person sitting on a cement walkway surrounded by a large group of birds. The birds are scattered throughout the area, with some located close to the person, and others further away. The birds appear to be a mix of pigeons and other species, and they seem to be attracted to the area. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "cell phone", "backpack", "bird", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 88269, "caption": "The image displays a plate with a couple of sandwiches and a bowl of tomato soup next to them. There's a spoon on the plate with the soup, suggesting that it's the main dish in the meal. The soup bowl is prominently placed at the top center of the scene, while the sandwiches take up a lower position. The food appears to be on a dining table or countertop, making it an appropriate meal setting.", "mscoco_hallucinated_words": [["table", "dining table"]], "mscoco_gt_words": ["sandwich", "spoon", "bowl"], "hallucination_idxs": [73], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 1.0}}, {"image_id": 463211, "caption": "There is a close-up view of a young girl in a backpack walking in front of a school bus. Two other children are walking next to her, and they all appear to be on their way to school. The children are in the middle of the image, with the school bus at the top of the picture and the sidewalk on the bottom.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "bus", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 396269, "caption": "In a large and ornate room, there is a vase holding a bouquet of flowers and several chairs surrounding a circular table. The room features elaborate decorative tapestries on the walls, a chair positioned behind the vase, and several other chairs scattered around the room. A beautiful, hand-painted mural can also be seen on the wall.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["vase", "chair", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 98979, "caption": "The image shows a silver microwave oven sitting on a countertop, placed in the center of the scene. The microwave has a button in front, ready to be pressed. Behind it, there are two sinks, one located on the left and the other one on the right.", "mscoco_hallucinated_words": [["sink", "sink"]], "mscoco_gt_words": ["microwave", "knife", "oven"], "hallucination_idxs": [39], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 244455, "caption": " The image features two motorcycles riding on a street, with the riders wearing protective gear such as helmets, vests, and gloves. The motorcycles are positioned side by side, with the one on the right being a little bit closer to the curb. The street appears to be a bit crowded, as there are several cars scattered around the area.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bicycle", "motorcycle", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 379443, "caption": " According to the picture, it is a close-up of a black material, with a purple plastic cutting shears held in a pair of scissors close to the material. The scissors appears to be held horizontally, creating a fun and playful atmosphere in the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["orange", "scissors", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 536154, "caption": "According to the image, there is a small child sitting on a couch or chair amidst a large pile of stuffed animals. The couch is in the background, and the child is the focus of the scene. The stuffed animals are of various sizes and types, spread across the scene, providing a playful and inviting atmosphere. There are many stuffed animals within the pile, with some on the right, left, and middle parts of the image, as well as some slightly off to the right side. The child is sitting in the middle of the pile, which covers most of the image. The stuffed animals are the highlight of the scene, showcasing the vast variety available.", "mscoco_hallucinated_words": [["couch", "couch"], ["chair", "chair"], ["couch", "couch"]], "mscoco_gt_words": ["teddy bear", "cat", "person"], "hallucination_idxs": [13, 15, 25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.3333333333333333}}, {"image_id": 374061, "caption": "The image features a black cat sleeping and resting with its head against a stuffed animal. The stuffed animal resembles a teddy bear. The cat is accompanied by a hand in the image. It appears to be holding a remote control or interacting with the cat. The teddy bear-like stuffed animal is in close proximity to the cat.", "mscoco_hallucinated_words": [["remote", "remote"]], "mscoco_gt_words": ["teddy bear", "bear", "cat", "bed", "couch"], "hallucination_idxs": [42], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.4}}, {"image_id": 560018, "caption": "The image captures a lively ocean scene with a group of people enjoying water sports, such as surfing, in the waves. The water is quite rough with a strong current, causing many waves to form in the area.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["surfboard", "kite", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 353320, "caption": " The image captures a guitar, placed on the ground on top of a bench, as well as a backpack and two handbags lying nearby. The bench is a large wooden surface that extends along the sidewalk and has several suitcases on it, with one on top of the guitar, and the others scattered near it. There are a couple of backpacks to the right of the main suitcase and several handbags, including one close to the guitar and another one a bit further down on the left side.", "mscoco_hallucinated_words": [["backpack", "backpack"], ["backpack", "backpack"]], "mscoco_gt_words": ["handbag", "bench", "suitcase", "person"], "hallucination_idxs": [20, 66], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.75}}, {"image_id": 256628, "caption": " In the image, a large, beautiful golden-colored dog is laying on the wooden floor of a patio, possibly enjoying some rest. The dog's mouth is open, exposing its tongue. Near the dog, there are several chairs and tables, indicating this may be a patio area where people can gather and enjoy their time outside.", "mscoco_hallucinated_words": [["table", "dining table"], ["person", "person"]], "mscoco_gt_words": ["bowl", "chair", "dog"], "hallucination_idxs": [45, 55], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.6666666666666666}}, {"image_id": 158806, "caption": " The image features a person outside, likely in a backyard, with a large dog by their side. The dog is about to eat a sandwich, consisting of two slices of bread on a blue plate. The person seems to be handing the sandwich to the dog, offering it a treat. In the background, there is a bottle that can be seen resting on the ground. A hand is also visible, probably belonging to the person who is interacting with the dog. There are no other people or objects in the scene, except the dog, the person, and the blue plate with the sandwich.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["sandwich", "person", "dog"], "hallucination_idxs": [64], "metrics": {"CHAIRs": 1, "CHAIRi": 0.07142857142857142, "Recall": 1.0}}, {"image_id": 424270, "caption": " The scene features a little girl sitting on a bed in the middle of the image. She is wearing a white outfit and sitting on top of the bed sheets, which are spread out in a relaxed manner. The girl is looking to the right of the image, creating a sense of movement and presence. There are two different beds within the image. The first is a larger one, occupying most of the space in the scene, while the second one is closer to the right side of the image. A dining table is placed on the left side of the image, offering some additional seating options. The scene showcases a cozy, comfortable, and somewhat whimsical atmosphere for a young girl.", "mscoco_hallucinated_words": [["table", "dining table"]], "mscoco_gt_words": ["person", "bed", "tv"], "hallucination_idxs": [101], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.6666666666666666}}, {"image_id": 368335, "caption": "The scene features a large parking lot filled with multiple cars. A horse is positioned in the middle of the parking lot, surrounded by the vehicles. There are several cars parked around the horse, including a black car on the left, a white car on the right, a brown car closer to the left side of the image, and a brown car in the back of the other cars.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "horse", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 483227, "caption": " There is a group of people standing on a tennis court in the image, including at least nine people of various ages. All of them have tennis rackets in hand, ready to play a game of tennis. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 228306, "caption": "The image features a red bus traveling down a street near the water, possibly in a city. The bus is occupying a significant portion of the street, and appears to be positioned in the center of the scene. In the background, there are several other buses parked, with a few closer to the foreground and others farther in the distance.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bus", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 416384, "caption": "According to the picture, there is a woman sitting in a chair and using a cell phone. She is positioned on a large outdoor chair, likely located in a public area like a park. The cell phone appears to be a prominent focus of her attention. Apart from the woman, there are other people present in the scene, some closer to the right side of the image and others more towards the left. A couple of bottles can be seen close to the chairs. There are also two dining chairs placed beside the larger chair the woman is sitting in.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["cell phone", "chair", "person"], "hallucination_idxs": [84], "metrics": {"CHAIRs": 1, "CHAIRi": 0.08333333333333333, "Recall": 1.0}}, {"image_id": 364783, "caption": "The image is a picture of a red bathroom with a toilet, sink, and a tub or shower. The toilet is situated in the left corner, and the sink is in the middle, with a bowl and a spout. The tub or shower is located in the right corner. The bathroom is small and appears to have little space for movement. Various items can be seen around the bathroom, including a cup placed near the toilet and a vase placed near the sink. There is also a book visible near the bathroom's entrance. The bathroom is well lit and clean, featuring the popular color of red.", "mscoco_hallucinated_words": [["bowl", "bowl"], ["vase", "vase"], ["book", "book"]], "mscoco_gt_words": ["toothbrush", "cup", "potted plant", "bottle", "toilet", "sink", "chair"], "hallucination_idxs": [40, 87, 97], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3, "Recall": 0.42857142857142855}}, {"image_id": 335041, "caption": " The image features a woman with skis standing on a snow-filled street, skiing past multiple cars parked alongside it. Some of the cars are visible on the left side of the image, while others are on the right. The street appears to be in a residential area with trees and buildings visible in the background. The woman, who is the focus of the scene, is enjoying the snowy conditions. There is another person in the scene, standing further down the street near the right side of the image. The surrounding cars appear to be parked along the sidewalk, as they are positioned relatively close to the curb. Overall, the scene depicts a snowy winter day with people enjoying the outdoors.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["car", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 531327, "caption": "The image features a white bowl placed on a dining table that is filled with a variety of mixed vegetables, consisting of broccoli, carrots, and other colored vegetables. The bowl is placed near the edge of the table, with a utensil \u2013 a spoon \u2013 positioned next to it. This spoon likely served as a utensil for serving or mixing the vegetables inside the bowl. The table is on a dining room floor, and there are two other silver spoons close to the table, possibly used as utensils as well. Overall, the image portrays a pleasant scene of a colorful, well-assorted plate of mixed vegetables.", "mscoco_hallucinated_words": [["bowl", "bowl"], ["bowl", "bowl"], ["bowl", "bowl"]], "mscoco_gt_words": ["knife", "dining table", "spoon", "fork", "broccoli", "wine glass", "carrot"], "hallucination_idxs": [5, 33, 70], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.5714285714285714}}, {"image_id": 451228, "caption": "The image features a busy street lined with holiday lights and decorations. There are two double-decker buses parked on the street, while a third is passing by. There are several people riding bicycles down the street, and several more people walking or standing on the sidewalk. The scene captures a festive atmosphere on the city street.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "bus", "person", "bicycle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 256365, "caption": " The image features a large plate covered in various blue- and white-frosted cupcakes. The cupcakes are arranged in rows on the plate, with some near the center and others scattered towards the edges. The frosting decoration creates an artistic design, giving the cupcakes a unique appearance.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cake", "cup", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 502936, "caption": "The scene is set at a tennis court in a large outdoor area, likely a park or a public recreational space. The focal point of the image is a young man playing tennis, holding a tennis racket as he is getting ready to hit the ball. The ball, which is located near the left side of the court, appears to be an orange one, representing the heat of the day during this leisurely activity.", "mscoco_hallucinated_words": [["orange", "orange"]], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [68], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 1.0}}, {"image_id": 451449, "caption": "According to the picture, there is a large dog lying in a dog bed, taking up most of the room. The bed is positioned near the corner of a room, underneath some books and other items.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["book", "bed", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 51054, "caption": " The image features an elderly woman sitting on a bed in her bedroom. She appears to be holding a cellphone and talking on it. There are several items in the room, including a bottle on the left, a clock near the middle, and a couple of bottles on the right side. A chair can be seen in the background, with two cushions placed on it. The scene is in black and white, creating an eerie feeling.", "mscoco_hallucinated_words": [["clock", "clock"], ["chair", "chair"]], "mscoco_gt_words": ["cup", "bottle", "cell phone", "bed", "person"], "hallucination_idxs": [42, 58], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2857142857142857, "Recall": 0.8}}, {"image_id": 1268, "caption": "According to the picture, there is a woman standing near the water, possibly taking a photograph or checking her phone. Several people are visible in the scene, likely enjoying their surroundings or socializing. A bench is situated along the water, providing a place for people to sit and relax. A few handbags can also be seen scattered throughout the area, indicating that some of the people might be carrying them.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["handbag", "cell phone", "backpack", "bird", "boat", "person"], "hallucination_idxs": [39], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.5}}, {"image_id": 62692, "caption": "In the image, a woman is playing tennis on a sports court. She is in action, swinging a tennis racket as she goes for the ball during a game of tennis. The ball is located towards the top of the scene, with multiple other balls visible in various positions. There is a pane of glass on the wall, located on the right side of the image. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 542771, "caption": "The image depicts a spacious room filled with a large array of chairs, dining tables, and a vase. The arrangement features various chairs situated around the room, some of them arranged in a circular manner while others are placed linearly along the walls. There are also two chairs near the middle of the room, and a vase placed prominently on a table. The dining tables are dispersed throughout the room, with one located in the lower right corner and another in the upper left part of the scene. The overall arrangement can be considered an old black and white photograph.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "potted plant", "vase", "book", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 575406, "caption": " In the image, a red bus with a blue top and a white roof is driving along a street next to several other parked vehicles. The street seems to be a parking lot or a place where buses are regularly parked. The bus appears to be the primary focus of the scene. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["motorcycle", "bus", "potted plant", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 244834, "caption": "The scene features a group of people walking down a rainy street. Some of them are holding umbrellas, which offer them protection from the rain. Some umbrellas are plain white, while others have colorful designs.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "clock", "handbag", "umbrella", "potted plant", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2857142857142857}}, {"image_id": 151159, "caption": " There are several people present in the image, with some scattered throughout the scene and others grouped around skateboards. One man is wearing gray pants and has a pair of sneakers, suggesting a skateboarding activity might be taking place. Two benches can be seen in the park, positioned near the edge of the image. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skateboard", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 277073, "caption": "The scene takes place on a street with multiple vehicles moving in different directions. Two men are sitting on the back of a motorcycle, likely enjoying a leisurely ride. One of them, the driver, is wearing a backpack. They are surrounded by cars, with one on the left side of the image, two on the right side, and two others behind the main scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "motorcycle", "car", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8}}, {"image_id": 377879, "caption": "The image features two military tanks, sitting on a large open area, likely a parking lot or a playground. The tanks are parked close together, with one placed at the left side of the picture and the other at the right side. They appear to be the main focus of the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "train", "bed"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.0}}, {"image_id": 436814, "caption": "The image displays a room with various types of technology set up in a small area. There is a computer sitting on top of a table along with a laptop on another table. Several monitors are placed in different locations throughout the room. One monitor can be seen near the left side of the image, while another monitor is situated near the center of the room and a third monitor is located on the right side. ", "mscoco_hallucinated_words": [["monitor", "tv"], ["monitor", "tv"], ["monitor", "tv"], ["monitor", "tv"]], "mscoco_gt_words": ["laptop", "mouse", "chair", "dining table"], "hallucination_idxs": [36, 47, 61, 73], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.5}}, {"image_id": 63934, "caption": "The scene features a person on a skateboard, dressed in black, navigating their way down a sidewalk in a suburban area. They are approaching a parking lot filled with cars, including a few in the foreground and several others scattered throughout the parking lot. The person on the skateboard is the focus of the scene, with their head facing in the direction they are traveling. Additionally, a traffic light can be seen in the background.", "mscoco_hallucinated_words": [["traffic light", "traffic light"]], "mscoco_gt_words": ["truck", "skateboard", "car", "person"], "hallucination_idxs": [75], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.75}}, {"image_id": 275611, "caption": " The image features a small pizza sitting on top of a plate on a dining table. The pizza appears to be a typical Italian-style cheese pizza with a crispy crust and a variety of toppings, including mozzarella, spices, and sauce. The pizza is located in the center of the plate, and its size suggests that it might be served as an individual serving.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "knife", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 516248, "caption": "In the image, a person is sitting at a dining table or desk, placing their hands near a computer keyboard and a mouse. They appear to be interacting with the computer keyboard and mouse while sitting in the dining room. The focus is on the hands of the person, with a couple of other items on the table, such as a cell phone on the right side, and a cup on the left side.", "mscoco_hallucinated_words": [["cell phone", "cell phone"], ["cup", "cup"]], "mscoco_gt_words": ["keyboard", "tv", "dining table", "laptop", "mouse", "person"], "hallucination_idxs": [67, 76], "metrics": {"CHAIRs": 1, "CHAIRi": 0.15384615384615385, "Recall": 0.8333333333333334}}, {"image_id": 478380, "caption": "The image depicts a person sitting in a van surrounded by a large flock of sheep. The person is driving, and the sheep are spread throughout the scene. Some of the sheep are near the van's back side while others are further away, moving down the road. There is a car nearby the van, adding to the scene's context. Overall, the scene appears to be capturing a unique moment of a person driving a van while being followed by a large herd of sheep.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "sheep", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 422326, "caption": "The image depicts a parking lot with a variety of motorcycles parked along the side. There is a three-wheeled motorcycle that stands out, positioned at the left side of the parking lot. Two other motorcycles can be seen further back, as well as a sports car on the right side. There are also a few vehicles parked in the parking lot, including a pickup truck, a car, and a semi-truck, all located towards the right side.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bicycle", "motorcycle", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 110449, "caption": " The image shows two people standing behind a counter, each holding a remote. They seem to be posing together, with their hands and arms outstretched. The person on the left is holding a remote in their right hand and placing their left hand on the right arm of the other man, while the person on the right is holding a remote in their right hand, leaning towards the counter. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "tv", "dining table", "laptop", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 45550, "caption": " In the image, there is a man standing in a room filled with other people, taking a selfie. He is holding a piece of fried food, most likely fried chicken, and there is a spoon resting near the food. The atmosphere seems relaxed, with a variety of food items on display, including a dish of soup and a fork nearby. The man appears to be the center of attention, either posing for a photo or simply enjoying the food and the company.", "mscoco_hallucinated_words": [["spoon", "spoon"], ["fork", "fork"]], "mscoco_gt_words": ["sandwich", "clock", "person", "bowl"], "hallucination_idxs": [39, 66], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4, "Recall": 0.25}}, {"image_id": 444913, "caption": "According to the picture, there is a man standing in a grassy area, possibly near some buildings, holding a bottle. The bottle appears to be green, and the man is wearing a cap.", "mscoco_hallucinated_words": [["bottle", "bottle"], ["bottle", "bottle"]], "mscoco_gt_words": ["frisbee", "person", "car"], "hallucination_idxs": [22, 25], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.3333333333333333}}, {"image_id": 82709, "caption": "The image features a large field filled with a number of people scattered around, some seated, others standing or walking. In the foreground, a black and white dog is in the process of catching a pink frisbee while the sunlight shines on the scene. The people appear to be enjoying their time in the field, likely participating in various activities or simply watching the dog's playful moment.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "dog", "bottle", "frisbee", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 132796, "caption": "In the image, a group of three people are riding on the back of an elephant, enjoying their time together. Two of the individuals are seated on the elephant, with one person positioned on the left side and the other person on the right side. The third person is walking alongside the elephant. The elephant appears to be occupying the middle of the image, with the people on its back riding together.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "elephant", "backpack", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 303099, "caption": "According to the picture, there are two cats sitting on a toilet in the bathroom. They are placed on the toilet bowl and appear to be looking at something outside of the frame, possibly a TV. One cat is situated on the left side, while the other cat is closer to the right side of the toilet. A few bowls are also placed in the bathroom, one behind the toilet, another near the right side, and the third closer to the left side.", "mscoco_hallucinated_words": [["bowl", "bowl"], ["tv", "tv"], ["bowl", "bowl"]], "mscoco_gt_words": ["cat", "toilet", "bottle"], "hallucination_idxs": [23, 38, 65], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3, "Recall": 0.6666666666666666}}, {"image_id": 271565, "caption": "According to the image, a group of people are riding motorcycles down a paved street. There are multiple motorcycles visible, including some that are parked in a line along the side of the street. Several people are on the motorcycles, including the lead rider. There are some parked motorcycles in front of the group of riders, and other motorcycles are spread throughout the scene. Some of the motorcycles appear to be in motion, while others are stationary.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "motorcycle", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 205193, "caption": " In the image, there are two bowls on a kitchen counter next to each other. One bowl is filled with black tea and the other has green tea. A coffee press can also be seen in the picture. There is a bottle and an opened bottle in the foreground of the image. Near the black tea bowl, there is a spoon lying on the counter.", "mscoco_hallucinated_words": [["bowl", "bowl"], ["bowl", "bowl"], ["bottle", "bottle"], ["bottle", "bottle"], ["bowl", "bowl"], ["spoon", "spoon"]], "mscoco_gt_words": ["toaster", "microwave", "oven", "person"], "hallucination_idxs": [7, 18, 45, 49, 61, 66], "metrics": {"CHAIRs": 1, "CHAIRi": 1.0, "Recall": 0.0}}, {"image_id": 272566, "caption": " The image shows a man sitting on the ground in front of a produce market. In front of him, there is a large pile of fruits, mainly consisting of bananas. There are numerous bananas stacked up, covering most of the pile. In addition, there are some oranges on the left side of the image.", "mscoco_hallucinated_words": [["orange", "orange"]], "mscoco_gt_words": ["cup", "apple", "person", "banana"], "hallucination_idxs": [53], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.5}}, {"image_id": 170099, "caption": " The image features a man standing in front of an orange couch, with his hands out and open, as if he is making a speech or presentation. He is wearing a tie and a red shirt underneath. The couch sits off to the side, and there is a chandelier hanging above it. The room appears to be a conference area or a stage set.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tie", "orange", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 14940, "caption": "According to the picture, there is an orange and white cat sitting next to a stuffed animal or toy, perhaps hugging it. The cat is positioned in the center of the scene and the stuffed animal is right next to it. A pair of teddy bears can be found in the background, one closer to the left side and another closer to the right side. It is likely that these stuffed animals are also part of the playful interaction with the cat and the stuffed animal it is hugging.", "mscoco_hallucinated_words": [["teddy bear", "teddy bear"]], "mscoco_gt_words": ["dog", "cat", "orange", "chair", "couch"], "hallucination_idxs": [48], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.4}}, {"image_id": 408449, "caption": "According to the picture, several people are present in the scene, with one man riding a skateboard and another walking behind him. There is a traffic light in the area, which might indicate a road or crosswalk. Another person can be seen walking to the left side of the scene. A few more individuals are spread throughout the area, some closer to the skateboarder and others further away. Overall, it appears to be a busy city street with various people going about their daily business.", "mscoco_hallucinated_words": [["traffic light", "traffic light"]], "mscoco_gt_words": ["truck", "skateboard", "bus", "handbag", "car", "backpack", "person"], "hallucination_idxs": [28], "metrics": {"CHAIRs": 1, "CHAIRi": 0.14285714285714285, "Recall": 0.2857142857142857}}, {"image_id": 140702, "caption": " The scene features two elephants, an adult and a baby, walking together on an open grassy field. The baby elephant is smaller in size compared to the adult one, but the two are standing close together, with the adult closer to the camera. A bird is also present near the adult elephant, slightly to the left. The grassy area extends across most of the image, except for a small section on the left side. It appears that the adult elephant is guiding or watching over the baby, creating an endearing image of the two creatures walking together.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bird", "person", "elephant"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 354540, "caption": "The image shows a person, likely a woman, sitting at a kitchen countertop with a laptop and a smartphone. The person appears to be using the kitchen island as a workstation, typing on the laptop while also using the phone. A few household items are scattered around the countertop: a toaster can be seen on the right side, a bowl resides in the middle, and a bottle lies on the left.", "mscoco_hallucinated_words": [["bowl", "bowl"], ["bottle", "bottle"]], "mscoco_gt_words": ["toaster", "oven", "laptop", "cell phone", "microwave", "chair", "person"], "hallucination_idxs": [66, 74], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.5714285714285714}}, {"image_id": 19763, "caption": " The image showcases a piece of pizza placed on a white plate on a dining table, accompanied by a fork. The pizza slice appears to be quite large and is partially cut in two. There are two different beverages on the table, likely beer, placed at the edges of the dining table.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "fork", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.42857142857142855}}, {"image_id": 191301, "caption": "The image features a small, white tray containing two pieces of food \u2013 one dish, possibly consisting of chocolate cake, and the other a square piece of cake, perhaps vanilla. There is also a bottle and spoon nearby on the table. The plate, bottle, and spoon are placed in the center of the table.", "mscoco_hallucinated_words": [["bottle", "bottle"], ["bottle", "bottle"]], "mscoco_gt_words": ["cup", "dining table", "spoon", "cake", "person"], "hallucination_idxs": [39, 50], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6}}, {"image_id": 124236, "caption": " The image showcases a kitchen filled with various items, including a toaster oven sitting on the counter. A green table occupies the room, with multiple bowls and a few books stacked on it. There are a few pots and pans placed around the kitchen, some on the counter and others on the floor. ", "mscoco_hallucinated_words": [["book", "book"]], "mscoco_gt_words": ["toaster", "oven", "dining table", "bowl", "spoon"], "hallucination_idxs": [32], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.8}}, {"image_id": 344930, "caption": "According to the picture, there is a red fire hydrant in the scene on the sidewalk, which is located at the edge of a residential street. The street is lined with cars parked, and a blue cup is visible near the fire hydrant. One of the cars is a truck, which is situated near the left side of the street, and another vehicle is seen further down on the right side of the scene. The fire hydrant stands out in the image, creating a unique scene on the street.", "mscoco_hallucinated_words": [["cup", "cup"], ["truck", "truck"]], "mscoco_gt_words": ["fire hydrant", "orange", "car"], "hallucination_idxs": [40, 54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2857142857142857, "Recall": 0.6666666666666666}}, {"image_id": 14285, "caption": "In the image, a large, fluffy cat is laying on a bed, resting comfortably in a room. The cat is surrounded by various items in the room, including a TV, which is on and visible. There are three books near the cat, placed strategically in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "tv", "cat", "chair", "book", "bed"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 82488, "caption": "There is a hotel room or hotel bedroom setting with two twin beds in it, as indicated by the furniture, with a desk or nightstand on the left and another one on the right. The bed sheets and pillows have an orange and yellow color scheme. The walls feature two paintings with vibrant colors. There are no other items in the scene such as lamps or mirrors.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "clock", "dining table", "orange", "book", "bed"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 348251, "caption": "The image captures a man standing on top of a snow-covered slope, wearing skis and enjoying the winter scenery. In the distance, several people can be seen skiing down the mountain. The skier is wearing a backpack and has a smile on his face as he stands in the middle of the snow-covered field.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 150703, "caption": "The image features a sandwich made with a hot dog and topped with various toppings, such as pickles. The sandwich is accompanied by a beverage, likely a cup of beer, on the left side of the image. There is another cup near the right side of the image, but it seems to be empty. The dining table is the main background for the sandwich and beverage, creating a setting where they are being enjoyed.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sandwich", "cup", "hot dog", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 381615, "caption": "The image showcases a warm, cozy room with a bright window, where a variety of plants are displayed prominently throughout. Plants are placed in several vases, including two near a chair and a couch, and one in the center of the room under the window. There is also a small potted plant placed next to the window. ", "mscoco_hallucinated_words": [["couch", "couch"]], "mscoco_gt_words": ["cup", "potted plant", "vase", "book", "chair"], "hallucination_idxs": [37], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6}}, {"image_id": 352703, "caption": " The scene features a woman preparing food at a dinner party, possibly in an outdoor setting. She is standing in front of a counter, where a bowl of food and a pan are placed. She appears to be stirring the contents with a spoon. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "handbag", "oven", "dining table", "bowl", "spoon", "bottle", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.375}}, {"image_id": 111109, "caption": " The image features a young boy wearing shorts standing on a chair in a room. There is another chair in the background, as well as a couch or sofa nearby. A television is located in the room as well, possibly behind the boy or further in the background. The young boy stands on a chair, likely capturing attention and posing for a photo. The scene is both black-and-white and black and white.", "mscoco_hallucinated_words": [["couch", "couch"], ["sofa", "couch"]], "mscoco_gt_words": ["tv", "tie", "motorcycle", "chair", "person"], "hallucination_idxs": [28, 30], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2222222222222222, "Recall": 0.6}}, {"image_id": 226802, "caption": "A group of people, many of whom appear to be tourists, are gathered around a red street near several buildings, with a bus visible in the scene. The street appears to be located in a large city where several streets converge, creating a busy, urban atmosphere. The city surrounds the people, who are standing around various areas of the street, waiting or walking around. A traffic sign can be seen near the top-left edge of the street, guiding the flow of the busy traffic. A pedicab is also visible in the scene, helping people to navigate the area. Several buses are also on the street, including one near the bottom-left edge, one near the bottom-right edge, and one occupying the center. A truck can be seen at the top-right edge of the image, adding to the diverse traffic mix.", "mscoco_hallucinated_words": [["truck", "truck"]], "mscoco_gt_words": ["motorcycle", "bus", "person", "car"], "hallucination_idxs": [140], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.5}}, {"image_id": 49005, "caption": " The scene depicts a group of baseball players on a field during a game. Several baseball players are visible in various locations on the field, including a pitcher, catcher, and several outfielders. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["baseball bat", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 410114, "caption": " The scene is a road or highway with a large herd of cows and a few people walking down it. The cows are spread across the road, some on the left and right sides, with a few more in the background. The people are walking with the cows, including one person on the left side and a group towards the center. They are possibly herding the cattle down the street, perhaps to move them to a different location or to bring them closer together.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "horse", "person", "cow"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 100274, "caption": "According to the picture, there is a train on some tracks at a rail yard, sitting near other trains on the railways. Several cars are parked or sitting next to the train, occupying both sides of the tracks. The cars are placed both in front of and behind the train, indicating a busy scene with multiple vehicles around. The train and cars are arranged across the entire rail yard, making it an extensive transportation hub for various purposes.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "train", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 317254, "caption": "In the scene, a person is sitting on the back of a motorcycle, riding down a street. The motorcycle appears to be in the middle of the street, while a truck is behind the motorcyclist, driving down the road in the same direction. The blur effect is likely caused by the movement of the motorcycle and the truck. ", "mscoco_hallucinated_words": [["truck", "truck"], ["truck", "truck"]], "mscoco_gt_words": ["motorcycle", "car", "person"], "hallucination_idxs": [34, 63], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.6666666666666666}}, {"image_id": 425569, "caption": " The image showcases a large kitchen featuring a variety of shiny silver cooking equipment. The kitchen countertops have multiple silver bowls and cups, as well as a few metal utensils such as spoons and knives scattered around. ", "mscoco_hallucinated_words": [["cup", "cup"], ["spoon", "spoon"], ["knife", "knife"]], "mscoco_gt_words": ["bowl", "sink", "oven"], "hallucination_idxs": [23, 34, 36], "metrics": {"CHAIRs": 1, "CHAIRi": 0.75, "Recall": 0.3333333333333333}}, {"image_id": 28842, "caption": "The image features two young men in a park, possibly having fun on their skateboards. One of the men is riding a skateboard down a slope in the park, while the other is standing nearby watching and taking the photo. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skateboard", "person", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 319607, "caption": " The image captures an intersection scene where there is a traffic light at the top of the frame. Nearby, a large red building stands tall, possibly a school or office complex. There are a few other street lights visible, and traffic can be spotted in various locations around the intersection, including two bicycles. In addition, a couple is seen walking along the sidewalk. The combination of the traffic light and the street lights in the scene creates an orderly traffic flow.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "bus", "traffic light", "car", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 210372, "caption": "In the image, there is a small kitchen featuring red cupboards and drawers underneath the countertops. The kitchen also has a sink, likely a small one, in the middle of the room. A window can be seen in the corner of the image, letting in natural light.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["sink", "refrigerator", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 156915, "caption": " The image features a vase filled with a variety of flowers, including red roses, white flowers, and greenery. There are several blooming stems visible in the arrangement, including one close to the bottom center of the vase, one in the upper right, and another in the upper left corner. The red rose is placed at the middle of the vase, with white flowers filling out the rest of the arrangement. Greenery is visible around the blooming stems and throughout the arrangement, providing a natural touch to the floral display.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["vase", "potted plant", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 349642, "caption": "The image shows a man sitting on a wooden bench while a dog is laying underneath it, giving the impression that they are enjoying each other's company. The bench is placed in the center of the scene, surrounded by a variety of brick paths. There are other people visible in the background, walking along the paths or standing in the vicinity. The setting appears to be a park or a public space where people can gather and spend time with their dogs.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "bench", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 423008, "caption": " The image depicts a body of water, likely a canal, with a group of small boats lined up near the shore. Several people are standing on or near the boats, enjoying the waterfront setting. The area has a few umbrellas, providing shade, and several handbags can be seen hanging from the boats or being carried by the people.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["suitcase", "handbag", "umbrella", "boat", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8}}, {"image_id": 45435, "caption": " The image features a person holding up their cell phone, highlighting its camera. The phone is in focus, with the camera situated near the top of the device. The individual seems to be holding the phone with both hands, possibly showing off the camera feature to someone or taking a picture. The scene is captured in a room, where various items and papers can be observed, adding some context to the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cell phone", "cup", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 35827, "caption": " The image captures a man sitting in front of his desk, using his laptop and a computer screen as he is on his desktop computer working. There are multiple chairs in the room, including a chair in the back and another chair placed to the left of the man. In addition to the laptop, the man is using a mouse as he navigates through his work on the computer screen. A book is visible near the desk, possibly related to the man's occupation or interests.", "mscoco_hallucinated_words": [["mouse", "mouse"], ["book", "book"]], "mscoco_gt_words": ["cup", "tv", "dining table", "laptop", "bottle", "chair", "person"], "hallucination_idxs": [64, 77], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 0.5714285714285714}}, {"image_id": 72656, "caption": " The image features a group of children seated at a table outdoors. There are three kids, with one of them sitting on a chair to the left, another closer to the table, and the last child seated at the dining table.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "suitcase", "dining table", "bowl", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 372087, "caption": " The image features a long, green and yellow train traveling down a train track as it approaches a train station. The train occupies a significant portion of the picture, extending from the left edge to the center, and further into the right edge.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "suitcase", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 336209, "caption": "The image captures a young man in the process of jumping off a skateboard on a ramp, or off a ledge. He is in the center of the image, balancing while performing the trick.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skateboard", "bench", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 46031, "caption": "The image displays two laptops side by side on a table. One is slightly smaller and placed on the left, while the other, larger laptop, is situated to the right. A pair of computer mice is also visible near the mouses, one placed further towards the front and the other closer to the laptops.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["keyboard", "tv", "dining table", "laptop", "mouse", "cell phone"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 283118, "caption": "The image features a family sitting at a table in a restaurant or a dining area, consisting of a man, a woman, and their two children. The table is adorned with a dish of food, which the family is gathered around to share. There are a couple of spoons in the vicinity, likely for sharing the food.", "mscoco_hallucinated_words": [["spoon", "spoon"]], "mscoco_gt_words": ["pizza", "handbag", "dining table", "bottle", "cell phone", "car", "backpack", "chair", "person"], "hallucination_idxs": [54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.2222222222222222}}, {"image_id": 429598, "caption": "The image depicts a kitchen with stained wood cabinets. There is a sink underneath the cabinets along with a small refrigerator beside the sink. An oven is located on top of the sink, further back into the kitchen area. The sink and refrigerator are positioned in the center of the kitchen, while the oven is on the right side. A microwave is placed near the left side, further back in the kitchen space. ", "mscoco_hallucinated_words": [["microwave", "microwave"]], "mscoco_gt_words": ["sink", "oven", "refrigerator", "bottle"], "hallucination_idxs": [66], "metrics": {"CHAIRs": 1, "CHAIRi": 0.1111111111111111, "Recall": 0.75}}, {"image_id": 333365, "caption": "The image shows an airport scene with several planes parked on the tarmac. There are a total of four airplanes in the scene, two of which appear to be white and red. These planes are parked on the tarmac in various locations. One of the white and red airplanes is positioned towards the front of the image, while the other white plane is closer to the back of the image. There are three cars parked nearby, positioned in different spots.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "car", "airplane", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 424174, "caption": "The image features a large bowl full of various fruits, including bananas, oranges, and apples. The bananas are spread throughout the bowl, along with apples and oranges in different areas. Apart from the apples and oranges, there are also three dishes in the scene, each containing different assortments of fruits. The dishes are situated at the top, left, and right of the bowl, creating a visually appealing display of fruits. The scene offers a tantalizing display of sweet, juicy fruits in a bowl.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "bowl", "apple", "orange", "banana"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8}}, {"image_id": 3035, "caption": " The image captures a scene at an airport, with two airplanes positioned at different locations on a runway. One airplane is in the foreground and the other is farther away towards the back. They are both parked near the terminal building, close to a ramp leading to it. The airplanes appear to be large and white, indicating their size and possibly their white exterior color. The airplanes are parked alongside other planes, creating an airport scene with multiple aircraft.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "car", "airplane", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 177213, "caption": " The image showcases a pizza placed on a white dish in the center of the scene. It appears to be a medium-sized pizza with cheese, possibly mozzarella, and other toppings.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["pizza", "cup", "knife", "dining table", "fork", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.16666666666666666}}, {"image_id": 438535, "caption": " The image depicts a city street scene with a prominent clock in the center, surrounded by various buildings in the background. The black and white image is set among different streets, making it appear to be a busy intersection. The street sign, situated near the bottom left, adds to the urban scene. The clock, situated closer to the right, is the focal point of the image, standing out against the various buildings in the area.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "clock", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 490711, "caption": " The image features a white kitchen with clean and modern appliances. There is a refrigerator located towards the left side of the room and an oven situated towards the right side. A microwave is also present in the kitchen, situated closer to the oven. A stove is positioned near the oven, with a sink located near the center of the kitchen. Various cooking utensils are scattered around the kitchen, such as a knife and a fork, placed on the left side of the room. An island or counter is also visible, possibly for additional cooking space or as a breakfast bar. The kitchen appears to be well-stocked and organized.", "mscoco_hallucinated_words": [["oven", "oven"], ["microwave", "microwave"], ["oven", "oven"], ["stove", "oven"], ["oven", "oven"], ["knife", "knife"], ["fork", "fork"]], "mscoco_gt_words": ["sink", "refrigerator", "banana"], "hallucination_idxs": [26, 34, 46, 49, 54, 79, 82], "metrics": {"CHAIRs": 1, "CHAIRi": 0.7777777777777778, "Recall": 0.6666666666666666}}, {"image_id": 163728, "caption": "According to the picture, there is a gray and white cat sleeping on a keyboard surface, curled up next to an empty glass. The cat appears to have been sleeping peacefully on top of the keyboard, with its head and paws resting on it. Near the cat, there is a mouse lying close by, likely a keyboard mouse or a computer mouse, but not in use. In addition, there is a handbag located near the top of the scene on the right side.", "mscoco_hallucinated_words": [["keyboard", "keyboard"], ["keyboard", "keyboard"], ["keyboard", "keyboard"], ["handbag", "handbag"]], "mscoco_gt_words": ["cup", "dining table", "laptop", "mouse", "cat"], "hallucination_idxs": [15, 38, 63, 81], "metrics": {"CHAIRs": 1, "CHAIRi": 0.36363636363636365, "Recall": 0.6}}, {"image_id": 456756, "caption": " The image features a group of animals including a giraffe and several zebras. The giraffe is standing at the front of the scene while the zebras are scattered behind it. There are three zebras on the left side, one close to the center of the picture and another further to the right. The remaining zebra is located closer to the back of the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["zebra", "giraffe", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 281028, "caption": "The image depicts a room with multiple chairs arranged throughout the space. There are 11 chairs in total, with 6 positioned on the right side of the room and 5 on the left side. Some of the chairs are placed against a window or close to another chair.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "person", "chair", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 343937, "caption": " In the snow, a young woman wearing all white clothing is snowboarding down a slope. She is positioned at the left side of the image, with the snowboard in front of her, indicating she's about to slide down the mountain. She is wearing a hat and a snowboard on top of the snow.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["skis", "person", "snowboard"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 277051, "caption": " The image showcases two little birds, sitting on a dining table near a pizza slice that's partially eaten. The birds appear to be close friends, as they are positioned side by side, just a few inches apart. The pizza is placed on a plate on the dining table, with the pizza slice near the edge. A bottle of wine is also visible in the background, adding a touch of luxury to the scene. A few dining chairs can be seen behind the table, indicating that people have eaten or are about to eat the partially consumed pizza.", "mscoco_hallucinated_words": [["pizza", "pizza"], ["pizza", "pizza"], ["pizza", "pizza"], ["person", "person"], ["pizza", "pizza"]], "mscoco_gt_words": ["knife", "dining table", "bottle", "bird", "chair"], "hallucination_idxs": [14, 44, 57, 96, 107], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4166666666666667, "Recall": 0.8}}, {"image_id": 145666, "caption": " The image features a large metal pan filled with a variety of cooked broccoli and carrots, ready to be consumed. The broccoli is spread out across the pan and is situated in the middle of the image. Some carrots can be seen in the lower-left corner of the pan, while others are dispersed around the broccoli. A knife can be seen in the background on the right side, possibly used in preparing the vegetables. The dish seems to be a delicious and well-prepared one, showcasing an abundance of fresh, nutritious vegetables.", "mscoco_hallucinated_words": [["knife", "knife"]], "mscoco_gt_words": ["bowl", "carrot", "broccoli", "dining table"], "hallucination_idxs": [62], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.5}}, {"image_id": 304984, "caption": "The image features a white plate topped with two sandwiches. The sandwiches appear to be French toast, served with a side of relish. The sandwiches are placed in the center of the plate, surrounded by various accompaniments, including carrots and cucumbers, likely served as a salad or garnish. The dish is accompanied by a fork on the left side and a spoon on the right side. The scene is set against a blue-and-yellow background, making the dish visually appealing.", "mscoco_hallucinated_words": [["carrot", "carrot"], ["spoon", "spoon"]], "mscoco_gt_words": ["bowl", "sandwich", "cake", "fork", "orange", "wine glass"], "hallucination_idxs": [43, 68], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.3333333333333333}}, {"image_id": 122467, "caption": "In the image, there is a person holding a video game controller in their right hand, indicating they are playing a game. The game controller has a cut on its face, possibly indicating that the person had injured themselves while using it. The controller appears to be the one for the Nintendo Wii, as there is another controller visible in the image, but not as prominent as the one the person is holding.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "dining table", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 518287, "caption": "According to the image, it features a toothbrush with a mouth full of teeth inside of a holder. The image was taken on an orange background, showcasing the humorous concept of a toothbrush with a mouth full of teeth.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["toothbrush", "cup", "orange", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 145448, "caption": " In the scene, there is a gray cat laying on top of a dining table near a tire and motorcycle. Two people can be seen in the background, with one sitting near the right edge and another in the center. A car is also present, parked to the left of the motorcycle.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "motorcycle", "cat", "car", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8333333333333334}}, {"image_id": 352900, "caption": " There is a white plate on the table filled with a pasta dish with pieces of cooked broccoli. The broccoli is mixed with the pasta, creating a delicious and visually appealing meal. The dish seems to be the centerpiece of the meal on the plate. A fork can be seen close to the plate, likely for use in consuming the food. Overall, the image appears to be a satisfying, freshly prepared pasta dish.", "mscoco_hallucinated_words": [["table", "dining table"]], "mscoco_gt_words": ["bowl", "fork", "knife", "broccoli"], "hallucination_idxs": [7], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.5}}, {"image_id": 153607, "caption": " The image is a collage of different vases filled with water. One vase has a flower floating in it, creating a beautiful arrangement. There are a total of four vases arranged together in the display, with two vases positioned side by side, and the other two on top of them. The flower in the vase is likely a single orange or yellow flower, adding a pop of color to the overall look. The vases can be seen as unique and interesting pieces of art, as well as functional vessels for display or decoration purposes.", "mscoco_hallucinated_words": [["vessel", "boat"]], "mscoco_gt_words": ["bowl", "vase", "orange", "dining table"], "hallucination_idxs": [97], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 0.5}}, {"image_id": 157581, "caption": " There is an image showcasing a table filled with various bottles of wine placed side by side. These bottles vary in their color, size, and shape. On top of the table, there are several dining chairs, some arranged in a row.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "dining table", "bottle", "wine glass", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 227438, "caption": "The image is of a hotel room with a double bed, a television, and a desk. The bed is positioned towards the left side of the room and the television is located near the center of the room. The desk is positioned towards the right side of the room. The hotel room has a minimalistic decor, as there is very little in the way of decorations or other furnishings. There is also a chair located close to the desk.", "mscoco_hallucinated_words": [["television", "tv"], ["desk", "dining table"], ["television", "tv"], ["desk", "dining table"], ["desk", "dining table"]], "mscoco_gt_words": ["clock", "couch", "bed", "chair"], "hallucination_idxs": [13, 17, 32, 43, 85], "metrics": {"CHAIRs": 1, "CHAIRi": 0.625, "Recall": 0.5}}, {"image_id": 116037, "caption": "The image captures a moment during a baseball game, with several people in the scene. In the foreground, there is a batter at the plate, holding a baseball bat while swinging at the pitch. Behind the batter, two other players can be seen \u2013 one near the left side of the image and the other one on the right.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "baseball bat", "sports ball", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 202658, "caption": " In this image, a white toilet with a pink seat is sitting on the ground in a backyard. The toilet appears to have a unique appearance, with the paint on the seat creating a colorful and creative design. The surroundings suggest a backyard or outdoor area. There is a backpack located nearby, perhaps in a storage area, and a chair can be spotted in the background of the scene. Overall, the image presents a unique and artistic touch to a common household item.", "mscoco_hallucinated_words": [["backpack", "backpack"]], "mscoco_gt_words": ["bowl", "chair", "toilet"], "hallucination_idxs": [52], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 18150, "caption": "According to the picture, a man and a little boy are sitting on the floor next to each other, with the boy giving the man a slice of pizza. The boy is touching the pizza with his hands as he feeds the man. There are some other objects, such as a cup on the left side and a bottle on the right side of the image. A couple of chairs can be seen in the background, one towards the upper-left and the other near the upper-right corner of the picture.", "mscoco_hallucinated_words": [["cup", "cup"], ["chair", "chair"]], "mscoco_gt_words": ["pizza", "bottle", "backpack", "person", "couch"], "hallucination_idxs": [56, 75], "metrics": {"CHAIRs": 1, "CHAIRi": 0.18181818181818182, "Recall": 0.6}}, {"image_id": 418711, "caption": "In this image, a person on a bike is plowing through a long dirt road filled with bananas and multiple vehicles. There are three vehicles in the scene, including a pickup truck near the front, a bus farther down the road, and another car on the right side. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "bicycle", "car", "person", "banana"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 92107, "caption": "According to the image, there is a bowl filled with soup on a table. The bowl has a white base and the soup inside it is green, which appears to have broccoli and corn in it. There are several other food items on the table that seem to be bread. All of the food items appear to be served and ready for consumption.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "spoon", "broccoli", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 491000, "caption": "According to the image, a group of people is walking down a street, engaging in conversation while carrying various bags. Among the group of people, a handbag can be seen placed near one of the individuals. The scene is reminiscent of a typical city street, with some buildings and trees visible in the background. There are also a few benches along the street, possibly providing a place for the group to sit and rest while they are walking.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["handbag", "motorcycle", "dog", "car", "person"], "hallucination_idxs": [66], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.4}}, {"image_id": 401163, "caption": "The image depicts a park, with a couple sitting on a bench near the center. The man appears to be standing to the left side of the bench while the woman is sitting on the right side. There is another person sitting further away, on the left side of the scene. There is a dog near the bench on the right side, and another dog can be found on the left side of the image, close to the edge of the park.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "bench", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 39663, "caption": "The image displays a variety of assorted items including various books, a laptop, and a school backpack laid out on a floor. Among the books, there is a book placed near the center of the scene, a second book on the left side of the image, and a third book on the far right. The laptop is placed towards the upper right portion of the image.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["laptop", "book", "cell phone", "backpack", "scissors"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 268293, "caption": "The image features a man standing on a tennis court, holding a tennis racket as he prepares to swing and hit the tennis ball. A few chairs are present in the background, possibly for spectators or other people to sit during the match.", "mscoco_hallucinated_words": [["chair", "chair"]], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [28], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 1.0}}, {"image_id": 474499, "caption": "The image displays a well-equipped kitchen with various appliances such as a microwave, a sink, and a refrigerator. There is a small oven in the kitchen, situated close to the left wall. A few knives can be seen placed on the countertop.", "mscoco_hallucinated_words": [["microwave", "microwave"], ["refrigerator", "refrigerator"], ["knife", "knife"]], "mscoco_gt_words": ["clock", "oven", "spoon", "bottle", "vase", "apple", "sink"], "hallucination_idxs": [12, 19, 39], "metrics": {"CHAIRs": 1, "CHAIRi": 0.6, "Recall": 0.2857142857142857}}, {"image_id": 358086, "caption": "The image depicts a group of people playing frisbee in a field. They are jumping up and reaching out to catch the frisbee, while one person in particular is making a significant leap, capturing the attention of others. There are a few benches around the field, allowing people to sit and rest during breaks in the game.", "mscoco_hallucinated_words": [["bench", "bench"]], "mscoco_gt_words": ["bottle", "frisbee", "backpack", "chair", "person"], "hallucination_idxs": [46], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.4}}, {"image_id": 89541, "caption": "In the image, there is a plastic food container filled with a variety of different food items. These include a bunch of bananas, some muffins, a few strawberries, and another food item. The bananas are placed at the top-left corner, while the muffins are located in the middle and top-right side of the container. The strawberries are situated towards the bottom of the container. The food items appear to be in various states of preparation, offering a satisfying selection for a meal.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "dining table", "cake", "banana"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 239445, "caption": "According to the picture, the image features a man standing near a car, holding an empty frisbee in one hand and another, full frisbee in the other. There are other people in the background, likely part of a group or in different locations within the scene. The setting appears to be a public area or park, with greenery and benches in the background. There is also a car visible in the image.", "mscoco_hallucinated_words": [["car", "car"], ["bench", "bench"], ["car", "car"]], "mscoco_gt_words": ["frisbee", "sports ball", "person"], "hallucination_idxs": [13, 66, 75], "metrics": {"CHAIRs": 1, "CHAIRi": 0.42857142857142855, "Recall": 0.6666666666666666}}, {"image_id": 105782, "caption": "According to the picture, a zebra and an airplane are flying in the field, creating an unusual scene. The zebra is prominently displayed on the left side of the image, and the airplane is hovering in the center. In the background, several cars can be seen parked along a street. The scene is set in a grassy field, providing a unique and possibly fun environment for both animals to fly around.", "mscoco_hallucinated_words": [["zebra", "zebra"], ["airplane", "airplane"], ["zebra", "zebra"], ["airplane", "airplane"]], "mscoco_gt_words": ["cat", "kite", "dog", "car"], "hallucination_idxs": [6, 9, 22, 36], "metrics": {"CHAIRs": 1, "CHAIRi": 0.8, "Recall": 0.25}}, {"image_id": 487990, "caption": "There is an image being displayed, with a small guitar on top of a table full of food. The scene appears to be a table set up for a meal or a gathering. The food is scattered throughout the table, consisting of various cups, a plate, a bowl, and multiple other dishes. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["cup", "knife", "dining table", "bowl", "spoon", "bottle", "cake", "fork", "donut", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3}}, {"image_id": 138834, "caption": "In the picture, a person is carrying skis on their back, as they are either going skiing or returning from a skiing trip. The person is in the middle of a snow-covered area, and there are snow skiers nearby. The vehicle in the background appears to be a truck or an SUV.", "mscoco_hallucinated_words": [["truck", "truck"]], "mscoco_gt_words": ["car", "skis", "person"], "hallucination_idxs": [53], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 1.0}}, {"image_id": 492057, "caption": "The image shows a man taking a selfie in a public bathroom, with him standing in front of the mirror. The bathroom mirror reflects the image of the man as he takes a photo of himself. There are multiple water bottles placed around the bathroom, with one in the right top corner, another in the top center, and the third near the right side.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["tie", "person", "cell phone"], "hallucination_idxs": [43], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3333333333333333, "Recall": 0.3333333333333333}}, {"image_id": 388215, "caption": "The image depicts a man standing on a tennis court, getting ready to play a game of tennis. He is holding a tennis racquet, preparing to hit the ball, which is situated on the left side of the court. There are various objects around the scene, such as a ball, a clock, and a traffic cone, which may be related to the sporting event or court setup.", "mscoco_hallucinated_words": [["clock", "clock"]], "mscoco_gt_words": ["tennis racket", "sports ball", "person"], "hallucination_idxs": [58], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 111409, "caption": "The image features a man riding a motorcycle, wearing a helmet, and carrying a backpack. He is in the process of driving and is visible from the side view. The motorcycle has the shape of a large circle, with the man riding along the circumference of the circle. There are several other people in the scene, but they are situated at various distances from the motorcyclist.", "mscoco_hallucinated_words": [["backpack", "backpack"]], "mscoco_gt_words": ["bicycle", "motorcycle", "person"], "hallucination_idxs": [16], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.6666666666666666}}, {"image_id": 451391, "caption": "There is a person holding a large bouquet of green broccoli at a grocery store, showcasing their artistic arranging skills. The broccoli bouquet is placed at the center of the image, surrounded by other flowers, fruits, and vegetables on various tables. Some of the flowers are located towards the left side of the picture, while the broccoli bouquet is in the middle, and other fruits are on the right side of the scene. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["broccoli", "person", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 92462, "caption": " The image features a herd of cows moving along a red dirt road, accompanied by two cars on either side. The cows are walking in a single line, with a car following closely behind them from the left, and another car following at a farther distance on the right side of the road. The road appears narrow, with the cows and vehicles taking up most of the available space.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "cow", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 397268, "caption": " In the image, there is a shop or store with two bicycles displayed in the window. The bicycles are the main focus of the scene and are positioned towards the front of the store, making them easily noticeable from outside. The store is located on the right side of the image and two chairs can be seen sitting out front on either side of the bicycles. The chairs are placed against the building, near the bicycles. There is also a handbag visible in the store's front window. The bicycles, chairs, and the handbag create an interesting and colorful display in front of the store.", "mscoco_hallucinated_words": [["handbag", "handbag"], ["handbag", "handbag"]], "mscoco_gt_words": ["bicycle", "chair", "potted plant", "bench"], "hallucination_idxs": [87, 103], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.5}}, {"image_id": 127659, "caption": " The scene depicts a train traveling along a railroad track, with people waiting for it at a train station. The train is stretched across the scene from left to right, appearing quite long. People are standing on the train platform, with some closer to the front of the train and others near the back.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["train", "person", "bench", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 251857, "caption": "In the image, there are two people skateboarding on a cement surface. One person is skateboarding on the left side of the area while the other is on the right side. The cement surface appears to be an outdoor park or area, with trees and other vegetation visible in the background. There are several cars scattered around the area, with one closer to the left side of the image, another near the center, and a few more toward the right. Apart from the skateboarders, there are some other people in the park, but their presence is not as prominent as the two individuals actively skateboarding.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "skateboard", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 243212, "caption": "According to the picture, a small child is standing on a grass-filled field while holding a blue Frisbee in each hand. The child is positioned at the center of the image, with the Frisbee in the foreground. There are various vehicles seen in the background, including cars and trucks, spread across the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["truck", "dog", "frisbee", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.8}}, {"image_id": 444304, "caption": "According to the image, there is a young man squatting on his bed, possibly playing a video game. He is wearing a baseball cap and has two video game remotes in his hands, one closer to the television and the other closer to him. There is also a bottle positioned near the edge of the bed.", "mscoco_hallucinated_words": [["bottle", "bottle"]], "mscoco_gt_words": ["remote", "person", "bed", "tv"], "hallucination_idxs": [53], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 1.0}}, {"image_id": 305000, "caption": " The image shows an outdoor market with a woman standing in front of a fruit stand and two other people nearby. The woman appears to be the main figure in the scene, with multiple fruits on display around her, including apples, oranges, and a bowl of mixed fruits. There are two baskets, one towards the right side of the market and the other at the middle. The other two people in the scene are standing to the right of the fruit stand, facing in different directions. They seem to be browsing the fruits and possibly discussing the options available. Overall, the image captures the lively atmosphere of a fruit market with people in it.", "mscoco_hallucinated_words": [["orange", "orange"], ["bowl", "bowl"]], "mscoco_gt_words": ["apple", "person", "banana"], "hallucination_idxs": [45, 49], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 44136, "caption": "In the image, two men wearing tuxedos are sitting at a table while a party is going on. They both have wine glasses in their hands, posing for the picture. There are three other people in the background, all at different distances from the main subjects. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "tie", "bottle", "wine glass", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6}}, {"image_id": 154846, "caption": "This image captures a delicious pepperoni pizza with mushrooms on it sitting on a counter. There are several toppings, including sliced mushrooms spread throughout the pizza. At the bottom center of the pan, there is a spoon, which may be used to scoop up the pizza toppings. Two people are present in the background, likely enjoying the pizza together. A cellphone is visible to the left, possibly belonging to one of the people in the scene. The dining table extends towards the right side of the image, and there is a chair positioned close to it.", "mscoco_hallucinated_words": [["spoon", "spoon"], ["cellphone", "cell phone"], ["chair", "chair"]], "mscoco_gt_words": ["pizza", "person", "dining table"], "hallucination_idxs": [40, 68, 102], "metrics": {"CHAIRs": 1, "CHAIRi": 0.3, "Recall": 1.0}}, {"image_id": 567254, "caption": "According to the picture, there is a young child standing in front of a television, with blonde hair, seemingly watching something on the screen. The image is relatively small, showcasing the child from their head down to their legs.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "vase", "tv", "bottle"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 79261, "caption": " According to the picture, there is a man standing in the middle of a shallow water feature with a broom, sweeping the water. The man is wearing blue clothes and is wading through the water in the public area. There are several people around the man, some closer and others a little further away.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "clock", "handbag", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 301402, "caption": " The image shows a man sitting in a chair, possibly in a living room or dining room, with a laptop placed on his lap. He is holding a remote in his hand as he looks at the laptop screen. Another person is also present in the room, sitting closer to the right side.", "mscoco_hallucinated_words": [["remote", "remote"]], "mscoco_gt_words": ["dining table", "laptop", "potted plant", "bottle", "chair", "person"], "hallucination_idxs": [31], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.5}}, {"image_id": 415828, "caption": " There is a large group of people standing on a city street next to an alley, possibly waiting for a bus or train. A few people are wearing a suit jacket and one person has a handbag. A car is parked on the street, providing an opportunity for the group to stand in this location. The street appears to be a city street or alley.", "mscoco_hallucinated_words": [["train", "train"]], "mscoco_gt_words": ["handbag", "umbrella", "traffic light", "stop sign", "car", "backpack", "person"], "hallucination_idxs": [23], "metrics": {"CHAIRs": 1, "CHAIRi": 0.16666666666666666, "Recall": 0.42857142857142855}}, {"image_id": 543231, "caption": " The scene depicts a woman in her living room, taking a picture. The main subject is holding a video game controller in her hand, about to capture a moment in a photograph. The other people in the scene appear to be watching the woman take the photo or enjoying themselves in the room. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["remote", "bowl", "book", "bed", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.2}}, {"image_id": 495357, "caption": "In the image, a small motorcycle or toy car is being ridden by a small child. The child is in the middle of the street, with cars on either side of the child, creating an obstacle course-like atmosphere. The image captures the scene with a sense of danger, as it depicts a small person riding a small toy on a busy street.", "mscoco_hallucinated_words": [["car", "car"], ["car", "car"]], "mscoco_gt_words": ["motorcycle", "person", "handbag", "dog"], "hallucination_idxs": [9, 29], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2857142857142857, "Recall": 0.5}}, {"image_id": 523175, "caption": " The image features a plate with a bowl of soup, filled with various vegetables such as carrots. There is a spoon in the soup that appears to be still stirring. The plate is on top of a table, emphasizing the soup as the focal point of the dish. The scene suggests a tasty and nutritious meal, or a healthy bowl of soup.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "carrot", "spoon", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 184945, "caption": "The image features several motorcycles parked along a street next to a train station. The motorcycles are parked in line with each other, with some of them positioned closer to the sidewalk and others further back. A couple of bicycles are also present, one placed closer to the motorcycles and the other towards the side of the street.", "mscoco_hallucinated_words": [["train", "train"]], "mscoco_gt_words": ["bicycle", "motorcycle", "potted plant"], "hallucination_idxs": [12], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2, "Recall": 0.6666666666666666}}, {"image_id": 49942, "caption": "The image portrays a man, likely the focal point of the photograph, holding a single open umbrella, while wearing a tie. He is capturing the shot, looking thoughtful or staring into the distance. The scene is set against a dark background, with a slightly lighter area in the center, and several other people in the background.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["tie", "umbrella", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 223019, "caption": "There is a blue train stationed at a platform in a city or train station setting. The train is pulling into the station or has arrived at the platform. There are multiple people in the scene, with one person closer to the front of the train and the other individuals positioned further away near the back. Some of the people are standing while others are seated.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["suitcase", "handbag", "train", "car", "backpack", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 42889, "caption": " In the image, a white teddy bear is sitting on top of a keyboard and keyboard cover. The teddy bear appears to be placed close to a computer. On the table next to the teddy bear, there are multiple devices, including a computer monitor to the right and a pair of glasses near the bottom right. Overall, the scene depicts a collection of electronics and a stuffed animal in close proximity.", "mscoco_hallucinated_words": [["computer", "laptop"], ["table", "dining table"], ["computer", "laptop"], ["monitor", "tv"]], "mscoco_gt_words": ["keyboard", "teddy bear", "bear"], "hallucination_idxs": [29, 33, 47, 48], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4444444444444444, "Recall": 0.6666666666666666}}, {"image_id": 22112, "caption": "The image showcases a collection of figurines and a pot with grass inside. The figurines are spread across the tabletop, including some that are smaller and others that are bigger. The pot with grass inside is also positioned on the table. The scene provides a unique and interesting display, featuring a diverse set of items.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "vase", "potted plant", "dining table"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.25}}, {"image_id": 247587, "caption": "According to the picture, there is a man standing on a surfboard in the middle of the beach scene. He appears to be in the midst of a surfing experience as he is balanced on the board. The beach is surrounded by various trees, including some closer to the shoreline and others standing at different distances further back.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["surfboard", "car", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 171962, "caption": "In this scene, there is a large bus driving down a city street, with a car on the left side of the bus. There are several bicycles in front of the bus, positioned close to each other and some are leaning against the bus. Additionally, there are several people present, with one person standing next to the bus, another near the far side of the bus, and another individual located further back from the scene.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bicycle", "bus", "person", "car"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}, {"image_id": 545056, "caption": "The image depicts a large field full of black cows and a few scattered sheep, all in various positions within the field. The cows are spread out, some lying down on the grass while others are standing or walking around. There are at least 14 cows on the scene, with a few closer to the camera and the rest further away. The sheep are also dispersed across the field, with some closer to the cows and others further away. Overall, the scene captures a herd of cows and sheep enjoying a grassy, pasture-like field.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["car", "sheep", "cow", "dog"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 401411, "caption": "In the image, there is a man standing in front of a table full of food, likely in a restaurant or kitchen setting. The man is looking towards the front of the image, possibly at another person or a specific dish. Around the man, there are various food items, such as an apple, bananas, onions, and a bowl.", "mscoco_hallucinated_words": [["apple", "apple"], ["banana", "banana"]], "mscoco_gt_words": ["cup", "oven", "dining table", "bowl", "book", "carrot", "person"], "hallucination_idxs": [59, 61], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.42857142857142855}}, {"image_id": 2867, "caption": " In the snow-covered scene, a group of skiers is gathered, with people standing close to each other, and some are skiing. There are at least four pairs of skis clearly visible, with the people using them spread throughout the image. Two people appear to be wearing backpacks, which could be for carrying essentials or for holding items like gloves or snacks. ", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["backpack", "skis", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 246963, "caption": "According to the picture, a group of motorcyclists are driving on a downward slope of a road, travelling south. There are several motorcycles in the image, each with their respective riders. A few motorcyclists can be seen on the left side, while others are closer to the center of the picture, and one or two are on the right side. There is a stop sign in the scene, located to the left of the motorcycles. Additionally, several cars are visible in the image, with a car located near the top of the downward slope, another in the left-hand area of the scene, and a third car on the right-hand side.", "mscoco_hallucinated_words": [["car", "car"], ["car", "car"], ["car", "car"]], "mscoco_gt_words": ["motorcycle", "stop sign", "person"], "hallucination_idxs": [88, 97, 119], "metrics": {"CHAIRs": 1, "CHAIRi": 0.42857142857142855, "Recall": 1.0}}, {"image_id": 498225, "caption": " In the image, a group of people, including a man and three women, are standing in front of a store window displaying numerous cakes. They appear to be walking or standing near the window, examining the baked goods. Some of the cakes can be seen on the top shelf, while others are closer to the viewer.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["handbag", "cake", "cell phone", "donut", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.4}}, {"image_id": 124601, "caption": "The image depicts a small, compact kitchen filled with various items, including a laptop and other electronic devices, ovens, and other cooking appliances. The room appears to have been well-maintained and cleaned, with all the surfaces, including the counter, free of any dirt or clutter. ", "mscoco_hallucinated_words": [["oven", "oven"]], "mscoco_gt_words": ["keyboard", "cup", "dining table", "laptop", "bottle", "wine glass", "sink", "microwave"], "hallucination_idxs": [21], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.125}}, {"image_id": 581749, "caption": " The image displays a variety of fruits and vegetables placed inside a container. The fruits consist of oranges placed together in the middle and oranges on the side, while the vegetables consist of broccoli scattered across the scene. The produce appears fresh, and the variety and arrangement provide an appealing arrangement for a salad or meal.", "mscoco_hallucinated_words": [["container", "bowl"]], "mscoco_gt_words": ["broccoli", "orange", "dining table"], "hallucination_idxs": [12], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.6666666666666666}}, {"image_id": 227960, "caption": "In the image, a bowl of food sits on a table filled with a variety of vegetables, including carrots and broccoli. There are several carrots placed around the table, including one in the bowl, one to the left and two more to the right. A few carrots are located closer to the bottom of the table.", "mscoco_hallucinated_words": [["table", "dining table"], ["broccoli", "broccoli"], ["table", "dining table"], ["table", "dining table"]], "mscoco_gt_words": ["cup", "bowl", "spoon", "wine glass", "carrot"], "hallucination_idxs": [11, 22, 31, 61], "metrics": {"CHAIRs": 1, "CHAIRi": 0.4444444444444444, "Recall": 0.4}}, {"image_id": 286503, "caption": "The scene takes place in a large enclosure, with a couple of people standing in the background. An elephant is the focus of the image, with one person standing close to it while holding a bag. The person is handing the bag to the elephant as it swings in the air, which is captured in the action.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["person", "sports ball", "elephant"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 315740, "caption": "This image depicts a spacious and modern kitchen, featuring wood cabinets and stainless steel appliances. The kitchen has a large sink surrounded by various utensils and a refrigerator positioned against one of the walls. ", "mscoco_hallucinated_words": [["refrigerator", "refrigerator"]], "mscoco_gt_words": ["oven", "tv", "bowl", "potted plant", "sink", "microwave", "chair", "couch"], "hallucination_idxs": [29], "metrics": {"CHAIRs": 1, "CHAIRi": 0.5, "Recall": 0.125}}, {"image_id": 142108, "caption": "The image depicts a group of people in a kitchen, blindfolded while they are in line, ready to eat a banana. They are standing on a long table, with the people on the right side of the table and the people on the left side of the table. There are several bananas spread throughout the scene, placed on the table and arranged on the ground as well. Some of the bananas are close to each other, while others are farther away.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["dining table", "person", "banana"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 1.0}}, {"image_id": 542933, "caption": "The image is a photograph taken at a baseball game, where a batter is getting ready to hit the ball and an umpire is standing nearby. In the scene, several people can be seen sitting and standing around, including the batter, the catcher, and the umpire. The players are in various poses, some focused on the game, while others are looking around the field.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bench", "chair", "baseball bat", "sports ball", "person", "baseball glove"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.3333333333333333}}, {"image_id": 475043, "caption": "A group of three people is sitting on couches in a room, using laptops to communicate and socialize with one another. One person is seated on the left couch, another person is on the right couch, and a third person is on the far right couch. They are drinking, talking, and enjoying their company.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["bowl", "spoon", "laptop", "bottle", "person", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.5}}, {"image_id": 456784, "caption": "The image depicts a person cutting a white, chocolate-iced cake using a knife. The knife has a distinct shape and is positioned in a downward direction, slicing the cake. The person appears to be standing close to the table and in the center of the image, possibly taking a bite after cutting the cake.", "mscoco_hallucinated_words": [["table", "dining table"]], "mscoco_gt_words": ["cake", "knife", "person"], "hallucination_idxs": [42], "metrics": {"CHAIRs": 1, "CHAIRi": 0.125, "Recall": 1.0}}, {"image_id": 33753, "caption": " The image depicts a wooden outdoor bench with a remote control sitting on it. The remote control is placed near the bottom of the bench, occupying the majority of the lower portion. The bench is positioned in a backyard area with a few potted plants nearby. There are no other people or items in the scene, and the focus is on the wooden bench and remote control.", "mscoco_hallucinated_words": [["potted plant", "potted plant"], ["person", "person"]], "mscoco_gt_words": ["remote", "chair", "bench", "tv"], "hallucination_idxs": [46, 54], "metrics": {"CHAIRs": 1, "CHAIRi": 0.2222222222222222, "Recall": 0.5}}, {"image_id": 448694, "caption": " In the image, a group of young men is playing a friendly game of frisbee while a crowd of onlookers watches the action. The main event is focused on one of the participants, who appears to be positioned close to the front of the group. A frisbee is being thrown by another man in the group, trying to catch it while being surrounded by his friends.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["frisbee", "chair", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.6666666666666666}}, {"image_id": 99053, "caption": "The image features a bowl filled with a variety of ingredients, including noodles and vegetables. There is a significant amount of noodles in the bowl, mixed with vegetables, and the dish appears to be a mix of foods. A metal spoon can be seen in the bowl, indicating the dish is stirred or served. ", "mscoco_hallucinated_words": [["spoon", "spoon"]], "mscoco_gt_words": ["dining table", "bowl", "fork", "broccoli", "carrot"], "hallucination_idxs": [45], "metrics": {"CHAIRs": 1, "CHAIRi": 0.25, "Recall": 0.2}}, {"image_id": 503210, "caption": " The image depicts a living room filled with various furniture. There are a number of seating arrangements throughout the space, including two couches on the left side, a loveseat in the center, and a chair on the right side. The couches are surrounded by a large number of pillows, with some pillows located on the back, and others dispersed across the seating areas. A few potted plants are also visible throughout the living room, placed on different surfaces.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["clock", "potted plant", "vase", "orange", "book", "chair", "couch"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.42857142857142855}}, {"image_id": 142257, "caption": " This scene captures a busy intersection filled with traffic, with several cars, a truck, and a taxi in the middle of the road. Several traffic lights can be seen, including two at the top of the image, one in the upper left corner, and another closer to the bottom left.", "mscoco_hallucinated_words": [], "mscoco_gt_words": ["traffic light", "car", "truck", "person"], "hallucination_idxs": [], "metrics": {"CHAIRs": 0, "CHAIRi": 0.0, "Recall": 0.75}}], "overall_metrics": {"CHAIRs": 0.3807615230460922, "CHAIRi": 0.12141127637495676, "Recall": 0.5452468007312614, "num_hallucinated_caps": 190.0, "num_caps": 499.0, "hallucinated_word_count": 351.0, "coco_word_count": 2891.0, "length_response": 72.78356713426854, "hallucinated_caps_ls": [144305, 329614, 324008, 247179, 39053, 493862, 181446, 401027, 161227, 25812, 195645, 155671, 412894, 32625, 18688, 370657, 481596, 254368, 459141, 209347, 523292, 12375, 13550, 146837, 209925, 104326, 233888, 300784, 183766, 169169, 409346, 344942, 300124, 490491, 175878, 12669, 133146, 498747, 227204, 185327, 348654, 482432, 556478, 37907, 406611, 278365, 432213, 368367, 21779, 148977, 467887, 352734, 557998, 401010, 313928, 346940, 400851, 388237, 24104, 183709, 59710, 527497, 285302, 265628, 499594, 255483, 453731, 87052, 357578, 418692, 259345, 120776, 545594, 378831, 449798, 375769, 426172, 180764, 295105, 443259, 249969, 146504, 240681, 542792, 14773, 298067, 563938, 526414, 505213, 441363, 188136, 191691, 89378, 557447, 561967, 198163, 217593, 301266, 475813, 558784, 539951, 230983, 564629, 53431, 381031, 9628, 275749, 163518, 525202, 118124, 288729, 372220, 302789, 30667, 532481, 291634, 88269, 98979, 536154, 374061, 353320, 256628, 158806, 424270, 416384, 364783, 531327, 502936, 51054, 1268, 436814, 63934, 516248, 45550, 444913, 303099, 205193, 272566, 14940, 408449, 354540, 191301, 124236, 344930, 381615, 111109, 226802, 317254, 425569, 35827, 283118, 429598, 490711, 163728, 277051, 145666, 304984, 352900, 153607, 227438, 202658, 18150, 491000, 268293, 474499, 358086, 239445, 105782, 138834, 492057, 388215, 111409, 397268, 444304, 305000, 154846, 301402, 415828, 495357, 184945, 42889, 401411, 246963, 124601, 581749, 227960, 315740, 456784, 33753, 99053]}}}