Configuration 1: xlnet on fasttext with cosine distance
Configuration 2: bert on fasttext with cosine distance
Testing on: hard queries for the flight_delay schema
Entries below have greater num_guesses with a threshold of 3
--------------------------------------------------
--------------------------------------------------

Query (#1): predict the average delay caused by weather for each airline where flight will start in next two days
Ground Truth (filter): DATE

Annotation 1: []
Annotation 2: [{'text': 'will start', 'confidence': 0.8114347755908966}]

Configuration 1 took 13 attempt(s) to get the correct answer
Configuration 2 took 9 attempt(s) to get the correct answer

--------------------------------------------------

Query (#2): predict the average delay caused by weather for each flight of Emirates airline where flight will start in next two days
Ground Truth (filter): DATE

Annotation 1: []
Annotation 2: [{'text': 'in', 'confidence': 0.3591545820236206}]

Configuration 1 took 10 attempt(s) to get the correct answer
Configuration 2 took 5 attempt(s) to get the correct answer

--------------------------------------------------

Query (#3): predict the average delay caused by the tornedo for each flight of Quatar airline where flight will start in next two days
Ground Truth (filter): DATE

Annotation 1: [{'text': 'flight', 'confidence': 0.9499793648719788}]
Annotation 2: [{'text': 'of Qua', 'confidence': 0.9406005342801412}, {'text': 'will start', 'confidence': 0.7854796648025513}]

Configuration 1 took 3 attempt(s) to get the correct answer
Configuration 2 took 9 attempt(s) to get the correct answer

--------------------------------------------------

Query (#4): predict the average delay due to bad weather for each flight of Quatar Airlines where flight will start in next two days
Ground Truth (filter): DATE

Annotation 1: [{'text': 'flight', 'confidence': 0.9999962449073792}]
Annotation 2: [{'text': 'of Qua', 'confidence': 0.9567640821139017}, {'text': 'start', 'confidence': 0.9605323076248169}]

Configuration 1 took 3 attempt(s) to get the correct answer
Configuration 2 took 9 attempt(s) to get the correct answer

--------------------------------------------------

Query (#5): I want to know the average delay for aircraft for the flights of Air Emirates which will start next week
Ground Truth (filter): DATE

Annotation 1: []
Annotation 2: [{'text': 'flights', 'confidence': 0.9564255475997925}]

Configuration 1 took 11 attempt(s) to get the correct answer
Configuration 2 took 2 attempt(s) to get the correct answer

--------------------------------------------------

Query (#6): can you tell me the average aircraft delay for Quatar Airlines where tail number starts from 4500 and will start within next week
Ground Truth (filter): TAIL NUMBER

Annotation 1: [{'text': 'tail number starts', 'confidence': 0.9999991854031881}, {'text': '4', 'confidence': 0.621338427066803}]
Annotation 2: [{'text': 'number', 'confidence': 0.9999996423721313}, {'text': 'from 4500', 'confidence': 0.9349560340245565}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 2 attempt(s) to get the correct answer

--------------------------------------------------

Query (#7): Predict the average elapsed time for all flights which will start from Dallas Airport and expected elapsed time is less than six hours
Ground Truth (filter): ELAPSED TIME

Annotation 1: [{'text': 'elapsed time', 'confidence': 1.0}]
Annotation 2: [{'text': 'start', 'confidence': 0.9993060827255249}, {'text': 'elapsed time', 'confidence': 0.9010847955942154}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#8): I want to predict the expected elapsed time of all Air Emirates Airlines flights where flight number is in between 3400 and 3500 and they will start tomorrow
Ground Truth (filter): FLIGHT NUMBER

Annotation 1: [{'text': 'flight number', 'confidence': 1.0}]
Annotation 2: [{'text': 'number', 'confidence': 0.9999986290931702}, {'text': 'in between 3400', 'confidence': 0.7628612369298935}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#9): predict how many flights will get cancelled which was supposed to start from New York for next week
Ground Truth (filter): NONE

Annotation 1: [{'text': 'get cancelled which', 'confidence': 0.9994409680366516}]
Annotation 2: []

Configuration 1 took 17 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#10): predict how many flights of Quatar Airlines will get cancelled for covid-19 for next month
Ground Truth (filter): NONE

Annotation 1: []
Annotation 2: [{'text': '- 19', 'confidence': 0.536309227347374}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 13 attempt(s) to get the correct answer

--------------------------------------------------

Query (#11): I wanna know how many flights will be cancelled for each airline for the tornedo that is coming within tomorrow
Ground Truth (filter): NONE

Annotation 1: [{'text': 'tornedo that', 'confidence': 0.9454457461833954}]
Annotation 2: [{'text': 'tornedo that', 'confidence': 0.9999784976243973}, {'text': 'coming', 'confidence': 0.9651370048522949}]

Configuration 1 took 17 attempt(s) to get the correct answer
Configuration 2 took 17 attempt(s) to get the correct answer

--------------------------------------------------

Query (#12): predict the average change is duration of flights for the Quatar Airlines which will start within next week
Ground Truth (filter): NONE

Annotation 1: []
Annotation 2: [{'text': 'Qua', 'confidence': 0.7557871639728546}, {'text': 'start', 'confidence': 0.997248113155365}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 17 attempt(s) to get the correct answer

--------------------------------------------------

Query (#13): predict the average change is duration of flights for the Emirates Airlines which will have tail number greater than 1k and will start within next week
Ground Truth (filter): TAIL NUMBER

Annotation 1: [{'text': 'tail number', 'confidence': 0.9999988377094269}]
Annotation 2: [{'text': 'tail number', 'confidence': 0.9996995329856873}, {'text': '1k', 'confidence': 0.8588913083076477}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 2 attempt(s) to get the correct answer

--------------------------------------------------

Query (#14): predict the average departure delay for the Emirates Airlines which will have tail number greater than 1k and will start within next week
Ground Truth (filter): TAIL NUMBER

Annotation 1: [{'text': 'tail number', 'confidence': 0.9999982714653015}]
Annotation 2: [{'text': 'tail number', 'confidence': 0.996773898601532}, {'text': '1k', 'confidence': 0.948462963104248}, {'text': 'start', 'confidence': 0.6746978163719177}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#15): predict the average arrival delay for the Emirates Airlines which will have tail number greater than 1k and will start within next week
Ground Truth (filter): TAIL NUMBER

Annotation 1: [{'text': 'tail number', 'confidence': 0.9999987185001373}]
Annotation 2: [{'text': 'tail number', 'confidence': 0.9967749118804932}, {'text': '1k', 'confidence': 0.9442611634731293}, {'text': 'start', 'confidence': 0.7025882005691528}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#16): predict the average air system delay for the Emirates Airlines which will have tail number greater than 1k and will start within next week
Ground Truth (filter): TAIL NUMBER

Annotation 1: [{'text': 'tail number', 'confidence': 0.9999985992908478}]
Annotation 2: [{'text': 'tail number', 'confidence': 0.9995978772640228}, {'text': '1k', 'confidence': 0.9647915959358215}, {'text': 'start', 'confidence': 0.5494731664657593}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#17): predict the average security delay for the Emirates Airlines which will have tail number greater than 1k and will start within next week
Ground Truth (filter): TAIL NUMBER

Annotation 1: [{'text': 'tail number', 'confidence': 0.9999984800815582}]
Annotation 2: [{'text': 'tail number', 'confidence': 0.99642413854599}, {'text': '1k', 'confidence': 0.9402155876159668}, {'text': 'start', 'confidence': 0.6693040132522583}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

