Configuration 1: xlnet on fasttext with cosine distance
Configuration 2: bert on fasttext with cosine distance
Testing on: combined queries for the online_delivary schema
Entries below have greater num_guesses with a threshold of 3
--------------------------------------------------
--------------------------------------------------

Query (#1): predict the total order where the cuisine is chinese for the next week
Ground Truth (filter): PERFERENCE(P1)

Annotation 1: [{'text': 'cuisine', 'confidence': 0.8491058945655823}, {'text': 'chinese', 'confidence': 0.9999929964542389}]
Annotation 2: [{'text': 'chinese', 'confidence': 0.6809283196926117}]

Configuration 1 took 15 attempt(s) to get the correct answer
Configuration 2 took 15 attempt(s) to get the correct answer

--------------------------------------------------

Query (#2): predict the total order where the preferred medium of order is online within tomorrow
Ground Truth (filter): MEDIUM (P1)

Annotation 1: [{'text': 'preferred medium of order is online within', 'confidence': 0.8779428601264954}]
Annotation 2: [{'text': 'preferred medium of order', 'confidence': 0.876037523150444}, {'text': 'online', 'confidence': 0.7616453766822815}]

Configuration 1 took 11 attempt(s) to get the correct answer
Configuration 2 took 11 attempt(s) to get the correct answer

--------------------------------------------------

Query (#3): predict the total order where the customer is married and monthly income is greater than two thousand within next two weeks
Ground Truth (filter): MARITAL STATUS

Annotation 1: [{'text': 'married and monthly', 'confidence': 0.9964938958485922}]
Annotation 2: [{'text': 'married', 'confidence': 0.9998403191566467}, {'text': 'monthly income', 'confidence': 0.9934128522872925}]

Configuration 1 took 3 attempt(s) to get the correct answer
Configuration 2 took 3 attempt(s) to get the correct answer

--------------------------------------------------

Query (#4): predict the average reduction of order because of poor hygiene and health concern
Ground Truth (filter): POOR HYGIENE

Annotation 1: [{'text': 'because of poor hygiene and health concern', 'confidence': 0.8663384573800224}]
Annotation 2: [{'text': 'poor hygiene', 'confidence': 0.9991397460301717}, {'text': 'health concern', 'confidence': 0.9695583283901215}]

Configuration 1 took 27 attempt(s) to get the correct answer
Configuration 2 took 27 attempt(s) to get the correct answer

--------------------------------------------------

Query (#5): predict the total order placed by mistake when the customer lives in a busy location for the next week
Ground Truth (filter): RESIDENCE IN BUSY LOCATION

Annotation 1: [{'text': 'customer lives in a busy location', 'confidence': 0.9999967018763224}]
Annotation 2: [{'text': 'lives in', 'confidence': 0.7846168577671051}, {'text': 'busy location', 'confidence': 0.962617427110672}]

Configuration 1 took 40 attempt(s) to get the correct answer
Configuration 2 took 40 attempt(s) to get the correct answer

--------------------------------------------------

Query (#6): predict the maximum order placed by mistake where the age of a customer is less than 21 for the next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of a customer', 'confidence': 0.9999995976686478}]
Annotation 2: [{'text': 'age of a customer', 'confidence': 0.9833278208971024}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#7): predict the total missing item where the gender of a customer is female for the next month
Ground Truth (filter): GENDER

Annotation 1: [{'text': 'gender of a customer is female', 'confidence': 0.9999458690484365}]
Annotation 2: [{'text': 'gender of a customer', 'confidence': 0.9746774286031723}, {'text': 'female', 'confidence': 0.9190409779548645}]

Configuration 1 took 2 attempt(s) to get the correct answer
Configuration 2 took 2 attempt(s) to get the correct answer

--------------------------------------------------

Query (#8): predict the total order placed by mistake where the occupation of the customer is job holder for tomorrow
Ground Truth (filter): OCCUPATION

Annotation 1: [{'text': 'occupation of the customer is job holder for', 'confidence': 0.9979017600417137}]
Annotation 2: [{'text': 'occupation of the customer', 'confidence': 0.9995979815721512}, {'text': 'job holder', 'confidence': 0.9996568560600281}]

Configuration 1 took 4 attempt(s) to get the correct answer
Configuration 2 took 4 attempt(s) to get the correct answer

--------------------------------------------------

Query (#9): predict the average accurately located on google maps order where the customers education qualification is high within next month
Ground Truth (filter): EDUCATIONAL QUALIFICATIONS

Annotation 1: [{'text': 'education qualification', 'confidence': 0.9999984800815582}, {'text': 'high', 'confidence': 0.9901666641235352}]
Annotation 2: [{'text': 'education qualification', 'confidence': 0.9684075117111206}, {'text': 'high', 'confidence': 0.7101011872291565}]

Configuration 1 took 6 attempt(s) to get the correct answer
Configuration 2 took 6 attempt(s) to get the correct answer

--------------------------------------------------

Query (#10): predict the total order with good food quality where the family size is more than four
Ground Truth (filter): FAMILY SIZE

Annotation 1: [{'text': 'family size', 'confidence': 0.9999759793281555}]
Annotation 2: [{'text': 'family size', 'confidence': 0.9984731376171112}]

Configuration 1 took 7 attempt(s) to get the correct answer
Configuration 2 took 7 attempt(s) to get the correct answer

--------------------------------------------------

Query (#11): predict the total order with medium preference is online where age of customer is more than 40 within next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of customer', 'confidence': 0.9999996821085612}]
Annotation 2: [{'text': 'age of customer', 'confidence': 0.9979022741317749}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#12): predict the total order with medium preference is online where type of meal preferred is family size within next month
Ground Truth (filter): MEAL(P1)

Annotation 1: [{'text': 'type of meal', 'confidence': 0.9999966025352478}, {'text': 'family size', 'confidence': 0.9982550144195557}]
Annotation 2: [{'text': 'type of meal', 'confidence': 0.9998167753219604}, {'text': 'family size', 'confidence': 0.9999593198299408}]

Configuration 1 took 13 attempt(s) to get the correct answer
Configuration 2 took 13 attempt(s) to get the correct answer

--------------------------------------------------

Query (#13): predict the total order where the preferred cuisine is Indian for the next week
Ground Truth (filter): PERFERENCE(P1)

Annotation 1: [{'text': 'preferred cuisine', 'confidence': 0.8306872248649597}, {'text': 'Indian', 'confidence': 0.9996103644371033}]
Annotation 2: [{'text': 'cuisine', 'confidence': 0.9963275790214539}, {'text': 'Indian', 'confidence': 0.7237054705619812}]

Configuration 1 took 15 attempt(s) to get the correct answer
Configuration 2 took 15 attempt(s) to get the correct answer

--------------------------------------------------

Query (#14): predict the total order where there is ease of online order for the next week
Ground Truth (filter): EASE AND CONVENIENT

Annotation 1: [{'text': 'ease of online order', 'confidence': 0.9999985843896866}]
Annotation 2: [{'text': 'ease of online order', 'confidence': 0.9887582212686539}]

Configuration 1 took 17 attempt(s) to get the correct answer
Configuration 2 took 17 attempt(s) to get the correct answer

--------------------------------------------------

Query (#15): predict the total order where the number of choice of restaurant is more than three for the next week
Ground Truth (filter): MORE RESTAURANT CHOICES

Annotation 1: [{'text': 'number of choice of restaurant', 'confidence': 0.9999985098838806}]
Annotation 2: [{'text': 'number of choice of restaurant', 'confidence': 0.9661942839622497}]

Configuration 1 took 19 attempt(s) to get the correct answer
Configuration 2 took 19 attempt(s) to get the correct answer

--------------------------------------------------

Query (#16): predict the maximum order with easy payment option where the preferred cuisine is Chinese for the next week
Ground Truth (filter): PERFERENCE(P1)

Annotation 1: [{'text': 'preferred cuisine', 'confidence': 0.8989270627498627}, {'text': 'Chinese', 'confidence': 0.9999204874038696}]
Annotation 2: [{'text': 'preferred cuisine', 'confidence': 0.7719434499740601}]

Configuration 1 took 15 attempt(s) to get the correct answer
Configuration 2 took 15 attempt(s) to get the correct answer

--------------------------------------------------

Query (#17): predict the total order with offers and discounts where the age of a customer is less than 21 for the next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of a customer', 'confidence': 0.9999996423721313}]
Annotation 2: [{'text': 'age of a customer', 'confidence': 0.9999962747097015}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#18): predict the total order with good food quality where age of customer is more than 40 within next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of customer is', 'confidence': 0.999992623925209}]
Annotation 2: [{'text': 'age of customer', 'confidence': 0.9996738831202189}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#19): predict the total order delivered with good tracking system where age of customer is more than 30 within next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of customer is', 'confidence': 0.9999961405992508}]
Annotation 2: [{'text': 'age of customer', 'confidence': 0.9982140858968099}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#20): predict the maximum reduction of order for self cooking where the family size is more than four within next month
Ground Truth (filter): FAMILY SIZE

Annotation 1: [{'text': 'family size', 'confidence': 0.9999961853027344}]
Annotation 2: [{'text': 'family size', 'confidence': 0.9999457001686096}]

Configuration 1 took 7 attempt(s) to get the correct answer
Configuration 2 took 7 attempt(s) to get the correct answer

--------------------------------------------------

Query (#21): predict the maximum reduction of order for health concern where the family size is more than four within next month
Ground Truth (filter): FAMILY SIZE

Annotation 1: [{'text': 'family size', 'confidence': 0.9999942183494568}]
Annotation 2: [{'text': 'family size', 'confidence': 0.9999627768993378}]

Configuration 1 took 7 attempt(s) to get the correct answer
Configuration 2 took 7 attempt(s) to get the correct answer

--------------------------------------------------

Query (#22): predict the average cancelled order for late delivery where the age of a customer is less than 21 for the next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of a customer', 'confidence': 0.9999995827674866}]
Annotation 2: [{'text': 'age of a customer', 'confidence': 0.9981803148984909}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#23): predict the total order reduced for poor hygiene where the occupation of the customer is job holder for tomorrow
Ground Truth (filter): OCCUPATION

Annotation 1: [{'text': 'occupation of the customer is job holder for', 'confidence': 0.9930545762181282}]
Annotation 2: [{'text': 'occupation of the customer', 'confidence': 0.999997079372406}, {'text': 'job holder', 'confidence': 0.9999291896820068}]

Configuration 1 took 4 attempt(s) to get the correct answer
Configuration 2 took 4 attempt(s) to get the correct answer

--------------------------------------------------

Query (#24): predict the total order reduced for bad past experience where the occupation of the customer is job holder for tomorrow
Ground Truth (filter): OCCUPATION

Annotation 1: [{'text': 'occupation of the customer is job holder for', 'confidence': 0.9967217370867729}]
Annotation 2: [{'text': 'occupation of the customer', 'confidence': 0.9999972283840179}, {'text': 'job holder', 'confidence': 0.9999347627162933}]

Configuration 1 took 4 attempt(s) to get the correct answer
Configuration 2 took 4 attempt(s) to get the correct answer

--------------------------------------------------

Query (#25): predict the total order reduced for unavailability where the family size is more than four within next month
Ground Truth (filter): FAMILY SIZE

Annotation 1: [{'text': 'family size', 'confidence': 0.9999986886978149}]
Annotation 2: [{'text': 'family size', 'confidence': 0.9897099137306213}]

Configuration 1 took 7 attempt(s) to get the correct answer
Configuration 2 took 7 attempt(s) to get the correct answer

--------------------------------------------------

Query (#26): predict the total order reduced for being unaffordable where the age of a customer is less than 21 for the next month
Ground Truth (filter): AGE

Annotation 1: [{'text': 'age of a customer', 'confidence': 0.9999997019767761}]
Annotation 2: [{'text': 'age of a customer', 'confidence': 0.9999983757734299}]

Configuration 1 took 1 attempt(s) to get the correct answer
Configuration 2 took 1 attempt(s) to get the correct answer

--------------------------------------------------

Query (#27): predict the total delay of delivery person getting assigned where the customers residence is in busy location for the next month
Ground Truth (filter): RESIDENCE IN BUSY LOCATION

Annotation 1: [{'text': 'residence is in busy location', 'confidence': 0.9999995946884155}]
Annotation 2: [{'text': 'residence', 'confidence': 0.9192526340484619}, {'text': 'in busy location', 'confidence': 0.9998436172803243}]

Configuration 1 took 40 attempt(s) to get the correct answer
Configuration 2 took 40 attempt(s) to get the correct answer

--------------------------------------------------

Query (#28): predict the total delay of delivery person picking up food where the customers residence is in busy location for the next month
Ground Truth (filter): RESIDENCE IN BUSY LOCATION

Annotation 1: [{'text': 'residence is in busy location', 'confidence': 0.9999996066093445}]
Annotation 2: [{'text': 'residence', 'confidence': 0.9871556162834167}, {'text': 'in busy location', 'confidence': 0.9999651312828064}]

Configuration 1 took 40 attempt(s) to get the correct answer
Configuration 2 took 40 attempt(s) to get the correct answer

--------------------------------------------------

Query (#29): predict the total delay of delivery person getting assigned where the customers location on google map is not accurate for the next month
Ground Truth (filter): GOOGLE MAPS ACCURACY

Annotation 1: [{'text': 'customers location', 'confidence': 0.9999672770500183}, {'text': 'not accurate', 'confidence': 0.9999375939369202}]
Annotation 2: [{'text': 'customers location', 'confidence': 0.9555194079875946}, {'text': 'google map', 'confidence': 0.9941354840993881}, {'text': 'not accurate', 'confidence': 0.9972830712795258}]

Configuration 1 took 41 attempt(s) to get the correct answer
Configuration 2 took 41 attempt(s) to get the correct answer

--------------------------------------------------

Query (#30): predict the total delay of delivery person picking up food where the delivery persons ability is not good for the next month
Ground Truth (filter): DELIVERY PERSON ABILITY

Annotation 1: [{'text': 'delivery persons ability is not good', 'confidence': 0.9999995231628418}]
Annotation 2: [{'text': 'delivery persons ability', 'confidence': 0.989648719628652}, {'text': 'not good', 'confidence': 0.9900850355625153}]

Configuration 1 took 44 attempt(s) to get the correct answer
Configuration 2 took 44 attempt(s) to get the correct answer

--------------------------------------------------

Query (#31): predict the total order with good reviews where the road condition is good for the next month
Ground Truth (filter): GOOD ROAD CONDITION

Annotation 1: [{'text': 'road condition is good', 'confidence': 0.9999995827674866}]
Annotation 2: [{'text': 'road condition', 'confidence': 0.9999964237213135}, {'text': 'good', 'confidence': 0.9897237420082092}]

Configuration 1 took 42 attempt(s) to get the correct answer
Configuration 2 took 42 attempt(s) to get the correct answer

--------------------------------------------------

Query (#32): predict the total order with good quantity where the customers residence is in busy location for the next month
Ground Truth (filter): RESIDENCE IN BUSY LOCATION

Annotation 1: [{'text': 'customers residence is in busy location', 'confidence': 0.9742420415083567}]
Annotation 2: [{'text': 'residence', 'confidence': 0.9663198590278625}, {'text': 'in busy location', 'confidence': 0.997384786605835}]

Configuration 1 took 40 attempt(s) to get the correct answer
Configuration 2 took 40 attempt(s) to get the correct answer

--------------------------------------------------

Query (#33): predict the total order with good quantity where the family size is more than four within next month
Ground Truth (filter): FAMILY SIZE

Annotation 1: [{'text': 'family size', 'confidence': 0.9999886751174927}]
Annotation 2: [{'text': 'family size', 'confidence': 0.9995880722999573}]

Configuration 1 took 7 attempt(s) to get the correct answer
Configuration 2 took 7 attempt(s) to get the correct answer

--------------------------------------------------

Query (#34): predict the maximum order time where the road condition is good for the next month
Ground Truth (filter): GOOD ROAD CONDITION

Annotation 1: [{'text': 'road condition is good', 'confidence': 0.9999996274709702}]
Annotation 2: [{'text': 'road condition', 'confidence': 0.9999909996986389}, {'text': 'good', 'confidence': 0.9857608675956726}]

Configuration 1 took 42 attempt(s) to get the correct answer
Configuration 2 took 42 attempt(s) to get the correct answer

--------------------------------------------------

Query (#35): predict the maximum order time where the road condition is not good for the next month
Ground Truth (filter): GOOD ROAD CONDITION

Annotation 1: [{'text': 'road condition is not good', 'confidence': 0.9999996542930603}]
Annotation 2: [{'text': 'road condition', 'confidence': 0.9999905228614807}, {'text': 'not good', 'confidence': 0.9299094378948212}]

Configuration 1 took 42 attempt(s) to get the correct answer
Configuration 2 took 42 attempt(s) to get the correct answer

--------------------------------------------------

Query (#36): predict the maximum order time where the quantity is low and delivery time is low for the next month
Ground Truth (filter): LOW QUANTITY LOW TIME

Annotation 1: [{'text': 'low and delivery time is low', 'confidence': 0.9993075331052145}]
Annotation 2: [{'text': 'quantity', 'confidence': 0.9998922348022461}, {'text': 'low', 'confidence': 0.9996652007102966}, {'text': 'delivery time', 'confidence': 0.9993226230144501}]

Configuration 1 took 43 attempt(s) to get the correct answer
Configuration 2 took 43 attempt(s) to get the correct answer

--------------------------------------------------

Query (#37): predict the average number of calls made by delivery captain where the delivery persons ability is not good for the next month
Ground Truth (filter): DELIVERY PERSON ABILITY

Annotation 1: [{'text': 'delivery persons ability is not good', 'confidence': 0.9999994238217672}]
Annotation 2: [{'text': 'delivery persons ability', 'confidence': 0.9947405060132345}, {'text': 'not good', 'confidence': 0.9882488250732422}]

Configuration 1 took 44 attempt(s) to get the correct answer
Configuration 2 took 44 attempt(s) to get the correct answer

--------------------------------------------------

Query (#38): predict the average number of calls made by delivery captain where the customers location on google map is not accurate for the next month
Ground Truth (filter): GOOGLE MAPS ACCURACY

Annotation 1: [{'text': 'customers location', 'confidence': 0.9999978244304657}, {'text': 'not accurate', 'confidence': 0.9999674260616302}]
Annotation 2: [{'text': 'location', 'confidence': 0.9983050227165222}, {'text': 'google map', 'confidence': 0.9963788092136383}, {'text': 'not accurate', 'confidence': 0.9952478408813477}]

Configuration 1 took 41 attempt(s) to get the correct answer
Configuration 2 took 41 attempt(s) to get the correct answer

--------------------------------------------------

Query (#39): predict the average number of calls made by delivery captain where the customers residence is in busy location for the next month
Ground Truth (filter): RESIDENCE IN BUSY LOCATION

Annotation 1: [{'text': 'residence is in busy location', 'confidence': 0.9999995708465577}]
Annotation 2: [{'text': 'residence', 'confidence': 0.9432263374328613}, {'text': 'in busy location', 'confidence': 0.999949594338735}]

Configuration 1 took 40 attempt(s) to get the correct answer
Configuration 2 took 40 attempt(s) to get the correct answer

--------------------------------------------------

Query (#40): I want to know the average temperature of food where the quantity is low and delivery time is low for the next month
Ground Truth (filter): LOW QUANTITY LOW TIME

Annotation 1: [{'text': 'quantity', 'confidence': 0.7386093139648438}, {'text': 'low and delivery time is low', 'confidence': 0.9992154240608215}]
Annotation 2: [{'text': 'quantity', 'confidence': 0.9999420046806335}, {'text': 'low', 'confidence': 0.9999933838844299}, {'text': 'delivery time', 'confidence': 0.9999749958515167}, {'text': 'low', 'confidence': 0.9952304363250732}]

Configuration 1 took 43 attempt(s) to get the correct answer
Configuration 2 took 43 attempt(s) to get the correct answer

--------------------------------------------------

Query (#41): I want to know the average temperature of food where the customers residence area road condition is good for the next month
Ground Truth (filter): GOOD ROAD CONDITION

Annotation 1: [{'text': 'residence area road condition is good', 'confidence': 0.9999995132287344}]
Annotation 2: [{'text': 'customers residence area road condition', 'confidence': 0.9811298847198486}, {'text': 'good', 'confidence': 0.9994962215423584}]

Configuration 1 took 42 attempt(s) to get the correct answer
Configuration 2 took 42 attempt(s) to get the correct answer

--------------------------------------------------

Query (#42): predict the total order increased for politeness where the occupation of the customer is job holder for tomorrow
Ground Truth (filter): OCCUPATION

Annotation 1: [{'text': 'occupation of the customer is job holder for', 'confidence': 0.9980372935533524}]
Annotation 2: [{'text': 'occupation of the customer', 'confidence': 0.9997764825820923}, {'text': 'job holder', 'confidence': 0.9998636245727539}]

Configuration 1 took 4 attempt(s) to get the correct answer
Configuration 2 took 4 attempt(s) to get the correct answer

--------------------------------------------------

Query (#43): predict the total order increased for freshness of food where the occupation of the customer is job holder for tomorrow
Ground Truth (filter): OCCUPATION

Annotation 1: [{'text': 'occupation of the customer is job holder for', 'confidence': 0.9992124512791634}]
Annotation 2: [{'text': 'occupation of the customer', 'confidence': 0.9999977946281433}, {'text': 'job holder', 'confidence': 0.9999924302101135}]

Configuration 1 took 4 attempt(s) to get the correct answer
Configuration 2 took 4 attempt(s) to get the correct answer

--------------------------------------------------

Query (#44): predict the total order increased for freshness of food where the customers monthly income is greater than two thousand for tomorrow
Ground Truth (filter): MONTHLY INCOME

Annotation 1: [{'text': 'monthly income', 'confidence': 0.7964674234390259}]
Annotation 2: [{'text': 'monthly income', 'confidence': 0.9999953508377075}]

Configuration 1 took 5 attempt(s) to get the correct answer
Configuration 2 took 5 attempt(s) to get the correct answer

--------------------------------------------------

Query (#45): predict the total order increased for good taste of food where the customers monthly income is greater than two thousand for tomorrow
Ground Truth (filter): MONTHLY INCOME

Annotation 1: [{'text': 'monthly income', 'confidence': 0.850880891084671}]
Annotation 2: [{'text': 'monthly income', 'confidence': 0.9999956786632538}]

Configuration 1 took 5 attempt(s) to get the correct answer
Configuration 2 took 5 attempt(s) to get the correct answer

--------------------------------------------------

