﻿paper_name,data_description,natural_language_query,answer,std_error,is_significant,method,treatment,outcome,covariates,running_var,temporal_var,instrument_var,state_var,interacting_variable,multirct_treatment,data_files
QRData,"The CSV file ihdp_0.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",Does home visit from specilaist doctors lead to an improvement in cognitive scores?,4.02,0.11,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_0.csv
QRData,"The CSV file ihdp_1.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,4.05,0.123,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_1.csv
QRData,"The CSV file ihdp_2.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,4.1,0.13,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_2.csv
QRData,"The CSV file ihdp_3.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,4.27,0.162,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_3.csv
QRData,"The CSV file ihdp_4.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,4.16,0.156,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_4.csv
QRData,"The CSV file ihdp_5.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,4,0.131,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_5.csv
QRData,"The CSV file ihdp_6.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,3.99,0.13,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_6.csv
QRData,"The CSV file ihdp_7.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,3.85,0.153,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_7.csv
QRData,"The CSV file ihdp_8.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,10.47,1.234,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_8.csv
QRData,"The CSV file ihdp_9.csv contains data obtained from the Infant Health and Development Program (IHDP). The randomized study is designed to evaluate the effect of home visit from specialist doctors on the cognitive test scores of premature infants. The confounders x (x1-x25) correspond to collected measurements of the children and their mothers, including measurements on the child (birth weight, head circumference, weeks born preterm, birth order, first born, neonatal health index, sex, twin status), as well as behaviors engaged in during the pregnancy (smoked cigarettes, drank alcohol, took drugs) and measurements on the mother at the time she gave birth (age, marital status, educational attainment, whether she worked during pregnancy, whether she received prenatal care) and the site (8 total) in which the family resided at the start of the intervention. There are 6 continuous covariates and 19 binary covariates.",What is the effect of home visits on the cognitive test scores of children who actually received the intervention?,4.59,0.4,1,ols,treatment,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16, x17, x18, x19, x20, x21, x22, x23, x24, x25",,,,,,,ihdp_9.csv
QRData,"To estimate the impacts of online class format on exam outcomes, a randomized experiment was conducted. The dataset online_classroom.csv contains exam scores of students in online classes versus face-to-face classes. Each row of the dataset contains a student's exam outcome (the variable falsexam), their classroom format (online, face-to-face, or blended), and other variables like gender and ethnicity.",Is there any advantage to taking classes online in comparison to face-to-face or blended format in terms of exam performance? ,-4.91,1.68,1,ols,format_ol,false_exam,,,,,,,,online_classroom.csv
QRData,"To estimate the impact of an additional year of education on hourly wage, we look at a sample size representing individuals with varying levels of education and their hourly wages. The dataset wage.csv contains the necessary data for our analysis. The columns in this dataset include 'wage', representing the total income; 'hours', representing the total hours worked; 'educ', representing the years of education; lhwage, representing the logarithm of the wage. Other variables in the dataset are:
feduc: father's eduction, meduc: mother's education,  brthord: birth order in the family, sibs: number of siblings, urban: whether the person lives in an urban area of rural, south: whether the person is from the south, black: if the person's race is black, married: whether the person is married or not, age: age of the person, tenure: the tenure of the person in the company where they are working, exper: number of years of work experience, IQ: the person's IQ",Do more years of education lead to higher log hourly wages?,0.0529,0.007,1,ols,educ,lhwage,,,,,,,,wage.csv
QRData,"To estimate the impact of an additional year of education on hourly wage, we look at a sample size representing individuals with varying levels of education and their hourly wages. The dataset wage.csv contains the necessary data for our analysis. The columns in this dataset include 'wage', representing the total income; 'hours', representing the total hours worked; 'educ', representing the years of education; lhwage, representing the logarithm of the wage. Other variables in the dataset are:
feduc: father's eduction, meduc: mother's education,  brthord: birth order in the family, sibs: number of siblings, urban: whether the person lives in an urban area of rural, south: whether the person is from the south, black: if the person's race is black, married: whether the person is married or not, age: age of the person, tenure: the tenure of the person in the company where they are working, exper: number of years of work experience, IQ: the person's IQ",Does an increase in years of education help increase log hourly wages when also considering other variables of interest?,0.0411,0.01,1,ols,educ,lhwage,"IQ, exper, tenure, age, married, black, south, urban, sibs, brthord, meduc, feduc",,,,,,,wage.csv
QRData,"To estimate the impact of an additional year of education on hourly wage, we look at a sample size representing individuals with varying levels of education and their hourly wages. The dataset wage.csv contains the necessary data for our analysis. The columns in this dataset include 'wage', representing the total income; 'hours', representing the total hours worked; and 'educ', representing the years of education; and other variables like the parents' education and the person's IQ score.",What's the effect of graduating 12th grade on hourly wage?,4.9,0.626,1,ols,passed_12,wage,,,,,,,,wage.csv
QRData,"In a study to determine the causal effect of sending an email reminder on the repayment of debts, a fintech company conducted a random test involving 5000 customers who were late on their payments. Each customer was randomly assigned to either receive an email about negotiating their debt or to be part of a control group that did not receive the email. Data was collected on the amounts paid by the late customers after this intervention. The dataset collections_email.csv contains variables including the amount paid (`payments`), whether the email was sent (`email`), whether the email was opened (`opened`), whether the customer contacted the collections department to negotiate their debt after having received the email (`agreement`), the customer's credit limit before being late (`credit_limit`), and the customer's risk score prior to the delivery of the email (`risk_score`).",Does sending reminder emails have an impact on the repayment of debts while including suitable controls?,4.43,2.13,1,ols,payments,email,"credit_limit, risk_score",,,,,,,collections_email.csv
QRData,"The dataset hospital_treatment.csv includes data from a randomized trials on a new drug to treat a certain illness. The outcome of interest is days hospitalized. If the treatment is effective, it will lower the amount of days the patient stays in the hospital.  The CSV file contains columns for `hospital` indicating the hospital a patient belongs to, `treatment` signifying if the patient received the new drug or placebo, `severity` reflecting the severity of the illness, and `days` representing the number of days the patient was hospitalized.",Is the new drug effective in reducing the hospital stay?,-7.59,2.269,1,ols,treatment,days,severity,,,,,,,hospital_treatment.csv
QRData,"The dataset provided, ak91.csv, contains information on individuals' log wages, years of schooling, year of birth, quarter of birth, and state of birth. The purpose of the analysis is to estimate the effect of education on wage by taking advantage of US compulsory attendance law. Usually, they state that a kid must have turned 6 years by January 1 of the year they enter school. For this reason, kids that are born at the beginning of the year will enter school at an older age. Compulsory attendance law also requires students to be in school until they turn 16, at which point they are legally allowed to drop out. The result is that people born later in the year have, on average, more years of education than those born in the beginning of the year.",How much more does a person earn for each extra year of education?,0.0853,0.0255,1,iv,years_of_schooling,log_wage,,,,quarter_of_birth,,,,ak91.csv
QRData,"A study is conducted to measure the effect of a marketing push on user engagement, specifically in-app purchases. Some customers who were assigned to receive the push are not receiving it, because they probably have an older phone that doesn't support the kind of push the marketing team designed.\nThe dataset app_engagement_push.csv contains records for 10,000 random customers. Each record includes whether an in-app purchase was made (in_app_purchase), if a marketing push was assigned to the user (push_assigned), and if the marketing push was successfully delivered (push_delivered).",Did the marketing push help increase in-app purchases?,3.29,0.7165,1,iv,push_delivered,in_app_purchase,,,,push_assigned,,,,app_engagement_push.csv
QRData,"To investigate the effect of a medication on the number of days it takes for a patient to recover from an illness, we have a dataset that includes several confounding variables like severity, sex, and age. The dataset medicine_impact_recovery.csv contains data on patients who have been prescribed medication and those who haven't. The variables include sex (0 or 1), age, the severity of the condition, whether the patient was on medication (0 or 1), and the number of days it took for each patient to recover.",What is the effect of the medication on the recovery time?,-7.7,0.606,1,matching,medication,recovery,"severity, age, sex",,,,,,,medicine_impact_recovery.csv
QRData,"The National Study of Learning Mindsets is a study conducted in U.S. public high schools which aims at finding the impact of a growth mindset. The way it works is that students receive from the school a seminar to instill in them a growth mindset. Then, they follow up the students in their college years to measure how well they\u2019ve performed academically. This measurement was compiled into an achievement score and standardized. The CSV file learning_mindset.csv contains data of this observation study.

Variable Description
intervention: the intervention of the growth mindset;\nachievement_score: the standardized academic achievement score;\nschoolid: identifier of the student\u2019s school;\nsuccess_expect: self-reported expectations for success in the future, a proxy for prior achievement, measured prior to random assignment;\nethnicity: categorical variable for student race/ethnicity;\ngender: categorical variable for student identified gender;\nfrst_in_family: categorical variable for student first-generation status, i.e. first in family to go to college;\nschool_urbanicity: school-level categorical variable for urbanicity of the school, i.e. rural, suburban, etc;\nschool_mindset: school-level mean of students\u2019 fixed mindsets, reported prior to random assignment, standardized;\nschool_achievement: school achievement level, as measured by test scores and college preparation for the previous 4 cohorts of students, standardized;\nschool_ethnic_minority: school racial/ethnic minority composition, i.e., percentage of student body that is Black, Latino, or Native American, standardized;\nschool_poverty: school poverty concentration, i.e., percentage of students who are from families whose incomes fall below the federal poverty line, standardized;\nschool_size: total number of students in all four grade levels in the school, standardized.",Does participating in seminars meant for boosting growth mindset lead to better academic achievement?,0.389,0.017,1,matching,intervention,achievement_score,"ethnicity, gender, school_urbancity",,,,,,,learning_mindset.csv
QRData,"The dataset billboard_impact.csv details information from a quasi-experiment assessing the influence of billboards on bank deposits in two cities: Porto Alegre (treatment group) and Florianopolis (control group). The csv file contains records with three variables: deposits (average bank deposits in Brazilian Reais), poa (A dummy indicator for the city of Porto Alegre. When it is zero, it means the samples are from Florianopolis.), and jul (A dummy for the month of July, or for the post intervention period. When it is zero it refers to samples from May, the pre-intervention period)",Does using billboards lead to higher bank deposits?,6.52,5.73,0,did,poa,deposits,,,jul,,,,,billboard_impact.csv
QRData,"To estimate the impacts of alcohol on death, we could use legal drinking age laws. In the US, those just under 21 years don't drink (or drink much less) while those just older than 21 can drink. The csv file drinking.csv contains mortality data aggregated by age. The dataset contains the following variables:

agecell: the average age across the given cohort 
all: the average mortality across all causes 
mva: average mortality by moving vehicle accidents 
suicide: average mortality by suicide 
homicide: average mortality by homicide 
drugs: average mortality by drugs ","By how much does turning 21, the legal drinking age affect the risk of death from any cause?",7.66,1.319,1,rdd,NA,all,,age,,,,,,drinking.csv
QRData,"To estimate the effect of cigarette taxation on its consumption, data from cigarette sales were collected and analyzed across 39 states in the United States from the years 1970 to 2000. Proposition 99, a Tobacco Tax and Health Protection Act passed in California in 1988, imposed a 25-cent per pack state excise tax on tobacco cigarettes and implemented additional restrictions, including the ban on cigarette vending machines in public areas accessible by juveniles and a ban on the individual sale of single cigarettes. Revenue generated was allocated for environmental and health care programs along with anti-tobacco advertising. We aim to determine if the imposition of this tax and the subsequent regulations led to a reduction in cigarette sales. The data is in the CSV file smoking2.csv. The columns represent:

state: ID of the states 
year: the year
cigsale: The total cigarettes sales 
lnincome: The logarithm of the average income 
beer: the legal drinking age in that state 
age15to24: proportion of people in the age group 15 - 24
retprice: the retail price of the cigarettes 
california: whether or not the state is California, where the ACT was passed 
after_treatment: whether the data is from the year after 1988, the year the ACT was passed
",Did Proposition 99 help reduce cigarette sales?,-27.35,10.91,1,did,california,cigsale,,,after_treatment,,,,,smoking2.csv
QRData,"We are trying to estimate the effect of a trainee program on earnings. Data in the CSV file trainee_unique_on_age.csv contains the following columns

unit: trainee unit 
trainee: trainee status i.e. whether or not the unit participated 
age: the age of the unit 
earnings: earnings of the unit",How much more did people who joined the trainee program earn compared to similar people who didn’t,2457.9,2945.745,0,matching,trainees,earnings,age,,,,,,,trainee_unique_on_age.csv
QRData,"""We consider how much politicians can increase their personal wealth due to holding office. Scholars investigated this question by analyzing members of Parliament (MPs) in the United Kingdom.6 The authors of the original study collected information about personal wealth at the time of death for several hundred competitive candidates who ran for office in general elections between 1950 and 1970. The data are contained in the CSV file MPs.csv.

Variable Description
surname: surname of the candidate
firstname: first name of the candidate
party: party of the candidate (labour or tory)
gross: log gross wealth at the time of death
net: log net wealth at the time of death
margin.pre: margin of the candidate‚ party in the previous election
region: electoral region
margin: margin of victory centered around 0 (vote share)  
nyob: year of birth of the candidate
yod: year of death of the candidate",What is the effect of becoming members of Parliament on the net (log) wealth for Tory candidates? ,1.37,0.288,1,rdd,NA,net,,margin,,,,,,MPs.csv
QRData,"Researchers conducted randomized policy experiment in India where, since the mid-1990s, one-third of village council heads have been randomly reserved for female politicians. The CSV data set women.csv contains a subset of this data from West Bengal. The policy was implemented at the level of government called Gram Panchayat or GP. Each GP contains many villages. For this study, two villages were selected at random within each GP for detailed data collection. Each observation in the data set represents a village and there are two villages associated with each GP.

Variable Description
GP: identifier for the Gram Panchayat (GP)
village: identifier for each village
reserved: binary variable indicating whether the GP was reserved for women leaders or not
female: binary variable indicating whether the GP had a female leader or not
irrigation: variable measuring the number of new or repaired irrigation facilities in the village since the reserve policy started
water: variable measuring the number of new or repaired drinking water facilities in the village since the reservation policy started",Did the reservation policy lead to more new or repaired drinking water facilities in villages since it was implemented?,9.25,3.948,1,ols,water,reserve,,,,,,,,women.csv
QRData,"Three social scientists conducted an RCT in which they investigated whether social pressure within neighborhoods increases participation. Specifically, during a primary election in the state of Michigan, they randomly assigned registered voters to receive different get-out-the-vote (GOTV) messages and examined whether sending postcards with these messages increased turnout. The researchers exploited the fact that the turnout of individual voters is public information in the United States. The GOTV message of particular interest was designed to induce social pressure by telling voters that after the election, their neighbors would be informed about whether they voted in the election or not. The researchers hypothesized that such a naming and shaming GOTV strategy would increase participation.

There are three treatment groups: Civic Duty, Hawthorne, and Neighbors. The Civic Duty group gets a message saying voting is their civic duty. The Hawthorne effect refers to the phenomenon where study subjects behave differently because they know they are being observed by researchers. Finally, the Neighbors group receives a message saying that the neighbors can become aware of whether they voted or not. The experiment also has a control group (Control), which consists of voters receiving no message. The researchers randomly assigned each voter to one of the four groups (3 types of treatment and 1 control) and examined whether the voter turnout was different across the groups.

The data is in the file social.csv.
Variable Description
hhsize: household size of the voter
messages: GOTV messages the voter received. It can be one of the three treatment groups: Civic Duty, Neighbors, Hawthorne, or the control group, Control. 
sex: sex of the voter (female or male)
yearofbirth: year of birth of the voter
primary2004: whether the voter voted in the 2004 primary election (1=voted, 0=abstained)
primary2006: whether the voter voted in the 2006 primary election or not (1=voted, 0=abstained)",What is the difference in the effect of the Neighbors message on whether the voter voted in the 2006 primary election between those who voted in the 2004 primary election and those who did not?,0.027,0.005,1,ols,treatment,primary2006,,,,,,primary2004,Neighbors,social.csv
QRData,"Three social scientists conducted an RCT in which they investigated whether social pressure within neighborhoods increases participation. Specifically, during a primary election in the state of Michigan, they randomly assigned registered voters to receive different get-out-the-vote (GOTV) messages and examined whether sending postcards with these messages increased turnout. The researchers exploited the fact that the turnout of individual voters is public information in the United States. The GOTV message of particular interest was designed to induce social pressure by telling voters that after the election, their neighbors would be informed about whether they voted in the election or not. The researchers hypothesized that such a naming and shaming GOTV strategy would increase participation.

There are three treatment groups: Civic Duty, Hawthorne, and Neighbors. The Civic Duty group gets a message saying voting is their civic duty. The Hawthorne effect refers to the phenomenon where study subjects behave differently because they know they are being observed by researchers. Finally, the Neighbors group receives a message saying that the neighbors can become aware of whether they voted or not. The experiment also has a control group (Control), which consists of voters receiving no message. The researchers randomly assigned each voter to one of the four groups (3 types of treatment and 1 control) and examined whether the voter turnout was different across the groups.

The data is in the file social.csv.
Variable Description
hhsize: household size of the voter
messages: GOTV messages the voter received. It can be one of the three treatment groups: Civic Duty, Neighbors, Hawthorne, or the control group, Control. 
sex: sex of the voter (female or male)
yearofbirth: year of birth of the voter
primary2004: whether the voter voted in the 2004 primary election (1=voted, 0=abstained)
primary2006: whether the voter voted in the 2006 primary election or not (1=voted, 0=abstained)",What is the effect of the Neighbors message on whether the voter voted in the 2006 primary election if the voter's age was 25 in 2006?,0.055,0.032,0,ols,treatment,primary2006,,,,,,age,Neighbors,social.csv
QRData,"Three social scientists conducted an RCT in which they investigated whether social pressure within neighborhoods increases participation. Specifically, during a primary election in the state of Michigan, they randomly assigned registered voters to receive different get-out-the-vote (GOTV) messages and examined whether sending postcards with these messages increased turnout. The researchers exploited the fact that the turnout of individual voters is public information in the United States. The GOTV message of particular interest was designed to induce social pressure by telling voters that after the election, their neighbors would be informed about whether they voted in the election or not. The researchers hypothesized that such a naming and shaming GOTV strategy would increase participation.

There are three treatment groups: Civic Duty, Hawthorne, and Neighbors. The Civic Duty group gets a message saying voting is their civic duty. The Hawthorne effect refers to the phenomenon where study subjects behave differently because they know they are being observed by researchers. Finally, the Neighbors group receives a message saying that the neighbors can become aware of whether they voted or not. The experiment also has a control group (Control), which consists of voters receiving no message. The researchers randomly assigned each voter to one of the four groups (3 types of treatment and 1 control) and examined whether the voter turnout was different across the groups.

The data is in the file social.csv.
Variable Description
hhsize: household size of the voter
messages: GOTV messages the voter received. It can be one of the three treatment groups: Civic Duty, Neighbors, Hawthorne, or the control group, Control. 
sex: sex of the voter (female or male)
yearofbirth: year of birth of the voter
primary2004: whether the voter voted in the 2004 primary election (1=voted, 0=abstained)
primary2006: whether the voter voted in the 2006 primary election or not (1=voted, 0=abstained)",What is the effect of the Neighbors message on whether the voter voted in the 2006 primary election if the voter's age was 65 in 2006?,0.079,0.029,0,ols,treatment,primary2006,,,,,,age,Neighbors,social.csv
QRData,"The CSV file jobs_0.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.074,0.052,0,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x16",,,,,,,jobs_0.csv
QRData,"The CSV file jobs_1.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.025,0.052,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x17",,,,,,,jobs_1.csv
QRData,"The CSV file jobs_2.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.081,0.051,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x18",,,,,,,jobs_2.csv
QRData,"The CSV file jobs_3.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.11,0.052,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x19",,,,,,,jobs_3.csv
QRData,"The CSV file jobs_4.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.075,0.051,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x20",,,,,,,jobs_4.csv
QRData,"The CSV file jobs_5.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.097,0.054,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x21",,,,,,,jobs_5.csv
QRData,"The CSV file jobs_6.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.1,0.056,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x22",,,,,,,jobs_6.csv
QRData,"The CSV file jobs_7.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.075,0.052,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x23",,,,,,,jobs_7.csv
QRData,"The CSV file jobs_8.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.094,0.052,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x24",,,,,,,jobs_8.csv
QRData,"The CSV file jobs_9.csv contains observational data from the National Supported Work (NSW) program. The study is designed to evaluate the effect of job training (t) on post-training income and employment status (y).
The confounders (x0–x16) include baseline characteristics such as age, years of education, race/ethnicity (Black and Hispanic indicators), marital status (binary), and high school degree attainment (binary). They also include pre-treatment earnings from 1974 and 1975. In addition, several covariates represent transformations or interactions of these base variables, such as standardized earnings, squared terms, and zero-earnings indicators.",Does the job training program improve employment prospects?,0.073,0.049,,matching,t,y,"x1, x2, x3, x4, x5, x6, x7, x8, x9, x10, x11, x12, x13, x14, x15, x24",,,,,,,jobs_9.csv