Abstract: The proliferation of educational videos on the Internet has changed the educational landscape by enabling students to learn complex concepts at their own pace. Our work outlines the vision of an automated tutor – a multimodal question answering (QA) system to answer questions from students watching a video. This can make doubt resolution faster and further improve learning experience. In this work, we take first steps towards building such a QA system. We curate and release a dataset named EduVidQA, with 3,158 videos and 18,474 QA-pairs. However, building and evaluating an educational QA system is challenging because (1) existing evaluation metrics do not correlate with human judgments, and (2) a student question could be answered in many different ways, training on a single gold answer could confuse the model and make it worse. We conclude with important research questions to develop this research area further.
Paper Type: Short
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: vision question answering, video processing, multimodality
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Previous URL: https://openreview.net/forum?id=lT9xK9pfLp
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: No, I want the same area chair from our previous submission (subject to their availability).
Reassignment Request Reviewers: No, I want the same set of reviewers from our previous submission (subject to their availability)
Software: zip
Data: zip
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Section 8
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Section 8
B2 Discuss The License For Artifacts: Yes
B2 Elaboration: Section 8
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Section 8
B4 Data Contains Personally Identifying Info Or Offensive Content: Yes
B4 Elaboration: Section 8
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Section 3
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 5.1
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section 5.1
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 5.2
C4 Parameters For Packages: Yes
C4 Elaboration: Section 4
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: No
D1 Elaboration: Exactly the same as prompts discussed in Appendix A
D2 Recruitment And Payment: N/A
D3 Data Consent: Yes
D3 Elaboration: Section 8
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: Yes
E1 Elaboration: Appendix A
Author Submission Checklist: yes
Submission Number: 119
Loading