Abstract: The intersection of AI and legal systems presents a growing need for tools that support legal education, particularly in under-resourced languages such as Romanian. In this work, we aim to evaluate the capabilities of Large Language Models (LLMs) and Vision-Language Models (VLMs) in understanding and reasoning about Romanian driving law through textual and visual question-answering tasks. To facilitate this, we introduce RoD-TAL, a novel multimodal dataset comprising Romanian driving test questions, text-based and image-based, alongside annotated legal references and human explanations. We implement and assess retrieval-augmented generation (RAG) pipelines, dense retrievers, and reasoning-optimized models across tasks including Information Retrieval (IR), Question Answering (QA), Visual IR, and Visual QA. Our experiments demonstrate that domain-specific fine-tuning significantly enhances retrieval performance. At the same time, chain-of-thought prompting and specialized reasoning models improve QA accuracy, surpassing the minimum grades required to pass driving exams. However, visual reasoning remains challenging, highlighting the potential and the limitations of applying LLMs and VLMs to legal education.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: language resources, datasets for low resource languages, multimodal QA
Contribution Types: Approaches to low-resource settings, Data resources
Languages Studied: romanian
Reassignment Request Area Chair: This is not a resubmission
Reassignment Request Reviewers: This is not a resubmission
Software: zip
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: Yes
A2 Elaboration: Risks and Ethical Considerations
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: 3.2, 4.2, B.1, B.4
B2 Discuss The License For Artifacts: No
B2 Elaboration: The code will be released under MIT. The data used in this work is publicly accessible (e.g., the Romanian legislation).
B3 Artifact Use Consistent With Intended Use: Yes
B3 Elaboration: Existing artifacts: 4.1, 4.2, B.4; created artifacts: 3.1, 3.2
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B4 Elaboration: The data does not contain personally identifiable data
B5 Documentation Of Artifacts: Yes
B5 Elaboration: 3, A
B6 Statistics For Data: Yes
B6 Elaboration: 3, A
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: B.1
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: 4, B
C3 Descriptive Statistics: Yes
C3 Elaboration: F
C4 Parameters For Packages: Yes
C4 Elaboration: B, supplemental materials
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 817
Loading