Track: Security and privacy
Keywords: CAPTCHA, Visual Illusion, Large Language Model
Abstract: CAPTCHAs have long been essential tools for protecting applications from automated bots. Initially designed as simple questions to distinguish humans from bots, they have become increasingly complex to keep pace with the proliferation of CAPTCHA-cracking techniques employed by malicious actors. However, with the advent of advanced large language models (LLMs), the effectiveness of existing CAPTCHAs is now being undermined.
To address this issue, we have conducted an empirical study to evaluate the performance of multimodal LLMs in solving CAPTCHAs and to assess how many attempts human users typically need to pass them. Our findings reveal that while LLMs can solve most CAPTCHAs, they struggle with those requiring complex reasoning—a type of CAPTCHA that also presents significant challenges for human users. Interestingly, our user study showed that the majority of human participants required a second attempt to pass these reasoning CAPTCHAs, a finding not previously reported in existing research.
Based on the findings of our empirical study, we introduce IllusionCAPTCHA, an innovative approach designed to be "Human-Easy but AI-Hard". This new CAPTCHA employs visual illusions to create tasks that are intuitive for humans but highly confusing for AI models. Furthermore, we developed a structured, step-by-step method that guides LLMs toward making specific incorrect choices, thereby reducing their ability to bypass CAPTCHA systems successfully. Our evaluation shows that IllusionCAPTCHA can effectively deceive LLMs 100\% of the time. Moreover, our structured design significantly increases the likelihood of AI errors when attempting to solve these challenges. Results from our user study indicate that 86.95\% of participants successfully passed the CAPTCHA on their first attempt, outperforming other CAPTCHA systems.
Submission Number: 2317
Loading