MemoPhishAgent: Memory-Augmented Multi-Modal LLM Agent for Phishing URL Detection

Published: 18 Apr 2026, Last Modified: 22 Apr 2026ACL 2026 Industry Track PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Phishing detection; LLM agent; Multimodal; Cybersecurity
Abstract: Traditional phishing website detection relies on static heuristics or reference lists, which lag behind rapidly evolving attacks. While recent systems incorporate large language models (LLMs), they are still prompt-based, deterministic pipelines that underutilize reasoning capability. We present MemoPhishAgent (MPA), a memory-augmented multi-modal LLM agent that dynamically orchestrates phishing-specific tools and leverages episodic memories of past reasoning trajectories to guide decisions on recurring and novel threats. On two public datasets, MPA outperforms three state-of-the-art (SOTA) baselines, improving recall by 13.6\%. To better reflect realistic, user-facing phishing detection performance, we further evaluate MPA on a benchmark of real-world suspicious URLs actively crawled from five social media platforms, where it improves recall by 20\%. Detailed analysis shows episodic memory contributes up to 27\% recall gain without introducing additional computational overhead. The ablation study confirms the necessity of the agent-based approach compared to prompt-based baselines and validates the effectiveness of our tool design. Finally, MPA is deployed in production, processing $\sim60K$ targeted high-risk URLs weekly, and achieving 91.44\% recall, providing proactive protection for millions of customers. Together, our results show that combining multi-modal reasoning with episodic memory yields robust, adaptable phishing detection in realistic user-exposure settings. Our implementation is available at \url{https://github.com/XuanChen-xc/MemoPhishAgent.git}.
Submission Type: Deployed
Copyright Form: pdf
Submission Number: 288
Loading