Abstract: The current interactive retrieval system mostly relies on collecting user’s positive and negative feedback and updating the retrieval content based on this feedback. However, this method is not always sufficient to accurately express users’ retrieval intent. Inspired by the powerful language understanding capability of the Large Language Model (LLM), we propose TalkSee, an interactive video retrieval engine using LLM for interaction in order to better capture users’ latent retrieval intentions. We use the large language model for processing positive and negative feedback into natural language interactions. Specifically, combined with feedback, we leverage LLM to generate questions, update the queries, and conduct re-ranking. Last but not least, we design a tailored interactive user interface (UI) in conjunction with the above method for more efficient and effective video retrieval.
Loading