Urban planning in the age of large language models: Assessing OpenAI o1's performance and capabilities across 556 tasks
Abstract: Highlights•First comprehensive LLM evaluation in urban planning using a 556-task benchmark.•OpenAI o1 excels, scoring 84.08 on average, outperforming both GPT-3.5 and GPT-4o.•Identifies OpenAI o1's key strengths and limitations for professional practice.•Informs and guides future LLM advancements for urban planning applications.
External IDs:dblp:journals/urban/ZhaoHYLZWLZL25
Loading