SAMKE: An Open-Ended Autonomous Foundation-Model-Based Agent for Meta-Knowledge Discovery and Learning

Published: 18 Nov 2025, Last Modified: 20 Jan 2026PLAN-FM Bridge @ AAAI 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Agent Orchestration, Multi-Agent Planning, large language models, meta-learning, parameter generation, reinforcement learning, self-evolving
TL;DR: This paper introduces SAMKE, an autonomous foundation-model-based agent that enables large language models to self-improve and efficiently adapt to unseen tasks through meta-knowledge discovery and parameter-level weight modifications.
Abstract: Given the recent tremendous success of large language models (LLMs), there has been an increasing trend in applying them to several down-stream tasks via fine-tuning (FT). However, there persist two significant challenges in FT - 1) curation of high-quality task-specific data and 2) expensive time-consuming model adaptation via gradient descent optimization. To mitigate these limitations, we leverage prior works in large-scale parameter generation for LLMs and propose Scientific Autonomous Meta-Knowledge Explore (SAMKE) to open up a new paradigm of parameter-level meta learning, thereby serving a critical advance in the AI Scientist domain. SAMKE is an open-ended autonomous foundation-model-based agent which is capable of self-improvement by discovering and learning the rich meta-knowledge information present in large neural network weights, thereby enabling efficient adaptation of LLMs by parameter-level weight modifications to unseen domains as well. To achieve this, SAMKE has a multi-level reinforcement learning approach to train models for efficient understanding of parameter space and offers higher degree of flexibility compared to prior works on self-evolution by providing freedom to the model for choosing its own adaptation strategy thereby breaking the scaling solely through data regime. Early experiments demonstrate the superior performance of SAMKE in the common-sense reasoning domain as it outperforms task-specific FT by 23.6% on average and even some of the most recent works in parameter generation (by 10.4%), model merging (by 24.3%) and test-time learning (by 25.2%) as well out-of-domain tasks such as coding, social, logical and mathematical reasoning as well.
Submission Number: 1
Loading