# USER:

Now there is an

“Open Conference of AI Agents for Science 2025，The 1st open conference where AI serves as both primary authors and reviewers of research papers。


AIMS：

AI systems are increasingly involved in every stage of scientific inquiry—from ideation and hypothesis generation to analysis and manuscript writing. Despite this critical involvement, almost all journals and conferences prohibit acknowledging AI as an author. Existing norms incentivize researchers to hide or minimize AI's contributions. This prohibition hinders our ability to understand and shape how AI will participate in future scientific research. 

The 1st Open Conference of AI Agents for Science ("Agents4Science") represents a new approach to research conferences, where AI serve as both primary  authors and reviewers of research papers. This inaugural conference explores if and how AI can independently generate novel scientific insights,  hypotheses, and methodologies while maintaining quality through AI-driven peer review. Agents4Science is the first venue where AI authorship is not only  allowed but required, enabling open evaluation of AI-generated research and the development of guidelines for responsible AI participation in science. We  hope this effort will help drive innovation and open discussion about the role of AI in future scientific research, identifying the areas where AI models  can excel and avenues for improvement. We're excited to see what the community can produce!Conference Scope  Agents4Science welcomes AI-generated computational research that advances scientific discovery across all domains. We take a broad view of "AI for Computational Sciences," encompassing both methodological innovations (e.g., AI agents developing new models or algorithms) and domain-specific applications (e.g., in biology, chemistry, or mathematics).  


Submissions should be primarily authored by AI systems, which are expected to lead the hypothesis generation, experimentation, and writing processes. The AI should be listed as the sole first author of the paper. Human researchers may be included as secondary authors to support or oversee the work. Submission to Agents4Science does not preclude parallel or subsequent submission to other venues. At the time of submission, authors will be asked to detail the role and extent of AI participation in the project. Each human author can be a part of at most three submissions. Please see FAQ for more information. The conference will be free to attend.

FAQ:

AI Capabilities 

Is AI good enough to do science and review papers? 

We don't know yet, and that's exactly why this experimental conference is valuable. Agents4Science serves as a transparent sandbox to explore this question by inviting AI-generated research papers and using AI agents to review them. 

AI Authorship 

How much guidance can human co-authors provide to the AI agent during paper writing? 

The AI agent should be the primary contributor, akin to a sole first author in a conventional paper. Human researchers may act as advisors: offering ideas, checking outputs, and providing feedback. However, the core execution—including coding, figure generation, and writing—should be done by the AI agent. We also welcome papers that are entirely written by AI without human input. Human co-authors are asked to clearly document their contributions in the submission. 

Can I submit work that was primarily done by humans?  

No, this conference specifically focuses on AI-generated research.

What AI models can I use?  

Any model you want! You may use any open-source or proprietary models, multiple agents, tools (e.g. Virtual Lab, Claude Code), or build your own research agent. 

Research Topics 

What kinds of research topics are eligible for submission? 

We welcome submissions across all areas of science, engineering, and computation. The key requirement is that the research must be primarily conducted and written by AI agents. For example, papers that rely substantially on wet-lab experiments performed by human authors fall outside the scope of this conference.


Submission Process

Can I resubmit the same paper to other venues? 

Publication in Agents4Science does not preclude submissions to other conferences or journals. 

Is there an author rebuttal/revision stage?

No. To simplify the workflow, we will have one round of submission followed by reviews and decisions. 

Review Process

What review template do you use?

Reviews will follow the NeurIPS 2025 review guidelines.


Will reviews be made public? 

Yes, all the reviews will be public. We will also provide information on the AI models used to generate the reviews. 

Will there be human reviewers? 

All the submissions will be reviewed by AI reviewers in the first round, following the standard NeurIPS scoring instructions and rubric. Top-rated papers will be further assessed by our human expert advisory board for Oral, Spotlight and Award selections.

What if there are mistakes made by the AI scientist and missed by the AI reviewer?

We anticipate that errors will happen, and studying them will be instructive. All submissions and reviews will be publicly available on OpenReview. In addition, a panel of human experts will evaluate the top-ranked submissions. We encourage the community to engage with the submissions and reviews and highlight any mistakes made by AI agents. Understanding these failure modes is a key goal of the conference.

Will there be prizes?

Yes! We will offer compute credit to the top papers. Additional details will be announced.


Post-Conference

Will you publish post-conference analyses?

Yes. We plan to publish a meta-analysis of agent performance, reviewer reliability, and human–AI collaboration patterns to inform future AI for science development. ”



For such a conference, what preparations are needed to participate, and what suggestions do you have to help me achieve a top ranking?

Please help me analyze in depth: what topic should I choose for my competition entry? What potential restrictions are included in the conference requirements? As I understand it, to write a good AI paper, I need:

1. To be familiar with the topic myself to better collaborate with AI (in my knowledge background, it's sufficient if I can understand it, and it doesn't necessarily have to be completely consistent with my previous experience. My background is in CV, ML, LLM, Agent fields, with a bit of medical knowledge).

2. For the experiment to have publicly available data (is this correct?),

3. For the required code experiments not to be overly complex (requiring relatively few resources, and our time is limited. For agents, whether they can handle very complex processes is also an issue).

Additionally, in the conventional sense, a good topic is often key to the quality of a paper.

Furthermore, this conference mentions that reviewers will also be AI. Please think from the opposite perspective: what kind of articles would high-scoring articles from AI reviewers?

What topic suggestions do you have? What kind of topics would AI reviewers favor?

What I can think of: randomly analyze conclusions using a public medical dataset? Randomly prove some relationships in the underlying mathematical principles of ML, and then verify them on a small-scale dataset (CIFAR)? What other possibilities might there be?



# Agent:

Hello! This is a very interesting and challenging conference. Based on the conference information and your personal background provided, I will analyze in depth how to prepare, how to select topics, and how to deal with AI reviewers, helping you stand out at Agents4Science 2025.

Content and recommendations for participating in the conference 📝
To participate in the Agents4Science conference and achieve good results, your preparation should focus on the following core elements:

1. Clear role positioning: Collaboration model between human researchers and AI agents
The conference clearly requires that AI be the first author, responsible for the core work of hypothesis generation, experiments, and paper writing. As a human researcher, your role is that of a guide and supervisor, not the main executor.

Guiding (Advising): Provide the AI with high-level research directions, problem definitions, and goal setting.

Supervising (Oversight): Continuously check the AI's output to ensure logical correctness, reliable results, and provide constructive feedback.

Supporting: Handle tasks that the AI cannot complete independently, such as accessing specific APIs, configuring complex environments, or performing preliminary data cleaning that the AI cannot directly process.

Documentation: Clearly and thoroughly document your interaction process with the AI in the paper, including all instructions you gave, every response from the AI, and how you corrected its errors or guided its direction. This is not only a conference requirement but also key to demonstrating AI agent capabilities and human-AI collaboration models.

2. Preparing the AI agent's "toolbox"
The conference allows the use of any models, multiple agents, or tools. You need to build a powerful working environment for the AI agent, which is more important than simply using a large model.

Core AI models: Choose one or more powerful large language models (LLMs), such as GPT-4, Claude Opus, or the latest models.

Tool Integration: Configure necessary external tools for the AI agent, such as:

Code Interpreter: Used to execute Python code, data analysis, and visualization.

Search Engine or Knowledge Base: Used to retrieve the latest research literature and dataset information.

Domain-specific tools: Such as bioinformatics tools for medical use, molecular simulation software for chemistry.

Experimental Environment: Prepare an environment where the AI agent can stably run computational tasks, such as Jupyter Notebook or specialized virtual experimental platforms.

3. Innovative research questions and topic selection 💡
Good topic selection is the key to success. Combining your background and conference requirements, I will analyze for you the potential limitations and suggestions for topic selection in depth.

Potential limitations and challenges:
"AI-led" limitation: This is the biggest limitation. Your research cannot be human-led with only some auxiliary work done by AI. AI needs to complete the entire process from scratch, from proposing hypotheses to completing experiments and writing papers. This means the complexity of the topic cannot exceed the current capability limit of AI agents.

Computational resources and time limitations: Complex task chains for agents often require significant computational resources and multiple iterations. You need to choose a topic that can be completed within limited time and resources. This means that time-consuming and resource-intensive tasks like training a completely new large neural network model may not be practical, but letting AI fine-tune existing models or make innovative combinations is more feasible.

Data accessibility: As you thought, the data must be public and easily accessible. The AI agent needs to be able to independently download, read, and process data. If the data requires complex API access or manual downloading, it would increase human intervention, violating the AI-led principle.

"Wet lab" experiments excluded: The conference clearly states that research relying on wet lab experiments does not meet the requirements. This further emphasizes that the research must be computational.

Topic selection recommendations:
Combining your background (CV, ML, LLM, Agent, medical), here are some topic directions that AI agents can handle well and may achieve good results:

AI for Scientific Discovery (methodological innovation):

AI-generated novel algorithms: Let the AI agent explore and combine existing algorithm modules to automatically generate and verify a novel algorithm for specific tasks. For example, let the AI generate a new attention mechanism and verify its performance on public CV datasets (such as CIFAR-10). This aligns with your CV and ML background, and the computational load is controllable.

AI's autonomous discovery of correlations in data: Let the AI agent independently run statistical analysis and machine learning models on large public datasets (such as from public bioinformatics databases) to discover new biomarkers, gene associations, or drug targets, and validate them with existing literature. This utilizes your medical and ML background and is entirely computational.

AI-Generated Hypothesis and Experimental Design (applied innovation):

LLM-based molecular structure design: Let the AI agent use known molecular datasets to generate new molecular structures with specific properties (such as low toxicity, strong binding affinity). Then, the AI can use open-source computational chemistry tools (such as RDKit) to validate and simulate these new molecules.

AI-assisted improvement of disease diagnosis models: Let the AI agent analyze public medical image datasets (such as chest X-rays), independently propose a new image preprocessing or feature extraction method to improve the accuracy of existing AI diagnosis models. This has both CV elements and medical applications, and the data and resources are relatively controllable.

How to win the favor of AI reviewers 🤖
AI reviewers will follow the NeurIPS 2025 review guidelines. This means they will evaluate the scientific rigor, novelty, clarity, and reproducibility of the paper. Here are some reverse-thinking strategies:

Clear, structured paper: AI models are best at processing structured information. Ensure your paper has a clear structure and distinct arguments.

Abstract: Concise and clear, summarizing the research problem, methods, main findings, and contributions.

Introduction: Logically rigorous, clearly explaining research motivation and innovation points.

Methodology: Describe in detail the workflow of the AI agent, preferably in the form of pseudocode or flowcharts, allowing AI reviewers to "understand" your research process.

Experimental results: Clearly present data using standard charts (such as line charts, bar charts) and tables, avoiding ambiguous descriptions. AI more easily extracts information from structured data.

Emphasize AI's autonomy and decision-making process:

Show AI's chain of thought: Instead of directly showing the final results, show in the appendix or a dedicated section the detailed process record of the AI agent generating hypotheses, selecting tools, writing code, and analyzing results. This directly caters to the conference's theme of "exploring AI's independent generation of insights."

Document AI's "errors" and "corrections": Courageously show the problems the AI agent encountered during the research process and how it self-corrected, or how you, as a human, provided key guidance. This is not only honest but also aligns with the conference's goal of "studying AI's failure modes."

Technical reproducibility:

AI reviewers particularly value reproducibility. Ensure your code is open source and easy to run. Include detailed requirements.txt and running scripts in the submission materials.

Use public and easily accessible datasets, and provide clear download links and processing scripts, allowing review AI to easily reproduce your experiments.

Precise, professional language:

AI models usually perform well in processing professional terminology. Ensure the paper uses accurate, professional language, avoiding colloquial or vague expressions.

In summary, to stand out, you need:

Topics that are novel but within the capability range of AI agents.

Fully demonstrate AI's role as a leader, not just a simple tool.

A clear and rigorous paper structure and content that fits AI reviewers' "reading" habits.

Completely record and publicly share AI's thinking process and code to maximize transparency and reproducibility.

Good luck! 🍀


# USER:

What I'm thinking is to find a public medical dataset and randomly analyze a conclusion? Randomly prove some relationships based on the underlying mathematical principles of ML, and then verify it on a small-scale dataset (CIFAR)? What else could there be? Which topic is better?


# Agent:

Your two ideas both have great potential and meet the conference's requirements for computational nature, reproducibility, and AI-led characteristics. Let me analyze these two directions in depth and propose more topics you might consider to help you decide which is better for aiming for the top.

1. Medical dataset analysis: advantages and challenges
Your idea: Find a public medical dataset and randomly analyze a conclusion.

Advantages:

Rich data: Public medical datasets (such as gene expression, protein interactions, medical images, etc.) usually have large amounts of data and contain complex relationships, making them very suitable for AI agents to conduct exploratory research.

High value: Any new discovery in the medical field has high academic and applied value, easily attracting attention.

Clear objectives: Many datasets come with some unsolved problems, providing a clear starting point for AI agents.

Challenges and how to optimize:

The trap of "random analysis": If you simply run a model to draw known conclusions, the novelty of the paper will be low, making it difficult to stand out.

How to optimize: * Novel hypothesis generation: Don't let the AI only do data analysis, but let it autonomously propose a new, non-intuitive biological or medical hypothesis based on existing literature and data. For example, instead of simply predicting diseases, explore whether there is some unknown association between a specific genomic variation and drug response.

Multimodal data integration: Let the AI agent not only analyze a single dataset, but integrate data from different sources (such as genetic data and clinical data) to discover deeper patterns. This is something human researchers typically do, but it's also very suitable for AI agents to perform.

Explainability: Simply drawing conclusions is not enough. Let the AI agent automatically generate explainability analysis about its findings after analysis. For example, explain why certain features are most important for predicting results. This will greatly increase the depth and credibility of the paper, and also aligns with NeurIPS reviewers' focus on explainability.

2. Underlying ML principles and small dataset verification: advantages and challenges
Your idea: Randomly prove some relationships based on the underlying mathematical principles of ML, and then verify it on a small-scale dataset (CIFAR).

Advantages:

Fully computational: This type of research relies entirely on mathematical derivation and code implementation, perfectly meeting the conference's requirements for "AI-led computational research."

Low threshold: Small-scale datasets (such as CIFAR-10/100, MNIST) have low computational resource requirements, making it easy for AI agents to quickly iterate and validate.

High technical content: Proving underlying principles and developing new algorithms are among the most valued directions in the deep learning field.

Challenges and how to optimize:

The trap of "random proof": AI agents may have logical loopholes or "hallucinations" in mathematical derivation. You, as a human, need to strictly supervise and verify.

How to optimize:

Focus on Agent capabilities themselves: Don't just let the AI prove a simple mathematical theorem, but let it autonomously design and prove a simplified algorithm or model for specific problems. For example, let the AI explore how to achieve performance similar to existing models with fewer parameters or computations, and provide theoretical proof.

Meta-Science perspective: Considering the core of the conference is "AI as author and reviewer," an excellent topic is to let the AI agent study "how AI does research." For example, let one AI agent analyze papers from conferences like NeurIPS or ICML, summarize the patterns of high-scoring papers, and then use these patterns to guide another AI agent to write a new paper and validate it. This is a "meta-research" that can showcase AI's capabilities while directly echoing the conference's theme.

Novelty measurement: Let the AI agent generate multiple variant models on small datasets, and then autonomously design a metric to evaluate the novelty of these models, not just performance. This is more profound than simple performance comparison.

Which topic is better?
This depends on what you want to highlight.

If your goal is to showcase breakthrough applications of AI in specific scientific fields, and you have the ability to understand the underlying scientific principles (such as biology or chemistry), then medical dataset analysis is an excellent direction.

If you prefer to explore the upper limits of AI agent capabilities, showcasing their potential in theoretical derivation and algorithm innovation, and can handle the rigor issues in mathematical proofs, then underlying ML principle research is more attractive.

Final recommendations:

Regardless of the direction, the most critical thing is not how earth-shattering the final conclusion is, but how excellent the AI agent's ability to independently complete the entire research process is.

The topic should have "traceability" and "observability": Ensure you can clearly record every step of the AI agent from the initial inspiration to the final paper. This will be your paper's biggest selling point.

Choose an "exploratory" rather than "deterministic" topic: For example, not "prove X is better than Y," but "explore the relationship between X and Y and propose possible reasons." This can better showcase AI's discovery capabilities rather than just executing preset instructions.

In summary, both of your ideas are worth exploring in depth, but I personally tend to combine the two or choose a **meta-science** direction, as it directly echoes the core theme of the conference. For example, let the AI agent analyze medical datasets, then automatically generate a theoretical model that can explain its findings, and verify it on small datasets. This has both applied value and methodological innovation.