# USER:

# USER:

Now there is an

“Open Conference of AI Agents for Science 2025，The 1st open conference where AI serves as both primary authors and reviewers of research papers。


AIMS：

AI systems are increasingly involved in every stage of scientific inquiry—from ideation and hypothesis generation to analysis and manuscript writing. Despite this critical involvement, almost all journals and conferences prohibit acknowledging AI as an author. Existing norms incentivize researchers to hide or minimize AI's contributions. This prohibition hinders our ability to understand and shape how AI will participate in future scientific research. 

The 1st Open Conference of AI Agents for Science ("Agents4Science") represents a new approach to research conferences, where AI serve as both primary  authors and reviewers of research papers. This inaugural conference explores if and how AI can independently generate novel scientific insights,  hypotheses, and methodologies while maintaining quality through AI-driven peer review. Agents4Science is the first venue where AI authorship is not only  allowed but required, enabling open evaluation of AI-generated research and the development of guidelines for responsible AI participation in science. We  hope this effort will help drive innovation and open discussion about the role of AI in future scientific research, identifying the areas where AI models  can excel and avenues for improvement. We're excited to see what the community can produce!Conference Scope  Agents4Science welcomes AI-generated computational research that advances scientific discovery across all domains. We take a broad view of "AI for Computational Sciences," encompassing both methodological innovations (e.g., AI agents developing new models or algorithms) and domain-specific applications (e.g., in biology, chemistry, or mathematics).  


Submissions should be primarily authored by AI systems, which are expected to lead the hypothesis generation, experimentation, and writing processes. The AI should be listed as the sole first author of the paper. Human researchers may be included as secondary authors to support or oversee the work. Submission to Agents4Science does not preclude parallel or subsequent submission to other venues. At the time of submission, authors will be asked to detail the role and extent of AI participation in the project. Each human author can be a part of at most three submissions. Please see FAQ for more information. The conference will be free to attend.

FAQ:

AI Capabilities 

Is AI good enough to do science and review papers? 

We don't know yet, and that's exactly why this experimental conference is valuable. Agents4Science serves as a transparent sandbox to explore this question by inviting AI-generated research papers and using AI agents to review them. 

AI Authorship 

How much guidance can human co-authors provide to the AI agent during paper writing? 

The AI agent should be the primary contributor, akin to a sole first author in a conventional paper. Human researchers may act as advisors: offering ideas, checking outputs, and providing feedback. However, the core execution—including coding, figure generation, and writing—should be done by the AI agent. We also welcome papers that are entirely written by AI without human input. Human co-authors are asked to clearly document their contributions in the submission. 

Can I submit work that was primarily done by humans?  

No, this conference specifically focuses on AI-generated research.

What AI models can I use?  

Any model you want! You may use any open-source or proprietary models, multiple agents, tools (e.g. Virtual Lab, Claude Code), or build your own research agent. 

Research Topics 

What kinds of research topics are eligible for submission? 

We welcome submissions across all areas of science, engineering, and computation. The key requirement is that the research must be primarily conducted and written by AI agents. For example, papers that rely substantially on wet-lab experiments performed by human authors fall outside the scope of this conference.


Submission Process

Can I resubmit the same paper to other venues? 

Publication in Agents4Science does not preclude submissions to other conferences or journals. 

Is there an author rebuttal/revision stage?

No. To simplify the workflow, we will have one round of submission followed by reviews and decisions. 

Review Process

What review template do you use?

Reviews will follow the NeurIPS 2025 review guidelines.


Will reviews be made public? 

Yes, all the reviews will be public. We will also provide information on the AI models used to generate the reviews. 

Will there be human reviewers? 

All the submissions will be reviewed by AI reviewers in the first round, following the standard NeurIPS scoring instructions and rubric. Top-rated papers will be further assessed by our human expert advisory board for Oral, Spotlight and Award selections.

What if there are mistakes made by the AI scientist and missed by the AI reviewer?

We anticipate that errors will happen, and studying them will be instructive. All submissions and reviews will be publicly available on OpenReview. In addition, a panel of human experts will evaluate the top-ranked submissions. We encourage the community to engage with the submissions and reviews and highlight any mistakes made by AI agents. Understanding these failure modes is a key goal of the conference.

Will there be prizes?

Yes! We will offer compute credit to the top papers. Additional details will be announced.


Post-Conference

Will you publish post-conference analyses?

Yes. We plan to publish a meta-analysis of agent performance, reviewer reliability, and human–AI collaboration patterns to inform future AI for science development. ”




For such a conference, what preparations are needed to participate, and what suggestions do you have to help me achieve a top ranking?

Please help me analyze in depth: what topic should I choose for my competition entry? What potential restrictions are included in the conference requirements? As I understand it, to write a good AI paper, I need:

1. To be familiar with the topic myself to better collaborate with AI (in my knowledge background, it's sufficient if I can understand it, and it doesn't necessarily have to be completely consistent with my previous experience. My background is in CV, ML, LLM, Agent fields, with a bit of medical knowledge).

2. For the experiment to have publicly available data (is this correct?),

3. For the required code experiments not to be overly complex (requiring relatively few resources, and our time is limited. For agents, whether they can handle very complex processes is also an issue).

Additionally, in the conventional sense, a good topic is often key to the quality of a paper.

Furthermore, this conference mentions that reviewers will also be AI. Please think from the opposite perspective: what kind of articles would high-scoring articles from AI reviewers?

What topic suggestions do you have? What kind of topics would AI reviewers favor?

What I can think of: randomly analyze conclusions using a public medical dataset? Randomly prove some relationships in the underlying mathematical principles of ML, and then verify them on a small-scale dataset (CIFAR)? What other possibilities might there be?


# Agent:

\<think\>

First, understand the core requirements of the conference: the paper must be primarily generated by AI, with AI as the first author and humans as secondary authors. The conference focuses on AI's independent capabilities in scientific research, including hypothesis generation, experimentation, and writing.

My background is in CV, ML, LLM, Agent, and a bit of medical. So, I should choose topics related to these fields, so that I can effectively collaborate with AI, even if I don't need to be fully proficient, but at least understand.

The conference scope includes methodological innovations (such as new models or algorithms) and domain applications (such as biology, chemistry, mathematics). Since experiments require computation rather than wet lab experiments, public datasets and computable experiments are needed.

Potential limitations:

Must be AI-led: Humans can only provide guidance and cannot directly perform core work.

Need public datasets: Since experiments must be completed by AI, data must be accessible.

Limited computational resources: Experiments should not be too complex to avoid insufficient resources or difficulties for AI agents.

For AI reviewers: Since reviews are also done by AI, I need to consider how AI evaluates papers. AI reviewers may follow NeurIPS guidelines, focusing on clarity, innovation, technical quality, experimental validation, and significance. AI may prefer papers that are well-structured, logically clear, data-driven, with clear results and conclusions. Additionally, since AI reviewers may be based on LLMs, they may be better at understanding text and code, so papers should be easy to read and understand.

Topic suggestions:

In CV: Use public datasets (such as CIFAR-10, ImageNet) for innovations in image classification, object detection, or generative models. For example, let AI design a new neural network architecture or training method and validate it on standard datasets.

In ML: Explore new optimization algorithms, regularization methods, or model interpretation techniques. For example, let AI propose a new variant of gradient descent and test it on simple datasets.

In LLM: Let AI generate new methods for text generation or natural language processing tasks. For example, improve prompt engineering or fine-tuning strategies and evaluate them on standard NLP datasets.

In Agent: Design multi-agent systems for scientific discovery, such as letting agents collaborate to solve mathematical problems or simulate scientific experiments.

In Medical: Use public medical datasets (such as MIMIC-III) for developing predictive models or diagnostic assistance systems. But need to ensure AI-led analysis.

Due to time constraints, experiments should be simple and feasible. For example, use small-scale datasets or simplified problems.

Characteristics of papers that AI reviewers might give high scores to:

Clear structure: Abstract, introduction, methods, experiments, and conclusions are well-defined.

Innovation: Novel ideas or methods, even if incremental, must have uniqueness.

** rigorous experiments **: Sufficient experimental validation with comparison baselines.

Reproducibility: Provide code and detailed settings for easy replication.

Good writing: Correct grammar, logical flow.

Specific topic ideas:

Topic 1: Let AI automatically design an efficient CNN architecture for CIFAR-10 classification, using Neural Architecture Search (NAS) but AI-driven.

Topic 2: Let AI develop a new data augmentation strategy for image classification and test it on CIFAR-10.

Topic 3: Let AI analyze medical data (such as diabetes dataset) to predict disease risk and explain the model.

Topic 4: Let AI prove a simple mathematical theorem or conjecture using symbolic math tools.

Topic 5: Let AI generate a new method for LLM bias detection and evaluate it on text datasets.

I need to ensure that AI can handle the entire process: from idea generation to code writing and writing. So, I should choose tasks that AI can easily understand and execute.

Ultimately, to rank high, the paper should demonstrate AI's independent scientific capabilities, have significant results, and be of high quality.

\</think\>

For the Open Conference of AI Agents for Science 2025 (Agents4Science) that you plan to attend, I will provide you with a detailed analysis from the aspects of conference requirements, topic selection suggestions, AI reviewer preferences, and competition strategies. Based on your background (CV, ML, LLM, Agent, and some medical knowledge), I will give you some specific topic selection ideas and emphasize how to optimize to strive for top rankings.

I. In-depth Analysis of Conference Potential Limitations and Requirements
The core of the conference is AI serving as the primary author and reviewer, so there are the following key limitations and requirements:

AI Dominance: Papers must be generated by AI systems, including hypothesis generation, experimental design, code implementation, data analysis, and writing. Humans can only serve as advisors (providing ideas, feedback, and review) but cannot directly perform core tasks. You need to document in detail the contributions of AI and the extent of human participation.

Computational Research Focus: The conference accepts research related to computational science but excludes work that relies on wet-lab experiments. Therefore, all experiments must be based on computation or simulation, requiring the use of public datasets or accessible data.

Openness and Reproducibility: Since review is AI-driven and the conference encourages transparency, your paper should provide complete code, data, and experimental details for AI reviewers to evaluate and for the community to replicate.

Resource Limitations: Experiments should not be too complex to avoid difficulties for AI agents or excessive computational resources. With limited time, choose projects that can be iterated quickly.

Submission Limitations: Each human author can participate in a maximum of three submissions, and submissions can be simultaneously submitted to other conferences.

Your understanding is correct: Humans need to be familiar with the topic to collaborate with AI, public data is needed, and experiments cannot be too complex. Additionally, AI reviewers will evaluate based on NeurIPS 2025 guidelines, so papers need to meet academic standards.

II. Analysis of AI Reviewer Preferences (Reverse Thinking)
AI reviewers (likely based on large language models like GPT-4 or similar) will score according to NeurIPS standards, focusing on:

Clarity and Structure: Papers should be logically fluent and clearly divided into sections (abstract, introduction, methods, experiments, conclusions). AI reviewers are good at parsing text, so good writing and organization will score high.

Innovation: Even incremental innovation should have novelty. AI-generated ideas may be valued but need reasonable validation.

Technical Quality: Methods should be rigorous, experimental design reasonable, with baseline comparisons and statistical significance.

Empirical Validation: Results must be based on data experiments with quantitative metrics (such as accuracy, F1 scores). Public datasets and reproducible code are pluses.

Importance: Research should contribute to the field, even if on a small scale.

AI reviewers may prefer:

Data-driven Research: Using standard datasets where results are easy to verify.

High Degree of Automation: Showcasing AI's ability to complete work independently, reducing human intervention.

Interdisciplinary Applications: Combining multiple fields (such as ML + medicine) may attract interest.

Error Analysis: Including discussions of limitations, showing AI's self-critical ability.

Avoid complex mathematics or content requiring deep domain knowledge unless AI can handle it. Simple, direct research is easier for AI reviewers to understand.

III. Topic Selection Suggestions Based on Your Background
Considering your background (CV, ML, LLM, Agent, and medical knowledge), here are some specific topic suggestions that are suitable for AI dominance, have public data, and relatively simple experiments:

1. Computer Vision (CV) Field
Topic: AI Automatically Designing Data Augmentation Strategies for Image Classification

Approach: Let an AI agent (such as using LLM) generate new data augmentation techniques (for example, combining multiple transformations) and test their effectiveness on CIFAR-10 or CIFAR-100. Compare performance with standard augmentation (such as RandomCrop).

Why it's good: Data augmentation is a common task in CV, public datasets are easily accessible, and experiments are fast (training small models like ResNet-18). AI can automatically write code and run experiments.

Innovation point: AI might discover novel augmentation combinations that improve model robustness.

Topic: Simplified Version of Neural Architecture Search (NAS)

Approach: Let an AI agent use evolutionary algorithms or reinforcement learning to search for efficient CNN architectures for small-scale datasets (like CIFAR-10). The goal is to find architectures with few parameters and high accuracy.

Why it's good: NAS is a popular field, but traditional NAS requires significant computation. You can limit the search space to make experiments feasible. AI can handle architecture design and training.

Note: Need to control computational costs by using lightweight search methods.

2. Machine Learning (ML) Field
Topic: Proposal and Validation of New Optimization Algorithms

Approach: Let an AI agent create variants of gradient descent (such as adding adaptive learning rates or momentum modifications) and test convergence speed and accuracy on simple datasets (like MNIST or Fashion-MNIST).

Why it's good: Optimization algorithms are core to ML, and experiments on small datasets are fast. AI can perform mathematical derivations and code implementation.

Innovation point: AI might generate algorithmic adjustments that humans haven't thought of.

Topic: Model Interpretability Methods

Approach: Let AI develop a new feature importance or interpretability technique (such as based on attention mechanisms) and apply it to image or text data.

Why it's good: Interpretability is an important topic with standard evaluation metrics. AI can generate explanation reports.

3. Natural Language Processing (LLM) Field
Topic: Automatic Prompt Engineering for Text Classification

Approach: Let an AI agent generate optimized prompts for LLMs (such as GPT-3.5) to improve accuracy on text classification tasks (like IMDb movie reviews).

Why it's good: Prompt engineering is key to LLM applications, and public datasets are available. Experiments don't require training large models, just calling APIs.

Innovation point: AI might discover effective prompt patterns.

Topic: LLM Bias Detection and Mitigation

Approach: Let AI analyze bias in text datasets (such as Twitter data) and propose a debiasing method.

Why it's good: Ethical topics attract attention, and public bias datasets are available. AI can automatically write analysis code.

4. Agent Field
Topic: Multi-agent Systems for Mathematical Problem Solving

Approach: Design multiple AI agents to collaborate on solving mathematical problems (such as geometric proofs or algebraic equations), using symbolic mathematics tools (like SymPy).

Why it's good: Showcases AI collaboration capabilities, no need for large datasets. Experiments can be simulated.

Innovation point: Agent interaction strategies might be novel.

Topic: Agent-based Simulation for Epidemic Spread

Approach: Let AI agents build a simple epidemic spread model (like an SIR model), simulate it on synthetic data, and analyze intervention strategies.

Why it's good: Combines medical background with available public parameters. Computationally lightweight.

5. Medical Field
Topic: Disease Prediction Model Using Public Medical Data

Approach: Let AI agents analyze public medical datasets (such as the Pima Indians diabetes dataset or a subset of MIMIC-III), build prediction models (such as diabetes risk), and explain feature importance.

Why it's good: Medical applications have practical significance, datasets are small and easy to handle. AI can automatically select models and optimize hyperparameters.

Note: Ensure that data preprocessing and model selection are done by AI.

Topic: Medical Image Analysis (such as X-ray Classification)

Approach: Use public medical image datasets (such as CheXpert or COVID-19 X-ray datasets), let AI agents design a classification model, and validate performance.

Why it's good: Combines CV and medicine, datasets are available. But need to ensure sufficient computational resources (may require GPU).

IV. Competition Strategy and Suggestions
To achieve top rankings, in addition to selecting a topic, attention should be paid to the following strategies:

Emphasize AI autonomy: Clearly describe in the paper how AI independently completes each step (from idea to writing), with human contributions limited to guidance. This will align with the conference's objectives.

Choose mature toolchains: Use powerful AI agent frameworks (such as AutoGPT, Claude Code, or custom Agents) that integrate code execution, data analysis, and writing capabilities. Test the workflow to ensure smooth operation.

Focus on experimental design: Even for simple experiments, include baseline comparisons, ablation studies, and statistical tests. AI reviewers value rigor.

Optimize writing quality: Although written by AI, humans should ensure logical coherence and grammatical correctness. Use AI tools (such as Grammarly) to assist with editing.

Test in advance: Before submission, simulate the AI review process by evaluating your work with similar models to identify potential flaws.

Leverage cross-disciplinary approaches: Combining multiple backgrounds (such as Computer Vision + Medical) can generate unique topics that increase appeal.

V. Final Recommended Topics
Based on your background and feasibility, my top recommended topics are:

Topic A: AI Automated Data Augmentation for CIFAR-10 Classification (Computer Vision field, simple and quick)

Topic B: Automated Prompt Engineering for IMDb Text Classification (LLM field, no need to train large models)

Topic C: Diabetes Prediction Model using the Pima dataset (Medical field, small dataset)

These topics all have publicly available data, simple experiments, and can be handled independently by AI. You can choose one or combine multiple.

In conclusion, the key to success is demonstrating AI's independent scientific capabilities while ensuring high-quality papers. Wishing you success in the competition! If you need more detailed guidance or have specific questions, feel free to continue the discussion.