SciTextures: Collecting and Connecting Visual Patterns, Models, and Code Across Science and Art
Keywords: Procedural generation, emergent, Vision language models, VLM, image to code, visual program synthesis, pattern recognition, textures, generative code, generative models
TL;DR: SciTextures is a large-scale dataset bridging visual-patterns generative-code and simulations across science, tech, and art, testing whether VLMs can link visual patterns to underlying code and mechanisms and infer and recreate them.
Abstract: Textures of clouds and waves, the growth of cities and forests, and the formation of materials all exemplify visual patterns that emerge from underlying mechanisms.
We present the SciTextures dataset, a large-scale collection of textures and visual patterns from all domains of science, tech, and art, along with the simulation code that generated these images. Spanning over 1,270 models and 100,000 images drawn from physics, chemistry, biology, sociology, mathematics, and art, SciTextures offers a unified resource for exploring the connection between visual patterns and the generative processes that produce them.
The dataset is constructed through an agentic AI pipeline that autonomously collects, implements, and standardizes scientific simulations and generative models, while also inventing and implementing novel methods.
SciTextures provides large scale and highly-diverse collection of seamless, emergent textures and patterns paired with generative code, enabling image synthesis at arbitrary scale and resolution.
SciTextures also enables systematic evaluation of vision-language-models (VLM) ability to link visual patterns to the models and code that generate them, and to identify different patterns that emerge from the same underlying process. We also test VLM's ability to infer and recreate the mechanisms behind visual patterns by providing a natural image of a real-world phenomenon and asking the VLM to identify and code a model of the process that formed it. Revealing that VLM's can understand and simulate systems beyond images at multiple levels of abstraction.
SciTextures is freely available together with the agentic pipeline used to construct it.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 13
Loading