VeriTrans: Fine-Tuned LLM-Assisted NL→PL Translation via a Deterministic Neuro-Symbolic Pipeline

Xuan Liu; Dheeraj Kodakandla; Kushagra Srivastva; Mahfuza Farooque

VeriTrans: Fine-Tuned LLM-Assisted NL→PL Translation via a Deterministic Neuro-Symbolic Pipeline

Xuan Liu, Dheeraj Kodakandla, Kushagra Srivastva, Mahfuza Farooque

Published: 28 Mar 2026, Last Modified: 28 Mar 2026AIware 2026EveryoneRevisionsCC BY 4.0

Keywords: Formal Methods for LLM reasoning, SAT based validation, CNF transformation and equivalence checking, Neural Symbolic Reasoning, LLM

Abstract: VeriTrans is a reliability-first ML system that compiles natural-language requirements into solver-ready logic with validator-gated reliability. The pipeline integrates an instruction-tuned NL$\\!\\to\\!$PL translator, round-trip reconstruction (PL$\\!\\to\\!$NL) used as a high-precision acceptance gate, and canonical PL$\\!\\to\\!$CNF compilation, all executed via fixed API configuration (temperature$=0$; fine-tuning runs use seed$=42$) and per-item artifact logging (prompts, outputs, hashes) to support auditability and replay-driven debugging. On SatBench (2100 specifications), VeriTrans achieves 94.46\% SAT/UNSAT correctness and 87.73\% median round-trip similarity. Compact fine-tuning on 100-150 curated examples improves fidelity by about 1-1.5,pp without increasing latency (mean 25.8 s/spec on our 201-spec runtime subset). A thresholded acceptance policy on the round-trip score exposes a reliability--coverage knob: at $\\tau{=}75$, roughly 68\% of items are retained with $\\sim$94\% correctness on the accepted set. Validator overhead contributes $<15\\%$ of end-to-end runtime, and all prompts/responses and timing metadata are logged to enable replay-driven debugging and regression testing. By separating learned translation from symbolic verification and enforcing deterministic, validator-gated acceptance, VeriTrans turns NL$\\!\\to\\!$logic front-ends into auditable, reproducible components for reliability-critical workflows.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public.

Paper Type: Full-length papers (i.e. case studies, theoretical, applied research papers). 8 pages

Reroute: false

Submission Number: 32

Loading