Keywords: Formal Methods for LLM reasoning, SAT based validation, CNF transformation and equivalence checking, Neural Symbolic Reasoning, LLM
Abstract: VeriTrans is a reliability-first ML system that compiles natural-language requirements into solver-ready logic with validator-gated reliability. The pipeline integrates an instruction-tuned NL$\\!\\to\\!$PL translator, round-trip reconstruction (PL$\\!\\to\\!$NL) used as a high-precision acceptance gate, and canonical PL$\\!\\to\\!$CNF compilation, all executed via fixed API configuration (temperature$=0$; fine-tuning runs use seed$=42$) and per-item artifact logging (prompts, outputs, hashes) to support auditability and replay-driven debugging. On SatBench (2100 specifications), VeriTrans achieves 94.46\% SAT/UNSAT correctness and 87.73\% median round-trip similarity. Compact fine-tuning on 100-150 curated examples improves fidelity by about 1-1.5,pp without increasing latency (mean 25.8 s/spec on our 201-spec runtime subset). A thresholded acceptance policy on the round-trip score exposes a reliability--coverage knob: at $\\tau{=}75$, roughly 68\% of items are retained with $\\sim$94\% correctness on the accepted set. Validator overhead contributes $<15\\%$ of end-to-end runtime, and all prompts/responses and timing metadata are logged to enable replay-driven debugging and regression testing. By separating learned translation from symbolic verification and enforcing deterministic, validator-gated acceptance, VeriTrans turns NL$\\!\\to\\!$logic front-ends into auditable, reproducible components for reliability-critical workflows.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public.
Paper Type: Full-length papers (i.e. case studies, theoretical, applied research papers). 8 pages
Reroute: false
Submission Number: 32
Loading