Abstract: Large language models can draft ontologies, but unverified extraction yields hallucinated triples—producing plausible yet incorrect facts. EchoLLM is a text-only, evidence-grounded pipeline for ontology construction. Candidate triples are first extracted with an instruction-following LLM. A hybrid retriever (BM25 + dense) gathers sentence-level evidence for each triple. Natural language inference then tests whether the evidence entails the triple; only entailed, lexically consistent hypotheses are accepted, and all decisions are logged. Accepted entities are embedded and clustered to induce classes and a lightweight hierarchy; rdfs:comment is generated from supporting text. The result is a validated triple set and an initial ontology suitable for bootstrapping domain knowledge graphs. The construction design favors high precision which requires no domain-specific rules, and surfaces failure modes (extraction, retrieval, verification). This enables authors and subject-matter experts to build trustworthy knowledge graphs quickly while keeping model and cost choices flexible.
External IDs:doi:10.52825/ocp.v8i
Loading