\begin{table*}[btp!]
\centering
\renewcommand{\arraystretch}{1.2}
\begin{tabular}{p{4in}|cc|cc}
\toprule
 & SC & MVSC & CQR & GCCQR \\
\toprule
Anaximander believed that the universe is infinite. &  & X &  & X \\
Anaximander came from a noble family. &  & X &  & X \\
Anaximander was born in Miletus, a city in the ancient Greek world. &  & X &  & X \\
Anaximander's work has survived to the present day. &  & X &  & X \\
Despite his contributions to philosophy, Anaximander's life remains somewhat shrouded in mystery. &  & X &  & X \\
\midrule
Merritt Butrick is best known for his roles in the Star Trek franchise. &  & X &  & X \\
Merritt Butrick was an American actor. &  & X &  & X \\
Merritt Butrick contributed to the Star Trek franchise. &  &  &  & X \\
\bottomrule
\end{tabular}
\caption{Using outputs from \textbf{Llama 2 7B Chat} on \textsc{Bio-NQ}, we present examples in which all conformal methods (at $90\%$ target coverage) produce a subset of claims that are entirely correct. In these examples, the multivalid methods (MVSC, GCCQR) output nonempty steps while the standard conformal methods (SC, CQR) do not.}
\label{tab:examples_llama2_nonempty}
\renewcommand{\arraystretch}{1}
\end{table*}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{table*}[btp!]
\centering
\renewcommand{\arraystretch}{1.2}
\begin{tabular}{p{4in}|cc|cc}
\toprule
 & SC & MVSC & CQR & GCCQR \\
\toprule

Bessel van der Kolk has written extensively on the connection between the brain, mind, and body in the healing of trauma. &  & X & X & X \\
Bessel van der Kolk is a world-renowned Dutch-American psychiatrist. &  & X & X & X \\
Bessel van der Kolk's work has had a significant impact on the understanding and treatment of trauma. &  & X &  & X \\
\midrule
Richard Chamberlain continues to act. &  &  &  & X \\
Richard Chamberlain has received numerous awards and accolades throughout his career. &  &  &  & X \\
Richard Chamberlain was born on March 31, 1934, in Beverly Hills, California. &  &  &  & X \\
\bottomrule
\end{tabular}
\caption{Using outputs from \textbf{Mistral 7B Instruct} on \textsc{Bio-NQ}, we present examples in which all conformal methods using \textit{self-consistency} (at $90\%$ target coverage) produce a subset of claims that are entirely correct. In these examples, we have that either SC or CQR produce empty sets while their multivalid counterparts (MVSC and GCCQR respectively) do not.}
\label{tab:examples_mistral_nonempty}
\renewcommand{\arraystretch}{1}
\end{table*}