The problem of verifying whether a textual hypothesis holds the truth based on the
given evidence, also known as fact verification, plays an important role in the study of
natural language understanding and semantic representation. However, existing studies
are restricted to dealing with unstructured textual evidence (e.g., sentences and passages,
a pool of passages), while verification using structured forms of evidence, such as tables, graphs, and databases, remains unexplored.
This paper specifically aims to study fact verification with semi-structured evidence. We construct a large-scale dataset called TABFACT with 16k Wikipedia tables as evidence for 118k human annotated statements. The
statements are labeled as either ENTAILED or REFUTED. TABFACT is challenging since
it involves both soft linguistic reasoning and hard symbolic reasoning. To address these
challenges, we design two different models:
Table-BERT and Latent Program Algorithm (LPA) as baselines to address the proposed task.