# ViText2SQL: A dataset for Vietnamese Text-to-SQL semantic parsing

ViText2SQL is a dataset for the Vietnamese Text-to-SQL semantic parsing task, consisting of about 10K question and SQL query pairs. The construction of ViText2SQL is detailed in our EMNLP-2020 Findings [paper](https://www.aclweb.org/anthology/2020.findings-emnlp.364/):

	@inproceedings{vitext2sql,
	    title     	= {{A Pilot Study of Text-to-SQL Semantic Parsing for Vietnamese}},
	    author    	= {Anh Tuan Nguyen and Mai Hoang Dao and Dat Quoc Nguyen},
	    booktitle   = {Findings of the Association for Computational Linguistics: EMNLP 2020},
	    year      	= {2020},
	    pages       = {4079--4085}
	}  

By downloading the ViText2SQL dataset, USER agrees:

- to use ViText2SQL for research or educational purposes only.
- to **not** distribute ViText2SQL or part of ViText2SQL in any original or modified form.
- and to cite our EMNLP-2020 Findings paper above whenever ViText2SQL is employed to help produce published results.

The user can also download from [**here**](https://github.com/datquocnguyen/PhoW2V) our pre-trained Word2Vec syllable and word embeddings for Vietnamese that we used in our experiments.

#### Copyright (c) 2020 VinAI

	THE DATA IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	OUT OF OR IN CONNECTION WITH THE DATA OR THE USE OR OTHER DEALINGS IN THE
	DATA.


