A Database-based Rather Than a Language Model-based Natural Language Processing Method

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: NLP, NLG, NLU, Learning, database, TGHM
TL;DR: We propose a new paradigm for the NLP problem, in which we have redefined natural language and set a new research object. The proposed method is more closer to the way humans process information.
Abstract: Language models pre-training for NLP tasks take natural language as the direct modeling object. However, we believe that natural language is essentially a way of encoding information ( knowledge). Therefore, the object of study for natural language should be the information encoded in language, and the organizational and compositional structure of the information described in language. Based on this understanding, we propose a database-based NLP method that changes the modeling object from natural language to the information encoded in natural language. On this basis, 1) sentences generation task is transformed into read operations implemented on the database, and some sentence encoding rules to be followed; 2) sentences understanding task is transformed into sentence decoding rules and a series of Boolean operations implemented on the database; 3) learning task can be achieved by writing operations. Our method is more closer to how the human brain processes information and has excellent interpretability and scalability.
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4434
Loading