Explainable, Steerable Models with Natural Language Parameters and Constraints

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Large Language Model; Explainability
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: we propose a new formulation to explain datasets with structured modalities, Statistical Modelling with Natural Language Constraints and Parameters
Abstract: Statistical modeling can uncover patterns in large datasets, but these patterns may not be explainable or relevant to our specific interest. For example, Gaussian mixture models are commonly used to explore text corpora, but it is hard to explain what each cluster means or steer them to use specific attributes (e.g. cluster based on style but not topic). To improve explainability and steerability, we introduce models where parameters are represented as natural language strings. For example, instead of using a Gaussian to represent a cluster, we represent it with a natural language predicate such as “*has a casual style*”. By leveraging the denotational semantics of natural language, we interpret these predicates as binary feature extractors and use them as building blocks for classical statistical models such as clustering, topic modeling, and regression. Language semantics also lets us specify constrains on the learned string parameters, such as “*the parameters should be style-related*”. To learn in our framework, we propose an algorithm to optimize the log-likelihood of these models, which iteratively optimizes continuous relaxations of string parameters and then discretizes them by explaining the continuous parameters with a language model. Evaluating our algorithm across three real corpora and four statistical models, we find that both the continuous relaxation and iterative refinement to be crucial. Finally, we show proof-of-concept applications in controllably generating explainable image clusters and describing major topic variations across news from different months.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4116
Loading