With Measured Words: Simple Sentence Selection for Black-Box Optimization of Sentence Compression Algorithms
Abstract: Sentence Compression is the task of generating a shorter, yet grammatical version
of a given sentence, preserving the essence
of the original sentence. This paper proposes a Black-Box Optimizer for Compression
(B-BOC): given a black-box compression algorithm and assuming not all sentences need
be compressed – find the best candidates for
compression in order to maximize both compression rate and quality. Given a required
compression ratio, we consider two scenarios: (i) single-sentence compression, and (ii)
sentences-sequence compression. In the first
scenario, our optimizer is trained to predict
how well each sentence could be compressed
while meeting the specified ratio requirement.
In the latter, the desired compression ratio is
applied to a sequence of sentences (e.g., a
paragraph) as a whole, rather than on each individual sentence. To achieve that, we use
B-BOC to assign an optimal compression ratio to each sentence, then cast it as a Knapsack
problem, which we solve using bounded dynamic programming. We evaluate B-BOC on
both scenarios on three datasets, demonstrating that our optimizer improves both accuracy
and Rouge-F1-score compared to direct application of other compression algorithms.
0 Replies
Loading