Abstract: This paper introduces Newsjam, a multilingual sum- marization tool for COVID-19 news articles. To this purpose, two extractive summarization methods were implemented: Latent Semantic Indexing and K-means clustering on contextual word embeddings on French and English data. This tool was then evaluated using three evaluation metrics and four different corpora; two existing ones as well as two custom-built ones. Finally, the best performing methods were implemented into a complete pipeline, going from text scraping and classification to summarization, and ultimately posting the summaries to Twitter automatically.
Loading