Abstract: In this work, we present an application of the t-Distributed Stochastic Neighbor Embedding method (t-SNE) to visualization of judgments from Polish courts. Using the t-SNE dimensionality reduction technique, each document is mapped onto a point in two-dimensional space, based on vocabulary (i.e., bag-of-words representation). Afterwards the whole collection is presented as a scatter-plot. We analyze the obtained maps for various document groups, differentiated with respect to issuing institution, division, keywords or selected on the basis of citations. The obtained visualizations are encouraging. We demonstrate that the generated t-SNE maps are interpretable and capable of providing synthetic knowledge about document collections, difficult to obtain without time consuming manual analysis.
Loading