Abstract: Information extraction is a well-known topic that plays a critical role in many NLP applications as its outputs can be considered as an
entrance step for digital transformation. However, there still exist gaps when applying research results to actual business cases. This
paper introduces AURORA, an information extraction for domain-specific business documents. The intuition of AURORA is to use
transfer learning for extraction. To do that, it utilizes the power of transformers for dealing with the limitation of training data in
business cases and stacks additional layers for domain adaptation. We demonstrate AURORA in the context of actual scenarios where
users are invited to experience two functions: fine-grained and whole paragraph extraction of Japanese business documents. A video of the system is available at http://y2u.be/xHQpYE41Tqw.
Loading