TextBox: A Unified, Modularized, and Extensible Framework for Text Generation

A lot of natural language generation models have been developed recently; however, most of them are suitable for a few specific tasks and focus on a special kind of technique. A recent study on arXiv.org addresses this drawback by suggesting a unified text generation framework.

Image: pxhere.com, CC0 Public Domain

It contains various text generation models, including variational auto-encoder, generative adversarial networks, recurrent neural network, Transformer based models, and pre-trained language models. Flexible mechanisms let to test and compare different algorithms. Separate modules and functionalities can be easily plugged in or swapped out.

Moreover, the suggested model can seamlessly integrate other user-customized modules and external components. It is suitable for unconditional and conditional text generation tasks, like text summarization and machine translation.

We release an open library, called TextBox, which provides a unified, modularized, and extensible text generation framework. TextBox aims to support a broad set of text generation tasks and models. In TextBox, we implements several text generation models on benchmark datasets, covering the categories of VAE, GAN, pre-trained language models, etc. Meanwhile, our library maintains sufficient modularity and extensibility by properly decomposing the model architecture, inference, learning process into highly reusable modules, which allows easily incorporating new models into our framework. It is specially suitable for researchers and practitioners to efficiently reproduce baseline models and develop new models. TextBox is implemented based on PyTorch, and released under Apache License 2.0 at this https URL.

Link: https://arxiv.org/abs/2101.02046


Notify of
Inline Feedbacks
View all comments
Would love your thoughts, please comment.x