Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
| OpenAI GPT | |
| ---------------------------------------------------- | |
| Overview | |
| ~~~~~~~~~~~~~~~~~~~~~ | |
| OpenAI GPT model was proposed in `Improving Language Understanding by Generative Pre-Training <https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf>`__ | |
| by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever. It's a causal (unidirectional) | |
| transformer pre-trained using language modeling on a large corpus will long range dependencies, the Toronto Book Corpus. | |
| The abstract from the paper is the following: | |
| *Natural language understanding comprises a wide range of diverse tasks such | |
| as textual entailment, question answering, semantic similarity assessment, and | |
| document classification. Although large unlabeled text corpora are abundant, | |
| labeled data for learning these specific tasks is scarce, making it challenging for | |
| discriminatively trained models to perform adequately. We demonstrate that large | |
| gains on these tasks can be realized by generative pre-training of a language model | |
| on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each | |
| specific task. In contrast to previous approaches, we make use of task-aware input | |
| transformations during fine-tuning to achieve effective transfer while requiring | |
| minimal changes to the model architecture. We demonstrate the effectiveness of | |
| our approach on a wide range of benchmarks for natural language understanding. | |
| Our general task-agnostic model outperforms discriminatively trained models that | |
| use architectures specifically crafted for each task, significantly improving upon the | |
| state of the art in 9 out of the 12 tasks studied.* | |
| Tips: | |
| - GPT is a model with absolute position embeddings so it's usually advised to pad the inputs on | |
| the right rather than the left. | |
| - GPT was trained with a causal language modeling (CLM) objective and is therefore powerful at predicting the next | |
| token in a sequence. Leveraging this feature allows GPT-2 to generate syntactically coherent text as | |
| it can be observed in the `run_generation.py` example script. | |
| `Write With Transformer <https://transformer.huggingface.co/doc/gpt>`__ is a webapp created and hosted by | |
| Hugging Face showcasing the generative capabilities of several models. GPT is one of them. | |
| OpenAIGPTConfig | |
| ~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.OpenAIGPTConfig | |
| :members: | |
| OpenAIGPTTokenizer | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.OpenAIGPTTokenizer | |
| :members: | |
| OpenAIGPTModel | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.OpenAIGPTModel | |
| :members: | |
| OpenAIGPTLMHeadModel | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.OpenAIGPTLMHeadModel | |
| :members: | |
| OpenAIGPTDoubleHeadsModel | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.OpenAIGPTDoubleHeadsModel | |
| :members: | |
| TFOpenAIGPTModel | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.TFOpenAIGPTModel | |
| :members: | |
| TFOpenAIGPTLMHeadModel | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.TFOpenAIGPTLMHeadModel | |
| :members: | |
| TFOpenAIGPTDoubleHeadsModel | |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
| .. autoclass:: transformers.TFOpenAIGPTDoubleHeadsModel | |
| :members: | |