Unverified Commit 52679fbc authored by Patrick von Platen's avatar Patrick von Platen Committed by GitHub
Browse files

add dialogpt training tips (#3996)

parent b5c6d3d4
...@@ -23,6 +23,16 @@ Tips: ...@@ -23,6 +23,16 @@ Tips:
- DialoGPT was trained with a causal language modeling (CLM) objective on conversational data and is therefore powerful at response generation in open-domain dialogue systems. - DialoGPT was trained with a causal language modeling (CLM) objective on conversational data and is therefore powerful at response generation in open-domain dialogue systems.
- DialoGPT enables the user to create a chat bot in just 10 lines of code as shown on `DialoGPT's model card <https://huggingface.co/microsoft/DialoGPT-medium>`_. - DialoGPT enables the user to create a chat bot in just 10 lines of code as shown on `DialoGPT's model card <https://huggingface.co/microsoft/DialoGPT-medium>`_.
Training:
In order to train or fine-tune DialoGPT, one can use causal language modeling training.
To cite the official paper:
*We follow the OpenAI GPT-2 to model a multiturn dialogue session
as a long text and frame the generation task as language modeling. We first
concatenate all dialog turns within a dialogue session into a long text
x_1,..., x_N (N is the sequence length), ended by the end-of-text token.*
For more information please confer to the original paper.
DialoGPT's architecture is based on the GPT2 model, so one can refer to GPT2's `docstring <https://huggingface.co/transformers/model_doc/gpt2.html>`_. DialoGPT's architecture is based on the GPT2 model, so one can refer to GPT2's `docstring <https://huggingface.co/transformers/model_doc/gpt2.html>`_.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment