meliksahturker commited on
Commit
fa390f5
verified
1 Parent(s): 1208b01

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -22
README.md CHANGED
@@ -4,14 +4,12 @@ language:
4
  arXiv: 2403.01308
5
  library_name: transformers
6
  pipeline_tag: text2text-generation
7
- inference:
8
- parameters:
9
- max_new_tokens: 128
10
  widget:
11
- - text: >-
12
- Ben buraya baz谋 <MASK> istiyorum.
13
  example_title: Masked Language Modeling
14
  license: cc-by-nc-sa-4.0
 
 
15
  ---
16
  # VBART Model Card
17
 
@@ -29,23 +27,7 @@ This repository contains pre-trained TensorFlow and Safetensors weights of VBART
29
  - **License:** CC BY-NC-SA 4.0
30
  - **Finetuned from:** VBART-Large
31
  - **Paper:** [arXiv](https://arxiv.org/abs/2403.01308)
32
- ## How to Get Started with the Model
33
- ```python
34
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
35
 
36
- tokenizer = AutoTokenizer.from_pretrained("vngrs-ai/VBART-Medium-Base",
37
- model_input_names=['input_ids', 'attention_mask'])
38
- # Uncomment the device_map kwarg and delete the closing bracket to use model for inference on GPU
39
- model = AutoModelForSeq2SeqLM.from_pretrained("vngrs-ai/VBART-Medium-Base")#, device_map="auto")
40
-
41
- # Input text
42
- input_text = "Ben buraya baz谋 <MASK> istiyorum."
43
-
44
- token_input = tokenizer(input_text, return_tensors="pt")#.to('cuda')
45
- outputs = model.generate(**token_input)
46
- print(tokenizer.decode(outputs[0]))
47
- ```
48
-
49
  ## Training Details
50
 
51
  ### Training Data
@@ -64,7 +46,7 @@ Pre-trained for a total of 63B tokens.
64
  #### Hyperparameters
65
  ##### Pretraining
66
  - **Training regime:** fp16 mixed precision
67
- - **Training objective**: Sentence permutation and span masking (using mask lengths sampled from Poisson distribution 位=3.5, masking 30% of tokens)
68
  - **Optimizer** : Adam optimizer (尾1 = 0.9, 尾2 = 0.98, 茞 = 1e-6)
69
  - **Scheduler**: Custom scheduler from the original Transformers paper (20,000 warm-up steps)
70
  - **Dropout**: 0.1
 
4
  arXiv: 2403.01308
5
  library_name: transformers
6
  pipeline_tag: text2text-generation
 
 
 
7
  widget:
8
+ - text: Ben buraya baz谋 <MASK> istiyorum.
 
9
  example_title: Masked Language Modeling
10
  license: cc-by-nc-sa-4.0
11
+ datasets:
12
+ - vngrs-ai/vngrs-web-corpus
13
  ---
14
  # VBART Model Card
15
 
 
27
  - **License:** CC BY-NC-SA 4.0
28
  - **Finetuned from:** VBART-Large
29
  - **Paper:** [arXiv](https://arxiv.org/abs/2403.01308)
 
 
 
30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  ## Training Details
32
 
33
  ### Training Data
 
46
  #### Hyperparameters
47
  ##### Pretraining
48
  - **Training regime:** fp16 mixed precision
49
+ - **Training objective**: Span masking (using mask lengths sampled from Poisson distribution 位=3.5, masking 30% of tokens)
50
  - **Optimizer** : Adam optimizer (尾1 = 0.9, 尾2 = 0.98, 茞 = 1e-6)
51
  - **Scheduler**: Custom scheduler from the original Transformers paper (20,000 warm-up steps)
52
  - **Dropout**: 0.1