zerofata qpqpqpqpqpqp commited on
Commit
9c6c94c
·
verified ·
1 Parent(s): 2fa8aa4

Update README.md (Minor Fix) (#2)

Browse files

- Update README.md (Minor Fix) (e5cdbcd53156f85fd2916a5d64d7000f84a04eb8)


Co-authored-by: 🎬cinema_anon <qpqpqpqpqpqp@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -540,7 +540,7 @@ base_model:
540
  <div class="section-content">
541
  <p>Creation Process: SFT</p>
542
  <p>SFT on approx 10 million tokens, SFW / NSFW RP, stories, creative instruct & chat data.</p>
543
- <p>MoE's are brutal to train even with a small dataset like mine, so I took a different approach from usual. I used a very low LR in an effort to avoid having to apply DPO / KTO training afterwards.</p>
544
  <p>I think there's likely a better config to be found, but experimentation with the model to find it is quite draining.</p>
545
  <div class="dropdown-container">
546
  <details>
 
540
  <div class="section-content">
541
  <p>Creation Process: SFT</p>
542
  <p>SFT on approx 10 million tokens, SFW / NSFW RP, stories, creative instruct & chat data.</p>
543
+ <p>MoE are brutal to train even with a small dataset like mine, so I took a different approach from usual. I used a very low LR in an effort to avoid having to apply DPO / KTO training afterwards.</p>
544
  <p>I think there's likely a better config to be found, but experimentation with the model to find it is quite draining.</p>
545
  <div class="dropdown-container">
546
  <details>