SerialKicked
/

ModelTestingBed

Model card Files Files and versions

SerialKicked commited on May 27, 2024

Commit

74a3a5a

·

verified ·

1 Parent(s): 966f39f

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -18,7 +18,8 @@ Simply put, I'm making my methodology to evaluate RP models public. While none o
 - All models are loaded in Q8_0 (GGUF) with all layers on the GPU (NVidia RTX3060 12GB)
 - Backend is the latest version of KoboldCPP for Windows using CUDA 12.
 - Using **CuBLAS** but **not using QuantMatMul (mmq)**.
-- All models are extended to **16K context length** (auto rope from KCPP) with **Flash Attention** and **ContextShift** enabled.
 - Frontend is staging version of Silly Tavern.
 - Response size set to 1024 tokens max.
 - Fixed Seed for all tests: **123**

 - All models are loaded in Q8_0 (GGUF) with all layers on the GPU (NVidia RTX3060 12GB)
 - Backend is the latest version of KoboldCPP for Windows using CUDA 12.
 - Using **CuBLAS** but **not using QuantMatMul (mmq)**.
+- All models are extended to **16K context length** (auto rope from KCPP)
+- **Flash Attention** and **ContextShift** enabled.
 - Frontend is staging version of Silly Tavern.
 - Response size set to 1024 tokens max.
 - Fixed Seed for all tests: **123**