Spaces:
Sleeping
Sleeping
zengxianyu
commited on
Commit
·
8573fa5
1
Parent(s):
bc20114
upload submodules
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- .gitattributes +1 -0
- .gitignore +1 -1
- .gitmodules +6 -0
- ComfyUI/custom_nodes/ComfyUI-GGUF +1 -0
- ComfyUI/custom_nodes/ComfyUI-GGUF/LICENSE +0 -201
- ComfyUI/custom_nodes/ComfyUI-GGUF/README.md +0 -49
- ComfyUI/custom_nodes/ComfyUI-GGUF/__init__.py +0 -9
- ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py +0 -248
- ComfyUI/custom_nodes/ComfyUI-GGUF/loader.py +0 -353
- ComfyUI/custom_nodes/ComfyUI-GGUF/nodes.py +0 -305
- ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py +0 -281
- ComfyUI/custom_nodes/ComfyUI-GGUF/pyproject.toml +0 -14
- ComfyUI/custom_nodes/ComfyUI-GGUF/requirements.txt +0 -5
- ComfyUI/custom_nodes/ComfyUI-GGUF/tools/README.md +0 -93
- ComfyUI/custom_nodes/ComfyUI-GGUF/tools/convert.py +0 -365
- ComfyUI/custom_nodes/ComfyUI-GGUF/tools/fix_5d_tensors.py +0 -82
- ComfyUI/custom_nodes/ComfyUI-GGUF/tools/fix_lines_ending.py +0 -31
- ComfyUI/custom_nodes/ComfyUI-GGUF/tools/lcpp.patch +0 -451
- ComfyUI/custom_nodes/ComfyUI-GGUF/tools/read_tensors.py +0 -21
- ComfyUI/custom_nodes/cg-image-filter +1 -0
- ComfyUI/models/audio_encoders/put_audio_encoder_models_here +0 -0
- ComfyUI/models/checkpoints/put_checkpoints_here +0 -0
- ComfyUI/models/clip/put_clip_or_text_encoder_models_here +0 -0
- ComfyUI/models/clip_vision/put_clip_vision_models_here +0 -0
- ComfyUI/models/configs/anything_v3.yaml +0 -73
- ComfyUI/models/configs/v1-inference.yaml +0 -70
- ComfyUI/models/configs/v1-inference_clip_skip_2.yaml +0 -73
- ComfyUI/models/configs/v1-inference_clip_skip_2_fp16.yaml +0 -74
- ComfyUI/models/configs/v1-inference_fp16.yaml +0 -71
- ComfyUI/models/configs/v1-inpainting-inference.yaml +0 -71
- ComfyUI/models/configs/v2-inference-v.yaml +0 -68
- ComfyUI/models/configs/v2-inference-v_fp32.yaml +0 -68
- ComfyUI/models/configs/v2-inference.yaml +0 -67
- ComfyUI/models/configs/v2-inference_fp32.yaml +0 -67
- ComfyUI/models/configs/v2-inpainting-inference.yaml +0 -158
- ComfyUI/models/controlnet/put_controlnets_and_t2i_here +0 -0
- ComfyUI/models/diffusers/put_diffusers_models_here +0 -0
- ComfyUI/models/diffusion_models/put_diffusion_model_files_here +0 -0
- ComfyUI/models/embeddings/put_embeddings_or_textual_inversion_concepts_here +0 -0
- ComfyUI/models/gligen/put_gligen_models_here +0 -0
- ComfyUI/models/hypernetworks/put_hypernetworks_here +0 -0
- ComfyUI/models/loras/put_loras_here +0 -0
- ComfyUI/models/model_patches/put_model_patches_here +0 -0
- ComfyUI/models/photomaker/put_photomaker_models_here +0 -0
- ComfyUI/models/style_models/put_t2i_style_model_here +0 -0
- ComfyUI/models/text_encoders/put_text_encoder_files_here +0 -0
- ComfyUI/models/unet/put_unet_files_here +0 -0
- ComfyUI/models/upscale_models/put_esrgan_and_other_upscale_models_here +0 -0
- ComfyUI/models/vae/put_vae_here +0 -0
- ComfyUI/models/vae_approx/put_taesd_encoder_pth_and_taesd_decoder_pth_here +0 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
*.mp4 filter=lfs diff=lfs merge=lfs -text
|
.gitignore
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
*.pyc
|
| 2 |
*.gguf
|
| 3 |
-
*.safetensors
|
|
|
|
| 1 |
*.pyc
|
| 2 |
*.gguf
|
| 3 |
+
*.safetensors
|
.gitmodules
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[submodule "ComfyUI/custom_nodes/ComfyUI-GGUF"]
|
| 2 |
+
path = ComfyUI/custom_nodes/ComfyUI-GGUF
|
| 3 |
+
url = https://github.com/city96/ComfyUI-GGUF
|
| 4 |
+
[submodule "ComfyUI/custom_nodes/cg-image-filter"]
|
| 5 |
+
path = ComfyUI/custom_nodes/cg-image-filter
|
| 6 |
+
url = https://github.com/chrisgoringe/cg-image-filter
|
ComfyUI/custom_nodes/ComfyUI-GGUF
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
Subproject commit be2a08330d7ec232d684e50ab938870d7529471e
|
ComfyUI/custom_nodes/ComfyUI-GGUF/LICENSE
DELETED
|
@@ -1,201 +0,0 @@
|
|
| 1 |
-
Apache License
|
| 2 |
-
Version 2.0, January 2004
|
| 3 |
-
http://www.apache.org/licenses/
|
| 4 |
-
|
| 5 |
-
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
| 6 |
-
|
| 7 |
-
1. Definitions.
|
| 8 |
-
|
| 9 |
-
"License" shall mean the terms and conditions for use, reproduction,
|
| 10 |
-
and distribution as defined by Sections 1 through 9 of this document.
|
| 11 |
-
|
| 12 |
-
"Licensor" shall mean the copyright owner or entity authorized by
|
| 13 |
-
the copyright owner that is granting the License.
|
| 14 |
-
|
| 15 |
-
"Legal Entity" shall mean the union of the acting entity and all
|
| 16 |
-
other entities that control, are controlled by, or are under common
|
| 17 |
-
control with that entity. For the purposes of this definition,
|
| 18 |
-
"control" means (i) the power, direct or indirect, to cause the
|
| 19 |
-
direction or management of such entity, whether by contract or
|
| 20 |
-
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
| 21 |
-
outstanding shares, or (iii) beneficial ownership of such entity.
|
| 22 |
-
|
| 23 |
-
"You" (or "Your") shall mean an individual or Legal Entity
|
| 24 |
-
exercising permissions granted by this License.
|
| 25 |
-
|
| 26 |
-
"Source" form shall mean the preferred form for making modifications,
|
| 27 |
-
including but not limited to software source code, documentation
|
| 28 |
-
source, and configuration files.
|
| 29 |
-
|
| 30 |
-
"Object" form shall mean any form resulting from mechanical
|
| 31 |
-
transformation or translation of a Source form, including but
|
| 32 |
-
not limited to compiled object code, generated documentation,
|
| 33 |
-
and conversions to other media types.
|
| 34 |
-
|
| 35 |
-
"Work" shall mean the work of authorship, whether in Source or
|
| 36 |
-
Object form, made available under the License, as indicated by a
|
| 37 |
-
copyright notice that is included in or attached to the work
|
| 38 |
-
(an example is provided in the Appendix below).
|
| 39 |
-
|
| 40 |
-
"Derivative Works" shall mean any work, whether in Source or Object
|
| 41 |
-
form, that is based on (or derived from) the Work and for which the
|
| 42 |
-
editorial revisions, annotations, elaborations, or other modifications
|
| 43 |
-
represent, as a whole, an original work of authorship. For the purposes
|
| 44 |
-
of this License, Derivative Works shall not include works that remain
|
| 45 |
-
separable from, or merely link (or bind by name) to the interfaces of,
|
| 46 |
-
the Work and Derivative Works thereof.
|
| 47 |
-
|
| 48 |
-
"Contribution" shall mean any work of authorship, including
|
| 49 |
-
the original version of the Work and any modifications or additions
|
| 50 |
-
to that Work or Derivative Works thereof, that is intentionally
|
| 51 |
-
submitted to Licensor for inclusion in the Work by the copyright owner
|
| 52 |
-
or by an individual or Legal Entity authorized to submit on behalf of
|
| 53 |
-
the copyright owner. For the purposes of this definition, "submitted"
|
| 54 |
-
means any form of electronic, verbal, or written communication sent
|
| 55 |
-
to the Licensor or its representatives, including but not limited to
|
| 56 |
-
communication on electronic mailing lists, source code control systems,
|
| 57 |
-
and issue tracking systems that are managed by, or on behalf of, the
|
| 58 |
-
Licensor for the purpose of discussing and improving the Work, but
|
| 59 |
-
excluding communication that is conspicuously marked or otherwise
|
| 60 |
-
designated in writing by the copyright owner as "Not a Contribution."
|
| 61 |
-
|
| 62 |
-
"Contributor" shall mean Licensor and any individual or Legal Entity
|
| 63 |
-
on behalf of whom a Contribution has been received by Licensor and
|
| 64 |
-
subsequently incorporated within the Work.
|
| 65 |
-
|
| 66 |
-
2. Grant of Copyright License. Subject to the terms and conditions of
|
| 67 |
-
this License, each Contributor hereby grants to You a perpetual,
|
| 68 |
-
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
| 69 |
-
copyright license to reproduce, prepare Derivative Works of,
|
| 70 |
-
publicly display, publicly perform, sublicense, and distribute the
|
| 71 |
-
Work and such Derivative Works in Source or Object form.
|
| 72 |
-
|
| 73 |
-
3. Grant of Patent License. Subject to the terms and conditions of
|
| 74 |
-
this License, each Contributor hereby grants to You a perpetual,
|
| 75 |
-
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
| 76 |
-
(except as stated in this section) patent license to make, have made,
|
| 77 |
-
use, offer to sell, sell, import, and otherwise transfer the Work,
|
| 78 |
-
where such license applies only to those patent claims licensable
|
| 79 |
-
by such Contributor that are necessarily infringed by their
|
| 80 |
-
Contribution(s) alone or by combination of their Contribution(s)
|
| 81 |
-
with the Work to which such Contribution(s) was submitted. If You
|
| 82 |
-
institute patent litigation against any entity (including a
|
| 83 |
-
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
| 84 |
-
or a Contribution incorporated within the Work constitutes direct
|
| 85 |
-
or contributory patent infringement, then any patent licenses
|
| 86 |
-
granted to You under this License for that Work shall terminate
|
| 87 |
-
as of the date such litigation is filed.
|
| 88 |
-
|
| 89 |
-
4. Redistribution. You may reproduce and distribute copies of the
|
| 90 |
-
Work or Derivative Works thereof in any medium, with or without
|
| 91 |
-
modifications, and in Source or Object form, provided that You
|
| 92 |
-
meet the following conditions:
|
| 93 |
-
|
| 94 |
-
(a) You must give any other recipients of the Work or
|
| 95 |
-
Derivative Works a copy of this License; and
|
| 96 |
-
|
| 97 |
-
(b) You must cause any modified files to carry prominent notices
|
| 98 |
-
stating that You changed the files; and
|
| 99 |
-
|
| 100 |
-
(c) You must retain, in the Source form of any Derivative Works
|
| 101 |
-
that You distribute, all copyright, patent, trademark, and
|
| 102 |
-
attribution notices from the Source form of the Work,
|
| 103 |
-
excluding those notices that do not pertain to any part of
|
| 104 |
-
the Derivative Works; and
|
| 105 |
-
|
| 106 |
-
(d) If the Work includes a "NOTICE" text file as part of its
|
| 107 |
-
distribution, then any Derivative Works that You distribute must
|
| 108 |
-
include a readable copy of the attribution notices contained
|
| 109 |
-
within such NOTICE file, excluding those notices that do not
|
| 110 |
-
pertain to any part of the Derivative Works, in at least one
|
| 111 |
-
of the following places: within a NOTICE text file distributed
|
| 112 |
-
as part of the Derivative Works; within the Source form or
|
| 113 |
-
documentation, if provided along with the Derivative Works; or,
|
| 114 |
-
within a display generated by the Derivative Works, if and
|
| 115 |
-
wherever such third-party notices normally appear. The contents
|
| 116 |
-
of the NOTICE file are for informational purposes only and
|
| 117 |
-
do not modify the License. You may add Your own attribution
|
| 118 |
-
notices within Derivative Works that You distribute, alongside
|
| 119 |
-
or as an addendum to the NOTICE text from the Work, provided
|
| 120 |
-
that such additional attribution notices cannot be construed
|
| 121 |
-
as modifying the License.
|
| 122 |
-
|
| 123 |
-
You may add Your own copyright statement to Your modifications and
|
| 124 |
-
may provide additional or different license terms and conditions
|
| 125 |
-
for use, reproduction, or distribution of Your modifications, or
|
| 126 |
-
for any such Derivative Works as a whole, provided Your use,
|
| 127 |
-
reproduction, and distribution of the Work otherwise complies with
|
| 128 |
-
the conditions stated in this License.
|
| 129 |
-
|
| 130 |
-
5. Submission of Contributions. Unless You explicitly state otherwise,
|
| 131 |
-
any Contribution intentionally submitted for inclusion in the Work
|
| 132 |
-
by You to the Licensor shall be under the terms and conditions of
|
| 133 |
-
this License, without any additional terms or conditions.
|
| 134 |
-
Notwithstanding the above, nothing herein shall supersede or modify
|
| 135 |
-
the terms of any separate license agreement you may have executed
|
| 136 |
-
with Licensor regarding such Contributions.
|
| 137 |
-
|
| 138 |
-
6. Trademarks. This License does not grant permission to use the trade
|
| 139 |
-
names, trademarks, service marks, or product names of the Licensor,
|
| 140 |
-
except as required for reasonable and customary use in describing the
|
| 141 |
-
origin of the Work and reproducing the content of the NOTICE file.
|
| 142 |
-
|
| 143 |
-
7. Disclaimer of Warranty. Unless required by applicable law or
|
| 144 |
-
agreed to in writing, Licensor provides the Work (and each
|
| 145 |
-
Contributor provides its Contributions) on an "AS IS" BASIS,
|
| 146 |
-
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
| 147 |
-
implied, including, without limitation, any warranties or conditions
|
| 148 |
-
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
| 149 |
-
PARTICULAR PURPOSE. You are solely responsible for determining the
|
| 150 |
-
appropriateness of using or redistributing the Work and assume any
|
| 151 |
-
risks associated with Your exercise of permissions under this License.
|
| 152 |
-
|
| 153 |
-
8. Limitation of Liability. In no event and under no legal theory,
|
| 154 |
-
whether in tort (including negligence), contract, or otherwise,
|
| 155 |
-
unless required by applicable law (such as deliberate and grossly
|
| 156 |
-
negligent acts) or agreed to in writing, shall any Contributor be
|
| 157 |
-
liable to You for damages, including any direct, indirect, special,
|
| 158 |
-
incidental, or consequential damages of any character arising as a
|
| 159 |
-
result of this License or out of the use or inability to use the
|
| 160 |
-
Work (including but not limited to damages for loss of goodwill,
|
| 161 |
-
work stoppage, computer failure or malfunction, or any and all
|
| 162 |
-
other commercial damages or losses), even if such Contributor
|
| 163 |
-
has been advised of the possibility of such damages.
|
| 164 |
-
|
| 165 |
-
9. Accepting Warranty or Additional Liability. While redistributing
|
| 166 |
-
the Work or Derivative Works thereof, You may choose to offer,
|
| 167 |
-
and charge a fee for, acceptance of support, warranty, indemnity,
|
| 168 |
-
or other liability obligations and/or rights consistent with this
|
| 169 |
-
License. However, in accepting such obligations, You may act only
|
| 170 |
-
on Your own behalf and on Your sole responsibility, not on behalf
|
| 171 |
-
of any other Contributor, and only if You agree to indemnify,
|
| 172 |
-
defend, and hold each Contributor harmless for any liability
|
| 173 |
-
incurred by, or claims asserted against, such Contributor by reason
|
| 174 |
-
of your accepting any such warranty or additional liability.
|
| 175 |
-
|
| 176 |
-
END OF TERMS AND CONDITIONS
|
| 177 |
-
|
| 178 |
-
APPENDIX: How to apply the Apache License to your work.
|
| 179 |
-
|
| 180 |
-
To apply the Apache License to your work, attach the following
|
| 181 |
-
boilerplate notice, with the fields enclosed by brackets "[]"
|
| 182 |
-
replaced with your own identifying information. (Don't include
|
| 183 |
-
the brackets!) The text should be enclosed in the appropriate
|
| 184 |
-
comment syntax for the file format. We also recommend that a
|
| 185 |
-
file or class name and description of purpose be included on the
|
| 186 |
-
same "printed page" as the copyright notice for easier
|
| 187 |
-
identification within third-party archives.
|
| 188 |
-
|
| 189 |
-
Copyright [yyyy] [name of copyright owner]
|
| 190 |
-
|
| 191 |
-
Licensed under the Apache License, Version 2.0 (the "License");
|
| 192 |
-
you may not use this file except in compliance with the License.
|
| 193 |
-
You may obtain a copy of the License at
|
| 194 |
-
|
| 195 |
-
http://www.apache.org/licenses/LICENSE-2.0
|
| 196 |
-
|
| 197 |
-
Unless required by applicable law or agreed to in writing, software
|
| 198 |
-
distributed under the License is distributed on an "AS IS" BASIS,
|
| 199 |
-
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
| 200 |
-
See the License for the specific language governing permissions and
|
| 201 |
-
limitations under the License.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/README.md
DELETED
|
@@ -1,49 +0,0 @@
|
|
| 1 |
-
# ComfyUI-GGUF
|
| 2 |
-
GGUF Quantization support for native ComfyUI models
|
| 3 |
-
|
| 4 |
-
This is currently very much WIP. These custom nodes provide support for model files stored in the GGUF format popularized by [llama.cpp](https://github.com/ggerganov/llama.cpp).
|
| 5 |
-
|
| 6 |
-
While quantization wasn't feasible for regular UNET models (conv2d), transformer/DiT models such as flux seem less affected by quantization. This allows running it in much lower bits per weight variable bitrate quants on low-end GPUs. For further VRAM savings, a node to load a quantized version of the T5 text encoder is also included.
|
| 7 |
-
|
| 8 |
-

|
| 9 |
-
|
| 10 |
-
Note: The "Force/Set CLIP Device" is **NOT** part of this node pack. Do not install it if you only have one GPU. Do not set it to cuda:0 then complain about OOM errors if you do not undestand what it is for. There is not need to copy the workflow above, just use your own workflow and replace the stock "Load Diffusion Model" with the "Unet Loader (GGUF)" node.
|
| 11 |
-
|
| 12 |
-
## Installation
|
| 13 |
-
|
| 14 |
-
> [!IMPORTANT]
|
| 15 |
-
> Make sure your ComfyUI is on a recent-enough version to support custom ops when loading the UNET-only.
|
| 16 |
-
|
| 17 |
-
To install the custom node normally, git clone this repository into your custom nodes folder (`ComfyUI/custom_nodes`) and install the only dependency for inference (`pip install --upgrade gguf`)
|
| 18 |
-
|
| 19 |
-
```
|
| 20 |
-
git clone https://github.com/city96/ComfyUI-GGUF
|
| 21 |
-
```
|
| 22 |
-
|
| 23 |
-
To install the custom node on a standalone ComfyUI release, open a CMD inside the "ComfyUI_windows_portable" folder (where your `run_nvidia_gpu.bat` file is) and use the following commands:
|
| 24 |
-
|
| 25 |
-
```
|
| 26 |
-
git clone https://github.com/city96/ComfyUI-GGUF ComfyUI/custom_nodes/ComfyUI-GGUF
|
| 27 |
-
.\python_embeded\python.exe -s -m pip install -r .\ComfyUI\custom_nodes\ComfyUI-GGUF\requirements.txt
|
| 28 |
-
```
|
| 29 |
-
|
| 30 |
-
On MacOS sequoia, torch 2.4.1 seems to be required, as 2.6.X nightly versions cause a "M1 buffer is not large enough" error. See [this issue](https://github.com/city96/ComfyUI-GGUF/issues/107) for more information/workarounds.
|
| 31 |
-
|
| 32 |
-
## Usage
|
| 33 |
-
|
| 34 |
-
Simply use the GGUF Unet loader found under the `bootleg` category. Place the .gguf model files in your `ComfyUI/models/unet` folder.
|
| 35 |
-
|
| 36 |
-
LoRA loading is experimental but it should work with just the built-in LoRA loader node(s).
|
| 37 |
-
|
| 38 |
-
Pre-quantized models:
|
| 39 |
-
|
| 40 |
-
- [flux1-dev GGUF](https://huggingface.co/city96/FLUX.1-dev-gguf)
|
| 41 |
-
- [flux1-schnell GGUF](https://huggingface.co/city96/FLUX.1-schnell-gguf)
|
| 42 |
-
- [stable-diffusion-3.5-large GGUF](https://huggingface.co/city96/stable-diffusion-3.5-large-gguf)
|
| 43 |
-
- [stable-diffusion-3.5-large-turbo GGUF](https://huggingface.co/city96/stable-diffusion-3.5-large-turbo-gguf)
|
| 44 |
-
|
| 45 |
-
Initial support for quantizing T5 has also been added recently, these can be used using the various `*CLIPLoader (gguf)` nodes which can be used inplace of the regular ones. For the CLIP model, use whatever model you were using before for CLIP. The loader can handle both types of files - `gguf` and regular `safetensors`/`bin`.
|
| 46 |
-
|
| 47 |
-
- [t5_v1.1-xxl GGUF](https://huggingface.co/city96/t5-v1_1-xxl-encoder-gguf)
|
| 48 |
-
|
| 49 |
-
See the instructions in the [tools](https://github.com/city96/ComfyUI-GGUF/tree/main/tools) folder for how to create your own quants.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/__init__.py
DELETED
|
@@ -1,9 +0,0 @@
|
|
| 1 |
-
# only import if running as a custom node
|
| 2 |
-
try:
|
| 3 |
-
import comfy.utils
|
| 4 |
-
except ImportError:
|
| 5 |
-
pass
|
| 6 |
-
else:
|
| 7 |
-
from .nodes import NODE_CLASS_MAPPINGS
|
| 8 |
-
NODE_DISPLAY_NAME_MAPPINGS = {k:v.TITLE for k,v in NODE_CLASS_MAPPINGS.items()}
|
| 9 |
-
__all__ = ['NODE_CLASS_MAPPINGS', 'NODE_DISPLAY_NAME_MAPPINGS']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py
DELETED
|
@@ -1,248 +0,0 @@
|
|
| 1 |
-
# (c) City96 || Apache-2.0 (apache.org/licenses/LICENSE-2.0)
|
| 2 |
-
import gguf
|
| 3 |
-
import torch
|
| 4 |
-
from tqdm import tqdm
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
TORCH_COMPATIBLE_QTYPES = (None, gguf.GGMLQuantizationType.F32, gguf.GGMLQuantizationType.F16)
|
| 8 |
-
|
| 9 |
-
def is_torch_compatible(tensor):
|
| 10 |
-
return tensor is None or getattr(tensor, "tensor_type", None) in TORCH_COMPATIBLE_QTYPES
|
| 11 |
-
|
| 12 |
-
def is_quantized(tensor):
|
| 13 |
-
return not is_torch_compatible(tensor)
|
| 14 |
-
|
| 15 |
-
def dequantize_tensor(tensor, dtype=None, dequant_dtype=None):
|
| 16 |
-
qtype = getattr(tensor, "tensor_type", None)
|
| 17 |
-
oshape = getattr(tensor, "tensor_shape", tensor.shape)
|
| 18 |
-
|
| 19 |
-
if qtype in TORCH_COMPATIBLE_QTYPES:
|
| 20 |
-
return tensor.to(dtype)
|
| 21 |
-
elif qtype in dequantize_functions:
|
| 22 |
-
dequant_dtype = dtype if dequant_dtype == "target" else dequant_dtype
|
| 23 |
-
return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
|
| 24 |
-
else:
|
| 25 |
-
# this is incredibly slow
|
| 26 |
-
tqdm.write(f"Falling back to numpy dequant for qtype: {qtype}")
|
| 27 |
-
new = gguf.quants.dequantize(tensor.cpu().numpy(), qtype)
|
| 28 |
-
return torch.from_numpy(new).to(tensor.device, dtype=dtype)
|
| 29 |
-
|
| 30 |
-
def dequantize(data, qtype, oshape, dtype=None):
|
| 31 |
-
"""
|
| 32 |
-
Dequantize tensor back to usable shape/dtype
|
| 33 |
-
"""
|
| 34 |
-
block_size, type_size = gguf.GGML_QUANT_SIZES[qtype]
|
| 35 |
-
dequantize_blocks = dequantize_functions[qtype]
|
| 36 |
-
|
| 37 |
-
rows = data.reshape(
|
| 38 |
-
(-1, data.shape[-1])
|
| 39 |
-
).view(torch.uint8)
|
| 40 |
-
|
| 41 |
-
n_blocks = rows.numel() // type_size
|
| 42 |
-
blocks = rows.reshape((n_blocks, type_size))
|
| 43 |
-
blocks = dequantize_blocks(blocks, block_size, type_size, dtype)
|
| 44 |
-
return blocks.reshape(oshape)
|
| 45 |
-
|
| 46 |
-
def to_uint32(x):
|
| 47 |
-
# no uint32 :(
|
| 48 |
-
x = x.view(torch.uint8).to(torch.int32)
|
| 49 |
-
return (x[:, 0] | x[:, 1] << 8 | x[:, 2] << 16 | x[:, 3] << 24).unsqueeze(1)
|
| 50 |
-
|
| 51 |
-
def split_block_dims(blocks, *args):
|
| 52 |
-
n_max = blocks.shape[1]
|
| 53 |
-
dims = list(args) + [n_max - sum(args)]
|
| 54 |
-
return torch.split(blocks, dims, dim=1)
|
| 55 |
-
|
| 56 |
-
# Full weights #
|
| 57 |
-
def dequantize_blocks_BF16(blocks, block_size, type_size, dtype=None):
|
| 58 |
-
return (blocks.view(torch.int16).to(torch.int32) << 16).view(torch.float32)
|
| 59 |
-
|
| 60 |
-
# Legacy Quants #
|
| 61 |
-
def dequantize_blocks_Q8_0(blocks, block_size, type_size, dtype=None):
|
| 62 |
-
d, x = split_block_dims(blocks, 2)
|
| 63 |
-
d = d.view(torch.float16).to(dtype)
|
| 64 |
-
x = x.view(torch.int8)
|
| 65 |
-
return (d * x)
|
| 66 |
-
|
| 67 |
-
def dequantize_blocks_Q5_1(blocks, block_size, type_size, dtype=None):
|
| 68 |
-
n_blocks = blocks.shape[0]
|
| 69 |
-
|
| 70 |
-
d, m, qh, qs = split_block_dims(blocks, 2, 2, 4)
|
| 71 |
-
d = d.view(torch.float16).to(dtype)
|
| 72 |
-
m = m.view(torch.float16).to(dtype)
|
| 73 |
-
qh = to_uint32(qh)
|
| 74 |
-
|
| 75 |
-
qh = qh.reshape((n_blocks, 1)) >> torch.arange(32, device=d.device, dtype=torch.int32).reshape(1, 32)
|
| 76 |
-
ql = qs.reshape((n_blocks, -1, 1, block_size // 2)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape(1, 1, 2, 1)
|
| 77 |
-
qh = (qh & 1).to(torch.uint8)
|
| 78 |
-
ql = (ql & 0x0F).reshape((n_blocks, -1))
|
| 79 |
-
|
| 80 |
-
qs = (ql | (qh << 4))
|
| 81 |
-
return (d * qs) + m
|
| 82 |
-
|
| 83 |
-
def dequantize_blocks_Q5_0(blocks, block_size, type_size, dtype=None):
|
| 84 |
-
n_blocks = blocks.shape[0]
|
| 85 |
-
|
| 86 |
-
d, qh, qs = split_block_dims(blocks, 2, 4)
|
| 87 |
-
d = d.view(torch.float16).to(dtype)
|
| 88 |
-
qh = to_uint32(qh)
|
| 89 |
-
|
| 90 |
-
qh = qh.reshape(n_blocks, 1) >> torch.arange(32, device=d.device, dtype=torch.int32).reshape(1, 32)
|
| 91 |
-
ql = qs.reshape(n_blocks, -1, 1, block_size // 2) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape(1, 1, 2, 1)
|
| 92 |
-
|
| 93 |
-
qh = (qh & 1).to(torch.uint8)
|
| 94 |
-
ql = (ql & 0x0F).reshape(n_blocks, -1)
|
| 95 |
-
|
| 96 |
-
qs = (ql | (qh << 4)).to(torch.int8) - 16
|
| 97 |
-
return (d * qs)
|
| 98 |
-
|
| 99 |
-
def dequantize_blocks_Q4_1(blocks, block_size, type_size, dtype=None):
|
| 100 |
-
n_blocks = blocks.shape[0]
|
| 101 |
-
|
| 102 |
-
d, m, qs = split_block_dims(blocks, 2, 2)
|
| 103 |
-
d = d.view(torch.float16).to(dtype)
|
| 104 |
-
m = m.view(torch.float16).to(dtype)
|
| 105 |
-
|
| 106 |
-
qs = qs.reshape((n_blocks, -1, 1, block_size // 2)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape(1, 1, 2, 1)
|
| 107 |
-
qs = (qs & 0x0F).reshape(n_blocks, -1)
|
| 108 |
-
|
| 109 |
-
return (d * qs) + m
|
| 110 |
-
|
| 111 |
-
def dequantize_blocks_Q4_0(blocks, block_size, type_size, dtype=None):
|
| 112 |
-
n_blocks = blocks.shape[0]
|
| 113 |
-
|
| 114 |
-
d, qs = split_block_dims(blocks, 2)
|
| 115 |
-
d = d.view(torch.float16).to(dtype)
|
| 116 |
-
|
| 117 |
-
qs = qs.reshape((n_blocks, -1, 1, block_size // 2)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape((1, 1, 2, 1))
|
| 118 |
-
qs = (qs & 0x0F).reshape((n_blocks, -1)).to(torch.int8) - 8
|
| 119 |
-
return (d * qs)
|
| 120 |
-
|
| 121 |
-
# K Quants #
|
| 122 |
-
QK_K = 256
|
| 123 |
-
K_SCALE_SIZE = 12
|
| 124 |
-
|
| 125 |
-
def get_scale_min(scales):
|
| 126 |
-
n_blocks = scales.shape[0]
|
| 127 |
-
scales = scales.view(torch.uint8)
|
| 128 |
-
scales = scales.reshape((n_blocks, 3, 4))
|
| 129 |
-
|
| 130 |
-
d, m, m_d = torch.split(scales, scales.shape[-2] // 3, dim=-2)
|
| 131 |
-
|
| 132 |
-
sc = torch.cat([d & 0x3F, (m_d & 0x0F) | ((d >> 2) & 0x30)], dim=-1)
|
| 133 |
-
min = torch.cat([m & 0x3F, (m_d >> 4) | ((m >> 2) & 0x30)], dim=-1)
|
| 134 |
-
|
| 135 |
-
return (sc.reshape((n_blocks, 8)), min.reshape((n_blocks, 8)))
|
| 136 |
-
|
| 137 |
-
def dequantize_blocks_Q6_K(blocks, block_size, type_size, dtype=None):
|
| 138 |
-
n_blocks = blocks.shape[0]
|
| 139 |
-
|
| 140 |
-
ql, qh, scales, d, = split_block_dims(blocks, QK_K // 2, QK_K // 4, QK_K // 16)
|
| 141 |
-
|
| 142 |
-
scales = scales.view(torch.int8).to(dtype)
|
| 143 |
-
d = d.view(torch.float16).to(dtype)
|
| 144 |
-
d = (d * scales).reshape((n_blocks, QK_K // 16, 1))
|
| 145 |
-
|
| 146 |
-
ql = ql.reshape((n_blocks, -1, 1, 64)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape((1, 1, 2, 1))
|
| 147 |
-
ql = (ql & 0x0F).reshape((n_blocks, -1, 32))
|
| 148 |
-
qh = qh.reshape((n_blocks, -1, 1, 32)) >> torch.tensor([0, 2, 4, 6], device=d.device, dtype=torch.uint8).reshape((1, 1, 4, 1))
|
| 149 |
-
qh = (qh & 0x03).reshape((n_blocks, -1, 32))
|
| 150 |
-
q = (ql | (qh << 4)).to(torch.int8) - 32
|
| 151 |
-
q = q.reshape((n_blocks, QK_K // 16, -1))
|
| 152 |
-
|
| 153 |
-
return (d * q).reshape((n_blocks, QK_K))
|
| 154 |
-
|
| 155 |
-
def dequantize_blocks_Q5_K(blocks, block_size, type_size, dtype=None):
|
| 156 |
-
n_blocks = blocks.shape[0]
|
| 157 |
-
|
| 158 |
-
d, dmin, scales, qh, qs = split_block_dims(blocks, 2, 2, K_SCALE_SIZE, QK_K // 8)
|
| 159 |
-
|
| 160 |
-
d = d.view(torch.float16).to(dtype)
|
| 161 |
-
dmin = dmin.view(torch.float16).to(dtype)
|
| 162 |
-
|
| 163 |
-
sc, m = get_scale_min(scales)
|
| 164 |
-
|
| 165 |
-
d = (d * sc).reshape((n_blocks, -1, 1))
|
| 166 |
-
dm = (dmin * m).reshape((n_blocks, -1, 1))
|
| 167 |
-
|
| 168 |
-
ql = qs.reshape((n_blocks, -1, 1, 32)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape((1, 1, 2, 1))
|
| 169 |
-
qh = qh.reshape((n_blocks, -1, 1, 32)) >> torch.tensor([i for i in range(8)], device=d.device, dtype=torch.uint8).reshape((1, 1, 8, 1))
|
| 170 |
-
ql = (ql & 0x0F).reshape((n_blocks, -1, 32))
|
| 171 |
-
qh = (qh & 0x01).reshape((n_blocks, -1, 32))
|
| 172 |
-
q = (ql | (qh << 4))
|
| 173 |
-
|
| 174 |
-
return (d * q - dm).reshape((n_blocks, QK_K))
|
| 175 |
-
|
| 176 |
-
def dequantize_blocks_Q4_K(blocks, block_size, type_size, dtype=None):
|
| 177 |
-
n_blocks = blocks.shape[0]
|
| 178 |
-
|
| 179 |
-
d, dmin, scales, qs = split_block_dims(blocks, 2, 2, K_SCALE_SIZE)
|
| 180 |
-
d = d.view(torch.float16).to(dtype)
|
| 181 |
-
dmin = dmin.view(torch.float16).to(dtype)
|
| 182 |
-
|
| 183 |
-
sc, m = get_scale_min(scales)
|
| 184 |
-
|
| 185 |
-
d = (d * sc).reshape((n_blocks, -1, 1))
|
| 186 |
-
dm = (dmin * m).reshape((n_blocks, -1, 1))
|
| 187 |
-
|
| 188 |
-
qs = qs.reshape((n_blocks, -1, 1, 32)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape((1, 1, 2, 1))
|
| 189 |
-
qs = (qs & 0x0F).reshape((n_blocks, -1, 32))
|
| 190 |
-
|
| 191 |
-
return (d * qs - dm).reshape((n_blocks, QK_K))
|
| 192 |
-
|
| 193 |
-
def dequantize_blocks_Q3_K(blocks, block_size, type_size, dtype=None):
|
| 194 |
-
n_blocks = blocks.shape[0]
|
| 195 |
-
|
| 196 |
-
hmask, qs, scales, d = split_block_dims(blocks, QK_K // 8, QK_K // 4, 12)
|
| 197 |
-
d = d.view(torch.float16).to(dtype)
|
| 198 |
-
|
| 199 |
-
lscales, hscales = scales[:, :8], scales[:, 8:]
|
| 200 |
-
lscales = lscales.reshape((n_blocks, 1, 8)) >> torch.tensor([0, 4], device=d.device, dtype=torch.uint8).reshape((1, 2, 1))
|
| 201 |
-
lscales = lscales.reshape((n_blocks, 16))
|
| 202 |
-
hscales = hscales.reshape((n_blocks, 1, 4)) >> torch.tensor([0, 2, 4, 6], device=d.device, dtype=torch.uint8).reshape((1, 4, 1))
|
| 203 |
-
hscales = hscales.reshape((n_blocks, 16))
|
| 204 |
-
scales = (lscales & 0x0F) | ((hscales & 0x03) << 4)
|
| 205 |
-
scales = (scales.to(torch.int8) - 32)
|
| 206 |
-
|
| 207 |
-
dl = (d * scales).reshape((n_blocks, 16, 1))
|
| 208 |
-
|
| 209 |
-
ql = qs.reshape((n_blocks, -1, 1, 32)) >> torch.tensor([0, 2, 4, 6], device=d.device, dtype=torch.uint8).reshape((1, 1, 4, 1))
|
| 210 |
-
qh = hmask.reshape(n_blocks, -1, 1, 32) >> torch.tensor([i for i in range(8)], device=d.device, dtype=torch.uint8).reshape((1, 1, 8, 1))
|
| 211 |
-
ql = ql.reshape((n_blocks, 16, QK_K // 16)) & 3
|
| 212 |
-
qh = (qh.reshape((n_blocks, 16, QK_K // 16)) & 1) ^ 1
|
| 213 |
-
q = (ql.to(torch.int8) - (qh << 2).to(torch.int8))
|
| 214 |
-
|
| 215 |
-
return (dl * q).reshape((n_blocks, QK_K))
|
| 216 |
-
|
| 217 |
-
def dequantize_blocks_Q2_K(blocks, block_size, type_size, dtype=None):
|
| 218 |
-
n_blocks = blocks.shape[0]
|
| 219 |
-
|
| 220 |
-
scales, qs, d, dmin = split_block_dims(blocks, QK_K // 16, QK_K // 4, 2)
|
| 221 |
-
d = d.view(torch.float16).to(dtype)
|
| 222 |
-
dmin = dmin.view(torch.float16).to(dtype)
|
| 223 |
-
|
| 224 |
-
# (n_blocks, 16, 1)
|
| 225 |
-
dl = (d * (scales & 0xF)).reshape((n_blocks, QK_K // 16, 1))
|
| 226 |
-
ml = (dmin * (scales >> 4)).reshape((n_blocks, QK_K // 16, 1))
|
| 227 |
-
|
| 228 |
-
shift = torch.tensor([0, 2, 4, 6], device=d.device, dtype=torch.uint8).reshape((1, 1, 4, 1))
|
| 229 |
-
|
| 230 |
-
qs = (qs.reshape((n_blocks, -1, 1, 32)) >> shift) & 3
|
| 231 |
-
qs = qs.reshape((n_blocks, QK_K // 16, 16))
|
| 232 |
-
qs = dl * qs - ml
|
| 233 |
-
|
| 234 |
-
return qs.reshape((n_blocks, -1))
|
| 235 |
-
|
| 236 |
-
dequantize_functions = {
|
| 237 |
-
gguf.GGMLQuantizationType.BF16: dequantize_blocks_BF16,
|
| 238 |
-
gguf.GGMLQuantizationType.Q8_0: dequantize_blocks_Q8_0,
|
| 239 |
-
gguf.GGMLQuantizationType.Q5_1: dequantize_blocks_Q5_1,
|
| 240 |
-
gguf.GGMLQuantizationType.Q5_0: dequantize_blocks_Q5_0,
|
| 241 |
-
gguf.GGMLQuantizationType.Q4_1: dequantize_blocks_Q4_1,
|
| 242 |
-
gguf.GGMLQuantizationType.Q4_0: dequantize_blocks_Q4_0,
|
| 243 |
-
gguf.GGMLQuantizationType.Q6_K: dequantize_blocks_Q6_K,
|
| 244 |
-
gguf.GGMLQuantizationType.Q5_K: dequantize_blocks_Q5_K,
|
| 245 |
-
gguf.GGMLQuantizationType.Q4_K: dequantize_blocks_Q4_K,
|
| 246 |
-
gguf.GGMLQuantizationType.Q3_K: dequantize_blocks_Q3_K,
|
| 247 |
-
gguf.GGMLQuantizationType.Q2_K: dequantize_blocks_Q2_K,
|
| 248 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/loader.py
DELETED
|
@@ -1,353 +0,0 @@
|
|
| 1 |
-
# (c) City96 || Apache-2.0 (apache.org/licenses/LICENSE-2.0)
|
| 2 |
-
import warnings
|
| 3 |
-
import logging
|
| 4 |
-
import torch
|
| 5 |
-
import gguf
|
| 6 |
-
import re
|
| 7 |
-
import os
|
| 8 |
-
|
| 9 |
-
from .ops import GGMLTensor
|
| 10 |
-
from .dequant import is_quantized, dequantize_tensor
|
| 11 |
-
|
| 12 |
-
IMG_ARCH_LIST = {"flux", "sd1", "sdxl", "sd3", "aura", "hidream", "cosmos", "ltxv", "hyvid", "wan", "lumina2", "qwen_image"}
|
| 13 |
-
TXT_ARCH_LIST = {"t5", "t5encoder", "llama", "qwen2vl"}
|
| 14 |
-
VIS_TYPE_LIST = {"clip-vision"}
|
| 15 |
-
|
| 16 |
-
def get_orig_shape(reader, tensor_name):
|
| 17 |
-
field_key = f"comfy.gguf.orig_shape.{tensor_name}"
|
| 18 |
-
field = reader.get_field(field_key)
|
| 19 |
-
if field is None:
|
| 20 |
-
return None
|
| 21 |
-
# Has original shape metadata, so we try to decode it.
|
| 22 |
-
if len(field.types) != 2 or field.types[0] != gguf.GGUFValueType.ARRAY or field.types[1] != gguf.GGUFValueType.INT32:
|
| 23 |
-
raise TypeError(f"Bad original shape metadata for {field_key}: Expected ARRAY of INT32, got {field.types}")
|
| 24 |
-
return torch.Size(tuple(int(field.parts[part_idx][0]) for part_idx in field.data))
|
| 25 |
-
|
| 26 |
-
def get_field(reader, field_name, field_type):
|
| 27 |
-
field = reader.get_field(field_name)
|
| 28 |
-
if field is None:
|
| 29 |
-
return None
|
| 30 |
-
elif field_type == str:
|
| 31 |
-
# extra check here as this is used for checking arch string
|
| 32 |
-
if len(field.types) != 1 or field.types[0] != gguf.GGUFValueType.STRING:
|
| 33 |
-
raise TypeError(f"Bad type for GGUF {field_name} key: expected string, got {field.types!r}")
|
| 34 |
-
return str(field.parts[field.data[-1]], encoding="utf-8")
|
| 35 |
-
elif field_type in [int, float, bool]:
|
| 36 |
-
return field_type(field.parts[field.data[-1]])
|
| 37 |
-
else:
|
| 38 |
-
raise TypeError(f"Unknown field type {field_type}")
|
| 39 |
-
|
| 40 |
-
def get_list_field(reader, field_name, field_type):
|
| 41 |
-
field = reader.get_field(field_name)
|
| 42 |
-
if field is None:
|
| 43 |
-
return None
|
| 44 |
-
elif field_type == str:
|
| 45 |
-
return tuple(str(field.parts[part_idx], encoding="utf-8") for part_idx in field.data)
|
| 46 |
-
elif field_type in [int, float, bool]:
|
| 47 |
-
return tuple(field_type(field.parts[part_idx][0]) for part_idx in field.data)
|
| 48 |
-
else:
|
| 49 |
-
raise TypeError(f"Unknown field type {field_type}")
|
| 50 |
-
|
| 51 |
-
def gguf_sd_loader(path, handle_prefix="model.diffusion_model.", return_arch=False, is_text_model=False):
|
| 52 |
-
"""
|
| 53 |
-
Read state dict as fake tensors
|
| 54 |
-
"""
|
| 55 |
-
reader = gguf.GGUFReader(path)
|
| 56 |
-
|
| 57 |
-
# filter and strip prefix
|
| 58 |
-
has_prefix = False
|
| 59 |
-
if handle_prefix is not None:
|
| 60 |
-
prefix_len = len(handle_prefix)
|
| 61 |
-
tensor_names = set(tensor.name for tensor in reader.tensors)
|
| 62 |
-
has_prefix = any(s.startswith(handle_prefix) for s in tensor_names)
|
| 63 |
-
|
| 64 |
-
tensors = []
|
| 65 |
-
for tensor in reader.tensors:
|
| 66 |
-
sd_key = tensor_name = tensor.name
|
| 67 |
-
if has_prefix:
|
| 68 |
-
if not tensor_name.startswith(handle_prefix):
|
| 69 |
-
continue
|
| 70 |
-
sd_key = tensor_name[prefix_len:]
|
| 71 |
-
tensors.append((sd_key, tensor))
|
| 72 |
-
|
| 73 |
-
# detect and verify architecture
|
| 74 |
-
compat = None
|
| 75 |
-
arch_str = get_field(reader, "general.architecture", str)
|
| 76 |
-
type_str = get_field(reader, "general.type", str)
|
| 77 |
-
if arch_str in [None, "pig"]:
|
| 78 |
-
if is_text_model:
|
| 79 |
-
raise ValueError(f"This text model is incompatible with llama.cpp!\nConsider using the safetensors version\n({path})")
|
| 80 |
-
compat = "sd.cpp" if arch_str is None else arch_str
|
| 81 |
-
# import here to avoid changes to convert.py breaking regular models
|
| 82 |
-
from .tools.convert import detect_arch
|
| 83 |
-
try:
|
| 84 |
-
arch_str = detect_arch(set(val[0] for val in tensors)).arch
|
| 85 |
-
except Exception as e:
|
| 86 |
-
raise ValueError(f"This model is not currently supported - ({e})")
|
| 87 |
-
elif arch_str not in TXT_ARCH_LIST and is_text_model:
|
| 88 |
-
if type_str not in VIS_TYPE_LIST:
|
| 89 |
-
raise ValueError(f"Unexpected text model architecture type in GGUF file: {arch_str!r}")
|
| 90 |
-
elif arch_str not in IMG_ARCH_LIST and not is_text_model:
|
| 91 |
-
raise ValueError(f"Unexpected architecture type in GGUF file: {arch_str!r}")
|
| 92 |
-
|
| 93 |
-
if compat:
|
| 94 |
-
logging.warning(f"Warning: This gguf model file is loaded in compatibility mode '{compat}' [arch:{arch_str}]")
|
| 95 |
-
|
| 96 |
-
# main loading loop
|
| 97 |
-
state_dict = {}
|
| 98 |
-
qtype_dict = {}
|
| 99 |
-
for sd_key, tensor in tensors:
|
| 100 |
-
tensor_name = tensor.name
|
| 101 |
-
# torch_tensor = torch.from_numpy(tensor.data) # mmap
|
| 102 |
-
|
| 103 |
-
# NOTE: line above replaced with this block to avoid persistent numpy warning about mmap
|
| 104 |
-
with warnings.catch_warnings():
|
| 105 |
-
warnings.filterwarnings("ignore", message="The given NumPy array is not writable")
|
| 106 |
-
torch_tensor = torch.from_numpy(tensor.data) # mmap
|
| 107 |
-
|
| 108 |
-
shape = get_orig_shape(reader, tensor_name)
|
| 109 |
-
if shape is None:
|
| 110 |
-
shape = torch.Size(tuple(int(v) for v in reversed(tensor.shape)))
|
| 111 |
-
# Workaround for stable-diffusion.cpp SDXL detection.
|
| 112 |
-
if compat == "sd.cpp" and arch_str == "sdxl":
|
| 113 |
-
if any([tensor_name.endswith(x) for x in (".proj_in.weight", ".proj_out.weight")]):
|
| 114 |
-
while len(shape) > 2 and shape[-1] == 1:
|
| 115 |
-
shape = shape[:-1]
|
| 116 |
-
|
| 117 |
-
# add to state dict
|
| 118 |
-
if tensor.tensor_type in {gguf.GGMLQuantizationType.F32, gguf.GGMLQuantizationType.F16}:
|
| 119 |
-
torch_tensor = torch_tensor.view(*shape)
|
| 120 |
-
state_dict[sd_key] = GGMLTensor(torch_tensor, tensor_type=tensor.tensor_type, tensor_shape=shape)
|
| 121 |
-
|
| 122 |
-
# keep track of loaded tensor types
|
| 123 |
-
tensor_type_str = getattr(tensor.tensor_type, "name", repr(tensor.tensor_type))
|
| 124 |
-
qtype_dict[tensor_type_str] = qtype_dict.get(tensor_type_str, 0) + 1
|
| 125 |
-
|
| 126 |
-
# print loaded tensor type counts
|
| 127 |
-
logging.info("gguf qtypes: " + ", ".join(f"{k} ({v})" for k, v in qtype_dict.items()))
|
| 128 |
-
|
| 129 |
-
# mark largest tensor for vram estimation
|
| 130 |
-
qsd = {k:v for k,v in state_dict.items() if is_quantized(v)}
|
| 131 |
-
if len(qsd) > 0:
|
| 132 |
-
max_key = max(qsd.keys(), key=lambda k: qsd[k].numel())
|
| 133 |
-
state_dict[max_key].is_largest_weight = True
|
| 134 |
-
|
| 135 |
-
if return_arch:
|
| 136 |
-
return (state_dict, arch_str)
|
| 137 |
-
return state_dict
|
| 138 |
-
|
| 139 |
-
# for remapping llama.cpp -> original key names
|
| 140 |
-
T5_SD_MAP = {
|
| 141 |
-
"enc.": "encoder.",
|
| 142 |
-
".blk.": ".block.",
|
| 143 |
-
"token_embd": "shared",
|
| 144 |
-
"output_norm": "final_layer_norm",
|
| 145 |
-
"attn_q": "layer.0.SelfAttention.q",
|
| 146 |
-
"attn_k": "layer.0.SelfAttention.k",
|
| 147 |
-
"attn_v": "layer.0.SelfAttention.v",
|
| 148 |
-
"attn_o": "layer.0.SelfAttention.o",
|
| 149 |
-
"attn_norm": "layer.0.layer_norm",
|
| 150 |
-
"attn_rel_b": "layer.0.SelfAttention.relative_attention_bias",
|
| 151 |
-
"ffn_up": "layer.1.DenseReluDense.wi_1",
|
| 152 |
-
"ffn_down": "layer.1.DenseReluDense.wo",
|
| 153 |
-
"ffn_gate": "layer.1.DenseReluDense.wi_0",
|
| 154 |
-
"ffn_norm": "layer.1.layer_norm",
|
| 155 |
-
}
|
| 156 |
-
|
| 157 |
-
LLAMA_SD_MAP = {
|
| 158 |
-
"blk.": "model.layers.",
|
| 159 |
-
"attn_norm": "input_layernorm",
|
| 160 |
-
"attn_q": "self_attn.q_proj",
|
| 161 |
-
"attn_k": "self_attn.k_proj",
|
| 162 |
-
"attn_v": "self_attn.v_proj",
|
| 163 |
-
"attn_output": "self_attn.o_proj",
|
| 164 |
-
"ffn_up": "mlp.up_proj",
|
| 165 |
-
"ffn_down": "mlp.down_proj",
|
| 166 |
-
"ffn_gate": "mlp.gate_proj",
|
| 167 |
-
"ffn_norm": "post_attention_layernorm",
|
| 168 |
-
"token_embd": "model.embed_tokens",
|
| 169 |
-
"output_norm": "model.norm",
|
| 170 |
-
"output.weight": "lm_head.weight",
|
| 171 |
-
}
|
| 172 |
-
|
| 173 |
-
CLIP_VISION_SD_MAP = {
|
| 174 |
-
"mm.": "visual.merger.mlp.",
|
| 175 |
-
"v.post_ln.": "visual.merger.ln_q.",
|
| 176 |
-
"v.patch_embd": "visual.patch_embed.proj",
|
| 177 |
-
"v.blk.": "visual.blocks.",
|
| 178 |
-
"ffn_up": "mlp.up_proj",
|
| 179 |
-
"ffn_down": "mlp.down_proj",
|
| 180 |
-
"ffn_gate": "mlp.gate_proj",
|
| 181 |
-
"attn_out.": "attn.proj.",
|
| 182 |
-
"ln1.": "norm1.",
|
| 183 |
-
"ln2.": "norm2.",
|
| 184 |
-
}
|
| 185 |
-
|
| 186 |
-
def sd_map_replace(raw_sd, key_map):
|
| 187 |
-
sd = {}
|
| 188 |
-
for k,v in raw_sd.items():
|
| 189 |
-
for s,d in key_map.items():
|
| 190 |
-
k = k.replace(s,d)
|
| 191 |
-
sd[k] = v
|
| 192 |
-
return sd
|
| 193 |
-
|
| 194 |
-
def llama_permute(raw_sd, n_head, n_head_kv):
|
| 195 |
-
# Reverse version of LlamaModel.permute in llama.cpp convert script
|
| 196 |
-
sd = {}
|
| 197 |
-
permute = lambda x,h: x.reshape(h, x.shape[0] // h // 2, 2, *x.shape[1:]).swapaxes(1, 2).reshape(x.shape)
|
| 198 |
-
for k,v in raw_sd.items():
|
| 199 |
-
if k.endswith(("q_proj.weight", "q_proj.bias")):
|
| 200 |
-
v.data = permute(v.data, n_head)
|
| 201 |
-
if k.endswith(("k_proj.weight", "k_proj.bias")):
|
| 202 |
-
v.data = permute(v.data, n_head_kv)
|
| 203 |
-
sd[k] = v
|
| 204 |
-
return sd
|
| 205 |
-
|
| 206 |
-
def strip_quant_suffix(name):
|
| 207 |
-
pattern = r"[-_]?(?:ud-)?i?q[0-9]_[a-z0-9_\-]{1,8}$"
|
| 208 |
-
match = re.search(pattern, name, re.IGNORECASE)
|
| 209 |
-
if match:
|
| 210 |
-
name = name[:match.start()]
|
| 211 |
-
return name
|
| 212 |
-
|
| 213 |
-
def gguf_mmproj_loader(path):
|
| 214 |
-
# Reverse version of Qwen2VLVisionModel.modify_tensors
|
| 215 |
-
logging.info("Attenpting to find mmproj file for text encoder...")
|
| 216 |
-
|
| 217 |
-
# get name to match w/o quant suffix
|
| 218 |
-
tenc_fname = os.path.basename(path)
|
| 219 |
-
tenc = os.path.splitext(tenc_fname)[0].lower()
|
| 220 |
-
tenc = strip_quant_suffix(tenc)
|
| 221 |
-
|
| 222 |
-
# try and find matching mmproj
|
| 223 |
-
target = []
|
| 224 |
-
root = os.path.dirname(path)
|
| 225 |
-
for fname in os.listdir(root):
|
| 226 |
-
name, ext = os.path.splitext(fname)
|
| 227 |
-
if ext.lower() != ".gguf":
|
| 228 |
-
continue
|
| 229 |
-
if "mmproj" not in name.lower():
|
| 230 |
-
continue
|
| 231 |
-
if tenc in name.lower():
|
| 232 |
-
target.append(fname)
|
| 233 |
-
|
| 234 |
-
if len(target) == 0:
|
| 235 |
-
logging.error(f"Error: Can't find mmproj file for '{tenc_fname}' (matching:'{tenc}')! Qwen-Image-Edit will be broken!")
|
| 236 |
-
return {}
|
| 237 |
-
if len(target) > 1:
|
| 238 |
-
logging.error(f"Ambiguous mmproj for text encoder '{tenc_fname}', will use first match.")
|
| 239 |
-
|
| 240 |
-
logging.info(f"Using mmproj '{target[0]}' for text encoder '{tenc_fname}'.")
|
| 241 |
-
target = os.path.join(root, target[0])
|
| 242 |
-
vsd = gguf_sd_loader(target, is_text_model=True)
|
| 243 |
-
|
| 244 |
-
# concat 4D to 5D
|
| 245 |
-
if "v.patch_embd.weight.1" in vsd:
|
| 246 |
-
w1 = dequantize_tensor(vsd.pop("v.patch_embd.weight"), dtype=torch.float32)
|
| 247 |
-
w2 = dequantize_tensor(vsd.pop("v.patch_embd.weight.1"), dtype=torch.float32)
|
| 248 |
-
vsd["v.patch_embd.weight"] = torch.stack([w1, w2], dim=2)
|
| 249 |
-
|
| 250 |
-
# run main replacement
|
| 251 |
-
vsd = sd_map_replace(vsd, CLIP_VISION_SD_MAP)
|
| 252 |
-
|
| 253 |
-
# handle split Q/K/V
|
| 254 |
-
if "visual.blocks.0.attn_q.weight" in vsd:
|
| 255 |
-
attns = {}
|
| 256 |
-
# filter out attentions + group
|
| 257 |
-
for k,v in vsd.items():
|
| 258 |
-
if any(x in k for x in ["attn_q", "attn_k", "attn_v"]):
|
| 259 |
-
k_attn, k_name = k.rsplit(".attn_", 1)
|
| 260 |
-
k_attn += ".attn.qkv." + k_name.split(".")[-1]
|
| 261 |
-
if k_attn not in attns:
|
| 262 |
-
attns[k_attn] = {}
|
| 263 |
-
attns[k_attn][k_name] = dequantize_tensor(
|
| 264 |
-
v, dtype=(torch.bfloat16 if is_quantized(v) else torch.float16)
|
| 265 |
-
)
|
| 266 |
-
|
| 267 |
-
# recombine
|
| 268 |
-
for k,v in attns.items():
|
| 269 |
-
suffix = k.split(".")[-1]
|
| 270 |
-
vsd[k] = torch.cat([
|
| 271 |
-
v[f"q.{suffix}"],
|
| 272 |
-
v[f"k.{suffix}"],
|
| 273 |
-
v[f"v.{suffix}"],
|
| 274 |
-
], dim=0)
|
| 275 |
-
del attns
|
| 276 |
-
|
| 277 |
-
return vsd
|
| 278 |
-
|
| 279 |
-
def gguf_tokenizer_loader(path, temb_shape):
|
| 280 |
-
# convert gguf tokenizer to spiece
|
| 281 |
-
logging.info("Attempting to recreate sentencepiece tokenizer from GGUF file metadata...")
|
| 282 |
-
try:
|
| 283 |
-
from sentencepiece import sentencepiece_model_pb2 as model
|
| 284 |
-
except ImportError:
|
| 285 |
-
raise ImportError("Please make sure sentencepiece and protobuf are installed.\npip install sentencepiece protobuf")
|
| 286 |
-
spm = model.ModelProto()
|
| 287 |
-
|
| 288 |
-
reader = gguf.GGUFReader(path)
|
| 289 |
-
|
| 290 |
-
if get_field(reader, "tokenizer.ggml.model", str) == "t5":
|
| 291 |
-
if temb_shape == (256384, 4096): # probably UMT5
|
| 292 |
-
spm.trainer_spec.model_type == 1 # Unigram (do we have a T5 w/ BPE?)
|
| 293 |
-
else:
|
| 294 |
-
raise NotImplementedError("Unknown model, can't set tokenizer!")
|
| 295 |
-
else:
|
| 296 |
-
raise NotImplementedError("Unknown model, can't set tokenizer!")
|
| 297 |
-
|
| 298 |
-
spm.normalizer_spec.add_dummy_prefix = get_field(reader, "tokenizer.ggml.add_space_prefix", bool)
|
| 299 |
-
spm.normalizer_spec.remove_extra_whitespaces = get_field(reader, "tokenizer.ggml.remove_extra_whitespaces", bool)
|
| 300 |
-
|
| 301 |
-
tokens = get_list_field(reader, "tokenizer.ggml.tokens", str)
|
| 302 |
-
scores = get_list_field(reader, "tokenizer.ggml.scores", float)
|
| 303 |
-
toktypes = get_list_field(reader, "tokenizer.ggml.token_type", int)
|
| 304 |
-
|
| 305 |
-
for idx, (token, score, toktype) in enumerate(zip(tokens, scores, toktypes)):
|
| 306 |
-
# # These aren't present in the original?
|
| 307 |
-
# if toktype == 5 and idx >= temb_shape[0]%1000):
|
| 308 |
-
# continue
|
| 309 |
-
|
| 310 |
-
piece = spm.SentencePiece()
|
| 311 |
-
piece.piece = token
|
| 312 |
-
piece.score = score
|
| 313 |
-
piece.type = toktype
|
| 314 |
-
spm.pieces.append(piece)
|
| 315 |
-
|
| 316 |
-
# unsure if any of these are correct
|
| 317 |
-
spm.trainer_spec.byte_fallback = True
|
| 318 |
-
spm.trainer_spec.vocab_size = len(tokens) # split off unused?
|
| 319 |
-
spm.trainer_spec.max_sentence_length = 4096
|
| 320 |
-
spm.trainer_spec.eos_id = get_field(reader, "tokenizer.ggml.eos_token_id", int)
|
| 321 |
-
spm.trainer_spec.pad_id = get_field(reader, "tokenizer.ggml.padding_token_id", int)
|
| 322 |
-
|
| 323 |
-
logging.info(f"Created tokenizer with vocab size of {len(spm.pieces)}")
|
| 324 |
-
del reader
|
| 325 |
-
return torch.ByteTensor(list(spm.SerializeToString()))
|
| 326 |
-
|
| 327 |
-
def gguf_clip_loader(path):
|
| 328 |
-
sd, arch = gguf_sd_loader(path, return_arch=True, is_text_model=True)
|
| 329 |
-
if arch in {"t5", "t5encoder"}:
|
| 330 |
-
temb_key = "token_embd.weight"
|
| 331 |
-
if temb_key in sd and sd[temb_key].shape == (256384, 4096):
|
| 332 |
-
# non-standard Comfy-Org tokenizer
|
| 333 |
-
sd["spiece_model"] = gguf_tokenizer_loader(path, sd[temb_key].shape)
|
| 334 |
-
# TODO: dequantizing token embed here is janky but otherwise we OOM due to tensor being massive.
|
| 335 |
-
logging.warning(f"Dequantizing {temb_key} to prevent runtime OOM.")
|
| 336 |
-
sd[temb_key] = dequantize_tensor(sd[temb_key], dtype=torch.float16)
|
| 337 |
-
sd = sd_map_replace(sd, T5_SD_MAP)
|
| 338 |
-
elif arch in {"llama", "qwen2vl"}:
|
| 339 |
-
# TODO: pass model_options["vocab_size"] to loader somehow
|
| 340 |
-
temb_key = "token_embd.weight"
|
| 341 |
-
if temb_key in sd and sd[temb_key].shape[0] >= (64 * 1024):
|
| 342 |
-
# See note above for T5.
|
| 343 |
-
logging.warning(f"Dequantizing {temb_key} to prevent runtime OOM.")
|
| 344 |
-
sd[temb_key] = dequantize_tensor(sd[temb_key], dtype=torch.float16)
|
| 345 |
-
sd = sd_map_replace(sd, LLAMA_SD_MAP)
|
| 346 |
-
if arch == "llama":
|
| 347 |
-
sd = llama_permute(sd, 32, 8) # L3
|
| 348 |
-
if arch == "qwen2vl":
|
| 349 |
-
vsd = gguf_mmproj_loader(path)
|
| 350 |
-
sd.update(vsd)
|
| 351 |
-
else:
|
| 352 |
-
pass
|
| 353 |
-
return sd
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/nodes.py
DELETED
|
@@ -1,305 +0,0 @@
|
|
| 1 |
-
# (c) City96 || Apache-2.0 (apache.org/licenses/LICENSE-2.0)
|
| 2 |
-
import torch
|
| 3 |
-
import logging
|
| 4 |
-
import collections
|
| 5 |
-
|
| 6 |
-
import nodes
|
| 7 |
-
import comfy.sd
|
| 8 |
-
import comfy.lora
|
| 9 |
-
import comfy.float
|
| 10 |
-
import comfy.utils
|
| 11 |
-
import comfy.model_patcher
|
| 12 |
-
import comfy.model_management
|
| 13 |
-
import folder_paths
|
| 14 |
-
|
| 15 |
-
from .ops import GGMLOps, move_patch_to_device
|
| 16 |
-
from .loader import gguf_sd_loader, gguf_clip_loader
|
| 17 |
-
from .dequant import is_quantized, is_torch_compatible
|
| 18 |
-
|
| 19 |
-
def update_folder_names_and_paths(key, targets=[]):
|
| 20 |
-
# check for existing key
|
| 21 |
-
base = folder_paths.folder_names_and_paths.get(key, ([], {}))
|
| 22 |
-
base = base[0] if isinstance(base[0], (list, set, tuple)) else []
|
| 23 |
-
# find base key & add w/ fallback, sanity check + warning
|
| 24 |
-
target = next((x for x in targets if x in folder_paths.folder_names_and_paths), targets[0])
|
| 25 |
-
orig, _ = folder_paths.folder_names_and_paths.get(target, ([], {}))
|
| 26 |
-
folder_paths.folder_names_and_paths[key] = (orig or base, {".gguf"})
|
| 27 |
-
if base and base != orig:
|
| 28 |
-
logging.warning(f"Unknown file list already present on key {key}: {base}")
|
| 29 |
-
|
| 30 |
-
# Add a custom keys for files ending in .gguf
|
| 31 |
-
update_folder_names_and_paths("unet_gguf", ["diffusion_models", "unet"])
|
| 32 |
-
update_folder_names_and_paths("clip_gguf", ["text_encoders", "clip"])
|
| 33 |
-
|
| 34 |
-
class GGUFModelPatcher(comfy.model_patcher.ModelPatcher):
|
| 35 |
-
patch_on_device = False
|
| 36 |
-
|
| 37 |
-
def patch_weight_to_device(self, key, device_to=None, inplace_update=False):
|
| 38 |
-
if key not in self.patches:
|
| 39 |
-
return
|
| 40 |
-
weight = comfy.utils.get_attr(self.model, key)
|
| 41 |
-
|
| 42 |
-
patches = self.patches[key]
|
| 43 |
-
if is_quantized(weight):
|
| 44 |
-
out_weight = weight.to(device_to)
|
| 45 |
-
patches = move_patch_to_device(patches, self.load_device if self.patch_on_device else self.offload_device)
|
| 46 |
-
# TODO: do we ever have legitimate duplicate patches? (i.e. patch on top of patched weight)
|
| 47 |
-
out_weight.patches = [(patches, key)]
|
| 48 |
-
else:
|
| 49 |
-
inplace_update = self.weight_inplace_update or inplace_update
|
| 50 |
-
if key not in self.backup:
|
| 51 |
-
self.backup[key] = collections.namedtuple('Dimension', ['weight', 'inplace_update'])(
|
| 52 |
-
weight.to(device=self.offload_device, copy=inplace_update), inplace_update
|
| 53 |
-
)
|
| 54 |
-
|
| 55 |
-
if device_to is not None:
|
| 56 |
-
temp_weight = comfy.model_management.cast_to_device(weight, device_to, torch.float32, copy=True)
|
| 57 |
-
else:
|
| 58 |
-
temp_weight = weight.to(torch.float32, copy=True)
|
| 59 |
-
|
| 60 |
-
out_weight = comfy.lora.calculate_weight(patches, temp_weight, key)
|
| 61 |
-
out_weight = comfy.float.stochastic_rounding(out_weight, weight.dtype)
|
| 62 |
-
|
| 63 |
-
if inplace_update:
|
| 64 |
-
comfy.utils.copy_to_param(self.model, key, out_weight)
|
| 65 |
-
else:
|
| 66 |
-
comfy.utils.set_attr_param(self.model, key, out_weight)
|
| 67 |
-
|
| 68 |
-
def unpatch_model(self, device_to=None, unpatch_weights=True):
|
| 69 |
-
if unpatch_weights:
|
| 70 |
-
for p in self.model.parameters():
|
| 71 |
-
if is_torch_compatible(p):
|
| 72 |
-
continue
|
| 73 |
-
patches = getattr(p, "patches", [])
|
| 74 |
-
if len(patches) > 0:
|
| 75 |
-
p.patches = []
|
| 76 |
-
# TODO: Find another way to not unload after patches
|
| 77 |
-
return super().unpatch_model(device_to=device_to, unpatch_weights=unpatch_weights)
|
| 78 |
-
|
| 79 |
-
mmap_released = False
|
| 80 |
-
def load(self, *args, force_patch_weights=False, **kwargs):
|
| 81 |
-
# always call `patch_weight_to_device` even for lowvram
|
| 82 |
-
super().load(*args, force_patch_weights=True, **kwargs)
|
| 83 |
-
|
| 84 |
-
# make sure nothing stays linked to mmap after first load
|
| 85 |
-
if not self.mmap_released:
|
| 86 |
-
linked = []
|
| 87 |
-
if kwargs.get("lowvram_model_memory", 0) > 0:
|
| 88 |
-
for n, m in self.model.named_modules():
|
| 89 |
-
if hasattr(m, "weight"):
|
| 90 |
-
device = getattr(m.weight, "device", None)
|
| 91 |
-
if device == self.offload_device:
|
| 92 |
-
linked.append((n, m))
|
| 93 |
-
continue
|
| 94 |
-
if hasattr(m, "bias"):
|
| 95 |
-
device = getattr(m.bias, "device", None)
|
| 96 |
-
if device == self.offload_device:
|
| 97 |
-
linked.append((n, m))
|
| 98 |
-
continue
|
| 99 |
-
if linked and self.load_device != self.offload_device:
|
| 100 |
-
logging.info(f"Attempting to release mmap ({len(linked)})")
|
| 101 |
-
for n, m in linked:
|
| 102 |
-
# TODO: possible to OOM, find better way to detach
|
| 103 |
-
m.to(self.load_device).to(self.offload_device)
|
| 104 |
-
self.mmap_released = True
|
| 105 |
-
|
| 106 |
-
def clone(self, *args, **kwargs):
|
| 107 |
-
src_cls = self.__class__
|
| 108 |
-
self.__class__ = GGUFModelPatcher
|
| 109 |
-
n = super().clone(*args, **kwargs)
|
| 110 |
-
n.__class__ = GGUFModelPatcher
|
| 111 |
-
self.__class__ = src_cls
|
| 112 |
-
# GGUF specific clone values below
|
| 113 |
-
n.patch_on_device = getattr(self, "patch_on_device", False)
|
| 114 |
-
if src_cls != GGUFModelPatcher:
|
| 115 |
-
n.size = 0 # force recalc
|
| 116 |
-
return n
|
| 117 |
-
|
| 118 |
-
class UnetLoaderGGUF:
|
| 119 |
-
@classmethod
|
| 120 |
-
def INPUT_TYPES(s):
|
| 121 |
-
unet_names = [x for x in folder_paths.get_filename_list("unet_gguf")]
|
| 122 |
-
return {
|
| 123 |
-
"required": {
|
| 124 |
-
"unet_name": (unet_names,),
|
| 125 |
-
}
|
| 126 |
-
}
|
| 127 |
-
|
| 128 |
-
RETURN_TYPES = ("MODEL",)
|
| 129 |
-
FUNCTION = "load_unet"
|
| 130 |
-
CATEGORY = "bootleg"
|
| 131 |
-
TITLE = "Unet Loader (GGUF)"
|
| 132 |
-
|
| 133 |
-
def load_unet(self, unet_name, dequant_dtype=None, patch_dtype=None, patch_on_device=None):
|
| 134 |
-
ops = GGMLOps()
|
| 135 |
-
|
| 136 |
-
if dequant_dtype in ("default", None):
|
| 137 |
-
ops.Linear.dequant_dtype = None
|
| 138 |
-
elif dequant_dtype in ["target"]:
|
| 139 |
-
ops.Linear.dequant_dtype = dequant_dtype
|
| 140 |
-
else:
|
| 141 |
-
ops.Linear.dequant_dtype = getattr(torch, dequant_dtype)
|
| 142 |
-
|
| 143 |
-
if patch_dtype in ("default", None):
|
| 144 |
-
ops.Linear.patch_dtype = None
|
| 145 |
-
elif patch_dtype in ["target"]:
|
| 146 |
-
ops.Linear.patch_dtype = patch_dtype
|
| 147 |
-
else:
|
| 148 |
-
ops.Linear.patch_dtype = getattr(torch, patch_dtype)
|
| 149 |
-
|
| 150 |
-
# init model
|
| 151 |
-
unet_path = folder_paths.get_full_path("unet", unet_name)
|
| 152 |
-
sd = gguf_sd_loader(unet_path)
|
| 153 |
-
model = comfy.sd.load_diffusion_model_state_dict(
|
| 154 |
-
sd, model_options={"custom_operations": ops}
|
| 155 |
-
)
|
| 156 |
-
if model is None:
|
| 157 |
-
logging.error("ERROR UNSUPPORTED UNET {}".format(unet_path))
|
| 158 |
-
raise RuntimeError("ERROR: Could not detect model type of: {}".format(unet_path))
|
| 159 |
-
model = GGUFModelPatcher.clone(model)
|
| 160 |
-
model.patch_on_device = patch_on_device
|
| 161 |
-
return (model,)
|
| 162 |
-
|
| 163 |
-
class UnetLoaderGGUFAdvanced(UnetLoaderGGUF):
|
| 164 |
-
@classmethod
|
| 165 |
-
def INPUT_TYPES(s):
|
| 166 |
-
unet_names = [x for x in folder_paths.get_filename_list("unet_gguf")]
|
| 167 |
-
return {
|
| 168 |
-
"required": {
|
| 169 |
-
"unet_name": (unet_names,),
|
| 170 |
-
"dequant_dtype": (["default", "target", "float32", "float16", "bfloat16"], {"default": "default"}),
|
| 171 |
-
"patch_dtype": (["default", "target", "float32", "float16", "bfloat16"], {"default": "default"}),
|
| 172 |
-
"patch_on_device": ("BOOLEAN", {"default": False}),
|
| 173 |
-
}
|
| 174 |
-
}
|
| 175 |
-
TITLE = "Unet Loader (GGUF/Advanced)"
|
| 176 |
-
|
| 177 |
-
class CLIPLoaderGGUF:
|
| 178 |
-
@classmethod
|
| 179 |
-
def INPUT_TYPES(s):
|
| 180 |
-
base = nodes.CLIPLoader.INPUT_TYPES()
|
| 181 |
-
return {
|
| 182 |
-
"required": {
|
| 183 |
-
"clip_name": (s.get_filename_list(),),
|
| 184 |
-
"type": base["required"]["type"],
|
| 185 |
-
}
|
| 186 |
-
}
|
| 187 |
-
|
| 188 |
-
RETURN_TYPES = ("CLIP",)
|
| 189 |
-
FUNCTION = "load_clip"
|
| 190 |
-
CATEGORY = "bootleg"
|
| 191 |
-
TITLE = "CLIPLoader (GGUF)"
|
| 192 |
-
|
| 193 |
-
@classmethod
|
| 194 |
-
def get_filename_list(s):
|
| 195 |
-
files = []
|
| 196 |
-
files += folder_paths.get_filename_list("clip")
|
| 197 |
-
files += folder_paths.get_filename_list("clip_gguf")
|
| 198 |
-
return sorted(files)
|
| 199 |
-
|
| 200 |
-
def load_data(self, ckpt_paths):
|
| 201 |
-
clip_data = []
|
| 202 |
-
for p in ckpt_paths:
|
| 203 |
-
if p.endswith(".gguf"):
|
| 204 |
-
sd = gguf_clip_loader(p)
|
| 205 |
-
else:
|
| 206 |
-
sd = comfy.utils.load_torch_file(p, safe_load=True)
|
| 207 |
-
if "scaled_fp8" in sd: # NOTE: Scaled FP8 would require different custom ops, but only one can be active
|
| 208 |
-
raise NotImplementedError(f"Mixing scaled FP8 with GGUF is not supported! Use regular CLIP loader or switch model(s)\n({p})")
|
| 209 |
-
clip_data.append(sd)
|
| 210 |
-
return clip_data
|
| 211 |
-
|
| 212 |
-
def load_patcher(self, clip_paths, clip_type, clip_data):
|
| 213 |
-
clip = comfy.sd.load_text_encoder_state_dicts(
|
| 214 |
-
clip_type = clip_type,
|
| 215 |
-
state_dicts = clip_data,
|
| 216 |
-
model_options = {
|
| 217 |
-
"custom_operations": GGMLOps,
|
| 218 |
-
"initial_device": comfy.model_management.text_encoder_offload_device()
|
| 219 |
-
},
|
| 220 |
-
embedding_directory = folder_paths.get_folder_paths("embeddings"),
|
| 221 |
-
)
|
| 222 |
-
clip.patcher = GGUFModelPatcher.clone(clip.patcher)
|
| 223 |
-
return clip
|
| 224 |
-
|
| 225 |
-
def load_clip(self, clip_name, type="stable_diffusion"):
|
| 226 |
-
clip_path = folder_paths.get_full_path("clip", clip_name)
|
| 227 |
-
clip_type = getattr(comfy.sd.CLIPType, type.upper(), comfy.sd.CLIPType.STABLE_DIFFUSION)
|
| 228 |
-
return (self.load_patcher([clip_path], clip_type, self.load_data([clip_path])),)
|
| 229 |
-
|
| 230 |
-
class DualCLIPLoaderGGUF(CLIPLoaderGGUF):
|
| 231 |
-
@classmethod
|
| 232 |
-
def INPUT_TYPES(s):
|
| 233 |
-
base = nodes.DualCLIPLoader.INPUT_TYPES()
|
| 234 |
-
file_options = (s.get_filename_list(), )
|
| 235 |
-
return {
|
| 236 |
-
"required": {
|
| 237 |
-
"clip_name1": file_options,
|
| 238 |
-
"clip_name2": file_options,
|
| 239 |
-
"type": base["required"]["type"],
|
| 240 |
-
}
|
| 241 |
-
}
|
| 242 |
-
|
| 243 |
-
TITLE = "DualCLIPLoader (GGUF)"
|
| 244 |
-
|
| 245 |
-
def load_clip(self, clip_name1, clip_name2, type):
|
| 246 |
-
clip_path1 = folder_paths.get_full_path("clip", clip_name1)
|
| 247 |
-
clip_path2 = folder_paths.get_full_path("clip", clip_name2)
|
| 248 |
-
clip_paths = (clip_path1, clip_path2)
|
| 249 |
-
clip_type = getattr(comfy.sd.CLIPType, type.upper(), comfy.sd.CLIPType.STABLE_DIFFUSION)
|
| 250 |
-
return (self.load_patcher(clip_paths, clip_type, self.load_data(clip_paths)),)
|
| 251 |
-
|
| 252 |
-
class TripleCLIPLoaderGGUF(CLIPLoaderGGUF):
|
| 253 |
-
@classmethod
|
| 254 |
-
def INPUT_TYPES(s):
|
| 255 |
-
file_options = (s.get_filename_list(), )
|
| 256 |
-
return {
|
| 257 |
-
"required": {
|
| 258 |
-
"clip_name1": file_options,
|
| 259 |
-
"clip_name2": file_options,
|
| 260 |
-
"clip_name3": file_options,
|
| 261 |
-
}
|
| 262 |
-
}
|
| 263 |
-
|
| 264 |
-
TITLE = "TripleCLIPLoader (GGUF)"
|
| 265 |
-
|
| 266 |
-
def load_clip(self, clip_name1, clip_name2, clip_name3, type="sd3"):
|
| 267 |
-
clip_path1 = folder_paths.get_full_path("clip", clip_name1)
|
| 268 |
-
clip_path2 = folder_paths.get_full_path("clip", clip_name2)
|
| 269 |
-
clip_path3 = folder_paths.get_full_path("clip", clip_name3)
|
| 270 |
-
clip_paths = (clip_path1, clip_path2, clip_path3)
|
| 271 |
-
clip_type = getattr(comfy.sd.CLIPType, type.upper(), comfy.sd.CLIPType.STABLE_DIFFUSION)
|
| 272 |
-
return (self.load_patcher(clip_paths, clip_type, self.load_data(clip_paths)),)
|
| 273 |
-
|
| 274 |
-
class QuadrupleCLIPLoaderGGUF(CLIPLoaderGGUF):
|
| 275 |
-
@classmethod
|
| 276 |
-
def INPUT_TYPES(s):
|
| 277 |
-
file_options = (s.get_filename_list(), )
|
| 278 |
-
return {
|
| 279 |
-
"required": {
|
| 280 |
-
"clip_name1": file_options,
|
| 281 |
-
"clip_name2": file_options,
|
| 282 |
-
"clip_name3": file_options,
|
| 283 |
-
"clip_name4": file_options,
|
| 284 |
-
}
|
| 285 |
-
}
|
| 286 |
-
|
| 287 |
-
TITLE = "QuadrupleCLIPLoader (GGUF)"
|
| 288 |
-
|
| 289 |
-
def load_clip(self, clip_name1, clip_name2, clip_name3, clip_name4, type="stable_diffusion"):
|
| 290 |
-
clip_path1 = folder_paths.get_full_path("clip", clip_name1)
|
| 291 |
-
clip_path2 = folder_paths.get_full_path("clip", clip_name2)
|
| 292 |
-
clip_path3 = folder_paths.get_full_path("clip", clip_name3)
|
| 293 |
-
clip_path4 = folder_paths.get_full_path("clip", clip_name4)
|
| 294 |
-
clip_paths = (clip_path1, clip_path2, clip_path3, clip_path4)
|
| 295 |
-
clip_type = getattr(comfy.sd.CLIPType, type.upper(), comfy.sd.CLIPType.STABLE_DIFFUSION)
|
| 296 |
-
return (self.load_patcher(clip_paths, clip_type, self.load_data(clip_paths)),)
|
| 297 |
-
|
| 298 |
-
NODE_CLASS_MAPPINGS = {
|
| 299 |
-
"UnetLoaderGGUF": UnetLoaderGGUF,
|
| 300 |
-
"CLIPLoaderGGUF": CLIPLoaderGGUF,
|
| 301 |
-
"DualCLIPLoaderGGUF": DualCLIPLoaderGGUF,
|
| 302 |
-
"TripleCLIPLoaderGGUF": TripleCLIPLoaderGGUF,
|
| 303 |
-
"QuadrupleCLIPLoaderGGUF": QuadrupleCLIPLoaderGGUF,
|
| 304 |
-
"UnetLoaderGGUFAdvanced": UnetLoaderGGUFAdvanced,
|
| 305 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py
DELETED
|
@@ -1,281 +0,0 @@
|
|
| 1 |
-
# (c) City96 || Apache-2.0 (apache.org/licenses/LICENSE-2.0)
|
| 2 |
-
import gguf
|
| 3 |
-
import torch
|
| 4 |
-
import logging
|
| 5 |
-
|
| 6 |
-
import comfy.ops
|
| 7 |
-
import comfy.lora
|
| 8 |
-
import comfy.model_management
|
| 9 |
-
from .dequant import dequantize_tensor, is_quantized
|
| 10 |
-
|
| 11 |
-
def chained_hasattr(obj, chained_attr):
|
| 12 |
-
probe = obj
|
| 13 |
-
for attr in chained_attr.split('.'):
|
| 14 |
-
if hasattr(probe, attr):
|
| 15 |
-
probe = getattr(probe, attr)
|
| 16 |
-
else:
|
| 17 |
-
return False
|
| 18 |
-
return True
|
| 19 |
-
|
| 20 |
-
# A bakcward and forward compatible way to get `torch.compiler.disable`.
|
| 21 |
-
def get_torch_compiler_disable_decorator():
|
| 22 |
-
def dummy_decorator(*args, **kwargs):
|
| 23 |
-
def noop(x):
|
| 24 |
-
return x
|
| 25 |
-
return noop
|
| 26 |
-
|
| 27 |
-
from packaging import version
|
| 28 |
-
|
| 29 |
-
if not chained_hasattr(torch, "compiler.disable"):
|
| 30 |
-
logging.info("ComfyUI-GGUF: Torch too old for torch.compile - bypassing")
|
| 31 |
-
return dummy_decorator # torch too old
|
| 32 |
-
elif version.parse(torch.__version__) >= version.parse("2.8"):
|
| 33 |
-
logging.info("ComfyUI-GGUF: Allowing full torch compile")
|
| 34 |
-
return dummy_decorator # torch compile works
|
| 35 |
-
if chained_hasattr(torch, "_dynamo.config.nontraceable_tensor_subclasses"):
|
| 36 |
-
logging.info("ComfyUI-GGUF: Allowing full torch compile (nightly)")
|
| 37 |
-
return dummy_decorator # torch compile works, nightly before 2.8 release
|
| 38 |
-
else:
|
| 39 |
-
logging.info("ComfyUI-GGUF: Partial torch compile only, consider updating pytorch")
|
| 40 |
-
return torch.compiler.disable
|
| 41 |
-
|
| 42 |
-
torch_compiler_disable = get_torch_compiler_disable_decorator()
|
| 43 |
-
|
| 44 |
-
class GGMLTensor(torch.Tensor):
|
| 45 |
-
"""
|
| 46 |
-
Main tensor-like class for storing quantized weights
|
| 47 |
-
"""
|
| 48 |
-
def __init__(self, *args, tensor_type, tensor_shape, patches=[], **kwargs):
|
| 49 |
-
super().__init__()
|
| 50 |
-
self.tensor_type = tensor_type
|
| 51 |
-
self.tensor_shape = tensor_shape
|
| 52 |
-
self.patches = patches
|
| 53 |
-
|
| 54 |
-
def __new__(cls, *args, tensor_type, tensor_shape, patches=[], **kwargs):
|
| 55 |
-
return super().__new__(cls, *args, **kwargs)
|
| 56 |
-
|
| 57 |
-
def to(self, *args, **kwargs):
|
| 58 |
-
new = super().to(*args, **kwargs)
|
| 59 |
-
new.tensor_type = getattr(self, "tensor_type", None)
|
| 60 |
-
new.tensor_shape = getattr(self, "tensor_shape", new.data.shape)
|
| 61 |
-
new.patches = getattr(self, "patches", []).copy()
|
| 62 |
-
return new
|
| 63 |
-
|
| 64 |
-
def clone(self, *args, **kwargs):
|
| 65 |
-
return self
|
| 66 |
-
|
| 67 |
-
def detach(self, *args, **kwargs):
|
| 68 |
-
return self
|
| 69 |
-
|
| 70 |
-
def copy_(self, *args, **kwargs):
|
| 71 |
-
# fixes .weight.copy_ in comfy/clip_model/CLIPTextModel
|
| 72 |
-
try:
|
| 73 |
-
return super().copy_(*args, **kwargs)
|
| 74 |
-
except Exception as e:
|
| 75 |
-
logging.warning(f"ignoring 'copy_' on tensor: {e}")
|
| 76 |
-
|
| 77 |
-
def new_empty(self, size, *args, **kwargs):
|
| 78 |
-
# Intel Arc fix, ref#50
|
| 79 |
-
new_tensor = super().new_empty(size, *args, **kwargs)
|
| 80 |
-
return GGMLTensor(
|
| 81 |
-
new_tensor,
|
| 82 |
-
tensor_type = getattr(self, "tensor_type", None),
|
| 83 |
-
tensor_shape = size,
|
| 84 |
-
patches = getattr(self, "patches", []).copy()
|
| 85 |
-
)
|
| 86 |
-
|
| 87 |
-
@property
|
| 88 |
-
def shape(self):
|
| 89 |
-
if not hasattr(self, "tensor_shape"):
|
| 90 |
-
self.tensor_shape = self.size()
|
| 91 |
-
return self.tensor_shape
|
| 92 |
-
|
| 93 |
-
class GGMLLayer(torch.nn.Module):
|
| 94 |
-
"""
|
| 95 |
-
This (should) be responsible for de-quantizing on the fly
|
| 96 |
-
"""
|
| 97 |
-
comfy_cast_weights = True
|
| 98 |
-
dequant_dtype = None
|
| 99 |
-
patch_dtype = None
|
| 100 |
-
largest_layer = False
|
| 101 |
-
torch_compatible_tensor_types = {None, gguf.GGMLQuantizationType.F32, gguf.GGMLQuantizationType.F16}
|
| 102 |
-
|
| 103 |
-
def is_ggml_quantized(self, *, weight=None, bias=None):
|
| 104 |
-
if weight is None:
|
| 105 |
-
weight = self.weight
|
| 106 |
-
if bias is None:
|
| 107 |
-
bias = self.bias
|
| 108 |
-
return is_quantized(weight) or is_quantized(bias)
|
| 109 |
-
|
| 110 |
-
def _load_from_state_dict(self, state_dict, prefix, *args, **kwargs):
|
| 111 |
-
weight, bias = state_dict.get(f"{prefix}weight"), state_dict.get(f"{prefix}bias")
|
| 112 |
-
# NOTE: using modified load for linear due to not initializing on creation, see GGMLOps todo
|
| 113 |
-
if self.is_ggml_quantized(weight=weight, bias=bias) or isinstance(self, torch.nn.Linear):
|
| 114 |
-
return self.ggml_load_from_state_dict(state_dict, prefix, *args, **kwargs)
|
| 115 |
-
# Not strictly required, but fixes embedding shape mismatch. Threshold set in loader.py
|
| 116 |
-
if isinstance(self, torch.nn.Embedding) and self.weight.shape[0] >= (64 * 1024):
|
| 117 |
-
return self.ggml_load_from_state_dict(state_dict, prefix, *args, **kwargs)
|
| 118 |
-
return super()._load_from_state_dict(state_dict, prefix, *args, **kwargs)
|
| 119 |
-
|
| 120 |
-
def ggml_load_from_state_dict(self, state_dict, prefix, local_metadata, strict, missing_keys, unexpected_keys, error_msgs):
|
| 121 |
-
prefix_len = len(prefix)
|
| 122 |
-
for k,v in state_dict.items():
|
| 123 |
-
if k[prefix_len:] == "weight":
|
| 124 |
-
self.weight = torch.nn.Parameter(v, requires_grad=False)
|
| 125 |
-
elif k[prefix_len:] == "bias" and v is not None:
|
| 126 |
-
self.bias = torch.nn.Parameter(v, requires_grad=False)
|
| 127 |
-
else:
|
| 128 |
-
unexpected_keys.append(k)
|
| 129 |
-
|
| 130 |
-
# For Linear layer with missing weight
|
| 131 |
-
if self.weight is None and isinstance(self, torch.nn.Linear):
|
| 132 |
-
v = torch.zeros(self.in_features, self.out_features)
|
| 133 |
-
self.weight = torch.nn.Parameter(v, requires_grad=False)
|
| 134 |
-
missing_keys.append(prefix+"weight")
|
| 135 |
-
|
| 136 |
-
# for vram estimation (TODO: less fragile logic?)
|
| 137 |
-
if getattr(self.weight, "is_largest_weight", False):
|
| 138 |
-
self.largest_layer = True
|
| 139 |
-
|
| 140 |
-
def _save_to_state_dict(self, *args, **kwargs):
|
| 141 |
-
if self.is_ggml_quantized():
|
| 142 |
-
return self.ggml_save_to_state_dict(*args, **kwargs)
|
| 143 |
-
return super()._save_to_state_dict(*args, **kwargs)
|
| 144 |
-
|
| 145 |
-
def ggml_save_to_state_dict(self, destination, prefix, keep_vars):
|
| 146 |
-
# This is a fake state dict for vram estimation
|
| 147 |
-
weight = torch.zeros_like(self.weight, device=torch.device("meta"))
|
| 148 |
-
destination[prefix + "weight"] = weight
|
| 149 |
-
if self.bias is not None:
|
| 150 |
-
bias = torch.zeros_like(self.bias, device=torch.device("meta"))
|
| 151 |
-
destination[prefix + "bias"] = bias
|
| 152 |
-
|
| 153 |
-
# Take into account space required for dequantizing the largest tensor
|
| 154 |
-
if self.largest_layer:
|
| 155 |
-
shape = getattr(self.weight, "tensor_shape", self.weight.shape)
|
| 156 |
-
dtype = self.dequant_dtype if self.dequant_dtype and self.dequant_dtype != "target" else torch.float16
|
| 157 |
-
temp = torch.empty(*shape, device=torch.device("meta"), dtype=dtype)
|
| 158 |
-
destination[prefix + "temp.weight"] = temp
|
| 159 |
-
|
| 160 |
-
return
|
| 161 |
-
# This would return the dequantized state dict
|
| 162 |
-
destination[prefix + "weight"] = self.get_weight(self.weight)
|
| 163 |
-
if bias is not None:
|
| 164 |
-
destination[prefix + "bias"] = self.get_weight(self.bias)
|
| 165 |
-
|
| 166 |
-
def get_weight(self, tensor, dtype):
|
| 167 |
-
if tensor is None:
|
| 168 |
-
return
|
| 169 |
-
|
| 170 |
-
# consolidate and load patches to GPU in async
|
| 171 |
-
patch_list = []
|
| 172 |
-
device = tensor.device
|
| 173 |
-
for patches, key in getattr(tensor, "patches", []):
|
| 174 |
-
patch_list += move_patch_to_device(patches, device)
|
| 175 |
-
|
| 176 |
-
# dequantize tensor while patches load
|
| 177 |
-
weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
|
| 178 |
-
|
| 179 |
-
# prevent propagating custom tensor class
|
| 180 |
-
if isinstance(weight, GGMLTensor):
|
| 181 |
-
weight = torch.Tensor(weight)
|
| 182 |
-
|
| 183 |
-
# apply patches
|
| 184 |
-
if len(patch_list) > 0:
|
| 185 |
-
if self.patch_dtype is None:
|
| 186 |
-
weight = comfy.lora.calculate_weight(patch_list, weight, key)
|
| 187 |
-
else:
|
| 188 |
-
# for testing, may degrade image quality
|
| 189 |
-
patch_dtype = dtype if self.patch_dtype == "target" else self.patch_dtype
|
| 190 |
-
weight = comfy.lora.calculate_weight(patch_list, weight, key, patch_dtype)
|
| 191 |
-
return weight
|
| 192 |
-
|
| 193 |
-
@torch_compiler_disable()
|
| 194 |
-
def cast_bias_weight(s, input=None, dtype=None, device=None, bias_dtype=None):
|
| 195 |
-
if input is not None:
|
| 196 |
-
if dtype is None:
|
| 197 |
-
dtype = getattr(input, "dtype", torch.float32)
|
| 198 |
-
if bias_dtype is None:
|
| 199 |
-
bias_dtype = dtype
|
| 200 |
-
if device is None:
|
| 201 |
-
device = input.device
|
| 202 |
-
|
| 203 |
-
bias = None
|
| 204 |
-
non_blocking = comfy.model_management.device_supports_non_blocking(device)
|
| 205 |
-
if s.bias is not None:
|
| 206 |
-
bias = s.get_weight(s.bias.to(device), dtype)
|
| 207 |
-
bias = comfy.ops.cast_to(bias, bias_dtype, device, non_blocking=non_blocking, copy=False)
|
| 208 |
-
|
| 209 |
-
weight = s.get_weight(s.weight.to(device), dtype)
|
| 210 |
-
weight = comfy.ops.cast_to(weight, dtype, device, non_blocking=non_blocking, copy=False)
|
| 211 |
-
return weight, bias
|
| 212 |
-
|
| 213 |
-
def forward_comfy_cast_weights(self, input, *args, **kwargs):
|
| 214 |
-
if self.is_ggml_quantized():
|
| 215 |
-
out = self.forward_ggml_cast_weights(input, *args, **kwargs)
|
| 216 |
-
else:
|
| 217 |
-
out = super().forward_comfy_cast_weights(input, *args, **kwargs)
|
| 218 |
-
|
| 219 |
-
# non-ggml forward might still propagate custom tensor class
|
| 220 |
-
if isinstance(out, GGMLTensor):
|
| 221 |
-
out = torch.Tensor(out)
|
| 222 |
-
return out
|
| 223 |
-
|
| 224 |
-
def forward_ggml_cast_weights(self, input):
|
| 225 |
-
raise NotImplementedError
|
| 226 |
-
|
| 227 |
-
class GGMLOps(comfy.ops.manual_cast):
|
| 228 |
-
"""
|
| 229 |
-
Dequantize weights on the fly before doing the compute
|
| 230 |
-
"""
|
| 231 |
-
class Linear(GGMLLayer, comfy.ops.manual_cast.Linear):
|
| 232 |
-
def __init__(self, in_features, out_features, bias=True, device=None, dtype=None):
|
| 233 |
-
torch.nn.Module.__init__(self)
|
| 234 |
-
# TODO: better workaround for reserved memory spike on windows
|
| 235 |
-
# Issue is with `torch.empty` still reserving the full memory for the layer
|
| 236 |
-
# Windows doesn't over-commit memory so without this 24GB+ of pagefile is used
|
| 237 |
-
self.in_features = in_features
|
| 238 |
-
self.out_features = out_features
|
| 239 |
-
self.weight = None
|
| 240 |
-
self.bias = None
|
| 241 |
-
|
| 242 |
-
def forward_ggml_cast_weights(self, input):
|
| 243 |
-
weight, bias = self.cast_bias_weight(input)
|
| 244 |
-
return torch.nn.functional.linear(input, weight, bias)
|
| 245 |
-
|
| 246 |
-
class Conv2d(GGMLLayer, comfy.ops.manual_cast.Conv2d):
|
| 247 |
-
def forward_ggml_cast_weights(self, input):
|
| 248 |
-
weight, bias = self.cast_bias_weight(input)
|
| 249 |
-
return self._conv_forward(input, weight, bias)
|
| 250 |
-
|
| 251 |
-
class Embedding(GGMLLayer, comfy.ops.manual_cast.Embedding):
|
| 252 |
-
def forward_ggml_cast_weights(self, input, out_dtype=None):
|
| 253 |
-
output_dtype = out_dtype
|
| 254 |
-
if self.weight.dtype == torch.float16 or self.weight.dtype == torch.bfloat16:
|
| 255 |
-
out_dtype = None
|
| 256 |
-
weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
|
| 257 |
-
return torch.nn.functional.embedding(
|
| 258 |
-
input, weight, self.padding_idx, self.max_norm, self.norm_type, self.scale_grad_by_freq, self.sparse
|
| 259 |
-
).to(dtype=output_dtype)
|
| 260 |
-
|
| 261 |
-
class LayerNorm(GGMLLayer, comfy.ops.manual_cast.LayerNorm):
|
| 262 |
-
def forward_ggml_cast_weights(self, input):
|
| 263 |
-
if self.weight is None:
|
| 264 |
-
return super().forward_comfy_cast_weights(input)
|
| 265 |
-
weight, bias = self.cast_bias_weight(input)
|
| 266 |
-
return torch.nn.functional.layer_norm(input, self.normalized_shape, weight, bias, self.eps)
|
| 267 |
-
|
| 268 |
-
class GroupNorm(GGMLLayer, comfy.ops.manual_cast.GroupNorm):
|
| 269 |
-
def forward_ggml_cast_weights(self, input):
|
| 270 |
-
weight, bias = self.cast_bias_weight(input)
|
| 271 |
-
return torch.nn.functional.group_norm(input, self.num_groups, weight, bias, self.eps)
|
| 272 |
-
|
| 273 |
-
def move_patch_to_device(item, device):
|
| 274 |
-
if isinstance(item, torch.Tensor):
|
| 275 |
-
return item.to(device, non_blocking=True)
|
| 276 |
-
elif isinstance(item, tuple):
|
| 277 |
-
return tuple(move_patch_to_device(x, device) for x in item)
|
| 278 |
-
elif isinstance(item, list):
|
| 279 |
-
return [move_patch_to_device(x, device) for x in item]
|
| 280 |
-
else:
|
| 281 |
-
return item
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/pyproject.toml
DELETED
|
@@ -1,14 +0,0 @@
|
|
| 1 |
-
[project]
|
| 2 |
-
name = "comfyui-gguf"
|
| 3 |
-
description = "GGUF Quantization support for native ComfyUI models."
|
| 4 |
-
version = "2.0.0" # 2.0.0 = GitHub main, 1.X.X = ComfyUI Registry
|
| 5 |
-
license = { file = "LICENSE" }
|
| 6 |
-
dependencies = ["gguf>=0.13.0", "sentencepiece", "protobuf"]
|
| 7 |
-
|
| 8 |
-
[project.urls]
|
| 9 |
-
Repository = "https://github.com/city96/ComfyUI-GGUF"
|
| 10 |
-
|
| 11 |
-
[tool.comfy]
|
| 12 |
-
PublisherId = "city96"
|
| 13 |
-
DisplayName = "ComfyUI-GGUF"
|
| 14 |
-
Icon = ""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/requirements.txt
DELETED
|
@@ -1,5 +0,0 @@
|
|
| 1 |
-
# main
|
| 2 |
-
gguf>=0.13.0
|
| 3 |
-
# optional - tokenizer
|
| 4 |
-
sentencepiece
|
| 5 |
-
protobuf
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/tools/README.md
DELETED
|
@@ -1,93 +0,0 @@
|
|
| 1 |
-
## Converting initial model
|
| 2 |
-
|
| 3 |
-
To convert your initial safetensors/ckpt model to FP16/BF16 GGUF, run the following command:
|
| 4 |
-
|
| 5 |
-
```
|
| 6 |
-
python convert.py --src E:\models\unet\flux1-dev.safetensors
|
| 7 |
-
```
|
| 8 |
-
Make sure `gguf>=0.13.0` is installed for this step. Optionally, specify the output gguf file with the `--dst` arg.
|
| 9 |
-
|
| 10 |
-
> [!NOTE]
|
| 11 |
-
> Do not use the diffusers UNET format for flux, it won't work, use the default/reference checkpoint key format. This is due to q/k/v being merged into one qkv key.
|
| 12 |
-
> You can convert it by loading it in ComfyUI and saving it using the built-in "ModelSave" node.
|
| 13 |
-
|
| 14 |
-
> [!WARNING]
|
| 15 |
-
> For hunyuan video/wan 2.1, you will see a warning about 5D tensors. This means the script will save a **non functional** model to disk first, that you can quantize. I recommend saving these in a separate `raw` folder to avoid confusion.
|
| 16 |
-
>
|
| 17 |
-
> After quantization, you will have to run `fix_5d_tensor.py` manually to add back the missing key that was saved by the conversion code.
|
| 18 |
-
|
| 19 |
-
## Quantizing using custom llama.cpp
|
| 20 |
-
|
| 21 |
-
Depending on your git settings, you may need to run the following script first in order to make sure the patch file is valid. It will convert Windows (CRLF) line endings to Unix (LF) ones.
|
| 22 |
-
|
| 23 |
-
```
|
| 24 |
-
python fix_lines_ending.py
|
| 25 |
-
```
|
| 26 |
-
|
| 27 |
-
Git clone llama.cpp into the current folder:
|
| 28 |
-
|
| 29 |
-
```
|
| 30 |
-
git clone https://github.com/ggerganov/llama.cpp
|
| 31 |
-
```
|
| 32 |
-
|
| 33 |
-
Check out the correct branch, then apply the custom patch needed to add image model support to the repo you just cloned.
|
| 34 |
-
|
| 35 |
-
```
|
| 36 |
-
cd llama.cpp
|
| 37 |
-
git checkout tags/b3962
|
| 38 |
-
git apply ..\lcpp.patch
|
| 39 |
-
```
|
| 40 |
-
|
| 41 |
-
Compile the llama-quantize binary. This example uses cmake, on linux you can just use make.
|
| 42 |
-
|
| 43 |
-
### Visual Studio 2019, Linux, etc...
|
| 44 |
-
|
| 45 |
-
```
|
| 46 |
-
mkdir build
|
| 47 |
-
cmake -B build
|
| 48 |
-
cmake --build build --config Debug -j10 --target llama-quantize
|
| 49 |
-
cd ..
|
| 50 |
-
```
|
| 51 |
-
|
| 52 |
-
### Visual Studio 2022
|
| 53 |
-
|
| 54 |
-
```
|
| 55 |
-
mkdir build
|
| 56 |
-
cmake -B build -DCMAKE_CXX_STANDARD=17 -DCMAKE_CXX_STANDARD_REQUIRED=ON -DCMAKE_CXX_FLAGS="-std=c++17"
|
| 57 |
-
```
|
| 58 |
-
|
| 59 |
-
Edit the `llama.cpp\common\log.cpp` file, inserts two lines after the existing first line:
|
| 60 |
-
|
| 61 |
-
```
|
| 62 |
-
#include "log.h"
|
| 63 |
-
|
| 64 |
-
#define _SILENCE_CXX23_CHRONO_DEPRECATION_WARNING
|
| 65 |
-
#include <chrono>
|
| 66 |
-
```
|
| 67 |
-
|
| 68 |
-
Then you can build the project:
|
| 69 |
-
```
|
| 70 |
-
cmake --build build --config Debug -j10 --target llama-quantize
|
| 71 |
-
cd ..
|
| 72 |
-
```
|
| 73 |
-
|
| 74 |
-
### Quantize your model
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
Now you can use the newly build binary to quantize your model to the desired format:
|
| 78 |
-
```
|
| 79 |
-
llama.cpp\build\bin\Debug\llama-quantize.exe E:\models\unet\flux1-dev-BF16.gguf E:\models\unet\flux1-dev-Q4_K_S.gguf Q4_K_S
|
| 80 |
-
```
|
| 81 |
-
|
| 82 |
-
You can extract the patch again with `git diff src\llama.cpp > lcpp.patch` if you wish to change something and contribute back.
|
| 83 |
-
|
| 84 |
-
> [!WARNING]
|
| 85 |
-
> For hunyuan video/wan 2.1, you will have to run `fix_5d_tensor.py` after the quantization step is done.
|
| 86 |
-
>
|
| 87 |
-
> Example usage: `fix_5d_tensors.py --src E:\models\video\raw\wan2.1-t2v-1.3b-Q8_0.gguf --dst E:\models\video\wan2.1-t2v-1.3b-Q8_0.gguf`
|
| 88 |
-
>
|
| 89 |
-
> By default, this also saves a `fix_5d_tensors_[arch].safetensors` file in the `ComfyUI-GGUF/tools` folder, it's recommended to delete this after all models have been converted.
|
| 90 |
-
|
| 91 |
-
> [!NOTE]
|
| 92 |
-
> Do not quantize SDXL / SD1 / other Conv2D heavy models. If you do, make sure to **extract the UNET model first**.
|
| 93 |
-
>This should be obvious, but also don't use the resulting llama-quantize binary with LLMs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/tools/convert.py
DELETED
|
@@ -1,365 +0,0 @@
|
|
| 1 |
-
# (c) City96 || Apache-2.0 (apache.org/licenses/LICENSE-2.0)
|
| 2 |
-
import os
|
| 3 |
-
import gguf
|
| 4 |
-
import torch
|
| 5 |
-
import logging
|
| 6 |
-
import argparse
|
| 7 |
-
from tqdm import tqdm
|
| 8 |
-
from safetensors.torch import load_file, save_file
|
| 9 |
-
|
| 10 |
-
QUANTIZATION_THRESHOLD = 1024
|
| 11 |
-
REARRANGE_THRESHOLD = 512
|
| 12 |
-
MAX_TENSOR_NAME_LENGTH = 127
|
| 13 |
-
MAX_TENSOR_DIMS = 4
|
| 14 |
-
|
| 15 |
-
class ModelTemplate:
|
| 16 |
-
arch = "invalid" # string describing architecture
|
| 17 |
-
shape_fix = False # whether to reshape tensors
|
| 18 |
-
keys_detect = [] # list of lists to match in state dict
|
| 19 |
-
keys_banned = [] # list of keys that should mark model as invalid for conversion
|
| 20 |
-
keys_hiprec = [] # list of keys that need to be kept in fp32 for some reason
|
| 21 |
-
keys_ignore = [] # list of strings to ignore keys by when found
|
| 22 |
-
|
| 23 |
-
def handle_nd_tensor(self, key, data):
|
| 24 |
-
raise NotImplementedError(f"Tensor detected that exceeds dims supported by C++ code! ({key} @ {data.shape})")
|
| 25 |
-
|
| 26 |
-
class ModelFlux(ModelTemplate):
|
| 27 |
-
arch = "flux"
|
| 28 |
-
keys_detect = [
|
| 29 |
-
("transformer_blocks.0.attn.norm_added_k.weight",),
|
| 30 |
-
("double_blocks.0.img_attn.proj.weight",),
|
| 31 |
-
]
|
| 32 |
-
keys_banned = ["transformer_blocks.0.attn.norm_added_k.weight",]
|
| 33 |
-
|
| 34 |
-
class ModelSD3(ModelTemplate):
|
| 35 |
-
arch = "sd3"
|
| 36 |
-
keys_detect = [
|
| 37 |
-
("transformer_blocks.0.attn.add_q_proj.weight",),
|
| 38 |
-
("joint_blocks.0.x_block.attn.qkv.weight",),
|
| 39 |
-
]
|
| 40 |
-
keys_banned = ["transformer_blocks.0.attn.add_q_proj.weight",]
|
| 41 |
-
|
| 42 |
-
class ModelAura(ModelTemplate):
|
| 43 |
-
arch = "aura"
|
| 44 |
-
keys_detect = [
|
| 45 |
-
("double_layers.3.modX.1.weight",),
|
| 46 |
-
("joint_transformer_blocks.3.ff_context.out_projection.weight",),
|
| 47 |
-
]
|
| 48 |
-
keys_banned = ["joint_transformer_blocks.3.ff_context.out_projection.weight",]
|
| 49 |
-
|
| 50 |
-
class ModelHiDream(ModelTemplate):
|
| 51 |
-
arch = "hidream"
|
| 52 |
-
keys_detect = [
|
| 53 |
-
(
|
| 54 |
-
"caption_projection.0.linear.weight",
|
| 55 |
-
"double_stream_blocks.0.block.ff_i.shared_experts.w3.weight"
|
| 56 |
-
)
|
| 57 |
-
]
|
| 58 |
-
keys_hiprec = [
|
| 59 |
-
# nn.parameter, can't load from BF16 ver
|
| 60 |
-
".ff_i.gate.weight",
|
| 61 |
-
"img_emb.emb_pos"
|
| 62 |
-
]
|
| 63 |
-
|
| 64 |
-
class CosmosPredict2(ModelTemplate):
|
| 65 |
-
arch = "cosmos"
|
| 66 |
-
keys_detect = [
|
| 67 |
-
(
|
| 68 |
-
"blocks.0.mlp.layer1.weight",
|
| 69 |
-
"blocks.0.adaln_modulation_cross_attn.1.weight",
|
| 70 |
-
)
|
| 71 |
-
]
|
| 72 |
-
keys_hiprec = ["pos_embedder"]
|
| 73 |
-
keys_ignore = ["_extra_state", "accum_"]
|
| 74 |
-
|
| 75 |
-
class ModelHyVid(ModelTemplate):
|
| 76 |
-
arch = "hyvid"
|
| 77 |
-
keys_detect = [
|
| 78 |
-
(
|
| 79 |
-
"double_blocks.0.img_attn_proj.weight",
|
| 80 |
-
"txt_in.individual_token_refiner.blocks.1.self_attn_qkv.weight",
|
| 81 |
-
)
|
| 82 |
-
]
|
| 83 |
-
|
| 84 |
-
def handle_nd_tensor(self, key, data):
|
| 85 |
-
# hacky but don't have any better ideas
|
| 86 |
-
path = f"./fix_5d_tensors_{self.arch}.safetensors" # TODO: somehow get a path here??
|
| 87 |
-
if os.path.isfile(path):
|
| 88 |
-
raise RuntimeError(f"5D tensor fix file already exists! {path}")
|
| 89 |
-
fsd = {key: torch.from_numpy(data)}
|
| 90 |
-
tqdm.write(f"5D key found in state dict! Manual fix required! - {key} {data.shape}")
|
| 91 |
-
save_file(fsd, path)
|
| 92 |
-
|
| 93 |
-
class ModelWan(ModelHyVid):
|
| 94 |
-
arch = "wan"
|
| 95 |
-
keys_detect = [
|
| 96 |
-
(
|
| 97 |
-
"blocks.0.self_attn.norm_q.weight",
|
| 98 |
-
"text_embedding.2.weight",
|
| 99 |
-
"head.modulation",
|
| 100 |
-
)
|
| 101 |
-
]
|
| 102 |
-
keys_hiprec = [
|
| 103 |
-
".modulation" # nn.parameter, can't load from BF16 ver
|
| 104 |
-
]
|
| 105 |
-
|
| 106 |
-
class ModelLTXV(ModelTemplate):
|
| 107 |
-
arch = "ltxv"
|
| 108 |
-
keys_detect = [
|
| 109 |
-
(
|
| 110 |
-
"adaln_single.emb.timestep_embedder.linear_2.weight",
|
| 111 |
-
"transformer_blocks.27.scale_shift_table",
|
| 112 |
-
"caption_projection.linear_2.weight",
|
| 113 |
-
)
|
| 114 |
-
]
|
| 115 |
-
keys_hiprec = [
|
| 116 |
-
"scale_shift_table" # nn.parameter, can't load from BF16 base quant
|
| 117 |
-
]
|
| 118 |
-
|
| 119 |
-
class ModelSDXL(ModelTemplate):
|
| 120 |
-
arch = "sdxl"
|
| 121 |
-
shape_fix = True
|
| 122 |
-
keys_detect = [
|
| 123 |
-
("down_blocks.0.downsamplers.0.conv.weight", "add_embedding.linear_1.weight",),
|
| 124 |
-
(
|
| 125 |
-
"input_blocks.3.0.op.weight", "input_blocks.6.0.op.weight",
|
| 126 |
-
"output_blocks.2.2.conv.weight", "output_blocks.5.2.conv.weight",
|
| 127 |
-
), # Non-diffusers
|
| 128 |
-
("label_emb.0.0.weight",),
|
| 129 |
-
]
|
| 130 |
-
|
| 131 |
-
class ModelSD1(ModelTemplate):
|
| 132 |
-
arch = "sd1"
|
| 133 |
-
shape_fix = True
|
| 134 |
-
keys_detect = [
|
| 135 |
-
("down_blocks.0.downsamplers.0.conv.weight",),
|
| 136 |
-
(
|
| 137 |
-
"input_blocks.3.0.op.weight", "input_blocks.6.0.op.weight", "input_blocks.9.0.op.weight",
|
| 138 |
-
"output_blocks.2.1.conv.weight", "output_blocks.5.2.conv.weight", "output_blocks.8.2.conv.weight"
|
| 139 |
-
), # Non-diffusers
|
| 140 |
-
]
|
| 141 |
-
|
| 142 |
-
class ModelLumina2(ModelTemplate):
|
| 143 |
-
arch = "lumina2"
|
| 144 |
-
keys_detect = [
|
| 145 |
-
("cap_embedder.1.weight", "context_refiner.0.attention.qkv.weight")
|
| 146 |
-
]
|
| 147 |
-
|
| 148 |
-
arch_list = [ModelFlux, ModelSD3, ModelAura, ModelHiDream, CosmosPredict2,
|
| 149 |
-
ModelLTXV, ModelHyVid, ModelWan, ModelSDXL, ModelSD1, ModelLumina2]
|
| 150 |
-
|
| 151 |
-
def is_model_arch(model, state_dict):
|
| 152 |
-
# check if model is correct
|
| 153 |
-
matched = False
|
| 154 |
-
invalid = False
|
| 155 |
-
for match_list in model.keys_detect:
|
| 156 |
-
if all(key in state_dict for key in match_list):
|
| 157 |
-
matched = True
|
| 158 |
-
invalid = any(key in state_dict for key in model.keys_banned)
|
| 159 |
-
break
|
| 160 |
-
assert not invalid, "Model architecture not allowed for conversion! (i.e. reference VS diffusers format)"
|
| 161 |
-
return matched
|
| 162 |
-
|
| 163 |
-
def detect_arch(state_dict):
|
| 164 |
-
model_arch = None
|
| 165 |
-
for arch in arch_list:
|
| 166 |
-
if is_model_arch(arch, state_dict):
|
| 167 |
-
model_arch = arch()
|
| 168 |
-
break
|
| 169 |
-
assert model_arch is not None, "Unknown model architecture!"
|
| 170 |
-
return model_arch
|
| 171 |
-
|
| 172 |
-
def parse_args():
|
| 173 |
-
parser = argparse.ArgumentParser(description="Generate F16 GGUF files from single UNET")
|
| 174 |
-
parser.add_argument("--src", required=True, help="Source model ckpt file.")
|
| 175 |
-
parser.add_argument("--dst", help="Output unet gguf file.")
|
| 176 |
-
args = parser.parse_args()
|
| 177 |
-
|
| 178 |
-
if not os.path.isfile(args.src):
|
| 179 |
-
parser.error("No input provided!")
|
| 180 |
-
|
| 181 |
-
return args
|
| 182 |
-
|
| 183 |
-
def strip_prefix(state_dict):
|
| 184 |
-
# prefix for mixed state dict
|
| 185 |
-
prefix = None
|
| 186 |
-
for pfx in ["model.diffusion_model.", "model."]:
|
| 187 |
-
if any([x.startswith(pfx) for x in state_dict.keys()]):
|
| 188 |
-
prefix = pfx
|
| 189 |
-
break
|
| 190 |
-
|
| 191 |
-
# prefix for uniform state dict
|
| 192 |
-
if prefix is None:
|
| 193 |
-
for pfx in ["net."]:
|
| 194 |
-
if all([x.startswith(pfx) for x in state_dict.keys()]):
|
| 195 |
-
prefix = pfx
|
| 196 |
-
break
|
| 197 |
-
|
| 198 |
-
# strip prefix if found
|
| 199 |
-
if prefix is not None:
|
| 200 |
-
logging.info(f"State dict prefix found: '{prefix}'")
|
| 201 |
-
sd = {}
|
| 202 |
-
for k, v in state_dict.items():
|
| 203 |
-
if prefix not in k:
|
| 204 |
-
continue
|
| 205 |
-
k = k.replace(prefix, "")
|
| 206 |
-
sd[k] = v
|
| 207 |
-
else:
|
| 208 |
-
logging.debug("State dict has no prefix")
|
| 209 |
-
sd = state_dict
|
| 210 |
-
|
| 211 |
-
return sd
|
| 212 |
-
|
| 213 |
-
def load_state_dict(path):
|
| 214 |
-
if any(path.endswith(x) for x in [".ckpt", ".pt", ".bin", ".pth"]):
|
| 215 |
-
state_dict = torch.load(path, map_location="cpu", weights_only=True)
|
| 216 |
-
for subkey in ["model", "module"]:
|
| 217 |
-
if subkey in state_dict:
|
| 218 |
-
state_dict = state_dict[subkey]
|
| 219 |
-
break
|
| 220 |
-
if len(state_dict) < 20:
|
| 221 |
-
raise RuntimeError(f"pt subkey load failed: {state_dict.keys()}")
|
| 222 |
-
else:
|
| 223 |
-
state_dict = load_file(path)
|
| 224 |
-
|
| 225 |
-
return strip_prefix(state_dict)
|
| 226 |
-
|
| 227 |
-
def handle_tensors(writer, state_dict, model_arch):
|
| 228 |
-
name_lengths = tuple(sorted(
|
| 229 |
-
((key, len(key)) for key in state_dict.keys()),
|
| 230 |
-
key=lambda item: item[1],
|
| 231 |
-
reverse=True,
|
| 232 |
-
))
|
| 233 |
-
if not name_lengths:
|
| 234 |
-
return
|
| 235 |
-
max_name_len = name_lengths[0][1]
|
| 236 |
-
if max_name_len > MAX_TENSOR_NAME_LENGTH:
|
| 237 |
-
bad_list = ", ".join(f"{key!r} ({namelen})" for key, namelen in name_lengths if namelen > MAX_TENSOR_NAME_LENGTH)
|
| 238 |
-
raise ValueError(f"Can only handle tensor names up to {MAX_TENSOR_NAME_LENGTH} characters. Tensors exceeding the limit: {bad_list}")
|
| 239 |
-
for key, data in tqdm(state_dict.items()):
|
| 240 |
-
old_dtype = data.dtype
|
| 241 |
-
|
| 242 |
-
if any(x in key for x in model_arch.keys_ignore):
|
| 243 |
-
tqdm.write(f"Filtering ignored key: '{key}'")
|
| 244 |
-
continue
|
| 245 |
-
|
| 246 |
-
if data.dtype == torch.bfloat16:
|
| 247 |
-
data = data.to(torch.float32).numpy()
|
| 248 |
-
# this is so we don't break torch 2.0.X
|
| 249 |
-
elif data.dtype in [getattr(torch, "float8_e4m3fn", "_invalid"), getattr(torch, "float8_e5m2", "_invalid")]:
|
| 250 |
-
data = data.to(torch.float16).numpy()
|
| 251 |
-
else:
|
| 252 |
-
data = data.numpy()
|
| 253 |
-
|
| 254 |
-
n_dims = len(data.shape)
|
| 255 |
-
data_shape = data.shape
|
| 256 |
-
if old_dtype == torch.bfloat16:
|
| 257 |
-
data_qtype = gguf.GGMLQuantizationType.BF16
|
| 258 |
-
# elif old_dtype == torch.float32:
|
| 259 |
-
# data_qtype = gguf.GGMLQuantizationType.F32
|
| 260 |
-
else:
|
| 261 |
-
data_qtype = gguf.GGMLQuantizationType.F16
|
| 262 |
-
|
| 263 |
-
# The max no. of dimensions that can be handled by the quantization code is 4
|
| 264 |
-
if len(data.shape) > MAX_TENSOR_DIMS:
|
| 265 |
-
model_arch.handle_nd_tensor(key, data)
|
| 266 |
-
continue # needs to be added back later
|
| 267 |
-
|
| 268 |
-
# get number of parameters (AKA elements) in this tensor
|
| 269 |
-
n_params = 1
|
| 270 |
-
for dim_size in data_shape:
|
| 271 |
-
n_params *= dim_size
|
| 272 |
-
|
| 273 |
-
if old_dtype in (torch.float32, torch.bfloat16):
|
| 274 |
-
if n_dims == 1:
|
| 275 |
-
# one-dimensional tensors should be kept in F32
|
| 276 |
-
# also speeds up inference due to not dequantizing
|
| 277 |
-
data_qtype = gguf.GGMLQuantizationType.F32
|
| 278 |
-
|
| 279 |
-
elif n_params <= QUANTIZATION_THRESHOLD:
|
| 280 |
-
# very small tensors
|
| 281 |
-
data_qtype = gguf.GGMLQuantizationType.F32
|
| 282 |
-
|
| 283 |
-
elif any(x in key for x in model_arch.keys_hiprec):
|
| 284 |
-
# tensors that require max precision
|
| 285 |
-
data_qtype = gguf.GGMLQuantizationType.F32
|
| 286 |
-
|
| 287 |
-
if (model_arch.shape_fix # NEVER reshape for models such as flux
|
| 288 |
-
and n_dims > 1 # Skip one-dimensional tensors
|
| 289 |
-
and n_params >= REARRANGE_THRESHOLD # Only rearrange tensors meeting the size requirement
|
| 290 |
-
and (n_params / 256).is_integer() # Rearranging only makes sense if total elements is divisible by 256
|
| 291 |
-
and not (data.shape[-1] / 256).is_integer() # Only need to rearrange if the last dimension is not divisible by 256
|
| 292 |
-
):
|
| 293 |
-
orig_shape = data.shape
|
| 294 |
-
data = data.reshape(n_params // 256, 256)
|
| 295 |
-
writer.add_array(f"comfy.gguf.orig_shape.{key}", tuple(int(dim) for dim in orig_shape))
|
| 296 |
-
|
| 297 |
-
try:
|
| 298 |
-
data = gguf.quants.quantize(data, data_qtype)
|
| 299 |
-
except (AttributeError, gguf.QuantError) as e:
|
| 300 |
-
tqdm.write(f"falling back to F16: {e}")
|
| 301 |
-
data_qtype = gguf.GGMLQuantizationType.F16
|
| 302 |
-
data = gguf.quants.quantize(data, data_qtype)
|
| 303 |
-
|
| 304 |
-
new_name = key # do we need to rename?
|
| 305 |
-
|
| 306 |
-
shape_str = f"{{{', '.join(str(n) for n in reversed(data.shape))}}}"
|
| 307 |
-
tqdm.write(f"{f'%-{max_name_len + 4}s' % f'{new_name}'} {old_dtype} --> {data_qtype.name}, shape = {shape_str}")
|
| 308 |
-
|
| 309 |
-
writer.add_tensor(new_name, data, raw_dtype=data_qtype)
|
| 310 |
-
|
| 311 |
-
def convert_file(path, dst_path=None, interact=True, overwrite=False):
|
| 312 |
-
# load & run model detection logic
|
| 313 |
-
state_dict = load_state_dict(path)
|
| 314 |
-
model_arch = detect_arch(state_dict)
|
| 315 |
-
logging.info(f"* Architecture detected from input: {model_arch.arch}")
|
| 316 |
-
|
| 317 |
-
# detect & set dtype for output file
|
| 318 |
-
dtypes = [x.dtype for x in state_dict.values()]
|
| 319 |
-
dtypes = {x:dtypes.count(x) for x in set(dtypes)}
|
| 320 |
-
main_dtype = max(dtypes, key=dtypes.get)
|
| 321 |
-
|
| 322 |
-
if main_dtype == torch.bfloat16:
|
| 323 |
-
ftype_name = "BF16"
|
| 324 |
-
ftype_gguf = gguf.LlamaFileType.MOSTLY_BF16
|
| 325 |
-
# elif main_dtype == torch.float32:
|
| 326 |
-
# ftype_name = "F32"
|
| 327 |
-
# ftype_gguf = None
|
| 328 |
-
else:
|
| 329 |
-
ftype_name = "F16"
|
| 330 |
-
ftype_gguf = gguf.LlamaFileType.MOSTLY_F16
|
| 331 |
-
|
| 332 |
-
if dst_path is None:
|
| 333 |
-
dst_path = f"{os.path.splitext(path)[0]}-{ftype_name}.gguf"
|
| 334 |
-
elif "{ftype}" in dst_path: # lcpp logic
|
| 335 |
-
dst_path = dst_path.replace("{ftype}", ftype_name)
|
| 336 |
-
|
| 337 |
-
if os.path.isfile(dst_path) and not overwrite:
|
| 338 |
-
if interact:
|
| 339 |
-
input("Output exists enter to continue or ctrl+c to abort!")
|
| 340 |
-
else:
|
| 341 |
-
raise OSError("Output exists and overwriting is disabled!")
|
| 342 |
-
|
| 343 |
-
# handle actual file
|
| 344 |
-
writer = gguf.GGUFWriter(path=None, arch=model_arch.arch)
|
| 345 |
-
writer.add_quantization_version(gguf.GGML_QUANT_VERSION)
|
| 346 |
-
if ftype_gguf is not None:
|
| 347 |
-
writer.add_file_type(ftype_gguf)
|
| 348 |
-
|
| 349 |
-
handle_tensors(writer, state_dict, model_arch)
|
| 350 |
-
writer.write_header_to_file(path=dst_path)
|
| 351 |
-
writer.write_kv_data_to_file()
|
| 352 |
-
writer.write_tensors_to_file(progress=True)
|
| 353 |
-
writer.close()
|
| 354 |
-
|
| 355 |
-
fix = f"./fix_5d_tensors_{model_arch.arch}.safetensors"
|
| 356 |
-
if os.path.isfile(fix):
|
| 357 |
-
logging.warning(f"\n### Warning! Fix file found at '{fix}'")
|
| 358 |
-
logging.warning(" you most likely need to run 'fix_5d_tensors.py' after quantization.")
|
| 359 |
-
|
| 360 |
-
return dst_path, model_arch
|
| 361 |
-
|
| 362 |
-
if __name__ == "__main__":
|
| 363 |
-
args = parse_args()
|
| 364 |
-
convert_file(args.src, args.dst)
|
| 365 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/tools/fix_5d_tensors.py
DELETED
|
@@ -1,82 +0,0 @@
|
|
| 1 |
-
# (c) City96 || Apache-2.0 (apache.org/licenses/LICENSE-2.0)
|
| 2 |
-
import os
|
| 3 |
-
import gguf
|
| 4 |
-
import torch
|
| 5 |
-
import argparse
|
| 6 |
-
from tqdm import tqdm
|
| 7 |
-
from safetensors.torch import load_file
|
| 8 |
-
|
| 9 |
-
def get_args():
|
| 10 |
-
parser = argparse.ArgumentParser()
|
| 11 |
-
parser.add_argument("--src", required=True)
|
| 12 |
-
parser.add_argument("--dst", required=True)
|
| 13 |
-
parser.add_argument("--fix", required=False, help="Defaults to ./fix_5d_tensors_[arch].pt")
|
| 14 |
-
parser.add_argument("--overwrite", action="store_true")
|
| 15 |
-
args = parser.parse_args()
|
| 16 |
-
|
| 17 |
-
if not os.path.isfile(args.src):
|
| 18 |
-
parser.error(f"Invalid source file '{args.src}'")
|
| 19 |
-
if not args.overwrite and os.path.exists(args.dst):
|
| 20 |
-
parser.error(f"Output exists, use '--overwrite' ({args.dst})")
|
| 21 |
-
|
| 22 |
-
return args
|
| 23 |
-
|
| 24 |
-
def get_arch_str(reader):
|
| 25 |
-
field = reader.get_field("general.architecture")
|
| 26 |
-
return str(field.parts[field.data[-1]], encoding="utf-8")
|
| 27 |
-
|
| 28 |
-
def get_file_type(reader):
|
| 29 |
-
field = reader.get_field("general.file_type")
|
| 30 |
-
ft = int(field.parts[field.data[-1]])
|
| 31 |
-
return gguf.LlamaFileType(ft)
|
| 32 |
-
|
| 33 |
-
if __name__ == "__main__":
|
| 34 |
-
args = get_args()
|
| 35 |
-
|
| 36 |
-
# read existing
|
| 37 |
-
reader = gguf.GGUFReader(args.src)
|
| 38 |
-
arch = get_arch_str(reader)
|
| 39 |
-
file_type = get_file_type(reader)
|
| 40 |
-
print(f"Detected arch: '{arch}' (ftype: {str(file_type)})")
|
| 41 |
-
|
| 42 |
-
# prep fix
|
| 43 |
-
if args.fix is None:
|
| 44 |
-
args.fix = f"./fix_5d_tensors_{arch}.safetensors"
|
| 45 |
-
|
| 46 |
-
if not os.path.isfile(args.fix):
|
| 47 |
-
raise OSError(f"No 5D tensor fix file: {args.fix}")
|
| 48 |
-
|
| 49 |
-
sd5d = load_file(args.fix)
|
| 50 |
-
sd5d = {k:v.numpy() for k,v in sd5d.items()}
|
| 51 |
-
print("5D tensors:", sd5d.keys())
|
| 52 |
-
|
| 53 |
-
# prep output
|
| 54 |
-
writer = gguf.GGUFWriter(path=None, arch=arch)
|
| 55 |
-
writer.add_quantization_version(gguf.GGML_QUANT_VERSION)
|
| 56 |
-
writer.add_file_type(file_type)
|
| 57 |
-
|
| 58 |
-
added = []
|
| 59 |
-
def add_extra_key(writer, key, data):
|
| 60 |
-
global added
|
| 61 |
-
data_qtype = gguf.GGMLQuantizationType.F32
|
| 62 |
-
data = gguf.quants.quantize(data, data_qtype)
|
| 63 |
-
tqdm.write(f"Adding key {key} ({data.shape})")
|
| 64 |
-
writer.add_tensor(key, data, raw_dtype=data_qtype)
|
| 65 |
-
added.append(key)
|
| 66 |
-
|
| 67 |
-
# main loop to add missing 5D tensor(s)
|
| 68 |
-
for tensor in tqdm(reader.tensors):
|
| 69 |
-
writer.add_tensor(tensor.name, tensor.data, raw_dtype=tensor.tensor_type)
|
| 70 |
-
key5d = tensor.name.replace(".bias", ".weight")
|
| 71 |
-
if key5d in sd5d.keys():
|
| 72 |
-
add_extra_key(writer, key5d, sd5d[key5d])
|
| 73 |
-
|
| 74 |
-
# brute force for any missed
|
| 75 |
-
for key, data in sd5d.items():
|
| 76 |
-
if key not in added:
|
| 77 |
-
add_extra_key(writer, key, data)
|
| 78 |
-
|
| 79 |
-
writer.write_header_to_file(path=args.dst)
|
| 80 |
-
writer.write_kv_data_to_file()
|
| 81 |
-
writer.write_tensors_to_file(progress=True)
|
| 82 |
-
writer.close()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/tools/fix_lines_ending.py
DELETED
|
@@ -1,31 +0,0 @@
|
|
| 1 |
-
import os
|
| 2 |
-
|
| 3 |
-
files = ["lcpp.patch", "lcpp_sd3.patch"]
|
| 4 |
-
|
| 5 |
-
def has_unix_line_endings(file_path):
|
| 6 |
-
try:
|
| 7 |
-
with open(file_path, 'rb') as file:
|
| 8 |
-
content = file.read()
|
| 9 |
-
return b'\r\n' not in content
|
| 10 |
-
except Exception as e:
|
| 11 |
-
print(f"Error checking '{file_path}': {e}")
|
| 12 |
-
return False
|
| 13 |
-
|
| 14 |
-
def convert_to_linux_format(file_path):
|
| 15 |
-
try:
|
| 16 |
-
with open(file_path, 'rb') as file:
|
| 17 |
-
content = file.read().replace(b'\r\n', b'\n')
|
| 18 |
-
with open(file_path, 'wb') as file:
|
| 19 |
-
file.write(content)
|
| 20 |
-
print(f"'{file_path}' converted to Linux line endings (LF).")
|
| 21 |
-
except Exception as e:
|
| 22 |
-
print(f"Error processing '{file_path}': {e}")
|
| 23 |
-
|
| 24 |
-
for file in files:
|
| 25 |
-
if os.path.exists(file):
|
| 26 |
-
if has_unix_line_endings(file):
|
| 27 |
-
print(f"'{file}' already has Unix line endings (LF). No conversion needed.")
|
| 28 |
-
else:
|
| 29 |
-
convert_to_linux_format(file)
|
| 30 |
-
else:
|
| 31 |
-
print(f"File '{file}' does not exist.")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/tools/lcpp.patch
DELETED
|
@@ -1,451 +0,0 @@
|
|
| 1 |
-
diff --git a/ggml/include/ggml.h b/ggml/include/ggml.h
|
| 2 |
-
index de3c706f..0267c1fa 100644
|
| 3 |
-
--- a/ggml/include/ggml.h
|
| 4 |
-
+++ b/ggml/include/ggml.h
|
| 5 |
-
@@ -223,7 +223,7 @@
|
| 6 |
-
#define GGML_MAX_OP_PARAMS 64
|
| 7 |
-
|
| 8 |
-
#ifndef GGML_MAX_NAME
|
| 9 |
-
-# define GGML_MAX_NAME 64
|
| 10 |
-
+# define GGML_MAX_NAME 128
|
| 11 |
-
#endif
|
| 12 |
-
|
| 13 |
-
#define GGML_DEFAULT_N_THREADS 4
|
| 14 |
-
@@ -2449,6 +2449,7 @@ extern "C" {
|
| 15 |
-
|
| 16 |
-
// manage tensor info
|
| 17 |
-
GGML_API void gguf_add_tensor(struct gguf_context * ctx, const struct ggml_tensor * tensor);
|
| 18 |
-
+ GGML_API void gguf_set_tensor_ndim(struct gguf_context * ctx, const char * name, int n_dim);
|
| 19 |
-
GGML_API void gguf_set_tensor_type(struct gguf_context * ctx, const char * name, enum ggml_type type);
|
| 20 |
-
GGML_API void gguf_set_tensor_data(struct gguf_context * ctx, const char * name, const void * data, size_t size);
|
| 21 |
-
|
| 22 |
-
diff --git a/ggml/src/ggml.c b/ggml/src/ggml.c
|
| 23 |
-
index b16c462f..6d1568f1 100644
|
| 24 |
-
--- a/ggml/src/ggml.c
|
| 25 |
-
+++ b/ggml/src/ggml.c
|
| 26 |
-
@@ -22960,6 +22960,14 @@ void gguf_add_tensor(
|
| 27 |
-
ctx->header.n_tensors++;
|
| 28 |
-
}
|
| 29 |
-
|
| 30 |
-
+void gguf_set_tensor_ndim(struct gguf_context * ctx, const char * name, const int n_dim) {
|
| 31 |
-
+ const int idx = gguf_find_tensor(ctx, name);
|
| 32 |
-
+ if (idx < 0) {
|
| 33 |
-
+ GGML_ABORT("tensor not found");
|
| 34 |
-
+ }
|
| 35 |
-
+ ctx->infos[idx].n_dims = n_dim;
|
| 36 |
-
+}
|
| 37 |
-
+
|
| 38 |
-
void gguf_set_tensor_type(struct gguf_context * ctx, const char * name, enum ggml_type type) {
|
| 39 |
-
const int idx = gguf_find_tensor(ctx, name);
|
| 40 |
-
if (idx < 0) {
|
| 41 |
-
diff --git a/src/llama.cpp b/src/llama.cpp
|
| 42 |
-
index 24e1f1f0..25db4c69 100644
|
| 43 |
-
--- a/src/llama.cpp
|
| 44 |
-
+++ b/src/llama.cpp
|
| 45 |
-
@@ -205,6 +205,17 @@ enum llm_arch {
|
| 46 |
-
LLM_ARCH_GRANITE,
|
| 47 |
-
LLM_ARCH_GRANITE_MOE,
|
| 48 |
-
LLM_ARCH_CHAMELEON,
|
| 49 |
-
+ LLM_ARCH_FLUX,
|
| 50 |
-
+ LLM_ARCH_SD1,
|
| 51 |
-
+ LLM_ARCH_SDXL,
|
| 52 |
-
+ LLM_ARCH_SD3,
|
| 53 |
-
+ LLM_ARCH_AURA,
|
| 54 |
-
+ LLM_ARCH_LTXV,
|
| 55 |
-
+ LLM_ARCH_HYVID,
|
| 56 |
-
+ LLM_ARCH_WAN,
|
| 57 |
-
+ LLM_ARCH_HIDREAM,
|
| 58 |
-
+ LLM_ARCH_COSMOS,
|
| 59 |
-
+ LLM_ARCH_LUMINA2,
|
| 60 |
-
LLM_ARCH_UNKNOWN,
|
| 61 |
-
};
|
| 62 |
-
|
| 63 |
-
@@ -258,6 +269,17 @@ static const std::map<llm_arch, const char *> LLM_ARCH_NAMES = {
|
| 64 |
-
{ LLM_ARCH_GRANITE, "granite" },
|
| 65 |
-
{ LLM_ARCH_GRANITE_MOE, "granitemoe" },
|
| 66 |
-
{ LLM_ARCH_CHAMELEON, "chameleon" },
|
| 67 |
-
+ { LLM_ARCH_FLUX, "flux" },
|
| 68 |
-
+ { LLM_ARCH_SD1, "sd1" },
|
| 69 |
-
+ { LLM_ARCH_SDXL, "sdxl" },
|
| 70 |
-
+ { LLM_ARCH_SD3, "sd3" },
|
| 71 |
-
+ { LLM_ARCH_AURA, "aura" },
|
| 72 |
-
+ { LLM_ARCH_LTXV, "ltxv" },
|
| 73 |
-
+ { LLM_ARCH_HYVID, "hyvid" },
|
| 74 |
-
+ { LLM_ARCH_WAN, "wan" },
|
| 75 |
-
+ { LLM_ARCH_HIDREAM, "hidream" },
|
| 76 |
-
+ { LLM_ARCH_COSMOS, "cosmos" },
|
| 77 |
-
+ { LLM_ARCH_LUMINA2, "lumina2" },
|
| 78 |
-
{ LLM_ARCH_UNKNOWN, "(unknown)" },
|
| 79 |
-
};
|
| 80 |
-
|
| 81 |
-
@@ -1531,6 +1553,17 @@ static const std::map<llm_arch, std::map<llm_tensor, const char *>> LLM_TENSOR_N
|
| 82 |
-
{ LLM_TENSOR_ATTN_K_NORM, "blk.%d.attn_k_norm" },
|
| 83 |
-
},
|
| 84 |
-
},
|
| 85 |
-
+ { LLM_ARCH_FLUX, {}},
|
| 86 |
-
+ { LLM_ARCH_SD1, {}},
|
| 87 |
-
+ { LLM_ARCH_SDXL, {}},
|
| 88 |
-
+ { LLM_ARCH_SD3, {}},
|
| 89 |
-
+ { LLM_ARCH_AURA, {}},
|
| 90 |
-
+ { LLM_ARCH_LTXV, {}},
|
| 91 |
-
+ { LLM_ARCH_HYVID, {}},
|
| 92 |
-
+ { LLM_ARCH_WAN, {}},
|
| 93 |
-
+ { LLM_ARCH_HIDREAM, {}},
|
| 94 |
-
+ { LLM_ARCH_COSMOS, {}},
|
| 95 |
-
+ { LLM_ARCH_LUMINA2, {}},
|
| 96 |
-
{
|
| 97 |
-
LLM_ARCH_UNKNOWN,
|
| 98 |
-
{
|
| 99 |
-
@@ -5403,6 +5436,25 @@ static void llm_load_hparams(
|
| 100 |
-
// get general kv
|
| 101 |
-
ml.get_key(LLM_KV_GENERAL_NAME, model.name, false);
|
| 102 |
-
|
| 103 |
-
+ // Disable LLM metadata for image models
|
| 104 |
-
+ switch (model.arch) {
|
| 105 |
-
+ case LLM_ARCH_FLUX:
|
| 106 |
-
+ case LLM_ARCH_SD1:
|
| 107 |
-
+ case LLM_ARCH_SDXL:
|
| 108 |
-
+ case LLM_ARCH_SD3:
|
| 109 |
-
+ case LLM_ARCH_AURA:
|
| 110 |
-
+ case LLM_ARCH_LTXV:
|
| 111 |
-
+ case LLM_ARCH_HYVID:
|
| 112 |
-
+ case LLM_ARCH_WAN:
|
| 113 |
-
+ case LLM_ARCH_HIDREAM:
|
| 114 |
-
+ case LLM_ARCH_COSMOS:
|
| 115 |
-
+ case LLM_ARCH_LUMINA2:
|
| 116 |
-
+ model.ftype = ml.ftype;
|
| 117 |
-
+ return;
|
| 118 |
-
+ default:
|
| 119 |
-
+ break;
|
| 120 |
-
+ }
|
| 121 |
-
+
|
| 122 |
-
// get hparams kv
|
| 123 |
-
ml.get_key(LLM_KV_VOCAB_SIZE, hparams.n_vocab, false) || ml.get_arr_n(LLM_KV_TOKENIZER_LIST, hparams.n_vocab);
|
| 124 |
-
|
| 125 |
-
@@ -18016,6 +18068,134 @@ static void llama_tensor_dequantize_internal(
|
| 126 |
-
workers.clear();
|
| 127 |
-
}
|
| 128 |
-
|
| 129 |
-
+static ggml_type img_tensor_get_type(quantize_state_internal & qs, ggml_type new_type, const ggml_tensor * tensor, llama_ftype ftype) {
|
| 130 |
-
+ // Special function for quantizing image model tensors
|
| 131 |
-
+ const std::string name = ggml_get_name(tensor);
|
| 132 |
-
+ const llm_arch arch = qs.model.arch;
|
| 133 |
-
+
|
| 134 |
-
+ // Sanity check
|
| 135 |
-
+ if (
|
| 136 |
-
+ (name.find("model.diffusion_model.") != std::string::npos) ||
|
| 137 |
-
+ (name.find("first_stage_model.") != std::string::npos) ||
|
| 138 |
-
+ (name.find("single_transformer_blocks.") != std::string::npos) ||
|
| 139 |
-
+ (name.find("joint_transformer_blocks.") != std::string::npos)
|
| 140 |
-
+ ) {
|
| 141 |
-
+ throw std::runtime_error("Invalid input GGUF file. This is not a supported UNET model");
|
| 142 |
-
+ }
|
| 143 |
-
+
|
| 144 |
-
+ // Unsupported quant types - exclude all IQ quants for now
|
| 145 |
-
+ if (ftype == LLAMA_FTYPE_MOSTLY_IQ2_XXS || ftype == LLAMA_FTYPE_MOSTLY_IQ2_XS ||
|
| 146 |
-
+ ftype == LLAMA_FTYPE_MOSTLY_IQ2_S || ftype == LLAMA_FTYPE_MOSTLY_IQ2_M ||
|
| 147 |
-
+ ftype == LLAMA_FTYPE_MOSTLY_IQ3_XXS || ftype == LLAMA_FTYPE_MOSTLY_IQ1_S ||
|
| 148 |
-
+ ftype == LLAMA_FTYPE_MOSTLY_IQ1_M || ftype == LLAMA_FTYPE_MOSTLY_IQ4_NL ||
|
| 149 |
-
+ ftype == LLAMA_FTYPE_MOSTLY_IQ4_XS || ftype == LLAMA_FTYPE_MOSTLY_IQ3_S ||
|
| 150 |
-
+ ftype == LLAMA_FTYPE_MOSTLY_IQ3_M || ftype == LLAMA_FTYPE_MOSTLY_Q4_0_4_4 ||
|
| 151 |
-
+ ftype == LLAMA_FTYPE_MOSTLY_Q4_0_4_8 || ftype == LLAMA_FTYPE_MOSTLY_Q4_0_8_8) {
|
| 152 |
-
+ throw std::runtime_error("Invalid quantization type for image model (Not supported)");
|
| 153 |
-
+ }
|
| 154 |
-
+
|
| 155 |
-
+ if ( // Rules for to_v attention
|
| 156 |
-
+ (name.find("attn_v.weight") != std::string::npos) ||
|
| 157 |
-
+ (name.find(".to_v.weight") != std::string::npos) ||
|
| 158 |
-
+ (name.find(".v.weight") != std::string::npos) ||
|
| 159 |
-
+ (name.find(".attn.w1v.weight") != std::string::npos) ||
|
| 160 |
-
+ (name.find(".attn.w2v.weight") != std::string::npos) ||
|
| 161 |
-
+ (name.find("_attn.v_proj.weight") != std::string::npos)
|
| 162 |
-
+ ){
|
| 163 |
-
+ if (ftype == LLAMA_FTYPE_MOSTLY_Q2_K) {
|
| 164 |
-
+ new_type = GGML_TYPE_Q3_K;
|
| 165 |
-
+ }
|
| 166 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q3_K_M) {
|
| 167 |
-
+ new_type = qs.i_attention_wv < 2 ? GGML_TYPE_Q5_K : GGML_TYPE_Q4_K;
|
| 168 |
-
+ }
|
| 169 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q3_K_L) {
|
| 170 |
-
+ new_type = GGML_TYPE_Q5_K;
|
| 171 |
-
+ }
|
| 172 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q4_K_M || ftype == LLAMA_FTYPE_MOSTLY_Q5_K_M) {
|
| 173 |
-
+ new_type = GGML_TYPE_Q6_K;
|
| 174 |
-
+ }
|
| 175 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q4_K_S && qs.i_attention_wv < 4) {
|
| 176 |
-
+ new_type = GGML_TYPE_Q5_K;
|
| 177 |
-
+ }
|
| 178 |
-
+ ++qs.i_attention_wv;
|
| 179 |
-
+ } else if ( // Rules for fused qkv attention
|
| 180 |
-
+ (name.find("attn_qkv.weight") != std::string::npos) ||
|
| 181 |
-
+ (name.find("attn.qkv.weight") != std::string::npos) ||
|
| 182 |
-
+ (name.find("attention.qkv.weight") != std::string::npos)
|
| 183 |
-
+ ) {
|
| 184 |
-
+ if (ftype == LLAMA_FTYPE_MOSTLY_Q3_K_M || ftype == LLAMA_FTYPE_MOSTLY_Q3_K_L) {
|
| 185 |
-
+ new_type = GGML_TYPE_Q4_K;
|
| 186 |
-
+ }
|
| 187 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q4_K_M) {
|
| 188 |
-
+ new_type = GGML_TYPE_Q5_K;
|
| 189 |
-
+ }
|
| 190 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q5_K_M) {
|
| 191 |
-
+ new_type = GGML_TYPE_Q6_K;
|
| 192 |
-
+ }
|
| 193 |
-
+ } else if ( // Rules for ffn
|
| 194 |
-
+ (name.find("ffn_down") != std::string::npos) ||
|
| 195 |
-
+ ((name.find("experts.") != std::string::npos) && (name.find(".w2.weight") != std::string::npos)) ||
|
| 196 |
-
+ (name.find(".ffn.2.weight") != std::string::npos) || // is this even the right way around?
|
| 197 |
-
+ (name.find(".ff.net.2.weight") != std::string::npos) ||
|
| 198 |
-
+ (name.find(".mlp.layer2.weight") != std::string::npos) ||
|
| 199 |
-
+ (name.find(".adaln_modulation_mlp.2.weight") != std::string::npos) ||
|
| 200 |
-
+ (name.find(".feed_forward.w2.weight") != std::string::npos)
|
| 201 |
-
+ ) {
|
| 202 |
-
+ // TODO: add back `layer_info` with some model specific logic + logic further down
|
| 203 |
-
+ if (ftype == LLAMA_FTYPE_MOSTLY_Q3_K_M) {
|
| 204 |
-
+ new_type = GGML_TYPE_Q4_K;
|
| 205 |
-
+ }
|
| 206 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q3_K_L) {
|
| 207 |
-
+ new_type = GGML_TYPE_Q5_K;
|
| 208 |
-
+ }
|
| 209 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q4_K_S) {
|
| 210 |
-
+ new_type = GGML_TYPE_Q5_K;
|
| 211 |
-
+ }
|
| 212 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q4_K_M) {
|
| 213 |
-
+ new_type = GGML_TYPE_Q6_K;
|
| 214 |
-
+ }
|
| 215 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q5_K_M) {
|
| 216 |
-
+ new_type = GGML_TYPE_Q6_K;
|
| 217 |
-
+ }
|
| 218 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q4_0) {
|
| 219 |
-
+ new_type = GGML_TYPE_Q4_1;
|
| 220 |
-
+ }
|
| 221 |
-
+ else if (ftype == LLAMA_FTYPE_MOSTLY_Q5_0) {
|
| 222 |
-
+ new_type = GGML_TYPE_Q5_1;
|
| 223 |
-
+ }
|
| 224 |
-
+ ++qs.i_ffn_down;
|
| 225 |
-
+ }
|
| 226 |
-
+
|
| 227 |
-
+ // Sanity check for row shape
|
| 228 |
-
+ bool convert_incompatible_tensor = false;
|
| 229 |
-
+ if (new_type == GGML_TYPE_Q2_K || new_type == GGML_TYPE_Q3_K || new_type == GGML_TYPE_Q4_K ||
|
| 230 |
-
+ new_type == GGML_TYPE_Q5_K || new_type == GGML_TYPE_Q6_K) {
|
| 231 |
-
+ int nx = tensor->ne[0];
|
| 232 |
-
+ int ny = tensor->ne[1];
|
| 233 |
-
+ if (nx % QK_K != 0) {
|
| 234 |
-
+ LLAMA_LOG_WARN("\n\n%s : tensor cols %d x %d are not divisible by %d, required for %s", __func__, nx, ny, QK_K, ggml_type_name(new_type));
|
| 235 |
-
+ convert_incompatible_tensor = true;
|
| 236 |
-
+ } else {
|
| 237 |
-
+ ++qs.n_k_quantized;
|
| 238 |
-
+ }
|
| 239 |
-
+ }
|
| 240 |
-
+ if (convert_incompatible_tensor) {
|
| 241 |
-
+ // TODO: Possibly reenable this in the future
|
| 242 |
-
+ // switch (new_type) {
|
| 243 |
-
+ // case GGML_TYPE_Q2_K:
|
| 244 |
-
+ // case GGML_TYPE_Q3_K:
|
| 245 |
-
+ // case GGML_TYPE_Q4_K: new_type = GGML_TYPE_Q5_0; break;
|
| 246 |
-
+ // case GGML_TYPE_Q5_K: new_type = GGML_TYPE_Q5_1; break;
|
| 247 |
-
+ // case GGML_TYPE_Q6_K: new_type = GGML_TYPE_Q8_0; break;
|
| 248 |
-
+ // default: throw std::runtime_error("\nUnsupported tensor size encountered\n");
|
| 249 |
-
+ // }
|
| 250 |
-
+ new_type = GGML_TYPE_F16;
|
| 251 |
-
+ LLAMA_LOG_WARN(" - using fallback quantization %s\n", ggml_type_name(new_type));
|
| 252 |
-
+ ++qs.n_fallback;
|
| 253 |
-
+ }
|
| 254 |
-
+ return new_type;
|
| 255 |
-
+}
|
| 256 |
-
+
|
| 257 |
-
static ggml_type llama_tensor_get_type(quantize_state_internal & qs, ggml_type new_type, const ggml_tensor * tensor, llama_ftype ftype) {
|
| 258 |
-
const std::string name = ggml_get_name(tensor);
|
| 259 |
-
|
| 260 |
-
@@ -18513,7 +18693,9 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
|
| 261 |
-
if (llama_model_has_encoder(&model)) {
|
| 262 |
-
n_attn_layer *= 3;
|
| 263 |
-
}
|
| 264 |
-
- GGML_ASSERT((qs.n_attention_wv == n_attn_layer) && "n_attention_wv is unexpected");
|
| 265 |
-
+ if (model.arch != LLM_ARCH_HYVID) { // TODO: Check why this fails
|
| 266 |
-
+ GGML_ASSERT((qs.n_attention_wv == n_attn_layer) && "n_attention_wv is unexpected");
|
| 267 |
-
+ }
|
| 268 |
-
}
|
| 269 |
-
|
| 270 |
-
size_t total_size_org = 0;
|
| 271 |
-
@@ -18547,6 +18729,51 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
|
| 272 |
-
ctx_outs[i_split] = gguf_init_empty();
|
| 273 |
-
}
|
| 274 |
-
gguf_add_tensor(ctx_outs[i_split], tensor);
|
| 275 |
-
+ // SD3 pos_embed needs special fix as first dim is 1, which gets truncated here
|
| 276 |
-
+ if (model.arch == LLM_ARCH_SD3) {
|
| 277 |
-
+ const std::string name = ggml_get_name(tensor);
|
| 278 |
-
+ if (name == "pos_embed" && tensor->ne[2] == 1) {
|
| 279 |
-
+ const int n_dim = 3;
|
| 280 |
-
+ gguf_set_tensor_ndim(ctx_outs[i_split], "pos_embed", n_dim);
|
| 281 |
-
+ LLAMA_LOG_INFO("\n%s: Correcting pos_embed shape for SD3: [key:%s]\n", __func__, tensor->name);
|
| 282 |
-
+ }
|
| 283 |
-
+ }
|
| 284 |
-
+ // same goes for auraflow
|
| 285 |
-
+ if (model.arch == LLM_ARCH_AURA) {
|
| 286 |
-
+ const std::string name = ggml_get_name(tensor);
|
| 287 |
-
+ if (name == "positional_encoding" && tensor->ne[2] == 1) {
|
| 288 |
-
+ const int n_dim = 3;
|
| 289 |
-
+ gguf_set_tensor_ndim(ctx_outs[i_split], "positional_encoding", n_dim);
|
| 290 |
-
+ LLAMA_LOG_INFO("\n%s: Correcting positional_encoding shape for AuraFlow: [key:%s]\n", __func__, tensor->name);
|
| 291 |
-
+ }
|
| 292 |
-
+ if (name == "register_tokens" && tensor->ne[2] == 1) {
|
| 293 |
-
+ const int n_dim = 3;
|
| 294 |
-
+ gguf_set_tensor_ndim(ctx_outs[i_split], "register_tokens", n_dim);
|
| 295 |
-
+ LLAMA_LOG_INFO("\n%s: Correcting register_tokens shape for AuraFlow: [key:%s]\n", __func__, tensor->name);
|
| 296 |
-
+ }
|
| 297 |
-
+ }
|
| 298 |
-
+ // conv3d fails due to max dims - unsure what to do here as we never even reach this check
|
| 299 |
-
+ if (model.arch == LLM_ARCH_HYVID) {
|
| 300 |
-
+ const std::string name = ggml_get_name(tensor);
|
| 301 |
-
+ if (name == "img_in.proj.weight" && tensor->ne[5] != 1 ) {
|
| 302 |
-
+ throw std::runtime_error("img_in.proj.weight size failed for HyVid");
|
| 303 |
-
+ }
|
| 304 |
-
+ }
|
| 305 |
-
+ // All the modulation layers also have dim1, and I think conv3d fails here too but we segfaul way before that...
|
| 306 |
-
+ if (model.arch == LLM_ARCH_WAN) {
|
| 307 |
-
+ const std::string name = ggml_get_name(tensor);
|
| 308 |
-
+ if (name.find(".modulation") != std::string::npos && tensor->ne[2] == 1) {
|
| 309 |
-
+ const int n_dim = 3;
|
| 310 |
-
+ gguf_set_tensor_ndim(ctx_outs[i_split], tensor->name, n_dim);
|
| 311 |
-
+ LLAMA_LOG_INFO("\n%s: Correcting shape for Wan: [key:%s]\n", __func__, tensor->name);
|
| 312 |
-
+ }
|
| 313 |
-
+ // FLF2V model only
|
| 314 |
-
+ if (name == "img_emb.emb_pos") {
|
| 315 |
-
+ const int n_dim = 3;
|
| 316 |
-
+ gguf_set_tensor_ndim(ctx_outs[i_split], tensor->name, n_dim);
|
| 317 |
-
+ LLAMA_LOG_INFO("\n%s: Correcting shape for Wan FLF2V: [key:%s]\n", __func__, tensor->name);
|
| 318 |
-
+ }
|
| 319 |
-
+ }
|
| 320 |
-
}
|
| 321 |
-
|
| 322 |
-
// Set split info if needed
|
| 323 |
-
@@ -18647,6 +18874,110 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
|
| 324 |
-
// do not quantize relative position bias (T5)
|
| 325 |
-
quantize &= name.find("attn_rel_b.weight") == std::string::npos;
|
| 326 |
-
|
| 327 |
-
+ // rules for image models
|
| 328 |
-
+ bool image_model = false;
|
| 329 |
-
+ if (model.arch == LLM_ARCH_FLUX) {
|
| 330 |
-
+ image_model = true;
|
| 331 |
-
+ quantize &= name.find("txt_in.") == std::string::npos;
|
| 332 |
-
+ quantize &= name.find("img_in.") == std::string::npos;
|
| 333 |
-
+ quantize &= name.find("time_in.") == std::string::npos;
|
| 334 |
-
+ quantize &= name.find("vector_in.") == std::string::npos;
|
| 335 |
-
+ quantize &= name.find("guidance_in.") == std::string::npos;
|
| 336 |
-
+ quantize &= name.find("final_layer.") == std::string::npos;
|
| 337 |
-
+ }
|
| 338 |
-
+ if (model.arch == LLM_ARCH_SD1 || model.arch == LLM_ARCH_SDXL) {
|
| 339 |
-
+ image_model = true;
|
| 340 |
-
+ quantize &= name.find("class_embedding.") == std::string::npos;
|
| 341 |
-
+ quantize &= name.find("time_embedding.") == std::string::npos;
|
| 342 |
-
+ quantize &= name.find("add_embedding.") == std::string::npos;
|
| 343 |
-
+ quantize &= name.find("time_embed.") == std::string::npos;
|
| 344 |
-
+ quantize &= name.find("label_emb.") == std::string::npos;
|
| 345 |
-
+ quantize &= name.find("conv_in.") == std::string::npos;
|
| 346 |
-
+ quantize &= name.find("conv_out.") == std::string::npos;
|
| 347 |
-
+ quantize &= name != "input_blocks.0.0.weight";
|
| 348 |
-
+ quantize &= name != "out.2.weight";
|
| 349 |
-
+ }
|
| 350 |
-
+ if (model.arch == LLM_ARCH_SD3) {
|
| 351 |
-
+ image_model = true;
|
| 352 |
-
+ quantize &= name.find("final_layer.") == std::string::npos;
|
| 353 |
-
+ quantize &= name.find("time_text_embed.") == std::string::npos;
|
| 354 |
-
+ quantize &= name.find("context_embedder.") == std::string::npos;
|
| 355 |
-
+ quantize &= name.find("t_embedder.") == std::string::npos;
|
| 356 |
-
+ quantize &= name.find("y_embedder.") == std::string::npos;
|
| 357 |
-
+ quantize &= name.find("x_embedder.") == std::string::npos;
|
| 358 |
-
+ quantize &= name != "proj_out.weight";
|
| 359 |
-
+ quantize &= name != "pos_embed";
|
| 360 |
-
+ }
|
| 361 |
-
+ if (model.arch == LLM_ARCH_AURA) {
|
| 362 |
-
+ image_model = true;
|
| 363 |
-
+ quantize &= name.find("t_embedder.") == std::string::npos;
|
| 364 |
-
+ quantize &= name.find("init_x_linear.") == std::string::npos;
|
| 365 |
-
+ quantize &= name != "modF.1.weight";
|
| 366 |
-
+ quantize &= name != "cond_seq_linear.weight";
|
| 367 |
-
+ quantize &= name != "final_linear.weight";
|
| 368 |
-
+ quantize &= name != "final_linear.weight";
|
| 369 |
-
+ quantize &= name != "positional_encoding";
|
| 370 |
-
+ quantize &= name != "register_tokens";
|
| 371 |
-
+ }
|
| 372 |
-
+ if (model.arch == LLM_ARCH_LTXV) {
|
| 373 |
-
+ image_model = true;
|
| 374 |
-
+ quantize &= name.find("adaln_single.") == std::string::npos;
|
| 375 |
-
+ quantize &= name.find("caption_projection.") == std::string::npos;
|
| 376 |
-
+ quantize &= name.find("patchify_proj.") == std::string::npos;
|
| 377 |
-
+ quantize &= name.find("proj_out.") == std::string::npos;
|
| 378 |
-
+ quantize &= name.find("scale_shift_table") == std::string::npos; // last block too
|
| 379 |
-
+ }
|
| 380 |
-
+ if (model.arch == LLM_ARCH_HYVID) {
|
| 381 |
-
+ image_model = true;
|
| 382 |
-
+ quantize &= name.find("txt_in.") == std::string::npos;
|
| 383 |
-
+ quantize &= name.find("img_in.") == std::string::npos;
|
| 384 |
-
+ quantize &= name.find("time_in.") == std::string::npos;
|
| 385 |
-
+ quantize &= name.find("vector_in.") == std::string::npos;
|
| 386 |
-
+ quantize &= name.find("guidance_in.") == std::string::npos;
|
| 387 |
-
+ quantize &= name.find("final_layer.") == std::string::npos;
|
| 388 |
-
+ }
|
| 389 |
-
+ if (model.arch == LLM_ARCH_WAN) {
|
| 390 |
-
+ image_model = true;
|
| 391 |
-
+ quantize &= name.find("modulation.") == std::string::npos;
|
| 392 |
-
+ quantize &= name.find("patch_embedding.") == std::string::npos;
|
| 393 |
-
+ quantize &= name.find("text_embedding.") == std::string::npos;
|
| 394 |
-
+ quantize &= name.find("time_projection.") == std::string::npos;
|
| 395 |
-
+ quantize &= name.find("time_embedding.") == std::string::npos;
|
| 396 |
-
+ quantize &= name.find("img_emb.") == std::string::npos;
|
| 397 |
-
+ quantize &= name.find("head.") == std::string::npos;
|
| 398 |
-
+ }
|
| 399 |
-
+ if (model.arch == LLM_ARCH_HIDREAM) {
|
| 400 |
-
+ image_model = true;
|
| 401 |
-
+ quantize &= name.find("p_embedder.") == std::string::npos;
|
| 402 |
-
+ quantize &= name.find("t_embedder.") == std::string::npos;
|
| 403 |
-
+ quantize &= name.find("x_embedder.") == std::string::npos;
|
| 404 |
-
+ quantize &= name.find("final_layer.") == std::string::npos;
|
| 405 |
-
+ quantize &= name.find(".ff_i.gate.weight") == std::string::npos;
|
| 406 |
-
+ quantize &= name.find("caption_projection.") == std::string::npos;
|
| 407 |
-
+ }
|
| 408 |
-
+ if (model.arch == LLM_ARCH_COSMOS) {
|
| 409 |
-
+ image_model = true;
|
| 410 |
-
+ quantize &= name.find("p_embedder.") == std::string::npos;
|
| 411 |
-
+ quantize &= name.find("t_embedder.") == std::string::npos;
|
| 412 |
-
+ quantize &= name.find("t_embedding_norm.") == std::string::npos;
|
| 413 |
-
+ quantize &= name.find("x_embedder.") == std::string::npos;
|
| 414 |
-
+ quantize &= name.find("pos_embedder.") == std::string::npos;
|
| 415 |
-
+ quantize &= name.find("final_layer.") == std::string::npos;
|
| 416 |
-
+ }
|
| 417 |
-
+ if (model.arch == LLM_ARCH_LUMINA2) {
|
| 418 |
-
+ image_model = true;
|
| 419 |
-
+ quantize &= name.find("t_embedder.") == std::string::npos;
|
| 420 |
-
+ quantize &= name.find("x_embedder.") == std::string::npos;
|
| 421 |
-
+ quantize &= name.find("final_layer.") == std::string::npos;
|
| 422 |
-
+ quantize &= name.find("cap_embedder.") == std::string::npos;
|
| 423 |
-
+ quantize &= name.find("context_refiner.") == std::string::npos;
|
| 424 |
-
+ quantize &= name.find("noise_refiner.") == std::string::npos;
|
| 425 |
-
+ }
|
| 426 |
-
+ // ignore 3D/4D tensors for image models as the code was never meant to handle these
|
| 427 |
-
+ if (image_model) {
|
| 428 |
-
+ quantize &= ggml_n_dims(tensor) == 2;
|
| 429 |
-
+ }
|
| 430 |
-
+
|
| 431 |
-
enum ggml_type new_type;
|
| 432 |
-
void * new_data;
|
| 433 |
-
size_t new_size;
|
| 434 |
-
@@ -18655,6 +18986,9 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
|
| 435 |
-
new_type = default_type;
|
| 436 |
-
|
| 437 |
-
// get more optimal quantization type based on the tensor shape, layer, etc.
|
| 438 |
-
+ if (image_model) {
|
| 439 |
-
+ new_type = img_tensor_get_type(qs, new_type, tensor, ftype);
|
| 440 |
-
+ } else {
|
| 441 |
-
if (!params->pure && ggml_is_quantized(default_type)) {
|
| 442 |
-
new_type = llama_tensor_get_type(qs, new_type, tensor, ftype);
|
| 443 |
-
}
|
| 444 |
-
@@ -18664,6 +18998,7 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
|
| 445 |
-
if (params->output_tensor_type < GGML_TYPE_COUNT && strcmp(tensor->name, "output.weight") == 0) {
|
| 446 |
-
new_type = params->output_tensor_type;
|
| 447 |
-
}
|
| 448 |
-
+ }
|
| 449 |
-
|
| 450 |
-
// If we've decided to quantize to the same type the tensor is already
|
| 451 |
-
// in then there's nothing to do.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/ComfyUI-GGUF/tools/read_tensors.py
DELETED
|
@@ -1,21 +0,0 @@
|
|
| 1 |
-
#!/usr/bin/python3
|
| 2 |
-
import os
|
| 3 |
-
import sys
|
| 4 |
-
import gguf
|
| 5 |
-
|
| 6 |
-
def read_tensors(path):
|
| 7 |
-
reader = gguf.GGUFReader(path)
|
| 8 |
-
for tensor in reader.tensors:
|
| 9 |
-
if tensor.tensor_type == gguf.GGMLQuantizationType.F32:
|
| 10 |
-
continue
|
| 11 |
-
print(f"{str(tensor.tensor_type):32}: {tensor.name}")
|
| 12 |
-
|
| 13 |
-
try:
|
| 14 |
-
path = sys.argv[1]
|
| 15 |
-
assert os.path.isfile(path), "Invalid path"
|
| 16 |
-
print(f"input: {path}")
|
| 17 |
-
except Exception as e:
|
| 18 |
-
input(f"failed: {e}")
|
| 19 |
-
else:
|
| 20 |
-
read_tensors(path)
|
| 21 |
-
input()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/custom_nodes/cg-image-filter
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
Subproject commit 2cd49e79af81f91e7758c5174b4ade7b168f9c85
|
ComfyUI/models/audio_encoders/put_audio_encoder_models_here
DELETED
|
File without changes
|
ComfyUI/models/checkpoints/put_checkpoints_here
DELETED
|
File without changes
|
ComfyUI/models/clip/put_clip_or_text_encoder_models_here
DELETED
|
File without changes
|
ComfyUI/models/clip_vision/put_clip_vision_models_here
DELETED
|
File without changes
|
ComfyUI/models/configs/anything_v3.yaml
DELETED
|
@@ -1,73 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-04
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false # Note: different from the one we trained before
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False
|
| 19 |
-
|
| 20 |
-
scheduler_config: # 10000 warmup steps
|
| 21 |
-
target: ldm.lr_scheduler.LambdaLinearScheduler
|
| 22 |
-
params:
|
| 23 |
-
warm_up_steps: [ 10000 ]
|
| 24 |
-
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
| 25 |
-
f_start: [ 1.e-6 ]
|
| 26 |
-
f_max: [ 1. ]
|
| 27 |
-
f_min: [ 1. ]
|
| 28 |
-
|
| 29 |
-
unet_config:
|
| 30 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 31 |
-
params:
|
| 32 |
-
image_size: 32 # unused
|
| 33 |
-
in_channels: 4
|
| 34 |
-
out_channels: 4
|
| 35 |
-
model_channels: 320
|
| 36 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 37 |
-
num_res_blocks: 2
|
| 38 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 39 |
-
num_heads: 8
|
| 40 |
-
use_spatial_transformer: True
|
| 41 |
-
transformer_depth: 1
|
| 42 |
-
context_dim: 768
|
| 43 |
-
use_checkpoint: True
|
| 44 |
-
legacy: False
|
| 45 |
-
|
| 46 |
-
first_stage_config:
|
| 47 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 48 |
-
params:
|
| 49 |
-
embed_dim: 4
|
| 50 |
-
monitor: val/rec_loss
|
| 51 |
-
ddconfig:
|
| 52 |
-
double_z: true
|
| 53 |
-
z_channels: 4
|
| 54 |
-
resolution: 256
|
| 55 |
-
in_channels: 3
|
| 56 |
-
out_ch: 3
|
| 57 |
-
ch: 128
|
| 58 |
-
ch_mult:
|
| 59 |
-
- 1
|
| 60 |
-
- 2
|
| 61 |
-
- 4
|
| 62 |
-
- 4
|
| 63 |
-
num_res_blocks: 2
|
| 64 |
-
attn_resolutions: []
|
| 65 |
-
dropout: 0.0
|
| 66 |
-
lossconfig:
|
| 67 |
-
target: torch.nn.Identity
|
| 68 |
-
|
| 69 |
-
cond_stage_config:
|
| 70 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
| 71 |
-
params:
|
| 72 |
-
layer: "hidden"
|
| 73 |
-
layer_idx: -2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v1-inference.yaml
DELETED
|
@@ -1,70 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-04
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false # Note: different from the one we trained before
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False
|
| 19 |
-
|
| 20 |
-
scheduler_config: # 10000 warmup steps
|
| 21 |
-
target: ldm.lr_scheduler.LambdaLinearScheduler
|
| 22 |
-
params:
|
| 23 |
-
warm_up_steps: [ 10000 ]
|
| 24 |
-
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
| 25 |
-
f_start: [ 1.e-6 ]
|
| 26 |
-
f_max: [ 1. ]
|
| 27 |
-
f_min: [ 1. ]
|
| 28 |
-
|
| 29 |
-
unet_config:
|
| 30 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 31 |
-
params:
|
| 32 |
-
image_size: 32 # unused
|
| 33 |
-
in_channels: 4
|
| 34 |
-
out_channels: 4
|
| 35 |
-
model_channels: 320
|
| 36 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 37 |
-
num_res_blocks: 2
|
| 38 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 39 |
-
num_heads: 8
|
| 40 |
-
use_spatial_transformer: True
|
| 41 |
-
transformer_depth: 1
|
| 42 |
-
context_dim: 768
|
| 43 |
-
use_checkpoint: True
|
| 44 |
-
legacy: False
|
| 45 |
-
|
| 46 |
-
first_stage_config:
|
| 47 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 48 |
-
params:
|
| 49 |
-
embed_dim: 4
|
| 50 |
-
monitor: val/rec_loss
|
| 51 |
-
ddconfig:
|
| 52 |
-
double_z: true
|
| 53 |
-
z_channels: 4
|
| 54 |
-
resolution: 256
|
| 55 |
-
in_channels: 3
|
| 56 |
-
out_ch: 3
|
| 57 |
-
ch: 128
|
| 58 |
-
ch_mult:
|
| 59 |
-
- 1
|
| 60 |
-
- 2
|
| 61 |
-
- 4
|
| 62 |
-
- 4
|
| 63 |
-
num_res_blocks: 2
|
| 64 |
-
attn_resolutions: []
|
| 65 |
-
dropout: 0.0
|
| 66 |
-
lossconfig:
|
| 67 |
-
target: torch.nn.Identity
|
| 68 |
-
|
| 69 |
-
cond_stage_config:
|
| 70 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v1-inference_clip_skip_2.yaml
DELETED
|
@@ -1,73 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-04
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false # Note: different from the one we trained before
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False
|
| 19 |
-
|
| 20 |
-
scheduler_config: # 10000 warmup steps
|
| 21 |
-
target: ldm.lr_scheduler.LambdaLinearScheduler
|
| 22 |
-
params:
|
| 23 |
-
warm_up_steps: [ 10000 ]
|
| 24 |
-
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
| 25 |
-
f_start: [ 1.e-6 ]
|
| 26 |
-
f_max: [ 1. ]
|
| 27 |
-
f_min: [ 1. ]
|
| 28 |
-
|
| 29 |
-
unet_config:
|
| 30 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 31 |
-
params:
|
| 32 |
-
image_size: 32 # unused
|
| 33 |
-
in_channels: 4
|
| 34 |
-
out_channels: 4
|
| 35 |
-
model_channels: 320
|
| 36 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 37 |
-
num_res_blocks: 2
|
| 38 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 39 |
-
num_heads: 8
|
| 40 |
-
use_spatial_transformer: True
|
| 41 |
-
transformer_depth: 1
|
| 42 |
-
context_dim: 768
|
| 43 |
-
use_checkpoint: True
|
| 44 |
-
legacy: False
|
| 45 |
-
|
| 46 |
-
first_stage_config:
|
| 47 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 48 |
-
params:
|
| 49 |
-
embed_dim: 4
|
| 50 |
-
monitor: val/rec_loss
|
| 51 |
-
ddconfig:
|
| 52 |
-
double_z: true
|
| 53 |
-
z_channels: 4
|
| 54 |
-
resolution: 256
|
| 55 |
-
in_channels: 3
|
| 56 |
-
out_ch: 3
|
| 57 |
-
ch: 128
|
| 58 |
-
ch_mult:
|
| 59 |
-
- 1
|
| 60 |
-
- 2
|
| 61 |
-
- 4
|
| 62 |
-
- 4
|
| 63 |
-
num_res_blocks: 2
|
| 64 |
-
attn_resolutions: []
|
| 65 |
-
dropout: 0.0
|
| 66 |
-
lossconfig:
|
| 67 |
-
target: torch.nn.Identity
|
| 68 |
-
|
| 69 |
-
cond_stage_config:
|
| 70 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
| 71 |
-
params:
|
| 72 |
-
layer: "hidden"
|
| 73 |
-
layer_idx: -2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v1-inference_clip_skip_2_fp16.yaml
DELETED
|
@@ -1,74 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-04
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false # Note: different from the one we trained before
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False
|
| 19 |
-
|
| 20 |
-
scheduler_config: # 10000 warmup steps
|
| 21 |
-
target: ldm.lr_scheduler.LambdaLinearScheduler
|
| 22 |
-
params:
|
| 23 |
-
warm_up_steps: [ 10000 ]
|
| 24 |
-
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
| 25 |
-
f_start: [ 1.e-6 ]
|
| 26 |
-
f_max: [ 1. ]
|
| 27 |
-
f_min: [ 1. ]
|
| 28 |
-
|
| 29 |
-
unet_config:
|
| 30 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 31 |
-
params:
|
| 32 |
-
use_fp16: True
|
| 33 |
-
image_size: 32 # unused
|
| 34 |
-
in_channels: 4
|
| 35 |
-
out_channels: 4
|
| 36 |
-
model_channels: 320
|
| 37 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 38 |
-
num_res_blocks: 2
|
| 39 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 40 |
-
num_heads: 8
|
| 41 |
-
use_spatial_transformer: True
|
| 42 |
-
transformer_depth: 1
|
| 43 |
-
context_dim: 768
|
| 44 |
-
use_checkpoint: True
|
| 45 |
-
legacy: False
|
| 46 |
-
|
| 47 |
-
first_stage_config:
|
| 48 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 49 |
-
params:
|
| 50 |
-
embed_dim: 4
|
| 51 |
-
monitor: val/rec_loss
|
| 52 |
-
ddconfig:
|
| 53 |
-
double_z: true
|
| 54 |
-
z_channels: 4
|
| 55 |
-
resolution: 256
|
| 56 |
-
in_channels: 3
|
| 57 |
-
out_ch: 3
|
| 58 |
-
ch: 128
|
| 59 |
-
ch_mult:
|
| 60 |
-
- 1
|
| 61 |
-
- 2
|
| 62 |
-
- 4
|
| 63 |
-
- 4
|
| 64 |
-
num_res_blocks: 2
|
| 65 |
-
attn_resolutions: []
|
| 66 |
-
dropout: 0.0
|
| 67 |
-
lossconfig:
|
| 68 |
-
target: torch.nn.Identity
|
| 69 |
-
|
| 70 |
-
cond_stage_config:
|
| 71 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
| 72 |
-
params:
|
| 73 |
-
layer: "hidden"
|
| 74 |
-
layer_idx: -2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v1-inference_fp16.yaml
DELETED
|
@@ -1,71 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-04
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false # Note: different from the one we trained before
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False
|
| 19 |
-
|
| 20 |
-
scheduler_config: # 10000 warmup steps
|
| 21 |
-
target: ldm.lr_scheduler.LambdaLinearScheduler
|
| 22 |
-
params:
|
| 23 |
-
warm_up_steps: [ 10000 ]
|
| 24 |
-
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
| 25 |
-
f_start: [ 1.e-6 ]
|
| 26 |
-
f_max: [ 1. ]
|
| 27 |
-
f_min: [ 1. ]
|
| 28 |
-
|
| 29 |
-
unet_config:
|
| 30 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 31 |
-
params:
|
| 32 |
-
use_fp16: True
|
| 33 |
-
image_size: 32 # unused
|
| 34 |
-
in_channels: 4
|
| 35 |
-
out_channels: 4
|
| 36 |
-
model_channels: 320
|
| 37 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 38 |
-
num_res_blocks: 2
|
| 39 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 40 |
-
num_heads: 8
|
| 41 |
-
use_spatial_transformer: True
|
| 42 |
-
transformer_depth: 1
|
| 43 |
-
context_dim: 768
|
| 44 |
-
use_checkpoint: True
|
| 45 |
-
legacy: False
|
| 46 |
-
|
| 47 |
-
first_stage_config:
|
| 48 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 49 |
-
params:
|
| 50 |
-
embed_dim: 4
|
| 51 |
-
monitor: val/rec_loss
|
| 52 |
-
ddconfig:
|
| 53 |
-
double_z: true
|
| 54 |
-
z_channels: 4
|
| 55 |
-
resolution: 256
|
| 56 |
-
in_channels: 3
|
| 57 |
-
out_ch: 3
|
| 58 |
-
ch: 128
|
| 59 |
-
ch_mult:
|
| 60 |
-
- 1
|
| 61 |
-
- 2
|
| 62 |
-
- 4
|
| 63 |
-
- 4
|
| 64 |
-
num_res_blocks: 2
|
| 65 |
-
attn_resolutions: []
|
| 66 |
-
dropout: 0.0
|
| 67 |
-
lossconfig:
|
| 68 |
-
target: torch.nn.Identity
|
| 69 |
-
|
| 70 |
-
cond_stage_config:
|
| 71 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v1-inpainting-inference.yaml
DELETED
|
@@ -1,71 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 7.5e-05
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentInpaintDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false # Note: different from the one we trained before
|
| 15 |
-
conditioning_key: hybrid # important
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
finetune_keys: null
|
| 19 |
-
|
| 20 |
-
scheduler_config: # 10000 warmup steps
|
| 21 |
-
target: ldm.lr_scheduler.LambdaLinearScheduler
|
| 22 |
-
params:
|
| 23 |
-
warm_up_steps: [ 2500 ] # NOTE for resuming. use 10000 if starting from scratch
|
| 24 |
-
cycle_lengths: [ 10000000000000 ] # incredibly large number to prevent corner cases
|
| 25 |
-
f_start: [ 1.e-6 ]
|
| 26 |
-
f_max: [ 1. ]
|
| 27 |
-
f_min: [ 1. ]
|
| 28 |
-
|
| 29 |
-
unet_config:
|
| 30 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 31 |
-
params:
|
| 32 |
-
image_size: 32 # unused
|
| 33 |
-
in_channels: 9 # 4 data + 4 downscaled image + 1 mask
|
| 34 |
-
out_channels: 4
|
| 35 |
-
model_channels: 320
|
| 36 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 37 |
-
num_res_blocks: 2
|
| 38 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 39 |
-
num_heads: 8
|
| 40 |
-
use_spatial_transformer: True
|
| 41 |
-
transformer_depth: 1
|
| 42 |
-
context_dim: 768
|
| 43 |
-
use_checkpoint: True
|
| 44 |
-
legacy: False
|
| 45 |
-
|
| 46 |
-
first_stage_config:
|
| 47 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 48 |
-
params:
|
| 49 |
-
embed_dim: 4
|
| 50 |
-
monitor: val/rec_loss
|
| 51 |
-
ddconfig:
|
| 52 |
-
double_z: true
|
| 53 |
-
z_channels: 4
|
| 54 |
-
resolution: 256
|
| 55 |
-
in_channels: 3
|
| 56 |
-
out_ch: 3
|
| 57 |
-
ch: 128
|
| 58 |
-
ch_mult:
|
| 59 |
-
- 1
|
| 60 |
-
- 2
|
| 61 |
-
- 4
|
| 62 |
-
- 4
|
| 63 |
-
num_res_blocks: 2
|
| 64 |
-
attn_resolutions: []
|
| 65 |
-
dropout: 0.0
|
| 66 |
-
lossconfig:
|
| 67 |
-
target: torch.nn.Identity
|
| 68 |
-
|
| 69 |
-
cond_stage_config:
|
| 70 |
-
target: ldm.modules.encoders.modules.FrozenCLIPEmbedder
|
| 71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v2-inference-v.yaml
DELETED
|
@@ -1,68 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-4
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
parameterization: "v"
|
| 6 |
-
linear_start: 0.00085
|
| 7 |
-
linear_end: 0.0120
|
| 8 |
-
num_timesteps_cond: 1
|
| 9 |
-
log_every_t: 200
|
| 10 |
-
timesteps: 1000
|
| 11 |
-
first_stage_key: "jpg"
|
| 12 |
-
cond_stage_key: "txt"
|
| 13 |
-
image_size: 64
|
| 14 |
-
channels: 4
|
| 15 |
-
cond_stage_trainable: false
|
| 16 |
-
conditioning_key: crossattn
|
| 17 |
-
monitor: val/loss_simple_ema
|
| 18 |
-
scale_factor: 0.18215
|
| 19 |
-
use_ema: False # we set this to false because this is an inference only config
|
| 20 |
-
|
| 21 |
-
unet_config:
|
| 22 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 23 |
-
params:
|
| 24 |
-
use_checkpoint: True
|
| 25 |
-
use_fp16: True
|
| 26 |
-
image_size: 32 # unused
|
| 27 |
-
in_channels: 4
|
| 28 |
-
out_channels: 4
|
| 29 |
-
model_channels: 320
|
| 30 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 31 |
-
num_res_blocks: 2
|
| 32 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 33 |
-
num_head_channels: 64 # need to fix for flash-attn
|
| 34 |
-
use_spatial_transformer: True
|
| 35 |
-
use_linear_in_transformer: True
|
| 36 |
-
transformer_depth: 1
|
| 37 |
-
context_dim: 1024
|
| 38 |
-
legacy: False
|
| 39 |
-
|
| 40 |
-
first_stage_config:
|
| 41 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 42 |
-
params:
|
| 43 |
-
embed_dim: 4
|
| 44 |
-
monitor: val/rec_loss
|
| 45 |
-
ddconfig:
|
| 46 |
-
#attn_type: "vanilla-xformers"
|
| 47 |
-
double_z: true
|
| 48 |
-
z_channels: 4
|
| 49 |
-
resolution: 256
|
| 50 |
-
in_channels: 3
|
| 51 |
-
out_ch: 3
|
| 52 |
-
ch: 128
|
| 53 |
-
ch_mult:
|
| 54 |
-
- 1
|
| 55 |
-
- 2
|
| 56 |
-
- 4
|
| 57 |
-
- 4
|
| 58 |
-
num_res_blocks: 2
|
| 59 |
-
attn_resolutions: []
|
| 60 |
-
dropout: 0.0
|
| 61 |
-
lossconfig:
|
| 62 |
-
target: torch.nn.Identity
|
| 63 |
-
|
| 64 |
-
cond_stage_config:
|
| 65 |
-
target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
|
| 66 |
-
params:
|
| 67 |
-
freeze: True
|
| 68 |
-
layer: "penultimate"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v2-inference-v_fp32.yaml
DELETED
|
@@ -1,68 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-4
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
parameterization: "v"
|
| 6 |
-
linear_start: 0.00085
|
| 7 |
-
linear_end: 0.0120
|
| 8 |
-
num_timesteps_cond: 1
|
| 9 |
-
log_every_t: 200
|
| 10 |
-
timesteps: 1000
|
| 11 |
-
first_stage_key: "jpg"
|
| 12 |
-
cond_stage_key: "txt"
|
| 13 |
-
image_size: 64
|
| 14 |
-
channels: 4
|
| 15 |
-
cond_stage_trainable: false
|
| 16 |
-
conditioning_key: crossattn
|
| 17 |
-
monitor: val/loss_simple_ema
|
| 18 |
-
scale_factor: 0.18215
|
| 19 |
-
use_ema: False # we set this to false because this is an inference only config
|
| 20 |
-
|
| 21 |
-
unet_config:
|
| 22 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 23 |
-
params:
|
| 24 |
-
use_checkpoint: True
|
| 25 |
-
use_fp16: False
|
| 26 |
-
image_size: 32 # unused
|
| 27 |
-
in_channels: 4
|
| 28 |
-
out_channels: 4
|
| 29 |
-
model_channels: 320
|
| 30 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 31 |
-
num_res_blocks: 2
|
| 32 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 33 |
-
num_head_channels: 64 # need to fix for flash-attn
|
| 34 |
-
use_spatial_transformer: True
|
| 35 |
-
use_linear_in_transformer: True
|
| 36 |
-
transformer_depth: 1
|
| 37 |
-
context_dim: 1024
|
| 38 |
-
legacy: False
|
| 39 |
-
|
| 40 |
-
first_stage_config:
|
| 41 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 42 |
-
params:
|
| 43 |
-
embed_dim: 4
|
| 44 |
-
monitor: val/rec_loss
|
| 45 |
-
ddconfig:
|
| 46 |
-
#attn_type: "vanilla-xformers"
|
| 47 |
-
double_z: true
|
| 48 |
-
z_channels: 4
|
| 49 |
-
resolution: 256
|
| 50 |
-
in_channels: 3
|
| 51 |
-
out_ch: 3
|
| 52 |
-
ch: 128
|
| 53 |
-
ch_mult:
|
| 54 |
-
- 1
|
| 55 |
-
- 2
|
| 56 |
-
- 4
|
| 57 |
-
- 4
|
| 58 |
-
num_res_blocks: 2
|
| 59 |
-
attn_resolutions: []
|
| 60 |
-
dropout: 0.0
|
| 61 |
-
lossconfig:
|
| 62 |
-
target: torch.nn.Identity
|
| 63 |
-
|
| 64 |
-
cond_stage_config:
|
| 65 |
-
target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
|
| 66 |
-
params:
|
| 67 |
-
freeze: True
|
| 68 |
-
layer: "penultimate"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v2-inference.yaml
DELETED
|
@@ -1,67 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-4
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False # we set this to false because this is an inference only config
|
| 19 |
-
|
| 20 |
-
unet_config:
|
| 21 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 22 |
-
params:
|
| 23 |
-
use_checkpoint: True
|
| 24 |
-
use_fp16: True
|
| 25 |
-
image_size: 32 # unused
|
| 26 |
-
in_channels: 4
|
| 27 |
-
out_channels: 4
|
| 28 |
-
model_channels: 320
|
| 29 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 30 |
-
num_res_blocks: 2
|
| 31 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 32 |
-
num_head_channels: 64 # need to fix for flash-attn
|
| 33 |
-
use_spatial_transformer: True
|
| 34 |
-
use_linear_in_transformer: True
|
| 35 |
-
transformer_depth: 1
|
| 36 |
-
context_dim: 1024
|
| 37 |
-
legacy: False
|
| 38 |
-
|
| 39 |
-
first_stage_config:
|
| 40 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 41 |
-
params:
|
| 42 |
-
embed_dim: 4
|
| 43 |
-
monitor: val/rec_loss
|
| 44 |
-
ddconfig:
|
| 45 |
-
#attn_type: "vanilla-xformers"
|
| 46 |
-
double_z: true
|
| 47 |
-
z_channels: 4
|
| 48 |
-
resolution: 256
|
| 49 |
-
in_channels: 3
|
| 50 |
-
out_ch: 3
|
| 51 |
-
ch: 128
|
| 52 |
-
ch_mult:
|
| 53 |
-
- 1
|
| 54 |
-
- 2
|
| 55 |
-
- 4
|
| 56 |
-
- 4
|
| 57 |
-
num_res_blocks: 2
|
| 58 |
-
attn_resolutions: []
|
| 59 |
-
dropout: 0.0
|
| 60 |
-
lossconfig:
|
| 61 |
-
target: torch.nn.Identity
|
| 62 |
-
|
| 63 |
-
cond_stage_config:
|
| 64 |
-
target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
|
| 65 |
-
params:
|
| 66 |
-
freeze: True
|
| 67 |
-
layer: "penultimate"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v2-inference_fp32.yaml
DELETED
|
@@ -1,67 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 1.0e-4
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false
|
| 15 |
-
conditioning_key: crossattn
|
| 16 |
-
monitor: val/loss_simple_ema
|
| 17 |
-
scale_factor: 0.18215
|
| 18 |
-
use_ema: False # we set this to false because this is an inference only config
|
| 19 |
-
|
| 20 |
-
unet_config:
|
| 21 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 22 |
-
params:
|
| 23 |
-
use_checkpoint: True
|
| 24 |
-
use_fp16: False
|
| 25 |
-
image_size: 32 # unused
|
| 26 |
-
in_channels: 4
|
| 27 |
-
out_channels: 4
|
| 28 |
-
model_channels: 320
|
| 29 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 30 |
-
num_res_blocks: 2
|
| 31 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 32 |
-
num_head_channels: 64 # need to fix for flash-attn
|
| 33 |
-
use_spatial_transformer: True
|
| 34 |
-
use_linear_in_transformer: True
|
| 35 |
-
transformer_depth: 1
|
| 36 |
-
context_dim: 1024
|
| 37 |
-
legacy: False
|
| 38 |
-
|
| 39 |
-
first_stage_config:
|
| 40 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 41 |
-
params:
|
| 42 |
-
embed_dim: 4
|
| 43 |
-
monitor: val/rec_loss
|
| 44 |
-
ddconfig:
|
| 45 |
-
#attn_type: "vanilla-xformers"
|
| 46 |
-
double_z: true
|
| 47 |
-
z_channels: 4
|
| 48 |
-
resolution: 256
|
| 49 |
-
in_channels: 3
|
| 50 |
-
out_ch: 3
|
| 51 |
-
ch: 128
|
| 52 |
-
ch_mult:
|
| 53 |
-
- 1
|
| 54 |
-
- 2
|
| 55 |
-
- 4
|
| 56 |
-
- 4
|
| 57 |
-
num_res_blocks: 2
|
| 58 |
-
attn_resolutions: []
|
| 59 |
-
dropout: 0.0
|
| 60 |
-
lossconfig:
|
| 61 |
-
target: torch.nn.Identity
|
| 62 |
-
|
| 63 |
-
cond_stage_config:
|
| 64 |
-
target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
|
| 65 |
-
params:
|
| 66 |
-
freeze: True
|
| 67 |
-
layer: "penultimate"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/configs/v2-inpainting-inference.yaml
DELETED
|
@@ -1,158 +0,0 @@
|
|
| 1 |
-
model:
|
| 2 |
-
base_learning_rate: 5.0e-05
|
| 3 |
-
target: ldm.models.diffusion.ddpm.LatentInpaintDiffusion
|
| 4 |
-
params:
|
| 5 |
-
linear_start: 0.00085
|
| 6 |
-
linear_end: 0.0120
|
| 7 |
-
num_timesteps_cond: 1
|
| 8 |
-
log_every_t: 200
|
| 9 |
-
timesteps: 1000
|
| 10 |
-
first_stage_key: "jpg"
|
| 11 |
-
cond_stage_key: "txt"
|
| 12 |
-
image_size: 64
|
| 13 |
-
channels: 4
|
| 14 |
-
cond_stage_trainable: false
|
| 15 |
-
conditioning_key: hybrid
|
| 16 |
-
scale_factor: 0.18215
|
| 17 |
-
monitor: val/loss_simple_ema
|
| 18 |
-
finetune_keys: null
|
| 19 |
-
use_ema: False
|
| 20 |
-
|
| 21 |
-
unet_config:
|
| 22 |
-
target: ldm.modules.diffusionmodules.openaimodel.UNetModel
|
| 23 |
-
params:
|
| 24 |
-
use_checkpoint: True
|
| 25 |
-
image_size: 32 # unused
|
| 26 |
-
in_channels: 9
|
| 27 |
-
out_channels: 4
|
| 28 |
-
model_channels: 320
|
| 29 |
-
attention_resolutions: [ 4, 2, 1 ]
|
| 30 |
-
num_res_blocks: 2
|
| 31 |
-
channel_mult: [ 1, 2, 4, 4 ]
|
| 32 |
-
num_head_channels: 64 # need to fix for flash-attn
|
| 33 |
-
use_spatial_transformer: True
|
| 34 |
-
use_linear_in_transformer: True
|
| 35 |
-
transformer_depth: 1
|
| 36 |
-
context_dim: 1024
|
| 37 |
-
legacy: False
|
| 38 |
-
|
| 39 |
-
first_stage_config:
|
| 40 |
-
target: ldm.models.autoencoder.AutoencoderKL
|
| 41 |
-
params:
|
| 42 |
-
embed_dim: 4
|
| 43 |
-
monitor: val/rec_loss
|
| 44 |
-
ddconfig:
|
| 45 |
-
#attn_type: "vanilla-xformers"
|
| 46 |
-
double_z: true
|
| 47 |
-
z_channels: 4
|
| 48 |
-
resolution: 256
|
| 49 |
-
in_channels: 3
|
| 50 |
-
out_ch: 3
|
| 51 |
-
ch: 128
|
| 52 |
-
ch_mult:
|
| 53 |
-
- 1
|
| 54 |
-
- 2
|
| 55 |
-
- 4
|
| 56 |
-
- 4
|
| 57 |
-
num_res_blocks: 2
|
| 58 |
-
attn_resolutions: [ ]
|
| 59 |
-
dropout: 0.0
|
| 60 |
-
lossconfig:
|
| 61 |
-
target: torch.nn.Identity
|
| 62 |
-
|
| 63 |
-
cond_stage_config:
|
| 64 |
-
target: ldm.modules.encoders.modules.FrozenOpenCLIPEmbedder
|
| 65 |
-
params:
|
| 66 |
-
freeze: True
|
| 67 |
-
layer: "penultimate"
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
data:
|
| 71 |
-
target: ldm.data.laion.WebDataModuleFromConfig
|
| 72 |
-
params:
|
| 73 |
-
tar_base: null # for concat as in LAION-A
|
| 74 |
-
p_unsafe_threshold: 0.1
|
| 75 |
-
filter_word_list: "data/filters.yaml"
|
| 76 |
-
max_pwatermark: 0.45
|
| 77 |
-
batch_size: 8
|
| 78 |
-
num_workers: 6
|
| 79 |
-
multinode: True
|
| 80 |
-
min_size: 512
|
| 81 |
-
train:
|
| 82 |
-
shards:
|
| 83 |
-
- "pipe:aws s3 cp s3://stability-aws/laion-a-native/part-0/{00000..18699}.tar -"
|
| 84 |
-
- "pipe:aws s3 cp s3://stability-aws/laion-a-native/part-1/{00000..18699}.tar -"
|
| 85 |
-
- "pipe:aws s3 cp s3://stability-aws/laion-a-native/part-2/{00000..18699}.tar -"
|
| 86 |
-
- "pipe:aws s3 cp s3://stability-aws/laion-a-native/part-3/{00000..18699}.tar -"
|
| 87 |
-
- "pipe:aws s3 cp s3://stability-aws/laion-a-native/part-4/{00000..18699}.tar -" #{00000-94333}.tar"
|
| 88 |
-
shuffle: 10000
|
| 89 |
-
image_key: jpg
|
| 90 |
-
image_transforms:
|
| 91 |
-
- target: torchvision.transforms.Resize
|
| 92 |
-
params:
|
| 93 |
-
size: 512
|
| 94 |
-
interpolation: 3
|
| 95 |
-
- target: torchvision.transforms.RandomCrop
|
| 96 |
-
params:
|
| 97 |
-
size: 512
|
| 98 |
-
postprocess:
|
| 99 |
-
target: ldm.data.laion.AddMask
|
| 100 |
-
params:
|
| 101 |
-
mode: "512train-large"
|
| 102 |
-
p_drop: 0.25
|
| 103 |
-
# NOTE use enough shards to avoid empty validation loops in workers
|
| 104 |
-
validation:
|
| 105 |
-
shards:
|
| 106 |
-
- "pipe:aws s3 cp s3://deep-floyd-s3/datasets/laion_cleaned-part5/{93001..94333}.tar - "
|
| 107 |
-
shuffle: 0
|
| 108 |
-
image_key: jpg
|
| 109 |
-
image_transforms:
|
| 110 |
-
- target: torchvision.transforms.Resize
|
| 111 |
-
params:
|
| 112 |
-
size: 512
|
| 113 |
-
interpolation: 3
|
| 114 |
-
- target: torchvision.transforms.CenterCrop
|
| 115 |
-
params:
|
| 116 |
-
size: 512
|
| 117 |
-
postprocess:
|
| 118 |
-
target: ldm.data.laion.AddMask
|
| 119 |
-
params:
|
| 120 |
-
mode: "512train-large"
|
| 121 |
-
p_drop: 0.25
|
| 122 |
-
|
| 123 |
-
lightning:
|
| 124 |
-
find_unused_parameters: True
|
| 125 |
-
modelcheckpoint:
|
| 126 |
-
params:
|
| 127 |
-
every_n_train_steps: 5000
|
| 128 |
-
|
| 129 |
-
callbacks:
|
| 130 |
-
metrics_over_trainsteps_checkpoint:
|
| 131 |
-
params:
|
| 132 |
-
every_n_train_steps: 10000
|
| 133 |
-
|
| 134 |
-
image_logger:
|
| 135 |
-
target: main.ImageLogger
|
| 136 |
-
params:
|
| 137 |
-
enable_autocast: False
|
| 138 |
-
disabled: False
|
| 139 |
-
batch_frequency: 1000
|
| 140 |
-
max_images: 4
|
| 141 |
-
increase_log_steps: False
|
| 142 |
-
log_first_step: False
|
| 143 |
-
log_images_kwargs:
|
| 144 |
-
use_ema_scope: False
|
| 145 |
-
inpaint: False
|
| 146 |
-
plot_progressive_rows: False
|
| 147 |
-
plot_diffusion_rows: False
|
| 148 |
-
N: 4
|
| 149 |
-
unconditional_guidance_scale: 5.0
|
| 150 |
-
unconditional_guidance_label: [""]
|
| 151 |
-
ddim_steps: 50 # todo check these out for depth2img,
|
| 152 |
-
ddim_eta: 0.0 # todo check these out for depth2img,
|
| 153 |
-
|
| 154 |
-
trainer:
|
| 155 |
-
benchmark: True
|
| 156 |
-
val_check_interval: 5000000
|
| 157 |
-
num_sanity_val_steps: 0
|
| 158 |
-
accumulate_grad_batches: 1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ComfyUI/models/controlnet/put_controlnets_and_t2i_here
DELETED
|
File without changes
|
ComfyUI/models/diffusers/put_diffusers_models_here
DELETED
|
File without changes
|
ComfyUI/models/diffusion_models/put_diffusion_model_files_here
DELETED
|
File without changes
|
ComfyUI/models/embeddings/put_embeddings_or_textual_inversion_concepts_here
DELETED
|
File without changes
|
ComfyUI/models/gligen/put_gligen_models_here
DELETED
|
File without changes
|
ComfyUI/models/hypernetworks/put_hypernetworks_here
DELETED
|
File without changes
|
ComfyUI/models/loras/put_loras_here
DELETED
|
File without changes
|
ComfyUI/models/model_patches/put_model_patches_here
DELETED
|
File without changes
|
ComfyUI/models/photomaker/put_photomaker_models_here
DELETED
|
File without changes
|
ComfyUI/models/style_models/put_t2i_style_model_here
DELETED
|
File without changes
|
ComfyUI/models/text_encoders/put_text_encoder_files_here
DELETED
|
File without changes
|
ComfyUI/models/unet/put_unet_files_here
DELETED
|
File without changes
|
ComfyUI/models/upscale_models/put_esrgan_and_other_upscale_models_here
DELETED
|
File without changes
|
ComfyUI/models/vae/put_vae_here
DELETED
|
File without changes
|
ComfyUI/models/vae_approx/put_taesd_encoder_pth_and_taesd_decoder_pth_here
DELETED
|
File without changes
|