Flash attention support

#4
by jiosephlee - opened

Hi, I seem to be getting different performance based on the attention implementation. Is flash attention 2 also supported? Or just flash attention 1.

Intern Large Models org
edited Oct 27

InternS1 follow the attention interface introduced in huggingface:https://huggingface.co/docs/transformers/attention_interface,you can set attn_implementation to control the attention implementation.

Sign up or log in to comment