google
/

vit-large-patch16-224-in21k

Image Feature Extraction

Model card Files Files and versions

nielsr HF Staff commited on Mar 31, 2021

Commit

e1a3478

·

1 Parent(s): be0d844

Fix typo

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -22,21 +22,20 @@ By pre-training the model, it learns an inner representation of images that can
 ## Intended uses & limitations
-You can use the raw model for image classification. See the [model hub](https://huggingface.co/models?search=google/vit) to look for
-fine-tuned versions on a task that interests you.
 ### How to use
 Here is how to use this model:
 ```python
-from transformers import ViTFeatureExtractor, ViTForImageClassification
 from PIL import Image
 import requests
 url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
 image = Image.open(requests.get(url, stream=True).raw)
 feature_extractor = ViTFeatureExtractor.from_pretrained('google/vit-large-patch16-224-in21k')
-model = ViTForImageClassification.from_pretrained('google/vit-large-patch16-224-in21k')
 inputs = feature_extractor(images=image, return_tensors="pt")
 outputs = model(**inputs)
 last_hidden_state = outputs.last_hidden_state

 ## Intended uses & limitations
+You can use the raw model to embed images, but it's mostly intended to be fine-tuned on a downstream task.
 ### How to use
 Here is how to use this model:
 ```python
+from transformers import ViTFeatureExtractor, ViTModel
 from PIL import Image
 import requests
 url = 'http://images.cocodataset.org/val2017/000000039769.jpg'
 image = Image.open(requests.get(url, stream=True).raw)
 feature_extractor = ViTFeatureExtractor.from_pretrained('google/vit-large-patch16-224-in21k')
+model = ViTModel.from_pretrained('google/vit-large-patch16-224-in21k')
 inputs = feature_extractor(images=image, return_tensors="pt")
 outputs = model(**inputs)
 last_hidden_state = outputs.last_hidden_state