Spaces:

karolmajek
/

Axial-DeepLab-SWideRNet

Runtime error

App Files Files Community

Axial-DeepLab-SWideRNet / g3doc /setup /coco.md

karolmajek

from https://huggingface.co/spaces/akhaliq/deeplab2

d1843be about 4 years ago

preview code

raw

history blame contribute delete

3.03 kB

	# Run DeepLab2 on COCO dataset

	This page walks through the steps required to generate
	[COCO](https://cocodataset.org/) panoptic segmentation data for DeepLab2.
	DeepLab2 uses sharded TFRecords for efficient processing of the data.

	## Prework

	Before running any Deeplab2 scripts, the users should (1) access the
	[COCO dataset website](https://cocodataset.org/) to download the dataset,
	including [2017 Train images](http://images.cocodataset.org/zips/train2017.zip),
	[2017 Val images](http://images.cocodataset.org/zips/val2017.zip),
	[2017 Test images](http://images.cocodataset.org/zips/test2017.zip), and
	[2017 Panoptic Train/Val annotations](http://images.cocodataset.org/annotations/panoptic_annotations_trainval2017.zip),
	and (2) unzip the downloaded files.

	After finishing above steps, the expected directory structure should be as
	follows:

	```
	.(COCO_ROOT)
	+-- train2017
	\| \|
	\| +-- *.jpg
	\|
	\|-- val2017
	\| \|
	\| +-- *.jpg
	\|
	\|-- test2017
	\| \|
	\| +-- *.jpg
	\|
	+-- annotations
	\|
	+-- panoptic_{train\|val}2017.json
	+-- panoptic_{train\|val}2017
	```

	## Convert prepared dataset to TFRecord

	Use the following commandline to generate COCO TFRecords:

	```bash
	# For generating data for panoptic segmentation task
	python deeplab2/data/build_coco_data.py \
	--coco_root=${COCO_ROOT} \
	--output_dir=${OUTPUT_DIR}
	```

	Commandline above will output three sharded tfrecord files:
	`{train\|val\|test}@1000.tfrecord`. In the tfrecords, for `train` and `val` set,
	it contains the RGB image pixels as well as corresponding annotations. For
	`test` set, it contains RGB images only. These files will be used as the input
	for the model training and evaluation.

	Note that we map the class ID to continuous IDs. Specifically, we map the
	original label ID, which ranges from 1 to 200, to the contiguous ones ranging
	from 1 to 133.

	### TFExample proto format for COCO

	The Example proto contains the following fields:

	* `image/encoded`: encoded image content.
	* `image/filename`: image filename.
	* `image/format`: image file format.
	* `image/height`: image height.
	* `image/width`: image width.
	* `image/channels`: image channels.
	* `image/segmentation/class/encoded`: encoded segmentation content.
	* `image/segmentation/class/format`: segmentation encoding format.

	For panoptic segmentation, the encoded segmentation map will be the raw bytes of
	an int32 panoptic map, where each pixel is assigned to a panoptic ID, which is
	computed by:

	```
	panoptic ID = semantic ID * label divisor + instance ID
	```

	where semantic ID will be:

	* ignore label (0) for pixels not belonging to any segment
	* for segments associated with `iscrowd` label:
	* (default): ignore label (0)
	* (if set `--treat_crowd_as_ignore=false` while running
	`build_coco_data.py`): `category_id`
	* `category_id` for other segments

	The instance ID will be 0 for pixels belonging to

	* `stuff` class
	* `thing` class with `iscrowd` label
	* pixels with ignore label

	and `[1, label divisor)` otherwise.