diff --git a/docs/zh_cn/migration/migrate_transform.md b/docs/zh_cn/migration/migrate_transform.md new file mode 100644 index 0000000000000000000000000000000000000000..b51dc1bdd12a1786e6a3c2a26d3adeb04fc31724 --- /dev/null +++ b/docs/zh_cn/migration/migrate_transform.md @@ -0,0 +1,153 @@ +# æ•°æ®å˜æ¢ç±»çš„è¿ç§» + +## 简介 + +在 TorchVision çš„æ•°æ®å˜æ¢ç±»æŽ¥å£çº¦å®šä¸ï¼Œæ•°æ®å˜æ¢ç±»éœ€è¦å®žçŽ° `__call__` 方法,而在 OpenMMLab 1.0 的接å£çº¦å®šä¸ï¼Œè¿›ä¸€æ¥è¦æ±‚ +`__call__` 方法的输出应当是一个å—典,在å„ç§æ•°æ®å˜æ¢ä¸å¯¹è¿™ä¸ªå—å…¸è¿›è¡Œå¢žåˆ æŸ¥æ”¹ã€‚åœ¨ OpenMMLab 2.0 ä¸ï¼Œä¸ºäº†æå‡åŽç»çš„ +å¯æ‰©å±•æ€§ï¼Œæˆ‘们将原先的 `__call__` 方法è¿ç§»ä¸º `transform` 方法,并è¦æ±‚æ•°æ®å˜æ¢ç±»åº”当继承 +[`mmcv.transforms.BaseTransfrom`](https://mmcv.readthedocs.io/en/dev-2.x/api.html#TODO)ã€‚å…·ä½“å¦‚ä½•å®žçŽ°ä¸€ä¸ªæ•°æ® +å˜æ¢ç±»ï¼Œå¯ä»¥å‚è§[文档](../tutorials/data_transform.md)。 + +由于在æ¤æ¬¡æ›´æ–°ä¸ï¼Œæˆ‘们将部分共用的数æ®å˜æ¢ç±»ç»Ÿä¸€è¿ç§»è‡³ MMCV ä¸ï¼Œå› æ¤æœ¬æ–‡çš„将会以 [MMClassification v0.23.2](https://github.com/open-mmlab/mmclassification/tree/v0.23.2)ã€[MMDetection v2.25.1](https://github.com/open-mmlab/mmdetection/tree/v2.25.1) å’Œ [MMCV v2.0.0rc0](https://github.com/open-mmlab/mmcv/tree/dev-2.x) 为例,对比这些数æ®å˜æ¢ç±»åœ¨æ–°æ—§ç‰ˆæœ¬ä¸åŠŸèƒ½ã€ç”¨æ³•å’Œå®žçŽ°ä¸Šçš„差异。 + +## 功能差异 + +<table class="docutils"> +<thead> + <tr> + <th></th> + <th>MMClassification (æ—§)</th> + <th>MMDetection (æ—§)</th> + <th>MMCV (æ–°)</th> + </tr> +</thead> +<tbody> + <tr> + <td><code>LoadImageFromFile</code></td> + <td>从 'img_prefix' å’Œ 'img_info.filename' å—段组åˆèŽ·å¾—文件路径并读å–</td> + <td>从 'img_prefix' å’Œ 'img_info.filename' å—段组åˆèŽ·å¾—文件路径并读å–,支æŒæŒ‡å®šé€šé“顺åº</td> + <td>从 'img_path' 获得文件路径并读å–,支æŒæŒ‡å®šåŠ 载失败ä¸æŠ¥é”™ï¼Œæ”¯æŒæŒ‡å®šè§£ç åŽç«¯</td> + </tr> + <tr> + <td><code>LoadAnnotations</code></td> + <td>æ— </td> + <td>支æŒè¯»å– bbox,label,maskï¼ˆåŒ…æ‹¬å¤šè¾¹å½¢æ ·å¼ï¼‰ï¼Œseg mapï¼Œè½¬æ¢ bbox åæ ‡ç³»</td> + <td>支æŒè¯»å– bbox,label,mask(ä¸åŒ…æ‹¬å¤šè¾¹å½¢æ ·å¼ï¼‰ï¼Œseg map</td> + </tr> + <tr> + <td><code>Pad</code></td> + <td>å¡«å…… "img_fields" ä¸æ‰€æœ‰å—段,ä¸æ”¯æŒæŒ‡å®šå¡«å……至整数å€</td> + <td>å¡«å…… "img_fields" ä¸æ‰€æœ‰å—段,支æŒæŒ‡å®šå¡«å……至整数å€</td> + <td>å¡«å…… "img" å—段,支æŒæŒ‡å®šå¡«å……至整数å€</td> + </tr> + <tr> + <td><code>CenterCrop</code></td> + <td>è£åˆ‡ "img_fields" ä¸æ‰€æœ‰å—段,支æŒä»¥ EfficientNet æ–¹å¼è¿›è¡Œè£åˆ‡</td> + <td>æ— </td> + <td>è£åˆ‡ "img" å—段的图åƒï¼Œ"gt_bboxes" å—段的 bbox,"gt_seg_map" å—段的分割图,"gt_keypoints" å—段的关键点,支æŒè‡ªåŠ¨å¡«å……è£åˆ‡è¾¹ç¼˜</td> + </tr> + <tr> + <td><code>Normalize</code></td> + <td>图åƒå½’一化</td> + <td>æ— å·®å¼‚</td> + <td>æ— å·®å¼‚ï¼Œä½† MMEngine 推è在<a href="TODO">æ•°æ®é¢„处ç†å™¨</a>ä¸è¿›è¡Œå½’一化</td> + </tr> + <tr> + <td><code>Resize</code></td> + <td>缩放 "img_fields" ä¸æ‰€æœ‰å—段,å…è®¸æŒ‡å®šæ ¹æ®æŸè¾¹é•¿ç‰æ¯”例缩放</td> + <td>功能由 <code>Resize</code> å®žçŽ°ã€‚éœ€è¦ <code>ratio_range</code> 为 None,<code>img_scale</code> 仅指定一个尺寸,且 <code>multiscale_mode</code> 为 "value" 。</td> + <td>缩放 "img" å—段的图åƒï¼Œ"gt_bboxes" å—段的 bbox,"gt_seg_map" å—段的分割图,"gt_keypoints" å—段的关键点,支æŒæŒ‡å®šç¼©æ”¾æ¯”例,支æŒç‰æ¯”例缩放图åƒè‡³æŒ‡å®šå°ºå¯¸å†…</td> + </tr> + <tr> + <td><code>RandomResize</code></td> + <td>æ— </td> + <td>功能由 <code>Resize</code> å®žçŽ°ã€‚éœ€è¦ <code>ratio_range</code> 为 None,<code>img_scale</code>指定两个尺寸,且 <code>multiscale_mode</code> 为 "range",或 <code>ratio_range</code> ä¸ä¸º None。 + <pre>Resize( + img_sacle=[(640, 480), (960, 720)], + mode="range", +)</pre> + </td> + <td>ç¼©æ”¾åŠŸèƒ½åŒ <code>Resize</code>,支æŒä»ŽæŒ‡å®šå°ºå¯¸èŒƒå›´æˆ–指定比例范围éšæœºé‡‡æ ·ç¼©æ”¾å°ºå¯¸ã€‚ + <pre>RandomResize(scale=[(640, 480), (960, 720)])</pre> + </td> + </tr> + <tr> + <td><code>RandomChoiceResize</code></td> + <td>æ— </td> + <td>功能由 <code>Resize</code> å®žçŽ°ã€‚éœ€è¦ <code>ratio_range</code> 为 None,<code>img_scale</code> 指定多个尺寸,且 <code>multiscale_mode</code> 为 "value"。 + <pre>Resize( + img_sacle=[(640, 480), (960, 720)], + mode="value", +)</pre> + </td> + <td>ç¼©æ”¾åŠŸèƒ½åŒ <code>Resize</code>,支æŒä»Žè‹¥å¹²æŒ‡å®šå°ºå¯¸ä¸éšæœºé€‰æ‹©ç¼©æ”¾å°ºå¯¸ã€‚ + <pre>RandomChoiceResize(scales=[(640, 480), (960, 720)])</pre> + </td> + </tr> + <tr> + <td><code>RandomGrayscale</code></td> + <td>ç°åº¦åŒ– "img_fields" ä¸æ‰€æœ‰å—段,ç°åº¦åŒ–åŽä¿æŒé€šé“数。</td> + <td>æ— </td> + <td>ç°åº¦åŒ– "img" å—段,支æŒæŒ‡å®šç°åº¦åŒ–æƒé‡ï¼Œæ”¯æŒæŒ‡å®šæ˜¯å¦åœ¨ç°åº¦åŒ–åŽä¿æŒé€šé“数(默认ä¸ä¿æŒï¼‰ã€‚</td> + </tr> + <tr> + <td><code>RandomFlip</code></td> + <td>翻转 "img_fields" ä¸æ‰€æœ‰å—段,支æŒæŒ‡å®šæ°´å¹³æˆ–垂直翻转。</td> + <td>翻转 "img_fields", "bbox_fields", "mask_fields", "seg_fields" ä¸æ‰€æœ‰å—段,支æŒæŒ‡å®šæ°´å¹³ã€åž‚直或对角翻转,支æŒæŒ‡å®šå„类翻转概率。</td> + <td>翻转 "img", "gt_bboxes", "gt_seg_map", "gt_keypoints" å—段,支æŒæŒ‡å®šæ°´å¹³ã€åž‚直或对角翻转,支æŒæŒ‡å®šå„类翻转概率。</td> + </tr> + <tr> + <td><code>MultiScaleFlipAug</code></td> + <td>æ— </td> + <td>用于测试时增强</td> + <td>TODO</td> + </tr> + <tr> + <td><code>ToTensor</code></td> + <td>将指定å—段转æ¢ä¸º <code>torch.Tensor</code></td> + <td>æ— å·®å¼‚</td> + <td>æ— å·®å¼‚</td> + </tr> + <tr> + <td><code>ImageToTensor</code></td> + <td>将指定å—段转æ¢ä¸º <code>torch.Tensor</code>,并调整通é“顺åºè‡³ CHW。</td> + <td>æ— å·®å¼‚</td> + <td>æ— å·®å¼‚</td> + </tr> +</tbody> +</table> + +## 实现差异 + +以 `RandomFlip` 为例,MMCV çš„ [RandomFlip](https://github.com/open-mmlab/mmcv/blob/5947178e855c23eea6103b1d70e1f8027f7b2ca8/mmcv/transforms/processing.py#L985) 相比旧版 MMDetection çš„ [RandomFlip](https://github.com/open-mmlab/mmdetection/blob/3b72b12fe9b14de906d1363982b9fba05e7d47c1/mmdet/datasets/pipelines/transforms.py#L333),需è¦ç»§æ‰¿ `BaseTransfrom`,将功能实现放在 `transforms` 方法,并将生æˆéšæœºç»“果的部分放在å•ç‹¬çš„方法ä¸ï¼Œç”¨ `cache_randomness` 包装。有关éšæœºæ–¹æ³•çš„包装相关功能,å‚è§[相关文档](TODO)。 + +- MMDetection (旧) + +```python +class RandomFlip: + def __call__(self, results): + """调用时进行éšæœºç¿»è½¬""" + ... + # éšæœºé€‰æ‹©ç¿»è½¬æ–¹å‘ + cur_dir = np.random.choice(direction_list, p=flip_ratio_list) + ... + return results +``` + +- MMCV + +```python +class RandomFlip(BaseTransfrom): + def transform(self, results): + """调用时进行éšæœºç¿»è½¬""" + ... + cur_dir = self._random_direction() + ... + return results + + @cache_randomness + def _random_direction(self): + """éšæœºé€‰æ‹©ç¿»è½¬æ–¹å‘""" + ... + return np.random.choice(direction_list, p=flip_ratio_list) +```