- Feb 03, 2023
-
-
Alexander Pacha authored
* Adding missing pre-commit requirement to tests.txt * Added support for setting a timeout for distributed learning * Adding documentation about how to change the runtime timeout into the distributed manual. * Fixed type in documentation to correctly specify an integer * Removing type-cast after checking the correct type already before * Update mmengine/dist/utils.py Adding an explicit `is not None` to the check Co-authored-by:
Mashiro <57566630+HAOCHENYE@users.noreply.github.com> * Removing explicit type check and replacing it with more pythonic way of assuming it is the right type and handling the exception if the type doesn't match. * Removing pre-commit from test requirements again * Simplified the code according to suggestions from PR * Update distributed.md --------- Co-authored-by:
Mashiro <57566630+HAOCHENYE@users.noreply.github.com> Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com>
-
- Oct 24, 2022
-
-
wangjiangben-hw authored
* init npu * Update mmengine/optim/optimizer/amp_optimizer_wrapper.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * Update mmengine/dist/dist.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * change to is_hccl_backend * Update mmengine/optim/optimizer/amp_optimizer_wrapper.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * add comment with AmpOptimWrapper * Update mmengine/runner/amp.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * Update mmengine/runner/amp.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * add npu fn in base_model * Update mmengine/optim/optimizer/amp_optimizer_wrapper.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * clean lint * Update mmengine/optim/optimizer/amp_optimizer_wrapper.py Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> * Update mmengine/model/base_model/base_model.py Co-authored-by:
Mashiro <57566630+HAOCHENYE@users.noreply.github.com> * add is_npu_available * try to fix * Add comments * Refine grammar Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> Co-authored-by:
Mashiro <57566630+HAOCHENYE@users.noreply.github.com> Co-authored-by:
HAOCHENYE <21724054@zju.edu.cn>
-
- Oct 08, 2022
-
-
Austin Welch authored
* Add smddp dist backend option * [Dev]: Upgrade pre commit hooks (#576) * Upgrade the versions of pre-commit-hooks * update zh-cn.yaml * [Docs] Fix the docstring of model sub-package (#573) * [Doc]: Update config.md (#562) * Update config.md * Update config.md * [Doc] delete the error comment in docs (#514) Co-authored-by:
Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> Co-authored-by:
Zhengfei-0311 <78833899+Zhengfei-0311@users.noreply.github.com> Co-authored-by:
vansin <msnode@163.com>
-
- Aug 29, 2022
-
-
Tong Gao authored
* [CI] Full tests * Add github tests * fix * fix typo Co-authored-by:
zhouzaida <zhouzaida@163.com>
-
- Aug 24, 2022
-
-
Zaida Zhou authored
* Rename data to structure * adjust the way to import module * adjust the way to import module * rename Structure to Data Structures in docs api * rename structure to structures * support using some modules of mmengine without torch * fix circleci config * fix circleci config * fix registry ut * minor fix * move init method from model/utils to model/weight_init.py * move init method from model/utils to model/weight_init.py * move sync_bn to model * move functions depending on torch to dl_utils * format import * fix logging ut * add weight init in model/__init__.py * move get_config and get_model to mmengine/hub * move log_processor.py to mmengine/runner * fix ut * Add TimeCounter in dl_utils/__init__.py
-
- Aug 15, 2022
-
-
Kai Hu authored
-
Zaida Zhou authored
-
- Jun 22, 2022
-
-
Haian Huang(深度眸) authored
* fix RuntimeError of SyncBuffersHook * add UT
-
- Jun 16, 2022
-
-
Jiazhen Wang authored
* support mlu * add ut and refine docstring
-
- May 25, 2022
-
-
Jiazhen Wang authored
* refine sync random seed * cancel seed param in batch-sampler
-
Haian Huang(深度眸) authored
* Add profiling tools * fix docstr * fix docstr * update * fix bug * update * update * fix error * fix mypy * uodate * merge main * fix UT
-
- May 19, 2022
-
-
Zaida Zhou authored
* [Fix] Replace torch distributed with mmengine dist module * minor refinement * move all_reduce_params to dist.py * add unit tests * update unit tests * fix test_logger.py * add examples
-
- May 10, 2022
-
-
Yining Li authored
-
- May 05, 2022
-
-
Zaida Zhou authored
-
- Apr 27, 2022
-
-
Zaida Zhou authored
* [Enhancement] Handle the device type of inputs in functions * rename and move three fucntions to dist/utils.py * minor refinement * rename dist to torch_dist in utils.py * update unit tests * refine unit tests * add unit tests * fix unit tests * replace Sequence with list and tuple * rename get_backend_device to get_comm_device * fix unit tests * fix unit tests * refactor and add more unit tests * cast_data_device does not support set type
-
- Mar 13, 2022
-
-
Zaida Zhou authored
-
- Mar 05, 2022
-
-
Zaida Zhou authored
* [Feature] Add distributed module * fix IS_DIST error * all_reduce_dict does operations in-place * support 'mean' operation * provide local group process * add tmpdir argument for collect_results * add unit tests * refactor unit tests * simplify steps to create multiple processes * minor fix * describe the different of *gather* in mmengine and pytorch * minor fix * add unit tests for nccl * test nccl backend in multiple gpu * add get_default_group function to handle different torch versions * minor fix * [Feature] Add distributed module * fix IS_DIST error * all_reduce_dict does operations in-place * support 'mean' operation * provide local group process * add tmpdir argument for collect_results * add unit tests * refactor unit tests * simplify steps to create multiple processes * minor fix * describe the different of *gather* in mmengine and pytorch * minor fix * add unit tests for nccl * test nccl backend in multiple gpu * add get_default_group function to handle different torch versions * minor fix * minor fix * handle torch1.5 * handle torch1.5 * minor fix * fix typo * refactor unit tests * nccl does not support gather and gather_object * fix gather * fix collect_results_cpu * fix collect_results and refactor unit tests * fix collect_results unit tests * handle torch.cat in torch1.5 * refine docstring * refine docstring * fix comments * fix comments
-