Spaces:
Running
Running
Commit
ยท
df2b142
1
Parent(s):
3c336c3
style(nyz): add status emoji and env link
Browse files
README.md
CHANGED
|
@@ -25,22 +25,19 @@ If you want to contact us & join us, you can โ๏ธ to our team : <opendilab@p
|
|
| 25 |
|
| 26 |
|
| 27 |
# Overview of Model Zoo
|
| 28 |
-
|
| 29 |
-
<sup>(
|
| 30 |
-
<sup>(2): "W" means that the corresponding model is in the upload waitinglist.</sup>
|
| 31 |
-
|
| 32 |
### Deep Reinforcement Learning
|
| 33 |
-
|
| 34 |
-
|
|
| 35 |
-
|
|
| 36 |
-
| [
|
| 37 |
-
| [
|
| 38 |
-
| [
|
| 39 |
-
| [
|
| 40 |
-
| [
|
| 41 |
-
| [
|
| 42 |
-
| [
|
| 43 |
-
| [SAC](https://arxiv.org/pdf/1801.01290.pdf) | | | | - | - | - | | | |
|
| 44 |
|
| 45 |
|
| 46 |
### Multi-Agent Reinforcement Learning
|
|
|
|
| 25 |
|
| 26 |
|
| 27 |
# Overview of Model Zoo
|
| 28 |
+
<sup>(1): "๐" means that this algorithm doesn't support this environment.</sup>
|
| 29 |
+
<sup>(2): "๐ฎ" means that the corresponding model is in the upload waitinglist.</sup>
|
|
|
|
|
|
|
| 30 |
### Deep Reinforcement Learning
|
| 31 |
+
| Algo.\Env. | [LunarLander](https://di-engine-docs.readthedocs.io/en/latest/13_envs/lunarlander.html) | [BipedalWalker](https://di-engine-docs.readthedocs.io/en/latest/13_envs/bipedalwalker.html) | [Pendulum](https://di-engine-docs.readthedocs.io/en/latest/13_envs/pendulum.html) | [Pong](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [SpaceInvaders](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Qbert](https://di-engine-docs.readthedocs.io/en/latest/13_envs/atari.html) | [Hopper](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Halfcheetah](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) | [Walker2d](https://di-engine-docs.readthedocs.io/en/latest/13_envs/mujoco.html) |
|
| 32 |
+
| :-------------: | :-------------: | :------------------------: | :------------: | :--------------: | :------------: | :------------------: | :---------: | :---------: | :---------: |
|
| 33 |
+
| [PPO](https://arxiv.org/pdf/1707.06347.pdf) | [โ
](https://huggingface.co/OpenDILabCommunity/LunarLander-v2-ppo) | | | | | | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-PPO) | | |
|
| 34 |
+
| [PG](https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf) | ๐ฎ | | | | | |๐ฎ | | |
|
| 35 |
+
| [A2C](https://arxiv.org/pdf/1602.01783.pdf) | ๐ฎ | | | | | | ๐ฎ | | |
|
| 36 |
+
| [IMPALA](https://arxiv.org/pdf/1802.01561.pdf) |๐ฎ | | | | | | ๐ฎ | | |
|
| 37 |
+
| [DQN](https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf) | ๐ฎ | | | | | | ๐ | ๐ | ๐ |
|
| 38 |
+
| [DDPG](https://arxiv.org/pdf/1509.02971.pdf) | ๐ฎ | | | ๐ | ๐ | ๐ | ๐ฎ | | |
|
| 39 |
+
| [TD3](https://arxiv.org/pdf/1802.09477.pdf) | ๐ฎ | | | ๐ | ๐ | ๐ |[โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-TD3) | | |
|
| 40 |
+
| [SAC](https://arxiv.org/pdf/1801.01290.pdf) |๐ฎ | | | ๐ | ๐ | ๐ | [โ
](https://huggingface.co/OpenDILabCommunity/Hopper-v4-SAC) | | |
|
|
|
|
| 41 |
|
| 42 |
|
| 43 |
### Multi-Agent Reinforcement Learning
|