github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl commits

Update README.md

3600ff82f3f6926dd3015747d4a7735dc9758771 authored over 3 years ago by Jun Tian <[email protected]>

Bump version

17c9d195f583582ca918e4e185e86f0526f248d4 authored over 3 years ago by Jun Tian <[email protected]>

Fix #251, ppo multidim action eval (#177)

d78f327e30bb35208ca03e317de962df69d89844 authored over 3 years ago by Albin Heimerson <[email protected]>

Bump version

52a9c8572a23c4197293ff12ed9cb21949beca4e authored over 3 years ago by Jun Tian <[email protected]>

QRDQN implementation (#176)

* QRDQN implementation

Initial implementation with a CartPole experiment with a few bugs.

...

b2a27f3189ad9bc41a79df8207301dd217b096cb authored over 3 years ago by Prasidh Srikumar <[email protected]>

SAC multidimensional actions (#173)

* Switch sigma to log_sigma

* Replace SAC network with gaussian network

* Missed a logsigm...

022c1fd433911dcaedf3ff38b4bfb6351c544497 authored over 3 years ago by Albin Heimerson <[email protected]>

Add dueling network (#171)

* Add dueling network

* Add docs

* Some adjustment

4a2417bb5f3349867182bba928d10e4cec69821e authored over 3 years ago by Guoyu Yang <[email protected]>

Fix GaussianNetwork stddev and replace SACPolicyNetwork (#172)

* Switch sigma to log_sigma

* Replace SAC network with gaussian network

* Missed a logsigm...

ca8f3474ba239bde70a3b11071d98befa586ff7c authored over 3 years ago by Albin Heimerson <[email protected]>

Fix a bug (#174)

ec06a82a0b8ff59a721e40de26008fabbbde7ce3 authored over 3 years ago by Guoyu Yang <[email protected]>

Use clamp instead of min max, update lower limit (#170)

d53a017827d346a0cbfbda8e2ea9080aec54c086 authored over 3 years ago by Albin Heimerson <[email protected]>

Fix bug in multi action ppo (#169)

* Remove dimension in log_pa, fix entropy for multi

* Update src/algorithms/policy_gradient/p...

2f28cbc443b1935f3b13005b35629043aeb46b40 authored over 3 years ago by Albin Heimerson <[email protected]>

Fix ppo pendulum example (#165)

* fix action_space name conflict problem

* add ppo pendulum to tests

bc64e422fe720b5040fbeb8f27cc62f45b1e894a authored almost 4 years ago by Albin Heimerson <[email protected]>

Add REM-DQN(Random Ensemble Mixture) method (#160)

* add some explanations

* Add REM DQN

* Add docs

* Modified implementatio...

8668f3c80761c54307f7f483e581f84e693af52d authored almost 4 years ago by Guoyu Yang <[email protected]>

Bump version

29083386cf29036c9dd79d22bead07203496fa69 authored almost 4 years ago by Jun Tian <[email protected]>

fix error with iqn (#163)

98af458996d0fdbb25ad8f38f056ee160c4e7562 authored almost 4 years ago by Jun Tian <[email protected]>

DQN: take into account the update_horizon to know if we can update (#162)

69b77b42eb883b05de03b92ce1d15a5dd653b211 authored almost 4 years ago by Ilan Coulon <[email protected]>

Typo fix. (#157)

* Typo fix.

e4d2df9f3de8bb1ae182d85cc7993b4f484f9002 authored almost 4 years ago by Prasidh Srikumar <[email protected]>

Implemented double DQN (#156)

* Implemented double DQN

Double DQN with an optional argument to disable it.

* Implemented...

3663eee9f0c5cf10a1033c19a4ad078382a36055 authored almost 4 years ago by Prasidh Srikumar <[email protected]>

add some explanations (#155)

5e1db96a69c07d4023aa12ed1777d4fdeab00d49 authored almost 4 years ago by GuoYu Yang <[email protected]>

Format .jl files (#153)

Co-authored-by: norci <[email protected]>

c5fa62828dc39ba3824a55cf259ffdf187289251 authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Bump version

333a48156b25aca34ac91611000453d782798dd8 authored almost 4 years ago by Jun Tian <[email protected]>

add GridWorlds environments (#152)

* ignore vim generated temp files

* add JuliaRL_BasicDQN_EmptyRoom experiment

* add per-st...

c5439e6dc029a36f783198653446fd82b874e366 authored almost 4 years ago by Sid-Bhatia-0 <[email protected]>

Allow multidimensional actions in ppo (#151)

* Hack to allow multidim actions in ppo

* Fix for single dim envs

* Handle single and mult...

7913db6cc54314dd76e12b079dd630a03d363562 authored almost 4 years ago by Albin Heimerson <[email protected]>

Format .jl files (#142)

Co-authored-by: norci <[email protected]>

a9ad73153d42a8c1aac595fdfffd227b830bdc9a authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

CompatHelper: bump compat for "BSON" to "0.3" (#149)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

4cb26f4a656047f0b6a96f3604ba7a327d5b6cc0 authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update StructArrays dependency (#150)

a14f7f009ca6355e3fbb2cb83855fc791a40881f authored almost 4 years ago by Nerd <[email protected]>

Add behavior cloning (#146)

* add behavior cloning

* add TODO

* add experiment for bc

* fix test error

* update ...

96de0674d3d452fa0ed38055d2585f9892eb738b authored almost 4 years ago by Jun Tian <[email protected]>

Bump version

f3da80b10f1f657ac6f7e96eabd09d23f482af37 authored almost 4 years ago by Jun Tian <[email protected]>

fix atari related experiments (#145)

ef712d049743338b4715e6c23e55d1765c1e206e authored almost 4 years ago by Jun Tian <[email protected]>

Update TagBot.yml

0f13747da4ed14d8680653cd138f70ca969ca27a authored almost 4 years ago by Jun Tian <[email protected]>

Add compat for DataStructures

2b4387d5177d03cdd27a8d73612bf4ad69e125a9 authored almost 4 years ago by Jun Tian <[email protected]>

CompatHelper: bump compat for "Zygote" to "0.6" (#138)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

734b21f16c6cf18662246207b96be72fabfb311e authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Support rlintro (#144)

* rename TabularLearner to TabularRandomPolicy

* sync chapter01

* sync changes related to ...

2ddf949800ac26409c20cf3d95b228acd011e53a authored almost 4 years ago by Jun Tian <[email protected]>

Bump version

335662aedc944146bba188f9f74edc4602c1e580 authored about 4 years ago by Jun Tian <[email protected]>

Enable cfr tests (#141)

* decouple PreActStage

* fix cfr related tests

* rename

* enable all tests

* resolve...

52424bf205189fabd8b0d8d52b4cf26a4332f415 authored about 4 years ago by Jun Tian <[email protected]>

Merge pull request #131 from JuliaReinforcementLearning/auto-juliaformatter-pr

Automatic JuliaFormatter.jl run

d7077e8e969e8a2969a21b7b2e6ffa190ce04fb9 authored about 4 years ago by norci <[email protected]>

Format .jl files

79b6a60f831682666425e164b57a568d5da13ad8 authored about 4 years ago by norci <[email protected]>

Merge pull request #139 from norci/patch

added .JuliaFormatter.toml

eb967cadc140b401a94a1eda21977ec7ef827934 authored about 4 years ago by norci <[email protected]>

added .JuliaFormatter.toml

198baeb4ff13f56afb383fdd5b4d3d849bcae73d authored about 4 years ago by norci <[email protected]>

Drop dependency of RLEnvs (#136)

* drop dependency of RLEnvs

* remove dependency on RLEnvs

* minor fix

* fix warnings

e8374632e214232d6dc9e3a48a4aa63d30621a4b authored about 4 years ago by Jun Tian <[email protected]>

Merge pull request #137 from JuliaReinforcementLearning/norci-patch-1

Update format_pr.yml

46324659a480ba8151cbad617200acc1954f81a0 authored about 4 years ago by norci <[email protected]>

Update format_pr.yml

copied from https://github.com/JuliaReinforcementLearning/ReinforcementLearningCore.jl/blob/mast...

02a139ecc94f46eb38682db45aae862e362d03a7 authored about 4 years ago by norci <[email protected]>

Update dependency (#135)

* moved imports

* removed repeated imports

* move using into RLZoo.jl

* remove other duplicat...

2c0e248f2765c9377afb4146e2c572a9015ce910 authored about 4 years ago by Jun Tian <[email protected]>

Update Project.toml

f55c858f53fdc0baa59800df0497d59871915631 authored about 4 years ago by Jun Tian <[email protected]>

Removed repetitive imports from algorithms. (#130)

* moved imports

* removed repeated imports

* move using into RLZoo.jl

* remove other du...

f60880338b395046f9119c8ed62e85599d9de908 authored about 4 years ago by Rishabh Varshney <[email protected]>

CompatHelper: add new compat entry for "CircularArrayBuffers" at version "0.1" (#132)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

08b94187667fd9427ef7e51ef3ebba9f412f775e authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Prepare for the next release of RLCore (#129)

* sync

* fix experiments in rl_env

* fix experiments of Atari

* bugfix with ppo

* fix tests
...

d4ead5e1c657e2d153e5d64a72039469a3f8af00 authored about 4 years ago by Jun Tian <[email protected]>

updated patch.jl, for ignore function. (#127)

2f6c1e2cff2324ed08c043737ad4e4acca3313a6 authored about 4 years ago by norci <[email protected]>

Update build status badge

806933161d1c787b565d0d2ce5cb52bd5ec8f77b authored about 4 years ago by Jun Tian <[email protected]>

Delete .travis.yml

3c7ce0d49cb14e2086540b3aff03e993c7e71c6d authored about 4 years ago by Jun Tian <[email protected]>

Create ci.yml

9b0ff2c0ecb93b2b090b58221c5aaf1d9f64396e authored about 4 years ago by Jun Tian <[email protected]>

Bump version

76f2f5118bc3eec567f9f27af3915541f1b748a7 authored about 4 years ago by Jun Tian <[email protected]>

improve basicdqn (#117)

5d0780b99c94834adfad1eb7d0e1cfe97318c27c authored about 4 years ago by Jun Tian <[email protected]>

Automated commit made by MassInstallAction.jl (#125)

ff95729026b48e08f0ce818d8b11809d6fd4134f authored about 4 years ago by Jun Tian <[email protected]>

Export PPOActionMaskTrajectory (#123)

801833a1b53f19cce4d6a522befd215ddf40581f authored about 4 years ago by Jun Tian <[email protected]>

Format .jl files (#121)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

6bd18ef30d4bce549b694a581fe5f365576bf639 authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

CompatHelper: bump compat for "CUDA" to "2.1" (#110)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

182a22c14a0a9d241326b33d158ad7f64a3b29e8 authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Fix #118 (#119)

7bc3013f5f29ff924adcb96622ce08d9218732ec authored about 4 years ago by Jun Tian <[email protected]>

Format .jl files (#98)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

096535aa0eadc9844a2cdc3179ed926115644660 authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update README.md (#114)

58e3044cb52b8d2ec627a9428712f697815f8b1a authored about 4 years ago by Jun Tian <[email protected]>

Update README.md

e28836194fcd8256d2bb28d64f8bee0a788f2e69 authored about 4 years ago by Jun Tian <[email protected]>

Update README.md

5f55b8c0597de8a27b7406abc13c5eedc214a6fc authored about 4 years ago by Jun Tian <[email protected]>

Adding Mean Actor Critic (#108)

* Update policy_gradient.jl

* CartPole MAC experiment

* MAC.jl

* Adding Test for MAC

...

1780ac6b118452d4baaa24d5f1737540f198cd03 authored about 4 years ago by Raj Ghugare <[email protected]>

Revert auto format related changes (#107)

2e5a43f4af3bce933c4a617952276706b04785b9 authored about 4 years ago by Jun Tian <[email protected]>

Bump Version

7c2490db2252e538892b2f23ac408130aca43f7f authored about 4 years ago by Jun Tian <[email protected]>

Improve CFR (#99)

* add an AbstractCFRPolicy

* improve tabular cfr

* add best response policy

* add nash_...

59a2475ace909776b62bfcbcc37ec518515df071 authored about 4 years ago by Jun Tian <[email protected]>

Update format_pr.yml

61836091c392488a246ec1e8fe48c8730aba6d38 authored about 4 years ago by Jun Tian <[email protected]>

Update format_pr.yml

c1412e5739ecd6a41e2e96634ff44738c4b30692 authored about 4 years ago by Jun Tian <[email protected]>

Update README.md (#105)

6ac4170755df7197c2df82688192a869fd35a6b3 authored about 4 years ago by Jun Tian <[email protected]>

CompatHelper: bump compat for "Distributions" to "0.24" (#101)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

22d500f51964caa482879a174c2afce9fd5e4fb0 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

fine-tune the logo slightly

89bf707efe8936ff7fd15a82e07bd146c3554b90 authored over 4 years ago by Jun Tian <[email protected]>

fine-tune the logo slightly

6ae064d8d95afd2741c3fadd49ad9d65419e9ef0 authored over 4 years ago by Jun Tian <[email protected]>

update logo

6f458f13337fc4d64244a316a4c91a9b424a0f13 authored over 4 years ago by Jun Tian <[email protected]>

Format .jl files (#91)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

cf4352317ef40b6fcbffe0c832555e3179e5d94f authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

CompatHelper: add new compat entry for "StructArrays" at version "0.4" (#95)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

c2f09f1ab08b1a681c50799de73e05f5edbb551a authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Rename PPOLearner to PPOPolicy and make it support continuous space (#93)

* initial changes

* added experiment of PPO with Pendulum

* fix test errors

* update RE...

2674ba187797506e0838be545ac45389cd00ade9 authored over 4 years ago by Jun Tian <[email protected]>

Update README.md

73d11f6450473e7873e59b1f63e54453190a3596 authored over 4 years ago by Jun Tian <[email protected]>

PG policy (#87)

* Implemented Reinforce policy gradient.

added a experiment with CartPole.

* refactor

*...

0e2dd9c225edd6a7dbe754d343b809d4616b2225 authored over 4 years ago by norci <[email protected]>

Add MCCFR (#90)

* add outcome_sampling_mccfr

* add esmccfr

* update README.md

34aea9030577ed5cea6630ad02dc60e6c4cfd9a6 authored over 4 years ago by Jun Tian <[email protected]>

Format .jl files (#88)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

7829fa526b88da8a0c08773c6fac6d285293685f authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Twin Delayed DDPG (TD3) (#89)

* add TD3

* adapt README

44e3358b23de71b07e8ed3fc458a0e020a11689f authored over 4 years ago by Roman Bange <[email protected]>

Fix (#85)

* fixed bug in reward logging.

due to multithread env does not have POST_EPISODE_STAGE.
See:...

b7bb7a0796903b7fe393beac985af2ab5dbf2d5a authored over 4 years ago by norci <[email protected]>

Format .jl files (#84)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

d21b82df8953ec9e192deadc20ccd0c4f3798a5b authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Multi agent related changes (#83)

* add experiment of snake game

* sync

* add Experiment for Minimax

* add Experiment for...

28fcb998348eafc78a519fe85ee6020d1f2e1970 authored over 4 years ago by Jun Tian <[email protected]>

Format .jl files (#82)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

39159f4fef674f49fae2b62ed4a8b7e44e12521e authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

added more tensorboard logs in rl experiments ... (#81)

* added Loss values for DDPG policy

* added more tensorboard logs in rl experiments.

adjus...

87a63f009b82c6e4c5eeca3e74b22538c0cb431a authored over 4 years ago by norci <[email protected]>

Format .jl files (#78)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

df55b8833bf75061885511d406776ad3f49e7fb9 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

add experiment of snake game (#77)

be4bb82402d622310d009ac7ec81f25a6ffdfcb6 authored over 4 years ago by Jun Tian <[email protected]>

fix legal_actions_mask errors (#74)

74c6e5a0933b8168f1a60f8ce95641d24491723a authored over 4 years ago by Jun Tian <[email protected]>

CompatHelper: add new compat entry for "Distributions" at version "0.23" (#72)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

bf79285c2499d94d3e7a7023be39a8d894c85106 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Soft Actor Critic (#71)

* inital SAC implementation

* PR review fixes

cf9bf197bc2b0493c329112cbdf41abe9523403e authored over 4 years ago by Roman Bange <[email protected]>

Format .jl files (#70)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

6409d3a3f9b90bd83854ef789aeab5cc0509113e authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

support legal_actions_mask (#69)

e3f9375a7b32c6d346591a00617fcb7f634eb0fd authored over 4 years ago by Jun Tian <[email protected]>

Update README.md

1f0cc22b2785cb0fd457c8e483e52a6b4e452d75 authored over 4 years ago by Jun Tian <[email protected]>

Format .jl files (#68)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

9b1057e320161d409bc71708086419d4d1e5a61e authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update Project.toml

fc159d1ac81b9cd200523d314de157c3005dc9f3 authored over 4 years ago by Jun Tian <[email protected]>

update dependency of RLCore (#67)

* update dependency of RLCore

* remove unnecessary copy due to upstream change

* correct s...

f1790c43b4c18165bcb22a5e80e998667acab306 authored over 4 years ago by Jun Tian <[email protected]>

Update README.md

816761c16a44c91ac619f02e9afbe94e0a45c3a7 authored over 4 years ago by Jun Tian <[email protected]>

Format .jl files (#66)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

79f91a39d1cdd74e830cd461fedda43fa0066612 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Update dependency (#63)

* bump version

* fix dependencies

* fix experiments in rl_env

* minor changes of seeds
...

a076f4ec50327e8faedaf0a13200b9309562f190 authored over 4 years ago by Jun Tian <[email protected]>

Ecosyste.ms: OpenCollective

github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl