Ecosyste.ms: OpenCollective
An open API service for software projects hosted on Open Collective.
github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl
https://github.com/JuliaReinforcementLearning/ReinforcementLearningZoo.jl
3600ff82f3f6926dd3015747d4a7735dc9758771 authored over 3 years ago by Jun Tian <[email protected]>
17c9d195f583582ca918e4e185e86f0526f248d4 authored over 3 years ago by Jun Tian <[email protected]>
d78f327e30bb35208ca03e317de962df69d89844 authored over 3 years ago by Albin Heimerson <[email protected]>
52a9c8572a23c4197293ff12ed9cb21949beca4e authored over 3 years ago by Jun Tian <[email protected]>
* QRDQN implementation
Initial implementation with a CartPole experiment with a few bugs.
...
b2a27f3189ad9bc41a79df8207301dd217b096cb authored over 3 years ago by Prasidh Srikumar <[email protected]>* Switch sigma to log_sigma
* Replace SAC network with gaussian network
* Missed a logsigm...
022c1fd433911dcaedf3ff38b4bfb6351c544497 authored over 3 years ago by Albin Heimerson <[email protected]>* Add dueling network
* Add docs
* Some adjustment
4a2417bb5f3349867182bba928d10e4cec69821e authored over 3 years ago by Guoyu Yang <[email protected]>* Switch sigma to log_sigma
* Replace SAC network with gaussian network
* Missed a logsigm...
ca8f3474ba239bde70a3b11071d98befa586ff7c authored over 3 years ago by Albin Heimerson <[email protected]>ec06a82a0b8ff59a721e40de26008fabbbde7ce3 authored over 3 years ago by Guoyu Yang <[email protected]>
d53a017827d346a0cbfbda8e2ea9080aec54c086 authored over 3 years ago by Albin Heimerson <[email protected]>
* Remove dimension in log_pa, fix entropy for multi
* Update src/algorithms/policy_gradient/p...
2f28cbc443b1935f3b13005b35629043aeb46b40 authored over 3 years ago by Albin Heimerson <[email protected]>* fix action_space name conflict problem
* add ppo pendulum to tests
bc64e422fe720b5040fbeb8f27cc62f45b1e894a authored almost 4 years ago by Albin Heimerson <[email protected]>* add some explanations
* Add REM DQN
* Add docs
* Add docs
* Modified implementatio...
8668f3c80761c54307f7f483e581f84e693af52d authored almost 4 years ago by Guoyu Yang <[email protected]>29083386cf29036c9dd79d22bead07203496fa69 authored almost 4 years ago by Jun Tian <[email protected]>
98af458996d0fdbb25ad8f38f056ee160c4e7562 authored almost 4 years ago by Jun Tian <[email protected]>
69b77b42eb883b05de03b92ce1d15a5dd653b211 authored almost 4 years ago by Ilan Coulon <[email protected]>
* Typo fix.
e4d2df9f3de8bb1ae182d85cc7993b4f484f9002 authored almost 4 years ago by Prasidh Srikumar <[email protected]>* Implemented double DQN
Double DQN with an optional argument to disable it.
* Implemented...
3663eee9f0c5cf10a1033c19a4ad078382a36055 authored almost 4 years ago by Prasidh Srikumar <[email protected]>5e1db96a69c07d4023aa12ed1777d4fdeab00d49 authored almost 4 years ago by GuoYu Yang <[email protected]>
Co-authored-by: norci <[email protected]>
c5fa62828dc39ba3824a55cf259ffdf187289251 authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>333a48156b25aca34ac91611000453d782798dd8 authored almost 4 years ago by Jun Tian <[email protected]>
* ignore vim generated temp files
* add JuliaRL_BasicDQN_EmptyRoom experiment
* add per-st...
c5439e6dc029a36f783198653446fd82b874e366 authored almost 4 years ago by Sid-Bhatia-0 <[email protected]>* Hack to allow multidim actions in ppo
* Fix for single dim envs
* Handle single and mult...
7913db6cc54314dd76e12b079dd630a03d363562 authored almost 4 years ago by Albin Heimerson <[email protected]>Co-authored-by: norci <[email protected]>
a9ad73153d42a8c1aac595fdfffd227b830bdc9a authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
4cb26f4a656047f0b6a96f3604ba7a327d5b6cc0 authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>a14f7f009ca6355e3fbb2cb83855fc791a40881f authored almost 4 years ago by Nerd <[email protected]>
* add behavior cloning
* add TODO
* add experiment for bc
* fix test error
* update ...
96de0674d3d452fa0ed38055d2585f9892eb738b authored almost 4 years ago by Jun Tian <[email protected]>f3da80b10f1f657ac6f7e96eabd09d23f482af37 authored almost 4 years ago by Jun Tian <[email protected]>
ef712d049743338b4715e6c23e55d1765c1e206e authored almost 4 years ago by Jun Tian <[email protected]>
0f13747da4ed14d8680653cd138f70ca969ca27a authored almost 4 years ago by Jun Tian <[email protected]>
2b4387d5177d03cdd27a8d73612bf4ad69e125a9 authored almost 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
734b21f16c6cf18662246207b96be72fabfb311e authored almost 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* rename TabularLearner to TabularRandomPolicy
* sync chapter01
* sync changes related to ...
2ddf949800ac26409c20cf3d95b228acd011e53a authored almost 4 years ago by Jun Tian <[email protected]>335662aedc944146bba188f9f74edc4602c1e580 authored about 4 years ago by Jun Tian <[email protected]>
* decouple PreActStage
* fix cfr related tests
* rename
* enable all tests
* resolve...
52424bf205189fabd8b0d8d52b4cf26a4332f415 authored about 4 years ago by Jun Tian <[email protected]>Automatic JuliaFormatter.jl run
d7077e8e969e8a2969a21b7b2e6ffa190ce04fb9 authored about 4 years ago by norci <[email protected]>79b6a60f831682666425e164b57a568d5da13ad8 authored about 4 years ago by norci <[email protected]>
added .JuliaFormatter.toml
eb967cadc140b401a94a1eda21977ec7ef827934 authored about 4 years ago by norci <[email protected]>198baeb4ff13f56afb383fdd5b4d3d849bcae73d authored about 4 years ago by norci <[email protected]>
* drop dependency of RLEnvs
* remove dependency on RLEnvs
* minor fix
* fix warnings
e8374632e214232d6dc9e3a48a4aa63d30621a4b authored about 4 years ago by Jun Tian <[email protected]>Update format_pr.yml
46324659a480ba8151cbad617200acc1954f81a0 authored about 4 years ago by norci <[email protected]>copied from https://github.com/JuliaReinforcementLearning/ReinforcementLearningCore.jl/blob/mast...
02a139ecc94f46eb38682db45aae862e362d03a7 authored about 4 years ago by norci <[email protected]>* moved imports
* removed repeated imports
* move using into RLZoo.jl
* remove other duplicat...
2c0e248f2765c9377afb4146e2c572a9015ce910 authored about 4 years ago by Jun Tian <[email protected]>f55c858f53fdc0baa59800df0497d59871915631 authored about 4 years ago by Jun Tian <[email protected]>
* moved imports
* removed repeated imports
* move using into RLZoo.jl
* remove other du...
f60880338b395046f9119c8ed62e85599d9de908 authored about 4 years ago by Rishabh Varshney <[email protected]>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
08b94187667fd9427ef7e51ef3ebba9f412f775e authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* sync
* fix experiments in rl_env
* fix experiments of Atari
* bugfix with ppo
* fix tests
...
2f6c1e2cff2324ed08c043737ad4e4acca3313a6 authored about 4 years ago by norci <[email protected]>
806933161d1c787b565d0d2ce5cb52bd5ec8f77b authored about 4 years ago by Jun Tian <[email protected]>
3c7ce0d49cb14e2086540b3aff03e993c7e71c6d authored about 4 years ago by Jun Tian <[email protected]>
9b0ff2c0ecb93b2b090b58221c5aaf1d9f64396e authored about 4 years ago by Jun Tian <[email protected]>
76f2f5118bc3eec567f9f27af3915541f1b748a7 authored about 4 years ago by Jun Tian <[email protected]>
5d0780b99c94834adfad1eb7d0e1cfe97318c27c authored about 4 years ago by Jun Tian <[email protected]>
ff95729026b48e08f0ce818d8b11809d6fd4134f authored about 4 years ago by Jun Tian <[email protected]>
801833a1b53f19cce4d6a522befd215ddf40581f authored about 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
6bd18ef30d4bce549b694a581fe5f365576bf639 authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
182a22c14a0a9d241326b33d158ad7f64a3b29e8 authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>7bc3013f5f29ff924adcb96622ce08d9218732ec authored about 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
096535aa0eadc9844a2cdc3179ed926115644660 authored about 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>58e3044cb52b8d2ec627a9428712f697815f8b1a authored about 4 years ago by Jun Tian <[email protected]>
e28836194fcd8256d2bb28d64f8bee0a788f2e69 authored about 4 years ago by Jun Tian <[email protected]>
5f55b8c0597de8a27b7406abc13c5eedc214a6fc authored about 4 years ago by Jun Tian <[email protected]>
* Update policy_gradient.jl
* CartPole MAC experiment
* MAC.jl
* Adding Test for MAC
...
1780ac6b118452d4baaa24d5f1737540f198cd03 authored about 4 years ago by Raj Ghugare <[email protected]>2e5a43f4af3bce933c4a617952276706b04785b9 authored about 4 years ago by Jun Tian <[email protected]>
7c2490db2252e538892b2f23ac408130aca43f7f authored about 4 years ago by Jun Tian <[email protected]>
* add an AbstractCFRPolicy
* improve tabular cfr
* add best response policy
* add nash_...
59a2475ace909776b62bfcbcc37ec518515df071 authored about 4 years ago by Jun Tian <[email protected]>61836091c392488a246ec1e8fe48c8730aba6d38 authored about 4 years ago by Jun Tian <[email protected]>
c1412e5739ecd6a41e2e96634ff44738c4b30692 authored about 4 years ago by Jun Tian <[email protected]>
6ac4170755df7197c2df82688192a869fd35a6b3 authored about 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
22d500f51964caa482879a174c2afce9fd5e4fb0 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>89bf707efe8936ff7fd15a82e07bd146c3554b90 authored over 4 years ago by Jun Tian <[email protected]>
6ae064d8d95afd2741c3fadd49ad9d65419e9ef0 authored over 4 years ago by Jun Tian <[email protected]>
6f458f13337fc4d64244a316a4c91a9b424a0f13 authored over 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
cf4352317ef40b6fcbffe0c832555e3179e5d94f authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
c2f09f1ab08b1a681c50799de73e05f5edbb551a authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* initial changes
* added experiment of PPO with Pendulum
* fix test errors
* update RE...
2674ba187797506e0838be545ac45389cd00ade9 authored over 4 years ago by Jun Tian <[email protected]>73d11f6450473e7873e59b1f63e54453190a3596 authored over 4 years ago by Jun Tian <[email protected]>
* Implemented Reinforce policy gradient.
added a experiment with CartPole.
* refactor
*...
0e2dd9c225edd6a7dbe754d343b809d4616b2225 authored over 4 years ago by norci <[email protected]>* add outcome_sampling_mccfr
* add esmccfr
* update README.md
34aea9030577ed5cea6630ad02dc60e6c4cfd9a6 authored over 4 years ago by Jun Tian <[email protected]>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
7829fa526b88da8a0c08773c6fac6d285293685f authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* add TD3
* adapt README
44e3358b23de71b07e8ed3fc458a0e020a11689f authored over 4 years ago by Roman Bange <[email protected]>* fixed bug in reward logging.
due to multithread env does not have POST_EPISODE_STAGE.
See:...
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
d21b82df8953ec9e192deadc20ccd0c4f3798a5b authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* add experiment of snake game
* sync
* add Experiment for Minimax
* add Experiment for...
28fcb998348eafc78a519fe85ee6020d1f2e1970 authored over 4 years ago by Jun Tian <[email protected]>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
39159f4fef674f49fae2b62ed4a8b7e44e12521e authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* added Loss values for DDPG policy
* added more tensorboard logs in rl experiments.
adjus...
87a63f009b82c6e4c5eeca3e74b22538c0cb431a authored over 4 years ago by norci <[email protected]>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
df55b8833bf75061885511d406776ad3f49e7fb9 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>be4bb82402d622310d009ac7ec81f25a6ffdfcb6 authored over 4 years ago by Jun Tian <[email protected]>
74c6e5a0933b8168f1a60f8ce95641d24491723a authored over 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
bf79285c2499d94d3e7a7023be39a8d894c85106 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* inital SAC implementation
* PR review fixes
cf9bf197bc2b0493c329112cbdf41abe9523403e authored over 4 years ago by Roman Bange <[email protected]>Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
6409d3a3f9b90bd83854ef789aeab5cc0509113e authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>e3f9375a7b32c6d346591a00617fcb7f634eb0fd authored over 4 years ago by Jun Tian <[email protected]>
1f0cc22b2785cb0fd457c8e483e52a6b4e452d75 authored over 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
9b1057e320161d409bc71708086419d4d1e5a61e authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>fc159d1ac81b9cd200523d314de157c3005dc9f3 authored over 4 years ago by Jun Tian <[email protected]>
* update dependency of RLCore
* remove unnecessary copy due to upstream change
* correct s...
f1790c43b4c18165bcb22a5e80e998667acab306 authored over 4 years ago by Jun Tian <[email protected]>816761c16a44c91ac619f02e9afbe94e0a45c3a7 authored over 4 years ago by Jun Tian <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
79f91a39d1cdd74e830cd461fedda43fa0066612 authored over 4 years ago by github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>* bump version
* fix dependencies
* fix experiments in rl_env
* minor changes of seeds
...