causality links
608 subscribers
45 photos
5 videos
109 links
people say (hope) we're invariant

personal channel of @vkurenkov
Download Telegram
Direct Advantage Estimation
https://arxiv.org/abs/2109.06093

links advantage function with causal effect as in rubin model
Shaking the foundations: delusions in sequence models for interaction and control
https://arxiv.org/abs/2110.10819
Forwarded from Just links
Learning to Induce Causal Structure https://arxiv.org/abs/2204.04875
#causality
👍1
hi there,

not a causality link, but still! we’ve got paper accepted at ICML2022 (Spotlight), so if you’re interested in offline rl — check it out
https://twitter.com/vladkurenkov/status/1534235675725381632

retweet appreciated 😈
On Calibration and Out-of-domain Generalization
https://arxiv.org/abs/2102.10395
👍1
👋

we finally released our offline RL library with SOTA algorithms, so if you're into this stuff, check it out

- single-file implementations
- benchmarked on D4RL datasets
- wandb reports with full metric logs (so that you don't need to rely on final performance tables)

https://github.com/corl-team/CORL
🔥7
A Survey on Causal Reinforcement Learning
https://arxiv.org/abs/2302.05209
10 Feb 2023
—-
While Reinforcement Learning (RL) achieves tremendous success in sequential decision-making problems of many domains, it still faces key challenges of data inefficiency and the lack of interpretability. Interestingly, many researchers have leveraged insights from the causality literature recently, bringing forth flourishing works to unify the merits of causality and address well the challenges from RL. As such, it is of great necessity and significance to collate these Causal Reinforcement Learning (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL. In particular, we divide existing CRL approaches into two categories according to whether their causality-based information is given in advance or not. We further analyze each category in terms of the formalization of different models, ranging from the Markov Decision Process (MDP), Partially Observed Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment Regime (DTR). Moreover, we summarize the evaluation matrices and open sources while we discuss emerging applications, along with promising prospects for the future development of CRL.
🎉1
Dutch Rudder as an Acyclic Causal Model
Reinforcement Learning from Passive Data via Latent Intentions
https://arxiv.org/abs/2304.04782
🔥5🙏1
Survival Instinct in Offline Reinforcement Learning
https://arxiv.org/abs/2306.03286
1🔥1
causality links
👋 we finally released our offline RL library with SOTA algorithms, so if you're into this stuff, check it out - single-file implementations - benchmarked on D4RL datasets - wandb reports with full metric logs (so that you don't need to rely on final performance…
New major CORL update!

🍏 Added offline benchmarks for 30 datasets covering Gym-MuJoCo, Maze2D, AntMaze, and Adroit

🍎 Implemented and benchmarked 5 offline-to-online algorithms on 10 datasets

Key takeaways:

🍒 IQL is strongest on average, and works pretty good in offline-to-online setup

🍒 AWAC is often missed in the literature, but performs strong in the offline setup

🍒 CQL is a nightmare but if tuned and tweaked (for a couple of months) works well


https://github.com/corl-team/CORL
4
Supervised Pretraining Can Learn In-Context Reinforcement Learning
https://arxiv.org/abs/2306.14892
2👍2