Our new work extending the previous companion teaching framework for on-line dialogue policy learning has been accepted by the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), which will be held in Copenhagen, Denmark, in September 2017.
[Read More]
Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning
Our new paper integrating hand-crafted rules and reinforcement learning approaches has been accepted by the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). An Agent-Aware Dropout Deep Q-Network (AAD-DQN) is proposed in this paper to estimate uncertainty of the learning process.
[Read More]
Welcome to the 8th ACM-Class Student Academic Festival
The SJTU ACM honored class invites you to participate in the 8th ACM-Class Student Academic Festival. ASAF 2017 will take place on June 3-4, 2017 at the East Middle Hall #1 on the campus of Shanghai Jiao Tong University.
[Read More]
Policy Optimization with Monotonic Improvement Guarantee
This article is about the theoretical derivation of Policy Improvement Bound and practical policy optimization algorithms discussed in the paper Trust Region Policy Optimization (Schulman, et al, 2015). TRPO is an interesting idea which optimizes policies with guaranteed monotonic improvement. In theory, its algorithm design looks elegant and justified. In...
[Read More]
On-line Dialogue Policy Learning with Companion Teaching
This paper is my first paper which is published in proceedings of the 15th European Chapter of the Association for Computational Linguistics Conference (EACL 2017). EACL 2017 is also my first international conference experience!
[Read More]