Nowadays, crowdsourcing becomes an economical means to leverage human wisdom for large-scale data annotation. However, when annotation tasks require specific domain knowledge that people commonly don’t have, which is normal in citizen science projects, crowd workers’ integrity and proficiency problems will significantly impair the quality of crowdsourced data. In this...
[Read More]
Affordable On-line Dialogue Learning
Our new work extending the previous companion teaching framework for on-line dialogue policy learning has been accepted by the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), which will be held in Copenhagen, Denmark, in September 2017.
[Read More]
Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning
Our new paper integrating hand-crafted rules and reinforcement learning approaches has been accepted by the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). An Agent-Aware Dropout Deep Q-Network (AAD-DQN) is proposed in this paper to estimate uncertainty of the learning process.
[Read More]
Welcome to the 8th ACM-Class Student Academic Festival
The SJTU ACM honored class invites you to participate in the 8th ACM-Class Student Academic Festival. ASAF 2017 will take place on June 3-4, 2017 at the East Middle Hall #1 on the campus of Shanghai Jiao Tong University.
[Read More]
Policy Optimization with Monotonic Improvement Guarantee
This article is about the theoretical derivation of Policy Improvement Bound and practical policy optimization algorithms discussed in the paper Trust Region Policy Optimization (Schulman, et al, 2015). TRPO is an interesting idea which optimizes policies with guaranteed monotonic improvement. In theory, its algorithm design looks elegant and justified. In...
[Read More]