Pedagogical Value-Aligned Crowdsourcing

Inspiring the Wisdom of Crowds via Interactive Teaching

Nowadays, crowdsourcing becomes an economical means to leverage human wisdom for large-scale data annotation. However, when annotation tasks require specific domain knowledge that people commonly don’t have, which is normal in citizen science projects, crowd workers’ integrity and proficiency problems will significantly impair the quality of crowdsourced data. In this... [Read More]
Human Computing, Interactive Teaching, Machine Teaching, Original Research

Affordable On-line Dialogue Learning

Our new work extending the previous companion teaching framework for on-line dialogue policy learning has been accepted by the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), which will be held in Copenhagen, Denmark, in September 2017. [Read More]
Dialogue Systems, Human-in-the-Loop, Reinforcement Learning, Original Research

Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning

Our new paper integrating hand-crafted rules and reinforcement learning approaches has been accepted by the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017). An Agent-Aware Dropout Deep Q-Network (AAD-DQN) is proposed in this paper to estimate uncertainty of the learning process. [Read More]
Dialogue Systems, Reinforcement Learning, Original Research

Policy Optimization with Monotonic Improvement Guarantee

This article is about the theoretical derivation of Policy Improvement Bound and practical policy optimization algorithms discussed in the paper Trust Region Policy Optimization (Schulman, et al, 2015). TRPO is an interesting idea which optimizes policies with guaranteed monotonic improvement. In theory, its algorithm design looks elegant and justified. In... [Read More]
Reinforcement Learning, Original Research