RL

Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning

Shang-Yu Su

, Xiujun Li, Jianfeng Gao, Jingjing Liu and Yun-Nung Chen

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning

Baolin Peng, Xiujun Li, Jianfeng Gao, Jingjing Liu, Kam-Fai Wong, and

Shang-Yu Su