Abstract: Non-autoregressive machine translation (NAT) models generate the entire target sentence in parallel by removing the dependency between target tokens to improve the inference speed. However, this strong independence assumption between target tokens also brings many problems and increases the difficulty of the task. There is still a certain gap between NAT models and state-of-the-art autoregressive models. In this talk, we will start from the background of NAT, then we will focus on the learning paradigm in NAT, including objective functions and learning strategies. The former alleviates the limitation of cross-entropy in NAT, and the latter simplifies the difficulty of NAT task learning.