Tuan Anh Le

making neural programming architectures generalize via recursion

10 February 2017

notes on (Cai et al., 2017).


understanding: 8/10
code: N/A

pretty cool idea about making neural nets learn recursive programs by changing the training traces for neural programmer interpreter (npi) architecture. can prove perfect generalization.


first, how does npi work? the main architecture is a lstm with things sticking in and out.

input to the lstm unit is (e, p, a)

output from the lstm unit is (r, p2, a2)

this architecture can make the neural net choose the inputs and arguments that drives the program in the right direction (similar to inference compilation). the program itself doesn’t need to be differentiable.

changes from npi



just need to prove correctness on base cases and reduction rules. proven for the 4 examples.


  1. Cai, J., Shin, R., & Song, D. (2017). Making Neural Programming Architectures Generalize via Recursion. International Conference on Learning Representations (ICLR).
      title = {Making Neural Programming Architectures Generalize via Recursion},
      author = {Cai, Jonathon and Shin, Richard and Song, Dawn},
      year = {2017},
      booktitle = {International Conference on Learning Representations (ICLR)}