Reinforcement Learning to Train Large Language Models to Explain Human Decisions