Supervised Fine Tuning on Curated Data is Reinforcement Learning