A human metaphor for evaluating AI capability