By default, an AI system will have an incentive to prevent humans from switching it off, or otherwise interfering in its operation, as this would prevent it from maximising its reward. An AI system is ‘corrigible’ if it has an incentive to accept human corrections. Inverse Reinforcement Learning (IRL) can help mitigate this problem in some cases, but there is disagreement as to whether IRL can guarantee corrigibility in all cases.