Background and hypothesis. In kidney transplantation, obtaining early information about the risk of graft failure helps physicians and patients anticipate a potential return to dialysis or retransplantation. Clinical prediction models are commonly used to obtain such risk estimation, but their performance needs to be continuously evaluated in various contexts. We propose an external validation study of the Kidney Transplant Failure Score in a pooled sample of 3,144 patients transplanted between 2010 and 2015 in France, Belgium, Norway and Canada. Methods. This score is used at the first transplantation anniversary to predict the probability of graft failure over the following seven years. The target population was defined as adult recipients of a kidney from a neurologically deceased donor without graft failure in the first year post-transplantation. Graft failure was defined as a return to dialysis. The KTFS authors fitted a Cox model and then adjusted its coefficients to maximize the discrimination, yielding the KTFS final version. We evaluated the performance of the initial and final versions of the KTFS, as well as the performance of another model we developed to consider death as a competing event. Results. Around 10% of patients returned to dialysis, and 12.6% died during the seven-year follow-up. All KTFS versions yielded similarly good discrimination (area under the time-dependant receiver operating curve around from 0.79 [0.76-0.82] to 0.80 [0.77-0.84]), while the discrimination-optimized one presented important miscalibration. Clinical utility, assessed through net benefit, was also the lowest for the discrimination-optimized version. Conclusion. Our results warn against using the current KTFS version and recommend using either the initial coefficients or the competing risk-based ones instead.