Is there a photo of the failed winding region? And is there a description of the driven process (e.g., fan, compressor, pump, extruder, conveyor, etc.) and motor orientation (vertical vs horizontal)?
Two things about your testing. A hipot is a "coil to ground" test, determining the insulating ability of the ground wall insulation. It can tell you nothing about a phase-to-phase or turn-to-turn fault, unless it occurs in conjunction with a ground fault. A surge test, if performed properly, compares the response of different phases but does not test "phase to phase". It tests turn-to-turn. The strenuousness of both types of test are critically dependent on the rise time and the peak value of the applied AC waveform.
I see the failure mode happen occasionally for a number of reasons, but the primary ones related to motor design / winding are:
A) Phase paper (and/or dedicated turn insulation) is not seated close enough to the end of the slot to prohibit contact between turns. Over time, the coils will move relative to each other as a result of process-induced vibration effects and/or starting transients. If unlucky, the insulation will break down, allowing the energy from the switching transients caused by the drive to catastrophically fail the winding at the specific point mentioned. Most often, this is a result of improperly trained winders (they don't know why having the paper there is important), or lazy ones (it takes work to get that insulation in there early in the coil-laying process, and it gets harder with more coils added).
B) The exposed winding is acquiring some sort of contaminant, leading to a tracking path (over the outside of the insulation) - which will eventually cause grief as a short-to-something somewhere. The abruptness of the rise time and the amplitude of the voltage peak during the switching transients will make things appear worse. Be aware of possible reflected wave syndrome due to the cable length, as well as the effect of common mode voltage.
C) A "bad batch" of wire (cracked / chipped / scratched enamel) and/or a slot burr right at the edge of the slot creates an abrasive region and the relative movement of the coil eventually fails the insulation. Note that some OEMs are more prone to this type of failure mechanism compared to others. Can be fixed with the same approach as A) - time, effort, and training of the individuals winding the machines.
Another thought: are there multiple drives (of the same or different rating) on the same bus? Are some (or all) of them having the same issue? It might not be a "bad drive", it might be "bad input waveform".
Converting energy to motion for more than half a century