Overview of the Issue
When the keyspace durability policy is changed from none to any other setting that requires semi-sync to be enabled, vtorc will notice that change and attempt to "fix" the semi sync settings on the primary by executing a UndoDemotePrimary call agains the primary vttablet.
This call will enable semi-sync on the primary, but potentially do so before any of the replicas have semi sync enabled, causing the primary to lock up waiting for acknowledgements. Even when semi sync gets enabled on the replicas, they can't ack any of the changes that are still pending on the primary, so the whole shard locks up.
I believe vtorc notices that the primary is locked up, and might attempt to run an ERS, but ERS does not handle semi-sync lockups well right now, and it seems that we can't get out of this broken state.
Reproduction Steps
N/A
Binary Version
Operating System and Environment details
Log Fragments