TV audio timing is a pig even in the studio these days. A significant portion of my time is taken up trying to re-synchronise audio and video signal after their various (separate) paths are brought back together at the output side of the studio. We frequently have to do 'clap tests', which are exactly what they sound like - some poor sod has to sit in front of the camera and clap on command while we all huddle round the monitors trying to work out if it was early or late. Late sound is just about listenable. Early sound isn't. The hi-tech version of the clap test is
SmartLips. We get a small smirk by calling it HotLips. There's not much to laugh at in telly.
Digital audio in TV is embedded into the HD-SDI video signal, and each audio embedder has delays built-in so that the sound can be tweaked appropriately. If you shoot in progressive/PsF, and I think 'Later' is, you have to alter the delays further, because the pictures take longer to process in the camera. If you use augmented reality (Mastermind, MOTD), you need different delays in each clean camera feed to match the delays caused by the AR processing of pictures from the same cameras. In some cases the echo between source and programme sound is so distracting that people switch the sound off.
Then you have the problem of live monitors on the studio floor. The audience are watching live sound, but any picture monitors a) come from the downstream side of the studio, which is late, and b) are probably LCDs which have anything up to 10 frames of delay in them due to picture processing. A good example of this can be seen on HIGNFY, when they cut to the last wide shot for the credits. If the slung monitor is in shot, you can see it cut to the wide shot late.
For a music show, there's usually front-of-house sound mix for the floor, and a separate mix in the sound gallery which is recorded with the video. A show like Jools will record multitrack outputs to Pro-Tools or similar for a music mix after editing.
All of the above is complicated further by the habit of recording 'ISO' feeds of individual cameras, which take a completely different path through the studio. Those recording will also take isolated sound tracks which need to be synchronous, and those embedders will need their own settings.
This doesn't even begin to take into account what happens to it when it's transmitted.
It's TV. Even when it's live, nothing is real.