I've written code that calculates elevation from GPX tracks, using its own topographic database. The problem you quickly run into is that if you add up all the tiny ups and downs for every point you end up with a number that's far too high for flat-ish routes.
So you then have to write some code that filters small undulations* and has some heuristic for what counts as a "climb" (minimum gradient, minimum total height, minimum length, etc). On some routes tiny changes to the chosen values here lead to big differences in the total. So +/- 50% is a totally reasonable number, especially allowing for inaccuracies or lack of detail in the GPX track and the topographic data.
And at some point it comes down to defining what elevation even is. Is riding along a road that constantly undulates slightly the same as riding one that goes up for a long time then down for a long time?
(* note the traditional contour counting method has a similar filter built-in, since small undulations don't cross a contour)