Originally Posted by
Philsam123
Hi
I would like to thank the creator of this great tool & making XP management simpler, Especially since I’m still learning how XP is earned.
Thanks!
Quick status on the PDF import saga.
Why is this so hard?
Flying Blue PDFs look standardized to the human eye: a clean table with dates, descriptions and numbers. But machines see something different. Layouts differ by language, region and sometimes even by month. A single flight can appear as one row or five (SAF bonuses, upgrades, etc). Date formats vary wildly. “My trip to Bangkok” summary rows mix with individual segments. What looks consistent to you is chaos for a parser.
What I’ve tried
The local parser has been through countless iterations: direct PDF extraction, copy-paste input, step-by-step wizards, trip totals vs segment parsing. Each fix breaks something else. Works for Dutch PDFs, fails on English or French ones.
AI parsing with my own data via OpenAI showed better accuracy, but hallucinates XP, guesses wrong airlines and produces different results on repeated runs.
Where I’m heading
Azure Document Intelligence is supposed to show 93%+ accuracy on visual table recognition. I’ve just implemented this on my test server. The plan is to combine this with a verification wizard: Azure extracts the data, you step through the results to confirm or correct before import.
The PDF header serves as the sanity check. Every statement shows your status, total miles and XP/UXP at the top. Parser finds 195 XP but header says 205? The wizard flags the gap and helps you locate or add the missing transaction. I’ve got this already working on the test server.
The reality
Perfect automated parsing is not happening. But something Azure plus human verification gets the best of both worlds: fast bulk extraction with guaranteed accuracy. Your totals will match Flying Blue’s records because you verified them.
Still iterating. Hopefully Flying Blue implements a proper export one day.