Hah.
So the testing program isn't currently designed well enough to evaluate individual airports? What a surprise. Not statistically significant = Security Theatre.
The statistics of such low rate event detection systems make it very difficult to design or even evaluate such systems, and it seems like TSAs ad-hoc covert testing system is exposed as another layer of smoke and mirrors in TSA's security theatre.
From the article:
• Determine whether TSA's existing testing framework and its resulting data can achieve airport-level – or only national-level -- statistical significance.
If your measurement system isn't designed to provide statistically significant results, it isn't useful for measuring.
Deming would be embarassed for TSA.
If the covert testing process isn't statistically significant for national, airport, (or individual) measurement, what is TSA using it for? Training and motivating (torturing?) its workers?
(A real red team would set of its fake bombs in the checkpoint line, rather than use testers with metal knees to guarantee secondaries. )