Skip to main content
EU Science Hub
News article12 December 20182 min read

How a couple of digits can make the difference and help preventing fraud

Histogram on detection of fraudster
histogram on the occurrence frequencies of the first significant digit of a serial fraudster detected in customs data by our two-stage Newcomb-Benford analysis. The frequencies deviate considerably from the theoretical expectations (red curve).
© European Union, 2018

JRC scientists propose a novel application of statistical theory (Newcomb-Benford Law) to detect frauds in international trade.

Have you ever noticed how the first pages of big reference books and tables are more worn and smudged than later pages?

Or how, in certain large sets of data, some numbers start with a certain digit (other than zero) more often than others?

The probability of these events is explained by the Newcomb-Benford Law, or first digit law.

The detection of frauds is one of the most prominent applications of this statistical theory.

To assess whether it provides a valid model also for genuine (i.e. non-fraudolent) empirical obsevations, whose generating process cannot be known with certainty, the JRC, in collaboration with the Universities of Parma and Siena, established conditions for the validity of the Newcomb–Benford law in the field of international trade data, where frauds typically involve huge amounts of money and constitute a major threat for national budgets.

The scientists also provided approximations to the distribution of test statistics when the Newcomb–Benford law does not hold, thus opening the door to the development of statistical procedures with good inferential properties and wide applicability.

Results show that this may lead towards an automatic method for flagging suspicious patterns of transactions in international trade, collected by national Customs services.

In practical terms, the EU Customs officer has now an instrument to verify if a certain declaration made by an importer is likely to have been manipulated or not.

The “manipulation flag” is generated by the JRC's procedure on the basis of the extent to which the frequency of the first significant digits for a certain trader differ from the theoretical expectations, after appropriate statistical corrections made to match the specific features of Customs data.

The results are now published in the Proceedings of the National Academy of Sciences of the Units States of America (PNAS)

Read more at:

https://www.pnas.org/content/early/2018/12/05/1806617115

 

 

Newcomb-Benford Law in a nutshel

The law states that in large sets of naturally occurring numbers with some connections (e.g. income tax data, mathematical tables, scientific data, …, and now also customs declarations) the most frequent leading digits are the small ones. For example most numbers, approximately ~30%, in a set will have 1 as first digit, followed by about 17.5% starting with the number 2 and so on until the less than 5% of the 9 digit. Intuition would be in favour of a uniform distribution of approximately 11,1% each. If numbers are assigned or have a stated minimum and maximum the law usually does not apply.

 

Details

Publication date
12 December 2018