Fraud and Deception Detection: Five Language Fingerprints
Posted by Jason Apollo Voss on Mar 16, 2021 in Blog | 0 commentsLast month, I described how computer-aided text-based analysis can help uncover fraud and deception in company communications. But what other insights can we glean from this research into scandal companies?
We used Deception And Truth Analysis (D.A.T.A.) to examine 10 of the largest corporate scandals in recent history and found that the average lead time between our textual identification of deception and the public recognition of possible scandal was more than six years.
The obvious question is why. Why does it take regulators and markets so long to recognize these scandals? And a follow-up question: What insights from text-based analysis can we use to better identify these scandals earlier? Let’s take these in turn.
Theory: It’s the Behavior
Why does D.A.T.A. detect deception faster than acutely interested investors and regulators? After thinking about this for a while, we developed a theory, and it boils down to 86.5%. That is the percentage of financial information that is expressed in text, not in numbers, in annual reports. Text communications reveal the behavior of corporate management teams, and that behavior leads to the outcome that is expressed in numerical performance.
So that 6.6 years between the initial indication of deception and when the scandal breaks is the average length of time that a poorly behaving firm can fake it, until they just can’t massage the numbers any longer.
What is interesting is that the two scandals that took over a decade to recognize both involved financial companies: AIG and Lehman Brothers. Their annual reports ran in the hundreds of pages, and the velocity of money cycling through their balance sheets and income and cash flow statements was very, very high. Thus, it took considerable time for their poor behaviors and choices — the inputs — to eventually show up in the numbers, or the outputs.
If this theory is a valid explanation for that lead time, then scandal ought to have language fingerprints that investors can dust for as either an early warning system or as a second opinion on the normal fundamental work that investment research teams conduct.
Language that Reveals Possible Scandal
After examining the 10 scandals above as well as Wirecard and other more recent controversies, we identified five textual fingerprints that differ from those of more truthful companies by more than 50%.
In addition to text-based analysis, we also conducted one-on-one conversations to better discern between deception and truth and to identify some of the more pan-cultural deceptive behaviors people engage in. Our findings aligned with what previous lie detection researchers had uncovered: that each of the five potential deception indicators that surface in text-based analysis also occur in person-to-person interviews.
So let’s drill a bit deeper into each of them.
1. Words Indicating Friendship
Lie detection researchers have shown that deceivers often employ obfuscation to create confusion. One way they do this is by using words that imply friendship more often than the norm in business communications. Deceptive companies employ such terms 56.1% more than the average, according to our analysis. So if an annual report includes a number of ingratiating terms, it may be evidence of obfuscation and deception.
But a distinction is crucial here: Words that indicate friendship — “friend,” “pal,” “neighbor,” and “gang,” for example — are different from friendly words.
2. Risky Words
Scandal firms favor words that indicate risk at a much higher proportion than the average company. These include such terms as “averse,” “avoid,” “concern,” “difficulty,” “prevent,” “stopped,” and so on. These types of words already tend to raise securities researchers’ hackles, and as we pointed out in the last piece, firms are proactively excising these kinds of “red flag” words from their annual reports.
3. Impersonal Pronouns
“Another,” “everybody,” “someone,” and “whichever” are the sort of impersonal pronouns that dishonest firms employ to a much greater extent — 54.1% more often — than their truthful peers. Why do they prefer to be impersonal in their communications? Researchers theorize that they are trying to create emotional space between themselves and those they wish to mislead.
4. Words That Indicate Difference
Lying is cognitively demanding. One manifestation of this is that during the act of deception, the liar is often unable to make distinctions among competing points of view in their communications and so are less likely to draw comparisons. So the use of words that suggest difference is actually an indication of truthfulness. Constructions that present contrasting viewpoints — “as compared with other years . . .” — are examples of this.
Deceivers also have an agenda: to convince their target to believe their preferred narrative. They are unlikely to draw distinctions between other narratives and will tend to focus on their preferred one.
5. Words That Negate a Statement
Research also indicates that liars often employ more negative terms than truth tellers. This is why we drew the distinction between words indicating friendship and words that are friendly.
But researchers do not always find that the deceivers are more negative than the truthful. Our analysis of dishonest firm communications suggests, however, that they tend to use such words as “not,” “never,” “should not,” “does not,” and “must not” at a 50.4% greater proportion than the average.
Bonus
So what is by far the strongest indicator of deception? The number of swear words in an annual report. Though they are rarities, swear words occur in scandal company annual reports a whopping 277.1% more frequently than the mean.
Originally published on Enterprising Investor.