{"id":8924,"date":"2021-02-16T00:01:03","date_gmt":"2021-02-16T05:01:03","guid":{"rendered":"http:\/\/www.jasonapollovoss.local\/?p=8924"},"modified":"2021-03-15T09:34:59","modified_gmt":"2021-03-15T13:34:59","slug":"fraud-and-deception-detection-text-based-analysis","status":"publish","type":"post","link":"https:\/\/jasonapollovoss.com\/web\/2021\/02\/16\/fraud-and-deception-detection-text-based-analysis\/","title":{"rendered":"Fraud and Deception Detection: Text-Based Analysis"},"content":{"rendered":"<h3><span style=\"font-family: futural; font-size: 20px;\">Research analysis relies on our trust.<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\">Among the many factors we consider as fundamental investors are assessments of a company\u2019s strategy, products, supply chain, employees, financing, operating environment, competition, management, adaptability, and so on. Investment professionals conduct these assessments to increase our understanding, yes, but also to increase our trust in the data and the people whose activities the data measure. If we cannot trust the data and the people who created it, then we will not invest. In short, we must\u00a0<em>trust\u00a0<\/em>management.<\/span><\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-family: futural; font-size: 20px;\">Our fraud and deception detection methods are only okay.<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\">But by what repeatable method can we evaluate the trustworthiness of companies and their people? Usually the answer is some combination of financial statement analysis and \u201ctrust your gut.\u201d Here is the problem with that:<\/span><\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-family: futural; font-size: 20px;\">1. Time and Resource Constraints<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\">Companies communicate information through words more than numbers. For example, from 2009 to 2019, the annual reports of the Dow Jones Industrial Average\u2019s component companies tallied just over 31.8 million words and numbers combined, according to AIM Consulting. Numbers only made up 13.5% of the total.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">Now, JP Morgan\u2019s 2012 annual report is 237,894 words. Let\u2019s say an average reader can read and comprehend about 125 words per minute. At this rate, it would take a research analyst approximately 31 hours and 43 minutes to thoroughly read the report.\u00a0<a href=\"https:\/\/www.wallstreetmojo.com\/mutual-fund-analyst-complete-guide\/\">The average mutual fund research analyst in the United States makes around $70,000 per year,<\/a>\u00a0according to\u00a0<em>WallStreetMojo<\/em>. So that one JP Morgan report costs a firm more than $1,100 to assess. If we are already invested in JP Morgan, we\u2019d perform much of this work just to ensure our trust in the company.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">Moreover, quantitative data is always publicly released with a significant time lag. Since a company\u2019s performance is usually disclosed quarterly and annually, the average time lag for such data is slightly less than 90 days. And once the data becomes public, whatever advantage it offers is quickly traded away. Most investment research teams lack the resources to assess every company in their universe or portfolio in near real time, or just after a quarterly or annual report is released.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\"><strong>Conclusion: What is that old line? Oh, yeah: Time is money.<\/strong><\/span><\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-family: futural; font-size: 20px;\">2. Trusting our gut does not work.<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\">Despite the pan-cultural fiction to the contrary, research demonstrates we cannot detect deception through body language or gut instinct.\u00a0<a href=\"https:\/\/journals.sagepub.com\/doi\/10.1207\/s15327957pspr1003_2\">In fact, a meta-analysis of our deception-spotting abilities found a global success rate just 4% better than chance<\/a>. We might believe that as finance pros we are exceptional. We would be wrong.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\"><a href=\"https:\/\/www.tandfonline.com\/doi\/abs\/10.1080\/15427560.2017.1276069\">In 2017, we measured deception detection skills among finance professionals. It was the first time our industry\u2019s lie detection prowess had ever been put to the test<\/a>. In short: ouch! Our overall success rate is actually worse than that of the general population: We did not score 54%, we earned an even-worse-than-a-coin-toss 49.4%.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">But maybe our strengths are in our own sector. Put us in a finance setting, say on an earnings call, and we\u2019ll do much better, right? Nope, not really. In investment settings, we could detect deception just 51.8% of the time.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">There is more bad news here (sorry): Finance pros have a strong truth bias. We tend to trust other finance pros way more than we should. Our research found that we only catch a lie in finance 39.4% of the time. So that 51.8% accuracy rate is due to our tendency to believe our fellow finance pros.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">One other tidbit: When assessing statements outside of our domain, we have a strong 64.9% deceptiveness bias. Again, this speaks to our industry\u2019s innate sense of exceptionalism.\u00a0<a href=\"https:\/\/www.tandfonline.com\/doi\/abs\/10.1080\/15427560.2015.1034862\">In an earlier study<\/a>, our researchers found that we believe we are told 2.14 lies per day\u00a0<em>outside of work<\/em>\u00a0settings, and just 1.62 lies per day\u00a0<em>in work<\/em>\u00a0settings. This again speaks to the truth bias within finance.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">Finally, we believe we can detect lies within finance at a 68% accuracy rate, not the actual 51.8% measured. Folks, this is the very definition of overconfidence bias and is delusion by another name.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\"><strong>Conclusion: We cannot trust our guts.<\/strong><\/span><\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-family: futural; font-size: 20px;\">3. Auditors\u2019 techniques audit numbers.<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\">But what about auditors? Can they accurately evaluate company truthfulness and save us both time and money? Yes, company reports are audited. But auditors can only conduct their analyses through a micro-sampling of transactions data. Worse still, auditors\u2019 techniques, like ours, are largely focused on that very small 13.5% of information that is captured numerically. That leaves out the 86.5% of text-based content.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">Further, because financial statement analysis \u2014 our industry\u2019s fraud detection technique \u2014 is one step removed from what the auditors see, it is hardly reliable. Indeed, financial statement analyses are just table stakes: Ours probably won\u2019t differ much from those of our competitors. Just looking at the same numbers as everybody else is unlikely to prevent fraud or generate alpha.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">And what about private markets? The investment research community has spent an awful lot of time looking for investment opportunities in that space in recent years. But while private market data are sometimes audited, they lack the additional enforcement mechanism of public market participants\u2019 due-diligence and trading activities. These can sometimes signal fraud and deception.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\"><strong>Conclusion: There has to be another tool to help us fight deception.<\/strong><\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-family: futural; font-size: 20px;\">Scientifically Based Text Analyses to the Rescue<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\"><a href=\"https:\/\/www.bloomsbury.com\/us\/the-secret-life-of-pronouns-9781608194964\">Starting with James W. Pennebaker\u2019s pioneering work<\/a>, researchers have applied natural language processing (NLP) to analyze verbal content and estimate a transcript\u2019s or written document\u2019s credibility. Computers extract language features from the text, such as word frequencies, psycholinguistic details, or negative financial terms, in effect, dusting for language fingerprints. How do these automated techniques perform?\u00a0<a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0001691820305746\">Their success rates are between 64% and 80%<\/a>.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">In personal interactions, as we noted, people can detect lies approximately 54% of the time. But their performance worsens when assessing the veracity of text.\u00a0<a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0001691820305746\">Research published in 2021 found that people have about a 50% or coin-flip chance to identify deception in text. A computer-based algorithm, however, had a 69% chance.<\/a><\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">But surely adding people to the mix improves the accuracy? Not at all. Our overconfidence as investors sabotages our ability to catch deception even in human-machine hybrid models. The same researchers explored how human subjects evaluated computer judgments of deception that they could then overrule or tweak. When humans could overrule, the computer\u2019s accuracy dropped to a mere 51%.\u00a0<a href=\"https:\/\/www.sciencedirect.com\/science\/article\/pii\/S0001691820305746\">When human subjects could tweak the computer judgments in a narrow range around the algorithms\u2019 evaluation, the hybrid success rate fell to 67%<\/a>.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">Computers can give investment pros a huge advantage in evaluating the truthfulness of company communications, but not all deception detection methods are one size fits all.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">One computer-driven text-based analysis,\u00a0<a href=\"https:\/\/onlinelibrary.wiley.com\/doi\/10.1111\/j.1540-6261.2010.01625.x\">published in 2011<\/a>, had the ability to predict negative stock price performance for companies whose 10-Ks included a higher percentage of negative words. By scanning documents for words and phrases associated with the tone of financial communications, this method searched for elements that may indicate deception, fraud, or poor future financial performance.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">Of course, those businesses whose stock prices were hurt by this technique adapted. They removed the offending words from their communications altogether.\u00a0<a href=\"https:\/\/papers.ssrn.com\/sol3\/papers.cfm?abstract_id=3683802\">Some executives even hired speech coaches to avoid ever uttering them<\/a>. So word-list analyses have lost some of their luster.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h3><span style=\"font-family: futural; font-size: 20px;\">Where Do We Go from Here?<\/span><\/h3>\n<p><span style=\"font-family: futural; font-size: 20px;\">It may be tempting to dismiss all text-based analyses. But that would be a mistake. After all, we have not thrown away financial statement analysis, right? No, instead we should seek out and apply the text-based analyses that work. That means methods that are not easily spoofed, that assess\u00a0<em>how language is used<\/em>\u00a0\u2014 its structure, for example \u2014 not\u00a0<em>what language is used<\/em>.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">With these issues in mind, we developed\u00a0<a href=\"http:\/\/www.deceptionandtruthanalysis.com\/\">Deception And Truth Analysis (D.A.T.A.)<\/a>\u00a0with\u00a0<a href=\"http:\/\/vision.orbitfin.ai\/\">Orbit Financial<\/a>. Based on a\u00a0<a href=\"https:\/\/jasonapollovoss.com\/webwp-content\/uploads\/2018\/08\/Lie-Detection-Guide-Theory-and-Practice-for-Investment-Professionals.pdf\">10-year investigation of those deception technologies that work in and out of sample<\/a>\u00a0\u2014 hint: not reading body language \u2014 D.A.T.A. examines more than 30 language fingerprints in five separate scientifically proven algorithms to determine how these speech elements and language fingerprints interact with one another.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">The process is similar to that of a standard stock screener. That screener identifies the performance fingerprints we want and then applies these quantitative fingerprints to screen an entire universe of stocks and produce a list on which we can unleash our financial analysis. D.A.T.A. works in the same way.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">A key language fingerprint is the use of articles like a, an, and the, for example. An excess of these is more associated with deceptive than truthful speech. But article frequency is only one component: How the articles are used is what really matters. And since articles are directly connected to nouns, D.A.T.A is hard to outmaneuver. A potential dissembler would have to alter how they communicate, changing how they use their nouns and how often they use them. This is not an easy task and even if successful would only counteract a single D.A.T.A. language fingerprint.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\">The other key findings from recent D.A.T.A. tests include the following:<\/span><\/p>\n<ul>\n<li><span style=\"font-family: futural; font-size: 20px;\"><strong>Time and Resource Savings<\/strong>: D.A.T.A. assesses over 70,400 words per second, or the equivalent of a 286-page book. That is a 99.997% time savings over people and a cost savings of more than 90%.<\/span><\/li>\n<\/ul>\n<ul>\n<li><span style=\"font-family: futural; font-size: 20px;\"><strong>Deception Accuracy:<\/strong>\u00a0Each of the five algorithms are measured at deception detection accuracy rates far above what people can achieve in text-based analyses. Moreover, the five-algorithm combination makes D.A.T.A. difficult to work around. We estimate its accuracy exceeds 70%.<\/span><\/li>\n<\/ul>\n<ul>\n<li><span style=\"font-family: futural; font-size: 20px;\"><strong>Fraud Prevention:<\/strong>\u00a0D.A.T.A. could identify the 10 largest corporate scandals of all time \u2014 think Satyam, Enron \u2014 with an average lead time in excess of six years.<\/span><\/li>\n<\/ul>\n<ul>\n<li><span style=\"font-family: futural; font-size: 20px;\"><strong>Outperformance:<\/strong>\u00a0In one D.A.T.A. test, we measured the deceptiveness of each component of the Dow Jones Industrial Average each year. In the following year, we bought all but the five most deceptive Dow companies. From 2009 through 2019, we repeated the exercise at the start of each year. This strategy results in an average annual excess return of 1.04% despite the sometimes nine-month lag in implementing the strategy.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-family: futural; font-size: 20px;\">The writing is on the wall. Text-based analyses that leverages computer technology to detect fraud and deception results in significant savings in both time and resources. Future articles in this series will detail more D.A.T.A. test results and the fundamental analysis wins that this kind of technology makes possible.<\/span><\/p>\n<p><span style=\"font-family: futural; font-size: 20px;\"><strong>Originally published on <em><a href=\"http:\/\/blogs.cfainstitute.org\/investor\/follow-the-enterprising-investor\/\">Enterprising Investor<\/a><\/em>.<\/strong><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Research analysis relies on our trust. Among the many factors we consider as fundamental investors are assessments of a company\u2019s strategy, products, supply chain, employees, financing, operating environment, competition, management, adaptability, and so on. Investment professionals conduct these assessments to increase our understanding, yes, but also to increase our trust in the data and the [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":8923,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[3],"tags":[421,423,38],"class_list":["post-8924","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-the-blog","tag-deception-and-truth-analysis","tag-deception-detection","tag-lie-detection"],"_links":{"self":[{"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/posts\/8924","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/comments?post=8924"}],"version-history":[{"count":0,"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/posts\/8924\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/media\/8923"}],"wp:attachment":[{"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/media?parent=8924"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/categories?post=8924"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jasonapollovoss.com\/web\/wp-json\/wp\/v2\/tags?post=8924"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}