Lesson 06 of 6

Data-Driven Fraud Detection: Chapter 6

Audio lesson

0:000:00

Overview

This episode explores how technology is changing the way organizations detect and investigate fraud. We'll break down proactive, data-driven methods, look at practical tools like Benford's Law and z-scores, and discuss how to use analytics on financial statements. Whether you’re new to fraud examination or looking for deeper insights, join Maya and David for a practical guide to catching the 'needle in the haystack.'

Transcript

Loading transcript...

Fraud Examination: Detect, Prevent, and Investigate Fraud: Data-Driven Fraud Detection: Chapter 6 — full transcript

The Shift to Data-Driven Fraud Detection

Maya Collins: Hey everyone, welcome back to Fraud Examination! I’m Maya Collins, and, as always, I’m here with David Miller. Today, we’re jumping into something I’m genuinely excited about—how technology is totally changing the way we detect and investigate fraud. Ready for this, David?

David Miller: Absolutely, Maya. You know, compared to the old days of fraud detection—manually flipping through ledgers and doing the odd spot-check—what we have now with data-driven analysis is, well, night and day. But before we get all excited, let’s clear something up right at the start: people sometimes lump errors and fraud together, but they’re really different beasts.

Maya Collins: Totally. So, errors—think accidental stuff, like a double payment because the printer jammed or maybe someone misunderstood a form. There’s no criminal intent. But fraud… fraud is deliberate. That’s someone finding a loophole or tricking the system for personal gain. And, unlike errors, fraud hides itself in just a few spots in all that data, not evenly sprinkled everywhere.

David Miller: Right. From the old-school audit side, sampling and spot-checking aren’t very useful for fraud. I mean, if you’re looking for mistakes scattered throughout, sure—sampling can work for stuff like errors. But fraud? You need to dig through the whole haystack, not just pick out a handful of straws. That’s where technology, and full-population analysis, really starts pulling its weight.

Maya Collins: Which is also why data-driven fraud detection is more proactive, right? Instead of waiting for someone to send in a tip or stumble onto an anomaly, you start by identifying where fraud could happen. You build this hypothesis: “Okay, if someone is gonna cheat the system, what would that look like?” And then you use technology to go look for those symptoms, even before anything obvious pops up.

David Miller: Yeah, and as a compliance officer, that kind of thinking’s saved my hide more than once. Actually, there was this case—let me try to remember the year, maybe 2014?—where we had this whole vendor fraud scheme ticking away in the background. Our manual checks kept missing it because these guys were smart. Just small invoices, each just under our approval limits. But once we switched to a tech-based audit—running everything through a digital analysis tool—it flagged all those nearly identical invoices under the approval threshold. Wouldn’t have caught that with a handful of samples. I’m still kind of embarrassed how long those spot-checks let it slip by!

Maya Collins: That’s such a classic example! And it really drives home why understanding the business and proactively posing “what-if” scenarios is so key. It’s not just waiting for red flags—it’s anticipating where they might crop up. So, if you’re listening and think your organization’s got a solid handle on fraud just because you run a few checks, maybe time to rethink that toolbox.

Key Tools and Techniques for Fraud Analytics

Maya Collins: Alright, so we’ve set the scene for thinking proactively. Let’s get into the good stuff—actual, practical tools. David, you wanna kick off with Benford’s Law? It sounds so technical but it’s honestly kinda fun.

David Miller: Yeah, I’ll give it a shot. So, Benford’s Law is this quirky property of numbers—maybe “quirky” isn’t the right word, but go with me—where certain digits show up more often at the start of random numbers in real-world data sets. Like, “1” as a first digit pops up about 30% of the time. Totally counterintuitive. So, if you’re looking at, say, invoice amounts, and the distribution of those first digits veers off what Benford’s Law predicts… well, you might wanna dig deeper.

Maya Collins: Yeah, and this isn’t just theoretical. There’s that case with MBO Corporation—they analyzed hundreds of thousands of supplier invoices and checked if digit distributions matched what Benford’s Law expected. With their pooled supplier data, it all looked normal. But a couple of individual suppliers? Suddenly, those first digits weren’t lining up with the prediction. Like, way too many “9s” showing up as the first digit. Total red flag!

David Miller: Exactly. And Benford’s works because fraudsters, when they make up numbers, they unintentionally mess up the natural randomness. They might think they’re being clever, but the math rats them out.

Maya Collins: That’s only one approach. Another one I love is outlier detection—using z-scores to spot transactions that just don’t belong. Say, you’re combing through invoice amounts from Vendor X, and one is—what was it, $1325 when most others are a few hundred bucks? You run the z-score and, boom, anything outside plus or minus 2 is considered statistically unusual. That one sticks out and needs a closer look.

David Miller: You can combine that with stratification and summarization, too. It’s just about slicing your data into groups—by vendor, by department, by whatever makes sense—so you can see what’s ordinary, and what’s… well, not. Summarization’s when you take those groups and crunch them down, like pivot tables in Excel, to surface trends fast.

Maya Collins: Oh, don’t get me started on pivot tables—I’m obsessed! Let me give a quick example. So at a consultancy project, we were suspicious that employees were gaming the system through fake vendor accounts. I imported employee and vendor lists into Excel and used fuzzy matching—think Soundex or n-grams, for my fellow nerds—to compare names and addresses even if they weren’t 100% identical. Plugged that into a pivot table, and bam, turned up a perfect “kickback” setup. The employee and vendor had nearly matching addresses. No way that’s a coincidence.

David Miller: It’s such a leap ahead from what we discussed in earlier episodes, like just looking for red flags or relying on whistleblower tips. Now you can actually run digital analysis and outlier checks on every transaction, not just waiting for someone to notice something weird in the break room.

Maya Collins: Right. And don’t forget, all these techniques—Benford’s Law, z-scores, pivot tables, fuzzy matching—they’re not just neat tricks. They’re really practical for catching fraud that old-school methods probably never would. Especially as more businesses are sitting on piles of data and finally figuring out what to do with all of it.

From Data Access to Financial Statement Analysis

Maya Collins: But all this tech talk brings up a question that always hits home for me—how do you even get the right data in the first place? Like, it sounds simple: just pull the data, do the analysis, profit. But, no, the technical side of accessing and prepping data can be the hardest part.

David Miller: Yeah, I couldn’t agree more. In fact, getting “the right data, at the right time, in the right format”—it’s practically the motto of modern fraud investigation. You’ve got your shiny tools like ACL, IDEA, and even Microsoft Office with ActiveData. But none of these work without a good data pipeline. Sometimes you get real-time access through things like ODBC—lets you pull data straight from the source, nice and clean. Other times it’s a headache of pulling CSVs, text files, or even wrangling data from an awkward data warehouse.

Maya Collins: And getting all that data isn’t just about dumping it into a spreadsheet. You still gotta prep it—convert columns, clean up values, check for missing stuff, standardize dates. Otherwise, you’re gonna get garbage results and chase ghosts. Data prep is so much more painful than people realize!

David Miller: Once you finally have your data set up, there’s the analysis itself. When it comes to financial statements—which, let’s be real, is probably what most auditors are looking at—vertical analysis and horizontal analysis are key. Vertical is about looking at all your amounts as a percentage of, say, total assets or revenue, so anomalies pop right out. Horizontal is comparing changes between periods to see what’s really moving around year to year.

Maya Collins: And ratios. Can’t forget the ratios—current ratio, quick ratio for liquidity, accounts receivable turnover, inventory turnover, debt-to-equity for solvency, profit margin… I mean, I could keep going, but listeners know I geek out over these things. These ratios make it easier to ask, “Hey, why is this company’s debt suddenly spiking, or why is profit margin plunging?” Sometimes the story you uncover is innocent, but sometimes… that’s where you find some pretty creative fraud.

David Miller: And that’s where I always circle back to: it’s not just about having tools, it’s about getting the right data and knowing what to do with it. Sure, the tech can pull numbers, but you have to know the business, understand its context, and ask the right questions. Otherwise, you’re just hoping for luck instead of actually uncovering the ‘needle in the haystack.’

Maya Collins: Exactly! If I had a dollar for every time I ran into a data set that was incomplete, months out of date, or weirdly formatted, I could… well, I probably wouldn’t need to do fraud analytics anymore. For everyone listening—think about your own organization. Who controls the data? How easy is it to get what you need? Maybe imagine your first step if a fraud tip turned up tomorrow.

David Miller: And if you’re sitting there thinking, “That’s too much work, we’ll wait for the whistleblowers,” just remember what we talked about in episodes five and four. Prevention’s great, but being able to spot fraud quickly makes all the difference. Alright, Maya, I think we covered a ton today. Any closing thoughts?

Maya Collins: Not much except—don’t be afraid of the data! Use the tools, learn the basics, and jump in. We’ll keep bringing you practical tips next time, so don’t go anywhere. Thanks for joining us!

David Miller: Appreciate you as always, Maya. Thanks folks—catch you in the next episode!