Sunday, November 6, 2016

How forensics really works (a post for my mom)

Dear Mom,

Forensics doesn't work like you see on NCIS where Abbie does the impossible and does chip off and data recovery to get done with the case in time to make an arrest.  Malware reversing doesn't work like you saw in CSI: Cyber where you just drop the code in a hex editor and the exploits and malware are just colored red (on a green background).  Terrorists can't really take down the whole air traffic control network, but if they could, I wouldn't drive down the runway in a Porsche to download the software.

Forensics is technical and hard.  It doesn't happen in the span of a single hour long show.  While you've probably heard that nothing is EVER deleted, that's not really true.  The more time that goes by between a deletion and the investigation, the less likely I am to recover anything.  Think about that "The First 48 Hours" show you like so much.  Sure, most of the time police get called right after the murder and get a fresh crime scene.  But remember that episode where the police get called to a vacant lot months after the murder happened?  That murder scene was fresh once too, but not when the police got there.

Forensics is (usually) a lot like that episode.  Instead of getting a fresh crime scene, there was a thunderstorm that washed away some blood evidence. Some drunk college kid came by and urinated on the ground near the body.  A young girl saw something shiny in the lot while walking home and removed a shell casing because it was "pretty" and she wanted it for her dolly.  Three hoboes having an orgy came through and moved the body to use it as a mattress.  The crime scene is a mess.  Yeah, forensics is a lot like this crime scene - it's just a mess.

But for all the bad, forensics isn't all manual either.  I'm not individually looking through each file on the system looking for keywords in your documents.  I don't manually search through your browser history either.  There a number of sites I don't ever want to see your password for.  What investigative value could your saved password for Amazon have?  The only value I see is you accusing me of purchasing something on your behalf.  I have filters to remove this sort of private information.

And if you have tens or hundreds of thousands of files (say for instance emails) I'm going to use automated tools to reduce this number to something I can manually examine.  All the interns in the world won't be able to cull through a hundred thousand emails in a timely fashion.  I'm going to sort files into four bins:
  • Files that match a certain keyword of interest (blacklist)
  • Files that match a certain keyword you DON'T want to see (whitelist)
  • Files that contains words from the whitelist and the blacklist
  • Files that don't match anything
Depending on the parameters of my investigation, I might only look at those files that match the blacklist words.  In other cases, I'll also examine those that match both the blacklist and the whitelist. Often, this second category is investigated by another independent investigator since items on the whitelist are often very sensitive.  Even for documents that match my search terms, there may be many that are well known (matching cryptographic hashes to known files or files I've already examined).  Using this method, I can get through tens of thousands of files quickly, providing my search terms are correctly defined.

So Mom, I opened this post by telling you about forensics works of fiction like CSI: Cyber, NCIS, and Scorpion.  I'd like to help clear up another fictional forensics story.  We both know the FBI has recently undertaken a very high profile forensic investigation involving large quantities of email.  The FBI claims to have investigated all of the email and cleared the suspect.  Some people have trouble understanding how this can happen so quickly (even though they apparently believe the timelines on NCIS, it's still on the air).  You mentioned this to me today in a phone call and I'm going to set the record straight.  You also know this isn't politically motivated since you know I'm nauseated by both candidates.


Let's examine a few facts about the recent FBI case:
  • The computer was never used by the subject of the investigation
  • There are 650k emails on the suspect machine
  • The number of emails in the 650k that match the blacklist is probably very low compared to the overall number of emails
  • Some number of these emails on the blacklist may have already been examined in a previous investigation
Considering these facts and knowing how email and files are processed in forensics investigations, it's completely possible that the FBI processed 650k emails for the context of this investigation in this timeframe.

Mom, it's fine if you want to distrust the FBI.  A little distrust of authority is actually probably healthy.  Say that the FBI is actively covering something up.  Say that if they release the emails, Putin will start WW III.  Say whatever you want, but don't subscribe to the "FBI couldn't have done it this quickly" narrative.  It's not only wrong, it's provably wrong.

Also: Mom, I love you.

* For the person who hit me up on Twitter assuming that I was picking on all moms with this post, let me set the record straight.  My mom, a great professional nurse, nursing administrator, etc., has a complete lack of understanding about technology.  I honestly wrote this post in response to one of her shares on Facebook that highlighted some mistruths re: forensics.  

3 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete

Note: Only a member of this blog may post a comment.