PAVE: Write-print Creation with MapReduce
Network Science Ceter, United States Military Academy West Point United States
Pagination or Media Count:
Cyber-crime is becoming alarmingly common through the use of anonymous e-mails. Author attribution helps digital forensics investigators filter through a large set of possible authors and focus traditional investigative techniques on the most probable culprits. A recent promising technique is to construct a write-print for each known author and compare it to the write-print extracted from the anonymous messages. A write-print is a unique digital fingerprint created by mining frequent patterns from a particular authors writing style. Parallel computing enables us to leverage multiple cores in the creation of author write-prints. We introduce Parallel Author Verification of E-mail PAVE, a MapReduce algorithm for generating author write-prints in parallel. Our algorithm is able to achieve up to 90 accuracy when tested on a subset of the Enron dataset. We believe the community will find the PAVE system useful to expedite author identification in time sensitive situations.