Clustering Similarity Digest Bloom Filters in Self-Organizing Maps
NAVAL POSTGRADUATE SCHOOL MONTEREY CA
Pagination or Media Count:
In response to increasing numbers of cases involving digital media, and the increasing sizes of and number of pieces of media in those cases, forensic investigators are relying increasingly on triage techniques for prioritizing which media to review. This thesis describes a framework for clustering documents aquired during a digital forensics investigation on a self organizing aka Kahonen map allowing new documents to be categorized relative to existing documents. Furthermore the presented algorithm avoids the need to work with source documents but with sdhash fingerprints allowing a fifty-fold reduction in data required. To test the methodology, document fingerprints are regenerated from the SOM and compared.
- Numerical Mathematics
- Computer Programming and Software