Automated Metadata Extraction
NAVAL POSTGRADUATE SCHOOL MONTEREY CA
Pagination or Media Count:
Metadata is data that describes data. There are many computer forensic uses of metadata and being able to extract metadata automatically provides positive forensic implications. This thesis presents a new technique for batch processing disk images and automatically extracting metadata from files and file contents. The technique is embodied in a program called fiwalk that has a plug-in architecture allowing new metadata extractors to be readily incorporated. Output from fiwalk can be provided in multiple formats such as ARFF and text. The plug-ins created for this thesis include one created by Simson Garfinkel for extracting metadata from .jpeg files, two for Microsoft Office documents one for prior to Office 2007 release and one for Office 2007 release, and a default plug-in for extracting metadata from .gif, .pdf, and .mp3 files. To better understand the metadata available in common file formats such as .doc, .docx, .odt, .pdf, .mp3, .mp4, .jpeg, .tiff, and .gif, an examination of these formats is provided.
- Computer Programming and Software