I ended my last blog with the question; So what and how should a company put in place a process to make sure this type of metadata is removed as a standard process before litigation arises?
Before we answer that question, let’s review the point of the last posting. The point was that Microsoft Office (and other applications) can be a rich source of discoverable information (not just metadata, thanks Leonid) for the plaintiff, especially if the author of the document didn’t take the time to scrub all hidden and personal information before finalizing it. Obviously this unseen data can be a goldmine for the opposing counsel if they know to look for it.
To address this risk, you should get your employees into the habit of “finalizing” documents by running the Microsoft Office “Document Inspector” as well as deleting all previous revisions of the document. To many, this seems like a lot of trouble but in the long run can significantly reduce your eDiscovery risk.
Figure 1: Accessing the “Document Inspector” in Word 2007
The Document Inspector provides a central location for you to examine documents for personal, hidden, or sensitive information. You can then use built-in Document Inspector modules to remove unwanted information more easily. The document inspector can be found by clicking on the “Prepare” topic and then on the “Inspect Document” menu item. The Document Inspector is available the same way in all Microsoft Office applications.
Figure 2: The Document Inspector
As you can see above, the Document Inspector allows you to check for Comments, revisions, versions, annotations, document properties, personal information, custom XML data, headers, footers, watermarks and hidden text. This type of data can be damaging in litigation if the discovered party is not aware of its existence.
Once the Document Inspector is run, you will get an indication of potential threats as seen in the window below:
Figure 3: The Document Inspector will alert you to possible threats
From this point, you can either “remove all” hidden data for each topic searched on or none of it.
But that’s the point right? You can’t remove this data after a litigation hold has been applied so by removing this hidden data as part of a consistent process will remove the risk of having to review and turn over this data in discovery.
The two points to remember is first, this hidden data and metadata could exist in you employee’s potential responsive files which means you better review it before you turn it over and second, there are ways for employees to easily remove it themselves as part of a documented process.
One thought on “Part 2: Finding hidden data and metadata in office documents for eDiscovery”
[…] Hidden Data and Metadata in Office Documents for eDiscovery – http://t.co/uww94Yw (Bill […]