Skip to content

eDiscovery101

The Intersection of eDiscovery, Technology and Common Sense

  • New item
  • About
HomePosts tagged 'metadata'

metadata

Part 2: Finding hidden data and metadata in office documents for eDiscovery

June 21, 2011June 21, 2011 Bill Tolson eDiscovery, Technology e-discovery, eDiscovery, electronic discovery, ESI, FRCP, litigation, metadata, Microsoft, Office, spoliation

I ended my last blog with the question; So what and how should a company put in place a process to make sure this type of metadata is removed as a standard process before litigation arises?

Before we answer that question, let’s review the point of the last posting. The point was that Microsoft Office (and other applications) can be a rich source of discoverable information (not just metadata, thanks Leonid) for the plaintiff, especially if the author of the document didn’t take the time to scrub all hidden and personal information before finalizing it. Obviously this unseen data can be a goldmine for the opposing counsel if they know to look for it.

To address this risk, you should get your employees into the habit of “finalizing” documents by running the Microsoft Office “Document Inspector” as well as deleting all previous revisions of the document. To many, this seems like a lot of trouble but in the long run can significantly reduce your eDiscovery risk.

Figure 1: Accessing the “Document Inspector” in Word 2007

The Document Inspector provides a central location for you to examine documents for personal, hidden, or sensitive information. You can then use built-in Document Inspector modules to remove unwanted information more easily. The document inspector can be found by clicking on the “Prepare” topic and then on the “Inspect Document” menu item. The Document Inspector is available the same way in all Microsoft Office applications.

Figure 2: The Document Inspector

As you can see above, the Document Inspector allows you to check for Comments, revisions, versions, annotations, document properties, personal information, custom XML data, headers, footers, watermarks and hidden text. This type of data can be damaging in litigation if the discovered party is not aware of its existence.

Once the Document Inspector is run, you will get an indication of potential threats as seen in the window below:

Figure 3: The Document Inspector will alert you to possible threats

From this point, you can either “remove all” hidden data for each topic searched on or none of it.

But that’s the point right? You can’t remove this data after a litigation hold has been applied so by removing this hidden data as part of a consistent process will remove the risk of having to review and turn over this data in discovery.

The two points to remember is first, this hidden data and metadata could exist in you employee’s potential responsive files which means you better review it before you turn it over and second, there are ways for employees to easily remove it themselves as part of a documented process.

1 Comment

Accessing hidden metadata in Office documents for eDiscovery

June 17, 2011June 17, 2011 Bill Tolson eDiscovery, Technology defendant, e-discovery, eDiscovery, electronic discovery, FRCP, litigation, metadata, Microsoft, Office, plaintiff, spoliation

Microsoft Office and other documents including PowerPoint, Word, and Excel among others can be a rich source of discoverable information for the plaintiff, especially if the author of the document didn’t take the time to scrub all hidden and personal information before finalizing it. Be aware all of this hidden and personal metadata can’t be altered after a litigation hold should have been applied.

Several types of hidden and personal data can and will be saved by default in an Office document if the correct precautions are not taken.  The first point to remember is that Office applications are by default capturing data as the document in question is created, reviewed and revised. Data such as:

  • Comments, revision marks from tracked changes, versions, and ink annotations
  • Document properties and personal information. Document properties, also known as metadata (metadata: Data that describes other data. For example, the words in a document are data; the word count is an example of metadata.), include details about your document such as author, subject, and title. Document properties also include information that is automatically maintained by Office programs, such as the name of the person who most recently saved a document and the date when a document was created. If you used specific features, your document might also contain additional kinds of personally identifiable information (PII) (personally identifiable information (PII): Any information that can be used to identify a person, such as a name, address, e-mail address, government ID, IP address, or any unique identifier associated with PII in another program.), such as e-mail headers, send-for-review information, routing slips, printer paths, and file path information for publishing Web pages.
  • Headers, footers, and watermarks
  • Hidden text including reviewers notes
  • Hidden rows, columns, and worksheets
  • Invisible content PowerPoint presentations and Excel workbooks can contain objects that are not visible because they are formatted as invisible
  • Off-slide content
  • Presentation notes
  • Document server properties. If your document was saved to a location on a document management server, such as a Document Workspace site or a library based on Microsoft Windows SharePoint Services, the document might contain additional document properties or information related to this server location.
  • Custom XML data

In my experience, few companies train their employees to remove this metadata before their documents, spreadsheets and presentations are finalized and distributed. For the attorney asking for ESI, a tell tale sign of spoliation would be the total absence of this hidden data. The attory could do a quick sampling of discovered documents and if the majority of them are “clean” then a discussion with the defendants counsel and possibly Judge would be in order. Unless the defendants could show evidence of employee processes including the regular removal of this type of data as a standard process, a spoliation ruling would be in the cards.

So what and how should a company put in place a process to make sure this type of metadata is removed as a standard process before litigation arises?

My next blog entry will answer this question.

5 Comments

Metadata should be included in federal FOIA response

March 1, 2011 Bill Tolson eDiscovery e-discovery, eDiscovery, electronic discovery, ESI, federal government, FOIA, freedom of information act, information management, metadata, NARA, production

A recent federal court decision, Nat. Day Laborer Org. Network v. United States Immigration and Customs Enforcement Agency, 2011 WL 381625 (S.D.N.Y. Feb. 7, 2011), has set the expectation that certain metadata is considered “responsive” in federal FOIA requests.

From ediscoverylaw.com:

In this Freedom of Information Act (“FOIA”) action, a dispute arose regarding the proper format of production and, in particular, whether metadata was subject to production pursuant to plaintiffs’ FOIA requests. Following substantial analysis of the issue, the court held that “certain metadata is an integral or intrinsic part of an electronic record” and, as such, is “‘readily reproducible’ in the FOIA context.” (FOIA provides that “in making any record available to a person under this paragraph, an agency shall provide the record in any form or format requested by the person if the record is readily reproducible by the agency in that form or format.”)

The court went on to describe the minimum metadata fields which should be considered reproducible and therefore should accompany any production electronically stored information (ESI) by the government:

1. Identifier: A unique production identifier (“UPI”) of the item.
2. File Name: The original name of the item or file when collected from the source custodian or system.
3. Custodian: The name of the custodian or source system from which the item was collected.
4. Source Device: The device from which the item was collected.
5. Source Path: The file path from the location from which the item was collected.
6. Production Path: The file path to the item produced from the production media.
7. Modified Date: The last modified date of the item when collected from the source custodian or system.
8. Modified Time: The last modified time of the item when collected from the source custodian or system.
9. Time Offset Value: The universal time offset of the item’s modified date and time based on the source system’s time zone and daylight savings time settings.

Why is this metadata decision important? Because it now puts a requirement on federal government agencies that they should be capturing and storing their ESI is ways that first capture the above noted ESI and second, insure that metadata is not inadvertently modified, for example when it is reviewed to determine its responsiveness to the FOIA request.

Federal government agencies deal with huge amounts of ESI on a daily basis and should have archiving systems already in place to insure ESI is captured and managed correctly per the National Archives and Records Administration (NARA) regulations. It now seems they need to relook at their ESI archiving systems to insure they can meet this new metadata requirement.

1 Comment

Social Media Links for Bill Tolson

  • View @wtolson’s profile on Twitter
  • View billtolson’s profile on LinkedIn

Search all blogs

Idea Cloud

CCPA Cloud Storage eDiscovery GDPR Information Governance Information Management Law Firm privacy records retention Technology Uncategorized

Blogroll

  • DredLaw
  • eDiscovery101 LinkedIn Group
  • eDiscoveryJournal
  • InformationGovernance-101
  • Part of the Solution
  • WordPress.com

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 1,415 other followers

Follow me on Twitter

My Tweets

Tags

attorney Cloud Storage custodian data defendant Discovery e-discovery eDiscovery electronic discovery email ESI Facebook FRCP information governance information management Judge legal legal hold litigation litigation hold plaintiff policy predictive coding privacy spoliation

Blog categories

Recent Posts

  • Office 365 Journaling to Create a Comprehensive eDiscovery Archive
  • Data Sovereignty and the GDPR; Do You Know Where Your Data Is?
  • The Right to be Forgotten Versus The Need to Backup
  • The New California Privacy Law and Presumed Damages
  • My Healthcare Data is Where?

Archives

  • February 2019
  • December 2017
  • June 2017
  • February 2016
  • April 2015
  • December 2014
  • August 2014
  • July 2014
  • June 2014
  • December 2013
  • November 2013
  • October 2013
  • August 2013
  • July 2013
  • June 2013
  • May 2013
  • March 2013
  • February 2013
  • October 2012
  • August 2012
  • July 2012
  • June 2012
  • May 2012
  • April 2012
  • January 2012
  • December 2011
  • November 2011
  • October 2011
  • September 2011
  • August 2011
  • July 2011
  • June 2011
  • May 2011
  • April 2011
  • March 2011
  • February 2011
  • January 2011
  • December 2010
  • September 2010
  • August 2010
  • July 2010
  • June 2010
  • April 2010
  • March 2010
  • February 2010
  • January 2010
  • December 2009
  • October 2009
  • September 2009
  • August 2009
  • July 2009
  • June 2009
  • May 2009
  • April 2009
  • March 2009
Follow eDiscovery101 on WordPress.com
Create a free website or blog at WordPress.com.
  • Follow Following
    • eDiscovery101
    • Join 1,415 other followers
    • Already have a WordPress.com account? Log in now.
    • eDiscovery101
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...