You Don’t Know What You Don’t Know


Blog_06272014_graphicThe Akron Legal News this week published an interesting editorial on information governance. The story by Richard Weiner discussed how law firms are dealing with the transition from rooms filled with hard copy records to electronically stored information (ESI) which includes firm business records as well as huge amounts of client eDiscovery content. The story pointed out that ESI flows into the law firm so quickly and in such huge quantities no one can track it much less know what it contains.  Law firms are now facing an inflection point, change the way all information is managed or suffer client dissatisfaction and client loss.

The story pointed out that “in order to function as a business, somebody is going to have to, at least, track all of your data before it gets even more out of control – Enter information governance.”

There are many definitions of information governance (IG) floating around but the story presented one specifically targeted at law firms: IG is “the rules and framework for managing all of a law firm’s electronic data and documents, including material produced in discovery, as well as legal files and correspondence.” Richard went on to point out that there are four main tasks to accomplish through the IG process. They are:

  • Map where the data is stored;
  • Determine how the data is being managed;
  • Determine data preservation methodology;
  • Create forensically sound data collection methods.

I would add several more to this list:

  • Create a process to account for and classify inbound client data such as eDiscovery and regulatory collections.
  • Determine those areas where client information governance practices differ from firm information governance practices.
  • Reconcile those differences with client(s).

As law firms’ transition to mostly ESI for both firm business and client data, law firms will need to adopt IG practices and process to account for and manage to these different requirements. Many believe this transition will eventually lead to the incorporation of machine learning techniques into IG to enable law firm IG processes to have a much more granular understanding of what the actual meaning of the data, not just that it’s a firm business record or part of a client eDiscovery response. This will in turn enable more granular data categorization capability of all firm information.

Iron Mountain has hosted the annual Law Firm Information Governance Symposium which has directly addressed many of these topics around law firm IG. The symposium has produced ”A Proposed Law Firm Information Governance Framework” a detailed description of the processes to look at as law firms look at adopting an information governance program.

Advertisements

Discoverable versus Admissible; aren’t they the same?


This question comes up a lot, especially from non-attorneys. The thought is that if something is discoverable, then it must be admissible; the assumption being that a Judge will not allow something to be discovered if it can’t be used in court. The other thought is that everything is discoverable if it pertains to the case and therefor everything is admissible.

Let’s first address what’s discoverable. For good cause, the court may order discovery of any matter (content) that’s not privileged relevant to the subject matter involved in the action. In layman’s terms, if it is potentially relevant to the case, you may have to produce it in discovery or in other words, anything and everything is potentially discoverable.  All discovery is subject to the limitations imposed by FRCP Rule 26(b)(2)(C).

With that in mind, let’s look at the subject of admissibility.

In Lorraine v. Markel Am. Ins. Co., 241 F.R.D. 534, 538 (D. Md. 2007), the court started with the premise that the admissibility of ESI is determined by a collection of evidence rules “that present themselves like a series of hurdles to be cleared by the proponent of the evidence”.  “Failure to clear any of these evidentiary hurdles means that the evidence will not be admissible”. Whenever ESI is offered as evidence, five evidentiary rules need to be considered. They are:

  • is relevant to the case
  • is authentic
  • is not hearsay pursuant to Federal Rule of Evidence 801
  • is an original or duplicate under the original writing rule
  • has probative value that is substantially outweighed by the danger of unfair prejudice or one of the other factors identified by Federal Rule of Evidence 403, such that it should be excluded despite its relevance.

Hearsay is defined as a statement made out of court that is offered in court as evidence to prove the truth of the matter asserted. Hearsay comes in many forms including written or oral statements or even gestures.

It is the Judge’s job to determine if evidence is hearsay or credible. There are three evidentiary rules that help the Judge make this determination:

  1. Before being allowed to testify, a witness generally must swear or affirm that his or her testimony will be truthful.
  2. The witness must be personally present at the trial or proceeding in order to allow the judge or jury to observe the testimony firsthand.
  3. The witness is subject to cross-examination at the option of any party who did not call the witness to testify.

The Federal Rules of Evidence Hearsay Rule prohibits most statements made outside of court from being used as evidence in court. Looking at the three evidentiary rules mentioned above – usually a statement made outside of the courtroom is not made under oath, the person making the statement outside of court is not present to be observed by the Judge, and the opposing party is not able to cross examine the statement maker. This is not to say all statements made outside of court are inadmissible. The Federal Rule of Evidence 801 does provide for several exclusions to the Hearsay rule.

All content is discoverable if it potentially is relevant to the case and not deemed privileged, but discovered content may be ruled inadmissible if it is deemed privileged (doctor/patient communications), unreliable or hearsay. You may be wondering how an electronic document can be considered hearsay? The hearsay rule refers to “statements” which can either be written or oral. So, as with paper documents, in order to determine whether the content of electronic documents are hearsay or fact, the author of the document must testify under oath and submit to cross-examination in order to determine whether the content is fact and can stand as evidence.

This legal argument between fact and hearsay does not relieve the discoveree from finding, collecting and producing all content in that could be relevant to the case.

Next Generation Technologies Reduce FOIA Bottlenecks


Federal agencies are under more scrutiny to resolve issues with responding to Freedom of Information Act (FOIA) requests.

The Freedom of Information Act provides for the full disclosure of agency records and information to the public unless that information is exempted under clearly delineated statutory language. In conjunction with FOIA, the Privacy Act serves to safeguard public interest in informational privacy by delineating the duties and responsibilities of federal agencies that collect, store, and disseminate personal information about individuals. The procedures established ensure that the Department of Homeland Security fully satisfies its responsibility to the public to disclose departmental information while simultaneously safeguarding individual privacy.

In February of this year, the House Oversight and Government Reform Committee opened a congressional review of executive branch compliance with the Freedom of Information Act.

The committee sent a six page letter to the Director of Information Policy at the Department of Justice (DOJ), Melanie Ann Pustay. In the letter, the committee questions why, based on a December 2012 survey, 62 of 99 government agencies have not updated their FOIA regulations and processes which was required by Attorney General Eric Holder in a 2009 memorandum. In fact the Attorney General’s own agency have not updated their regulations and processes since 2003.

The committee also pointed out that there are 83,000 FOIA request still outstanding as of the writing of the letter.

In fairness to the federal agencies, responding to a FOIA request can be time-consuming and expensive if technology and processes are not keeping up with increasing demands. Electronic content can be anywhere including email systems, SharePoint servers, file systems, and individual workstations. Because content is spread around and not usually centrally indexed, enterprise wide searches for content do not turn up all potentially responsive content. This means a much more manual, time consuming process to find relevant content is used.

There must be a better way…

New technology can address the collection problem of searching for relevant content across the many storage locations where electronically stored information (ESI) can reside. For example, an enterprise-wide search capability with “connectors” into every data repository, email, SharePoint, file systems, ECM systems, records management systems allows all content to be centrally indexed so that an enterprise wide keyword search will find all instances of content with those keywords present. A more powerful capability to look for is the ability to search on concepts, a far more accurate way to search for specific content. Searching for conceptually comparable content can speed up the collection process and drastically reduce the number of false positives in the results set while finding many more of the keyword deficient but conceptually responsive records. In conjunction with concept search, automated classification/categorization of data can reduce search time and raise accuracy.

The largest cost in responding to a FOIA request is in the review of all potentially relevant ESI found during collection. Another technology that can drastically reduce the problem of having to review thousands, hundreds of thousands or millions of documents for relevancy and privacy currently used by attorneys for eDiscovery is Predictive Coding.

Predictive Coding is the process of applying machine learning and iterative supervised learning technology to automate document coding and prioritize review. This functionality dramatically expedites the actual review process while dramatically improving accuracy and reducing the risk of missing key documents. According to a RAND Institute for Civil Justice report published in 2012, document review cost savings of 80% can be expected using Predictive Coding technology.

With the increasing number of FOIA requests swamping agencies, agencies are hard pressed to catch up to their backlogs. The next generation technologies mentioned above can help agencies reduce their FOIA related costs while decreasing their response time.

Coming to Terms with Defensible Disposal; Part 1


Last week at LegalTech New York 2013 I had the opportunity to moderate a panel titled: “Defensible Disposal: If it doesn’t exist, I don’t have to review it…right?” with an impressive roster of panelists. They included: Bennett Borden, Partner, Chair eDiscovery & Information Governance Section, Williams Mullen, Clifton C. Dutton, Senior Vice President, Director of Strategy and eDiscovery, American International Group and John Rosenthal, Chair, eDiscovery and Information Management Practice, Winston & Strawn and Dean Gonsowski, Associate General Counsel, Recommind Inc.

During the panel session it was agreed that organizations have been over-retaining ESI (which accounts for at least 95% of all data in organizations) even if it’s no longer needed for business or legal reasons. Other factors driving this over-retention of ESI were the fear of inadvertently deleting evidence, otherwise called spoliation. In fact an ESG survey published in December of 2012 showed that the “fear of the inability to furnish data requested as part of a legal or regulatory matter” was the highest ranked reason organizations chose not to dispose of ESI.

Other reasons cited included not having defined policies for managing and disposing of electronic information and adversely, organizations having defined retention policies to actually keep all data indefinitely (usually because of the fear of spoliation).

One of the principal information governance gaps most organizations haven’t yet addressed is the difference between “records” and “information”. Many organizations have “records” retention/disposition policies to manage those official company records required to be retained under regulatory or legal requirements. But those documents and files that fall under legal hold and regulatory requirements amount to approximately 6% of an organization’s retained electronic data (1% legal hold and 5% regulatory).

Another interesting survey published by Kahn Consulting in 2012 showed levels of employee understanding of their information governance-related responsibilities. In this survey only 21% of respondents had a good idea of what information needed to be retained/deleted and only 19% knew how  information should be retained or disposed of. In that same survey, only 15% of respondents had a general idea of their legal hold and eDiscovery responsibilities.

The above surveys highlight the fact that organizations aren’t disposing of information in a systematic process mainly because they aren’t managing their information, especially their electronic information and therefore don’t know what information to keep and what to dispose of.

An effective defensible disposal process is dependent on an effective information governance process. To know what can be deleted and when, an organization has to know what information needs to be kept and for how long based on regulatory, legal and business value reasons.

Over the coming weeks, I will address those defensible disposal questions and responses the LegalTech panel discussed. Stay tuned…

eDiscovery Cost Reduction Strategies


In these still questionable economic times, most legal departments are still looking for ways to reduce, or at least stop the growth, of their legal budgets. One of the most obvious targets for cost reduction in any legal department is the cost of responding to eDiscovery including the cost of finding all potentially responsive ESI, culling it down and then having in-house or external attorneys review it for relevance and privilege. Per a CGOC survey, the average GC spends approximately $3 million per discovery to gather and prepare information for opposing counsel in litigation.

Most organizations are looking for ways to reduce these growing costs of eDiscovery. The top four cost reduction strategies legal departments are considering are:

  • Bring more evidence analysis and do more ESI processing internally
  • Keep more of the review of ESI in house rather that utilize outside law firms
  • Look at off-shore review
  • Pressure external law firms for lower rates

I don’t believe these strategies address the real problem, the huge and growing amount of ESI.

Several eDiscovery experts have told me that the average eDiscovery matter can include between 2 and 3 GB of potentially responsive ESI per employee. Now, to put that in context, 1 GB of data can contain between 10,000 and 75,000 pages of content. Multiply that by 3 and you are potentially looking at between 30,000 and 225,000 pages of content that should be reviewed for relevancy and privilege per employee. Now consider that litigation and eDiscovery usually includes more than one employee…ranging from two to hundreds.

It seems to me the most straight forward and common sense way to reduce eDiscovery costs is to better manage the information that could be pulled into an eDiscovery matter, proactively.

To illustrate this proactive information management strategy for eDiscovery, we can look at the overused but still appropriate DuPont case study from several years ago.

DuPont re-looked at nine cases. They determined that they had reviewed a total of 75,450,000 pages of content in those nine cases. A total of 11,040,000 turned out to be responsive to the cases. DuPont also looked at the status of these 75 million pages of content to determine their status in their records management process. They found that approximately 50% of those 75 million pages of content were beyond their documented retention period and should have been destroyed and never reviewed for any of the 9 cases. They also calculated they spent $11, 961,000 reviewing this content. In other words, they spent $11.9 million reviewing documents that should not have existed if their records retention schedule and policy had been followed.

An information management program, besides capturing and making ESI available for use, includes the defensible deletion of ESI that has reached the end of its retention period and therefore is valueless to the organization.

Corporate counsel should be the biggest proponents of information governance in their organizations simply due to the fact that it affects their budgets directly.

Can you wipe your twitter ramblings, and should you?


In December of 2011, the Library of Congress and Twitter signed an agreement that will eventually make available every public Tweet ever sent as an archive to the Library of Congress.


While writing a blog post last week, I began  to wonder how long all my twitter postings would
be available and who could look at them. For the fun of it, I went back through approximately 6 months of my old twitter postings, re-tweets and replies (yes you can do it, it’s relatively easy and you can look at anyone’s).

I’ve been pretty good about keeping my twitter posts “business-like” and have steered away from personal stuff like “I just checked in to the Ramada Inn on route 11…can’t wait for the evening to begin!”, or “does anyone know how to setup an off-shore bank account?” or “those jerks over at Company ABC are a bunch of losers”.  But many tweeters aren’t so disciplined and have posted stuff that could come back to haunt them later. I could imagine a perspective employer reviewing a candidate’s twitter history or even worse an attorney conducting research for a case using the public twitter archives to create a timeline.

With that in mind, could you delete your twitter postings and should you? Twitter does allow you to delete specific tweets one at a time but as far as I can determine, Twitter does not give you the ability to delete your entire twitter history short of deactivating your account. From the Twitter website:

How To Delete a Tweet

If you’ve posted something that you’d rather take back, you can remove it easily. When you hover over your Tweet while viewing your home or profile page, you’ll see a few options appear below the message.

To delete one of your Twitter updates:

  1. 1.       Log in to Twitter.com
  2. 2.       Visit your Profile page
  3. 3.       Locate the Tweet you want to delete
  4. 4.       Hover your mouse over the message (as shown below), and click the “Delete” option that appears

Voila! Gone forever… almost. Deleted updates sometimes hang out in Twitter search. They will clear with time.

We do not provide a way to bulk delete Tweets. If you’re looking to get a “fresh start” on your Twitter account without losing your username, the best way to do this is to create a temporary account with a temporary username, and then switch the username between your current account and the temporary account. Please see our article on How to Change Your Username for more info. 

On December 30, 2011, CNET published a story titled “How to delete all your tweets” which highlighted a product called TwitWipe. TwitWipe is a free tool that allows you to delete ALL your past tweets in one fell swoop. This may be handy because you can clean out your twitter account and start fresh without changing your username and dumping all your hard won followers.

This is an interesting capability but I think the more important question is why would you use this drastic of a step? The four most obvious reasons one would want to delete all their twitter postings and start fresh would be:

1.       You went through an unfortunate period in your life that you would rather forget

2.       You were regularly conducting criminal activities through your Twitter account

3.       You are considering a run for the presidency

4.       For whatever reason, you don’t want your twitter postings archived and available at the Library of Congress

The ability to delete ESI can be dangerous if done at the wrong time, especially if civil litigation is anticipated. Deleting a single tweet or every tweet you have ever posted can be construed as destruction of evidence if those tweets could have been relevant in litigation. ESI, no matter its format or where it’s stored, is potentially evidence  and should be at least considered when protecting ESI for litigation hold. Attorneys on both sides need to include social media content like twitter postings in their eDiscovery plans and be sure to warn all custodians about deleting/editing  social media content once litigation is anticipated.

Part 2, Steganography; Hiding from eDiscovery in plain sight


In my last blog I described a unique way of hiding incriminating data from eDiscovery queries in plain sight. In the example, I was able to hide obviously responsive information in a QR code attached as part of the signature to an email message.  The point was to show that ESI, especially email, can still be used to communicate with others and remain under the radar of the best eDiscovery search applications.

Now let’s look at another way to hide incriminating ESI from eDiscovery search applications.

The technique is called Steganography. Steganography is the art and science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message; a form of security through obscurity. The best known Steganography technique hides information in standard graphic images.

Graphic #1: Tree

The above image of a tree includes a steganographically hidden image. The hidden image (the image of the cat below) is revealed by removing all but the two least significant bits of each color component and a subsequent normalization. The hidden image is shown below.

Graphic #2: Cat

You can hide any electronically stored data in any graphic image. As in the example above, a picture can be hidden in another picture. But the technique is not limited to hiding pictures in pictures. A word document, a schematic, even a sound file can be embedded and hidden in any graphic.

There are several free steganography applications available on the internet. I found and tested two; Invisible Secrets 2.1 and Xiao Steganography. Both use JPEG images as the “carrier” device.

How can this technique be used to pass incriminating information to someone else? Using the email example from my previous blog, let’s look at the example email message below from Bill to Ken.

Email example #1

There is absolutely nothing out of the ordinary in this email and would not trigger an eDiscovery search application to flag it as suspicious. Look closely at the email signature especially the eDiscovery101 graphic. Now look at the email below:

Email example #2

The second email looks exactly the same. Again there would be no reason for an eDiscovery search application to flag it as suspicious. But, hidden in the second email’s eDiscovery101 graphic is the very incriminating Word document shown below:

Graphic #3: Incriminating letter

This raises the question; if you were conducting an eDiscovery investigation, how would you ever suspect that there is additional responsive data included in the in the “eDiscovery101” email signature graphic and if you did suspect hidden data, how could you prove it?

To answer the first question, we need to understand how steganography applications work. For this example I will use the Invisible Secrets 2.1 application.

The application includes a helpful wizard to quickly walk you through the process.

The first step is to decide which graphic file you will use as the “carrier” for the incriminating data. In this case I will use my standard JPEG file for my blog, eDiscovery101.

The next step is to select the source file or in this example the incriminating letter from above.

Next, a password for encryption of the incriminating letter is requested. This will insure the incriminating (hidden) data in the eDiscovery101 graphic cannot not be accessed, even if suspected.

Lastly, you need to give the application a destination file name. In this case I named it something obvious and familiar, eDiscovery101s.jpg, so as not to draw attention to it. At this point, after the “Next” button is pressed, the new graphic file is created and can be inserted into the email signature.

Detecting hidden data via automation is tough if not impossible. As I mentioned before, As far as I know, there is no eDiscovery application which can recognize and flag steganography. To have a chance, you must already suspect a custodian and then manually look for inconsistencies. For this example, the only way to tell if a given graphic contains hidden data is to compare the size of the images. The two eDiscovery101 images have different sizes. The original eDiscovery101 image is a 52KB JPEG file, while the second eDiscovery101 image is a 78KB JPEG file. Another clue to hidden data would be to search for know steganography applications on the custodian’s desktop or laptop (if they didn’t delete it after creating the hidden data). But remember, even if you find a suspicious image, without the encryption password you will never be able to open it.

To protect organizations from this type eDiscovery liability they can put some basic measures in place. Most importantly, include in your email system use policy a definitive statement about using these types of encryption applications on any organization owned assets, and audit custodians for enforcement. You could also forbid placing graphic images within the body of an email but this is not realistic. For example you could insert the same incriminating letter mentioned above into a table within a spreadsheet and convert that table to a JPEG. Below is a spreadsheet converted into a JPEG image file with the same incriminating letter embedded in it.

Spreadsheet #1

Would the above spreadsheet embedded into an email raise suspicions? Probably not… If custodians are determined to hide data in plain sight, they can with little chance of being caught.