January 12, 2011

Using Unique “Personalities” to Cull eDiscovery Data

Have you ever been told you look like someone? Sometimes you can see it; sometimes you can’t. For example, I’ve been told a few times that I look like Derek Jeter of the Yankees. Hmmm. I think there may be a couple of similar features, but overall I don’t agree. However, I’ll gladly trade him for his job and paycheck! I’ve also been told I look like Eddie Vedder of the band Pearl Jam. Don’t really see that either … maybe if my hair was long and I was screaming into a microphone … but I’ll take that too since I’m a huge fan (and will trade for his job as well).

OUR UNIQUE PERSONALITIES

For starters, there’s the physical aspect of who we are. Sure, you may look like someone famous or someone somebody knows, but when it comes down to it, nobody looks EXACTLY like you (unless, of course, you are an identical twin). Each of us is unique in various aspects of our appearance such as complexion, eye color, shape of our nose, height, build, ethnicity, and many other physical features.

Then there are even more non-physical things that make us distinct from others, including: where we’re from, our childhood, religion, hobbies, interests, education, skills, talents, career path, friends, family, taste in music and so on. In essence, there are so many attributes that make us who we are, it is impossible to find an exact match anywhere out there in the world.

Despite our uniqueness, we also share many random things in common with others. Picture yourself in a public place like a restaurant, movie theater, or a major sporting event. It’s likely that just about everyone around you is a perfect stranger. However, if you started talking to these people you’d quickly find many things in common. How many share similar interests? How many are from your hometown or region of the world? How many are in similar industries? Went to the same college? Some may even live right down the street from you. And certainly, you’d be surprised how many mutual people you and these random strangers know or have connection to — the proverbial six degrees of separation in action.

It is the wide range of attributes we all possess that makes up our unique personality. And it’s the certain aspects of our personalities — these common threads — we share with others that enable us to connect, cultivate friendships and build relationships. Think about your circle of friends, business associates or acquaintances. For each person you think of, there are certain attributes that connect you to each other.

In short, the potential to connect with others exists everywhere we go and with everyone we meet. And therefore, we take for granted just how many common threads we actually do share with most people around us — even with those that seem unlikely on the surface — if only we’d take a moment to discover those connections.

Now, this isn’t an article about love, unity and world peace. There’s no “Kum Ba Yah” moment here, and I promise not to channel John Lennon and start singing “Give Peace a Chance.”

THE "PERSONALITIES" OF ELECTRONIC DATA

I’ve spent time talking about this because, as silly a segue as it may seem — Electronically Stored Information (ESI) that gets collected for discovery actually has unique “personalities” as well. How so? Well, if you take an average document, it will possess a combination of aspects that make it unique to every other in the collection (unless, of course, it’s been collected twice).

Just like people, each document’s unique set of attributes gives it its own “personality”. Taking it a step further — just like how we have connections to random people in a public place, there are common threads within these document personalities that share connections to other files in the database. And with proper analysis, these personalities and their connections can help legal teams streamline the eDiscovery process.

To illustrate, let’s look at an e-mail for example. Its unique attributes would include: file type, sent date, custodian, author, recipients, attachment information, content, file location, header information and more. But despite its uniqueness, these attributes, along with others, can also represent relevant connections — or relationships — to other documents in the database that share some of these same elements. These possible common threads can include: concepts contained within, its position within an entire email thread, changes in recipients, relevant dates, as well as similarities (and differences) with other documents or related e-mail threads ... just to name a few.

Clearly ESI is significantly different from paper documents because of all these “moving parts” contained within the data — from content, to attributes, to metadata. And with the right technologies, we should be able to leverage all the data points inherent to ESI in order to discover all the stories contained in these collections. It’s a process I call “Relationship Mining.”

The problem with existing tools commonly used for culling, processing and reviewing ESI today is that most do not fully use these unique personalities to their advantage. Instead, most legal teams take a limited and linear view of this data in order to find responsive documents. In other words, only basic criteria are typically utilized to flesh out important documents in most cases. But given the wide range of information built into every electronic file, there’s so much more intelligence available that gets left out of the mix.

For example, let’s look at the steps involved in your average ESI processing and review project. When electronic collections are initially delivered, they are typically in the form of fairly unstructured data residing on a hard drive. The next step in the process is to load the data in an eDiscovery processing tool in order to normalize the set. Once that step is completed, the data is usually filtered — or culled — using limited criteria such as keywords, file types, dates, custodian information, and other high-level attributes. And although this standard process DOES reduce the set somewhat in most situations, it still promotes the idea of casting a fairly wide net to ensure nothing important falls through the cracks. As such, many irrelevant documents make it to the next step: review. And therefore, tremendous time and money investments will be made in an already expensive process.

More irrelevant data means more to process, more to load, more reviewers to assign, more data to host, and more time required to get through the review. And the costs rack up every step of the way.

USING DOCUMENT PERSONALITIES TO CULL ELECTRONIC DATA

However, with the right relationship mining technologies, electronic collections could be dramatically culled to the smallest, most relevant set by analyzing all the common threads that exist within these document personalities. For example, a typical personality-based culling process could include the following steps:

1. Select relevant custodians:

By sub-dividing the collection by various attributes, you can easily start with that which is most obviously relevant. In this case, having the ability to choose only each relevant custodian’s sub-set of documents is a great way to “trim the fat,” so to speak. Let’s assume that out of 9 custodians that produced documents, there are only three that we know were primarily involved in the issues of the case: Bob Smith, Susan Johnson and Stephen Davis.

2. Drill down into relevant concepts:

Depending on the issues pertinent to the case, it may be a good idea to analyze documents by subject matter using concept analysis tools. Doing so allows you to choose what data is potentially relevant based on what documents say. So, if the issues at play relate to a contract dispute, concept categories such as: “contract negotiation,” “contract edits,” “agreement status,” “revised pricing levels,” and “project XYZ engagement” could be chosen to further narrow your focus. As you see, there are technologies out there that can read and understand what documents are all about and group them together for you. These concepts are simply another attribute (out of many) that can be used to determine responsiveness.

3. See a listing of all file types:

This would allow you to quickly see the various formats of the filtered data at this point. Such information can illustrate many things, including: What type of files represent the lion’s share of communication? Are there any unusual file types that need to be dealt with (CAD drawings, proprietary formats, etc)? Are there certain file types that you would expect but aren’t seeing (a possible sign of incomplete data harvesting)? For our example, let’s say that since we are dealing with contract negotiations that we will choose to focus only on e-mails, Word documents, and PDFs — the standard formats found in such situations.

4. DataMap selected file types along a timeline:

This is where visual data mapping technologies are useful (a future TechTalk topic). In short, having the ability to see data in an illustrative format allows you to quickly identify trends in activity. So let’s say you mapped the filtered collection along a timeline. Doing so allows you to quickly see what date ranges represent various spikes in activity, which usually fleshes out relevant communications. For this example, let’s say based on what we see we decide to further cull the set by choosing the following date range: May 2007 – August 2007.

5. DataMap all communication threads between relevant entities:

Here’s where the full benefit of both relationship mining and data mapping can be realized. Mapping communication threads simply charts out all conversations between two or more people. However, if presented properly, the technology could very well flesh out communications involving additional players than those originally selected. The obvious benefit here is that by analyzing these common threads, you’ve ensured that important activity didn’t fall through the cracks based on initial assumptions. From here, you can either go back and edit some of your filtering criteria, or go ahead and select the conversations you want to further analyze. Let’s say that the technology has culled the set to a total of 75 e-mail threads, yet we’ve decided to narrow our focus on 10 that are likely most relevant based on everything above.

6. Find other documents that share relevant connections:

So here’s where we are at this point in our culling process:
  • We started by selecting three custodians out of nine
  • Within that sub-set, we’ve selected five concepts out of hundreds
  • Within that, we’ve identified three file types that typically deal with contract negotiations — by following only the conceptual connections that were relevant
  • Within that, we’ve focused only on a narrow range of dates that share certain selected attributes in common: concepts, file types, and specific spikes in activity
  • And then uncovered all communications (and then some) within those common threads
... all made possible by the ability to analyze the unique personalities of individual documents, then following selected common attributes that are shared with others within the collection. And now that we have our short list of relevant documents, we’ve essentially created a unique “personality” of responsiveness. At this point, relationship mining can be taken a step further by asking the technology to “show me other documents from the entire collection that share these personalities.”

Summary

As you can see, by having access to the right technologies that indexes all attributes in a collection, we can quickly uncover special relationships. However, the key to making this work is that you need to have hands-on access to such culling tools. It is important to be able to control the process so that you can immediately react to the results you see to help you drill down or widen your focus, change your criteria, follow tangents and make decisions on-the-fly.

And you’ll notice many of the common steps of eDiscovery culling and processing are missing from the above example. Specifically, no keyword filtering has been made behind-the-scenes. This is not to say that keyword filtering does not have its place in the process, however, many “false positives” can come from this standard method of culling.

Additionally, keyword filtering requires you to cast a fairly wide net. And this means more irrelevant information makes its way into the mix. But as you can see, the above scenario easily goes above and beyond the standard success rate of keyword filtering. In essence, given the nature of ESI, it makes little sense to not use ALL the moving parts contained within these collections to your advantage ... unless of course, you like spending excess time and money on eDiscovery! When dealing with electronic data, it’s all there, so why not use it?

Hopefully, I’ve helped you see ESI collections and the eDiscovery process in a slightly new way. Just remember that each document, like people, is unique in its own way. And within those unique personalities lie important information that connects all relevant facts together. We’re just on the forefront of understanding how to effectively use all these unique document attributes to our fullest advantage.

However, now that you know all documents have unique personalities, please refrain from trying to strike up conversations with them. If you find yourself doing that, well ... that means this business is getting to you and some serious time off is needed!

January 11, 2011

The Changing Landscape of eDiscovery

The eDiscovery landscape is a never-ending, constantly evolving environment. Each day, more challenges arise from the sheer breadth and scope of its intricacies – made even more complex by the introduction of new communications technologies and practices. 

One of the most quickly evolving areas of eDiscovery, however, is centered around the relationship of in-house counsel and law firms.   The best way to understand where things are going is to first look at where we’ve been. 

Categorically speaking, it used to be that corporations would essentially rely on outside counsel to lead all things related to discovery.  This includes collection management, data processing, culling, review and production – and all the technology decisions around this process.  However, to execute the functions related to eDiscovery, law firms would (in most cases) outsource the work to litigation support companies.  Law firms, through whoever was assigned the task (either the litigation support department [if it existed], litigators, partners or paralegals), would instruct the corporation on the process while vetting various vendor technologies and services that would be used in the matter.  So, we had a situation where the level of influence on the process rested almost solely on the shoulders of outside counsel. 

But through the years things started to change on two fronts:  Corporations starting taking more control of the discovery process and many law firms began bringing discovery technologies in house.  What caused these changes?  There are a number of reasons. 

The single most overriding driver to these changes is education.  Just like with any new industry’s technologies or services, there is a maturation process.  Gone are the days where this process was “magic,” only to be understood by a small community of specialists.   Case by case, legal teams became more sophisticated.  They climbed the learning curve via tradeshow attendance, reading whitepapers, watching webinars, participating in CLE, doing research, membership in professional organizations and experimenting with various technologies.  

Through this process, just like with any new technology, users migrated to the top of the bell curve where industry-wide adoption occurs. And with this maturation, legal teams could increasingly navigate the previously murky waters of eDiscovery on their own.

Outside of education, there are drivers specific to each entity that led these changes.

CHANGES WITHIN THE CORPORATION

First, let’s look at the corporation and why more of the discovery process is “moving upstream:”

The enterprise is the entity that has the most to gain or lose in litigation, sometimes even facing a “bet the company” situation.  Given this, inside counsel and C-Level executives necessarily wanted to be more involved in the process.

Corporations foot the bill for discovery.  And, as the volume and complexity of electronic document collections continued to rise, so did discovery costs.  With any escalating cost line item, corporations pushed back, sought changes, and/or brought the right technologies in-house to streamline processes and reduce costs. 

It’s the corporation’s highly sensitive data that is released to the outside world - with no strong assurances as to how many hands would touch it, how many eyes would see it, and how many copies existed.  Due to prevalent  scatter-shot collection practices, this information can easily include strategic communications, technology blueprints, unrelated financial and human resource documentation, embarrassing employee activity and communication, and various “secret sauce” data that is otherwise highly protected from the outside world.  Coupled with the reality that upwards of 95% of collected data are not relevant or responsive, corporations found they were unnecessarily sharing sensitive information that should have never left the enterprise.

The final driver is related to new technologies and business processes.  Through the years, discovery activity – particularly collection – has matured.  The old fashioned way of collection used to involve IT teams going from desktop to desktop and/or server to server, making copies of all possibly responsive data (or, oftentimes, for the sake of ease, entire hard drives would be collected).  Today, although some of that still takes place, discovery is becoming more and more ingrained in normal business processes.  Technologies now exist that sit on top of data archives, which enable inside legal teams to query the enterprise to collect forensically sound data without ever disturbing the IT department or custodians. An additional measure to protect data, reduce costs and streamline the process is that many corporations are even hosting discovery data in-house and granting access to outside counsel during the review process.  This way, no sensitive documents leave the enterprise unless they are deemed responsive to discovery.

CHANGES WITHIN THE LAW FIRM

Next, let’s evaluate the changes within law firms’ and their ever-increasing trend of insourcing.

Law firms are becoming experts in eDiscovery.  Litigation support professionals are increasingly regarded as go-to people and are serving more of a consultative role to their firms.  As such, law firms are not relying on outside vendors for guidance as much as they used to. 

They can also position this part of the litigation process as a value-add to their corporate clients.  Having the ability to leverage a savvy team of discovery professionals is a great differentiator when corporations evaluate which firms to use on certain matters.

It is becoming more and more acceptable for law firms to bill the client for discovery services.  Adding this line item to the services that law firms provide is even more critical today given that they are being pressured by their clients to reduce costs in other areas.

Insourcing, if done correctly, has proven to help corporate clients reduce discovery costs.  By having the right technology and processes in-house, law firms can eliminate the margins that are paid when outsourcing to vendors.  Of course, this greatly depends on the quality of the technology that is brought in-house.  Many firms still see value in outsourcing this work to litigation support companies in order to leverage those companies’ infrastructure, expertise, project management and data analysts.  However, as collected data becomes more structured and better technologies are offered, more and more firms are finding insourcing a viable option.

And, lastly, by insourcing, law firms can oftentimes greatly streamline the process.  By having the right technologies and resources in-house, they eliminate the RFP process, the need to send data back and forth to vendors, and other inefficient steps inherent to outsourcing.  Today, many firms can receive data from their clients, and immediately load it into their internal discovery processing, culling and review technologies.  Oftentimes, this enables legal teams to complete the discovery activity in the time it would have typically taken just to begin the outsourcing process. 

WHERE WE’RE HEADED

These changes within each entity - corporations and law firms - are creating a final change to discovery:  Improved collaboration and communication between the two.  The proverbial “line in the sand” is eroding.   The process of issuing a discovery request to the corporation and, in turn, the corporation returning ready-to-review-for-production data will continue to blend into a single, efficient business process.  Ultimately, data will be indexed on-the-fly as it is created, instantly conceptualized and data mapped, identify all communication threads and all departments of the organization will have intelligent access to that data including: records management, compliance, legal, and more.  It may take years for these changes to affect most corporations worldwide.

However, a time will come – after major technology investments and changes to business processes - when document creation all the way through discovery will be streamlined in such a manner that costs will be reduced while increasing productivity and collaboration between all entities along the discovery life cycle.

In many situations, the old adage that “the only thing constant is change” holds true.  The discovery industry is no different.  Just like with any new technology or process, things move from early adopters to wide-spread maturation and change.  The main drivers of this phenomenon are almost always education, adoption, and commoditization. 

And in the case of discovery, all have shaped the industry as we know it today.

January 5, 2011

InterLegis Launches Discovery360™ Desktop


Solution Gives Corporate Legal Departments and Law Firms the Ability to Efficiently Process eDiscovery In-House

InterLegis, an innovator of litigation and electronic discovery technologies, today announced the release of Discovery360™ Desktop, an in-house, end-to-end discovery solution that allows corporations and law firms to protect data internally and perform eDiscovery processing, culling and/or review without the need to work with outside vendors.

“The introduction of Discovery360 Desktop underscores our commitment to our clients by addressing the need to quickly perform early case assessment, data processing, reporting, and culling as soon as collected data lands on their desk,” said Kevin Carr, president of InterLegis. “With Discovery360 Desktop, the complete discovery life cycle can oftentimes be completed in less time than it normally takes to outsource the work.  In addition, the performance of Discovery360 Desktop is matched by our cost-effective pricing, which includes benefits such as free culling.”

Features

Discovery360 Desktop provides multiple document analytic capabilities, robust data reporting, and advanced culling technologies in a single solution – eliminating the need to rely on expensive eDiscovery solutions to accomplish the same tasks. With its easy-to-use, intuitive interface, Discovery360 Desktop also offers:

  • The ability for legal teams to quickly understand the contents of collected data without the need for the time-consuming process of outsourcing.
  • Streamlined processing of electronic data with deduplication capabilities.
  • A built-in, powerful culling tool that includes data mapping, text search, keyword import, concept analysis, email threading, parent/child relationships and complete attribute filtering of all metadata.
  • Robust reporting capabilities that include data inventory, keyword term hit lists, deduplication reports, exception reports, production reports and metadata reporting.  
  • The ability to export into Discovery360 Reviewer or any common load file format such as Concordance, Summation and EDRM.
  • The ability for data to reside locally, on a network or pushed to InterLegis’ hosted environment.
  • Scalability across an entire corporate or law firm network, which translates into the ability to efficiently handle both small- and large-scale projects.
Pricing

Pricing for Discovery360 Desktop includes complimentary electronic discovery processing, deduplication, reporting, early case assessment and culling – meaning clients only pay for the relevant data they need.

Immediate Availability

Discovery360 Desktop is available immediately. For more information, contact info@interlegis.com or visit www.interlegis.com.