Data Quality, what does it mean to you?
To me, it means that I can implicitly trust what I see whether it be a well ticket or a boat’s gas gauge (and E does not mean Enjoy your ski, we have lots of gas and YOU won’t get stuck out in the middle of the lake but that is another story…) or that the result set from a query for wells drilled year over year by producing zone that show an uptake in Triassic based plays can be trusted. So that is my definition of data quality but how do we get there? Or maybe the question is how do we avoid becoming a data quality folk tale? This website (http://www.iqtrainwrecks.com) has some amusing stories and this one (http://www.iqtrainwrecks.com/2011/03/17/gas-byproducts-give-pain-gut) strikes very close to home.
In my years of working with oil and gas data, it seems to me that data degrades very easily but it is only through hard work and consistent effort that you can get data to stay the same or even improve, year after year. Part of this comes from our own built in avoidance of change, part of it is based on company culture. I found this article (http://www.forbes.com/sites/forbeswomanfiles/2011/09/15/the-remarkable-edge-a-breakthrough-environment-will-give-your-company) most interesting because it states that there are basic inputs that will help your company grow and thrive in this new and challenging century. To take it a step further, the basic tenants of data quality (Full, Accurate, Consistent and Timely data) can be tied to four inputs plus one to sum it up.
- Speed –> Timely data; it’s no good to know about a competitor well being abandoned after you have started drilling your own. You need to have that information when it comes off confidentiality and you need to trust that it will be there.
- Reliability –> Consistent data; now you see me, now you don’t. Great game if you are playing it as a 3 year old; not so much if you are an oil & gas knowledge worker.
- Quality –> Accurate data; we have all heard the folk tale of G&G staff spending anywhere from 1/3 to 2/3 of their time validating or finding data. Well, when you have an inherent built-in quality or trust level, then you move mentally from challenging the data to incorporating the data into the play.
- Engagement –> Full data; when a play is being worked on, the G&G staff must know that all components are available. Logs, Cores, Analysis, Tops, Tests, IP, Reserves, Seismic, Pipelines, etc.
- Innovation –> FACT based data; having all of the facts and then being able to challenge the status quo with sound information that is able to show opportunity.
Okay, we have defined a basic tenant around building up data quality. What’s the next step? Well, some review of existing websites is never a bad thing. It’s always easier to build on the shoulders of others than start from scratch. Sites or blogs that I like are:
- http://dataroundtable.com –> a series of blog postings from IQ or information quality thought leaders
- http://tensteps.gfalls.com –> a great site that has digital examples of what’s in her book, Executing Data Quality Projects. I have a copy of this book and it’s getting dog-eared. ‘Nuff said.
- http://data-governance.blogspot.com –> Anything by Steve Sarsfield is gold. His latest articles on the root causes of data quality are great. As always, it’s easy to identify a problem, it’s harder to suggest a solution. And Steve does not shy away from providing solutions.
- http://bardess.com/blog/?p=202 –> Yet another article on what bad data is and what it is costing your organization. What’s interesting about this one is the five key data standards mentioned; Completeness, Accuracy, Timeliness, Uniqueness and Consistency. They match up with what I put forward earlier (Full, Accurate, Consistent & Timely) except I am missing Uniqueness. Why? Well, being a data vendor, I know that we create that single record of the truth (well) and why would you go anywhere else for public data?
Data Quality, what does it mean to you?