February 17, 2014

When Does ICD-10 Testing Start to Resemble Something Other than Testing?


We hear the term “testing” on a daily basis and see hundreds of articles on how important testing is, and yet these warnings often go largely ignored. This results in installing defective software and business processes into production. Why does this happen, time and time again?

For the most part, it is a matter of competing priorities in addition to the simple fact that there never seems to be enough time or money to do it right. The key lesson to learn here is that if you don’t have time to do it right the first time, by default you must have time to do it over again. We’ll touch upon this recurring theme again, because you can’t skip testing – you can only put it off until later, when it costs many times more to fix the problem.

Unit tests, system tests, integration tests, smoke tests, whitebox, blackbox, load, stress, user acceptance testing, end-to-end testing: we can tend to use these testing terms interchangeably, as if everyone understands what their true definitions mean and what specific tasks are supposed to be completed during each test phase.

Phase containment is one term not too often heard, but it’s one that is important for any test effort. It means finding and keeping defects within an appropriate range, and making sure none of them slip into the next testing phase. An easy-to-understand example is finding a unit test error in UAT or E2E testing, such as a missing NPI number or diagnosis pointer. The reason for such an issue stems from test cases and scripts not designed effectively from the start. Each testing phase specifically must be designed to build upon the previous one, and very specific tasks are to be accomplished during each test phase; in other words, testing cannot be random, haphazard, or accomplished via dumb luck. For instance, you don’t test full-scale business processes in unit testing, and likewise, you don’t send defective claims during UAT or end-to-end testing; those phases are for positive business events only.

One of the key goals of any test involves determining what it is you want to know prior to beginning the test. Are you just testing a claim, or the contents of the claim? If it is the contents, then all the critical elements of the claim must be known ahead of time. This means the right patient information, the right diagnosis and procedure codes, the right billing amounts and revenue codes, etc. The same holds true for remittances, specifically in terms of the payment amounts for the member and the health plan along with contracted amounts for the provider. Often this testing can be summarized as “just test and make sure nothing is broken” or “the claim paid what it has paid historically, right, wrong, or indifferent.”

Since the focus of this article is UAT and end-to-end testing, we will focus on the critical aspects in these two areas and where we see the most common mistakes. Providers cannot and should not send incorrect ICD-10 codes on claims in UAT or E2E, as incorrect codes get tested and corrected in previous phases. What does this mean? It means that if you are not sure what the correct answers are and you are just sending in claims, then the test is not valid and your organization has lost phase containment. Health plans are trying to unit-test hospital coders in E2E testing, which is the wrong test in the wrong phase. Contrived clinical scenarios are another head-scratcher because it is a technical solution to a business problem; they don’t resemble what happens in production. It certainly doesn’t mimic any part of the actual production process, and if you don’t test using the exact method that is going to be used in production, then what are you testing? You tested whether your testing shortcut worked, and that’s about it. I don’t know any examples of an encounter in which a patient walks in and selects his or her ailments based on some list of contrived clinical scenarios; instead, the coding for each patient must be based on the exact clinical documentation at each hospital or physician practice. That is why we test with providers’ actual records – so we can determine if the documentation practices currently in use will be effective for ICD-10 or if modifications are required. I can find lots of technical IT professionals who think clinical scenarios are the way to go, but when you talk to clinicians and coders, they display confusion over how they have anything to do with what actually occurs at the point of care. Since ICD-10 is a clinical documentation effort, I am going to side with the coders, clinicians, and health information management (HIM) professionals who want to test with their own medical records, because they know that testing using what they do every day is a much better way to test.

If you are looking for an “out” in testing, there are two more ways you can skip testing ICD-10. The first is to select only a handful of trading partners for testing and leave the other 99 percent of your trading partners in the dark, without communication of every transaction that was tested or the results of those tests. The second option is even a little better: How about just taking the word of organizations that say because they tested so awesomely internally, they don’t ever need to test with any submitters, because everything works just fine. They apparently have advanced premonition skills, through which their testing teams can predict what ICD-10 codes will be sent in production without ever having to see a single E2E test case – and that, my friends, is a truly amazing skill set to have! That would be like skipping UAT testing because the software developer said he fully tested his code and it’s good to go. We test to verify; we don’t test because we don’t believe him.

These examples remind me of several common phrases, perhaps the most obvious of which being the saying “familiarity breeds contempt.” Since we are all in healthcare, there’s another that quickly comes to mind: “an ounce of prevention is worth a pound of cure.” Since this phrase is used so often and highlights the benefits of preventative medicine, it is a perfect corollary to ICD-10 testing. If the industry fully understands how preventing illness is so important for the wellness of an individual, it should be obvious that testing for the wellness of software and business processes that drive revenue for every provider in the country is equally important in this arena. Testing is preventative medicine, and not testing is ignoring the problem and electing for surgical procedures to fix the problem later. For the techies in the crowd here is an golden oldie: GIGO (garbage in, garbage out), which means if you arbitrarily test stuff with no rhyme or reason, then don’t expect anything to come out of your efforts – it’s pretty much throwaway work at that point. For those doing this type of testing, don’t bother – save your money and just throw the stuff in production to find all the errors there.

Of course, I saved the best saying for last, because it clearly states the obvious, even if most organizations completely overlook it: “you can never avoid testing – you either do it in the test phase or you do it in production, but you can never escape it.” What does this mean? It means that you can spend a dollar today to test before the new coding set goes live, or you can spend a hundred dollars to fix that same thing later in production. It is a really simple concept, and Healthcare.gov is a perfect example of throwing something into production without proper testing. The U.S. Department of Health and Human Services (HHS) will spend millions more dollars fixing it at, 10 to 50 times the cost, because shortcuts were taken in the testing phase. For whatever reason, money wasted on production support fixes doesn’t have the same visibility as money saved by testing properly. Here is a common IT scenario that depicts the previous statement: a development manager puts untested code in production, but does so on time and on budget, so to the untrained eye, the effort looks successful. Then all the bad code causes production errors, but there is no traceability for determining where the defect came from in the first place. The production support manager then looks like a hero for fixing a defect that never should have been there in the first place. The easiest way to remedy this is to have the development and test teams own the product for six months after it goes live in order to determine the true project costs and how many test cases were missed so that lessons can be learned.

How else does the healthcare industry waste time and money in testing? Silo-based testing efforts create massive duplication of efforts across the country, and it wastes precious resources and time, especially when compared to completing tasks collaboratively. A quick example is every provider calling every health plan to inquire about testing plans and schedules while every clearinghouse is doing exactly the same thing. All of that information is kept in silos, but a better scenario would involve a central repository for vendor testing information, accessible to clearinghouses, health plans, and providers. For those entities that are testing, the following needs to be answered: What codes were tested? What DRGs? What CPTs? Was the remittance correct for each test case? What was the coding accuracy? Which specialties require more comprehension? What kind of feedback mechanisms are in place to improve coding and clinical documentation? How well are the computer-assisted coding (CAC) tools working? What are the lessons learned?

Testing between a provider and a clearinghouse involves trying to get the technical layout correct and the right content in the files. Testing with health plans is testing business functionality almost exclusively. Testing eligibility, remittance, etc., can and should be done early and often. Separating these two distinct testing functions will remove the waterfall dependency through which providers think they have to wait to test, allowing health plans to test earlier in their remediation processes. The goal is to get providers and health plans on the same page earlier in the testing process, which is something that only effective collaboration can provide.

Providers must be able to test much earlier with health plans and clearinghouses than they do today, and this can also be accomplished quickly – but again, not under an antiquated waterfall testing methodology. If you want to use the waterfall method, you’d better start a year or two earlier than everyone else. The industry must move away from waterfall testing and move toward agile and asynchronous testing methods, which is the foundation of the national testing platform.

This will be the first in a series of articles addressing how we can increase knowledge of proper and effective quality assurance and testing procedures. So the next time you hear someone discussing testing, inquire first about their formal quality assurance training, testing certifications, and testing experience. Watching or being involved in a previous testing effort from afar does not make someone a testing professional; understanding why and how to test requires analytical and logical approaches that follow a methodology, and any methodology that doesn’t include expected results prior to testing is not a good methodology at all. ICD-10 is not a good area for testing the whims of the untrained eye. Next month’s article will address some of the many lessons learned and the early coding metrics we are seeing among the stakeholders currently involved with testing.

About the Author

Mark Lott is the CEO for the Lott QA Group. Mark has more than 27 years of extensive quality assurance expertise in healthcare, pharmaceutical and financial services in his role as a testing evangelist. Mark has developed the industry’s first end- to-end testing framework using dual coded medical records available through the nation’s only social collaborative testing – the National Testing Platform. He is active in many industry workgroups and associations pushing for the advancement of testing initiatives and educating the industry on testing best practices for large scale trading partner networks.

Contact the Author

To comment on this article please go to