| Authors | S. Sen, D. Marijan, C. Ieva, A. Grime and A. Sander |
| Title | Modelling and Verifying Combinatorial Interactions to Test Data Intensive Systems: Experience with Optimal Archiving at the Norwegian Customs and Excise Directorate |
| Afilliation | Software Engineering |
| Project(s) | The Certus Centre (SFI) |
| Status | Published |
| Publication Type | Journal Article |
| Year of Publication | 2016 |
| Journal | IEEE Transaction on Reliability |
| Issue | 99 |
| Pagination | 1-14 |
| Publisher | IEEE |
| Abstract | Testing data-intensive systems is paramount to increase our reliance on information processed in e-governance, scientific/medical research, and social networks. Data accrued in these systems often go through several manual and computational steps involving human inputs in interactive media and complex batch appications that aim to ensure high quality of data in terms of validity, correctness, and adherence to business rules. A common industrial practice in testing data-intensive systems is to extract test databases from live production streams and verify the data in them through a checklist of requirements either by tedious manual observation or by executing complex SQL queries composed and understood by very few domain experts. We elevate the specification of such requirements on data by modelling data interactions between fields cross-cutting the test database’s schema. These interactions are modelled as test cases in a classification tree model. The model documents intuitive expert knowledge about what to expect in the test database and is given executable semantics using our human-in-the-loop tool DEPICT. DEPICT verifies if interactions occurred or not in systematically extracted test databases. Non-occurrence of expected interactions or occurrence of unexpected interactions indicate faults in the data. We present experiences on how our model-driven approach has been successfully applied to verify test databases in the Norwegian Public Sector. In particular, we present case studies at (1) the Norwegian Customs and Excise Directorate for verifying the adherence to customs regulations and (2) the Cancer Registry of Norway to verify its data quality management process involving both human coders and complex legacy batches. |
| Citation Key | 23816 |