Abstract
Joining relational data can jeopardize patient confidentiality if disseminated data for research can be joined with publicly available data containing, for example, explicit identifiers. Ambiguity in data hinders the construction of primary keys that are of importance when joining data tables. We define two values to be indiscernible if they are the same or at least one of them is a special value. Two rows in a data table are indiscernible if their corresponding entries are indiscernible. We further define a table to be k-ambiguous if each row is indiscernible from at least k rows in the same table. We present two simple heuristics to make a table k-ambiguous by cell suppression, and compare them on example data.
Original language | English |
---|---|
Pages (from-to) | 726-730 |
Number of pages | 5 |
Journal | Proceedings / AMIA ... Annual Symposium. AMIA Symposium |
Publication status | Published - 2001 |
Externally published | Yes |
Keywords
- Algorithms
- Confidentiality
- Medical Records Systems, Computerized