Rainer Lenz (Federal Statistical Office of Germany)
Disclosure of confidential information by means of multi objective optimization
Assessing the effectiveness of an anonymization method w.r.t. data protectioning, the disclosure risk associated to the protected data must be evaluated. We consider the scenario where a possible data intruder matches an outside database with the protected data (target data); e.g. in order to improve his outside database he may try to assign as many correct pairs of records (that is; records corresponding to the same individual) as possible. The problem of maximization of the numbcr of correct assigned pairs is translated into a multi objective linear assignment problem (MOLP).
We calculate the solutions to the MOLP obtained by application to the german structure of costs survey (SCS) reduced to processing industry. Regarding specific anonymization methods, we get an upper bound for the disclosure risk by assuming the worst-case scenario, in which the outside database equals the original data. Since combining all objectives into one single value - as it is typically done in a linear program formulation - in general leads to considerable loss of useful information, we compare the results obtained by setting different vector weights to the system of multiple objectives. Solving the MOLP by use of greedy heuristics does not guarantee optimality. Nevertheless, those approaches are also discussed since their undoubted advantage is that they work agreeable time, precisely in square time complexity w.r.t. the numbcr of individuals.
Session: 3a Auditorium Category: Data access and disclosure control
Paper
|