Which company sells more milk? How about coffee or cheese? If a country’s supermarket chains would like to know the answer and compare results, it would be virtually impossible without spying on competitors’ accounting data.
This problem has a solution. University of Tartu doctoral student at the Institute of Computer Science and researcher at Cybernetica Dan Bogdanov leads the Sharemind project, which allows computations on input data to be made without compromising privacy.
Sharemind would allow companies to compare their sales statistics with their competitors without knowing the actual figures. So how does it work?
“Data from the different supermarket chains is split into three parts, the programme algorithm encrypts the data, and our three data miners each receive a part,” describes Bogdanov. “This is followed by computations which are carried out so that nobody has access to the original data. Finally, the data processor collects the computation results and puts together the final outcome. However, if you look at any one of the three parts of the data, it looks like random white noise.”
Basically, Sharemind is able to process data without seeing it and at the same time generate accurate results. A prerequisite for confidentiality is that the data must come from at least three different sources and that the owners of the data are not exchanging information between themselves.
Bogdanov explains that owners of the data must be curious but honest. “Suppose one of the database owners is attacked and the data is stolen. In this case, the thief would have no use for the stolen data as it has no value by itself,” describes Bogdanov.
Who could benefit from this?
A confidential calculation system of this nature could be used by hospitals and gene banks that contain similar, comparable data that the institutions are not allowed to share because of privacy requirements. But if these institutions cooperated, they could answer questions such as “How many sex partners has the average HIV-positive person had?” or “How many 55-year-old males smoke?”
“I’m sure that Estonian IT companies are interested in what the average salaries for various positions in the sector are. Obviously, Skype or Webmedia wouldn’t reveal their figures to competitors. However, if IT companies would allow computations based on their anonymous data, it would be possible to come up with some objective figures,” adds Bogdanov.
Sharemind is a joint product of the University of Tartu, the Software Technology and Applications Competence Centre and Cybernetica. In addition to the inter-database computations, Sharemind allows users to create web-based questionnaires that split data into three parts as they are being filled in, thus protecting the privacy of the respondent.
A system similar to Sharemind was used in Denmark for negotiating the prices of sugar beet producers. No single producer saw the prices of other producers, but in the end, the system found a balance between production and demand.
“We started with only a few lines of ideas that have now evolved into several tens of thousands of lines of code,” says Bogdanov. “The project contains more than ten years of human work.”
Could this kind of computation system interest many potential users?
“At the moment, it is still a scientific project, but as the technology is nearing completion, we are actively searching for opportunities to put it into practice,” Bogdanov affirms.