A blockchain-based tool to make data science safer
SecureKG, a joint project between two labs in EPFL’s School of Computer and Communication Sciences (IC), aims to improve the security and traceability of RENKU, the data science platform being developed by the Swiss Data Science Center (SDSC).
RENKU enables reproducible data science via its knowledge graph (KG), which provides a framework for facilitating collaboration; analyzing data and capturing its traceability; and organizing datasets and software for later reference, re-use, or refining. But since some of these datasets contain sensitive information, there’s a need for secure management, controlled access, and a guarantee of tamper-proof traceability.
Thanks to the combined efforts of Ford’s Decentralized and Distributed Systems Lab (DEDIS) and Hubaux’s Computer Communications and Applications Laboratory (LCA1), SecureKG will provide this protection in two layers.
Security from two sides
The DEDIS lab’s Calypso architecture will ensure secure data storage and access using a novel blockchain technology that can safely manage private data – unlike conventional blockchain.
“If you want anything secret on a conventional blockchain, you have to encrypt it…but then someone has to hold the encryption key,” Ford explains. “Calypso gives you the ability to entrust secrets to the blockchain itself in a decentralized fashion. All participants share fragments of the key, none of which reveal any information about the secret data on their own.” The beauty of integrating a decentralized tool like Calypso with the SDSC’s platform, Ford adds, is that the level of privacy increases along with the size of the system.
While Calypso does the “data care-taking”, the LCA1’s technology will provide the crucial second layer of protection, allowing private RENKU data to be analyzed safely. Hubaux’s lab specializes in homomorphic encryption technologies, which enable computations to be performed on encrypted data without decrypting it. Using this approach, only the statistical results of computations are revealed, and not the data itself, protecting the knowledge graph both in terms of data privacy and security.
«For a platform to be used, one has to reach a level of mutual trust between those who provide the platform and those who use it.»
SecureKG was launched on October 1, 2018 as a two-year collaboration between the IC and SDSC.
Hubaux says that the project is important because it addresses a unique aspect of the SDSC’s work. “SecureKG is about the tools that will enable research when there are issues related to sensitive data, and these issues are going to arise, especially as the SDSC has ambitions in the area of personalized medicine.”
Ford says he has high hopes for the impact of SecureKG on data science research carried out at the SDSC – which was founded in 2017 as a joint venture between EPFL and ETH Zurich – in the coming year. “We’d like the SecureKG project to produce extremely strong, provable decentralized security for sensitive data and computations that scientists entrust the RENKU platform with,” he says.