ENQUIRE PROJECT DETAILS BY GENERAL PUBLIC

Project Details

Funding Scheme :

General Research Fund

Project Number :

17200418

Project Title(English) :

Private and I/O-Efficient Concurrent Data Structures and Parallel Algorithms

Project Title(Chinese) :

隱私及輸入/輸出高效的並行數據結構與算法

Principal Investigator(English) :

Dr Chan, Hubert Tsz Hong

Principal Investigator(Chinese) :

Department :

School of Computing and Data Science

Institution :

The University of Hong Kong

E-mail Address :

hubert@cs.hku.hk

Tel :

28578461

Co - Investigator(s) :

Panel :

Engineering

Subject Area :

Computing Science & Information Technology

Exercise Year :

2018 / 19

Fund Approved :

693,000

Project Status :

Completed

Completion Date :

9-11-2022

Project Objectives :

Explore concurrency and parallelism in private data structures and algorithms. We will design some basic oblivious parallel algorithms as fundamental building blocks for storage schemes, and study the tradeoffs between depth and total work.

Achieve IO-efficiency for private mechanisms. For instance, we will consider cache-agnostic subroutines in our schemes that can achieve optimal IO-efficiency without explicitly knowing the parameters of the underlying architecture.

Study more general notions of security in storage schemes. Going beyond conventional ORAMs, we will consider differential privacy as a more relaxed security requirement to allow better performance in practice.

Abstract as per original application
(English/Chinese):

This project will explore the theoretical foundations and applications of private data structures and algorithms, whose goal is to prevent the leakage of sensitive information through the data access pattern, which cannot be obscured by just using encryption. We will extend existing private storage models to capture practical aspects such as concurrency and I/O (input/output)-efficiency. Since the design of modern architectures ubiquitously employs parallelism and memory hierarchy, our models will have impacts on real-world applications such as multi-core processors and outsourced cloud platforms. Moreover, we will develop more relaxed security notions to achieve tradeoffs between performance and privacy levels. The classical oblivious RAM (ORAM) proposed by Goldreich and Ostrovsky allows data to be accessed obliviously such that the access pattern does not leak any information other than the length of the operation sequence. Even though the performance of ORAM has been improved in the past decade, the model does not readily support parallel architectures. To rectify the situation, the oblivious parallel RAM (OPRAM) has been proposed in a recent TCC 2016 paper. In our preliminary results that will appear in the upcoming TCC 2017 and ASIACRYPT 2017, we have achieved OPRAMs whose performance matches the state-of-the-art ORAMs. Our preliminary results inspire us to go beyond OPRAMs and further investigate private concurrent data structures supporting multiple requests and their parallel implementations. Besides parallelism, memory hierarchy plays a major role in practice. Our preliminary results on I/O-efficient cache-agnostic oblivious sorting algorithms, which will appear in the upcoming SODA 2018, equip us with powerful techniques to further explore the I/O-efficiency of our private mechanisms. Furthermore, we will also explore new notions of privacy in storage schemes that allow users to balance between their privacy preference and performance guarantee. Specifically, our recently developed techniques enable us to tackle the following challenging areas: (1) Explore concurrency and parallelism in private data structures and algorithms. We will design some basic oblivious parallel algorithms as fundamental building blocks for storage schemes, and study the tradeoff between depth and total work. (2) Achieve I/O-efficiency for private mechanisms. For instance, we will consider cache-agnostic subroutines in our private schemes that can achieve optimal I/O-efficiency without explicitly knowing the parameters of the underlying architecture. (3) Study more general notions of security in storage schemes. Going beyond conventional ORAMs, we will consider differential privacy as a more relaxed security requirement to allow better performance in practice.

此項目將探索隱私數據結構與算法的理論基礎與應用，旨在通過數據存取模式防止敏感信息的洩露，該目標不能僅通過加密完成。我們將延展已有的隱私存儲模型來刻畫實際應用中的特徵，例如並行性與輸入/輸出高效性。由於現代結構設計普適採用了並行性與分級存儲器體系，我們的模型將對包括多核處理器、外包雲平台在內的多項實際應用中產生重要的影響。此外，我們發展更鬆弛的安全概念，以在性能與隱私級別中進行平衡。由Goldreich和Ostrovsky提出的經典不經意隨機存取器(ORAM)使得數據可以被不經意的存取，其中存儲模式不會洩露除運算序列長度以外的任何信息。儘管在過去的十年中，ORAM的性能被提升，該模型並不能直接的支持並行結構。為了改變以上情況，不經意並行隨機存取器（OPRAM）在最近的TCC2016論文中被提出。在我們即將發表的TCC2017及ASIACRYPT2017論文中，我們得到了與最先進ORAM性能相同的OPRAM。這些初步成果啓發我們在OPRAM之外進一步探索支持多任務的隱私並行數據結構與他們的並行實現。除了並行性，分級存儲器體系在實際應用中也發揮著重要的作用。在即將發表的SODA2018論文中，我們提出了輸入/輸出高效、緩存不可知不經意排序算法，該成果給與我們強大的技術以繼續探索隱私機制中的輸入/輸出高效性。進一步，我們將探索在存儲方案中的新隱私概念，允許用戶在隱私偏好與性能保障中進行平衡。特別的，我們最近發展的技術使得我們可以克服以下挑戰：（1）在隱私數據結構與算法中探索並行性。我們將設計基礎的不經意並行算法作為存儲方案的基石，並研究深度與總工作之間的平衡。（2）在隱私機制中達到輸入/輸出高效性。例如，我們將考慮隱私方案中緩存不可知子程序，使得算法可以在沒有底層結構變量信息的情況下達到最優輸入/輸出高效性。（3）研究存儲方案中更一般的安全概念。超越傳統ORAM，我們將考慮可微分隱私作為更鬆弛的安全要求，以在實際應用中取得更好的性能。

Realisation of objectives:

Objective 1 We have explored concurrency and parallelism in private data structures and algorithms. One notable example is our study of Oblivious Parallel RAM (OPRAM). Our scheme, which offers perfect security, has demonstrated improvements in total work overhead and parallel depth compared to previous works. This advancement can be attributed to the use of special expander graphs for derandomizing earlier statistically secure OPRAM techniques, presenting a novel approach in the construction of ORAMs/OPRAMs. Our findings were published in TCC 2018. Moreover, we have developed distributed algorithms to compute approximate solutions for several related graph optimization problems. Our algorithms feature round complexity that is logarithmic in the number of nodes in the underlying graph, independent of the graph diameter. By utilizing a primal-dual approach, our algorithm can compute coreness values of nodes with approximation ratios close to 2. Additionally, our distributed algorithm can solve the min-max edge orientation problem with a similarly close approximation ratio. We have provided lower bounds that demonstrate the tightness of our approximation guarantee and round complexity. This research, which explores a weaker version of the densest subset problem, was published in IPDPS 2019 and the Journal of Parallel Distributed Computing 2021. Our work on Byzantine Agreement (BA) has also contributed to a better understanding of its round complexity. Despite previous research, significant gaps remained in comprehending BA under a corrupt majority. We have made progress in this area by developing a corrupt-majority BA protocol with sublinear round complexity, consistent with a high probability, and capable of tolerating a substantial fraction of corrupt nodes. Our protocol is secure against adaptive adversaries and offers an upper bound that is optimal up to a logarithmic factor. These results were published in PKC 2020. Objective 2 We have focused on achieving I/O efficiency for private mechanisms. To address this challenge, we have explored various approaches, including multi-server settings in the context of Oblivious RAM (ORAM). Traditionally, ORAM has been studied in a single-server setting with tree-based approaches that offer good bandwidth overhead but lack I/O efficiency. By considering a multi-server setting, we have developed a 3-server ORAM scheme that not only achieves I/O efficiency but also provides perfect security and O(log^2 N) total work overhead. This performance surpasses the best-known single-server scheme by a logarithmic factor. Our study also demonstrates that specific problems, such as obliviously merging two sorted arrays, can benefit from multiple servers in overcoming known lower bounds in the single-server setting. Our findings were published in ASIACRYPT 2018. We have also investigated locality-preserving ORAMs to address the issue of conventional ORAM schemes disrupting the locality of data accesses. Our research has led to the development of a locality-preserving ORAM with poly-logarithmic overhead in terms of both bandwidth and locality. By examining the trade-off between locality, bandwidth, and leakage, we have discovered that schemes preserving locality without revealing the lengths of contiguous memory regions accessed suffer from prohibitive bandwidth. Our approach utilizes a locality-friendly oblivious sorting algorithm, and the results were published in EUROCRYPT 2019 and the Journal of Cryptology 2022. We have revisited the problem of low-memory robust simulation of interactive protocols over noisy channels, particularly in the context of client-server interactive protocols and multi-server Private Information Retrieval (PIR) schemes. Our proposed information-theoretic technique converts any correct PIR protocol that assumes reliable channels into a protocol that is both correct and private in the presence of a noisy channel while minimizing space complexity. Our technique addresses the challenges of previous approaches by saving hashes of prefixes of past transcripts, resulting in a more robust simulation. These results were published in SODA 2020. Furthermore, our research has led to the development of bucket oblivious sort and bucket oblivious random permutation algorithms, which offer a conceptually simple and efficient approach to oblivious sorting. The bucket oblivious sort algorithm demonstrates a runtime that is only three times slower than a non-oblivious merge sort baseline, making it significantly faster than the bitonic sort. Our results were published in SOSA 2020. Objective 3 We have investigated more general notions of security in storage schemes. Moving beyond conventional ORAMs, we have explored differential privacy as a more relaxed security requirement, which allows for improved performance in practice. Our work considers the notion of oblivious simulation, which prevents the access pattern from revealing any information about the input. However, due to the well-known logarithmic ORAM lower bound by Goldreich and Ostrovsky, achieving this strict notion presents certain limitations. Inspired by differential privacy, we have introduced the concept of differential obliviousness, which ensures that the distribution of the access pattern does not change significantly when the input is slightly altered. By examining classical problems such as sorting small-length keys, merging two sorted lists, and range query data structures, we have demonstrated that adopting differential obliviousness with reasonable parameters can circumvent several impossibilities related to full obliviousness. Conversely, we have shown that for more stringent parameter choices, the same lower bounds for oblivious algorithms still apply. Our findings have been published in SODA 2019 and JACM 2022. In addition to differential obliviousness, we have investigated Massively Parallel Computation (MPC) as a model of computation that captures realistic parallel computing architectures, such as large-scale MapReduce and Hadoop clusters. Given that many data analytics tasks on these platforms involve sensitive user data, our research has explored how to leverage MPC architectures for efficient, privacy-preserving computation over massive data sets. We have established that any task that can be efficiently computed on MPC can also be securely computed with comparable efficiency. Specifically, we have shown that any MPC algorithm can be compiled into a communication-oblivious counterpart, preserving its round and space complexity. Communication-obliviousness ensures that any network intermediary observing the communication patterns learns no information about the secret inputs. Furthermore, we have demonstrated that, assuming the existence of Fully Homomorphic Encryption with a suitable notion of compactness and other standard cryptographic assumptions, any MPC algorithm can be compiled to a secure counterpart that defends against adversaries controlling not only intermediate network routers but also up to a less than 1/3 fraction of machines. This compilation preserves the round complexity tightly and the space complexity up to a multiplicative security parameter-related blowup. These results have been published in ITCS 2020.

Summary of objectives addressed:

	Objectives	Addressed	Percentage achieved
1.	as per 5.1 above	Yes	100%
2.	as per 5.1 above	Yes	100%
3.	as per 5.1 above	Yes	100%

Research Outcome

Major findings and research outcome:

For Objective 1, we have investigated concurrency and parallelism within private data structures and algorithms, including Oblivious Parallel RAM (OPRAM) with perfect security, showcasing improvements in overhead and parallel depth (TCC 2018). We have also devised distributed algorithms for graph optimization issues featuring logarithmic round complexity and tight approximation guarantees (IPDPS 2019, Journal of Parallel Distributed Computing 2021). Our research on Byzantine Agreement (BA) has enriched our comprehension of its round complexity by formulating a corrupt-majority BA protocol with sublinear round complexity and optimal upper bounds (PKC 2020). Addressing Objective 2, we have attained I/O efficiency for private mechanisms by examining multi-server settings in Oblivious RAM (ORAM) and creating a 3-server ORAM scheme with perfect security and enhanced performance (ASIACRYPT 2018). Furthermore, we have designed locality-preserving ORAMs, generating a scheme with poly-logarithmic overhead in both bandwidth and locality (EUROCRYPT 2019, Journal of Cryptology 2022). We also revisited low-memory robust simulation of interactive protocols over noisy channels, introducing an information-theoretic technique for multi-server Private Information Retrieval (PIR) schemes (SODA 2020). Finally, we designed bucket oblivious sort and bucket oblivious random permutation algorithms for a more efficient oblivious sorting approach (SOSA 2020). Regarding Objective 3, we have delved into differential privacy in storage schemes and utilized differential obliviousness, illustrating its potential to overcome full obliviousness limitations (SODA 2019, JACM 2022). We have also explored Massively Parallel Computation (MPC) for privacy-preserving computation on vast data sets, demonstrating that any MPC algorithm can be transformed into a communication-oblivious counterpart while maintaining similar efficiency and security (ITCS 2020). This project has also mentored two PhD students. One thesis examined concurrency and parallelism in combinatorial problems in graphs and hypergraphs, emphasizing well-connected parts detection and algorithm development for various problems, including k-core decomposition and the k-clique densest subgraph problem. The other thesis concentrated on communication-efficient low-memory robust simulation of client-server interactive protocols over noisy channels. Lastly, this project has fostered collaboration with global academic institutions in the US, Europe, Israel, and Asia, as well as industry research labs.

Potential for further development of the research
and the proposed course of action:

Future research will continue to refine Oblivious Parallel RAM (OPRAM) and other private data structures and algorithms for enhanced security and efficiency, potentially incorporating machine learning techniques. Expanding distributed algorithms for graph optimization, such as community detection and network flow, will facilitate applications in real-world scenarios like social networks and transportation systems. Investigating the scalability and adaptability of Byzantine Agreement (BA) protocols in various network conditions and applications, including blockchain and secure multiparty computation, should continue to be investigated. For instance, optimizing I/O efficiency for private mechanisms can be achieved by exploring server configurations and access strategies in Oblivious RAM (ORAM). Developing novel privacy-preserving techniques and tools for handling sensitive data can be accomplished by applying concepts of differential privacy and differential obliviousness to domains like machine learning and data analytics. Creating a comprehensive framework for Massively Parallel Computation (MPC) algorithms that integrates with existing data processing systems can ensure privacy-preserving computation without sacrificing performance and scalability. Moreover, establishing collaborations with research groups and industry partners can facilitate real-world applications, such as finance, healthcare, and smart cities. For example, engaging with regulatory bodies and policymakers is vital for promoting privacy-preserving computation and developing implementation standards and guidelines across sectors.

Layman's Summary of
Completion Report:

This project aimed to protect sensitive information in computer systems by researching and developing advanced data storage methods and algorithms. The primary focus was on enhancing the way data is accessed and processed while maintaining privacy and security. We have designed innovative techniques for handling data, allowing multiple tasks to be performed simultaneously, thereby increasing efficiency without compromising privacy. This has improved our understanding of securely processing data in complex systems such as cloud platforms. We have achieved enhanced efficiency in managing data input and output, which is crucial for modern computer systems. By creating new, more efficient algorithms, sensitive data remains secure even when accessed by multiple users. We have explored novel ways of securing data that balance privacy and performance. By considering alternative privacy models, we have been able to develop more efficient and flexible solutions for real-world applications. In summary, this project has made significant advancements in the field of data security, with wide-ranging applications in modern computing systems, such as multi-core processors and cloud platforms. The research has provided valuable insights and practical solutions for maintaining privacy and security while optimizing the performance of data storage and processing systems.

Research Output

Peer-reviewed journal publication(s)
arising directly from this research project :
(* denotes the corresponding author)

Year of Publication	Author(s)	Title and Journal/Book	Accessible from Institution Repository
2021	T.-H. Hubert Chan*, Mauro Sozio, Bintao Sun	Distributed approximate k-core decomposition and min-max edge orientation: Breaking the diameter barrier	No
2022	T.-H. Hubert Chan*, Kai-Min Chung, Bruce M. Maggs, Elaine Shi	Foundations of Differentially Oblivious Algorithms	No
2022	Gilad Asharov*, T.-H. Hubert Chan, Kartik Nayak, Rafael Pass, Ling Ren, Elaine Shi	Locality-Preserving Oblivious RAM	No

Recognized international conference(s)
in which paper(s) related to this research
project was/were delivered :

Month/Year/City	Title	Conference Name
Panaji, India	Perfectly Secure Oblivious Parallel RAM	Theory of Cryptography - 16th International Conference
Brisbane, Australia	More is Less: Perfectly Secure Oblivious Algorithms in the Multi-server Setting	Advances in Cryptology - ASIACRYPT 2018 - 24th International Conference on the Theory and Application of Cryptology and Information Security
San Diego, USA	Foundations of Differentially Oblivious Algorithms	Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms
Rio de Janeiro	Distributed Approximate k-Core Decomposition and Min-Max Edge Orientation: Breaking the Diameter Barrier	IEEE International Parallel and Distributed Processing Symposium
Darmstadt	Locality-Preserving Oblivious RAM	Advances in Cryptology - EUROCRYPT 2019 - 38th Annual International Conference on the Theory and Applications of Cryptographic Techniques
Seattle	MPC for MPC: Secure Computation on a Massively Parallel Computing Architecture	Innovations in Theoretical Computer Science Conference
Salt Lake City	Small Memory Robust Simulation of Client-Server Interactive Protocols over Oblivious Noisy Channels	ACM-SIAM Symposium on Discrete Algorithms
Salt Lake City	Bucket Oblivious Sort: An Extremely Simple Oblivious Sort	Symposium on Simplicity in Algorithms
Edinburgh	Sublinear-Round Byzantine Agreement Under Corrupt Majority	IACR International Conference on Practice and Theory of Public-Key Cryptography

Other impact
(e.g. award of patents or prizes,
collaboration with other research institutions,
technology transfer, etc.):

SCREEN ID: SCRRM00542