Client-Efficient Online-Offline Private Information Retrieval
https://iacr.org/news/item/23111

ePrint Report: Client-Efficient Online-Offline Private Information Retrieval

Hoang-Dung Nguyen, Jorge Guajardo, Thang Hoang
Private Information Retrieval (PIR) permits clients to query entries from a public database hosted on untrusted servers in a privacy-preserving manner. Traditional PIR model suffers from high computation and/or bandwidth cost due to entire database processing for privacy. Recently, Online-Offline PIR (OO-PIR) has been suggested to improve the practicality of PIR, where query-independent materials are precomputed beforehand to accelerate online access. While state-of-the-art OO-PIR schemes (e.g., S&P’24, CRYPTO’23) successfully reduce the online processing overhead to sublinear, they still impose sustainable bandwidth and storage burdens on the client, especially when operating on large databases.
In this paper, we propose Pirex, a new OO-PIR scheme with eminent client performance while maintaining the sublinear server processing efficiency. Specifically, Pirex offers clients with sublinear processing, minimal inbound bandwidth, and low storage requirements. Our Pirex design is fairly simple yet efficient, where the majority of operations are naturally low-cost and streamlined (e.g., XOR, PRF, modular arithmetic).
We have fully implemented Pirex and evaluated its real-world performance using commodity hardware. Our experimental results demonstrated that Pirex outperforms existing OO-PIR schemes by at least two orders of magnitude. Concretely, with a 1 TB database, Pirex only takes 0.8s to query a 256-KB entry, compared with 30-220s by the state-of-the-art.
Private Computations on Streaming Data
https://eprint.iacr.org/2024/698

We present a framework for privacy-preserving streaming algorithms which combine the memory-efficiency of streaming algorithms with strong privacy guarantees. These algorithms enable some number of servers to compute aggregate statistics efficiently on large quantities of user data without learning the user's inputs. While there exists limited prior work that fits within our model, our work is the first to formally define a general framework, interpret existing methods within this general framework, and develop new tools broadly applicable to this model. To highlight our model, we designed and implemented a new privacy-preserving streaming algorithm to compute heavy hitters, which are the most frequent elements in a data stream. We provide a performance comparison between our system and Poplar, the only other private statistics algorithm which supports heavy hitters. We benchmarked ours and Poplar's systems and provided direct performance comparisons within the same hardware platform. Of note, Poplar requires linear space compared to our poly-logarithmic space, meaning our system is the first to compute heavy hitters within the privacy-preserving streaming model. A small memory footprint allows our algorithm (among other benefits) to run efficiently on a very large input sizes without running out of memory or crashing.
An Improved Threshold Homomorphic Cryptosystem Based on Class Groups
https://iacr.org/news/item/23109

ePrint Report: An Improved Threshold Homomorphic Cryptosystem Based on Class Groups

Lennart Braun, Guilhem Castagnos, Ivan Damgård, Fabien Laguillaumie, Kelsey Melissaris, Claudio Orlandi, Ida Tucker
We present distributed key generation and decryption protocols for an additively homomorphic cryptosystem based on class groups, improving on a similar system proposed by Braun, Damgård, and Orlandi at CRYPTO '23. Our key generation is similarly constant round but achieves lower communication complexity than the previous work. This improvement is in part the result of relaxing the reconstruction property required of the underlying integer verifiable secret sharing scheme. This eliminates the reliance on potentially costly proofs of knowledge in unknown order groups. We present a new method to batch zero-knowledge proofs in unknown order groups which strengthens these improvements. We also present a protocol which is proven secure against adaptive adversaries in the single inconsistent player (SIP) model. Our protocols are secure in the universal composability (UC) framework and provide guaranteed output delivery. We demonstrate the relative efficiency of our techniques by presenting the running times and communication costs associated with our implementation of the statically secure protocol and provide a direct comparison with alternate state of the art constructions.
An Efficient All-to-All GCD Algorithm for Low Entropy RSA Key Factorization
https://eprint.iacr.org/2024/699

RSA is an incredibly successful and useful asymmetric encryption algorithm. One of the types of implementation flaws in RSA is low entropy of the key generation, specifically the prime number creation stage. This can occur due to flawed usage of random prime number generator libraries, or on computers where there is a lack of a source of external entropy. These implementation flaws result in some RSA keys sharing prime factors, which means that the full factorization of the public modulus can be recovered incredibly efficiently by performing a computation GCD between the two public key moduli that share the prime factor. However, since one does not know which of the composite moduli share a prime factor a-priori, to determine if any such shared prime factors exist, an all-to-all GCD attack (also known as a batch GCD attack, or a bulk GCD attack) can be performed on the available public keys so as to recover any shared prime factors. This study describes a novel all-to-all batch GCD algorithm, which will be referred to as the binary tree batch GCD algorithm, that is more efficient than the current best batch GCD algorithm (the remainder tree batch GCD algorithm). A comparison against the best existing batch GCD method (which is a product tree followed by a remainder tree computation) is given using a dataset of random RSA moduli that are constructed such that some of the moduli share prime factors. This proposed binary tree batch GCD algorithm has better runtime than the existing remainder tree batch GCD algorithm, although asymptotically it has nearly identical scaling and its complexity is dependent on how many shared prime factors exist in the set of RSA keys. In practice, the implementation of the proposed binary tree batch GCD algorithm has a roughly 6x speedup compared to the standard remainder tree batch GCD approach.
Unclonable Secret Sharing
https://iacr.org/news/item/23108

ePrint Report: Unclonable Secret Sharing

Prabhanjan Ananth, Vipul Goyal, Jiahui Liu, Qipeng Liu
Unclonable cryptography utilizes the principles of quantum mechanics to addresses cryptographic tasks that are impossible classically. We introduce a novel unclonable primitive in the context of secret sharing, called unclonable secret sharing (USS). In a USS scheme, there are $n$ shareholders, each holding a share of a classical secret represented as a quantum state. They can recover the secret once all parties (or at least $t$ parties) come together with their shares. Importantly, it should be infeasible to copy their own shares and send the copies to two non-communicating parties, enabling both of them to recover the secret.


Our work initiates a formal investigation into the realm of unclonable secret sharing, shedding light on its implications, constructions, and inherent limitations.
** Connections: We explore the connections between USS and other quantum cryptographic primitives such as unclonable encryption and position verification, showing the difficulties to achieve USS in different scenarios.

**Limited Entanglement: In the case where the adversarial shareholders do not share any entanglement or limited entanglement, we demonstrate information-theoretic constructions for USS.


**Large Entanglement: If we allow the adversarial shareholders to have unbounded entanglement resources (and unbounded computation), we prove that unclonable secret sharing is impossible. On the other hand, in the quantum random oracle model where the adversary can only make a bounded polynomial number of queries, we show a construction secure even with unbounded entanglement.

Furthermore, even when these adversaries possess only a polynomial amount of entanglement resources, we establish that any unclonable secret sharing scheme with a reconstruction function implementable using Cliffords and logarithmically many T-gates is also unattainable.
Sublinear Distributed Product Checks on Replicated Secret-Shared Data over $\mathbb{Z}_{2^k}$ without Ring Extensions
https://eprint.iacr.org/2024/700

Multiple works have designed or used maliciously secure honest majority MPC protocols over $\mathbb{Z}_{2^k}$ using replicated secret sharing (e.g. Koti et al. USENIX’21, and the references therein). A recent trend in the design of such MPC protocols is to first execute a semi-honest protocol, and then use a check that verifies the correctness of the computation requiring only sublinear amount of communication in terms of the circuit size. The so-called Galois ring extensions are needed in order to execute such checks over $\mathbb{Z}_{2^k}$, but these rings incur incredibly high computation overheads, which completely undermine any potential benefits the ring $\mathbb{Z}_{2^k}$ had to begin with.
In this work we revisit the task of designing sublinear distributed product checks on replicated secret-shared data over $\mathbb{Z}_{2^k}$ among three parties with an honest majority. We present a novel technique for verifying the correctness of a set of multiplication (in fact, inner product) triples, involving a sublinear cost in terms of the amount of multiplications. Most importantly, unlike previous works, our tools entirely avoid Galois ring extensions, and only require computation over rings of the form $\mathbb{Z}_{2^l}$ . In terms of communication, our checks are 3∼5× lighter than existing checks using ring extensions, which is already quite remarkable. However, our most noticeable improvement is in terms of computation: avoiding extensions allows our checks to be 17.7∼44.2× better than previous approaches, for many parameter regimes of interest. Our experimental results show that checking a 10 million gate circuit with the 3PC protocol from (Boyle et al., CCS’19) takes about two minutes, while our approach takes only 2.82 seconds.
Finally, our techniques are not restricted to the three-party case, and we generalize them to replicated secret-sharing with an arbitrary number of parties n. Even though the share size in this scheme grows exponentially with n, prior works have used it for n = 4 or n = 5—or even general n for feasibility results—and our distributed checks also represent improvements in these contexts.
Quantum Unpredictability
https://eprint.iacr.org/2024/701

Unpredictable functions (UPFs) play essential roles in classical cryptography, including message authentication codes (MACs) and digital signatures. In this paper, we introduce a quantum analog of UPFs, which we call unpredictable state generators (UPSGs). UPSGs are implied by pseudorandom function-like states generators (PRFSs), which are a quantum analog of pseudorandom functions (PRFs), and therefore UPSGs could exist even if one-way functions do not exist, similar to other recently introduced primitives like pseudorandom state generators (PRSGs), one-way state generators (OWSGs), and EFIs. In classical cryptography, UPFs are equivalent to PRFs, but in the quantum case, the equivalence is not clear, and UPSGs could be weaker than PRFSs. Despite this, we demonstrate that all known applications of PRFSs are also achievable with UPSGs. They include IND-CPA-secure secret-key encryption and EUF-CMA-secure MACs with unclonable tags. Our findings suggest that, for many applications, quantum unpredictability, rather than quantum pseudorandomness, is sufficient.
Security Analysis of Signal's PQXDH Handshake
https://eprint.iacr.org/2024/702

Signal recently deployed a new handshake protocol named PQXDH to protect against "harvest-now-decrypt-later" attacks of a future quantum computer. To this end, PQXDH adds a post-quantum KEM to the Diffie-Hellman combinations of the prior X3DH handshake.

In this work, we give a reductionist security analysis of Signal's PQXDH handshake in a game-based security model that captures the targeted "maximum-exposure" security, allowing fine-grained compromise of user's long-term, semi-static, and ephemeral key material. We augment prior such models to capture not only the added KEM component but also the signing of public keys, which prior analyses did not capture but which adds an additional flavor of post-quantum security in PQXDH. We then establish a fully parameterized, concrete security bound for the session key security of PQXDH, in particular shedding light on a KEM binding property we require for PQXDH's security, and how to avoid it.

Our discussion of KEM binding complements the tool-based analysis of PQXDH by Bhargavan, Jacomme, Kiefer, and Schmidt, which pointed out a potential re-encapsulation attack if the KEM shared secret does not bind the public key. We show that both Kyber (used in PQXDH) and its current NIST draft standard ML-KEM (foreseen to replace Kyber once standardized) satisfy a novel binding notion we introduce and rely on for our PQXDH analysis, which may be of independent interest.
An Efficient and Extensible Zero-knowledge Proof Framework for Neural Networks
https://eprint.iacr.org/2024/703

In recent years, cloud vendors have started to supply paid services for data analysis by providing interfaces of their well-trained neural network models. However, customers lack tools to verify whether outcomes supplied by cloud vendors are correct inferences from particular models, in the face of lazy or malicious vendors. The cryptographic primitive called zero-knowledge proof (ZKP) addresses this problem. It enables the outcomes to be verifiable without leaking information about the models. Unfortunately, existing ZKP schemes for neural networks have high computational overheads, especially for the non-linear layers in neural networks.

In this paper, we propose an efficient and extensible ZKP framework for neural networks. Our work improves the performance of the proofs for non-linear layers. Compared to previous works relying on the technology of bit decomposition, we convert complex non-linear relations into range and exponent relations, which significantly reduces the number of constraints required to prove non-linear layers. Moreover, we adopt a modular design to make our framework compatible with more neural networks. Specifically, we propose two enhanced range and lookup proofs as basic blocks. They are efficient in proving the satisfaction of range and exponent relations. Then, we constrain the correct calculation of primitive non-linear operations using a small number of range and exponent relations. Finally, we build our ZKP framework from the primitive operations to the entire neural networks, offering the flexibility for expansion to various neural networks.

We implement our ZKPs for convolutional and transformer neural networks. The evaluation results show that our work achieves over $168.6\times$ (up to $477.2\times$) speedup for separated non-linear layers and $41.4\times$ speedup for the entire ResNet-101 convolutional neural network, when compared with the state-of-the-art work, Mystique. In addition, our work can prove GPT-2, a transformer neural network with $117$ million parameters, in $287.1$ seconds, achieving $35.7\times$ speedup over ZKML, which is a state-of-the-art work supporting transformer neural networks.
Fully Automated Selfish Mining Analysis in Efficient Proof Systems Blockchains
https://eprint.iacr.org/2024/704

We study selfish mining attacks in longest-chain blockchains like Bitcoin, but where the proof of work is replaced with efficient proof systems -- like proofs of stake or proofs of space -- and consider the problem of computing an optimal selfish mining attack which maximizes expected relative revenue of the adversary, thus minimizing the chain quality. To this end, we propose a novel selfish mining attack that aims to maximize this objective and formally model the attack as a Markov decision process (MDP). We then present a formal analysis procedure which computes an $\epsilon$-tight lower bound on the optimal expected relative revenue in the MDP and a strategy that achieves this $\epsilon$-tight lower bound, where $\epsilon>0$ may be any specified precision. Our analysis is fully automated and provides formal guarantees on the correctness. We evaluate our selfish mining attack and observe that it achieves superior expected relative revenue compared to two considered baselines.

In concurrent work [Sarenche FC'24] does an automated analysis on selfish mining in predictable longest-chain blockchains based on efficient proof systems. Predictable means the randomness for the challenges is fixed for many blocks (as used e.g., in Ouroboros), while
we consider unpredictable (Bitcoin-like) chains where the challenge is derived from the previous block.
Large-Scale MPC: Scaling Private Iris Code Uniqueness Checks to Millions of Users
https://eprint.iacr.org/2024/705

In this work we tackle privacy concerns in biometric verification systems that typically require server-side processing of sensitive data (e.g., fingerprints and Iris Codes). Concretely, we design a solution that allows us to query whether a given Iris Code is similar to one contained in a given database, while all queries and datasets are being protected using secure multiparty computation (MPC). Addressing the substantial performance demands of operational systems like World ID and aid distributions by the Red Cross, we propose new protocols to improve performance by more than three orders of magnitude compared to the recent state-of-the-art system Janus (S&P 24). Our final protocol can achieve a throughput of over a million Iris Code comparisons per second on a single CPU core, while protecting the privacy of both the query and database Iris Codes. We additionally investigate GPU acceleration for some building blocks of our protocol, which results in further speedups of over 38x compared to the respective multi-threaded CPU implementation.
Linicrypt in the Ideal Cipher Model
https://eprint.iacr.org/2024/706

We extend the Linicrypt framework for characterizing hash function security as proposed by McQuoid, Swope, and Rosulek (TCC 2018) to support constructions in the ideal cipher model.
In this setting, we give a characterization of collision- and second-preimage-resistance in terms of a linear-algebraic condition on Linicrypt programs, and present an efficient algorithm for determining whether a program satisfies the condition. As an application, we consider the case of the block cipherbased hash functions proposed by Preneel, Govaerts, and Vandewall (Crypto 1993), and show that the semantic analysis of PGV given by Black et. al. (J. Crypto. 2010) can be captured as a special case of our characterization. In addition, We model hash functions constructed through the Merkle-Damgård transformation within the Linicrypt framework. Finally, we appy this model to an analysis of how various attacks on the underlying compression functions can compromise the collision resistance of the resulting hash function.
Towards a Polynomial Instruction Based Compiler for Fully Homomorphic Encryption Accelerators
https://eprint.iacr.org/2024/707

Fully Homomorphic Encryption (FHE) is a transformative technology that enables computations on encrypted data without requiring decryption, promising enhanced data privacy. However, its adoption has been limited due to significant performance overheads. Recent advances include the proposal of domain-specific, highly-parallel hardware accelerators designed to overcome these limitations.
This paper introduces PICA, a comprehensive compiler framework designed to simplify the programming of these specialized FHE accelerators and integration with existing FHE libraries. PICA leverages a novel polynomial Instruction Set Architecture (p-ISA), which abstracts polynomial rings and their arithmetic operations, serving as a fundamental data type for the creation of compact, efficient code embracing high-level operations on polynomial rings, referred to as kernels, e.g., encompassing FHE primitives like arithmetic and ciphertext management. We detail a kernel generation framework that translates high-level FHE operations into pseudo-code using p-ISA, and a subsequent tracing framework that incorporates p-ISA functionalities and kernels into established FHE libraries. Additionally, we introduce a mapper to coordinate multiple FHE kernels for optimal application performance on targeted hardware accelerators. Our evaluations demonstrate PICA's efficacy in creation of compact and efficient code, when compared with an x64 architecture. Particularly in managing complex FHE operations such as relinearization, where we observe a 25.24x instruction count reduction even when a large batch size (8192) is taken into account.
Automated Generation of Fault-Resistant Circuits
https://eprint.iacr.org/2024/708

Fault Injection (FI) attacks, which involve intentionally introducing faults into a system to cause it to behave in an unintended manner, are widely recognized and pose a significant threat to the security of cryptographic primitives implemented in hardware, making fault tolerance an increasingly critical concern. However, protecting cryptographic hardware primitives securely and efficiently, even with well-established and documented methods such as redundant computation, can be a time-consuming, error-prone, and expertise-demanding task.
In this research, we present a comprehensive and fully-automated software solution for the Automated Generation of Fault-Resistant Circuits (AGEFA). Our application employs a generic and extensively researched methodology for the secure integration of countermeasures based on Error-Correcting Codes (ECCs) into cryptographic hardware circuits. Our software tool allows designers without hardware security expertise to develop fault-tolerant hardware circuits with pre-defined correction capabilities under a comprehensive fault adversary model. Moreover, our tool applies to masked designs without violating the masking security requirements, in particular to designs generated by the tool AGEMA. We evaluate the effectiveness of our approach through experiments on various block ciphers and demonstrate its ability to produce fault-tolerant circuits. Additionally, we assess the security of examples generated by AGEFA against Side-Channel Analysis (SCA) and FI using state-of-the-art leakage and fault evaluation tools.
Masked Computation the Floor Function and its Application to the FALCON Signature
https://eprint.iacr.org/2024/709

FALCON is candidate for standardization of the new Post Quantum Cryptography (PQC) primitives by the National Institute of Standards and Technology (NIST). However, it remains a challenge to define efficient countermeasures against side-channel attacks (SCA) for this algorithm. FALCON is a lattice-based signature that relies on rational numbers which is unusual in the cryptography field. While recent work proposed a solution to mask the addition and the multiplication, some roadblocks remain, most noticeably how to protect the floor function. We propose in this work to complete the existing first trials of hardening FALCON against SCA. We perform the mathematical proofs of our methods as well as formal security proof in the probing model using the Non-Interference concepts.
BUFFing FALCON without Increasing the Signature Size
https://eprint.iacr.org/2024/710

This work shows how FALCON can achieve the Beyond UnForgeability Features (BUFF) introduced by Cremers et al. (S&P'21) more efficiently than by applying the generic BUFF transform. Specifically, we show that applying a transform of Pornin and Stern (ACNS'05), dubbed PS-3 transform, already suffices for FALCON to achieve BUFF security. For FALCON, this merely means to include the public key in the hashing step in signature generation and verification, instead of hashing only the nonce and the message; the other signature computation steps and the signature output remain untouched. In comparison to the BUFF transform, which appends a hash value to the final signature, the PS-3 transform therefore achieves shorter signature sizes, without incurring additional computations.
Non-Transferable Anonymous Tokens by Secret Binding
https://eprint.iacr.org/2024/711

Non-transferability (NT) is a security notion which ensures that credentials are only used by their intended owners. Despite its importance, it has not been formally treated in the context of anonymous tokens (AT) which are lightweight anonymous credentials. In this work, we consider a client who "buys" access tokens which are forbidden to be transferred although anonymously redeemed. We extensively study the trade-offs between privacy (obtained through anonymity) and security in AT through the notion of non-transferability. We formalise new security notions, design a suite of protocols with various flavors of NT, prove their security, and implement the protocols to assess their efficiency. Finally, we study the existing anonymous credentials which offer NT, and show that they cannot automatically be used as AT without security and complexity implications.
Quantum NV Sieve on Grover for Solving Shortest Vector Problem
https://eprint.iacr.org/2024/712

Quantum computers can efficiently model and solve several challenging problems for classical computers, raising concerns about potential security reductions in cryptography. NIST is already considering potential quantum attacks in the development of post-quantum cryptography by estimating the quantum resources required for such quantum attacks. In this paper, we present quantum circuits for the NV sieve algorithm to solve the Shortest Vector Problem (SVP), which serves as the security foundation for lattice-based cryptography, achieving a quantum speedup of the square root. Although there has been extensive research on the application of quantum algorithms for lattice-based problems at the theoretical level, specific quantum circuit implementations for them have not been presented yet.
Notably, this work demonstrates that the required quantum complexity for the SVP in the lattice of rank 70 and dimension 70 is $2^{43}$ (a product of the total gate count and the total depth) with our optimized quantum implementation of the NV sieve algorithm.
This complexity is significantly lower than the NIST post-quantum security standard, where level 1 is $2^{157}$, corresponding to the complexity of Grover's key search for AES-128.
Analyzing Pump and jump BKZ algorithm using dynamical systems
https://eprint.iacr.org/2024/713

The analysis of the reduction effort of the lattice reduction algorithm is important in estimating the hardness of lattice-based cryptography schemes. Recently many lattice challenge records have been cracked by using the Pnj-BKZ algorithm which is the default lattice reduction algorithm used in G6K, such as the TU Darmstadt LWE and SVP Challenges. However, the previous estimations of the Pnj-BKZ algorithm are simulator algorithms rather than theoretical upper bound analyses. In this work, we present the first dynamic analysis of Pnj-BKZ algorithm. More precisely, our analysis results show that let $L$ is the lattice spanned by $(\mathbf{a}_i)_{i\leq d}$. The shortest vector $\mathbf{b}_1$ output by running $\Omega \left ( \frac{2Jd^2}{\beta(\beta-J)}\left ( \ln_{}{d} +\ln_{} \ln_{}{\max_{i}\frac{\left \| \mathbf{a}_i^{*} \right \| }{(\mathrm{det}L )^{1/d} } } \right ) \right ) $ tours reduction of pnj-BKZ$(\beta,J)$, $\mathbf{b}_1$ satisfied that \memo{$\left \| \mathbf{b}_1 \right \| \le {\gamma}_{\beta}^{\frac{d-1}{2(\beta-J)}+2 } \cdot \left ( \mathrm{det}L \right ) ^{\frac{1}{d} } $}.
Learning with Quantization, Polar Quantizer, and Secure Source Coding
https://eprint.iacr.org/2024/714

This paper presents a generalization of the Learning With Rounding (LWR) problem, initially introduced by Banerjee, Peikert, and Rosen, by applying the perspective of vector quantization. In LWR, noise is induced by rounding each coordinate to the nearest multiple of a fraction, a process inherently tied to scalar quantization. By considering a new variant termed Learning With Quantization (LWQ), we explore large-dimensional fast-decodable lattices with superior quantization properties, aiming to enhance the compression performance over conventional scalar quantization. We identify polar lattices as exemplary structures, effectively transforming LWQ into a problem akin to Learning With Errors (LWE), where the distribution of quantization noise is statistically close to discrete Gaussian. Furthermore, we develop a novel ``quancryption'' scheme for secure source coding. Notably, the scheme achieves near-optimal rate-distortion ratios for bounded rational signal sources, and can be implemented efficiently with quasi-linear time complexity. Python code of the polar-lattice quantizer is available at https://github.com/shx-lyu/PolarQuantizer.