Pindex: Private multi-linked index for encrypted document retrieval

A. John Prakash; B. Lydia Elizabeth

doi:10.1371/journal.pone.0256223

Abstract

Cryptographic cloud storage is used to make optimal use of the cloud storage infrastructure to outsource sensitive and mission-critical data. The continuous growth of encrypted data outsourced to cloud storage requires continuous updating. Attacks like file-injection are reported to compromise confidentiality of the user as a consequence of information leakage during update. It is required that dynamic schemes provide forward privacy guarantees. Updates should not leak information to the untrusted server regarding the previously issued queries. Therefore, the challenge is to design an efficient searchable encryption scheme with dynamic updates and forward privacy guarantees. In this paper, a novel private multi-linked dynamic index for encrypted document retrieval namely Pindex is proposed. The multi-linked dynamic index is constructed using probabilistic homomorphic encryption mechanism and secret orthogonal vectors. Full security proofs for correctness and forward privacy in the random oracle model is provided. Experiments on real world Enron dataset demonstrates that our construction is practical and efficient. The security and performance analysis of Pindex shows that the dynamic multi-linked index guarantees forward privacy without significant loss of efficiency.

Citation: Prakash AJ, Elizabeth BL (2021) Pindex: Private multi-linked index for encrypted document retrieval. PLoS ONE 16(8): e0256223. https://doi.org/10.1371/journal.pone.0256223

Editor: Anandakumar Haldorai, Sri Eshwar College of Engineering, INDIA

Received: March 11, 2021; Accepted: August 2, 2021; Published: August 20, 2021

Copyright: © 2021 Prakash, Elizabeth. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

1 Introduction

Cloud computing has revolutionized data storage by offering effortless data storage for personal, enterprises, governments and institutions. Data owners have flexible, on-demand access to data with simplified data management and significant cost benefits with increased performance. Data owners outsource sensitive data such as the Electronic Health Records (EHR), citizens’ personal information, emails, credit cards data, government information and critical business data to the public cloud [1]. Despite the numerous benefits of cloud storage fuelled by high speed networking technologies and although the cloud storage providers (CSPs) claim to adopt strong security measures governments, organizations and businesses are slow to fully embrace the public cloud storage due to privacy and security concerns. When data owners outsource sensitive data to the cloud; they lose control over their data leading to a number of security issues [2].

Data owners encrypt their sensitive information to protect privacy of data stored with the honest but curious cloud storage and to defend against unauthorized access. Encryption imposes restriction on searching which is the only way to access the data. Unless they can easily be indexed, shared, retrieved and utilized, encrypted data serves no purpose. Searchable encryption enables a user to perform efficient keyword search while preserving privacy. Indexes are used to improve the search performance by significantly reducing the search complexity. A number of searchable encryption schemes [3–6] have been proposed but most of the schemes provide a static mechanism which hinders its application in reality.

Dynamic searchable encryption allows clients to dynamically update files or keywords without rebuilding the encrypted index. For any searchable encryption scheme to be practical the scheme must be secure and should allow efficient updates with optimal search time. A few reported works [7–11] support dynamic index. Nevertheless, most of the solutions leaks critical information since the adversary can observe and learn more information from the interaction between data user and CSP. This leakage can be from search pattern, access pattern, update pattern, size pattern, file identifiers containing a specific keyword, trace and trapdoor linkability. As a result of the leakages, forward privacy was introduced by Stefanov et al. [12]. Zhang et al. [13] presented a file injection attack on dynamic constructions by adding files to the database. There is a trade-off between security and practicality. In addition to the data privacy, it is imperative that the computational correctness should be guaranteed by design. This is because the malicious server can return incorrect results due to hardware or software malfunction or to save resources. In order to fully utilize the services offered by the cloud server, there is a need for a provably secure dynamic searchable encryption scheme with efficient search, forward privacy and support for parallelism, which is a challenging problem.

From the above discussions, the significant research problems in designing protocols for privacy-preserving search over encrypted outsourced cloud data can be summarized as: a) an efficient and secure index construction to improve search without reconstructing the index. b) a privacy preserving search over encrypted data with allowable minimum leakage of information to the CSP. c) providing support for efficient dynamic updates to encrypted index with allowable minimum leakage and without requirement of trusted platform at the CSP. d) Support parallel execution of multi-keyword search and update operations. e) a construction with minimum communicational and computational overhead. Motivated by the challenges summarized above the main contributions and results of the proposed method, namely, Pindex are:

Secure Dynamic Index: A novel private multi-linked dynamic index construction using probabilistic homomorphic encryption and a secret orthogonal vector as building blocks. Support to add/delete keywords or documents without reconstructing the outsourced encrypted index using the hash table that contains the sum of inner product of the rows that are linked by orthogonal vectors.
Forward Privacy: The key used in construction of the index and trapdoor or token generation are indistinguishable from random functions. This is because Pindex uses probabilistic homomorphic encryption and secret orthogonal vectors. Thus, the server cannot learn information from search or access patterns from previous queries.
Efficient and Parallelizable: The server performs and functions evaluations for each search and update operation where |F_q| is the number of documents that contain the query keyword w_q and |W_f| is the number of keywords contained in a file f to be updated. The client and server storage overhead is reduced to . The multi-linked index structure allows multi-keywords search and update operations to run independently over p processors.
The results obtained from security analysis shows that the proposed design is secure in the real-ideal security model, efficient with sub-linear time complexity , efficiently parallelized and exhibits relatively better performance in empirical analysis.

The rest of the paper is organized in the order of research methodology adopted which is similar to the standard works [5, 8, 9, 12, 14, 15]. First, we define the security requirements of the protocol under design in the form of definitions such that it captures thoroughly the adversarial environment and adversary. Such definitions stated as a security model for the protocol to serve as the basis for further design is described in section 2. Second, we describe in section 3 the design of the cryptographic protocol adhering to the security model devised in section 2. Third, we present in section 4, a thorough theoretical correctness and security analysis using standard provable security attack or adversary model to investigate vulnerabilities in the design. Fourth in section 5, we support the theoretical correctness and efficiency with results obtained from empirical simulations. Lastly, we present in section 6 the related works, followed by discussion in section 7 and conclusion in section 8.

2 Model

This section describes the preliminaries and security model which captures the requirements of the protocol under design. The model captures the notions of security and adversary through precise definitions to be used later as the basis of design.

2.1 Notations and system model

Let D = {d₁, … d_n} be the set of document collection, C = {C₁, … C_n} be the encrypted collection of documents and W = {w₁, …, w_m} be the set of keywords. Let F = {f₁, …, f_n} denote the file identifiers of documents given by f_i = id(C_i). A collection of document D is to be outsourced to the Cloud Service Provider (CSP). The data owner encrypts the document set D to obtain a encrypted document collection C. To enable privacy preserving searching over the encrypted document collection C, a secure encrypted index denoted by γ is built using BuildIndex. The encrypted document collection C, the secure encrypted Index γ are outsourced to the CSP. An authorised Data User (DU) acquires a search token denoted by τ_q generated using SrchToken corresponding to a given query keyword w_q from the Data Owner (DO). The search token τ_q is then sent to the CSP using a standard communication protocol. The CSP searches for the matching documents for the query keyword using Search over the encrypted index γ preserving privacy. The Search algorithm outputs a set of file identifiers relevant to the query denoted by F_q that contain the query keyword. The DO updates the index using Update whenever the document changes or files need to be added/deleted. The overall system working mechanism of the search is shown in Fig 1. We use the notation = to denote a deterministic assignment, ← a probabilistically computed assignment and ←_$ an uniformly sampled assignment.

Download:

Fig 1. System model.

https://doi.org/10.1371/journal.pone.0256223.g001

2.2 Security definitions

The construction of the Pindex scheme aims to securely perform computations at the CSP on the encrypted data while preserving privacy and is defined as in [9]:

Definition 1 (Pindex). Pindex is a dynamic searchable encryption scheme consisting of a tuple {KGen, Enc₁, Dec₁, Enc₂, Dec₂, BuildIndex, SrchToken, Search, UpdToken, Update} of ten algorithms such that:

K ← KGen(1^λ) is a probabilistic polynomial time algorithm that outputs a secret key K ← {pk, sk, k} when given as input a security parameter 1^λ
c ← Enc₁(k, f) is a CPA-secure deterministic polynomial time algorithm that outputs a sequence of encrypted ciphertexts c when given as input a secret key k and a file f.
f = Dec₁(k, c) is a deterministic polynomial time algorithm that outputs a file f when given as input a secret key k and a ciphertext c.
c ← Enc₂(pk, m, r) is a CPA-secure probabilistic polynomial time additive homomorphic algorithm that outputs an encrypted ciphertext c when given as input a secret key pk, a message m and randomness r.
m = Dec₂(sk, c) is a deterministic polynomial time algorithm that outputs m when given as input a secret key sk and a ciphertext c.
(γ, Γ) ← BuildIndex(K, δ, V, D, Enc₂) is a probabilistic polynomial time algorithm that outputs a encrypted index γ and a parameter hash table Γ when given as input a secret K, vector V, primitive Enc₂, document collection D and a index δ.
τ_q ← SrchToken(pk, w_q, Γ) is a probabilistic polynomial time algorithm that outputs a search token τ_q when given as input a secret key pk, query keyword w_q and a parameter hash table Γ.
F_q ← Search(γ, τ_q) is a deterministic polynomial time algorithm that outputs a set of identifiers F_q ⊆ D when given as input an encrypted index γ and a search token τ_q.
τ_u ← UpdToken(pk, f_i, u, w_u) is a probabilistic polynomial time algorithm. Given a secret key K, update keyword w_u and a file f_i as input, the algorithm outputs an update token τ_u for the update type u ∈ {add, delete}.
γ′ ← Update(γ, τ_u) is a deterministic polynomial time algorithm that outputs an updated encrypted index γ′ given an encrypted index γ and an update token τ_u.

A scheme as defined above is secure when it is designed based on the real/ideal security paradigm. The paradigm requires defining correctness and privacy. The correctness captures the notion that the specified polynomial time algorithms in Definition 1 outputs correctly and it is given as:

Definition 2 (Correctness). Let π be a dynamic searchable encryption scheme as given in Definition 1. Such π is said to be correct if , ∀K: K ← KGen(1^λ), ∀(δ, f), ∀γ: γ ← BuildIndex(K, δ, V, D, Enc₂), ∀c: c ← Enc₁(k, f) and for all update operations of Update(γ, τ_u), where τ_u is the update token obtained by ∀(f, τ_u): τ_u ← UpdToken(pk, f_i, w) and all u ∈ {add, delete}, ∀(w, τ_q: τ_q ← SrchToken(pk, w, Γ), ∀F_q: F_q ← Search(γ, τ_q), the plaintext f_w = {Dec₁(k, c_i):i ∈ F_q} are all plaintexts in f containing keyword w.

Most of the dynamic searchable encryption schemes leak information which could be exploited to launch attacks such as file injection attacks to recover the encryption key. For example, information leak such as file identifiers returned for a respective search or update query cannot be prevented. The leakage function captures the notion of allowable leakages in a dynamic searchable encryption defined as in [9]:

Definition 3 (Leakage). The leakage functions are defined as follows:

Search Pattern : It is defined by a one dimensional binary matrix of length m × n. Given a search keyword w at t, if the search at time i ≤ t is for the keyword w then it is marked as 1 at i and 0 otherwise.
Access Pattern Δ(δ, f, w, t): It is defined as the set of identifiers f_w at time t given a search query for keyword w at time t.
: The function outputs the number of keywords m, the number of files n, the identifiers i of the file and the size of each file when given the index δ and set of file identifiers.
: The function outputs and Δ(δ, f, w, t) for a search query at time t, given the index δ, set of files f and keyword w as input.

The definition of privacy or security captures the notion that the view of each party can be separately simulated given the leakage function and it is defined as:

Definition 4 (Security). Let π be a dynamic searchable encryption scheme as defined in Definition 1. Let be a stateful adversary, Sim be a stateful simulator and are stateful leakage algorithms in the following experiments:

: The challenger runs K ← KGen. The adversary outputs and receives (γ, Γ) ← BuildIndex(K, δ, V, D, Enc₁) from . Each time the adversary computes q ∈ {w, f_i} and makes polynomial number of adaptive queries. obtains from a search token τ_q ← SrchToken(pk, w, Γ) whenever it wants to perform a search query q = w. If wants to make a update query q = f_i it receives a update token τ_u ← UpdToken(pk, f_i, w_u) from . returns an output bit b at the end of the experiment.
: Sim outputs (γ, c) to when given (δ, f) and . then can make polynomial number of adaptive queries q ∈ {w, f_i}. makes a search query q = w then the simulator is given and when q is update query, the Sim is given a new including the adaptive query history by . Also, the Sim returns a token τ to appropriately. then returns an output bit b at the end of the experiment.

The scheme π is said to be –secure against adaptive dynamic chosen-keyword attacks if for all PPT adversaries , there exists a PPT simulator Sim such that (1)

A dynamic searchable scheme is considered secure if no information is revealed during execution of all its operations. However, designing such schemes which do not leak any information is a challenge. For example, when a search or update is performed for the keyword w, the CSP can correlate the file identifiers with it’s query and learn about the outsourced data. Therefore, there is a need for stronger security guarantees for dynamic searchable schemes.

Definition 5 (Forward private). A dynamic searchable encryption scheme which is -adaptively secure is said to be forward private if the following holds: (2) where is the update leakage function, {f_i, μ_i} is the set of document-number of keywords updated pair, f_i is the document updated, μ_i is the number of keywords updated on respective f_i and a stateless function.

The forward private definition 5 states that the server should be able to learn only about the file identifier(s), size of file(s) and number of keywords contained in file(s) after observing searches before and after a dynamic update operation op ∈ {Add, Delete} over encrypted index is performed.

2.3 Homomorphic and orthogonality

The design of dynamic searchable schemes satisfying definitions 4 and 5 more often requires the use of encryption mechanisms called homomorphic cryptosystems as a building block.

Definition 6 (Homomorphic). A public key encryption scheme E = (KGen, Enc, Dec, M, C) is said to be homomorphic if it holds that Dec(sk, c₁ ⋄ c₂) = m₁ ⋄ m₂ for all m₁, m₂ ∈ M and c₁, c₂ ∈ C with m₁ = Dec(sk, c₁) and m₂ = Dec(sk, c₂) such that for all c obtained from Enc(pk, m), c belongs to C.

The proposed Pindex uses Paillier cryptosystem to instantiate the encryption which exhibits homomorphic addition property. The group operation ⋄ for Paillier cryptosystem has homomorphic property such that Dec(sk, c₁ ⋄ c₂) = m₁ + m₂ [16]. Let V = v₁, v₂ …, v_|W| be a matrix of mutually orthogonal vectors where v_i is the i^th row vector and |W| is the number of keywords in the keyword collection W. A square n × n matrix V with elements ±1 that satisfies V × V^T = nI_n is called a Hadamard matrix of order n [17, 18]. Such matrices make cross correlation values to be zero and this property is used in the design of the proposed search and update mechanism of Pindex.

3 Construction

The cloud data storage service involves three entities namely, the Data Owner (DO), Cloud Server or the Cloud Service Provider (CSP) and the Data User (DU). The index is constructed using a novel multi-linked hash map structure. The index construction based on the security model presented in section 2 is given below:

3.1 Multi-linked hash map construction

The two dimensional matrix M_mn whose elements are stored in a multi-linked hash map δ, where each element {m_ij: m_ij = [next, prev], m_ij ∈ M_mn,1 ≤ i ≤ m,1 ≤ j ≤ n} has link to the next non-empty element in the respective row i and column j. The row links are implemented by orthogonal vectors and respective column links using pointers as shown in Fig 2. The hash table value δ(i) is orthogonally linked row values for all 1 ≤ i ≤ m as given below: (3)

Download:

Fig 2. Multi-linked hash map δ.

https://doi.org/10.1371/journal.pone.0256223.g002

Eq 3 shows that each value is a sum of inner product of the orthogonal vector v_j and a pointer to the next non-empty value in the column j. Each v_j represents a column and all v_j’s are mutually orthogonal 4. (4)

Hence, assuming y such that y = j, (5) where m_ij by definition is a pointer to the next non-empty element in the column j and <.,.> denotes vector inner product. Therefore, all the non-zero elements in the column j can be retrieved using the recursive function: (6)

The hash table value δ can be used as an index for retrieving documents containing a given keyword. Let the rows and columns in matrix M represent documents and keywords respectively. Each element in a column j is a pointer to next document containing the keyword w_j. Each row i is encoded as an orthogonal sum of product of keyword vector v_j and the next document f_i containing keyword w_j and stored in a keyword multi-linked hash map as δ(f_i). Each element m_ij = [f_next, f_prev] represents the next and previous file of f_i containing the keyword w_j and an empty value m_ij = ∅ represents that w_j ∉ f_i. Therefore, using Eq 6 set of all documents F_q containing query keyword w_q can be retrieved in F_q ⊆ F in sub-linear time as for all practical queries |F_q| ⋘ |F|. The multi-linked hash map data structure is designed as described above such that it supports addition of new documents which includes encoding a row, adding encoded value and updating either the last or first row pointers accordingly. The multi-linked hash map data structure design improves search to sub-linear time and supports parallel execution of search and updates.

3.2 Algorithm K ← KGen(1^λ)

It is a probabilistic polynomial time algorithm which generates the secret parameters used for building index and encryption. The Pindex algorithm uses Paillier Cryptosystem for building the index and generation of search and update tokens. The key (pk, sk) generation setup is described in [18] where pk denotes the public key and sk denotes the private key of Paillier cryptosystem. The KGen also generates a secret key k to be used by Enc₁ for encrypting the document collection D. The design choice for Enc₁ is the CPA-secure AES while the design choice for Enc₂ is the CPA-secure Paillier cryptosystem which also exhibits additive homomorphic properties.

3.3 Algorithm c ← Enc₂(K, m, r)

It is a probabilistic algorithm, given a message m ∈ P and a public key pk = (n, g), Enc₂(pk, m, r) returns the ciphertext c = g^mrⁿ mod n², where given r is a random value chosen such that .

3.4 Algorithm m = Dec₂(sk, c)

It is a probabilistic algorithm, given a ciphertext c ∈ C and a private key sk = (p, q, λ), Dec₂(sk, c) returns the message m. (7)

3.5 Algorithm (γ, Γ) ← BuildIndex(K, δ, V, D, Enc₂)

BuildIndex algorithm constructed by the data owner takes as input five parameters namely the document set F = {f₁, f₂,…, f_|D|}, keyword set W = {w₁, w₂,…w_|W|}, a pre-built multi-linked hash map δ as described in section 3.1 for F, mutually orthogonal vector V and the key K as input. The algorithm outputs multi-linked hash map index γ and keyword parameter hash table Γ. The algorithm computes:

For each row i or file f_i in δ:
For each column j or word w_j in f_i:
if m_ij ≠ ∅ and Γ(w_j) = ∅ then compute Γ(w_j) = (v ←_$ V_w, r ←_$ Z_n)
Compute s ← Enc₂(pk, w_j, r)
Compute γ(0) = γ(0) + v.s.f and γ(f_i) = γ(f_i) + v.s.⊥
if m_ij ≠ ∅ and Γ(w_j) ≠ ∅ then retrieve (v, r) = Γ(w_j)
Compute s ← Enc₂(pk, w_j, r)
if f_next ≠ ∅ then compute γ(f_i) = γ(f_i) + v.s.f_next otherwise γ(f_i) = γ(f_i) + v.s.⊥
Compute v_r ←_$ V_r and r′ ←_$ Z_m
Compute γ(f_i) = γ(f_i) + v_r.r′
Return (γ, Γ)
Send γ to CSP and Γ is kept secret at the DO

The pictorial representation of index structure is given in Fig 3. The encrypted index (γ, Γ) can be observed to be designed such that the link can be traversed vertical through the pointers and traversed horizontal through the orthogonal property without leaking critical information.

Download:

Fig 3. Index structure.

https://doi.org/10.1371/journal.pone.0256223.g003

3.6 Algorithm τ_s ← SrchToken(w_q, pk, V)

The SrchToken algorithm is computed by the Data Owner. It takes a query keyword w_q ∈ W, key pk, set of mutually orthogonal vectors V as input and outputs a search token τ_q. The algorithm is given below:

Get the keyword secret (v, r) ← Γ(w_q) from the keyword parameter hash table Γ and compute such that rr⁻¹ ≡ 1 mod n²
Compute s⁻¹ ← Enc₂(pk, −w_q, r⁻¹) and returns to DU the search token τ_q by computing: (8) where and v′ ←_$ V_r

The trapdoor τ_q is designed to contain the encrypted keyword parameters embedded with the corresponding orthogonal vector v and random v′.r′ are added at each instance to make the token appear indistinguishable even when the same query is repeated.

3.7 Algorithm F_q ← Search(γ, τ_q)

The Search algorithm is computed at CSP and it takes as input the encrypted index γ, search token τ_q, key pk and outputs F_q = {f₁, f₂,‥f_n}. F_q is the set of document identifiers such that w_q satisfies w_q ∈ W and ⊥ otherwise, where ⊥ denotes null. The search algorithm is given below:

Compute 〈γ(0), τ_q〉 to obtain the first file pointer f_q = f₁ of document containing the keyword w_q.
If f_q = ∅ then there are no documents matching the query keyword w_q or the search token is invalid. The algorithm terminates returning ⊥.
Otherwise, obtain all f_q computing recursively as given below with base condition f_q = f₁ until f_q = ⊥ (9)
Return all file identifiers {f₁, …, f_q, …, |F_q|} obtained.

Thus, it can be observed that the design of the search algorithm is simple and dependent on the search token design. Since, the search token is designed to be created with the orthogonally embedded encrypted keyword parameters (v.s⁻¹) and the index contains (v, s), a inner product of the two will result in retrieving the corresponding document identifiers. The proof of correctness for which is presented in section 4.

3.8 Algorithm τ_q ← UpdToken(pk, f, u, w)

The UpdToken probabilistic algorithm is computed by the DO. The algorithm returns an update token to the CSP for securely updating the index. We define a header node to be the node pointed by γ(0) or the row when i = 1 in δ, a terminal node to be the last file containing ⊥ for a w in δ and any other node as intermediate nodes containing a keyword w. The algorithm computes:

When the update operation is DeleteKeyword w ∈ W from document f then compute:
1. Get (v, r) ← Γ(w), compute s ← Enc₂(pk, w, r)
2. When f is the header node, compute τ₁ ← v.s.f_next, τ₂ ← v.s.f and return τ ← (τ₁, τ₂, f₁ = 0, f₂ = f)
3. When f is the terminal node, compute τ₁ ← v.s.⊥, τ₂ ← v.s.f and return τ ← (τ₁, τ₂, f₁ = f_prev, f₂ = f)
4. When f is an intermediate node, determine f_next, f_prev, compute τ₁ ← v.s.f_next, τ₂ ← v.s.f and return τ_u ← (τ₁, τ₂, f₁ = f_prev, f₂ = f)
When the update operation is DeleteFile for a document f then compute:
1. Get (v, r) ← Γ(w), compute s ← Enc₂(pk, w, r),
2. When f is the header node, compute and return τ ← (τ₁, τ₂, f₁, f₂ = f)
3. When f is the terminal node, compute and return τ ← (τ₁, τ₂, f₁, f₂ = f)
4. When f is an intermediate node, determine f_next, f_prev, compute and return τ_u ← (τ₁, τ₂, f₁, f₂ = f)
When the update operation is AddKeyword w ∈ W to document f then compute:
1. Get , compute s ← Enc₂(pk, w, r)
2. When f is the header node, compute τ₁ ← v.s.f, τ₂ ← v.s.f_next and return τ ← (τ₁, τ₂, f₁ = 0, f₂ = f)
3. When f is the terminal node, compute τ₁ ← v.s.f, τ₂ ← v.s.⊥ and return τ ← (τ₁, τ₂, f₁ = f_prev, f₂ = f)
4. When f is an intermediate node, determine f_next, f_prev in respect of f, compute τ₁ ← v.s.f, τ₂ ← v.s.f_next and return τ_u ← (τ₁, τ₂, f₁ = f_prev, f₂ = f)
When the update operation is AddFile for a document f then compute:
1. Get
2. Compute s ← Enc₂(pk, w, r)
3. Add f as the header node by computing and return τ_u = (τ₁, τ₂, f₁ = 0, f₂ = f)

3.9 Algorithm γ′ ← Update(γ, τ_u, f₁, f₂)

This algorithm takes an index γ, an update token τ_u = (τ₁, τ₂) and the public key pk as input and updates the encrypted index. The updated index γ′ is computed by Update algorithm is given below:

When the update operation u ∈ {DeleteKeyword, DeleteFile}
When the update operation u ∈ {AddKeyword, AddFile}

The rationale behind the design of Update described above and depicted in Fig 4 is based on exploiting the orthogonal property of V and homomorphic property of s used in creating the index (γ, Γ). This enables addition and deletion over encrypted index such that the tokens are indistinguishable guaranteed by CPA-security.

Download:

Fig 4. Update working.

https://doi.org/10.1371/journal.pone.0256223.g004

4 Security analysis

The privacy guarantees and correctness of the proposed design, Pindex with respect to the security model given in section 2 and particularly in definition 4 is analysed in this section. The first formal security model given by Bellare et al. [19] is a game-based definition in which the adversary is allowed to interact with a set of oracles (encryption and decryption). The definition models communicating parties in a network where the adversary’s goal is to distinguish between a correct shared secret or random value for the given challenge. Similar approach is used to prove the indistinguishability of the allowed information leakage of the protocol Pindex. Firstly, the correctness of the proposed protocol is proved independently. Secondly, the privacy and correctness with respect to the security model is proved. The Search can handle only two cases: w_q ∈ W and w_q ∉ W during its operation. To prove the correctness in each case, the proof follows two lemmas:

Lemma 1. When given a search token for a keyword (w_q ∉ W) and the index γ, the returns ⊥ such that

Proof. The concise representation of the Search algorithm for the above case is given below:

When i ≠ q, v_i.v_q = 0 and ∀(v_i, v_q) ∉ V, v_i.v_q = 0. It can be observed from the Search algorithm that ⊥ is returned whenever f_l = 0. Therefore, whenever a query keyword w_q ∉ W is searched over the document set the Search algorithm returns ⊥ with probability 1.

Lemma 2. When given a search token for a keyword (w_q ∈ W) and the index γ, the returns such that

Proof. Similar to lemma 1, the Search algorithm can be generally represented for w_q ∈ W as:

When i = q, v_i.v_q = 1 for all (v_i, v_q)∈V and and . Thus, a file identifier f_i will be obtained by the Search algorithm. Therefore, when the computation is recursively repeated until f_l == ⊥, the algorithm returns the set of file identifiers with probability 1.

Lemma 3. The algorithm Update provides correct dynamic index update for a given update token τ_u.

Proof. The Update algorithm updates the encrypted index using an update token generated for both the cases addition or deletion of files/keywords.

Case DeleteKeyword: The process of deleting a keyword is as follows:

When f is the header node file pointer, the UpdateToken computes τ₁ ← v.s.f_next, τ₂ ← v.s.f and returns τ_u ← (τ₁, τ₂, f₁ = 0, f₂ = ⊥). The Update is represented as: (10)
When f is the terminal node, compute τ₁ = v.s.⊥, τ₂ = v.s.f and return (11)
When f is a neither a header nor a terminal node pointer, determine f_next, f_prev, compute τ₁ ← v.s.f_next, τ₂ ← v.s.f and return (12)

Eqs 10, 11 and 12 demonstrates how the file pointers are adjusted when the keyword w_u has to be added or removed. The old pointers are removed and new pointers are updated from/to previous or next appropriate file depending on the case.

Case AddKeyword: The process of adding a keyword can be represented as follows:

When f is the first node, compute τ₁ ← v.s.f, τ₂ ← v.s.f_next and return τ_u ← (τ₁, τ₂, f₁ = 0, f₂ = f) (13)
When f is the terminal node, compute τ₁ ← v.s.f, τ₂ ← v.s.⊥ and return τ_u ← (τ₁, τ₂, f₁ = f_prev, f₂ = ⊥) (14)
When f is a neither a header nor a terminal node pointer, determine f_next, f_prev, compute τ₁ ← v.s.f, τ₂ ← v.s.f_next and return τ_u ← (τ₁, τ₂, f₁ = f_prev, f₂ = f) (15)

Eqs 13, 14 and 15 shows how the file pointers are modified while adding the keyword w_u. File addition AddFile and deletion DeleteFile follow similar modification of file pointers. Hence, the dynamic index update is performed correctly.

To prove the indistinguishability between the real game and the ideal game by any PPT distinguisher Dist, we construct a PPT simulator Sim which uses the leakage functions and to simulate functionality correctly of the algorithms described in Definition 1 indistinguishablly.

Theorem 1. Given the Definition(CKA-2 security) 4, the Pindex scheme given above is secure in the random oracle model, where leaks the number of keywords, the number of documents, the identifiers of the documents and the size of each document; and leaks the search pattern and the access pattern.

Proof. Let Sim be the simulator which interacts with an adversary in an execution of an as in Definition 4. We construct (γ, c) given the leakage as follows:

Given the number of files n and size of each file l through the leakage function, the Sim obtains c = (c₁,…, c_n) by simulating the encryption of files using which exists by definition of CPA-Secure Enc₁.
Given the file identifiers F = (f₁,…, f_n) through the leakage function, the Sim simulates the index γ:
1. The Sim builds the multi-linked list δ by generating a order m hadamard matrix, V where m is the number of keywords in the universal keyword set. If no such order m exists then the least greatest order to m is chosen.
2. Construct (n + 1) × m state matrix S where each entry S_ij ∈ {0,1} is determined by flipping a coin.
3. Compute such that S_ij ≠ 0 and x = sup(X) where and r is obtained from random oracle H given j. Store δ(j) ← (v_j, r).
4. Compute for all 1 ≤ i ≤ n + 1 such that S_ij ≠ 0 and if sup(X) ≠ ∅ then x = sup(X) otherwise x = ⊥ where .
Sim then gives adversary the encrypted index γ.
When the Sim receives an adaptive query q for keyword w_j, it computes τ_q ← v_j.Enc₂(pk, −j, r⁻¹) + v_r.r′ and returns τ_q to the adversary . The may use the Search function for the obtained token τ_q and it will obtain exactly the same file identifiers as in the leakage function every time correctly although the search token are different for same keyword w_j. It repeats only after making |V_r| queries for the same keyword w_j.
When the Sim receives an adaptive update query q = u = f_i and if u is for add, the simulator uses to simulate c_i assuming that the leakage function reveals the identifier and performs:
1. Updates the state by adding a row to the state matrix S where q = n + 2 and each entry S_qj ∈ {0,1} is determined by flipping a coin.
2. When f is first row and j is the keyword, compute τ₁ ← v.Enc₂(pk, j, r).f, τ₂ ← v.Enc₂(pk, j, r).f_next where next = sup(X) and X = {next: S_yj ≠ 0 ∧ 1 < y ≤ (n + 1) } and (v, r) ← δ(j). Returns τ ← (τ₁, τ₂, f₁ = 0, f₂ = f).
3. When f is the terminal row and j is the keyword, compute τ₁ ← v.Enc₂(pk, j, r).f, τ₂ ← v.Enc₂(pk, j, r).⊥. Returns τ ← (τ₁, τ₂, f₁ = f_prev, f₂ = ⊥) where prev = inf(X) and X = {prev: S_yj ≠ 0 ∧ i > y > 0} and (v, r) ← δ(j).
4. When f is a neither a header nor a terminal node pointer, determine f_next, f_prev, compute τ₁ ← v.s.f, τ₂ ← v.s.f_next where prev = inf(X) and X = {prev: S_yj ≠ 0 ∧ i > y > 0} while next = sup(X) and X = {next: S_yj ≠ 0 ∧ i < y ≤(n + 1)} and (v, r) ← δ(j). Returns τ_u ← (τ₁, τ₂, f₁ = f_prev, f₂ = f).
5. can then use τ_u on Update function and obtain the updated encrypted index γ′.
Similarly, Sim can simulate operation Add and Delete for keywords and file.

The results returned to the random oracle queries by Sim are consistent. The keys used in the tokens and keys used in the construction of index γ are indistinguishable since (Enc₁,Enc₂) used to encrypt is by definition indistinguishable from random. Hence, the CPA-Security of (Enc₁,Enc₂) guarantee indistinguishability between real and simulated encryptions of the files and index by .

5 Performance analysis

This section presents the performance analysis of the proposed design for its efficiency. Pindex is implemented in Java and the experimental results are evaluated on a Windows server with Intel Xeon Processor running at 2.30 GHz and 16 GB RAM. The experiment uses real world Enron email dataset [20]. The performance of Pindex is evaluated and compared with MRSE [21]. The experimental results are shown in Fig 5.

Download:

Fig 5. Experimental results.

https://doi.org/10.1371/journal.pone.0256223.g005

5.1 Multi-link index

Building the multi-linked index is a one-time process carried out by the data owner. The multi-linked dynamic searchable index γ can be constructed using two steps. First, encrypting the keyword w using Paillier cryptosystem if it belongs to document f and secondly computing the encrypted index γ which is a multi-link between the document f and keywords w_i ∈ W. Therefore the time cost of index generation is proportional to the number of files and number of keywords contained in each file. It can be observed from Fig 5a that the time cost for building the index with varying number of documents of the proposed system performs relatively better than MRSE. This is due to use of relatively small matrix dimensions in the design and computationally less intensive operations like scalar multiplications (|W| operations) and vector additions (|W| + 1 operations).

5.2 Search token generation

The search token τ_q is constructed for the keywords in the query. The process involves retrieving the pair (v, r) from the parameter hash table Γ and encrypting the query keyword w_q. The time cost for search token depends on the number of query keywords. Fig 5b shows the time cost for generating the search token in logarithmic scale for document size n = |D| = 1000 and varying the size of the query keyword. It is observed that Pindex performance is better than MRSE. This is because our scheme use less computationally intensive operations and the number of operations involved in building a trapdoor is minimum. The operations include two scalar multiplications, one vector addition for indistinguishability and one lookup.

5.3 Search

Search operation executed by the cloud server consists of multiplying the search token τ_q with the encrypted index γ. If w_q ∈ W, the first pointer is computed and recursively the multi linked index is traversed until f_q = ⊥. For a multi-keyword search with |w_q| keywords in a query q and run on p processors parallel which returned |F_q| files then time complexity would be . In MRSE similarity scores for the document set is computed and therefore the time complexity is . The time cost for varying the number of documents for m = |W| = 1000 is shown in Fig 5c and 5d. Search time of Pindex is relatively better than MRSE. Also, it can be observed that as the number of keywords per query is increased, the corresponding increase in time of query to be sub-linear for the proposed design. It is due to optimal number of operations used in the design of search operation which involves just one vector inner product.

5.4 Update

The update (add and delete) of both keywords and document involves adding the update token τ₁ and deleting τ₂ from f_prev. The worst case scenario for update delete file operation is when the number of f_prev equals the number of keywords contained in the file to be deleted. Updates are not compared as MRSE does not support dynamic index, the index has to be rebuilt for updates and the results are shown in Fig 5e and 5f. The time cost for update i.e to add/ delete a keyword is and to add/delete a file is . The proposed method supports dynamic updates unlike MRSE and it can be observed that dynamic updates comprises of building update trapdoor and executing update algorithm. The time to build update trapdoor is significantly less due to the same reasoning as that of building trapdoor and execution of update algorithm involves only vector additions and deletions as given in section 3.

6 Related works

The confidentiality of sensitive data like Electronic Health Records (EHR) outsourced to CSP is protected from external and internal attacks by deploying a searchable encryption scheme specially designed to support operations like search and updates over encrypted data [22].

6.1 Searchable encryption

The searchable encryption schemes are primarily classified into Searchable Symmetric Encryption and Public Key Encryption with Keyword Search [2] depending upon the type of encryption primitive used and notion of provable security. The searchable encryption systems is further classified based on the type of search operation into keyword based and semantic based searchable encryption system [23]. The keyword based searchable encryption system uses either specially designed secure encrypted indexes to support search operations over encrypted data or sequential scanning of encrypted documents to support search [4]. The keyword based searchable encryption system are exact search keyword based while the semantic keyword search matches even closest query keywords. The proposed design is based on keyword based searchable encryption schemes using secure encrypted index. The first practical solution to symmetric searchable encryption was proposed by Song et al. [3]. Followed by a number of searchable encryption schemes especially with secure index construction, improved efficiency, security definitions and formalizations [4, 5, 8, 9, 24–26].

6.2 Secure index

To improve performance and decrease the search complexity, clients build keyword-based indexes and outsource it to the cloud. Searchable encryption schemes in literature use a forward, inverted or tree-based index [4, 24, 27, 28]. However, the scalability of the index based schemes can be achieved by either rebuilding the index or by using expensive techniques [5]. The index contains information such as document identifiers, document length, number of keywords in the file. The adversary should not be able to determine the query keyword using statistical analysis. Therefore, the index must not leak information which could aide honest-but-curious CSP to link tokens, keywords and document identifiers. The indexes and the tokens or trapdoors are inherently linked. Deterministic trapdoors are known to leak information therefore probabilistic trapdoors are used. Goh [4] uses forward index using bloom filters per document. The search complexity is proportional to the number of files and bloom filters have an inherent problem of false positives. Chang and Mitzenmacher [24] use prebuilt dictionary of distinct keywords to build the forward index. The search complexity is proportional to the number of files. Inverted index or index per keyword was introduced by [5] and thus reduced the search time to a sub-linear scheme with optimal search time. [26] proposed a scheme based on inverted index using hash tables.

6.3 Dynamic searchable encryption

Dynamism is a main requirement for any searchable encryption scheme to be practical. The first dynamic searchable encryption was proposed by Kamara et al. [8] using inverted index and dynamic addition and deletion of documents. The update complexity is linear to the updated document/keyword pair but the update leaks some information about the documents. Kamara and Pamamanthou [9] proposed a red-black tree index based on an encrypted linked-list multi-map. To insert a new keyword, they added as a new entry to the right list and deletion using a dual representation of the index but update leaks information. Cash et al. [10] proposed a dynamic index using a separate update database to add and delete tuples stored in the revocation list. However, the update is inefficient and leaks significant information. Naveed et al. [11] presented a dynamic searchable encryption with blind storage. Instead of encrypting the index, the stored blocks are scattered using hashing. This scheme leaks the information while adding keyword. Other dynamic searchable encryption schemes are [7, 14, 29–31].

6.4 Forward privacy

There is a trade-off between practicality and security as most of the dynamic searchable encryption schemes leak significant amount of information about the document. Attacker can leverage this leakage to reveal information on the client’s queries. Deterministic encryption schemes leak information regarding size, search, access patterns from the repetition of the queried keyword. File injection attack on dynamic searchable encryption by Zhang et al. [13] triggered much attention on forward privacy schemes. By injecting few carefully selected files, the adversary can recover keywords queried in the past from the information leaked from the search token. Thus, the prior search token needs to be invalidated to achieve forward security. Therefore, a stronger property called forward privacy and an informal definition was introduced by Stefanov et al. [12] based on Oblivious RAM. However, the scheme leads to a search cost and update cost. Existing schemes that achieve forward privacy are generally based on trapdoor permutation or using pseudorandom functions. Bost proposed the first formal definition of forward security and a trapdoor permutation based scheme Sophos [32]. Later proposed Diana [15] a constrained PRF based scheme. A dual dictionary Dual [33] was proposed that simultaneously uses both forward with inverted index and achieves forward privacy using fresh keys. To realize forward security, a symmetric punctured encryption Janus ++ [34] was proposed. Etemad et al. [35] achieve forward privacy by replacing the keys revealed to the server. Sun et al. Aura [36] use a non-interative DSSE using bloom filters and multi-puncturable PRF. Wei et al. [37] proposed an index structure with keyed block chain. Khons [38] uses inverted index and achieves weak forward security by exploitation hidden pointer technique.

7 Discussion

A comparison of our work with existing dynamic schemes is given in Table 1. In our work, there is a relative improvement in client and server storage. The client storage is as the client needs to store the δ, which is the sum of the inner product of orthogonal vectors for documents and Γ the keyword parameter hash table. The server storage is as the server stores the encrypted multi-linked index γ that contains the number of document and the keywords w_i ∈ W. Efficient updates are performed without rebuilding the index. The search and update tokens are indistinguishable from the previous queries. The proposed multi-linked index is parallelizable and provides relatively efficient search and update with and respectively. The efficient time complexity of search is achieved by exploiting the vertical pointers and orthogonal horizontal linking property of the proposed multi-linked encrypted index structure which enables retrieval of the encrypted keyword index with a vector inner product. It can be observed from the SrchToken algorithm that for each search, the token generated consists of the orthogonal sum of and (v_r.r) in which (v_r, r) is chosen uniformly random for each query to make the token indistinguishable for even the same query. Similarly, the UpdToken algorithm also generates each token which are indistinguishable. This token generation strategy makes the tokens, file identifiers and keywords unlinkable which eventually provides forward privacy as in Definition 5. Moreover, the search can be performed in parallel for multiple tokens simultaneously in parallel on a shared index. This makes the proposed scheme more efficient in exploiting the available parallel processors and improves the execution speed of search operations.

Download:

Table 1. Complexity and feature comparison.

https://doi.org/10.1371/journal.pone.0256223.t001

8 Conclusion

This paper proposes a novel private multi-linked dynamic index construction for efficient retrieval of encrypted documents with forward privacy guarantees. The multi-linked index is constructed using probabilistic homomorphic encryption and secret orthogonal vectors. Experimental evaluations on Enron dataset proves that our construction achieves optimal search and update cost. Pindex supports parallelism using the multi-link structure. Thus, the server can distribute the load to its available p processors with search and update cost as and respectively. Forward privacy guarantees is achieved using probabilistic algorithms for search and update token generation. Further, the client and server storage is reduced to . Security analysis of Pindex using random oracle models proves the privacy and correctness of search and update operations. Our work Pindex can be used on critical cloud infrastructre for secure and efficient encrypted document retrieval with forward privacy guarantees. As a future work, Pindex can be adapted to provide verifiablilty and backward privacy.

References

1. Kamara S, Lauter K. Cryptographic Cloud Storage. In: Financial Cryptography and Data Security. Springer Berlin Heidelberg; 2010. p. 136–149. Available from: https://doi.org/10.1007%2F978-3-642-14992-4_13.
2. Bösch C, Hartel P, Jonker W, Peter A. A Survey of Provably Secure Searchable Encryption. ACM Computing Surveys. 2015;47(2):1–51.
- View Article
- Google Scholar
3. Song DX, Wagner D, Perrig A. Practical techniques for searches on encrypted data. In: Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000. IEEE Comput. Soc;. Available from: https://doi.org/10.1109%2Fsecpri.2000.848445.
4. Goh EJ, et al. Secure indexes. IACR Cryptol ePrint Arch. 2003;2003:216.
- View Article
- Google Scholar
5. Curtmola R, Garay J, Kamara S, Ostrovsky R. Searchable symmetric encryption: Improved definitions and efficient constructions. Journal of Computer Security. 2011;19(5):895–934.
- View Article
- Google Scholar
6. Boneh D, Waters B. Conjunctive, Subset, and Range Queries on Encrypted Data. In: Theory of Cryptography. Springer Berlin Heidelberg;. p. 535–554. Available from: https://doi.org/10.1007%2F978-3-540-70936-7_29.
7. Hahn F, Kerschbaum F. Searchable Encryption with Secure and Efficient Updates. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2014. Available from: https://doi.org/10.1145%2F2660267.2660297.
8. Kamara S, Papamanthou C, Roeder T. Dynamic searchable symmetric encryption. In: Proceedings of the 2012 ACM conference on Computer and communications security—CCS’12. ACM Press; 2012. Available from: https://doi.org/10.1145%2F2382196.2382298.
9. Kamara S, Papamanthou C. Parallel and Dynamic Searchable Symmetric Encryption. In: Financial Cryptography and Data Security. Springer Berlin Heidelberg; 2013. p. 258–274. Available from: https://doi.org/10.1007%2F978-3-642-39884-1_22.
10. Cash D, Jaeger J, Jarecki S, Jutla C, Krawczyk H, Roşu MC, et al. Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation. In: Proceedings 2014 Network and Distributed System Security Symposium. Internet Society; 2014. Available from: https://doi.org/10.14722%2Fndss.2014.23264.
11. Naveed M, Prabhakaran M, Gunter CA. Dynamic Searchable Encryption via Blind Storage. In: 2014 IEEE Symposium on Security and Privacy. IEEE; 2014. Available from: https://doi.org/10.1109%2Fsp.2014.47.
12. Stefanov E, Papamanthou C, Shi E. Practical Dynamic Searchable Encryption with Small Leakage. In: Proceedings 2014 Network and Distributed System Security Symposium. Internet Society; 2014. Available from: https://doi.org/10.14722%2Fndss.2014.23298.
13. Zhang Y, Katz J, Papamanthou C. All your queries are belong to us: The power of file-injection attacks on searchable encryption. In: 25th {USENIX} Security Symposium ({USENIX} Security 16); 2016. p. 707–720.
14. Elizabeth BL, Prakash AJ. Verifiable top-k searchable encryption for cloud data. Sādhanā. 2019;45(1).
- View Article
- Google Scholar
15. Bost R, Minaud B, Ohrimenko O. Forward and Backward Private Searchable Encryption from Constrained Cryptographic Primitives. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2017. Available from: https://doi.org/10.1145%2F3133956.3133980.
16. Paillier P. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Advances in Cryptology—EUROCRYPT’99. Springer Berlin Heidelberg;. p. 223–238. Available from: https://doi.org/10.1007%2F3-540-48910-x_16.
17. Agaian SS. Construction of generalized Hadamard matrices. In: Lecture Notes in Mathematics. Springer Berlin Heidelberg; 1985. p. 103–133. Available from: https://doi.org/10.1007%2Fbfb0101076.
18. Casazza PG. THE ART OF FRAME THEORY. Taiwanese Journal of Mathematics. 2000;4(2).
- View Article
- Google Scholar
19. Bellare M, Rogaway P. Random oracles are practical. In: Proceedings of the 1st ACM conference on Computer and communications security—CCS’93. ACM Press; 1993. Available from: https://doi.org/10.1145%2F168588.168596.
20. Cohen WW. Enron email dataset. 2009.
21. Cao N, Wang C, Li M, Ren K, Lou W. Privacy-Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data. 2014;25(1):222–233.
- View Article
- Google Scholar
22. Pham H, Woodworth J, Salehi MA. Survey on secure search over encrypted data on the cloud. Concurrency and Computation: Practice and Experience. 2019; p. e5284.
- View Article
- Google Scholar
23. Liu G, Yang G, Bai S, Zhou Q, Dai H. FSSE: An Effective Fuzzy Semantic Searchable Encryption Scheme Over Encrypted Cloud Data. IEEE Access. 2020;8:71893–71906.
- View Article
- Google Scholar
24. Chang YC, Mitzenmacher M. Privacy Preserving Keyword Searches on Remote Encrypted Data. In: Applied Cryptography and Network Security. Springer Berlin Heidelberg; 2005. p. 442–455. Available from: https://doi.org/10.1007%2F11496137_30.
25. Amanatidis G, Boldyreva A, O’Neill A. Provably-Secure Schemes for Basic Query Support in Outsourced Databases. In: Data and Applications Security XXI. Springer Berlin Heidelberg; 2007. p. 14–30. Available from: https://doi.org/10.1007%2F978-3-540-73538-0_2.
26. Chase M, Kamara S. Structured Encryption and Controlled Disclosure. In: Advances in Cryptology—ASIACRYPT 2010. Springer Berlin Heidelberg; 2010. p. 577–594. Available from: https://doi.org/10.1007%2F978-3-642-17373-8_33.
27. Golle P, Staddon J, Waters B. Secure Conjunctive Keyword Search over Encrypted Data. In: Applied Cryptography and Network Security. Springer Berlin Heidelberg; 2004. p. 31–45. Available from: https://doi.org/10.1007%2F978-3-540-24852-1_3.
28. Liu Q, Wang G, Wu J. An Efficient Privacy Preserving Keyword Search Scheme in Cloud Computing. In: 2009 International Conference on Computational Science and Engineering. IEEE; 2009. Available from: https://doi.org/10.1109%2Fcse.2009.66.
29. Li H, Yang Y, Dai Y, Yu S, Xiang Y. Achieving Secure and Efficient Dynamic Searchable Symmetric Encryption over Medical Cloud Data. IEEE Transactions on Cloud Computing. 2020;8(2):484–494.
- View Article
- Google Scholar
30. Zuo C, Sun SF, Liu JK, Shao J, Pieprzyk J. Dynamic Searchable Symmetric Encryption Schemes Supporting Range Queries with Forward (and Backward) Security. In: Computer Security. Springer International Publishing; 2018. p. 228–246. Available from: https://doi.org/10.1007%2F978-3-319-98989-1_12.
31. Xu L, Xu C, Liu JK, Zuo C, Zhang P. Building a dynamic searchable encrypted medical database for multi-client. Information Sciences. 2020;527:394–405.
- View Article
- Google Scholar
32. Bost R. Forward Secure Searchable Encryption. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2016. Available from: https://doi.org/10.1145%2F2976749.2978303.
33. Kim KS, Kim M, Lee D, Park JH, Kim WH. Forward Secure Dynamic Searchable Symmetric Encryption with Efficient Updates. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2017. Available from: https://doi.org/10.1145%2F3133956.3133970.
34. Sun SF, Yuan X, Liu JK, Steinfeld R, Sakzad A, Vo V, et al. Practical Backward-Secure Searchable Encryption from Symmetric Puncturable Encryption; 2018. Available from: https://doi.org/10.1145%2F3243734.3243782.
35. Etemad M, Küpçü A, Papamanthou C, Evans D. Efficient Dynamic Searchable Encryption with Forward Privacy. Proceedings on Privacy Enhancing Technologies. 2018;2018(1):5–20.
- View Article
- Google Scholar
36. Sun SF, Steinfeld R, Lai S, Yuan X, Sakzad A, Liu J, et al. Practical Non-Interactive Searchable Encryption with Forward and Backward Privacy. In: Proceedings 2021 Network and Distributed System Security Symposium. Internet Society; 2021. Available from: https://doi.org/10.14722%2Fndss.2021.24162.
37. Wei Y, Lv S, Guo X, Liu Z, Huang Y, Li B. FSSE: Forward secure searchable encryption with keyed-block chains. Information Sciences. 2019;500:113–126.
- View Article
- Google Scholar
38. Li J, Huang Y, Wei Y, Lv S, Liu Z, Dong C, et al. Searchable Symmetric Encryption with Forward Search Privacy. IEEE Transactions on Dependable and Secure Computing. 2021;18(1):460–474.
- View Article
- Google Scholar

[ref1] 1. Kamara S, Lauter K. Cryptographic Cloud Storage. In: Financial Cryptography and Data Security. Springer Berlin Heidelberg; 2010. p. 136–149. Available from: https://doi.org/10.1007%2F978-3-642-14992-4_13.

[ref2] 2. Bösch C, Hartel P, Jonker W, Peter A. A Survey of Provably Secure Searchable Encryption. ACM Computing Surveys. 2015;47(2):1–51.
View Article
Google Scholar

[3] View Article

[4] Google Scholar

[ref3] 3. Song DX, Wagner D, Perrig A. Practical techniques for searches on encrypted data. In: Proceeding 2000 IEEE Symposium on Security and Privacy. S&P 2000. IEEE Comput. Soc;. Available from: https://doi.org/10.1109%2Fsecpri.2000.848445.

[ref4] 4. Goh EJ, et al. Secure indexes. IACR Cryptol ePrint Arch. 2003;2003:216.
View Article
Google Scholar

[7] View Article

[8] Google Scholar

[ref5] 5. Curtmola R, Garay J, Kamara S, Ostrovsky R. Searchable symmetric encryption: Improved definitions and efficient constructions. Journal of Computer Security. 2011;19(5):895–934.
View Article
Google Scholar

[10] View Article

[11] Google Scholar

[ref6] 6. Boneh D, Waters B. Conjunctive, Subset, and Range Queries on Encrypted Data. In: Theory of Cryptography. Springer Berlin Heidelberg;. p. 535–554. Available from: https://doi.org/10.1007%2F978-3-540-70936-7_29.

[ref7] 7. Hahn F, Kerschbaum F. Searchable Encryption with Secure and Efficient Updates. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2014. Available from: https://doi.org/10.1145%2F2660267.2660297.

[ref8] 8. Kamara S, Papamanthou C, Roeder T. Dynamic searchable symmetric encryption. In: Proceedings of the 2012 ACM conference on Computer and communications security—CCS’12. ACM Press; 2012. Available from: https://doi.org/10.1145%2F2382196.2382298.

[ref9] 9. Kamara S, Papamanthou C. Parallel and Dynamic Searchable Symmetric Encryption. In: Financial Cryptography and Data Security. Springer Berlin Heidelberg; 2013. p. 258–274. Available from: https://doi.org/10.1007%2F978-3-642-39884-1_22.

[ref10] 10. Cash D, Jaeger J, Jarecki S, Jutla C, Krawczyk H, Roşu MC, et al. Dynamic Searchable Encryption in Very-Large Databases: Data Structures and Implementation. In: Proceedings 2014 Network and Distributed System Security Symposium. Internet Society; 2014. Available from: https://doi.org/10.14722%2Fndss.2014.23264.

[ref11] 11. Naveed M, Prabhakaran M, Gunter CA. Dynamic Searchable Encryption via Blind Storage. In: 2014 IEEE Symposium on Security and Privacy. IEEE; 2014. Available from: https://doi.org/10.1109%2Fsp.2014.47.

[ref12] 12. Stefanov E, Papamanthou C, Shi E. Practical Dynamic Searchable Encryption with Small Leakage. In: Proceedings 2014 Network and Distributed System Security Symposium. Internet Society; 2014. Available from: https://doi.org/10.14722%2Fndss.2014.23298.

[ref13] 13. Zhang Y, Katz J, Papamanthou C. All your queries are belong to us: The power of file-injection attacks on searchable encryption. In: 25th {USENIX} Security Symposium ({USENIX} Security 16); 2016. p. 707–720.

[ref14] 14. Elizabeth BL, Prakash AJ. Verifiable top-k searchable encryption for cloud data. Sādhanā. 2019;45(1).
View Article
Google Scholar

[21] View Article

[22] Google Scholar

[ref15] 15. Bost R, Minaud B, Ohrimenko O. Forward and Backward Private Searchable Encryption from Constrained Cryptographic Primitives. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2017. Available from: https://doi.org/10.1145%2F3133956.3133980.

[ref16] 16. Paillier P. Public-Key Cryptosystems Based on Composite Degree Residuosity Classes. In: Advances in Cryptology—EUROCRYPT’99. Springer Berlin Heidelberg;. p. 223–238. Available from: https://doi.org/10.1007%2F3-540-48910-x_16.

[ref17] 17. Agaian SS. Construction of generalized Hadamard matrices. In: Lecture Notes in Mathematics. Springer Berlin Heidelberg; 1985. p. 103–133. Available from: https://doi.org/10.1007%2Fbfb0101076.

[ref18] 18. Casazza PG. THE ART OF FRAME THEORY. Taiwanese Journal of Mathematics. 2000;4(2).
View Article
Google Scholar

[27] View Article

[28] Google Scholar

[ref19] 19. Bellare M, Rogaway P. Random oracles are practical. In: Proceedings of the 1st ACM conference on Computer and communications security—CCS’93. ACM Press; 1993. Available from: https://doi.org/10.1145%2F168588.168596.

[ref20] 20. Cohen WW. Enron email dataset. 2009.

[ref21] 21. Cao N, Wang C, Li M, Ren K, Lou W. Privacy-Preserving Multi-Keyword Ranked Search over Encrypted Cloud Data. 2014;25(1):222–233.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref22] 22. Pham H, Woodworth J, Salehi MA. Survey on secure search over encrypted data on the cloud. Concurrency and Computation: Practice and Experience. 2019; p. e5284.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref23] 23. Liu G, Yang G, Bai S, Zhou Q, Dai H. FSSE: An Effective Fuzzy Semantic Searchable Encryption Scheme Over Encrypted Cloud Data. IEEE Access. 2020;8:71893–71906.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref24] 24. Chang YC, Mitzenmacher M. Privacy Preserving Keyword Searches on Remote Encrypted Data. In: Applied Cryptography and Network Security. Springer Berlin Heidelberg; 2005. p. 442–455. Available from: https://doi.org/10.1007%2F11496137_30.

[ref25] 25. Amanatidis G, Boldyreva A, O’Neill A. Provably-Secure Schemes for Basic Query Support in Outsourced Databases. In: Data and Applications Security XXI. Springer Berlin Heidelberg; 2007. p. 14–30. Available from: https://doi.org/10.1007%2F978-3-540-73538-0_2.

[ref26] 26. Chase M, Kamara S. Structured Encryption and Controlled Disclosure. In: Advances in Cryptology—ASIACRYPT 2010. Springer Berlin Heidelberg; 2010. p. 577–594. Available from: https://doi.org/10.1007%2F978-3-642-17373-8_33.

[ref27] 27. Golle P, Staddon J, Waters B. Secure Conjunctive Keyword Search over Encrypted Data. In: Applied Cryptography and Network Security. Springer Berlin Heidelberg; 2004. p. 31–45. Available from: https://doi.org/10.1007%2F978-3-540-24852-1_3.

[ref28] 28. Liu Q, Wang G, Wu J. An Efficient Privacy Preserving Keyword Search Scheme in Cloud Computing. In: 2009 International Conference on Computational Science and Engineering. IEEE; 2009. Available from: https://doi.org/10.1109%2Fcse.2009.66.

[ref29] 29. Li H, Yang Y, Dai Y, Yu S, Xiang Y. Achieving Secure and Efficient Dynamic Searchable Symmetric Encryption over Medical Cloud Data. IEEE Transactions on Cloud Computing. 2020;8(2):484–494.
View Article
Google Scholar

[46] View Article

[47] Google Scholar

[ref30] 30. Zuo C, Sun SF, Liu JK, Shao J, Pieprzyk J. Dynamic Searchable Symmetric Encryption Schemes Supporting Range Queries with Forward (and Backward) Security. In: Computer Security. Springer International Publishing; 2018. p. 228–246. Available from: https://doi.org/10.1007%2F978-3-319-98989-1_12.

[ref31] 31. Xu L, Xu C, Liu JK, Zuo C, Zhang P. Building a dynamic searchable encrypted medical database for multi-client. Information Sciences. 2020;527:394–405.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref32] 32. Bost R. Forward Secure Searchable Encryption. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2016. Available from: https://doi.org/10.1145%2F2976749.2978303.

[ref33] 33. Kim KS, Kim M, Lee D, Park JH, Kim WH. Forward Secure Dynamic Searchable Symmetric Encryption with Efficient Updates. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM; 2017. Available from: https://doi.org/10.1145%2F3133956.3133970.

[ref34] 34. Sun SF, Yuan X, Liu JK, Steinfeld R, Sakzad A, Vo V, et al. Practical Backward-Secure Searchable Encryption from Symmetric Puncturable Encryption; 2018. Available from: https://doi.org/10.1145%2F3243734.3243782.

[ref35] 35. Etemad M, Küpçü A, Papamanthou C, Evans D. Efficient Dynamic Searchable Encryption with Forward Privacy. Proceedings on Privacy Enhancing Technologies. 2018;2018(1):5–20.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref36] 36. Sun SF, Steinfeld R, Lai S, Yuan X, Sakzad A, Liu J, et al. Practical Non-Interactive Searchable Encryption with Forward and Backward Privacy. In: Proceedings 2021 Network and Distributed System Security Symposium. Internet Society; 2021. Available from: https://doi.org/10.14722%2Fndss.2021.24162.

[ref37] 37. Wei Y, Lv S, Guo X, Liu Z, Huang Y, Li B. FSSE: Forward secure searchable encryption with keyed-block chains. Information Sciences. 2019;500:113–126.
View Article
Google Scholar

[60] View Article

[61] Google Scholar

[ref38] 38. Li J, Huang Y, Wei Y, Lv S, Liu Z, Dong C, et al. Searchable Symmetric Encryption with Forward Search Privacy. IEEE Transactions on Dependable and Secure Computing. 2021;18(1):460–474.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

Figures

Abstract

1 Introduction

2 Model

2.1 Notations and system model

2.2 Security definitions

2.3 Homomorphic and orthogonality

3 Construction

3.1 Multi-linked hash map construction

3.2 Algorithm K ← KGen(1λ)

3.3 Algorithm c ← Enc2(K, m, r)

3.4 Algorithm m = Dec2(sk, c)

3.5 Algorithm (γ, Γ) ← BuildIndex(K, δ, V, D, Enc2)

3.6 Algorithm τs ← SrchToken(wq, pk, V)

3.7 Algorithm Fq ← Search(γ, τq)

3.8 Algorithm τq ← UpdToken(pk, f, u, w)

3.9 Algorithm γ′ ← Update(γ, τu, f1, f2)

4 Security analysis

5 Performance analysis

5.1 Multi-link index

5.2 Search token generation

5.3 Search

5.4 Update

6 Related works

6.1 Searchable encryption

6.2 Secure index

6.3 Dynamic searchable encryption

6.4 Forward privacy

7 Discussion

8 Conclusion

References

3.2 Algorithm K ← KGen(1^λ)

3.3 Algorithm c ← Enc₂(K, m, r)

3.4 Algorithm m = Dec₂(sk, c)

3.5 Algorithm (γ, Γ) ← BuildIndex(K, δ, V, D, Enc₂)

3.6 Algorithm τ_s ← SrchToken(w_q, pk, V)

3.7 Algorithm F_q ← Search(γ, τ_q)

3.8 Algorithm τ_q ← UpdToken(pk, f, u, w)

3.9 Algorithm γ′ ← Update(γ, τ_u, f₁, f₂)