A Simple TypeScript Implementation of Private Information Retrieval

TLDR: I learnt about Apple’s new Live Caller ID lookup feature last month, and I built out a simple implementation of that in TypeScript called sharded-seal-lookup. This post explains how the entire system works, its limitations and possible future work. The main purpose of this post is to document what I’ve learnt in the last few days.

Background

Apple has a reputation for preserving user-privacy and collecting only as much information as absolutely necessary, but this often limits what users can achieve on the platform. Android users have been able to use applications like Truecaller to identify spam calls since forever, but that’s not been possible until very recently with iOS.

On Android, the OS calls a lookup service, providing it the mobile number of the caller, and it receives a reply indicating if the call is spam or not.

sequenceDiagram OS->>Third party server: mobile number Third party server->>OS: spam or not response

This design however leaks the caller mobile number to this third-party service, and it’s easy to imagine why Apple perceives this as undesirable. Is there a way however to encrypt the mobile number, pass it on to a third party service, and still receive the spam indicator back?

sequenceDiagram OS->>Third party server: encrypted hash Third party server->>OS: spam or not response

The simplest implementation is obviously to send the entire spam caller database back to the client.

sequenceDiagram OS->>Third party server: database request Third party server->>OS: entire database response OS->>OS: perform lookups locally

This has privacy, latency and bandwidth implications. Remember that the OS has to wait until it gets this information back to decide whether to ring the call or perhaps block the number as spam. And besides, no commercial third-party service would want to part with their spam database. So in essence, we need a system where:

  • The server should not know what the client is looking up. But it should still be able to send back the correct information.
  • Clients shouldn’t receive more information than what they looked up. This is also particularly important if the stored information is sensitive, like for e.g. names or company information.
  • The lookup operation should be perceived as instant (i.e. <100ms).

How about PKI or shared secrets?

Usually in scenarios like this, engineers often turn to public-key encryption or shared secrets, but it doesn’t work in this instance for a few reasons. Let’s look at an example scenario below:

Let’s say the client (the OS) bootstraps a PKI keypair and sends the public key to the server (the 3rd party service). The server pre-encrypts all of its keys (the mobile numbers) with this public key.

sequenceDiagram client->>client: creates a keypair client->>server: sends public key server->>server: creates a separate database with encrypted keys and values.

Then in future lookup requests, the client looks up information with this encrypted key.

sequenceDiagram client->>client: encrypts mobile number with public key client->>server: encrypted mobile number server->>server: looks up encrypted mobile number server->>client: returns encrypted value. client->>client: decrypts value with secret key.

Since the server maintains a separate encrypted database that doesn’t link to its unencrypted one, it can return a response without knowing what mobile number was queried. But you can immediately figure out the problems:

  • The server will need to maintain a separate database for each client. This means maintaining millions (possibly billions) of instances of the same key.
  • And more importantly, the client still has to trust the server to maintain a separate encrypted database. In the first bootstrapping step, a malicious server can simply maintain this mapping and in future lookup requests, still understand what mobile number was being queried.

So then we have a couple more requirements:

  • Clients should consider servers as untrusted, and protect against malicious servers.
  • Lookups should not result in a lot of data duplication at the server end.

Enter Homomorphic Encryption

Homomorphic encryption provides primitives that can help here. In addition to having properties similar to PKI (where a public key encrypts and only the secret key can decrypt), homomorphic ciphers can also be manipulated in-situ by untrusted entities. Let’s take an example to illustrate this.

Imagine we are building a service that squares a number, i.e. it receives a number n, and returns n × n. However, we have one strange requirement: the server should never know what n is, but it should always return the correct squared answer.

First, a client generates a homomorphic keypair. It also shares the parameters of the homomorphic environment with the server. We’ll get into this later, but you can think about this as essentially configuration or metadata info that helps the server replicate the client’s homomorphic parameters.

sequenceDiagram client->>client: creates a keypair client->>server: sends public key & homomorphic configuration

The client then encrypts n with its secret key and sends it to the server.

sequenceDiagram client->>client: encrypts n with public key client->>server: encrypted n

The server performs a multiplication operation of the cipher with itself. Homomorphic ciphers have this interesting property that they can have logical mathematical operations performed on it without decrypting it. The last bit is key: remember that the server never even decrypts the value, but it can still perform mathematical operations on it!

sequenceDiagram client->>server: encrypted n server->>server: homomorphically multiplies n with itself. server->>client: returns encrypted value.

And once the server returns the response (still encrypted), the client can decrypt it with the secret key to learn the squared n × n value.

sequenceDiagram client->>client: decrypts value with secret key.

Now isn’t this interesting? Almost seems like magic, but that’s what I thought about PKI too when I heard about it first, but now I take it for granted. We can think of homomorphic encryption as a primitive that enables even untrusted servers to perform operations on our ciphers and return a value that is still encrypted.

Homomorphic Encryption in TypeScript

Let’s take a look at some code to figure out how to create a homomorphic encryption environment, create some keypairs, perform a homomorphic operation, and finally, decrypt the result. Just a note that the code below uses a library called node-seal that is a simple wrapper around one of the gold standard HE libraries called SEAL by Microsoft Research.

First, let’s set up the homomorphic environment:

/**
 * Sets various required homomorphic encryption constants.
 *
 * @returns The encryption constants for the database
 */
function getEncryptionConstants() {
  const polyModulusDegree = 4096;
  const securityLevel = seal.SecurityLevel.tc256;
  const plainModulusBitSize = 20;
  return { polyModulusDegree, securityLevel, plainModulusBitSize };
}
/**
 * Constructs a SEAL context with the encryption parameters.
 *
 * @returns The SEAL context for the encryption parameters
 */
function getSealContext() {
  const { polyModulusDegree, securityLevel, plainModulusBitSize } =
    getEncryptionConstants();
  // Set encryption parameters
  const schemeType = seal.SchemeType.bfv;
  const parms = seal.EncryptionParameters(schemeType);
  parms.setPolyModulusDegree(polyModulusDegree); // Optimized for shard size
  parms.setCoeffModulus(
    seal.CoeffModulus.BFVDefault(polyModulusDegree, securityLevel)
  );
  parms.setPlainModulus(
    seal.PlainModulus.Batching(polyModulusDegree, plainModulusBitSize)
  );
  const context = seal.Context(parms, true, seal.SecurityLevel.tc128);
  if (!context.parametersSet()) {
    throw new Error("Invalid encryption parameters.");
  }
  return context;
}

The three encryption constants have a bit of magical number flavor, and from what I’ve been able to understand:

  • polyModulusDegree is the most crucial one. It directly affects the size of the ciphertext. That is, if you set it to 4096, the ciphertext has 4096 bytes. This means that the larger the polyModulusDegree, the more information you can store in the cipher. But the larger the polyModulusDegree, the slower the homomorphic operations on it are. So striking a balance between data storage and computational performance is key. And finally, just like PKI, a higher polyModulusDegree also means greater security in terms of key strength.
  • securityLevel is a bit of an abstraction for security strength, and the default value is tc128. The SEAL codebase says that normal users should not tweak this.
  • Homomorphic encryption also facilitates logical mathematics between encrypted ciphers and plain text vectors. i.e. an encrypted number 2 can be multiplied with a plain text number 3 to get an encrypted cipher that decrypts to 6. plainModulusBitSize is just the largest plain text number that can be stored in the encryption scheme.

So that’s it, once we have the magic numbers, the rest of the code should be self explanatory, it’s just boiler plate for getting a context object that we can reuse throughout the rest of the code. The schemeType is the algorithm used for homomorphic encryption. Just like PKI has RSA or ECC, homomorphic has several schemes too, with differing tradeoffs and even some supporting more homomorphic operations.

Note that there’s also another bit to Step 1 which is when the client generates the keypair. That’s super simple, and the code should be self explanatory:

/**
 * Generate public and secret keys. This should be done once per client instance.
 *
 * @returns The generated public and secret keys.
 */
function generateKeys() {
  const keyGenerator = seal.KeyGenerator(context);
  const publicKey = keyGenerator.createPublicKey();
  const secretKey = keyGenerator.secretKey();
  return { publicKey, secretKey };
}

Step 2 is sharing these encryption parameters between the client and the server. In a simple project like this, we can just assume that the function getSealContext can be accessed by both the server and the client.

Step 3 is when the client encrypts the number n to send to the server:

function createEncryptedQuery(n: number) {
  const batchEncoder = seal.BatchEncoder(context);
  const encryptor = seal.Encryptor(context, publicKey);
  // batchEncoder.slotCount is the same as polyModulusDegree
  // i.e. we have lots of space.
  const query = new Uint32Array(batchEncoder.slotCount);
  // We only have a single number to store, so we use the first byte.
  query[0] = n; 
  const plainQuery = seal.PlainText();
  batchEncoder.encode(query, plainQuery);
  const encryptedQuery = seal.CipherText();
  encryptor.encrypt(plainQuery, encryptedQuery);
  return encryptedQuery;
}

This should be reasonably straightforward as well. Things to note:

  • In a single cipher we have a lot of space, at least polyModulusDegree bytes, for this use case it is overkill.
  • We have to encode before we encrypt, and decode before we decrypt. For this, SEAL provides a convenient batchEncoder.

Once we have encryptedQuery, Step 4 is processing it at the server end:

function processQuery(encryptedQuery: CipherText) {
  const evaluator = seal.Evaluator(context);
  const result = seal.CipherText();
  // multiply the cipher by itself and store in result
  evaluator.multiply(encryptedQuery, encryptedQuery, result);
  return result;
}

This is probably the simplest to understand. The cool magic trick here is that we multiply two encrypted numbers (the server has no clue what the number is), and return the result encrypted. Note that we multiply all bytes of the 4096 byte long Ciphertext vector by itself, and it’s only the first byte that is populated by n, our number, but our server is clueless about all of this.

The client however, knows about this, and that’s how in Step 5, we finally decrypt and get the result of n × n:

function decryptResult(encryptedResult: Cipher) {
  const batchEncoder = seal.BatchEncoder(context);
  const decryptor = seal.Decryptor(context, secretKey);
  const decryptedResult = seal.PlainText();
  decryptor.decrypt(encryptedResult, decryptedResult);
  const decodedResult = batchEncoder.decode(decryptedResult);
  // We just want the first byte.
  const result = decodedResult.slice(0, 1);
  return result;
}

This should be self-explanatory.

That’s how homomorphic encryption looks like in code. Can we use this to perform information lookups?

Modeling Private Information Retrieval

The whole idea that I described above in the background section is formally called Private Information Retrieval, i.e. the idea that clients can lookup information without the server even knowing what information has been looked up. Note that there is another aspect to PIR which is also obfuscating identity, i.e. the server doesn’t even know who is looking up the information, but we’ll not talk about that here1.

Also an additional note that the system being designed here is for academic purposes: to learn and understand how PIR works in a homomorphic context. In no way should this be replicated directly in a production environment. I’ll speak later about limitations of this simple system and my current understanding of how to create robust implementations in the final section.

Let’s go back to the example from the background section: a client wants to lookup a mobile number from the server spam database satisfying PIR. To recap, these are the fundamental properties that we want from the system, and I’ll mark these with separate REQ strings:

  • REQ1: The server should not know what the client is looking up. But it should still be able to send back the correct information.
  • REQ2: Clients shouldn’t receive more information than what they looked up.
  • REQ3: The lookup operation should be perceived as instant (i.e. <100ms).
  • REQ4: Clients should consider servers as untrusted, and protect against malicious servers.
  • REQ5: Lookups should not result in a lot of data duplication at the server end.

Let’s also recap what we learnt about homomorphic encryption:

  • We have a primitive where the server can perform mathematical operations on integer vectors without the server knowing the contents of the cipher. This should satisfy REQ1.
  • Since the server has mathematically no way of knowing the ciphertext, clients just have to do response validation to ensure that the response is in a valid and expected format. This should satisfy REQ4.
  • Since homomorphic operations allow homomorphic operations between a ciphertext sent by the client and a plaintext vector stored server-side, no additional data prep is necessary per client at the server-end. This should satisfy REQ5.

For REQ3, the relevant constraint here is the time taken to perform a homomorphic operation. This depends on the CPUs used obviously, but it’s in the order of milliseconds, and will definitely satisfy our perception constraint of <100ms.

From a relevant paper that describes a more advanced SEAL PIR.

And finally, we have REQ2. To ensure that clients don’t receive more information than requested, we have to design the data structures that the client sends and the server stores mobile numbers in intelligently. Let’s do that now.

Mobile numbers (or more technically MSISDNs) can be as much as 15 digits long. For the sake of simplicity, let’s say they are only 3 digits long for the purpose of this thought experiment. We’ll also ignore MSISDN validation in the examples below.

We’ll store these mobile numbers the same way we’ll store them in a bloom filter, except that the hashing function we’ll use is the identity function, i.e. we’ll directly map mobile numbers to vector bit positions:

  • i.e. the mobile number 001 will have the second bit set in the bit array (the first bit being the mobile number 000).
  • the mobile number 005 will have the sixth bit set in the array, and the number 100 will have the 101th bit set, and so on.

Expanding this to a set of mobile numbers, let’s say that the set of mobile numbers that are marked as spam are 001, 005 and 100. To represent this set on the server, we need at least a 101 bit long bit array, with the bits in the position 2, 6 and 101 set to 1 and all others set to 0. Here’s a diagram to make this clearer.

flowchart TD id1(0) id2(1) id3(0) id4(0) id5(0) id6(1) id7(ooo) id100(1)

From the perspective of a client, when we query for a mobile number, we’ll create a vector array that has the 1 bit set at the mobile number’s position, and 0s otherwise. i.e. if we want to query for 005, we’ll set the 1 bit at position 6 and 0 out the other 100 bits:

flowchart TD id1(0) id2(0) id3(0) id4(0) id5(0) id6(1) id7(ooo) id100(0)

When the server receives the query, it does a homomorphic multiplication of both these vector arrays, and everything except the mobile number being queried gets zeroed out. This is best understood visually:

input cipher representation (the same as above):

flowchart TD id1(0) id2(0) id3(0) id4(0) id5(0) id6(1) id7(ooo) id100(0)

data at server end (again, the same as the first diagram):

flowchart TD id1(0) id2(1) id3(0) id4(0) id5(0) id6(1) id7(ooo) id100(1)

and when we multiply these vectors in order, we get:

flowchart TD id1(0) id2(0) id3(0) id4(0) id5(0) id6(1) id7(ooo) id100(0)

Since this is a homomorphic operation, the server still doesn’t know what the cipher looks like or what mobile number was queried, but once it passes the response back to the client, the client can decrypt the response and just do a simple check:

  • If there’s a 1 bit set in the response, the number is spam.
  • If there isn’t, the number is not.

Because the only information the client receives is a single 1 or 0 bit for the information it queried, REQ2 can be considered satisfied. Of course, a malicious client can always get more data than what is considered fair by either setting the entire request bits to 1 or by sending in multiple requests, but we can mitigate that by our next strategy: sharding.2

Sharding

Let’s now make this a bit more realistic by having 10 digit mobile numbers. Why 10 and not the entire range of 15? Well, a homomorphic database like what we’ve modeled here is almost infinitely shardable, and most country mobile numbers are 10 digits or lower, so sharding by country makes a lot of sense. We also have to understand the upper limits of our homomorphic vectors, as I mentioned before, vectors are at most polyModulusDegree long, and so if that’s set to 4096, we can have at most 4096 mobile numbers in a vector. Again, sharding to the rescue: we can pretty deterministically find both the index of a shard, and the index in a shard using simple modulo arithmetic:

// We set the shardSize to a small value here so that at most 20
// mobile numbers are stored together 
// (even in our vector space of 4096).
const shardSize = 20;
function shardIndexOf(mobileNumber: number) {
  return Math.floor(mobileNumber / shardSize);
}
function indexInShardOf(mobileNumber: number) {
  return mobileNumber % shardSize;
}

Just a note that when we split the database into shards like this, the shard has to be passed in along with the encrypted query (since the server has no knowledge of the mobile number). This means that a server can at most guess that a mobile number queried could be in a particular shard, and nothing else can be guessed. If the shard size is large, this is pretty much a non-issue3. processQuery function is then modified as below to take a shardIndex too:

function processShardQuery(shardIndex: number, encryptedQuery: CipherText) {
  const evaluator = seal.Evaluator(context);
  const result = seal.CipherText();
  // Server data is stored unencrypted, so we can multiply it directly
  evaluator.multiplyPlain(encryptedQuery, shardAsPlainText(shardIndex), result);
  return result;
}
/**
 * Returns a shard as a Uint8Array.
 *
 * @param shardIndex The index of the shard to return.
 * @returns The shard value as a seal PlainText structure.
 */
function shardAsPlainText(shardIndex) {
  let shard = getShard(shardIndex);
  const batchEncoder = seal.BatchEncoder(context);
  const result = seal.PlainText();
  batchEncoder.encode(Uint32Array.from(shard), result);
  batchEncoder.delete();
  return result;
}

sharded-seal-lookup: A sample project

I’ve found that the best way to learn something is to write some code and really try out how it works, so I have an example project on Github called sharded-seal-lookup that demonstrates the ideas above. It’s meant to be run with bun, and is pretty easy to get started:

bun i
EXPLAIN=1 bun run test.ts

I have tried to document the code extensively too, start from test.ts, and make your way to all other files.

Limitations & Future Work

As I mentioned before, this code is definitely academic and shouldn’t be used as a base for a production-grade system4, and it definitely has limitations:

  • The most obvious limitation is that currently the lookup only works for presence and absence of a key on the server. That works well with a spam database that only needs to return binary information, but how about a name lookup database for mobile numbers? Or having to return larger data like perhaps a profile picture? If I have the time, I’ll post more about this in a Part 2, but this brings about interesting problems like encryption of data at rest, server trust and efficient hashing systems to store such data in a sharded system.
  • The other big limitation is mathematical. I’ve just scratched the surface of the wonderful world that is homomorphic encryption. Learning more about it can open eyes to more efficient ways to both store and manipulate such data on the server. For example, Galois keys, relinearization and noise budgets, evaluation keys, and so on. There’s a lot to learn!

For jumping in further to both these topics, I recommend these resources:

  • First, this Apple post gives a good overview of the possibilities unlocked by homomorphic encryption.
  • Then, take a look at how Apple has implemented homomorphic encryption in Swift.
  • There is also a related library that builds on top of homomorphic encryption to implement live caller ID lookup.
  • There is an excellent SEAL-based PIR implementation called SealPIR, and it’s got pretty simple to read code. Understanding the mathematics behind this is more challenging, but a good start is the linked paper.

That’s it! Do try out these and let me know what you think. Cheers!


  1. For folks interested in learning more, read up on Oblivious HTTP ↩︎
  2. And of course, rate limiting, but that’s a well understood problem. ↩︎
  3. Except when the server stores multiple pieces of information and a timing attack can make this intersection much smaller. Apple has an interesting mitigation strategy for this here. ↩︎
  4. I keep repeating this, because so many prototypes end up in production 🙂 ↩︎

Discover more from Vishnu Gopal

Subscribe to get the latest posts sent to your email.

Leave a Reply