On
a better
security token

An experiment in secure authentication tokens for zero-trust micro services.

Owen Kelly
Thursday, May 25, 2017

In case you haven’t heard, JSON Web Tokens (JWT’s) are dead. In fact they’ve been dead for a while. According to some, they were dead on arrival. It's hard to go a week now, without someone saying you should stop using them.

But JWT’s are so useful.

It’s very appealing to ignore this and continue to use JWT’s. They’re built into just about everything now, and support for them is great. There’s just one catch. As Thomas H. Ptáček wrote in a Hacker News thread on this topic “… it's unethical to build systems you know have security weaknesses.”

It’s getting impossible to ignore the problems with JWT’s. I’m not going to re-hash them here, suffice to say the lack of a defined scope and the extensive configuration options have led JWT’s to be very dangerous to implement.

For most situations a JWT is simply too complex. It looks simple once explained, but there’s too much configurability lurking in the standard. You need to really think about the reasons to picking the JWT out of your toolbox over anything else.

Session tokens are still solid and reliable, when donecorrectly.

There is one specific use case that’s not served well by session tokens, and where JWT’s really become useful. Authentication in a system with zero trust microservices. In this system we have have an authentication service, one or more client services (front end websites, mobile clients, etc) and multiple internal services. To do this right, we shouldn’t implicitly trust a request to an internal service — after all you can’t know for sure there isn’t a bad actor on your internal network. So we need a method to provide authentication to the internal micro services.

A quick and dirty way to do this is to share a secret between all services and then create a JWT with that secret. If you also have the secret you can verify the token. Sounds good right?

Well, not really. Sure you can verify the token has not been tampered with, but you cannot guarantee that the token came from your authentication service. In fact, any service with the shared secret could create a token. Now, this does not mean that any bad actor on your network could suddenly start forging token, they don’t have the shared secret. It does mean that any exploit in your services that could expose the secret makes the whole system vulnerable.

JWT does give us one more option though, asymmetric encryption with public and private keys. You give the authentication service a private key, and all the other services a public key. Now they can both verify the token has not been tampered with, and verify that it was created by the authentication service.

But, you still have all the issues with JWT’s. So this actually isn’t an option.

What do we actually need?

Another thing to consider is what the actual content of the token needs. Every internal service needs to authenticate and authorise the request before acting on it. But we don’t want to send a request to the authentication service at the start of every request to an internal service, that’s excessively noisy and failure prone.

We need a token that can encapsulate the information required for authentication and authorisation. It should contain the information required for an internal service to verify who the request came from (authentication) and what actions they are allowed to perform (authorisation). So what do we need?

First we need a universally unique identifier (UUID). This should be unique to the user, and it should not change — otherwise we would need to contact the authentication service to find out who they are.

Next we need some way to encapsulate the users authorisation. In my opinion the most flexible and reliable method is using roles, and letting the internal service filter based on the role. In this setup, the authentication service stores the users roles on their record and adds them to the token. You would have an array containing strings such as admin, support, editor and so on. Then when evaluating the user in an internal service, you check if they have a role that matches, and only allow the action if they do. In this setup you do need coordination on the available roles, and you will want to keep the quantity of roles assigned to a user to something below 10.

The last thing we need to handle is expiry. It’s up to the internal service to make the decision on whether or not a token has expired, otherwise they would need to call the authentication service on each request. So the best option we have here, is to set consistent rules across the system and enforce them ruthlessly.

How do we do this?

In an earlier version of this idea, I explored having multiple types of tokens. Having one for refresh and one for access for example. In reality, we only need to solve the access token problem. As the refresh token should be a random session token stored with the authentication service. In getting a new access token, a request to the authentication service is necessary.

One other consideration is multi-tenancy, common requirement of modern SaaS systems. Instead of deploying a system for each customer, you have one system that everyone shares. Our token needs the capability to store a uuid for the tenant the user belongs to. It’s also plausible a user could be part of more than one tenant, therefore we should store an array of UUIDs for the tenant. By now our token object looks like this:

javascript
1
{
2
uuid:132f4-134f-1324f’
3
issued: 1490139941,
4
roles: [
5
‘editor’,
6
],
7
tenant: [
8
6t7ihg-3f4v5tqw-w43ct5’,
9
],
10
}
11

Hopefully the first thought when you see that, is that we can’t possibly expose that information to the client. Realistically the client doesn’t need to see the contents of this token. The client implicitly trusts everything we send back to it, so if we needed to send any of this information to the client we could send it in addition to this token.

Quite often JWT’s are used to send information to the client. In order to verify the contents of the JWT, you would also need to send a key down as well. Unless you’re doing this out of band on a different channel, the key you send has as much chance of being tampered with as the JWT itself. They cancel each other out. Further to that, if you’re using a shared key instead of asymmetric encryption you need to expose the key you use to sign your JWT in order to verify it.

The better option is for the client to not even be aware of the token, and instead work based on the results of the requests it makes. Say you want to load an authenticated route, well just try it. If the token is stored as a Http Only, Secure cookie it will be sent along automatically by the browser. And the result of that request will inform the client if the user is logged in or not.

The client doesn’t need to read the token, but our internal services do. We also need to guarantee the token could only have been created by the authentication service.

This is where it starts to get a bit more complicated than just encrypting an object and sending that around. If you don’t need the guarantees above (which you don’t if you’re building a monolith), then you don’t need to add any extra complexity. But if you do, here’s how I’m thinking it can work.

First let’s evaluate what we need.

Our client doesn’t need to read the token, but we need to ensure the entire token is not tampered with. To meet this requirement we can HMAC the entire token, with a key shared among all the internal services.

Given the content we need in the token to prevent the need to a request to the authentication service on every request, we need to encrypt the contents of the token. We can solve this simply and quickly using a symmetric encryption key, again shared with all internal services.

Finally, we need to guarantee the token has been created by the authentication service. For this, we use asymmetric encryption to sign the contents of the token. We securely keep the private key on the authentication service, and share the public key to each service.

Warden

For this to work reliably, we need a client library to handle this work for us. In the same manner that JWT libraries work. Where JWT libraries offer configuration, we should offer none. Configuration is one of the most dangerous aspects of the JWT specification. For example, in the specification you can set the encryption algorithm to none.

I’ve wrapped this idea up into a project called Warden. The project is not yet at a 1.0.0 because I’m looking for feedback on the actual design and implementation. So while this project is complete and ready to be used, it has not yet been extensively validated for security.

In direct contrast to the JWT spec, encryption options are not configurable. Any change to them will be a major version change. Minor versions and patches will not change the encryption schemes used. Instead of having no safe defaults, you now only have safe defaults.

Warden is designed to solve the specific issue of authentication and authorisation in a zero-trust microservice system. It formalises the ideas explored above into a JavaScript module that you can use. The long term goal is to have the library available for other languages commonly used in microservice systems.

Warden has three classes, each with a specific function. They are; the Warden used only on the authentication service to create new Cards , the Guard used on every service to authenticate and authorise a user’s Card, and the Forge used to create and rotate the keys.

The first two should be reasonable obvious, but the third solves a problem JWT left up to the user. With Warden, you don’t have a choice about key rotation. It’s part of the standard implementation and must not be ignored.

The Forge's main function is to create a key set.

This the set of keys required for the Warden and the Guard to do their jobs.

The Forge creates two editions of a key set, one for the Warden and one for the Guards.

The only difference is the key set for the Guard does not contain the private key the Warden uses for signing the Card.

Here’s what the key sets look like:

javascript
1
WardenKeySet {
2
publicKey: string;
3
privateKey: string;
4
symmetric: string;
5
hmac: string;
6
expires: number;
7
}
8
GuardKeySet {
9
publicKey: string;
10
symmetric: string;
11
hmac: string;
12
expires: number;
13
}
14

Our key set is comprised of a public private key pair, with the private key only available to the Warden . Then we have a key for the symmetric encryption, and another key for the HMAC hash. Finally we have the time the key expires.

Along with the library is an example Forge container. It manages two files: wardenKeySetCollection.json and guardKeySetCollection.json. These files should be mounted into share volumes such that wardenKeySetCollection.json is available read-only to the Warden service, and guardKeySetCollection.json is available read-only to the Guard service. Both files need to be available read-write to the Forge service.

Now, you could just use the same set of keys for the life of your service. But at some point you should rotate them. You are rotating your keys right? If you’re using a container service such as Docker or Kubernetes, you can share volumes into your containers with the keys, and let another service manage rotating them.

This is what Forge does. It maintains a collection of the key sets , and rotates them on a regular schedule. In order to properly rotate your keys, you need more than one key set available at any given time. The way we manage this is as follows:

  • The Forge regularly rotates the keys, by checking to see if any keys have expired. If they have, a new key set is generate and added to the collection.json files that are on the shared volumes.
  • The Warden always uses the newest key set when creating new Cards
  • The Guard cycles through all the key sets when trying to confirm the HMAC , and uses the key set where the HMAC is valid.

This method for key manage also means you can invalidate all sessions by rotating all your keys, and it means the longest possible session is equal to the expiry of your keys. Currently this is set to 7 days.

This means you will need to setup a refresh token on login in addition to the access token Warden will generate for you. The refresh token should be stored with your authentication service and have it’s own expiry time. When you need to request a new access token, make a request with the refresh token, at which point the authentication service should check to see the refresh token is valid and has not been used yet, and then return you a new refresh and access token.

Alternately, you can implement multi-factor authentication by returning a different token, much like the refresh token, requesting a MFA challenge which will only return an access token upon success.

Usage

In practice you need the following: an authentication service, a Forge service, and a method to shared the collection.json files to the authentication service and the other internal services. Both Docker volumes and Kubernetes secrets can do this.

The Forge manages your keys. It will generate and rotate keys for you. You implemented this as a discrete service. Ideally as it's own container. The Forge manages two file shares, one for the Warden and one for the Guard. The keys are kept on these shares.

A Warden can create a Card, this is the bearer token you can pass around. You add this to the service that provides authentication.

A Guard can validate and extract the contents of a card. This is integrated into every service that needs to read a Card.

How to create a Card

javascript
1
import { Warden } from './warden';
2
import fsp from 'fs-promise';
3
4
const wardenKeySetCollection = await fsp.readJson(`
5
${Shared Warden keys folder}/wardenKeySetCollection.json`
6
);
7
8
// Create a new warden
9
const warden = new Warden(wardenKeySetCollection);
10
const card = await warden.createCard({
11
uuid: 'string',
12
tenant: [
13
'string',
14
],
15
roles: [
16
'admin',
17
'editor',
18
],
19
hoursUntilExpiry: 1,
20
});
21
console.log(card);
22
// card:
23
// ODJmY2ZiYTIyYzJhZjlmMjc2ZWNlZjhlY2QxNjIwN2ZkOWMzNWRkODBlOWY3MGJkM2EzYWM0MzQ2M
24
// zRhNTY0NjU3YjgzYzY1NWM2MmNjNmRmNGJlOGQ5NjA0YmRmY2JiMWZkZGRmN2QwMDc1M2RiZDkwZW
25
// Y5Y2IyY2MxZjQzNzBjZDI3ZDM3NDFhOGZlZjY1MGM3Yjk2ZDgyNjhhZTU3M2MzZGUzODQ2YjJmM2E
26
// 1OWUwZjUwZDNjOGU4MjcxNzBlZTZmYmM1YjkwZWMwOWRhNmVhZTZjMTE3ODI4YzhlZThiMWZjYWE4
27
// OThhNTc1MmYwYjYxMzU3NmYzMjlhZThjM2E3ZmEwNDg2MTc2YTJlYmY0OTljYTY5M2Q0ZDhlYzY4Z
28
// DZkZjUxZGY2NzkzYzdhODEzOTAzMmVjYTdlMTNiNjZkZjFjNTFjMDQ4NzdkZmY4YWFlZTQ5YmNkMj
29
// llOTJhOGEyMDVkNTdkMThjMWZjYWQyMDk2ZGRjMjZlNDc5MTViMWRjNWE4YWQ4MTEyN2E5M2I4MGM
30
// yYWJjM2Y0YzIyMGFjNzc1ZTY3OGJjOGVlOWQ2Y2JjY2NkMjVlOGI1OWYzNDIzNDM2OTNhYjRmYTYy
31
// NzgwMTU1NDEwZDY2YjA3NWQwNDY3NWE3YWM1ZGU0NTllODBhNTJjZDczZGM2N2E3MGUxOGNmNWE4O
32
// GUyODFiZWVlY2U5MTg3ZmRiNzYyYjk3YjhkZmNmYzVjZDI2YTJlOTMxMDkxOWQ1NWIwZTEwNjhiNG
33
// M5MzkwNzhhNzgzMjI0NTdlNzQ3NDAzZDE4ZDgwZTFiNGY4N2I0M2M1ZWFlZjVkNmJlNGM5ZmIyNjA
34
// 1MThkYTRhOWJiZTFjY2YxY2E3MTM2OGEwNzk2ZjYzNTQzZTg4YTZhMDE5OTI0NDFiMmU1NjVmMTAw
35
// M2FlZDc1ZDI0YjBhNjk3MWM0NTJhNzMzNTlkMTRhZDdmZTYyMjg0MDRkZjhkZGUzMzBkZWM5NDQzM
36
// WU0YTBlNzMzOTYxZTIzNWY0ODg4ODhjZDEwNzZmMWNjZDg4ZTY5Y2FmMGEyZGZlMmUwMWVjLjBiNT
37
// lhODcxM2JhODI5MWUwM2UzZTg5ZDhlMWJiMWM5LjdiMjVjMTMyNTk4NDZiY2YuODJmNTM5NGQ1MTF
38
// iMWQxYzQ0Y2Q5ZDBlZTU4NGEzNTIxZDViNzE3ZTFmMWJhMDlhMzY3MjNkYzhlOWFkOTJmMA==
39

How to use a card with a Guard

javascript
1
import { Guard } from 'warden';
2
import fsp from 'fs-promise';
3
4
const wardenKeySetCollection = await fsp.readJson(
5
`${Shared Guard keys folder}/guardKeySetCollection.json`
6
);
7
8
try {
9
const checkedCard = await guard.checkCard(card);
10
console.log(checkedCard);
11
} catch (err) {
12
// Card is invalid
13
}
14
15
16
// checkedCard:
17
// {
18
// uuid: '523b519b-cb8b-4fd5-8a46-ff4bab206fad',
19
// roles: [ 'engineer', 'onCall' ],
20
// expires: 1490232381669,
21
// tenant: [ '48d2d67d-2452-4828-8ad4-cda87679fc91' ]
22
// }
23

The key sets are managed by the Forge class. This class looks after key rotate and the initial generation. You should implement this as a completely separate container, either running with an internal cronjob, or set to spin up every hour.

Inside the container you need to have two file mounts, one for the Wardens keys and one the Guards. Pass the location of these to the Forge constructor. The Forge will load up the files, check if any keys have expired, and rotate the collection if they have.

How to check and rotate the keys with Forge

javascript
1
import { Forge } from "warden";
2
3
const forge = new Forge({
4
// The path to the directory where the Warden keys are stored.
5
// This should be a mount shared with the Wardens.
6
// Forge needs read-write access, Wardens must have read-only
7
wardenKeySetDirectory: "/srv/wardenKeys",
8
9
// The path to the directory where the Guards keys are stored.
10
// This should be a mount shared with the Guards.
11
// Forge needs read-write access, Guards must have read-only
12
guardKeySetDirectory: "/srv/guardKeys",
13
14
// Optionally you can set the maximum number of key sets to
15
// keep on rotation. When a keyset expires it is replace with
16
// a new one.
17
maxKeySetsValid: 3,
18
19
// Optionally you can set the maximum number of days a keyset
20
// is valid for.
21
// This must be greater or equal to the maximum time a Card
22
// is valid for, becuase once the key a Card was created with
23
// is rotated, that Card becomes invalid.
24
maxKeySetValidDays: 5,
25
});
26
27
// This will check to see if any keys exist
28
// Then it will either create new ones, or cycle through
29
// and replace exipered ones.
30
try {
31
await forge.rotateKeys();
32
} catch (err) {
33
console.warn(err);
34
// If this fails you should crash the process and try again
35
process.exit(1);
36
}
37

Postscript

After follow up discussions I've had on this topic, I feel it's important to add the following note.

In general, this approach is likely overkill. More often than not, it's probably simpler to use some kind of authentication proxy. Essentially something like session tokens.

I still think this is an interesting project, and the code is worth exploring. But I don't have a practical application for it anymore.

View Source

— OK
Tagged