Boldizsár's programming blog

How to remove sensitive data from your winston logs?

April 15, 2019 | 12 Minute Read

In this article I'm going to show you how you can remove sensitive information from winston logs.

You’ve written your application, it’s all done and ready to be shipped so what happens next? Your job is not over yet, not even close. Now comes the monitoring, debugging and troubleshooting part. You will have a chance now to see how your masterpiece behaves in real world. We need tools to get feedback about the performance and functioning of our code. One of the many tools we can use to get an insight into our app’s life is by creating logs about different kinds of events happening inside our app.

In the JavaScript world the most common way for logging is done with the good, old console object. In the web dev world with the console object we can access the browser’s debugging console, in the Node environment the console module is the way to access the debugging console. But be honest here, it lacks many useful capabilities which would be very useful. This is when winston logging library comes into the picture. With winston you can just do a lot more, set up different transports, define different ways to store and output different levels of logs, create different loggers for different purposes, create custom output formats, etc. And this latter feature is the one we’re going to dive into today.

Sometimes we might log information that are private and confidential or just should not be seen by any employee. Chances are you have already seen data you thought it didn’t belong there. What can we do? We can either check the inputs of the logs which can be a bit tedious since for example if we log a request body object, you know, many things which we’ve never thought about while developing can be there like password or other credentials. We might log something we were not aware that it could contain some sensitive data. In order to try to prevent this we can leverage winston’s feature of creating custom outputs. In this article I’m going to show you how I removed sensitive data from our logs to prevent misusing it by anyone who has access to the logs.

Let’s create a Node.js app so you can also follow me along. Create a folder and cd into it; then initialize an npm project.

mkdir winstonexample && cd winstonexample
npm init -y

We’re going to make a very basic server first. We’re going to create 2 endpoints: /register where the user sends us their username, password and name. We receive this data in body and as you can see it’ll contain sensitive information. The other endpoint is /orders where the user can fetch their orders. We assume the app uses JWT tokens for authentication so in the headers we expect to get some sensitive data too.

Let’s install the required dependencies

npm install express body-parser winston is-plain-object is-empty --save

First we're going to set winston up; then create a basic logger. Create _logger.js_ file.

touch logger.js

Open it with your favourite editor and put this code there.

const winston = require('winston');

const logger = winston.createLogger({
  transports: [
    new winston.transports.Console()
  ]
});

module.exports = logger;

Then create an index.js file and open it up. Copy this code which sets up a simple server:

const express = require('express');
const bodyParser = require('body-parser');
const logger = require('./logger');
const app = express();

app.use(bodyParser.urlencoded({ extended: false }));

app.use((err, req, res, next) => {
	logger.error(err);
	res.status(500).send('Internal server error');
})

const port = 4000;

app.listen(port, () => {
	logger.info(`Server listening on ${port}...`);
})

Start the server.

node index.js

And if all is good we should see this log message.

{"message":"Server listening on 4000...","level":"info"}

Now let’s add the register endpoint. Nothing special here, data comes in the body, we log that a new user’s just registered and return a 200 and a descriptive message. Let’s put this code under the body parser part.

app.post('/register', (req, res) => {
    const { body } = req;
    logger.info('User registered', body);
    res.status(200).send('Successful registration');
})

After restarting the server we can try our app using cURL (or any other tool like Postman).

curl -d 'username=test' -d 'password=secret' -d 'name=John Doe' -X POST http://localhost:4000/register

This is the log we should be seeing now.

{"username":"test","password":"secret","name":"John Doe","level":"info","message":"User registered"}

We can see that the password got logged and anyone could take advantage of this who has access to the logs. Let’s create the /order endpoint to allow the user to see their made up orders. Paste this code under the post endpoint.

app.get('/orders', (req, res) => {
    const { headers } = req;
    const authHeader = req.get('Authorization');
    if (!authHeader) return res.status(401).end();
    const token = authHeader.replace('Bearer ', '');
    logger.info(headers);
    if (token === 'bigsecret') res.send([{id: 1, name: 'Order1'}, {id: 2, name: 'Order2'}]);
    else res.status(401).end();
})

I’m using the Bearer authorization format here and let’s say bigsecret is a valid token. If we log the headers here, we log the user’s token which can lead to misuse. Let’s hit this endpoint too after restarting our server.

curl -X GET -H "Authorization: Bearer bigsecret" http://localhost:4000/orders

The expected log message now is the line below which contains our token.

{"message":{"host":"localhost:4000","user-agent":"curl/7.54.0","accept":"*/*","authorization":"Bearer bigsecret"},"level":"info"}

So far, we’ve seen two examples on how logging request related data also logs private data. Let’s look at now how we can customize the output format of the logs. Now we’ll go back to the logger.js file and add some new lines. Using the printf method of the format object we can customize the output format. I’m adding the level and the stringified message. It’s still very similar to the basic format. This is how the logger.js should look now.

const winston = require('winston');
const isEmpty = require('is-empty');

const { createLogger, format, transports } = winston;

const { combine, printf } = format;

const logger = createLogger({
  transports: [
    new transports.Console()
  ],
  format: combine(printf(({ level, message, ...rest }) => {
    const log = { message };
    if (!isEmpty(rest)) log.data = rest;
    return `[${level}]: ${JSON.stringify(log)}`;
  }))
});

module.exports = logger;

Now we can implement a procedure that basically checks the properties of this message object recursively and remove all the keys, or at least modifies the content of the keys which we think can contain sensitive information. For now, in our app we’ll have two such keys: Authorization and password. So we want to modify the content of them. Let’s create an array and store them. And we’ll create a function that does the replacement for us. It’ll need a dependency so we need to require it at the beginning of the file. This code should now stand before the createLogger call.

const winston = require('winston');
const isObject = require('is-plain-object');
const isEmpty = require('is-empty');

const { createLogger, format, transports } = winston;

const { combine, printf } = format;

const excludedKeys = ['password', 'Authorization'];
const deepRegexReplace = (value, keys) => {
  if (typeof value === 'undefined' || typeof keys === 'undefined') return {};

  if (Array.isArray(value)) {
    for (let i = 0; i < value.length; i = i + 1) {
      value[i] = deepRegexReplace(value[i], keys);
    }
    return value;
  }

  if (!isObject(value)) {
    return value;
  }

  if (typeof keys === 'string') {
    keys = [keys];
  }

  if (!Array.isArray(keys)) {
    return value;
  }

  for (let j = 0; j < keys.length; j++) {
    for (let key in value) {
      if (value.hasOwnProperty(key)) {
        if (new RegExp(keys[j],'i').test(key)) value[key] = '[REDACTED]';
      }
    }
  }

  for (let key in value) {
    if (value.hasOwnProperty(key)) {
      value[key] = deepRegexReplace(value[key], keys);
    }
  }

  return value;
};

As you can see. I go through the properties of the object. If it’s an array I call the function for each member. When there’s a key which has been added to our excluded keys array then I modify the value so I still know it was there but the value can no longer be seen. Let’s call this function before stringifying the object. The print function will first remove all sensitive data and only then returns with the stringified log.

return `[${level}]: ${JSON.stringify(deepRegexReplace(log, excludedKeys))}`;

If we call our endpoints again, now we should see that the password and our very secret token is gone. We could further customize our winston logger but that should be another blog post. The log when we the /register endpoint looks like this. We can see that the password key’s value got replaced. Try the other endpoint too and see what happened.

[info]: {"message":"User registered","data":{"username":"test","password":"[REDACTED]","name":"John Doe"}}

You can find the full source code on my GitHub

This is the end of my first blog post. I hope you’ve founded it useful, if so, leave a comment and subscribe to my newsletter so you’ll be notified if I post a new article.