Tuesday, July 22, 2014

Uploading files to a mongodb database without using express


Building functionality to upload a file to a Node.js server using express is a piece of cake. But for various reasons sometimes we do not want to use express. I had to implement such a functionality for such a system which only uses pure Node.js. Here is my experience while at it.

HTTP multipart request

Http is a text based protocol. It is intended to transfer text. If we transfer files which may contain binary patterns that are not found in simple text files, the network components, as they are only intended to handle text, may misbehave. The data in the http packet could contain a byte with a pattern that is used as a control signal in the http protocol. For example the end of transmission(EOT) character. Some components may reject bytes that are not valid text. Some may edit them. These may corrupt the file.

To avoid such pitfalls the standard of http multipart request is used. Http multipart request body is a little different in format to its regular counterpart. Most notably the value of the content type header field would be 'multipart/form-data'. The body of the http request could contain multiple files separated by a boundary. Network components are designed so that they would interpret multipart requests differently than regular ones. Data amid boundaries are treated as binary and they would not care what they mean.

So when we upload a file to a server through the internet what we actually do is no different than what we do when we submit a form by an http post request. Except that the http post request is encoded in a different way.

However above information is not needed to be known by the application programmer because the user agent she is writing the program to, should know how to put together an http multipart request. For example the browser (a user agent) would submit a multiparty request at the submission of following html form.


    <form action="/upload" enctype="multipart/form-data" method="post">
    <input type="text" name="title"><br>
    <input type="file" name="upload" multiple="multiple"><br>
    <input type="submit" value="Upload">
    </form>

Or on the Linux terminal

curl -v -include --form file=@my_image.png http://localhost:3000/upload

Server side

Just as the http client the application programmer is using would encode an http multiparty request, the server side framework should decode one for her. As mentioned earlier express would do this without a hassle. But if express is not an option for you, if you are on pure Node.js, then you might be a little confused. I was too until I got to know about multiparty. This npm package takes in the request instance and gives you references to the files saved in your disk on the temp directory, the files that were included in the request. Just as express would have.


http.createServer(function(req, res) {
  var multiparty = require('multiparty');

  if (req.url === '/upload' && req.method === 'POST') {
    // parse a file upload
    var form = new multiparty.Form();

    form.parse(req, function(err, fields, files) {
      res.writeHead(200, {'content-type': 'text/plain'});
      response.end("File uploaded successfully!");
      // 'files' array would contain the files in the request
    });

    return;
  }

}).listen(8080);

In the callback of the form.parse method it is possible to read the file in and save it to a database, rename it (move it) or do any other processing.

Processing the request

But if we are gonna save the file on the mongodb database why save it in the disk? Turns out we don't have to.

The form instant created by multiparty's Form constructor has 'part' and 'close' events to which handlers can be hooked. The 'part' event will be triggered once for each file(part) included in the multipart request. 'close' will be triggered once all the files are read.

The handler of the 'part' event will be passed an instance of a Node.js ReadableStream, just like a request instance to an Node.js http server. So it has 'data' and 'close' events (among others) just like a request instance to an Node.js http server, that can be used to read in the file, chunk by chunk.


form.on('part', function(part) {
    console.log('got file named ' + part.name);
    var data = '';
    part.setEncoding('binary'); //read as binary
    part.on('data', function(d){ data = data + d; });
    part.on('end', function(){
      //data variable has the file now. It can be saved in the mongodb database.
    });
  });

The handler of the 'close' can be used to respond to the client.


  form.on('close', function() {
    res.writeHead(200, {'content-type': 'text/plain'});
    response.end("File uploaded successfully!");
  });

The complete code would look like this.


  var multiparty = require('multiparty');
  var form = new multiparty.Form();

  var attachments = []

  form.on('part', function(part) {
    var bufs = [];

    if (!part.filename) { //not a file but a field
      console.log('got field named ' + part.name);
      part.resume();
    }

    if (part.filename) {
      console.log('got file named ' + part.name);
      var data = "";
      part.setEncoding('binary'); //read as binary
      part.on('data', function(d){ data = data + d; });
      part.on('end', function(){
        //data variable has the file now. It can be saved in the mongodb database.
      });
    }
  });

  form.on('close', function() {
    response.writeHead(200);
    response.end("File uploaded successfully!");
  });

  form.parse(request);

Multiparty would save the files to the disk, only if the form.parse method is provided a callback. So in the above case it would not do so. It is expected that processing of the file is handled using the event handlers of the form instance.

Saving on MongoDb

Saving the data on the mongodb database could be done using the GridStore. This part will not be included in this post since it is straight forward. Further this step will be the same whether we use express or not, and I want this post to be specific to the case of pure Node.js.

Thanks for checking out!