Sunny.js, a Cloud Library for Node.js
Node.js provides a great environment for cloud applications and development. Node’s asynchronous design and great HTTP/REST support provide the building blocks for architecting non-blocking and scalable applications in the cloud.
After reviewing these libraries, I found a few features variously lacking that I would like in an ideal cloud datastore client:
- Event-based interface.
- “One-shot” requests wherever possible.
- Able to create / delete arbitrary containers (buckets).
- Configurable request headers, cloud options and metadata.
- SSL support.
With these goals in mind, I put together a basic multi-cloud datastore client called Sunny.js. Sunny initially supports Amazon Web Service’s Simple Storage Service (S3) and Google Storage for Developers (version 1). Sunny hits all the points above, and also has:
- A really well-documented API and user guide. (At least I think so!)
- Full unit tests for all cloud operations and providers.
Here are some basic resources to get started with Sunny, which we’ll dive into a little deeper in the subsequent sections of this post:
Sunny.js Documentation: Guides and API documentation.
- Sunny.js GitHub Page: Source repository and issue tracker.
- Sunny.js NPM Page: NPM page and history.
First, install sunny from npm:
From here, we can create our Node program, and get a live cloud datastore connection as follows:
Now that we have our cloud connection object (
conn), we can perform all of
our cloud operations on containers and blobs. For those used to Amazon Web
Services parlance, a Sunny container is equivalent to an S3 “bucket”, and a
Sunny blob is an S3 “key”.
Sunny cloud operations are asynchronous and event-based. The cloud methods
return either a request or a stream object, that is then used to set
listeners and callbacks on, before calling
end() and starting the
underlying real cloud network operation.
Sunny request objects are not true Node request objects, but approximate a subset of a Node HTTP request object events. The most common request events are:
- “error”: The underlying real cloud operation failed.
- “end”: The operation finished and we have results.
- “data”: A GET request returned a data chunk.
Let’s perform a basic asynchronous operation to get a container named “sunny” from our cloud store.
which gives us the output:
Breaking the code down, we call
getContainer with the container name
and the option
validate: true. The validate option means we actually
perform a cloud request to check the container exists before continuing.
This is a good check to start with before launching in to other code. But,
Sunny allows the programmer to choose to not validate, and wait for the first
blob request to actually perform any network operation, which is faster.
getContainer method returns a
request object, which we then set
our listeners on for “error” and “end” (when we have results). Finally, the
request.end() initiates the actual network operation.
Our “end” callback takes a
results parameter which contains a
container method for further use with blob methods, and a
with information from the actual cloud call (metadata, HTTP headers, etc.)
getContainer documentation for more options,
results and uses.
The other request-based Sunny methods work in a similar fashion, and include the following:
- Connection: Get a list of containers. Get/create/delete a single container.
- Container: Get/create/delete a single container. Get a list of blobs. Head/delete a single blob. Get/put blob to/from a file.
- Blob: Head/delete a single blob. Get/put blob to/from a file.
See the API for full method details.
Sunny returns stream objects for data-based cloud operations (PUT or GET), which are real implementations of Node streams. A GET blob method returns a Readable Stream object, and a PUT blob method returns a Writable Stream object.
Let’s take the
container object we received from our successful
getContainer request above, and perform a PUT blob operation with a simple
string. (Note: programmatically, this would be within the “end” callback of
the get container request). Data can be written with any number of
calls and a single
end() call (which starts the data transfer and ignores
all subsequent writes).
Now, in a later operation (say, within the
putBlob “end” callback, after
we know the blob was written), we can retrieve the data:
which gives us:
Note that we’re getting the data in raw chunks from the “data” event, which
is somewhat tedious to cobble together (as strings if encoded or
objects otherwise). Given that we have full read / write stream implementations,
Sunny really shines with the availability of
pipe()‘ing data. Essentially,
you can simply connect a Sunny read / write stream to another read / write
pipe() and let Node and Sunny take care of the rest.
For example, Sunny provides helper methods for getting / putting blobs to and from files. Looking at the source code, the real gist of, for example, getting a blob to a local file is the following:
There’s a little more to it, but it is mostly this easy. Take a look at the
source code for the blob convenience methods
getToFile to see a full implementation that binds together file and cloud
Node streams available for use with Sunny get / put
pipe() calls include:
- HTTP/HTTPS requests and responses. See my sunny-proxy project for a simple web server that proxies web requests to cloud blobs using Sunny blob get streams piped to HTTP responses.
- File reads and writes –
Cloud Headers and Metadata
Sunny abstracts common cloud provider header and metadata. For example, AWS S3 has a metadata header prefix of “x-amz-meta-“, while Google Storage has an analogous one of “x-goog-meta-“.
Sunny cloud operations return a
meta object that looks like:
with cloud-specific prefixes stripped out. Moreover, you can set cloud
headers in the same manner by passing any of the fields above (“headers”,
“cloudHeaders”, “metadata”) in the
options object of a cloud request.
In this manner, Sunny makes it easy to utilize metadata operations as part
of your application, while remaining agnostic to your actual cloud provider.
See the Sunny user guide for more information.
For programmatic ease, and to better abstract across cloud providers, Sunny
wraps cloud-based errors with a custom class, that adds in some additional
useful attributes. The
CloudError class has the following members:
message: The error message (usually in XML). (from
statusCode: HTTP status code, if any.
and error type methods:
isNotFound: Container / blob does not exist.
isInvalidName: Invalid name requested for container / blob.
isNotEmpty: Container cannot be deleted if blobs still exist within it.
isAlreadyOwnedByYou: Creating a container that is already owned by you.
isNotOwner: AWS and Google Storage have a global namespace, so if someone else owns the container name requested, your operations will fail.
Ultimately, Sunny “error” event listeners will get
instead of raw
Error objects on cloud method errors, which can be used to
more easily programmatically react (especially since Sunny correctly handles
different error codes and XML from cloud providers that correspond to the same
type of error).
Putting it all Together, with a Little Help from Async.js
Now that we’ve seen how to create a Sunny connection, get a container and perform operations on blobs within the container, let’s put all of our code together in a single script. (Note: I’ve collapsed the error handling to a single utility function).
Not too shabby! However, you’ll note that we have three serial cloud operations, each of which have to complete before the next starts, so we end up repeatedly nesting callbacks in “end” event listeners. Even with a mere three operations, the nesting makes the code hard to understand at first blush, and an indentation nightmare. Wouldn’t it be better if we could code our operations to match the general outline of what we’re trying to do? That is:
- Get a container.
- Once we have a container, put data to a blob.
- Once we have put data to the blob, get the data back.
Fortunately, there are many asynchronous helper libraries for Node to help in just this situation. I like Async.js, which from the project page “provides around 20 functions that include the usual ‘functional’ suspects (map, reduce, filter, forEach…) as well as some common patterns for asynchronous control flow (parallel, series, waterfall…).”
In our simple three-cloud operation script, we just need
serialize our operations for easier control flow. I am skipping over a lot
of details about using Async.js, such as callbacks at the end of functions or
errors, and passing state across serialized functions, but you can probably
intuit most of how things are working here:
which gives us our expected output:
Our code now appears in a serialized, contained order that matches the conceptual execution of the actual cloud operations, instead of a lot of callback nesting.
For more on developing with Async.js, please see the excellent Async.js Readme / tutorial. It is an investment well worth your time. Sunny internally uses Async.js extensively for unit tests. Explore the source tests directory for a lot of good use cases (parallel and serial).
Have Fun in the Sun
That wraps up our introduction to Sunny.js. Hopefully the library provides a useful interface for your cloud programming needs.
The project is still very much in an early development state. See the Sunny “to do” list to get a better idea of what features and fixes are coming down the pipeline. The big ticket items I would really like to see added are:
- More cloud providers, notable Rackspace Cloud Files and OpenStack Storage.
- A basic retry function wrapper to handle expected, occasional cloud operation failures and throttling.
- Advanced cloud operation support for ACL’s, policies, etc.
Help, bug reports, and pull requests for the project are most welcome. Please review the developer’s guide, and contact me with any questions or comments.