A World Full of Sharp Objects

Wednesday, July 29, 2015

JavaScript anno 2015 – Gulp

Gulp is an automation tool used for build automation. Just like MSBuild, Nant, make, and pretty much any build automation tool out there, Gulp runs one or more tasks that you define.

Gulp is built on Node.js which means that you define your tasks in JavaScript. To get started you need to install Gulp and create a gulpfile.js which will be the home for your automation tasks:

npm install gulp -g
npm install gulp --save-dev
new-item gulpfile.js -type file
invoke-item gulpfile.js

(btw I’m using Powershell as my shell)

If you run those 4 commands you should now have your gulp file open and ready to define some tasks. Here is an example of a Gulp task;

var gulp = require('gulp');

gulp.task('default', function() {
    process.stdout.write("Gulp is running the default task\n");
});

Since Gulp is running in Node.js we can use Node’s process object to write to the console. To run this task just run the gulp command in your shell and the default task will run.

Gulp doesn’t really do much itself. In fact, there’s only 4 functions in the Gulp API:
- src
- dest
- task
- watch

Src & Dest

src and dest are filesystem operations. To read files from the filesystem, use gulp.src(). To write files, use gulp.dest(). The src function takes a glob pattern as input (using Node’s glob which again uses the minimatch library), so to read all js-files in a scripts folder you can use this syntax to recursively find all files:

gulp.src('Scripts/**/*.js')

The dest function takes a path as parameter and will output files to that folder (and create the folder if it doesn’t exists). Matching files that already exists will be overwritten. To create a simple copy task, we can chain the src and dest functions together, and Gulp uses the pipeline pattern to do this (similar to pipes in Powershell and bash):

gulp.src('Scripts/**/*.js').pipe(gulp.dest('Copies'));

This will copy all js-files from the Scripts-folder to a folder called Copies. Note that it will keep the directory structure from the source, so if the file index.js exists in a folder ‘src’ in the Scripts folder it will end up in ‘Copies\src\index.js’.

Task

If we want to make a task for the file copying above we could define it like this:

gulp.task('copy', function(){
    gulp.src('Scripts/**/*.js')
        .pipe(gulp.dest('Copies'));
});

To run a specific task, we just use the gulp command and send in the name of the task as parameter:

gulp copy

If you don’t provide a named task, Gulp will look for a task called ‘default’ and run it.

The task function has two required parameters and one optional;
- name: the name of the task
- function: the function that defines what the task do
- dependencies (optional): an array of strings that contain names of other tasks that should be run prior to this one

As with all build automation tools, tasks can be chained together. If we want the default task to depend on the copy task, we can define it like this:

gulp.task('default', ['copy'], function() {
    process.stdout.write("Gulp is running the default task\n");
});

Watch

The watch function is (as src() and dest()) filesystem related and it takes a glob pattern as input. With watch() you can get notified when a file is changed and act accordingly. For instance if we want the copy task to automatically copy files when changes occur, we can setup a watch for that:

gulp.task('watch', function() {
    gulp.watch('Scripts/**/*.js', ['copy']);
});

When this task is running all changed files will be copied. Any new files will also be copied, but since the copy function only copies (doh!) it will not remove deleted files in the source folder from the destination folder. If we want that we need to add a new task that deletes files and that reacts to a ‘deleted’ event:

gulp.task('watch', function() {
    gulp.watch('Scripts/**/*.js', function(event){
        if(event.type === 'deleted') {
            deleteFile(event.path);
        }
        else {
            gulp.start('copy');
        };
    })
});

Note: I haven’t shown the implementation of the function deleteFile as this is just an example on how to react to different types of events. The available event types are: changed, added and deleted.

Plugins

Since Gulp itself doesn’t do much it depends upon plugins to provide the usefulness. And there’s a lot of plugins to choose from. At the time of writing there’s 1690 plugins listed on the Gulp home page. Let’s start with one of the most popular of them; the JSHint plugin. Since Gulp is running on Node it’s off course using npm so we install gulp-jshint just like any other npm package:

npm install gulp-jshint --save-dev

Now back to the Gulp file and let JSHint do some error checking on our JavaScript code:

var jshint = require('gulp-jshint');

gulp.task('jshint', function(){
    gulp.src('./src/scripts/*.js')
        .pipe(jshint())
        .pipe(jshint.reporter('jshint-stylish'));
});

Alternatives

Gulp is not the only build automation / task runner for JavaScript. It’s predecessor is Grunt and it’s quite similar to Gulp, but the definition of tasks are configuration-based and therefore more verbose. You configure tasks instead of defining them as JavaScript functions. Gulp is also more flexible than Grunt. Personally I prefer the Gulp-way of doing it, but Grunt is still the most popular task runner AFAIK.

Cake is pretty much the same as Gulp, but uses Coffeescript instead of pure JavaScript (as a side note; you can actually use Coffeescript with gulp too, but that’s another story).

Broccoli seems to be the new kid on the block right now and it seems promising. For bigger projects the watch-and-rebuild can take a ‘long’ time (everything is relative), so Broccoli was designed to do incremental builds. That is; only build whatever has changed since the last build. Broccoli is still in it’s early stages (at version 0.16 at the time of writing) and the “Windows support is still spotty” according to their own documentation.

Last but not least; you don’t actually need a dedicated automation tool as npm can already do this for you. I’m not going to go into detail on this one, but you can check out this blog post by Keith Cirkel for more information.

Resources

Gulp plugins – Search for available plugins
Node Glob – Documentation for the glob syntax in Node
Automate your tasks easily with Gulp.js by Justin Rexroad
Smashing Magazine: Building with Gulp
For more information on Node.js and npm, take a look at my previous post in this Getting started with JavaScript series: JavaScript anno 2015 - Node and npm

Saturday, July 25, 2015

JavaScript anno 2015 – Node and npm

Node.js is server-side JavaScript; a “runtime environment for server-side and networking applications” [Wikipedia]. It’s open source as just about everything in the JavaScript world of frameworks and tools. Node is an amazing runtime and in just about 5 lines of code you can have a simple web server up and running:

var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/plain'});
  res.end('Hello World\n');
}).listen(1337, '127.0.0.1');

We’re not using Node per se in our project yet. We’re running our ASP.NET web application on IIS, but it might be something we’ll be looking more at for specific tasks later on. But for now we’re using it implicit through the Node Package Manager.

Node Package Manager (NPM)

A package manager is responsible for installing, upgrading, configuring and uninstalling software packages for a given platform. NPM is a package manager for JavaScript, just like NuGet is a package manager for .NET, CPAN for Perl, Maven and Ivy for Java, RubyGems for Ruby, and so on.

NPM is bundled with Node so the way to get NPM on your machine is to install Node. With Node installed you can run npm commands from PowerShell or the command prompt.

NPM differentiates between modules and executables. When you install a module it will be placed in a node_modules folder, whereas executables are placed in a sub-folder called .bin. You can run the npm root and npm bin commands respectively to see where those folders exists on your local machine.

npm install

NPM also differentiates between installing globally or locally. Locally means local to the folder you run the npm install command from (typically your project root folder). A good rule is to never install modules globally. Only executables that need to be available across many project should be installed globally, but to avoid versioning dependencies one should strive to install most packages locally. To install a package globally you just run the install command with the --global (or -g) option.

Example:

npm install browserify => Downloads and unpack the Browserify package to a local node_modules folder

npm install grunt-cli –g => Downloads and unpack the Grunt command line interface to the central .bin folder, and adds grunt.exe to your PATH.

Dependencies

Often a package uses other packages and these dependencies are expressed in the package.json file inside the package. So when npm installs a package it will look in the package.json file and install any package that they rely on. Therefore an install of for instance Browserify will install no less than 49 direct dependencies, which again install their dependencies.

Unlike NuGet the dependencies of a package will not be installed at the same directory level as their dependent. Instead they will be installed in a node_modules folder inside the Browserify folder. Inside each dependent module there might be other dependencies. In fact, installing Browserify will create no less than 87 (!) node_modules folder underneath the Browserify folder.

Placing all dependencies inside each module is a great way to prevent versioning issues between dependencies of different modules. In .NET and NuGet this wouldn’t work since there’s no way to reference 2 different versions of the same assembly into the same project. You can however reference assemblies that are dependent on different versions of the same assembly. The conflict between them is solved by assembly binding redirects in web/app.config, but it can lead to problems if a dependent assembly has breaking changes between versions. In the JavaScript world there is no concept of binary assemblies. The modules are just one or more js-files and as long as they are loaded within different scopes, they will not cause any conflicts.

There’s a lot more to say about dependencies in npm, but the one thing you need to be aware of is the difference between dependencies and devDependencies. The first one is all dependencies required to run, while the latter is additional dependencies needed for development (there’s also peerDependencies and bundledDependencies, but I won’t go into that here). Typically devDependencies will include unit tests, test harnesses, minification, transpilers, etc.

Package.json

If you run npm install without any package name, npm will look for a package.json in the directory where you run the install command. If it finds it npm will install all dependencies listed (including devDependencies). The package.json is similar to packages.config in NuGet, but I dare to say that npm seems a lot more sophisticated and solid than NuGet.

Typically you will have a package.json file in the root directory of your project and you will place all your dependencies there. A simple package file might look like this:

{
    “name”: “my-app”,
    “version”: “0.0.1”
    “dependencies”: {
        “browserify”: “11.0.x”  
    }
}

Note the ‘x’ in the Browserify version. This means that any 11.0-versions will do, so when you do a npm install (or update) 11.0.0, 11.0.1, etc, is OK, but not 11.1.0 or 12.0.0. Npm follows semantic versioning (semver) and the numbers means ‘major.minor.patch’. If you’re happy to trust that Browserify will be backward compatible across minor versions (which it should be if they follow semver), you can specify ‘11.x’ instead of ‘11.0.x’. If you just want the newest version – regardless of breaking changes (not recommended!), then you can just put an ‘x’ instead of ‘11.0.x’.

You could create the package.json file manually, but a better way is to use the init command:

npm init
Note; if you create the file manually be sure that the file is ASCII encoded. Unicode will result in a parsing error in npm and UTF-8 will result in the file not been updated (e.g. when running with the --save flag below).

You could also edit your package.json file manually and add all dependencies by hand, but again; it’s better to let npm handle this. You do this by appending a --save flag (-S for short) to the install command;

npm install browserify --save
If the package is only meant for development purpose and not applicable for the production environment, for instance testing frameworks, you can use the --save-dev instead;

npm install jasmine --save-dev
When you have a lot of packages installed, it’s a great chance that some of them share some dependencies. Because of npm’s hierarchical structure, it’s possible to optimize these shared dependencies by moving them further up the tree and thereby get rid of duplicated modules. The command for that is dedupe:
npm dedupe

If you want to remove any packages that is not in your package.json, you can run the prune command:

npm prune
If you run the prune command with the --production flag, all devDependencies will be removed (nice when deploying to production).

If you want to see all installed packages there’s a ls command for that:

npm ls
Note that this will list all top level packages as well as all their dependencies. If you’re only interested in the top level packages you can add the –depth 0 parameter to ls.

If you want to search for available packages, there’s a search:
npm search
…which also can do regular expressions. But for the most part it’s just easier to browse and search on npm’s home page.

npm update

Installing is just one side of the story. Once you’ve added any dependencies to your project you would like to keep them updated as well:

npm update
If you run it without specifying a particular package to update, npm will go through the package.json file and see if any newer versions are available. It will off course respect the versioning you’ve applied, so if won’t upgrade to a newer major version if you have specified that only minor versions are acceptable.

You can update a specific package by providing the name of the package. If the package is installed in the global scope you need to add the –g flag.

npm update will also download any missing packages, but you need to add the --dev flag to get all devDependencies. One thing to be aware of is that npm will not do any recursive update of all package dependencies. It will only update the top level packages, but you can force recursion with the --depth flag.

As with the install command you can let npm update your package.json file with the updated package versions;

npm update --save
If you want to check whether any new packages exists without updating them, you can run the outdated command:

npm outdated

npm uninstall

Removing packages is as easy as installing. Just run the uninstall command with the name of the package to remove and the package is gone;

npm uninstall browserify
As with install and update you can let npm update package.json:
npm uninstall browserify --save

Wrap up

The three major take-aways from this post should be that

1. Npm packages comes in two flavors; executables and modules. Executables are typically command line tools, while modules are libraries that you want to use in your code.

2. Npm has two modus operandi; global and local. In general you should install executables in the global scope and modules in the local.

3. Put a package.json file in the root of your project and add all of your project dependencies there.

What I haven’t talked about is configuring npm. There’s a lot to say about this, but I’m just going to keep it short and say that npm is highly configurable and I’ll just point you to the resources below.

Resources

npmjs.org – The home page for npm where you can search and browse for available packages
docs.npmjs.org – The documentation for npm is darn good if I may say so. I really recommend taking a look at it as I promise you’ll learn a lot from it.
As for configuring npm, here is a starting point for you.
For a great explanation of the difference between the various dependencies in npm, take a look at the top-voted answer to this question on StackOverflow.
To get some insight into the history of npm and why it is as it is, read through the answers from Isaac Schlueter (the main developer on npm) in this thread.
nodejs.org – The home page for Node.

Wednesday, June 3, 2015

Logging to SQL Server with Log4Net

How do you know what’s happening on your production servers? Logging off course (if you wonder; no, ‘debug & breakpoints’ is never the correct answer. Never ever. Ever.).
We have been using Log4Net as our logging tool for 3-4 years by now and I just wanted to share how we are using it and how incredibly powerful good logging can be.
First of all, if you are not familiar with Log4Net it is an open source, free-for-use logging framework under the Apache Foundation umbrella. Among its strengths is that it is fairly easy to get started with, it has a low impact on the application performance and it has a lot of adapters that lets you log to a lot of different destinations (console, file, database, event log, etc).
At the beginning we set up logging to console (for those systems that had console output) and file, but after a while we added logging to SQL Server. It is the combination of logs stored in a SQL database and full-text indexing of these logs that really gives us eyes in to what happens on our production servers.

Log to console

Logging to console is definitely the easiest way to get started with Log4Net. But writing to the console output is also the one that gives you least payback in form of long-term insight into your production systems. Log4Net can be configured either using xml or code, but xml is by far the most used. Typically you do the xml configuration in your app/web.config, but you can also keep the Log4Net configuration in separate xml files if you prefer. We chose the app/web.config approach and so the xml for console logging looks like this:

<?xml version="1.0" encoding="utf-8"?>
<configuration>
    <configSections>
        <section 
            name="log4net" 
            type="log4net.Config.Log4NetConfigurationSectionHandler,Log4net" />
    </configSections>
    <log4net>
        <root>
            <level 
                value="DEBUG" />
            <appender-ref 
                ref="ConsoleAppender" />
        </root>
        <appender 
            name="ConsoleAppender" 
            type="log4net.Appender.ConsoleAppender">
            <layout 
                type="log4net.Layout.PatternLayout">
                <param 
                    name="ConversionPattern" 
                    value="%d [%t] %-5p [%x] - %m%n" />
            </layout>
            <filter 
                type="log4net.Filter.LevelRangeFilter">
                <param 
                    name="LevelMin" 
                    value="DEBUG" />
                <param 
                    name="LevelMax" 
                    value="FATAL" />
            </filter>
        </appender>
    </log4net>
</configuration>

You can do this configuration in code as well, but the great benefit of using xml for the configuration is that you can change the settings (for instance the log level threshold) without re-deploying your application. In the case of web hosts you can even change it without restarting the application. If you’ve been a good boy/girl and set up debug-level logging in your code, you can just flip an xml-switch and additional log entries will start flowing in.

Log to file

If you want your logs to survive application restarts (and the console window buffer) and/or have an application that doesn’t have console output, logging to file would be the next step on logging ladder.
The main thing to keep in mind when logging to file is to set limits on how large each log file can get. Log4Net has some defaults that might not suit your situation so be sure to check out the documentation on how you can configure logging to file. For one of our systems we chose to have a 10 mb limit on each file which you can see in this xml config:

<log4net>
    <root>
        <level value="DEBUG" />
        <appender-ref ref="LogFileAppender" />
    </root>
    <appender 
        name="LogFileAppender" 
        type="log4net.Appender.RollingFileAppender">
        <param 
            name="File" 
            value="logs.txt" />
        <param 
            name="AppendToFile" 
            value="true" />
        <!-- Logfiles are rolled over to backup files when size limit is reached -->
        <rollingStyle 
            value="Size" />
        <!-- Maximum number of backup files that are kept before the oldest is erased -->
        <maxSizeRollBackups 
            value="10" />
        <!-- Maximum size that the output file is allowed to reach before being rolled over to backup files -->
        <maximumFileSize 
            value="10MB" />
        <!-- Indicating whether to always log to the same file -->
        <staticLogFileName 
            value="true" />
        <layout type="log4net.Layout.PatternLayout">
            <param 
                name="ConversionPattern" 
                value="%-5p%d{yyyy-MM-dd hh:mm:ss} – %m%n" />
        </layout>
    </appender>
</log4net>

The above config specifies that maximum 100 mb of logs will be kept on file (10 mb pr file and max 10 files).

Log to console and file

There is no problem logging to both console and file simultaneously and you can even set different log levels on each appender. If you want to have different files for different log levels (e.g. ‘debug.log’, ‘info.log’, etc), you can just configure as many file appenders as you need. Here is an example of logging to both console and file at the same time:

<log4net>
    <root>
        <level value="INFO" />
        <appender-ref ref="LogFileAppender" />
        <appender-ref ref="ConsoleAppender" />
    </root>
    <appender name="LogFileAppender" type="log4net.Appender.RollingFileAppender">
        <filter type="log4net.Filter.LevelRangeFilter">
            <param name="LevelMin" value="WARN" />
            <param name="LevelMax" value="FATAL" />
        </filter>
        ...
    </appender>
    <appender name="ConsoleAppender" type="log4net.Appender.ConsoleAppender">
        ...
    </appender>
</log4net>

The default log level is set to INFO, which means that unless otherwise specified in the appenders, messages with level INFO, WARN, ERROR and FATAL will be logged. The file appender is set to only log WARN, ERROR and FATAL though.

Log to SQL Server

As already mentioned the logging to file and console is easy to get started with and does not take much effort to set up. Setting up logging to a database takes a bit more work, but it is far from difficult. Here is how we configured logging to a SQL database from one of our web hosts:

<root>
    <level value="DEBUG" />
    <appender-ref ref="AdoNetAppender" />
</root>
<appender 
    name="AdoNetAppender" 
    type="log4net.Appender.AdoNetAppender">
    <threshold>INFO</threshold>
    <bufferSize 
        value="50" />
    <connectionType 
        value="System.Data.SqlClient.SqlConnection, System.Data, Version=1.0.3300.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
    <connectionString 
        value="data source=SERVERNAME;initial catalog=DATABASE;integrated security=false;persist security info=True;User ID=USERNAMEN;Password=PASSWORD" />
    <commandText 
        value="INSERT INTO Logs ([Date],[Thread],[Source],[Level],[Logger],[Message],[Exception],[HostName]) VALUES (@log_date, @thread, 'LOG SOURCE',@log_level, @logger, @message, @exception, @hostname)" />
    <parameter>
        <parameterName value="@log_date" />
        <dbType value="DateTime" />
        <layout type="log4net.Layout.RawTimeStampLayout" />
    </parameter>
    <parameter>
        <parameterName value="@thread" />
        <dbType value="String" />
        <size value="255" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%thread" />
        </layout>
    </parameter>
    <parameter>
        <parameterName value="@hostname" />
        <dbType value="String" />
        <size value="255" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%property{log4net:HostName}" />
        </layout>
    </parameter>
    <parameter>
        <parameterName value="@log_level" />
        <dbType value="String" />
        <size value="50" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%level" />
        </layout>
    </parameter>
    <parameter>
        <parameterName value="@logger" />
        <dbType value="String" />
        <size value="255" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%logger" />
        </layout>
    </parameter>
    <parameter>
        <parameterName value="@message" />
        <dbType value="String" />
        <size value="-1" />
        <layout type="log4net.Layout.PatternLayout">
            <conversionPattern value="%message" />
        </layout>
    </parameter>
    <parameter>
        <parameterName value="@exception" />
        <dbType value="String" />
        <size value="-1" />
        <layout type="log4net.Layout.ExceptionLayout" />
    </parameter>
</appender>
    </log4net>

The xml config is the same whether you are configuring logging in web- or app.config (you need to insert your own values for servername, database and login).
The main thing to point out here is the ‘buffer’ element, which tells Log4Net how many log entries to buffer up before writing them the database. There isn’t any correct number here and you need to figure out what suits your environment the best. The trade-offs are performance versus reliability, since a low buffer will take more resources because of the many writes to the database table (and yes, we learned that the hard way off course). A high buffer limit will be less reliable because if your application crashes, the logs not yet written will never be written.
Also; it might make sense to have different buffer limits for different environments. In the development and test/QA environments, a low limit might be preferable since the logs will be written faster to the database. And since the number of log entries will be far less than in the production system, it might be long time to wait for the logs to be available if you run with the same limits as in production. In a production environment, instant logs are in most cases not relevant and performance is more critical. Then again, reliability is also a good thing so you need to find a good trade off.
Another thing to notice is that we have a lot of subsystems (web hosts, windows services, message bus, cron jobs, etc) that logs to the database. To know where the logs come from we add the ‘LOG SOURCE’ as the name of the subsystem where the config is defined in (e.g ‘CommandsHost’ as the web host that receives commands from our application).
To get the logs into a database, you will need to create a table that matches the log entry that you have defined in the appender config. Here is the t-sql to create a table that matches the above config:

CREATE TABLE [dbo].[Logs](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Date] [datetime] NOT NULL,
    [Thread] [varchar](255) NOT NULL,
    [Level] [varchar](50) NOT NULL,
    [Logger] [varchar](255) NOT NULL,
    [Message] [nvarchar](max) NOT NULL,
    [Exception] [nvarchar](max) NULL,
    [Source] [varchar](100) NULL,
    [HostName] [nvarchar](255) NULL
CONSTRAINT [PK_Log] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)

Xml transforms

Using xml transforms is an easy way to set up different settings for different environments. For web projects this is built into Visual Studio and MSBuild/MSDeploy, so the tooling support for this is pretty good. The only caveat is that the transformation is only run during deployment – not during the build. So if your switching between different build configs in Visual Studio, the web host on your dev machine will only use the web.config – not any of the web.debug.config, web.release.config, etc (unless you are actually deploying to your local IIS).
If you are developing a console/WPF/WebForms application you still can take advantage of the same xml transform as web projects, but the tooling is not built into Visual Studio or MSBuild/MSDeploy. There is however an excellent free tool (VS extension) called SlowCheetah developed by Sayed Ibrahim Hashimi that will do this for you. You can download it as a Visual Studio extension, and it has an extra gem that Visual Studio doesn’t have; transformation preview.

SQL Server Full-Text search

The real power when it comes to database log entries is when you pair it with full-text searching. Full-text search will require quite a bit of resources in the form of hardware (disk, memory, cpu), but you don’t have to (and shouldn’t) set up the full-text indexing on your production database server. Instead you should set up log shipping in SQL Server (or some other form of pulling the logs off your production servers) and then do your full-text indexing and searching on a separate database server.

Pair full-text search of logs with a message based (event driven) system, and you have an incredible insight to your production system and an invaluable, searchable history.

Resources

Log4Net: http://logging.apache.org/log4net/

SlowCheetah: https://visualstudiogallery.msdn.microsoft.com/69023d00-a4f9-4a34-a6cd-7e854ba318b5