Comet or Long Loop message pattern - put an end to polling!

NOTE: as of 2018, these approaches are hopelessly out of date πŸ™‚ Current tools like socket.io and/or meteor.js handle all of this, are scalable, and provide a bunch of other features/benefits...

A while back I posted a possible solution to dealing with long running processes in a web application. While that solution works for very basic processes, the use of threading in an asp.net application can be the cause of a lot of grief (there are just too many ways outside of your control for those threads to be aborted prematurely).

I did a little research and came up with a MUCH better solution - simply execute the ajax request for the long running process, and then listen for messages on another ajax request. The key to this working in IIS/.NET, however is to ensure that your long running process is a SESSIONLESS request, otherwise your request will block further ajax requests until it's completed.

In my previous solution, I pointed out a number of weakneses with the simple polling technique I used to send status messaged back to the client. Polling is too chatty, especially if you wind up sending redundant status data back from the server regularly.

In this blog post, I'll focus on the improvement to the message passing portion of the solution.

As web developers we are constantly on the lookout for server requests that take 'too long' to respond, so this solution may be a little counter-intuitive for some. The basic idea is that we submit an ajax request to the server which INTENTIONALLY doesn't return a response to the client until it has new data to send back. The code on the server waits in a loop until the relevant data is available. When the client eventually receives the response, it IMMEDIATELY makes another request to the server for more data. In this way it will always have an open connection to the server, which can send data back to the client when IT is ready, not when the client decides it's interested. This simulates a 'push' of data from the server to the client, which is an extremely useful concept.

In order to have a good user experience this design does need to incorporate a bit more error handling. First, the client must handle timeout errors (default timeouts can be set when you make the ajax request, or you can accept the server defaults (typically 20 minutes). Second, the client should have a way of aborting the 'listening' process based on user input. Third, the server should handle some type of logical timeout in case it unexpectedly stops finding data to send back to the client. The method used to communicate a server-initiated timeout could also be used to communicate other logical statuses, such as 'process completed - stop listening'.

On the server

I'll show the code for the server first, because this is actually the simplest part of the solution:

As you can see, this simply loops until it has something to send back to the client. I chose to sleep the loop for .5 seconds in order to avoid consuming too many resources on the server. You can also see that I implemented a simple 1 minute timeout.

Note: in a real-world production application, you should use asynchronous controller methods, or you will risk depleting the IIS threadpool. This is just a sample app focusing on the communication pattern. Another alternative would be to use a completely different messaging server that doesn't consume threads as agressively as IIS/.net - such as Faye built on top of node.js. In this option, the .net server would send messages to the messaging server, and the web clients would 'listen' to the messaging server.

On the client

This is a simple ajax request, with handlers for errors and success. You'll notice I return the jqXHR object recieved from the ajax call. This is used later to abort the open connection based on user input. The error handler isn't that interesting - it just displays the error and stops the process.

When we recieve a message from the server, we display it and then check to see if we should continue listening for messages. In this case I have encapsulated the code into a 'listener' object which has start(), abort(), and isListening() methods, so I call the abort() method simply to reset the state of the object, not to abort the ajax request (which is completed at this point).

Here are their implementations:

And to wrap things up, here is how these are wired into the user interface:

You might notice that I have the UI code pass in a callback function which is responsible for displaying the recieved messages in the UI. This provides a nice clean separation of concerns between the message handling and the UI elements.

You can find a complete asp.net MVC3 solution here:Comet Demo Naturally this pattern could be implemented in any development platform, although an ajax library such as jQuery and an MVC framework that handles the details of dealing with json data keeps things a lot simpler.

Let me know if you have feedback on this pattern or on the code sample.

Cheers,

Javascript videos - by Douglas Crockford of YAHOO

If you are a web developer, you almost certainly need to program in Javascript. If you need to program in Javascript, you need to watch this series of video presentations by Douglas Crockford.

Hopefully most of you (web developers) know who Mr Crockford is, but for those that don't recognize his name: he works for Yahoo, and is a well known author and presenter on Javascript topics. He is a member of the ECMAScript standards body and a general Javascript guru - developer of JSLint, the JSON spec and author of JavaScript: The Good Parts.

These lectures were given to (some members of) the Yahoo development team. The first lecture is a fascinating history of computing and language development which is really informative and sets up the other lectures (on Javascript) really well. If you don't have the time for the first lecture you can dive in on the second one and get right into the language implementation details, but I really do recommend you start with the first video. Each presentation is about 2hrs long, so make sure you've set aside enough time - it will be worth it!

I can't recommend this enough - if you're serious about your professional development as a web developer, Mr. Crockford's material is must-read and must-watch.

Long Running ASP.net Processes - a simple example

The problem

Sometimes, your web application needs to do something that takes a really long time - perhaps process a batch of files, backup or archive data, gather a bunch of data from external sources, or similar. When dealing with this situation, you're faced with a few challenges:

  • browser and other timeout settings - web frameworks aren't designed to take more than a few seconds long to process a request and send back a response to the user.
  • user feedback - the user needs some sort of indication that the system is working as intended and not frozen or encountered an error.
  • user productivity - the user may want to do something else within your app while waiting for your process to finish.

I had to solve this problem myself a little while ago and thought I'd share my solution, which has a few concepts I did not find while searching for articles on the topic:

  • Status update in the form of a log or history of process rather than just a single %age complete number used in a progress bar
  • Providing parameters into the long running process, and
  • Getting access to the HTTPContext during the long running process.<!--more-->

The Solution

The key technologies used in solving the posted challenges are:

  • Ajax in the browser to dynamically update the UI with the current status in a smooth and expected manner for the user.

    • I used the excellent jQuery libraries for this behavior.
  • Threads on the server to spawn a process that continues to run after a response has been given to the user.

  • Use of the ASP.net Cache as a way of communicating between the long running process and the rest of the web application

  • Use of JSON data to pass information between the browser and the server.

I've chosen to use ASP.net MVC in my example, because it provides very easy means to work with JSON requests and responses (and the framework rocks!). However the same approach could be used with ASP.net webforms as well. The MVC framework is not core to the solution. In fact, this approach could be used in a Java application just as well, given the ability to create threads and use some means of inter-process communication (session variables or similar).

The design is pretty straightforward. Let's look at the code in the browser first.

In the Browser

I won't show the HTML markup here, but it is very simple - just a jQueryUI button for the user to click. When the user clicks the button, we display a dialog box to hold the status updates, and the process is kicked off with an ajax request.

The browser stores that processID and creates a timer which will poll the server every second for an update on the status of that process:

Obviously polling every second may not be ideal so the frequency should be adjusted to suit your needs.

When the response with the updated status returns, the UI is updated:

When the process is completed, we let the user know and trigger an optional cleanup routine on the server that I'll go into more detail later. In this example I'm throwing an alert dialog to make it obvious to the user that the process is done, but this is just an example - you might want to do something a bit more user-friendly than that in your application.

Now, let's see what's happening on the server.

On the Server

Here is the controller action that triggers the long running process and returns a unique ID:

That is pretty straightforward and not very interesting, other than to point out that the only job of this action is to trigger the long running process, not do any of the actual work. More interesting is how it triggers the process:

Here we create a new Thread to do the actual work. The 'MyLongRunningProcess' is a method that we created to do the actual work. The 'ParameterizedThreadStart' method allows us to pass a single parameter to that method, so we use a simple object to hold whatever data we want to use during that process.

There are a few interesting points about the parameters:

  1. Because we've used a simple object to hold the data passed to the process, we can pass in whatever complex information and objects we might need.
  2. I've passed in a GUID processID that the long running process can use to tag it's status updates, and I've passed in the current HTTPContext so that the long-running process can have access to the environment of the web application. This is particularly useful if you are migrating controller code that may have assumptions/dependencies on the HTTPContext.

Here is the long running process itself:

Notice that I've assignedthe HTTPContext to the System.Web.HttpContext.Current property. I was actually surprised that I was able to do this. We are now keeping a handle to that Context which would otherwise be disposed of after the original request was made. I am assuming that when this thread finishes execution, that things will be disposed of during the normal .net garbage collection process, so I don't explicitly null the System.Web.HttpContext.Current at the end of the process. Perhaps I should - anyone with a bit more insight into this please leave a comment!

The process leaves status updates in the HTTPCache:

All we are doing is appending a new message onto the end of the existing status, which is simply an html string. This is very simple and actually works really well, but there are a few issues with this that I want to highlight:

  • Because of the polling design, we won't know when the user has the very last status update, so I rely on a final action triggered by the browser to clean up the cache entry. Obviously this isn't guaranteed to happen - the user might close the browser or navigate to a different page, or the network might fail and not deliver the cleanup request.
    • However, because the ASP.net cache will eventually kick the entry out of the cache based on it's expiry rules, we don't have to worry about a permanent buildup of garbage like we would if we were storing status in a file or other more permanent resource.
  • The amount of data passed from the server to the client grows with each update to the status, and the majority of that data is redundant. Depending on the polling frequency, if your status update includes lists of thousands of files or similar, this could quickly become a real performance issue.
    • This design could be refactored to only retrieve status updates that the client hasn't already recieved. Because of the inherent unreliability of http protocols, we can't just delete the status information once we send it back to the browser, we would have to have requests for status updates include some type of pointer to the last update recieved (datetime might suffice, but I think an ID of some sort would be more reliable), and send updates that have occurred after that point.
  • The example stores the status as HTML (it includes
    tags as line separators). This is not a good separation of design/layout and content/logic.
    • An improvement would be to separate the updates with some other token (newlines), parse the data on the client, and apply whatever layout styling is appropriate, perhaps using jQuery templates.
    • A variation would be to have the updateStatus method apply some type of template rendering (HTML.RenderPartial()?) to the status before it is inserted into the cache.
  • The example uses a magic string in the returned status to determine when the process is completed. This is also not a good separation of design/layout and content/logic, and the status update could incorrectly think the process is finished if your status update includes that magic keyword before the process is finished - such as having the keyword in a filename, or similar.
  • An improvement would be to return the 'complete or not' status (or a %age complete amount) as an additional parameter in the status update response, which the browser could check. This could be stored in a separate cache element (make sure to use the processID in the key).

Here is the action used to return the status update to the browser:

And here is the action used to cleanup the cache entry when we're done:

General notes and comments

In this example, the user can trigger multiple processes by clicking the button again before existing processes are finished. Each status dialog will get it's own updates appropriately. The user can also close the dialogs before the process is finished. This only hides the dialog. The updates continue to happen, the user just can't see them. You might need to provide a way for the user to bring the dialog back up again. Obviously the choice of a dialog in the first place is just one of convenience - you can put the status updates wherever you like.

Note - this process still runs inside the asp.net application pool, so is subject to issues like app-pool recycling for various reasons, which would terminate your long running process. If you have a really long running process, or need one that is guaranteed to stay running through application restarts, you need to create a windows service, run the thread there, and communicate to and from the web application in a completely different manner - something beyond the scope of this blog post. πŸ™‚

I hope this helps someone needing to do something similar in their application. Feel free to leave questions or comments!

Here is a link to the source code: Long Running Process Example

Learning another language

I've been meaning to learn another programming language as a way of 'sharpening my tools', so to speak. For years I've had my eye on LISP, but wasn't sure it would be worth my time, given the predominance of the MS .net stack, Java, javascript, and some of the new scripting languages - ruby, python, etc. But I stumbled across a recommendation to watch a series of videos on LISP from MIT professors Hal Abelson and Gerald Jay Sussman.

They are freely posted here: http://www.archive.org/details/mitocwsicp

Well I tried the first lecture, and while I chuckled at the dress and hairstyles of the early 80's, I was quickly hooked on the content. I quickly realized that this wasn't just a course on LISP, but rather a great course on the really great concepts of computer programming. I'm on the 3rd lecture so far, and having a blast. This might be old-hat for some CS grads, but I'd still really recommend checking these out if you want to geek out on some advanced ideas.

Cisco VPN for Win7 x64

The Cisco VPN client won't run on 64bit versions of windows, and Cisco has no plans to ship a 64 bit client anytime soon. In order to get VPN working on my shiny new Windows 7 x64 install, I tried to get the Cygwin/linux vpnc client working. I ran into a number of problems, so this post explains how it can be done and hopefully will help you avoid the same problems I had. I spent a lot of time researching information on the 'net, and wound up tweaking steps a bit and hacking scripts myself, but much of this is based on work by Li Zhao and Salty.

NOTE: update 22 Mar 2010 - The Shrew.net client supports Win7 now, and seems to work flawlessly. It’s a much better option if you don’t already have a need for Cygwin.

To get Cisco VPN working on an x64 Windows 7, you need:

  • Cygwin - a wonderful set of tools to emulate a linux environment on windows
  • OpenVPN - to provide a virtual TAP network adapter
  • VPNC - the VPN Client

Here's how to do it: I turned off UAC for the purposes of getting all of this installed. I ran into a number of permissions issues and false starts when I left it turned on. This reduces the number of things that can go wrong in the install process, particularly with Cygwin. Feel free to turn UAC back on after the install process if you're so inclined.

Install Cygwin:

  • Download cygwin http://www.cygwin.com/setup.exe installer and save to your hard drive (don't just run it directly)
  • Right-click on the setup file, and 'Run as administrator'. This may not actually be neccesary but it makes me feel good.
  • Accept defaults except for local packages directory - I changed that to c:\cygwin\packages. Saving stuff to my desktop is annoying.
  • Choose a mirror to download packages from. Choose something physically close to you.
  • Choose the following additional packages to include: (choose them by clicking on the word 'skip' in the 'New' column)
    • Devel > gcc-g++: C++ Compiler
    • Devel > make: The GNU version of the 'make' utility
    • Libs > libgcrypt: A general purpose crypto library based on the code from GnuPG
    • Libs > libgcrypt-devel: A general purpose crypto library based on the code from GnuPG (development)
    • Libs > libgpg-error: A library that defines common error values for GnuPG
    • Perl > perl: Larry Wall's Practical Extraction and Reporting Language
  • The rest of the cygwin install process is pretty obvious; just a couple more mouse clicks.

Install OpenVPN:

  • Download the latest OPENVPN release candidate from http://openvpn.net/index.php/open-source/downloads.html
  • Save it to your hard drive somewhere you can find it.

  • Make sure to run the installer as an administrator and in windows vista compatibility mode: (this IS neccesary!!)

    • Open up windows explorer to that location.
    • Right click on the file you saved, and select 'properties'
      • Select the 'compatibility' tab
      • In the Compatibility mode section, check the box 'Run this program in compatibility mode for:', choose Windows Vista (service pack 2)
      • In the Privilege Level section, check the box 'Run this program as an administrator'
      • OK/Apply the settings.
  • Now run the installer. Agree to the license agreement.

  • Uncheck all of the options EXCEPT for: 'TAP Virtual Ethernet Adapter', and 'add shortcuts to Start Menu' (in case you want to add another vpn connection)

  • Install to default location.

  • The installer should have created a TAP adapter in Control PanelNetwork and InternetNetwork Connections. It's probably called Local Area Connection 2. You NEED to rename it to something without a space in the name and something you'll remember. I called mine 'CiscoVPN'.

Build and install VPNC:

  • Download the latest release of vpnc from : http://www.unix-ag.uni-kl.de/~massar/vpnc/ I got http://www.unix-ag.uni-kl.de/~massar/vpnc/vpnc-0.5.3.tar.gz
  • Save the file to a convenient temporary location, such as c:\cygwin\tmp
  • Right click on the Cygwin Bash Shell shortcut (should be on your desktop and/or start menu) and select 'Run as administrator' to open up a cygwin bash command prompt. IT IS CRITICAL THAT YOU RUN THIS AS ADMINISTRATOR OR YOU WILL FAIL AT CERTAIN POINTS IN THIS PROCESS SILENTLY!!!
  • Now extract the contents of the file and change to the extracted folder:
$ cd /tmp
$ tar xvfz vpnc<tab>
$ cd vpnc<tab>
  • Now, compile it with make:
$ make
$ make PREFIX=/usr install
$ mkdir /var/run/vpnc
  • For or some reason my own Cygwin install did not put /usr/sbin into the 'path' environment variable. Not sure why that is. You could add c:\cygwin\usrs\bin into your system path via advanced system properties in Windows, but here's how I fixed it: I edited my .bashrc file (found in your 'home' directory: /home//.bashrc ), and added these two lines:
export PATH=${PATH}:/usr/local/bin
export PATH=${PATH}:/usr/sbin/
  • You'll want to logout out of the bash shell and login again to have that take effect. REMEMBER TO 'RUN AS ADMINISTATOR'
  • You can check to make sure you can 'find' vpnc with this command:
$ which vpnc
/usr/sbin/vpnc

Repair the VPN Routing configuration script:

  • This is the bit that's broken in Win 7, or perhaps all x64 bit OS. The script in question is: /etc/vpnc/vpnc-script-win.js. I've posted my updated script here: vpnc-script-win Note that I added a bunch of debugging information to the script. You can make it much quieter by removing or commenting out the two echo commands inside the run function definition on lines 17 and 19.
  • The script supplied with vpnc relies on the execution of the "route print" command to extract the default gateway on this computer. For some reason the results of that command do not format the default gateway information in Win 7 x64 in the expected manner (perhaps other versions as well - I've seen references to this problem with Vista x64).
  • To fix the script, you need to replace the function getDefaultGateway with this one I hacked together:
function getDefaultGateway()
 {
 var output =   run( "route print 0.0.0.0"  ) ;
 var pos = output.indexOf("0.0.0.0          0.0.0.0      ") + 30;
 var gw = output.substring(pos,pos+15); // max length of ip address
 gw = gw.substring(0,gw.indexOf(" ")); // trim at first space...
 echo("Default Gateway: [" + gw + "]");
 return gw;
 }
  • I suck at regular expressions so I didn't try to use them for this. If you're better with regular expressions than I am feel free to post a comment with an improved function.
  • I also found it neccessary to add a smal pause to the script before adding the internal routes. Without the pause, the 'route add' commands were not adding correct routes in every case.
echo("Pausing for 4 seconds to allow the adapter to register itself correctly and therefore correct routing inferences made. You may need to supply a longer delay.");
WScript.Sleep(4000);
  • I put that sleep command just before line: if (env("CISCO_SPLIT_INC")) {
  • I also noticed that after disconnecting (ctrl-c), there were still stray routes in the route table. I'm not sure that it matters because the route to the main vpn gateway is properly removed, but I went ahead and added some code to remove all of the routes added by the script when disconnecting:
run("route delete " + env("VPNGATEWAY") + " mask 255.255.255.255");
//remove internal network routes
if (env("CISCO_SPLIT_INC")) {
for (var i = 0 ; i < parseInt(env("CISCO_SPLIT_INC")); i++) {
var network = env("CISCO_SPLIT_INC_" + i + "_ADDR");
run("route delete " + network );
}
  • Alternatively, I've posted an updated file here: vpnc-script-win Just copy that to your /etc/vpnc folder.

Create your VPNC Configuration file:

  • You need to convert your old Cisco .pcf file into a config file for vpnc in order to get the shared secret included in that file carried over correctly, and to make sure you have the ip address of the vpn gateway, etc. So, first copy your existing cisco .pcf file into somewhere convenient. /tmp is pretty convenient. Then run the pcf2vpnc utility:
$ cd /tmp
$ pcf2vpnc <old Cisco filename>.pcf /etc/vpnc/<profilename>.conf
  • is whatever unique name you want it to be. Note that we're saving the new configuration file to /etc/vpnc. You could always copy the created file to that location later.
  • Now, edit the file it created with your favorite unix friendly editor. Windows Notepad is no good. Go download Notepad++ right now; I'll wait. Got it? OK. Open the file /etc/vpnc/.conf
  • You'll need to add a few items to the file:
Interface name <NameOfTheInterfaceYouPickedEarlier> 
# mine is: CiscoVPN

Interface mode tap
Pidfile /var/run/vpnc/<uniqueName>.pid 
# need a unique one per profile, so may as well use <profilename>.pid

Local Port 0  #auto selects a port
NAT Traversal Mode force-natt
No Detach
  • You really shouldn't do it, but you can also enter your password in this config file if you don't mind that password being there in plain text. The syntax for that is:
Xauth password <yourpassword>  # You've got a secure computer, right? Really? Are you sure?
  • If you have any trouble, you can add a debug flag to your config file:
Debug 1     # valid values: 1-3, 99.  99 = everything, including authorization information (passwords), so be careful

Putting it all together:

  • Actually there's not much to put together, you just need to use your shiny new vpnc command:
$ vpnc <profileNameWithoutExtension>
  • You could drop that command into an executable file to save a few keystrokes. I'll leave figuring out how to launch the cygwin bash shell and executing that command from a windows batch file (and so an icon on your desktop) up to you.

I hope this works for you, but I can't guarantee anything! Feel free to post comments with your experiences.

Update 10 Aug 09:

I have had issues with routing within the network if the vpn concentrator provides me with a new IP address. I'm not sure exactly why that happened or what to do in order to fix the routing - I'm not a network routing guru. I did figure out how to work around this issue though. In Control Panel\Network and Internet\Network Connections I went to the TCP/IPv4 settings and switched to dhcp ('obtain an IP address automatically, Obtain DNS server address automatically). These settings are overwritten when connecting to the vpn again, but for some reason making that change clears out something so that when connecting via vpnc the routing works again. For today anyways. πŸ™‚

Cheers,

Allan

Update 11 Aug 09:

I forgot to mention that in my research, this VPN client: http://www.shrew.net/software was mentioned by a number of folks as a valid free alternative. It actually looks pretty good, but I didn't try it, as I wanted to get the vpnc option figured out. I always install cygwin on my windows boxes so I didn't mind going down this route. If anyone has experience with the shrewsoft client on x64 bit windows, please comment here.