Nerdworks logo "The nerd shall inherit the earth."

Nerdworks Blogorama

Nerdspeak

Writing a sensor driver for the Wiimote on Windows 7
Technobabble
2/5/2010 4:57:28 PM  

Ever since I saw Johnny Chung Lee's demos of the innovative ways in which the Nintendo Wiimote can be used as an input device for the PC, I've been hooked! The Wiimote is a surprisingly self-contained piece of hardware that is able to operate independently apart from the Wii console. Among other things, the Wiimote features a 3-axis accelerometer which is the primary enabler for letting you do things like wave a virtual tennis racquet or roll a bowling ball down the alley.

When Microsoft released Windows 7, one of the new things that they added to the system was a brand new platform for managing a certain class of hardware devices known as "sensors". Sensors are basically devices that, well, sense things! A GPS device for example is a sensor that can provide geographical location information. Another example is an ambient light sensor that lets the system know how bright the ambient light is. Now, these kinds of devices could be written and used even before Windows 7, just that now we have standard ways of exposing and consuming sensor data so that device vendors and application developers are able to communicate with each other in non-proprietary ways.

So, putting the two together, I figured it'd be kind of neat to come up with a sensor driver that exposed the accelerometer data from the Wiimote to the sensor platform. After some bit of head-scratching I managed to put something together. I've written up an article about it and posted it over at CodeProject.com. If you're interested, you can head over there and read all about it! Here's the link:

Link Comment
 
Enabling JSONP calls on ASP.NET MVC
Technobabble
10/19/2009 2:15:09 PM  

JSONP is the technique of making cross-domain HTTP requests via JavaScript circumventing browser security restictions on requests that return data in the JSON format. By cross-domain, we refer to the practice of making HTTP requests on URLs that refer to resources residing on a domain other than the one from where the page being viewed/executed was loaded from. You'll find a good description of what JSONP is all about here. Briefly, the technique exploits a browser back-door where the SRC attribute on a SCRIPT tag is allowed to be a "foreign" URL and the browser will download the code and evaluate it. But then again, this is perhaps not so much a back-door as a feature that allows you to build mash-ups by composing code that draws on functionality hosted on different servers. If this were not allowed then something like the Content Delivery Networks (CDN) would just not work. In fact this blog that you're reading right now loads the jQuery JavaScript library via the Google CDN.

Here's an example of this in practice - I've loaded up the 5 most "interesting" photos uploaded on Flickr in the last day or so below:

And here's the code that accomplishes this with a little jQuery magic:

//
// Flickr REST url
//
var url = "http://api.flickr.com/services/rest/?";

//
// My Flickr API key
//
var api_key = "<<your flickr api key here>>";

//
// build a flicker url from a photo object
//
function buildPhotoUrl(photo) {
    return "http://farm" + photo.farm +
           ".static.flickr.com/" + photo.server + "/" +
           photo.id + "_" + photo.secret + "_t.jpg";
}

//
// get interesting photos
//
function getInterestingPhotos() {
    //
    // build the URL
    //
    var call = url + "method=flickr.interestingness.getList&api_key=" +
               api_key + "&per_page=5&page=1&format=json&jsoncallback=?";

    //
    // make the ajax call
    //
    $.getJSON(call, function(rsp) {
        if (rsp.stat != "ok") {
            //
            // something went wrong!
            //
            $("#interesting_photos").append(
                "<label style=\"color:red\">Whoops!  It didn't work!" +
                "  This is embarrassing!  Here's what Flickr had to " +
                " say about this - " + rsp.message + "</label>");
        }
        else {
            //
            // build the html
            //
            var html = "";
            $.each(rsp.photos.photo, function() {
                var photo = this;
                html += "<span><img src=\"" + buildPhotoUrl(photo) +
                        "\" title=\"" + photo.title + "\" alt=\"" + photo.title +
                        "\" /></span> ";
            });

            //
            // append this to the div
            //
            $("#interesting_photos").append(html);
        }
    });
}

//
// get the photos
//
$(getInterestingPhotos);

The basic idea is to dynamically add a SCRIPT tag to the DOM where the SRC attribute is pointing to the external URL where the data that we want resides. Upon encountering a new SCRIPT node, the browser immediately starts loading the code and evaluating it (and also freezes pretty much everything else in the browser while it is doing this!). If we can have the external resource render a call to a function that we define (or to a well-known function name) in the script that it produces then we can effectively have a callback routine invoked when the data is received from the foreign domain. The function to be called when the script dynamically loads is usually passed in as a query string parameter in the GET request or can be a function name that is mandated by the 3rd party site. Flickr for example defaults to rendering a call to a function called jsonFlickrApi but allows you to override this by passing a different name via the jsoncallback query string parameter.

jQuery has direct support for JSONP in that if you include a callback=? parameter in the AJAX URL for the getJSON call then it will replace the ? with a dynamically generated global JavaScript function name that it also defines. The word callback can be replaced with anything else as we did above by using jsoncallback. All the grunt work of dynamically adding SCRIPT tags to the DOM is done by jQuery.

Now, imagine that you designed your own little REST based service using ASP.NET MVC and have opted to provide JSON as one of the data output formats and wish to support JSONP requests. This basically means that you simply need to wrap your JSON output in a call to a user-supplied or a standard JavaScript function. So instead of rendering something like this out to the client:

{
    photos : [
        photo : {
            id : 232992,
            secret : "bAC980980c09c08a0ef"
        }
    ],
    page : 1,
    total : 500,
    photosPerPage : 10
}

You want to render something like this:

callback({
    photos : [
        photo : {
            id : 232992,
            secret : "bAC980980c09c08a0ef"
        }
    ],
    page : 1,
    total : 500,
    photosPerPage : 10
});

Where callback is the name of the JavaScript function defined by the client that needs to be invoked when the script is evaluated by the browser. If you wanted your ASP.NET MVC controllers to support this, how would you do it? The first approach I took was to simply define a custom ActionResult class that would produce the correct script. Here's what this looks like:

/// <summary>
/// Renders result as JSON and also wraps the JSON in a call
/// to the callback function specified in "JsonpResult.Callback".
/// </summary>
public class JsonpResult : JsonResult
{
    /// <summary>
    /// Gets or sets the javascript callback function that is
    /// to be invoked in the resulting script output.
    /// </summary>
    /// <value>The callback function name.</value>
    public string Callback { get; set; }

    /// <summary>
    /// Enables processing of the result of an action method by a
    /// custom type that inherits from <see cref="T:System.Web.Mvc.ActionResult"/>.
    /// </summary>
    /// <param name="context">The context within which the
    /// result is executed.</param>
    public override void ExecuteResult(ControllerContext context)
    {
        if (context == null)
            throw new ArgumentNullException("context");

        HttpResponseBase response = context.HttpContext.Response;
        if (!String.IsNullOrEmpty(ContentType))
            response.ContentType = ContentType;
        else
            response.ContentType = "application/json";

        if (ContentEncoding != null)
            response.ContentEncoding = ContentEncoding;

        if (Callback == null || Callback.Length == 0)
            Callback = context.HttpContext.Request.QueryString["callback"];

        if (Data != null)
        {
            // The JavaScriptSerializer type was marked as obsolete
            // prior to .NET Framework 3.5 SP1 
#pragma warning disable 0618
            JavaScriptSerializer serializer = new JavaScriptSerializer();
            string ser = serializer.Serialize(Data);
            response.Write(Callback + "(" + ser + ");");
#pragma warning restore 0618
        }
    }
}

Now you can simply return a JsonpResult (note the p before the "R" in "Result") from your action methods instead of JsonResult and everything should just work. Client's would indicate that this is a JSONP request by appending a query string parameter called callback to the request with the name of a JavaScript function that is to be called. I wasn't however, entirely happy with this situation as this would mean that I'd have to go and change the return type of all my action methods to return a JsonpResult instead of a JsonResult. And also update the code that instantiated JsonResult. I wanted a less disruptive solution if you will.

It is precisely for scenarios such as this that the ASP.NET MVC framework includes extension points in the form of action filters. Action filters are basically hooks that you can define to intercept and enhance request processing in various ways. ASP.NET MVC allows you to define filters that get called before and/or after the action method is invoked and also before and/or after the ExecuteResult method on the ActionResult is invoked. Action filters are defined as .NET custom attribute classes that inherit from ActionFilterAttribute. ActionFilterAttribute defines 4 virtual methods that you can override to hook into the request processing pipeline at the right juncture. The OnActionExecuted method for instance is invoked immediately after the action method has been invoked and gives you a chance to further process the returned result. This would perfectly suit our purpose here where we wish to conditionally supplant the JsonResult object being returned from the action method with a JsonpResult instead. Here's what I came up with:

public class JsonpFilterAttribute : ActionFilterAttribute
{
    public override void OnActionExecuted(ActionExecutedContext filterContext)
    {
        if(filterContext == null)
            throw new ArgumentNullException("filterContext");

        //
        // see if this request included a "callback" querystring parameter
        //
        string callback = filterContext.HttpContext.Request.QueryString["callback"];
        if (callback != null && callback.Length > 0)
        {
            //
            // ensure that the result is a "JsonResult"
            //
            JsonResult result = filterContext.Result as JsonResult;
            if (result == null)
            {
                throw new InvalidOperationException("JsonpFilterAttribute must be applied only " +
                    "on controllers and actions that return a JsonResult object.");
            }

            filterContext.Result = new JsonpResult
            {
                ContentEncoding = result.ContentEncoding,
                ContentType = result.ContentType,
                Data = result.Data,
                Callback = callback
            };
        }
    }
}

As is perhaps self-evident, we simply check if the query string included a parameter called callback and if it did, then we supplant the Result object in filterContext with a new JsonpResult instance. This approach allows us to enable JSONP processing on a controller or an action method by simply tagging on the JsonpFilter attribute. If I wanted all of my methods in the controller to support JSONP then, I can do this:

[JsonpFilter]
public class DoofusController : Controller
{
    //
    // all your action methods here
    //
}

Or if I wanted it to work with only a specific action method then I'd do this:

public class DoofusController : Controller
{
    [JsonpFilter]
    public JsonResult DoTheThing(string data, string moreData)
    {
        return new JsonResult
        {
            Data = FetchSomeData(data, moreData)
        };
    }
}

ASP.NET MVC is kind of cool eh?! :-P

Link Comment (10)
 
Memoization - Optimize your function calls
Technobabble
7/31/2009 12:04:34 PM  

Memoization is an optimization technique where the idea is to create functions that cache computed values for various input sets so that subsequent invocations can return the value from the cache instead of re-computing the results. It is a simple space for time trade-off where processing time is reduced at the expense of increased memory use. I figured I'd implement an automatic memoization mechanism in JavaScript that will work well under certain well defined constraints. Here's what I came up with:

function memoize( fn ) {
    //
    // check if this function has already been memoized
    //
    if( typeof( fn.__fn ) == "undefined" ) {
        //
        // the cache where the results are to be stored
        //
        fn.__cache = {};
        
        //
        // memoized version of the function
        //
        fn.__fn = function() {
            //
            // build the key that represents the given input set;
            // note that this thing works only so long as the operation
            // "toString" returns a meaningful result on all of the
            // parameters
            //
            var key = "";
            for( var i = 0 ; i < arguments.length ; ++i )
                key += arguments[i].toString() + "-";
            
            //
            // if the result for the current parameter set already exists
            // in the cache then return that; otherwise call the original
            // routine and store the result
            //
            if( typeof( fn.__cache[key] ) == "undefined" )
                fn.__cache[key] = fn.apply( null, arguments );
            return fn.__cache[key];
        }
    }
    
    return fn.__fn;
}

Imagine that you are tasked with the job of writing a Javascript function that returns the factorial given a number. If you wished to write a memoized version of the function here's what you'd do:

var fac = memoize( function( n ) {
    if( n <= 0 )
        return 1;
    return ( n * fac( n - 1 ) );
} );

If you invoked fac passing 5 for instance, and traced the control flow, here's the sequence of calls you'd see:

. __fn(5)
.. fac(5)
... __fn(4)
.... fac(4)
..... __fn(3)
...... fac(3)
....... __fn(2)
........ fac(2)
......... __fn(1)
.......... fac(1)
........... __fn(0)
............ fac(0)

The number of dots to the left of the function names indicates the stack depth and here it tends to increase because fac is a recursive function and keeps calling itself till the termination criteria is met. This seems like a drawback since where you'd have had a sequence of calls to just fac now we see it interspersed with calls to __fn also, effectively doubling the number of function calls that needs to be made. The benefit however, is realized when we observe what happens when fac is invoked again, a second time, passing 5.

. __fn(5)

As you can see, this time around, there was only a single function call. No further computation was needed as the required result was already available in the cache and only a simple look-up operation was performed. Observe what happens when we pass 3 next.

. __fn(3)

This was resolved via cache look-up also because the first call with 5 had recursively invoked fac with 3 as well. On a similar note, passing 7 produces the following call sequence:

. __fn(7)
.. fac(7)
... __fn(6)
.... fac(6)
..... __fn(5)

Automatic memoization ensures that the original routine is called only the bare minimum number of times - in this case, for the values 7 and 6. The implementation of memoize given above is useful only so long as the following are held true:

  1. Each parameter passed to the actual function has a toString method that returns a meaningful value; if you passed a custom object for instance, you're going to have to define a toString method on it.
  2. The CPU overhead of constructing a unique key for a given parameter set and doing hashtable lookups should be lower (preferably significantly) than simply performing the actual work.
  3. For obvious reasons, the function should have at least 1 parameter!

If you've been following this blog, you know that I am trying to learn the Common Lisp programming language. I figured, I'll try and implement memoize in Common Lisp as well, to see if I'd learned enough of it to be able to do this. After looking up the Common Lisp Hyperspec a few times, I was in fact able to come up with an implementation that does essentially the same thing as what the JavaScript version above does. Here goes:

(defun make-key (lst)
  (format nil "~{~A-~}" lst))

(defun memoize (fn)
  (let ((cache (make-hash-table :test 'equal)))
    #'(lambda (&rest p)
	(let ((key (make-key p)))
	  (if (not (gethash key cache))
	    (setf (gethash key cache) (apply fn p)))
	  (gethash key cache)))))

(setf fac (memoize #'(lambda (n)
                        (if (<= n 0) 1
                            (* n (funcall fac (- n 1)))))))
Link Comment
 
Random Lisp thoughts
Technobabble
7/23/2009 9:04:37 AM  

I have covered about 4 chapters from the book Practical Common Lisp and what I have learnt so far is quite fascinating to say the least. Here're some random observations that I happened to make about Commmon Lisp. Please note that whenever I use the term Lisp below I refer to the Common Lisp dialect of the language.

  • Contrary to what one might conclude upon encountering a typical Lisp program, the basic Lisp syntax is actually quite minimalistic. It is so minimalistic in fact that I can probably explain all of Lisp syntax in about one line (OK, not all of it, but maybe a significant chunk of it - the point is, the basic structure of Lisp code is fairly straightforward)! Here goes:

    Any Lisp code is a line of space delimited list of "things" and nested lists enclosed in parentheses.

    By "things", I mean pretty much everything that can go into Lisp code. Here's some lisp code:

    (believe it or not but this is some lisp code!)

    And here's another example with nested lists:

    (lisp code (with nested lists))

    If you tried entering this stuff at an interactive Lisp shell prompt however, you are bound to have been slapped with some errors and that's because even though this conforms to Lisp code structure it really doesn't mean anything. Its like trying to feed some random bit of XML to an XSL parser, or trying to pass an XSL file to the ANT program. In each of these cases, though we are passing well-formed XML they don't exactly have the tags that the respective programs are looking for.

    What we have accomplished with our definition above therefore is to specify what Lisp expressions are to look like. Imagine specifying the entire XML specification in one line (which of course is quite impossible because the actual spec runs to about 50 pages)! For a fully featured multi-paradigm programming language I think this is quite an incredible feat!

    The technical Lisp term for what we have defined above is an s-expression. Any piece of valid Lisp code is always a valid s-expression but as you might have discovered if you tried getting a Lisp interpreter to parse the examples given above, not all valid s-expressions are valid Lisp code. And that is where the typical Lisp noob (such as yours truly) spends time learning the language. But as is perhaps self evident, the terseness of the basic syntax for Lisp code goes a long way in accelerating the learning process and all the apparently confounding parentheses actually end up being a rather natural way of doing things.

  • The basic Lisp form is so simple that I was able to whip up a little Lisp parser in JavaScript in about half an hour! Here's the complete parser:

    //
    // Each node in the parse tree can be a list or a symbol.
    //
    var NodeType = {
        List: 0,
        Symbol: 1
    };
    
    //
    // Each node is defined by its type and content which in
    // turn can be another node (list) or a symbol name.
    //
    function Node(type, content) {
        this.type = type;
        this.content = content;
    }
    
    //
    // Helper function that generates a "Node" object given
    // a symbol name.
    //
    function symbol(sym) {
        return new Node(NodeType.Symbol, sym);
    }
    
    //
    // Helper function that generates a "Node" object given
    // a list object.
    //
    function list(lst) {
        return new Node(NodeType.List, lst);
    }
    
    //
    // Helper function that trims a string for leading/trailing white space.
    //
    String.prototype.trim = function() {
        return this.replace(/^\s+|\s+$/g, "");
    }
    
    //
    // The Lisp lexer that can tokenize a string of Lisp code.
    //
    function Lexer(code) {
        this.code = code;
        var current = 0;
        var delims = "( )";
    
        this.nextToken = function() {
            var token = null;
            //
            // skip all white-space only tokens
            //
            while (((token = internalNextToken.apply(this)) != null) && (token.trim().length == 0));
            return token;
        }
    
        function internalNextToken() {
            //
            // if we have reached end of code then return null
            //
            if (current >= this.code.length)
                return null;
    
            //
            // accumulate characters into token till one of
            // the delimiters are encountered
            //
            var token = "";
            for (; current < this.code.length; ++(current)) {
                var ch = this.code.charAt(current);
                if (isDelim(ch)) {
                    if (token.length == 0) {
                        token += ch;
                        ++current;
                    }
    
                    break;
                }
    
                token += ch;
            }
    
            return token;
        }
    
        function isDelim(ch) {
            return (delims.indexOf(ch) != -1);
        }
    }
    
    //
    // The Lisp parser class that generates a parse tree composed of
    // "Node" object given a lexer object.
    //
    function Parser(lexer) {
        this.lexer = lexer;
    
        this.parseList = function() {
            var token;
            var lst = [];
            while (((token = this.lexer.nextToken()) != null) && (token != ")")) {
                switch (token) {
                    case "(":
                        lst.push(this.parseList());
                        break;
                    default:
                        lst.push(symbol(token));
                        break;
                }
            }
    
            return list(lst);
        };
    
        this.parse = function() {
            //
            // the first token MUST be a "("
            //
            if (this.lexer.nextToken() != "(")
                return null;
            return this.parseList();
        };
    }
    
    //
    // Parse some lisp code.
    //
    var code = "(sum (gen-multiples (gen-series 1000) 3 5))";
    var parser = new Parser(new Lexer(code));
    var root = parser.parse();

    This code simply generates an in-memory tree representation of the Lisp expression without performing any validation to check for correctness. This is akin to writing a non-validating DOM parser for XML - it simply hands you an object graph that you can programmatically traverse and do stuff with. I wrote a little HTML page to generate parse-tree visualizations given Lisp expressions of arbitrary complexity using the Google Visualization API and the Organizational Chart visualization in particular. Given the following Lisp expression for instance:

    (sum (gen-multiples (gen-series 1000) 3 5))

    Here's the tree representation:

    Parse tree for Lisp expression (sum (gen-multiples (gen-series 1000) 3 5))

    And for the following slightly more complex Lisp expression:

    (sqrt (1+ (* (- (/ 28 2) 10) 2)))

    Here's the tree representation:

    Parse tree for Lisp expression (sqrt (1+ (* (- (/ 28 2) 10) 2)))

    You can try building Lisp expression trees of your own at the following location:

    Build your own Lisp expression trees
  • Lisp s-expressions (i.e. valid Lisp form s-expressions) are of 3 types:

    1. Function calls
    2. Macro invocations
    3. Special forms

    Function calls, are, well, function calls! In keeping with the minimalistic syntax philosophy, almost everything in Lisp is a function call (except of course, macros and special forms). Take for instance, arithmetic operators. In most languages, the compiler is able to natively recognize operators such as [+ - / *] and assign special meanings depending on the context where they are used and for that reason must, among other things, differentiate between unary, binary and ternary operators. In Lisp however, these are all function calls!

    The general form of a function call is like so:

    (function-name [arg1 arg2 ... argN])

    As is perhaps obvious, a function call is expressed simply as a list where the first element is considered to be the function name and everything else its arguments. As always, the arguments can be Lisp s-expressions themselves. If you wanted to add 2 numbers for instance, here's what you'd do:

    (+ 1 2)

    Here, + is the name of the function being invoked and 1 and 2 are its arguments. The function + can accept a variable number of arguments (like the C printf function), which means that in order to add 4 numbers for instance, you can simply do this:

    (+ 1 2 3 4)

    You can build compound lists by nesting s-expressions like so:

    (+ 1 2 (+ 3 4))

    Lisp will always evaluate the function arguments first from left to right (evaluating nested s-expressions if any) and only then invoke the function with the result of evaluating the argument forms. In the example above therefore, it will evaluate 1 and 2 first - which evaluate to themselves, i.e. 1 and 2 - followed by (+ 3 4) - which evaluates to 7. The resulting 3 numbers (1, 2 and 7) are then passed to + which evaluates the composite expression to the value 10.

There's a lot more that one can say about functions. I'll cover some of it in a subsequent post and also talk a bit about macros (which while a rather unique feature of Lisp is also at the same time quite powerful) and special forms.

Link Comment (2)
 
Learning Common Lisp
Technobabble
7/18/2009 2:38:15 PM  

As is frequently the case with such things (at least with me) I have decided, for no compelling reason whatsoever, that I will learn to program in the Common Lisp programming language. It might perhaps have something to do with the fact that I stumbled upon the following impassioned pleas for that language and the functional style of programming, all of which are in my opinion well worth your while should you choose to spend the next hour or so of your life reading them, even if you have no plans of learning Lisp:

Functional Programming For The Rest of Us
- an introduction to functional programming designed to appeal to us imperative grunt programmers.

The Nature of Lisp
- an admirable attempt at showing to the Lisp noob what its all really about.

Beating The Averages
- if you thought Lisp was only for "intellectual" academicians, then you've got to hear from this guy who made, like 50 million bucks by selling a piece of software to Yahoo - and guess what he wrote it in?

If Lisp is So Great
- tries to answer this question: if Lisp is so great, why don't more people use it?

This being my blog and everything, I plan to use it to post what I hope will be a series of entries chronicling my experiments with Common Lisp. There is frequent reference in popular Lisp literature to this so called moment of enlightenment that you apparently experience at some point as you work your way through the language. If this occurs with me, well, you'll know about it!

Here's what I am using to learn the thing:

  • Practical Common Lisp by Peter Seibel - a great free book that teaches you ANSI Common Lisp
  • CLISP - a free open source implementation of the ANSI Common Lisp language (running on Cygwin on Windows boxes - just remember to select the "clisp" package when you install Cygwin)

Here's what this dude called Eric Raymond had to say about Lisp:

Lisp is worth learning for the profound enlightenment experience you will have when you finally get it; that experience will make you a better programmer for the rest of your days...

I intend to personally find out if there's any truth to this or if Eric was just high when he wrote it (OK, so I do have a reason for learning Common Lisp - I never said I'll never contradict myself you know).

Link Comment
 
JavaScript closures act like implicit function state?
Technobabble
7/12/2009 3:36:27 PM  

Consider this JavaScript code:

function acc( n ) {
    return function( i ) {
        return n += i;
    };
}

var fn = acc( 5 );
var n1 = fn( 1 );
var n2 = fn( 2 );

The question, of course, is what the values of n1 and n2 will be. It is perhaps evident that n1 must now equal to 6. But what about n2? Will it now contain 7 (i.e. 2 + 5) or will successive calls to fn result in the parameters being accumulated by addition with the value that was passed to acc (5)? We find by experimentation that the latter turns out to be true. n2 now equals 8, i.e., 5 + 1 + 2!

The conclusion to draw here therefore is that JavaScript function closures are essentially implicit function state. In the code snippet given above, if you treat fn as a regular object (which in fact it is), then the variable n which is part of the closure captured by the function object when it was returned from acc now acts like member state of fn. This is why multiple calls to fn causes the mutation to n to persist across those calls.

This is further corroborated by the fact that a subsequent call to acc to create another function object results in that instance getting a separate copy of the closure containing n. Here's an example:

function acc( n ) {
    return function( i ) {
        return n += i;
    };
}

var fn = acc( 5 );
var n1 = fn( 1 );
var n2 = fn( 2 );

var fn2 = acc( 10 );
var n3 = fn2( 1 );
var n4 = fn2( 2 );

Here, n3 and n4 hold 11 and 13 respectively. The function fn has no effect whatsoever on fn2 (or vice versa). This in fact, is the basis for creating the C++ equivalent of private member data in JavaScript. Imagine that you wish to create the equivalent of the .NET StringBuilder class and want to make the buffer where the string is actually stored a private member of the class. If you did this:

function StringBuilder() {
    this.buffer = [];
}

Then buffer is a public member and can be accessed via an instance. To make it private, simply declare buffer as a local variable inside StringBuilder. Like this:

function StringBuilder() {
    var buffer = [];
    
    this.getBuffer = function() {
        //
        // TODO:
        // return a copy so the original buffer is
        // left intact
        //
        return buffer;
    }
}

Now buffer is not visible to routines defined outside StringBuilder via an instance. But all methods defined inside StringBuilder can access buffer like any other member. You'd have to add accessor methods if you wished to provide access to private data. The same principle applies to member methods as well. Any local functions that you defined inside StringBuilder remain accessible only from other functions defined inside that class.

Cool eh?!

Link Comment
 
Calling a JavaScript function with variable arguments
Technobabble
7/4/2009 3:15:50 PM  

I am working on this little Windows Scripting Host script using JavaScript where I basically need to load up a Word document and do a bunch of text transformation tasks on each line and dump the output to the console (which I plan to redirect to a file). I decided to employ the builder pattern a bit and set something up like this first:

//
// transform events collection
//
var transformTable = {
    parseFileBegin : [],
    parseLineBegin : [],
    parseLineEnd : [],
    parseFileEnd : []
};

The idea is to populate the arrays parseFileBegin and parseFileEnd with a set of function object references that would get called in sequence at the appropriate time. To make calling these callbacks easier I decided to come up with a fireEvent routine which I could then use to fire a particular set of callback functions. I wanted also, to be able to call fireEvent passing as many arguments as are needed for that particular callback. When invoking parseFileBegin for instance, I wanted to pass the name of the file as a parameter to the callback routines and when calling parseLineBegin I wanted to pass in a tokenized form of each line along with the line string itself. Here're a couple of examples of how I wanted to call fireEvent.

fireEvent(transformTable.parseFileBegin, fileName);
fireEvent(transformTable.parseLineBegin, line, tokens);
And here's what I came up with for fireEvent:
function forEach( arr, cb ) {
    for( var i = 0 ; i < arr.length ; ++i )
        if( cb( arr[i] ) == false )
            return false;

    return true;
}

function fireEvent(eventHandlers) {
    //
    // everything after the first argument must be
    // considered as parameters to be passed to the
    // event handler routines
    //
    var args = [];
    var i = 0;
    forEach(arguments, function( arg ) {
        if( i++ == 0 )
            return;
        args.push( arg );
    });

    //
    // iterate through the handlers collection and call one by one
    //
    forEach(eventHandlers, function(handler) {
        // TODO: call the handler
    });
}

I needed to somehow call the function referenced by handler and pass all the values in the args array as parameters to it. One way might have been to dynamically build a string of JavaScript code that calls handler and then have it executed by calling eval on the string. But I perferred a more direct method if one were available. As it turned out, one was in fact available in the form of the apply method on Function objects. Consider this code:

var foo = function(s1, s2) {
    alert( s1 + " - " + s2 );
}

There are a couple of ways you can invoke foo. You can call it as you normally would with functions or, alternatively, you can call the member method apply that all function objects posses. Here's an example:

var foo = function(s1, s2) {
    alert( s1 + " - " + s2 );
}

foo("ding", 20);                 // call like normal function
foo.apply( null, ["ding", 20] ); // call via "apply" method

The apply method requires you to supply 2 parameters, the first one indicates the object in whose scope the function must be invoked - which means that the function will be invoked as though it were a member function of that object. The effect of this is that the variable "this" within that function will refer to the object you pass as the first argument. If you pass null then it will execute like a global function. Here's an example:

var person = {
    name : "binga",
    age : 20
};

var print = function() {
    alert( this.name + " - " + this.age );
}

print.apply( person );

From within the function print here, the reference to "this" turns out to refer the first parameter that you pass to apply. So what happens if you called print like this?

print.apply( null );

As things tend to be in such cases, "this" becomes "undefined" from inside print.

The second parameter to apply is of course, the array of parameters that are to be passed to the function. So with this new information, fireEvent looks like this:

function fireEvent(eventHandlers) {
    //
    // everything after the first argument must be
    // considered as parameters to be passed to the
    // event handler routines
    //
    var args = [];
    var i = 0;
    forEach(arguments, function( arg ) {
        if( i++ == 0 )
            return;
        args.push( arg );
    });

    //
    // iterate through the handlers collection and call one by one
    //
    forEach(eventHandlers, function(handler) {
        handler.apply(null, args);
    });
}

Simple enough, when you know how to do it eh?!

Link Comment
 
Centering elements on a canvas in WPF
Technobabble
3/26/2009 11:42:04 AM  

While I was fiddling around with WPF in general I noticed that when I dropped inane little circles and boxes onto a canvas I'd never be able to center it inside the containing canvas just right. If you try the following XAML in the excellent Kaxaml tool for instance:

<Window xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Width="300"
        Height="300"
        Title="Centered?">
  <Grid>  
    <Canvas Background="Black" Width="300" Height="300">
      <Ellipse Width="50"
               Height="50"
               Canvas.Left="125"
               Canvas.Top="125"
               Fill="LightGray"
               x:Name="ellipse" />
    </Canvas>
  </Grid>
</Window>

Here's what you see:

Off center

Clearly the ellipse is not positioned in the center of the containing canvas even though the point (125,125) should in fact have done so. If however you set the value None for the WindowStyle attribute of the Window tag like so (note the code in bold):

<Window xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Width="300"
        Height="300"
        Title="Centered?"
        WindowStyle="None">
  <Grid>  
    <Canvas Background="Black" Width="300" Height="300">
      <Ellipse Width="50"
               Height="50"
               Canvas.Left="125"
               Canvas.Top="125"
               Fill="LightGray"
               x:Name="ellipse" />
    </Canvas>
  </Grid>
</Window>

Then here's what you get (you'll have to hit Alt+F4 to close this window):

Centered but no title bar

As you can probably tell, the circle seems nicely centered now. The problem therefore is the title bar. In the first XAML code snippet above I had given a dimension of 300x300 for the canvas. Accounting for the height of the title bar, this causes the canvas to actually extend beyond the window border which of course, gets clipped by the OS. In order to trim the canvas to size I decided to put it inside a dock panel and remove the explicit width/height specification. Here's what I came up with:

<Window xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Width="300"
        Height="300"
        Title="Centered?">
  <DockPanel>  
    <Canvas Background="Black">
      <Ellipse Width="50"
               Height="50"
               Canvas.Left="125"
               Canvas.Top="125"
               Fill="LightGray"
               x:Name="ellipse" />
    </Canvas>
  </DockPanel>
</Window>

This produced a window that looked like this:

Again off center

Again, not quite in the center of the client area. The issue here is of course that now, the canvas size is not 300x300 which means the point (125,125) should not be the top left co-ordinate of the ellipse if we want it centered. Here's a little XAML that shows you the real dimensions of the canvas.

<Window xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Width="300"
        Height="300"
        Title="Centered?">
  <DockPanel>  
    <Canvas Background="Black" x:Name="canvas">
      <Ellipse Width="50"
               Height="50"
               Canvas.Left="125"
               Canvas.Top="125"
               Fill="LightGray"
               x:Name="ellipse" />
      <TextBlock Text="{Binding ElementName=canvas, Path=ActualWidth}"
                 Canvas.Left="10"
                 Canvas.Top="10"
                 Foreground="White"/>
      <TextBlock Text=", "
                 Canvas.Left="30"
                 Canvas.Top="10"
                 Foreground="White"/>
      <TextBlock Text="{Binding ElementName=canvas, Path=ActualHeight}"
                 Canvas.Left="38"
                 Canvas.Top="10"
                 Foreground="White"/>
    </Canvas>
  </DockPanel>
</Window>

Here's what the window that this produces looks like:

Actual dimensions of the canvas

The correct top-left co-ordinate to center the ellipse therefore is (121,108). The straightforwad solution seems to be to just handle it in code and be done with it. For example, the following code in the window class's constructor manages to do the job:

this.Loaded += (sender, e) =>
{
    Canvas.SetLeft(ellipse, (canvas.ActualWidth - ellipse.ActualWidth) / 2);
    Canvas.SetTop(ellipse, (canvas.ActualHeight - ellipse.ActualHeight) / 2);
};

But this isn't quite the WPF way of doing things and besides if you resized the window the ellipse would again be off center and you'd need to write more code to handle the window resize events and re-layout the ellipse in response. Ideally, these sort of things should be handled using the WPF binding system. At first I figured this should be really easy to do with XAML like this:

<Window xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Width="300"
        Height="300"
        Title="Centered?">
  <DockPanel>  
    <Canvas Background="Black" x:Name="canvas">
      <Ellipse Width="50"
               Height="50"
               Canvas.Left="{Binding ElementName=canvas, Path=((ActualWidth - 50)/2)}"
               Canvas.Top="{Binding ElementName=canvas, Path=((ActualHeight - 50)/2)}"
               Fill="LightGray"
               x:Name="ellipse" />
    </Canvas>
  </DockPanel>
</Window>

Here's what this produced:

Passing expressions to binding path does not work

As it turns out one cannot use expressions for the value of the Path attribute of the Binding markup extension! Some folks over at blendables.com have however solved this problem by developing a custom WPF markup extension that allows the specification of expressions. You can get it here. But this seemed a bit extreme given the circumstances. The alternative as it happens is to use multi-binding with a custom value converter. First we write a class that implements IMultiValueConverter like so:

public class HalfValueConverter : IMultiValueConverter
{
    #region IMultiValueConverter Members

    public object Convert(object[] values,
                          Type targetType,
                          object parameter,
                          CultureInfo culture)
    {
        if (values == null || values.Length < 2)
        {
            throw new ArgumentException(
                "HalfValueConverter expects 2 double values to be passed" +
                " in this order -> totalWidth, width",
                "values");
        }

        double totalWidth = (double)values[0];
        double width = (double)values[1];
        return (object)((totalWidth - width) / 2);
    }

    public object[] ConvertBack(object value,
                                Type[] targetTypes,
                                object parameter,
                                CultureInfo culture)
    {
        throw new NotImplementedException();
    }

    #endregion
}

And then we use the following XAML (you cannot do this in Kaxaml because it does not support writing code-behind) to get the job done!

<Window x:Class="CenterWin.Window1"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="Centered?"
        Height="300"
        Width="300"
        Background="Black"
        xmlns:local="clr-namespace:CenterWin">
    <DockPanel>
        <DockPanel.Resources>
            <local:HalfValueConverter x:Key="HalfValue" />
        </DockPanel.Resources>
        <Canvas x:Name="canvas">
            <Ellipse Width="50"
                     Height="50"
                     Fill="LightGray"
                     x:Name="ellipse">
                <Canvas.Left>
                    <MultiBinding Converter="{StaticResource HalfValue}">
                        <Binding ElementName="canvas" Path="ActualWidth" />
                        <Binding ElementName="ellipse" Path="ActualWidth" />
                    </MultiBinding>
                </Canvas.Left>
                <Canvas.Top>
                    <MultiBinding Converter="{StaticResource HalfValue}">
                        <Binding ElementName="canvas" Path="ActualHeight" />
                        <Binding ElementName="ellipse" Path="ActualHeight" />
                    </MultiBinding>
                </Canvas.Top>
            </Ellipse>
        </Canvas>
    </DockPanel>
</Window>

That's it! Now you can resize to your heart's content and the WPF binding system will take care of all the updates. Here's a screenshot of the window after resizing it a bit.

Nicely centered!
Link Comment (4)
 
Custom windows with WPF
Technobabble
3/10/2009 8:47:07 PM  

I am working on a little WPF app to build a UI for the Freebase web database and decided that the entry point to the app will be a simple textbox where the user can type in the search text. Here's what it looks like:

Screenshot of the search textbox.

While this was the look that I had envisioned I did not really expect to be able to achieve it as easily as I did with WPF! I at first started off with a simple window and set its size to what I wanted.

<Window x:Class="TextboxApp.Window1"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="Freebase Search"
        Height="65"
        WindowStyle="None"
        Width="708"
        MinHeight="65"
        MinWidth="400"
        WindowStartupLocation="CenterScreen"
        x:Name="MainWindow">
...

I then plonked a DockPanel into the window with a TextBox in it.

<DockPanel Background="Transparent">
    <TextBox x:Name="txtSearch"
             FontSize="40"
             TextAlignment="Center"
             VerticalAlignment="Center"
             Foreground="Wheat"
    </TextBox>
</DockPanel>

I wanted a swanky gradient background and also get the nice rounded corners for the window. After a few attempts with less than stellar results I settled upon the following structure.

<DockPanel Background="Transparent">
    <Border CornerRadius="9,9,9,9">
        <Border.Background>
            <LinearGradientBrush StartPoint="0,0" EndPoint="2,1" x:Name="WindowBackground">
                <GradientStop Color="#CC000000"  Offset="0.0" />
                <GradientStop Color="#CCFFFFFF" Offset="1.0" />
            </LinearGradientBrush>
        </Border.Background>
        <TextBox x:Name="txtSearch"
                 FontSize="40"
                 TextAlignment="Center"
                 VerticalAlignment="Center"
                 KeyDown="txtSearch_KeyDown"
                 Background="Transparent"
                 Foreground="Wheat"
                 BorderBrush="Transparent"
                 BorderThickness="0"
                 Visibility="Hidden">
        </TextBox>
    </Border>
</DockPanel>

I now had a textbox with nice rounded corners but the window itself was a standard rectangle. A little googling revealed the pixy dust you've got to sprinkle to get it going. First you enable transparency on the window itself and make its background transparent like so (note the tags in bold below):

<Window x:Class="TextboxApp.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        Title="Freebase Search"
        Height="65"
        Width="708"
        WindowStyle="None"
        MinHeight="65"
        MinWidth="400"
        WindowStartupLocation="CenterScreen"
        x:Name="TheMainWindow"
        AllowsTransparency="True"
        Background="Transparent"
        ShowInTaskbar="False">

The basic idea is to make all backgrounds and borders transparent except for the Border tag in the DockPanel so that only the rouned corners in the border element are visible. Here's the modified code (again, note the stuff in bold):

<DockPanel Background="Transparent">
    <Border CornerRadius="9,9,9,9">
        <Border.Background>
            <LinearGradientBrush StartPoint="0,0" EndPoint="2,1" x:Name="WindowBackground">
                <GradientStop Color="#CC000000"  Offset="0.0" />
                <GradientStop Color="#CCFFFFFF" Offset="1.0" />
            </LinearGradientBrush>
        </Border.Background>
        <TextBox x:Name="txtSearch"
                 FontSize="40"
                 TextAlignment="Center"
                 VerticalAlignment="Center"
                 KeyDown="txtSearch_KeyDown"
                 Background="Transparent"
                 Foreground="Wheat"
                 BorderBrush="Transparent"
                 BorderThickness="0">
        </TextBox>
    </Border>
</DockPanel>

That's it! Now I had the nice custom round cornered window! I added some pizzazz with some fade-in/fade-out animation and I was in business!

Screenshot of the search textbox.

If you'd like to take a look at the code, here's the link you'll need.

Link Comment
 
What's a good error message?
Technobabble
8/21/2008 3:27:46 AM  

In general, when writing code for handling the case for when something goes wrong, the programmer's natural instinct seems to be to just get over with it as soon as is possible and move on to other more exciting things. I speak, of course, from first hand experience. Now, there are a class of errors for which even the programmer is at a loss for providing guidance on what is an appropriate recourse, but really, if one is honest with oneself they are in truth few and far between. If you're willing to put in the effort required, more often than not, an error message that allows the end user to actually do something about the problem is only a few minutes and a few key strokes away.

I was working with SQL Server Management Studio Express edition today when I came across this error:

SQL Server Management Studio Express Edition - Error message.  Database diagram support objects cannot be installed because this database does not have a valid owner.  To continue, first use the Files page of the Database Properties dialog box or the ALTER AUTHORIZATION statement to set the database owner to a valid login, then add the database diagram support objects.

I present this as an example of a good error message. I knew exactly what I had to do to remedy the situation once I actually paid attention to what the message said (I admit, however, to having pig-headedly re-tried what I was trying to do a few times, unabashedly dismissing this message box every time, before admitting defeat and reading the message!). I did what this message box told me to do and the thing just worked! This might not sound like a big deal, but it makes me that little bit happier and as Joel puts it, these tiny victories tend to add up and contributes to one's feeling positively disposed towards the product at a subconcious level!

Link Comment
 
How to create a simple Workflow Foundation (WF) activity
Technobabble
8/14/2008 9:21:41 AM  

Quite suddenly, with no prior warning whatsoever, I resolved with firm determination that I will without further delay inflict upon unsuspecting world, my first screen recorded, poorly narrated technical tutorial. After many failed attempts with many miserable little screen capture programs, I finally managed to put something together using a trial edition of Camtasia which in my opinion is a stunningly useful piece of software if you dig this sort of impulsive screen recording fits. I just wish it didn't cost quite as much as it does.

The tutorial is a short 20 minute video that shows you how you can create an extremely simple, fairly useless workflow activity using the Windows Workflow Foundation (WF). It shows you how you can create a workflow that uses the activity and then how you can host the runtime and execute the workflow.

The Camtasia produced Flash file has been hosted on a site called Hot Link Files who in their boundless magnanimity allow basically everybody to host whatever they want on their servers and happily provide URLs to those files. Go Hot Link Files! They do have a clause however that they'll delete this file after 30 days of inactivity (note to self: figure out another cheap stingy way of hosting files and not spend American dollars).

[Update (22-Aug): I have since then changed my hosting provider and now have a gigabyte of disk space which is considerably more than the 20 MB that I used to scrounge with earlier and have therefore moved the SWF file for this movie on to my web server itself. It did cost American dollars though (dang!).]

Without further mindless blathering then, here's the tutorial. Oh, one more thing - unless you take great delight in squinting at the screen trying to make out extremely small text you might want to click the full-screen button on the video player below; you should find a small button that looks like a cross-hair on the bottom right hand corner of the player once you start playback and clicking it will hopefully launch the player in full-screen mode.

Link Comment (2)
 
Gmail - Can't take security for granted!
Technobabble
3/10/2008 3:34:15 PM  

A couple of disturbing stories about how your email account can be compromised!

Link Comment
 
Silverlight Tic-Tac-Toe
Technobabble
1/17/2008 4:09:33 PM  

I got it into my head one fine day that I will try to find out what this Silverlight thingy is all about. I read the quickstart tutorials a bit, read some arbit blogs and finally decided to write up a tic-tac-toe implementation using the alpha version of Silverlight 1.1 (or is it 2.0 now?). If you have Silverlight 1.1 installed on your Windows box and happen to use Internet Explorer or Firefox then you can click here to take a look at the game. The game AI isn't exactly HAL 9000 but does an admirable job of appearing to be intelligent by making inexplicable random moves. If you're a geeky kind of person and want to look at the source then the links are available below.

Link Comment
 
Switching threads - my second article for Code Project!
Technobabble
9/16/2007 11:29:19 PM  

After much grievous toil my second Code Project article has finally been posted (phew)! It talks about a thread switching technique that lets you switch the thread on which a routine is running by messing around with stack pointers, CPU registers and window messages. You'll find the article here:

http://www.codeproject.com/useritems/threadswitch.asp

I have written another article for Code Project by the way, an article that had its origin here as a blog post. It talks about writing self-deleting executables. You'll find it here:

http://www.codeproject.com/useritems/selfdel.asp

Feel free to leave comments about the articles and to vote for them on Code Project!

Link Comment
 
Concurrency with an STA?
Technobabble
9/8/2007 3:20:31 PM  

I was recently experimenting with the Windows Running Object Table (ROT) when I ran into a peculiar problem. Here's the scenario: I had a simple in-process COM component configured to run in a single threaded apartment (STA). The STA apartment configuration, if you didn't know, is an indication to the COM runtime that the component does not know anything about thread synchronization and will miserably stomp over itself and do other unpleasant things if unbridled concurrent access to an instance of it is made available from multiple threads. Wanting to test whether the COM runtime's call serialization guarantee would continue to hold true even when we remote an in-process component to remote processes via the ROT, I put together a quick sample containing the following:

  • A simple in-process STA COM component named Dong that contained a single method, again, called Dong that simply popped up a message box to announce to the world the fact that it had indeed been found worthy of being invoked.

  • A console application called AddToROT.exe that created an instance of Dong and added it to the ROT. After that it hung around running a windows message loop set to terminate upon a key press. The message loop is needed because the COM runtime's call serialization implementation depends on it. We'll learn why exactly in a moment. Here's what the loop looks like:

    while( !_kbhit() )
    {
    	MSG msg;
    	if( PeekMessage( &msg, NULL, 0, 0, PM_REMOVE ) )
    	{
    		TranslateMessage( &msg );
    		DispatchMessage( &msg );
    	}
    	else
    		Sleep( 500 );
    }

    As you can see, a fairly straightforward message loop that keeps spinning till you hit a key on the keyboard.

  • The last piece in the sample was a console application called GetFromROT.exe that fetched a reference to Dong from the ROT and invoked its sole method.

My plan was to first run AddToROT and then launch multiple instances of GetFromROT. I expected that the message box from Dong.Dong() would get displayed one after the other even though the client processes were running more or less concurrently. I expected this because that's what the COM runtime guarantees for components marked as an STA. How exactly does it provide this guarantee? It's quite simple actually.

Whenever a method call is made on a component from the thread on which it was created, it behaves like a regular method call, i.e. your call results in a simple transfer of control to the called method in the component. When you wish to call the method from another thread however, the COM powers that be have mandated that you first marshal the interface pointer across the thread boundary before invoking methods on it. You do this via CoMarshalInterThreadInterfaceInStream and CoGetInterfaceAndReleaseStream, i.e. you call the former from the thread on which the component was created to be handed an IStream pointer which you somehow pass to the second thread from where you call the latter to be handed a component pointer which in turn you can use to call its methods. If you do all this the COM runtime guarantees that the calls will get serialized and peace shall reign everywhere.

Now if all that sounds a bit confusing here's a small code snippet that'll hopefully clear the air for you (the code below is meant to just illustrate the concept which means that there are no error checks; and what's more, it won't even compile!).

Thread 1

//
// let's assume this is a global variable
//
IStream *g_pStream;

//
// create an instance of dong
//
IDong *pDong;
CoCreateInstance( ..., &pDong );

//
// marshal the interface pointer into a stream
//
CoMarshalInterThreadInterfaceInStream(
   __uuidof( IDong ),
   pDong,
   &g_pStream );

//
// simple straightforward method call
//
pDong->Dong();

Thread 2

IDong *pDong;

//
// un-marshal the interface pointer
//
CoGetInterfaceAndReleaseStream(
   g_pStream,
   __uuidof( IDong ),
   &pDong );

//
// "pDong" is in truth a proxy object that marshalls the call
// across thread/process boundaries; the COM runtime ensures that
// the component gets only one call at a time
//
pDong->Dong();

Now that you know how marshalling interface pointers across threads is accomplished, let's go back to our question of how the COM runtime provides the call serialization guarantee and what it has to do with running message loops. As it turns out whenever you create an STA COM component, the COM runtime secretly goes and creates a hidden window. When you marshal the interface pointer across to another thread (or another process for that matter) what you are actually handed is a proxy object. When you invoke a method on the proxy all that it does is to serialize the method parameters and post (or rather, send) a regular window message to the hidden window. The window procedure that handles the message unpacks the parameters and calls the method on the actual component. Simple! No matter how many concurrent clients exist for the component, as long as all the method calls are routed through the hidden window, call serialization is automatically guaranteed!

As must be evident, in order for windows (hidden or otherwise) to receive messages there must be a message loop that's retrieving and dispatching the messages. This is the reason why COM's call serialization guarantee works only so long as the thread on which the component was created has a message loop going. So far so good!

In our sample setup therefore you couldn't have blamed me too much for expecting that when I run 2 instances of GetFromROT one after the other without dismissing the message box shown as a result of the first instance the 2nd instance would essentially block on the method call till I dismissed the first message box. After it had been dismissed however I would see the message box appearing a second time, courtesy the 2nd instance of GetFromROT. Here's a screenshot of what I actually saw!

As you can see, the second instance of GetFromROT was also somehow given access to Dong while the first invocation still hadn't returned! What's even more stranger is that both the calls seem to have occurred on the same thread!! This is evident from the fact that both the message boxes show the same thread ID as returned by the GetCurrentThreadId API.

For a couple of days there I walked about with tousled hair, unshaved chin and rumpled shirt with a murderous look in my eye. What has world come to if one can't trust the COM runtime to do what it had promised to do?! This sorry state of affairs ended finally one day as I was performing my morning ablutions (and I could hear mother nature letting out a sigh of relief) when it dawned upon me with startling clarity that the windows message box spawns a little message loop of its own!

That was indeed the problem here! As it turns out whenever you pop up a message box (or any modal dialog box for that matter) a local message loop is executed from that dialog. This is done because of the modal nature of the dialog. The thread that has the message pump running is now blocked on the call to the modal dialog which means that the message pump isn't doing a whole lot while the dialog is active. It would also mean that the dialog itself would remain unresponsive since there's nobody picking the messages from the queue and having it processed. To counter all this, modal dialog boxes always run their own message loop till the dialog is dismissed.

So, in our case the 2nd call to Dong.Dong was facilitated not by the message loop running in AddToROT but from the one running in the message box that had been invoked from the previous call to Dong.Dong. We can easily verify this by taking a look at the call-stack of the primary thread in AddToROT while Dong.Dong is running.

Here's the call-stack while the message box is being shown as a result of running the first instance of GetFromROT. The stack was captured using the excellent Process Explorer tool written by Mark Russinovich of Sysinternals fame. I have snipped some of the function calls from the stack so that we can focus on the relevant stuff.

snip.. snip..

USER32.dll!NtUserWaitMessage+0xc
USER32.dll!InternalDialogBox+0xd0
USER32.dll!SoftModalMessageBox+0x938
USER32.dll!MessageBoxWorker+0x2ba
USER32.dll!MessageBoxTimeoutW+0x7a
USER32.dll!MessageBoxExW+0x1b
USER32.dll!MessageBoxW+0x45
SampleCOM.dll!CDong::Dong+0x7e          <-- this is our function
RPCRT4.dll!Invoke+0x30
RPCRT4.dll!NdrStubCall2+0x297

snip.. snip..

ole32.dll!StubInvoke+0xa7
ole32.dll!CCtxComChnl::ContextInvoke+0xe3
ole32.dll!MTAInvoke+0x1a
ole32.dll!STAInvoke+0x4a

snip.. snip..

USER32.dll!DispatchMessageWorker+0x306
USER32.dll!DispatchMessageW+0xf         <-- and this is the
                                            DispatchMessage call
AddToROT.exe!wmain+0x135
AddToROT.exe!__tmainCRTStartup+0x1a6
AddToROT.exe!wmainCRTStartup+0xd
kernel32.dll!BaseProcessStart+0x23

Now take a look at what the stack looks like after the second instance of GetFromROT is launched without dismissing the first message box.

snip.. snip..

USER32.dll!NtUserWaitMessage+0xc
USER32.dll!InternalDialogBox+0xd0
USER32.dll!SoftModalMessageBox+0x938
USER32.dll!MessageBoxWorker+0x2ba
USER32.dll!MessageBoxTimeoutW+0x7a
USER32.dll!MessageBoxExW+0x1b
USER32.dll!MessageBoxW+0x45
SampleCOM.dll!CDong::Dong+0x7e          <-- second invocation
                                            of Dong.Dong
RPCRT4.dll!Invoke+0x30
RPCRT4.dll!NdrStubCall2+0x297

snip.. snip..

ole32.dll!StubInvoke+0xa7
ole32.dll!CCtxComChnl::ContextInvoke+0xe3
ole32.dll!MTAInvoke+0x1a
ole32.dll!STAInvoke+0x4a

snip.. snip..

USER32.dll!DispatchMessageWorker+0x306
USER32.dll!DispatchMessageW+0xf         <-- dispatch message from
                                            the loop in MessageBox
USER32.dll!DialogBox2+0x15a
USER32.dll!InternalDialogBox+0xd0
USER32.dll!SoftModalMessageBox+0x938
USER32.dll!MessageBoxWorker+0x2ba
USER32.dll!MessageBoxTimeoutW+0x7a
USER32.dll!MessageBoxExW+0x1b
USER32.dll!MessageBoxW+0x45
SampleCOM.dll!CDong::Dong+0x7e          <-- first invocation of
                                            Dong.Dong
RPCRT4.dll!Invoke+0x30
RPCRT4.dll!NdrStubCall2+0x297

snip.. snip..

ole32.dll!StubInvoke+0xa7
ole32.dll!CCtxComChnl::ContextInvoke+0xe3
ole32.dll!MTAInvoke+0x1a
ole32.dll!STAInvoke+0x4a

snip.. snip..

USER32.dll!DispatchMessageWorker+0x306
USER32.dll!DispatchMessageW+0xf         <-- original dispatch msg
                                            for first invocation
AddToROT.exe!wmain+0x135
AddToROT.exe!__tmainCRTStartup+0x1a6
AddToROT.exe!wmainCRTStartup+0xd
kernel32.dll!BaseProcessStart+0x23

As is evident, the fact that the message loop in the MessageBox API does not filter for messages that are applicable only to the message box window and its descendants results in this side effect. Inadvertently our STA component has actually become re-entrant! The behaviour we were expecting to see is evident the moment you change the MessageBox call to a _tprintf and make the method wait for user input via a call to _getch. The following implementation of Dong.Dong causes the second launch of GetFromROT to wait till the first launch has been responded to by the pressing a key in the AddToROT console window.

STDMETHODIMP CDong::Dong(LONG* plRetVal)
{
    *plRetVal = 50;

    TCHAR szBuf[1024];
    _stprintf( szBuf, _T( "Dong - Thread ID = 0x%X, " \
    "Object ID = %d\n" ),
	    GetCurrentThreadId(), m_iObjectID );
    _tprintf( _T( "%s\nPress any key to return from "\
    "CDong::Dong\n" ), szBuf );
    _getch();
    return S_OK;
}

A nasty sort of issue to run into wouldn't you think?!

Link Comment
 
PathIsDirectory woes!
Technobabble
4/18/2007 1:21:32 PM  

Here's what MSDN has to say about PathIsDirectory's return value:

Returns TRUE if the path is a valid directory, or FALSE otherwise.

One of the architectural goals for the product that I am working on is the enabling of cross platform source code portability with minimal development effort. To this extent we have a practice of wrapping platform specific API calls via simple functions that mostly just forward the call to the OS API. In this spirit therefore I wrote the simplest of wrapper routines for the PathIsDirectory Win32 API like so (not the actual function name of course!).

bool IsThePathADangDirectory( const TCHAR *pszPath )
{
	return ( PathIsDirectory( pszPath ) == TRUE );
}

I had a perfectly normal directory at the location "C:\WINDDK" and guess what this helper routine returned when I called it like this?

 
_tprintf( _T( "%d\n" ),
          IsThePathADangDirectory( _T( "C:\\WINDDK" ) ) );
 

It returned a big fat false! As it turns out PathIsDirectory does not in fact return TRUE when the path is valid. It returns instead, a non-zero value. The difference is important because in this case it returns the number 16 when it is happy with the path! Since we were explicitly checking for the return value TRUE, IsPathADangDirectory dutifully reported that it wasn't a dang directory! The right way to write the function would therefore be,

bool IsThePathADangDirectory( const TCHAR *pszPath )
{
	return !( PathIsDirectory( pszPath ) == FALSE );
}

Grrr... Trust no one I say, not even the documentation!

Link Comment (2)
 
On Bugs and Destiny
Technobabble
2/10/2007 1:34:00 PM  

A recent experience at work has left me convinced that sometimes a bug is quite simply just meant to be. I do not refer mind you to bugs that choose to make an appearance only after the user has performed a jiggle in front of the monitor having muttered strange incantations while standing on his/her head with the planets aligned in just the right manner. I refer rather, to bugs that a blind man in a dark room would have been hard pressed to miss (now, if you feel the need to clarify with me the fine point of how a dark room can make the process of locating an object harder than it would otherwise have been to a blind man, then I strongly urge you to resist that need).

Any given feature in the product that I am working on must necessarily pass through the following quality checks before it ever sees the light of day. It must first negotiate a series of unit test cases cunningly designed to trip it up when it's not looking. It must then survive the traumatic experience of system testing where the entire development team has a go at it. This is followed by days upon days of unending torment by the quality assurance team. Only those features that display the resilience to pass through all this without blemish make it to the final release. And yet, a bug that failed the most basic of test cases managed to escape notice and show up in front of the customer!

Like I said, some bugs are just meant to be!

Link Comment
 
Pay attention to that demo
Technobabble
12/31/2006 1:08:43 PM  

It seems implicitly apparent to me that when someone bothers to create and distribute a demo version of their software product, it must be because they want people to have a go at it and hopefully so impress them that they will want to fork up money and actually buy it. So when somebody invests thousands upon thousands of American dollars on carefully designing and building a product and expends further effort creating a less functional demo edition, the least that one expects is being able to actually use it. For some mysterious reason however the demo of the game Battlefield 2142 does everything except allowing you to play it!

The demo edition does not allow you to play solo even though the full version supports it (doesn’t it make more sense to make the demo edition single player only and reserve multi-player for the full version?). You absolutely have to go multi-player. When you launch the game it takes you to a page where presumably it would list all the game servers that are currently available. After waiting a while it listed exactly 1 server! When you try joining that server however it doesn’t work. I forget what exactly the problem was.

But there is this other tab called “Advanced” where it lists lots of servers (why is this the advanced tab again?). At last, I thought I was making some progress. When you trying joining a server for the first time it makes you wait for like 5 full minutes assuring you that it is only “caching shaders” or something like that. Once you are through that you finally enter the game and are plonked into a spawn point (you are actually made to select a spawn point – how on earth is the gamer supposed to know where on the map s/he should spawn – at least while joining a game afresh?).

After sitting through all of this you’re finally in the game world. You have the gun in your hand and you’ve just looked around for like 5 seconds when it suddenly pops up a message saying you’ve been kicked out of the server because everybody else voted on it. Clearly it was some kind of private game server (I think it is about time game developers used less brutal messages for notifying somebody that they have been ejected from a server for whatever reason; kicked out is quite… demeaning eh?).

I had just about run out of patience by this time. Figuring I’ll give one more shot at it I tried joining another server. A couple of minutes later I was back at square one – kicked out of the server by something called punkbuster. Turns out punkbuster is some kind of bot that looks for gamers who have got cheats turned on in their game and kicks them out. Needless to say I had done nothing of the sort. But that really was the last straw. Not only am I not likely to buy that game; I’d probably be giving it some bad publicity too!

Moral of the story: pay more attention to your demo software!

Link Comment (2)
 
System API call hooking
Technobabble
9/24/2006 1:38:49 PM  

I have for sometime been meaning to investigate into how exactly one set about hooking system API calls, i.e., be able to intercept/instrument calls to Win32 APIs made by any given process on the system. Surprisingly, there are quite a few good, informed articles on the subject. Here're some links to a few of them:

API hooking revealed
A good article that covers all the options available to achieve this.
Process-wide API spying - an ultimate hack
Describes Import Address Table (IAT) patching in fair detail.
Three Ways to Inject Your Code into Another Process
Another API spying DLL injection article.
Windows NT System-Call Hooking
A great article from Mark Russinovich and Bryce Cogswell of Sysinternals fame detailing interception of system calls by patching system call dispatch tables from the kernel mode.
Tracing NT Kernel-Mode Calls
Talks about intercepting kernel mode APIs such as IoAllocateIrp and IoCallDriver.

My primary interest was in being able to intercept calls to APIs like CopyFile, MoveFile and DeleteFile. Having recently developed an interest in kernel mode programming I initially figured that I'll write this as some sort of kernel mode filter driver and roll a super-cool interception system. But I came to realise in the end that this was not going to be possible without writing some fairly intricate and basically shaky code. As the articles I've given links to above indicate, it is quite possible to do this with a lot less fuss from user mode itself.

To avoid duplicating information already available in these articles I'll just briefly describe the approach I took:

  • I created a DLL that would hook routines that I am interested in from DllMain.
  • I would then inject this DLL into the process that I am interested in using the CreateRemoteThread technique.
  • The injected DLL would call back to the EXE whenever the relevant APIs were called by sending WM_COPY_DATA messages.

That's all! One thing that I did not do however is implementing the fancy IAT patching code myself. I used the Microsoft Research Detours library for doing this which does it in a very clean structured fashion. And finally, the whole thing will work only on systems running Windows 2000 and later (who uses Windows 95, 98 and ME anyway!).

Here's a screen shot of what the UI for this program that I wrote looks like:

IOSpy screenshot

And here're the binaries and the source code should you feel like taking a look. Please note that I haven't included the Detours library here. You'll have to download it from the link given here yourself (it's only 519 KB in size) and set your build environment up so that the compiler and the linker can find the "detours.h", "detours.lib" and the "detoured.lib" files.

Link Comment
 
Ultimate list of developer/power user tools
Technobabble
9/11/2006 6:27:24 PM  

Find Scott Hanselman's 2006 Ultimate Developer and Power Users Tool List for Windows here:

http://www.hanselman.com/tools

Who is Scott Hanselman eh? Err.. I don't really know but the tools that he lists I know (well, some of them at least)! From his top 10 life/work changing utilities I am already using 5 (Notepad++, Lutz's Reflector for .NET, Google Desktop, ZoomIt and various other Sysinternals tools) and I am going to give his other suggestions a try as they sound like they're going to be equally super cool!

Put simply these are tools that you cannot afford to leave home without!

Link Comment
 
Self deleting executables
Technobabble
9/10/2006 5:52:57 PM  

I read an interesting article the other day that spoke about the various mechanisms a Win32 application can employ for deleting itself from the disk once execution completes. The basic issue is of course that while the module is being executed the operating system has the file locked. So something like this will just not work:

    TCHAR szModule[MAX_PATH];
    GetModuleFileName( NULL, szModule, MAX_PATH );
    DeleteFile( szModule );

Of the various options available, the author of the said article had suggested the following approach as being the definitive one as it has the added benefit of functioning correctly on all versions of Microsoft Windows (starting with '95).

Now would be a good time to hop over to the article and see what it's about (and while you're there make sure you look at some of the other articles - pretty neat). Here's the link:

http://www.catch22.net/tuts/selfdel.asp

And here's the approach in brief:

  • When it's time to delete ourselves we first spawn an external process that is guaranteed to exist on all Windows computers (explorer.exe for example) in the suspended state. We do this by calling CreateProcess passing CREATE_SUSPENDED for the dwCreationFlags parameter. Note that when a process is launched this way there's really no telling at what point the primary thread of the process will get suspended. But it does appear to get suspended long before the entry point gets invoked and in fact it occurs even before the Win32 environment for the process has been fully initialized.

  • After this we get the CONTEXT data (basically, the CPU register state) for the suspended primary thread (in the remote process) via GetThreadContext.

  • We then manipulate the stack pointer (ESP) to allocate some space on the remote stack for storing some of our data (like the path to the executable to be deleted). After this we plonk the binary code for a local routine that we've written for deleting files over to the remote process (along with the data it needs) by calling WriteProcessMemory.

  • Next we mess around with the instruction pointer (EIP) so that it points to the binary code we've copied to the remote process and update the suspended thread's context (via SetThreadContext).

  • And finally, we resume execution of the remote process (via ResumeThread). Since the EIP in the remote thread is now pointing to our code, it executes it; which of course, happily deletes the original executable. And that's it!

While this approach does get the job done, the fact that our deletion code executes in the remote process even before Windows has had a chance to initialize it fully places some restrictions on the kind of APIs that we can invoke. It so turns out that APIs like DeleteFile and ExitProcess do work while the process is in this half-baked state. So I figured I'll modify the approach somewhat so that it allows us to call any API we want from our injected code. Here's what I did:

  • As before we launch the external process in a suspended state. However, instead of plonking our code at the location that ESP happens to be pointing at when it got suspended, we put it over the executable's entry-point routine, i.e., we replace the remote process's entry point with our own injected code. And when the entry-point code executes we can be pretty sure that the Win32 environment is fully initialized and primed for use!

  • Figuring out where the entry point of a module lives requires us to parse PE file format structures. In your own program for example, the following code would give you a pointer to the entry point routine in the process's executable image:

#pragma pack( push, 1 )

struct coff_header
{
    unsigned short machine;
    unsigned short sections;
    unsigned int timestamp;
    unsigned int symboltable;
    unsigned int symbols;
    unsigned short size_of_opt_header;
    unsigned short characteristics;
};

struct optional_header
{
    unsigned short magic;
    char linker_version_major;
    char linker_version_minor;
    unsigned int code_size;
    unsigned int idata_size;
    unsigned int udata_size;
    unsigned int entry_point;
    unsigned int code_base;
};

#pragma pack( pop )

//
// get the module address
//
char *module = (char *)GetModuleHandle( NULL );

//
// get the sig
//
int *offset = (int*)( module + 0x3c );
char *sig = module + *offset;

//
// get the coff header
//
coff_header *coff = (coff_header *)( sig + 4 );

//
// get the optional header
//
optional_header *opt = (optional_header *)( (char *)coff + sizeof( coff_header ) );

//
// get the entry point
//
char *entry_point = (char *)module + opt->entry_point;
  • The entry point that you define by the way - main or WinMain - isn't the actual entry point routine. The compiler inserts its own entry point which in turn calls our function. This entry point typically does stuff like CRT initialization and cleanup. In an ANSI console app for instance the actual entry point routine is something called mainCRTStartup.

  • It appears logical that we should be able to find the entry point routine in remote processes also in a similar fashion using ReadProcessMemory. While that is so, finding the equivalent of the module variable in the code given above for remote processes turned out to be trickier than anticipated. The problem is that there is no convenient GetModuleHandle routine that'll work for remote processes.

  • As it turns out GetModuleHandle returns a virtual address that is valid only within the relevant process's address space. ReadProcessMemory however requires real addresses to work with. So the question is, how do we get to know the remote process's base address in memory? The solution as it turned out requires us to dig deep into the OS's internals! The credit for this solution goes to Ashkbiz Danehkar whose article called Injective Code inside Import Table on Code Project outlines a method for finding this.

  • In brief, the operating system maintains a user-mode data structure for every thread in the system called the Thread Environment Block (TEB) which describes pretty much everything you'd want to know about the thread including a pointer to another data structure called the Process Environment Block (PEB) which, as may be apparent describes processes including, happily for us, a pointer to the image's base address in memory! These structures are not however documented (by Microsoft that is ;). But some very very clever folks at http://undocumented.ntinternals.net/ managed to figure out the layout for these structures all by themselves!

  • So all we need to do is:

    • Figure out where the TEB for the primary thread lives in the remote process. This information is stored in the thread's FS register which is accessible via the GetThreadSelectorEntry API.
    • Read the PEB using the pointer to it in the thread's TEB via ReadProcessMemory.
    • Use the pointer to the image's base address in the PEB and parse the PE structures till we are left with a reference to the remote process's entry point routine.
    • Phew!

    Here's the code that achieves this:

//
// Gets the address of the entry point routine given a
// handle to a process and its primary thread.
//
DWORD GetProcessEntryPointAddress( HANDLE hProcess, HANDLE hThread )
{
    CONTEXT             context;
    LDT_ENTRY           entry;
    TEB                 teb;
    PEB                 peb;
    DWORD               read;
    DWORD               dwFSBase;
    DWORD               dwImageBase, dwOffset;
    DWORD               dwOptHeaderOffset;
    optional_header     opt;
    
    //
    // get the current thread context
    //
    context.ContextFlags = CONTEXT_FULL | CONTEXT_DEBUG_REGISTERS;
    GetThreadContext( hThread, &context );
    
    //
    // use the segment register value to get a pointer to
    // the TEB
    //
    GetThreadSelectorEntry( hThread, context.SegFs, &entry );
    dwFSBase = ( entry.HighWord.Bits.BaseHi << 24 ) |
                     ( entry.HighWord.Bits.BaseMid << 16 ) |
                     ( entry.BaseLow );
    
    //
    // read the teb
    //
    ReadProcessMemory( hProcess, (LPCVOID)dwFSBase,
                       &teb, sizeof( TEB ), &read );
    
    //
    // read the peb from the location pointed at by the teb
    //
    ReadProcessMemory( hProcess, (LPCVOID)teb.Peb,
                       &peb, sizeof( PEB ), &read );
    
    //
    // figure out where the entry point is located;
    //
    dwImageBase = (DWORD)peb.ImageBaseAddress;
    ReadProcessMemory( hProcess, (LPCVOID)( dwImageBase + 0x3c ),
                       &dwOffset, sizeof( DWORD ), &read );
    
    dwOptHeaderOffset = ( dwImageBase + dwOffset + 4 + sizeof( coff_header ) );
    ReadProcessMemory( hProcess, (LPCVOID)dwOptHeaderOffset,
                       &opt, sizeof( optional_header ), &read );
    
    return ( dwImageBase + opt.entry_point );
}
  • If you're wondering what the weird code initializing dwFSBase means all I can do is direct you to the documentation for the LDT_ENTRY data structure in MSDN. Structures of this kind are partly the reason why system programmers tend to go bald early in life.

  • Now that we know where the entry point lives in the remote process it should be really straightforward right? Wrong! There still is that itsy bitsy problem of figuring out how we are to pass data to the remote process!

    The routine that deletes our executable looks like this:

#pragma pack(push, 1)

//
//  Structure to inject into remote process. Contains 
//  function pointers and code to execute.
//
typedef struct _SELFDEL
{
    HANDLE  hParent;                // parent process handle
    FARPROC fnWaitForSingleObject;
    FARPROC fnCloseHandle;
    FARPROC fnDeleteFile;
    FARPROC fnSleep;
    FARPROC fnExitProcess;
    FARPROC fnRemoveDirectory;
    FARPROC fnGetLastError;
    FARPROC fnLoadLibrary;
    FARPROC fnGetProcAddress;
    BOOL    fRemDir;
    TCHAR   szFileName[MAX_PATH];   // file to delete
} SELFDEL;

#pragma pack(pop)

//
//  Routine to execute in remote process. 
//
static void remote_thread(SELFDEL *remote)
{
    // wait for parent process to terminate
    remote->fnWaitForSingleObject(remote->hParent, INFINITE);
    remote->fnCloseHandle(remote->hParent);

    // try to delete the executable file 
    while(!remote->fnDeleteFile(remote->szFileName))
    {
        // failed - try again in one second's time
        remote->fnSleep(1000);
    }

    // finished! exit so that we don't execute garbage code
    remote->fnExitProcess(0);
}
  • As you might have noticed the function remote_thread makes all system calls via function pointers instead of calling them directly. This is done because, in the normal course, the compiler generates tiny stubs whenever calls to routines in dynamically loaded DLLs are made from a program. This stub jumps to a function pointer stored in a table initialized by the operating system's loader at runtime. Since we don't want these fancy stubs generated for code that is meant to be injected into a remote process, we deal exclusively with function pointers.

    Fortunately for us, the system APIs (in kernel32, user32 etc.) always get loaded at the same virtual address in all processes. So all we need to do is initialize a data structure with pointers to all the system calls we want to make from the remote process and pass this structure along also. With our entry-point overwrite strategy of course, how are we to do this? To make a long story short, I settled for the following approach.

  • First, I modified remote_thread to look like this:

//
//  Routine to execute in remote process. 
//
static void remote_thread()
{
    //
    // this will get replaced with a
    // real pointer to the data when it
    // gets injected into the remote
    // process
    //
    SELFDEL *remote = (SELFDEL *)0xFFFFFFFF;

    //
    // wait for parent process to terminate
    //
    remote->fnWaitForSingleObject(remote->hParent, INFINITE);
    remote->fnCloseHandle(remote->hParent);

    //
    // try to delete the executable file 
    //
    while(!remote->fnDeleteFile(remote->szFileName))
    {
        //
        // failed - try again in one second's time
        //
        remote->fnSleep(1000);
    }

    //
    // finished! exit so that we don't execute garbage code
    //
    remote->fnExitProcess(0);
}
  • I then converted this into shellcode (the exact mechanics of which I'll outline in another post) to arrive at what looks like this (this is just representative shellcode and not the one that got generated for the routine shown above):

char shellcode[] = {
    '\x55', '\x8B', '\xEC', '\x83', '\xEC', 
    '\x10', '\x53', '\xC7', '\x45', '\xF0',
    '\xFF', '\xFF', '\xFF', '\xFF',   // replace these 4 bytes
                                      // with actual address
    '\x8B', '\x45', '\xF0', '\x8B', '\x48',
    '\x20', '\x89', '\x4D', '\xF4', '\x8B',
    '\x55', '\xF0', '\x8B', '\x42', '\x24',
    '\x89', '\x45', '\xFC', '\x6A', '\xFF', ... more shell code here

  • shellcode, if you didn't know, is the technical term used (in security circles) to refer to binary machine code that is typically used in exploits as the payload. As it turns out in our case the value 0xFFFFFFFF that we initialized the pointer remote with in remote_thread shows up the exact same way in the shellcode also. Since we know where the entry point lives in the remote process, all we need to do is to first replace 0xFFFFFFFF in the shellcode with the actual pointer to the data before over-writing the entry point. Here's how this looks:

STARTUPINFO             si = { sizeof(si) };
PROCESS_INFORMATION     pi;
SELFDEL                 local;
DWORD                   data;
TCHAR                   szExe[MAX_PATH] = _T( "explorer.exe" );
DWORD                   process_entry;

//
// this shellcode self-deletes and then shows a messagebox
//
char shellcode[] = {
    '\x55', '\x8B', '\xEC', '\x83',
    '\xEC', '\x10', '\x53', '\xC7',
    '\xFF', '\xFF', '\xFF', '\xFF',   // replace these 4 bytes
                                      // with actual address
    '\x8B', '\x45', '\xF0', '\x8B',
    '\x48', '\x20', '\x89', '\x4D',

    ... snipped lots of meaningless shellcode here! ...

    '\xFF', '\xD0', '\x5B', '\x8B',
    '\xE5', '\x5D', '\xC3'
};

//
// initialize the SELFDEL object
//
local.fnWaitForSingleObject     = (FARPROC)WaitForSingleObject;
local.fnCloseHandle             = (FARPROC)CloseHandle;
local.fnDeleteFile              = (FARPROC)DeleteFile;
local.fnSleep                   = (FARPROC)Sleep;
local.fnExitProcess             = (FARPROC)ExitProcess;
local.fnRemoveDirectory         = (FARPROC)RemoveDirectory;
local.fnGetLastError            = (FARPROC)GetLastError;
local.fnLoadLibrary             = (FARPROC)LoadLibrary;
local.fnGetProcAddress          = (FARPROC)GetProcAddress;

//
// Give remote process a copy of our own process handle
//
DuplicateHandle(GetCurrentProcess(), GetCurrentProcess(), 
    pi.hProcess, &local.hParent, 0, FALSE, 0);
GetModuleFileName(0, local.szFileName, MAX_PATH);

//
// get the process's entry point address
//
process_entry = GetProcessEntryPointAddress( pi.hProcess, pi.hThread );

//
// replace the address of the data inside the
// shellcode (bytes 10 to 13)
//
data = process_entry + sizeof( shellcode );
shellcode[13] = (char)( data >> 24 );
shellcode[12] = (char)( ( data >> 16 ) & 0xFF );
shellcode[11] = (char)( ( data >> 8 ) & 0xFF );
shellcode[10] = (char)( data & 0xFF );

//
// copy our code+data at the exe's entry-point
//
VirtualProtectEx( pi.hProcess,
                  (PVOID)process_entry,
                  sizeof( local ) + sizeof( shellcode ),
                  PAGE_EXECUTE_READWRITE,
                  &oldProt );
WriteProcessMemory( pi.hProcess,
                    (PVOID)process_entry,
                    shellcode,
                    sizeof( shellcode ), 0);
WriteProcessMemory( pi.hProcess,
                    (PVOID)data,
                    &local,
                    sizeof( local ), 0);

//
// Let the process continue
//
ResumeThread(pi.hThread);

There! That's all there is to it. Please find the code for a self-deleting executable (that among other things also displays a message box from the remote process's hijacked entry point) here:

myselfdel.c
ntundoc.h
Link Comment (4)
 
Python rocks!
Technobabble
8/18/2006 9:14:20 AM  

I have been dabbling with a bit of 3D graphics programming for the last couple of days; trying my hand at exporting models designed using the free 3D modelling software Blender and getting them rendered using OpenGL. Blender supports a Python based scripting system where pretty much everything in Blender can be accessed via Python scripts. So, along the way I happened to see what Python was all about.

Exporting co-ordinates and normal vectors from Blender turned out to be fairly straightforward. The following script does a quick and dirty job of producing a file with all the numbers.

 import Blender;
 from Blender import *;

 import Blender.Scene;
 from Blender.Scene import *;

 f = open( "co.csv", "w" )

 scenes = Scene.Get(); # iterate thru all ze scenes
 for sc in scenes:
   objects = sc.getChildren(); # run thru all ze objects in this scene
   for obj in objects:
     if( obj.getType() != "Mesh" ): # we are interested only in meshes
       continue;
     print "exporting ", obj.name;
     f.write( "# " + obj.name + "\n" ); # write the name of the object into
                                        # the file; our renderer ignores
                                        # lines starting with the '#' character

     data = obj.getData(0, 1); # get co-ord data
     for face in data.faces: # blender gives us the co-ords face-wise
       #
       # first we write out the normal vector for this polygon
       #
       f.write( "--" +
                ( '%f' % face.no.x ) + "," +
                ( '%f' % face.no.y ) + "," +
                ( '%f' % face.no.z ) + "\n" );

       #
       # now, write out all the vertices
       #
       for v in face.verts:
           f.write( ( '%f' % v.co.x ) + "," +
                    ( '%f' % v.co.y ) + "," +
                    ( '%f' % v.co.z ) + "\n" )
    
 f.close();

Nothing spectacular about the script really. Here's a screenshot of how Blender's default suzanne monkey model looks like in my OpenGL program.

suzanne monkey

But one problem I quickly ran into was with regard to the scale of the co-ordinates output by Blender. Sometimes Blender's data would cause the model to be rendered in gigantic proportions and I'd have to render it far into the screen to make it fit inside the window. What I really needed was a script that could post-process the co-ordinates from Blender (basically, scale them down). I figured I'll write it in Python (I could have done the scaling while exporting them from Blender of course, but where's the fun in that!) and boy was it cool, or was it cool!

Python supports this super-cool feature called "List comprehension" that allows you to succintly express operations that you want performed on elements in a collection. In my case the file containing the list of co-ordinates looked like this:

-3.448783,-1.912737,-0.861946
-3.281278,-2.116843,-0.861946
-2.328871,-1.164436,-3.555760
-2.328871,-1.164436,-3.555760
-3.573251,-1.679874,-0.861946
-3.448783,-1.912737,-0.861946

And I wanted each of those numbers scaled down by a factor. The script turned out to be remarkably short and it has elegance written all over it!

 import string;

 #
 # open the source, dest files
 #
 src = open( "co.csv", "r" );
 dst = open( "cos.csv", "w" );

 #
 # process each line
 #
 line = src.readline();
 while( len( line ) > 0 ):
  #
  # process it only if it is a non-comment and non
  # normal line
  #
  if( line[0] != "#" and line[0:2] != "--" ):
   line = string.join( [ str( float( x ) * 0.25 )
           for x in string.split( line, "," ) ], "," ) + "\n";
   dst.write( line );
   line = src.readline();

 src.close();
 dst.close();

Take special note of the highlighted line. Believe it or not, but that single line of code splits a line of comma delimited text, converts each resulting token into a float, multiplies it with 0.25, converts the value back into a string and then concatenates the list of converted values into a comma delimited string again! Now I know that some of you're thinking that we have traded off clarity of code for expressive power and that is true to an extent here. But it isn't anymore obtuse than, say a regular expression! I guess it boils down to personal preference at the end of the day.

I could have written 20 lines to do the same thing but I sure wouldn't be feeling as pleased as I am feeling right now! Python rocks!

Link Comment (2)
 
Fedora Linux - Not for the faint of heart!
Technobabble
7/29/2006 12:04:42 AM  

I happened to install the 64 bit edition of Fedora Core 5 on my computer a few days back. Here's a log of all that transpired during that time before I could finally wipe the sweat off my forehead, heave a sigh of quiet satisfaction and watch GNOME load in all its resplendent glory!

22nd July, 2006 - 15:00 hrs

Received Fedora Core 64 bit bootable DVD via courier that I'd ordered through Buylinuxdvd.com. This is a great site for buying Linux distros in India by the way. You will not find fancy shmancy payment gateways though. But the service is just great.

15:30 hrs

Step 1 was of course, to carve out some space on my hard drive where I could install Fedora. Since I had lots of free space on one of my partitions I decided to hew it out of that partition and thus began my search for a free disk parition tool that could create paritions out of free space on existing partitions.

I went ahead and burnt the GParted LiveCD ISO image onto a CD-ROM and rebooted, only to find GParted indefinitely scanning for hardware! Ran out of patience and tried to see if PartitionLogic was any better. Well it wasn't! It had an issue with crossing a certain line called A20 (whatever that is!).

But this time I read somewhere that while GParted might take a while scanning things it will eventually deliver. Deciding to give it another shot, I rebooted into GParted, got the scanning thing going and went to watch my favourite TV show. By the time I came back (around 30 mins later) it had managed to figure out where my hard drive was and how it was laid out. God bless GParted developers! From there on it was a piece of cake (creating the partitions that is).

The lesson learnt here is of course that, "GParted's mill grinds slow, but sure."!

20:00 hrs

After a failed attempt at getting Fedora's Anaconda installer to start in GUI mode I successfully got Fedora installed using the plain jane character mode interface! Hurray! I now had Fedora Linux installed and it didn't mess up my XP installation! The experience so far had been so smooth and bump free that the only way forward was - down!

It all started when I decided that I had to have a GUI!

23rd July, 2006 - 02:20 hrs

Still no GUI :(. Got a few Nvidia display drivers for Linux downloaded on my XP system and was wondering how to access them from Fedora (my Windows partitions are all NTFS formatted you see). Turns out that some brilliant folks have actually reverse engineered the NTFS file system and written drivers for Linux that'll let us mount NTFS partitions on Linux in read-only mode! These guys rock!

I happily issued the command:

yum install kmod-ntfs

Only to be informed that I needed to upgrade my kernel and must download and install a 21 MB package before even thinking about kmod-ntfs. I had already started using stilts by this time to prevent falling face down on the keyboard before I finally got the hint my body had been trying to give me all along, that I needed to grab some sleep and that getting X to work on Fedora could actually wait (what blasphemy)! I let the download progress however and fell into the closest bed that I could find, falling asleep long before actually hitting ground (or well, bed)!

11:00 hrs

No Sir. Still staring at the dang white blinking cursor on the black dang screen! I installed kmod-ntfs (finally!), copied the Nvidia drivers over and even got them to build and install. But doing an init 5 continued to cause X to go nuts! After scouring all over the internet and trying 20 different home-brewed solutions I was almost ready to give up and accede defeat when I suddenly decided to take took a look at /var/logs/Xorg.0.log.

As it turned out, X was having trouble locating where exactly the module "nvidia" was to be found. I searched for "nvidia_drv.so" and found it easily enough in /usr/X11R6/lib64/modules/drivers. What I really wanted to do then was to give X a thorough shake and beat it into its puny brain where the driver lived. Grrr.

24th July, 2006 - 00:00 hrs

While I was thus considering the modalities of how such a shake could be administered I suddenly noticed that X was loading keyboard and mouse drivers from a different location - /usr/lib64/xorg/modules! Voila! That was of course the problem! I immediately decided to copy the nvidia driver files from /usr/X11R6/lib64/modules to this location. I did that first, then checked whether xorg.conf was configured right, ran nvidia-xconfig again just to be sure, checked whether the planets were aligned just right and ran the command:

startx

No words can express the joy I experienced when a beautiful screen like this opened up in 16-bit splendour!

Fedora GNOME screenshot

Only then did I wipe the sweat off my brow, heave a sigh of quiet satisfaction and rest my weary head for well-deserved sleep.

Link Comment (4)
 
On Unicode
Technobabble
7/16/2006 4:47:58 PM  

There were some interesting Unicode related issues that cropped up recently in the project that I am working on that led to my doing a little research into what the fuss around unicode was all about. While I had some understanding of what Unicode was, there were a few things that I managed to learn anew. So, if you didn't know, here's the low down on the Unicode standard.

First, some basic facts

  • Firstly there are two parallel efforts aimed at standardizing the use of characters in computer programs! One is the ISO 10646 project called the Universal Character Set (UCS) and the other is of course, Unicode. Around 1991 however, participants from both the projects fortunately decided that it would probably not be a good idea to have two competing standards for solving the same problem and decided to make both of their specifications compatible.

  • The primary goal of the Unicode standard is the definition of a universal character set (!), i.e., a character set to replace all the other character sets. Further, it would also be able to accommodate characters from all the languages spoken/written in the world.

  • It achieves this by assigning unique numbers – called code points – to each character. The Kannada letter “ka” for example has been assigned the code point 3221. What this means is that 3221 is forever the code for the Kannada letter “ka” all over the planet! Numbers such as this are assigned for all characters in all languages.

  • Code points are always assigned from the range 0x000000 to 0x10FFFF. You’d need 21 bits to represent this information at most. Around 5% of this space (works out to about 50,000 characters) is currently in use, another 5% is in preparation, about 13% is reserved for private use and about 2% is just reserved and not to be used for representing characters. The remaining 75% (around 8,35,000 characters) is open for future use!

  • Interestingly, effort is underway for assigning code points to characters from imaginary languages as well! JRR Tolkien invented a whole slew of languages each with its own grammar and script for his epic trilogy – “The Lord of the Rings”. Languages spoken and written by elves, dwarves, hobbits and ents (large walking/talking trees!) including a language called “Black Speech” used by orcs and other such dark residents of Mordor!

Some caveats

  • You might have heard that Unicode characters can be represented by 2 byte unsigned integers. Well, this is not entirely true. While it is possible to represent all the Unicode characters that exist in the world today (which represents the most frequently used set of characters) using 2 byte unsigned integers (given that only around 50,000 characters exist and an unsigned short can have a maximum value of 65,535) it is possible that code points get created whose value is greater than the maximum that can be accommodated in an unsigned short. The most commonly used characters however have been assigned numbers within the range 0x0000 to 0xFFFF (this is called the Basic Multilingual Plane or BMP).

  • The closest data type in C/C++ that can be used to represent all the possible Unicode code points is a 4 byte integer. But this would also mean that 11 bits would get wasted for every character given that all the code points can be represented with just 21 bits. The size of the C/C++ wchar_t data-type is compiler dependent and the standard does not say anything on how big it must be.

  • Even while using 2 bytes per character you’ll immediately notice that using them is wasteful when you’re mostly dealing with characters belonging, for example to the ASCII character set (all the ASCII character code assignments have been retained in Unicode for ensuring backward compatibility by the way) since the second byte would always have the value zero for all the characters.

Encodings

  • To get around this problem some clever folks invented “encoding” schemes such as UTF-8 and UTF-16 that lay out how any Unicode code point from the entire spectrum can be represented using the least number of bytes. UTF-8 in particular is quite popular as it automatically ensures backward compatibility with older documents. All existing ASCII documents are already valid UTF-8 files. Here’s a nifty little table that specifies how Unicode code points will be represented in the UTF-8 encoding scheme.

    Unicode UTF-8
    00000000 - 0000007F 0xxxxxxx
    00000080 - 000007FF 110xxxxx 10xxxxxx
    00000800 - 0000FFFF 1110xxxx 10xxxxxx 10xxxxxx
    00010000 - 001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
    00200000 - 03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
    04000000 - 7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

    The first column specifies the range of code points and the second column is a bitwise representation of how it will be denoted under UTF-8. Code points till 0x7F (ASCII character set) for instance will be represented using a single byte. For code points greater than 7F at least 2 bytes are needed and the number of contiguous bits set to 1 in the first byte till a zero is encountered indicates the number of bytes used to represent that code point. For example, 3 bytes are required for representing code points in the range 0x00000800 – 0x0000FFFF and this is indicated by the fact that the 3 most significant bits in the first byte is set to 1 followed by a zero bit.

  • UTF-8 is a “variable encoding” scheme where each character in the document can correspond to a varying number of bytes. Finding the size of such a document post encoding can be somewhat tricky.

Most of this information has been taken from the following great resources on this topic.

http://www.cl.cam.ac.uk/~mgk25/unicode.html
This is an FAQ on what it takes to support Unicode on Linux and has a lot of information on Unicode and UCS in general.

http://www-128.ibm.com/developerworks/library/codepages.html
Talks about various character sets. Good introduction to Unicode.

http://icu.sourceforge.net/docs/papers/unicode_wchar_t.html
Talks about issues relating to size of the C/C++ wchar_t data-type.

Link Comment
 
Windows hooks & call sequence
Technobabble
7/6/2006 9:19:58 PM  

I ran into an interesting little issue at work the other day. There is this program that I am working on which happens to embed Office documents into a browser control hosted in a CHtmlView derived view window (that's right, we use MFC). The requirement was to popup a little toolbar with a save button whenever the user did some in-place editing on whatever's displayed in the browser control. We already knew how to detect whether at a given point in time the document currently displayed in the browser has been edited. This was done in this manner (error checks omitted for brevity):

bool IsDirty()
{
    //
    // get hold of the html view object somehow
    //
    CHtmlView *pView = GetViewSomehow();

    //
    // get a pointer to the HTML document object
    //
    CComPtr<IDispatch> spDocument( pView->GetHtmlDocument() );

    //
    // turns out that you can QI on the doc object to
    // get a "IPersistStorage" object which is directly linked
    // to whatever happens to be embedded in the browser
    //
    CComPtr<IPersistStorage> spStorage;
    HRESULT hr = spDocument->QueryInterface( IID_IPersistStorage, (void **)&spStorage );

    //
    // now we just call "IPersistStorage::IsDirty" to figure
    // out whether the doc's been edited
    //
    hr = spStorage->IsDirty();
    return ( hr == S_OK );
}

The question of course was to figure out when this routine will be invoked so we can display the toolbar. We first came up with an approach that involved continuously polling for changes by making some creative use of CWinApp::OnIdle. But that resulted in short bursts of CPU usage spikes and while the solution worked, it somehow didn't feel right!

The next thing we tried was to see if Windows hooks can be put to some use here. We quickly set up a system-wide keyboard and mouse hook which "phoned home" so to speak whenever an event occurred by posting a custom message to a window. Whenever this message is received by the window it would call our clever little IsDirty routine to check if the document has been modified and get the toolbar displayed if need be. I let out a satisfied little burp at this point and compiled, linked and hit Ctrl+F5.

I loaded up a PowerPoint file into the browser control and pressed a few keys and whopeeee(!) the toolbar appeared straightaway. On doing some additional testing however I discovered that the damn thing showed up only with the second key press and not immediately after the first one! As it turns out, Windows delivers messages to hooks before they are delivered to the target application. So my hook was getting the keyboard event before PowerPoint was getting it and the call to IsDirty was consequently returning false as PowerPoint hadn't had a chance to mark the file as having been modified yet.

This thing drove us a little nuts until of course we figured a way out. I even made a newsgroup post on this issue (with no response by the way). The solution in the end turned out to be quite simple.

Well-behaving hooks are required to call the CallNextHookEx function before returning from the hook routine. This is to let other hooks that are installed on the system have a go at the message. I had done a PostMessage to the application window before calling CallNextHookEx like so:


PostMessage( hwndNotify, WM_HOOK_NOTIFY_MOUSE, 0, 0 );
return CallNextHookEx( NULL, nCode, wParam, lParam );

I made a small change to the order of invocation in this manner:


LRESULT lResult = CallNextHookEx( NULL, nCode, wParam, lParam );
PostMessage( hwndNotify, WM_HOOK_NOTIFY_MOUSE, 0, 0 );
return lResult;

And voila! it started working! It must be pretty evident what the issue was by just looking at the change that was done. Looks like CallNextHookEx, apart from calling other hooks that may have been installed also actually delivers the message to the target application before returning. In this case, this was just what the doctor ordered :)!

Cool eh?!

Link Comment
 
Turn off VS.NET 2005 deprecation
Technobabble
6/25/2006 3:13:32 PM  

If you've been compiling projects created using earlier versions of VisualStudio in VisualStudio 2005 then you have most certainly noticed the new security warnings that get displayed whenever an in-secure CRT routine is invoked from your code. If you call strcpy for instance, you'd see this:

    warning C4996: 'strcpy' was declared deprecated
    This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use
    _CRT_SECURE_NO_DEPRECATE. See online help for details.

If you're writing new code then it is generally a good idea to listen to this warning and change your code. If you are building an existing project however (an open source project for example) then you're likely to get hundreds of C4996 warnings. One straightforward way of turning them off is to disable deprecation warnings by defining _CRT_SECURE_NO_DEPRECATE.

If you've got a large solution with 15-20 projects and 2-3 build configurations in each then defining this symbol for each project can be one seriously daunting task. In a 10 project solution with each project having a "Debug" and a "Release" configuration for instance you'd have to define this symbol in the project properties dialog 20 times (10 * 2)! I found myself having to do this often enough to warrant the writing of a small VisualStudio macro to do the job. This macro captures the preprocessor definitions from an input box and adds it to all the configurations of each project that is currently selected in the solution explorer. Here's the macro definition:

Public Sub AddPreprocessorMacroToAllProjects()

    '
    ' check whether at least one project has been selected
    '
    If DTE.ActiveSolutionProjects.Length = 0 Then
        MsgBox("Please select the Visual C++ project(s) " + _
            "to which you would like a macro to be added.")
        Exit Sub
    End If

    '
    ' get the macro names and values
    '
retry:
    Dim macro As String
    macro = InputBox("Please enter one or more macros " + _
        "(e.g. _WIN32_WINNT=0x0500; WINVER=0x0500)", _
            "Enter Macros").Trim()
    If macro.Length = 0 Then
        If MsgBox("An empty macro was entered.  Retry?", _
                MsgBoxStyle.YesNo, _
                "Wrong macro") = MsgBoxResult.Yes Then
            GoTo retry
        Else
            Exit Sub
        End If
    End If

    '
    ' now iterate through each project in the array and add the
    ' macro to all the configurations of all visual c++ projects
    '
    Dim i As Integer
    Dim project As Project
    Dim VCProjectKind As String = "{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}"

    For i = 0 To DTE.ActiveSolutionProjects.Length - 1
        project = DTE.ActiveSolutionProjects(i)

        '
        ' is it a visual c++ project?
        '
        If project.Kind = VCProjectKind Then
            Dim vcproj As Microsoft.VisualStudio.VCProjectEngine.VCProject
            vcproj = project.Object
            Dim j As Integer

            '
            ' iterate through each configuration on this project
            '
            For j = 1 To vcproj.Configurations.Count
                Dim config As Microsoft.VisualStudio.VCProjectEngine.VCConfiguration
                config = vcproj.Configurations(j)

                '
                ' now add the macro to the compiler settings of this configuration
                '
                Dim cl As Microsoft.VisualStudio.VCProjectEngine.VCCLCompilerTool
                cl = config.Tools("VCCLCompilerTool")
                cl.PreprocessorDefinitions = cl.PreprocessorDefinitions + "; " + macro
            Next
        End If
    Next

    MsgBox("Done.")

End Sub

Feel free to use it if you find it useful!

Link Comment
 
NTFS File Streams
Technobabble
6/4/2006 1:19:34 PM  

The Windows NTFS file system has for a long long time included support for what is known as "streams". The idea is to view a file as being a container for 1 or more data streams.  Security information for a file for instance could be stored in stream A and the main file data in stream B.  The interesting thing is that the operating system will directly recognize only data stored in what is known as the default stream.  This default stream is always called $DATA and crud stored in this stream alone is used while accounting for things like file size!  To see streams in action try this out (this will obviously work only if your file system is NTFS):

  • Open a command prompt.
  • Type echo This is in the default stream > ding.txt.
  • Type dir ding.txt. System reports the size as being 32 bytes.
  • Now type echo This is in a hidden stream > ding.txt:bar.
  • Type dir ding.txt. System still reports the size as being 32 bytes.
  • Type more < ding.txt. System prints out This is in the default stream.
  • Now type more < ding.txt:bar. System prints out This is in a hidden stream!

As it turns out, you can specify a stream name along with the file name to deal with specific streams inside a file (ding.txt:bar). Further, for all practical purposes, data stored in non-default streams seem to get ignored by the operating system. While that is so, when you do things like copy/move files from one location to another the system does ensure that it copies the supplementary stream also with it. Now if you're a worm/virus writer I can see you rubbing your hands in glee thinking of all the security implications. But given that this capability has been around since 1993, chances are, all the security folks already know about it!

There's a nice little article on this topic at the following URL. Go take a look!

http://www.osronline.com/article.cfm?article=457
Link Comment (2)
 
What it takes to write 64-bit apps
Technobabble
6/2/2006 6:08:35 PM  

Matt Pietrek has written a nice new article on MSDN about everything that you need to know to get started writing applications for 64-bit versions of Windows - otherwise known as x64. It is available here:

http://msdn.microsoft.com/msdnmag/issues/06/05/x64/default.aspx

For your quick reference (and mine :) I've jotted down the important points here if you haven't got the time or inclination to read the article. Here goes.

  • Moving from 32-bit to 64-bit is not just a re-compile away.
  • You get to address really really large chunks of memory (on the order of terabytes and even larger). Each process gets its own 8TB chunk of memory from the OS.
  • All system DLLs are loaded above 4GB typically at addresses around 0x7FF00000000.
  • The x64 linker assigns the default load address for 64-bit applications to just above 32 bits. This is being done so that you can quickly discover porting bugs. If you are for instance using a 32-bit pointer where you should have used a 64-bit pointer, because the base address is greater than what can be accommodated in 32 bits the pointer will effectively get truncated resulting in an access violation.
  • Most Win32 datatypes continue to retain the size from the 32-bit world. INTs, LONGs, WORDs and DWORDs continue to be 32-bits wide. HANDLEs have become 64-bits wide though.
  • x64 versions of Windows include a sub-system called WOW64 that allows 32-bit applications to just work on x64.
  • A 64-bit application cannot load 32-bit DLLs and vice versa. However 64 and 32-bit applications can still talk to each other using inter-process communication mechanisms (shared memory, named pipes, synchronization objects).
  • x64 is officially off limits for 16-bit applications.
  • Two copies of system DLLs are maintained - one for 64-bit applications and one for 32-bit applications. WOW64 silently re-directs file I/O on the system folder for 32-bit applications to \Windows\SysWow64. If you want to figure out the path to the 32-bit system folder from a 64-bit app then you can do that by calling GetSystemWow64Directory.
  • Just like the file system, WOW64 silently re-directs access to the registry also for 32-bit applications. This is to prevent each type of application from stepping on each other's toes, as can happen when a 64-bit app CoCreateInstances a 32-bit COM component! So, a 32-bit application would see a different HKEY_CLASSES_ROOT (and a few other keys) as opposed to a 64-bit app.
  • x64 now includes a feature called PatchGuard which basically BSODs your system if any kernel mode code alters important kernel mode data structures such as the Interrupt Dispatch Table (IDT). Hmm. Now there's a challenge for rootkit writers and tools such as Regmon that rely on being able to mess around with the IDT.
  • CPU registers are now 64-bits wide (obviously) and are called RAX, RBX, RCX, RDX, RSI and so forth. 8 new general purpose registers have been added and are called R8, R9, R10 and so on till R15.
  • No more __cdecl, __stdcall, __fastcall or __thiscall! There is only a single calling convention in x64 which passes the first 4 parameters to a function via RCX, RDX, R8 and R9 and the remaining parameters via the stack.

A few quick points on writing code that works on 32-bit and 64-bit computers:

  • Pointers cannot be stored in 32-bit types such as ints, longs or DWORDs.
  • Use DWORD_PTRs, INT_PTRs and LONG_PTRs when you want to store pointer values. These types automatically become 64-bits wide on x64 systems (after a recompile that is).
  • When you use functions like printf and sprintf do not use %X to print pointer values. Use %p and you're automagically protected.
  • Inline assembly is not supported in the 64-bit C++ compiler. Boo! hoo!
Link Comment
 
How I lost my right foot!
Technobabble
5/4/2006 2:08:09 AM  

I don't particularly mind dealing with bugs as long as they are interesting to fix and you learn something along the way.  I was not always so well disposed towards them though.  It is only after reading Debugging Applications by John Robbins that I realized how much fun fixing bugs can be 8-)!  There are bugs that you feel good about when you fix 'em because it required you to use your "amazing ingenuity" and "extensive knowledge" of how stuff works under the hood and then there are bugs that make you feel rotten because the darn thing won't get reproduced on your box!

Those are the nastiest kind of bugs to get stuck with because you know it is there somewhere and there isn't a single thing that you can do about it!  I had to deal with a bug like that the other day the root cause for which in the end turned out to be a fairly silly little programming error.  "If you can shoot yourself in the foot with 'C', you can blow your whole leg off with C++" reminisced the wise old man while helping himself up the handicap ramp seated on his wheelchair.  I know he said this because I was right there behind him when he said it with a big smoking hole in my right foot.  Here's what happened.

Ours is an ambitious little web application that seeks to do a whole lot of things all by itself using pretty much every single technology that mankind has managed to come up with till about 5 minutes back.  One of the things that it jauntily goes about doing every now and then is to call a little Internet Server API (ISAPI) extension on the web server whenever somebody logs off.  When a user happens to crash out of a web session however (as can happen for instance when lightning strikes the user's computer and does not give her a chance to cleanly exit the browser and shut the computer down) that little notification does not ever reach the ISAPI extension and the web server remains tragically unaware of the user's untimely end.  After twiddling thumbs for some time though the ISAPI extension runs out of patience and just ends that user's session.  Now, here's the important part - as part of the processing where it terminates that non-responsive user's session, it turns on a little boolean member in a little C++ class to register the fact that that session has been aborted (as opposed to cleanly logging off).

All the session information is stored in a Standard Template Library list<> object and this is what the session object looks like:

class CSession
{
public:
    bool    m_bAbnormalLogOff; // this tells me whether
                               // this session ended abnormally
    int     haplessInt;
    float   haplessFloat;

public:
    CSession()
    {
        //
        // initialize everything
        //
        m_bAbnormalLogOff = false;
        haplessInt = 0;
        haplessFloat = 0.0f;
    }

    CSession( const CSession& s1 )
    {
        haplessInt = s1.haplessInt;
        haplessFloat = s1.haplessFloat;
    }
};

Now, can you spot the error in this code?  The error is of course that m_bAbnormalLogOff is not initialized in the copy constructor.  Why is that a bad thing?  Please look at the following code and try predicting what the output will be:

list<CSession> listOfSessions;

//
// create an abnormal session and push
// it onto the list
//
CSession badSession;
badSession.m_bAbnormalLogOff = true;
listOfSessions.push_back( badSession );

//
// now pop it off the list and push
// a normal session object
//
listOfSessions.pop_front();
CSession goodSession;    // now m_bAbnormalLogOff would be "false"
listOfSession.push_back( goodSession );

cout<<goodSession.m_bAbnormalLogOff<<endl;

If you said that it would print 0 (zero), then wouldn't you be surprised if I told you that the actual output on Microsoft's C++ compiler (the one they give free with Visual C++ Express Edition 2005) is 1?  Well, fact is, it prints 1 and this is the nasty little bug that troubled us no end!  Here's my take on what is most likely happening in this case:

  • When the first CSession object gets pushed on to the list<> it allocates some space for it and keeps it there.  The point to note here is that list<> classes maintain their own copies of the objects that they are tracking and routines like list<>::push_back invoke the object's copy constructor for creating the copy.  This is the reason why it is important that classes that you plan to store in STL containers implement the copy constructor and the assignment operator.
  • When this object gets popped off the list<> and is replaced by another CSession object, the new instance, instead of occupying fresh memory space just sits nice and snug in the space that the previous CSession object had occupied.  As before list<>::push_back dutifully invokes CSession::CSession( const CSession& s1 ) for creating the object copy.
  • Since we forgot to copy the value for m_bAbnormalLogOff from s1 in the copy constructor it automatically assumes whatever value is currently stored in that location.
  • Given that we initialized m_bAbnormalLogOff with the value true for badSession,goodSession continues to use the same value!

The fallout of this little beauty is that every once in a while the system would report sessions where the user had logged off legitimately as having been aborted.  Invariably this would always happen for 2 or 3 sessions that immediately followed a session that got aborted!

Link Comment (2)
 
On preprocessors & Java
Technobabble
5/3/2006 12:01:18 PM  

The following is a link to an open source pre-processor program for Java:

http://jappo.opensourcefinland.org/

I think it is shockingly short-sighted on the part of Sun developers that they are so obstinately refusing to include a pre-processor with the Java compiler. While MACROs are admittedly the source of many bugs in the C and C++ world, a more reasonable approach should have been taken instead of completely cutting it out as the benefits derived out of using them is real and indispensable. C# and .NET for instance include a pre-processor that has relatively fewer capabilities as compared to the C/C++ pre-processor but still allow for the writing of conditionally compiled code.  Microsoft has got it perfectly right on this count IMO.

If you’re wondering where all this angst is coming from, we recently discovered that the changes that we had made to a certain applet for incorporating a set of changes was not implemented in the source branch for Microsoft's VM (yep, we have customers who still use that VM!). If we’d had conditional compilation then it would have been a simple matter of #ifdef ing out the relevant portions instead of having an entire branch!

The fact that an open source effort for developing a pre-processor for Java exists at all is evidence enough IMO of the need for it. Grrrr.

Link Comment
 
Now Blogorama features RSS syndication!
Technobabble
5/1/2006 3:09:34 PM  

The nerd is pleased to anounce the general availability of Really Simple Syndication (RSS) on Nerdworks Blogorama! Muaha! ha! ha! ha! Please use the link pointed at by the orange button on the right hand side of your screen in your favourite RSS feed reader.

A screen shot of what this looks like using Google's Feed Reader is available here.

Link Comment (2)
 
Poor man's network bandwidth detection technique
Technobabble
5/1/2006 11:52:54 AM  

I am currently working on an online e-learning/collaboration tool that features all the bells and whistles that one would normally expect of such a tool.  The basic functioning of the application is fairly straightforward in that it let's a presenter collaborate with a set of participants by passing messages around through a web server.  There was this recent requirement where the application had to automatically detect the network bandwidth available to a user and provide appropriate warnings on finding that it is less than the minimum required.  The idea is to measure the net time it takes for a message to travel from the presenter to the web server and then from the web server to the participant (and vice versa).

The first thing that you would probably try is:

  • have the presenter put a timestamp on the message that she sends (let's call this PrTS - for presenter timestamp)
  • have the server put another timestamp before forwarding it to the participant (this would be SvrTS - for server timestamp)
  • and finally, let the participant mark the time of receipt of message and do a little arithmetic to figure out the net latency (let's call this PaTs - for participant timestamp); basically the net latency from presenter to participant is a simple matter of subtracting PrTS from PaTS

Perfect.  Except that the whole thing comes crashing down when you realize that,

  1. the presenter, the server and the participant could be in different time-zones (but this can of course be handled), and
  2. the presenter and the participant might have set their clocks to the previous (or maybe the next) century!

Basically, this system requires that the clocks on all three computers be absolutely accurate.  So we went back to the drawing board and came up with this, IMO nifty little approach:

  • the presenter sends out a message of a fixed size with a timestamp - i.e. with presenter's local time (let's call this PrTS1)
  • the server plonks its timestamp on to the message before forwarding it to the participant (let's call this one SvrTS1)
  • the participant receives the message, marks the time of receipt and just sits pretty (let's call this PaTS1)
  • the presenter, after sending the first message, waits for a random interval (say 5 seconds) and sends out a second message, again with a timestamp (let's call this PrTS2)
  • the server, as before puts its time of receipt on the message and forwards it to the participant (SvrTS2)
  • the participant, upon receiving this second message, records the time of receipt (PaTS2) and does the following arithmetic to figure out the latency
Presenter to Server latency (PrSvrL) = ( SvrTS2 - SvrTS1 ) - ( PrTS2 - PrTS1 )
Server to Participant latency (SvrPaL) = ( PaTS2 - PaTS1 ) - ( SvrTS2 - SvrTS1 )
And finally, Presenter to Participant latency (PrPaL) = PrSvrL + SvrPaL

Now I know this sounds complicated, but really, it isn't.  Work it out; it seems to work! :)

[Updated - 27 May, 2006]

Well, some further analysis reveals that this algorithm does not in fact measure latency.  What it does measure however is jitter, i.e., variations in latency.  We are only measuring the difference between the latencies of the first message and the second message and not the latency itself.  Sigh!

Link Comment
 
Google launches free 3D modelling software
Technobabble
4/30/2006 11:56:44 PM  

Google has launched a free 3D modelling software called SketchUp that lets you quickly create 3D models - a la dummies version of 3D Studio Max. As with most software coming out of Google stables SketchUp is also delightfully simple to use. There’s a free and a pro version (that comes for a price) with one major difference being that only the pro version lets you export models into different file formats. The free version lets you save either in the native SketchUp format or in a format that’s suitable for use with Google Earth.  While the download is a tad heavy at 19MB it is certainly worth it.  The downside is of course that they don't let you save the models in any real format that you can use (in an OpenGL program for instance).  SketchUp is another Google acquisition incidentally.

Link Comment
 
blogorama home
about this blog
email the author
where on earth am i?
subscribe to mailing list
feeds Use these links for feed syndication
rss  |  atom
by category
technobabble (33)
philosophical crud (3)
irrelevant stuff (7)
archive
march, 2009 (2)
august, 2008 (2)
march, 2008 (1)
january, 2008 (1)
september, 2007 (2)
april, 2007 (1)
february, 2007 (2)
december, 2006 (1)
october, 2006 (1)
september, 2006 (4)
august, 2006 (3)
july, 2006 (4)
june, 2006 (3)
may, 2006 (6)
april, 2006 (2)
recent entries
Writing a sensor dr...
Enabling JSONP call...
The Conman
Memoization - Optim...
Random Lisp thought...
Learning Common Lis...
JavaScript closures...
Calling a JavaScrip...
92040 hits