March 2010 – The Nomadic Developer

In thinking about what is so compelling about certain new technologies that have emerged in recent years, a common theme is starting to emerge. The best technologies don’t just do something useful, but they make the user think about the right things that lead to better designs and more robust software. Lets start by thinking about some of the most compelling technologies or techniques that have had a lot of buzz in the last couple years:

Dependency Injection

Dependency injection is a great technique for being able to manage coupling. At least it is in practice. But there is nothing that would stop you from using dependency injection to, say, create a giant class and inject thirtyteen-hundred dependencies in it and giving yourself a maintenance nightmare.

What is important about dependency injection, is that it makes you think about dependencies. When I am writing code, I want creating a dependency from one class to another to hurt. Not a lot, but enough that it makes me put an entry in a file somewhere – be it an xml config, or a module, or something. A DI tool that wires everything up for me without me having to explicitly think about it each time – well, that is to me like an HR management tool that does not have an “are you sure” prompt on the “fire employee” button.

Continuous Integration

Continuous Integration tools do some nifty things, one of the most important being providing some visibility to the state of the code. When properly implemented, the benefits are pretty staggering. However, in my experience, one of the most important benefits is that Continuous Integration forces you to think about how to automate the deployment process. You can’t continuously integrate if some person has to flip the bozo bit in order for the build to work. CI promotes a model that makes tools that require human intervention to install seem obsolete. I can’t see this as a bad thing.

Functional Programming

Most anyone who knows me in a professional context knows I am a big fan of functional programming. I have grown to like the syntax and expressibility of languages like F#, and there are a great many reasons why the language is important. But to me, the most striking is that functional languages make you think long and hard about state. You can do state in F# (through the mutable keyword) – even in pure functional programming languages like Haskell if you implement a state monad. But the important thing here is that you have to do significant work to create state in these languages. That leads to choices about state that tend to be more mindful than in languages where state is the default.

I could talk through more examples, but I think you get the picture. The opposite tends to hold true as well – my main issue with a technology like ASP.NET Webforms is that it makes certain things easy (ViewState, notably) that it should not, in order to help you avoid having to think about the fact that you are running an application over a stateless medium. When you are considering a new technology that is emerging – don’t think features, think “What choices is this technology forcing me to make?”

An update to a project I have been working on for some time, for which the time definitley ripe for an update.

It was an afternoon in 2007 when I was pondering… “Why I am writing the same Dictionary<K,V> collections just for indexing and putting them internal to my collection classes so I could do query optimization.” At the same time thinking about how Linq could work with such a thing, I decided to go forth and spike out a specialized collection type that would internally implement indexes and just use them when you issue a query to the collection. The result was the i4o project (codeplex.com/i4o).

If you are not familiar with i4o, the problem it solves is that it takes any IEnumerable<T> in .NET, and allows you to easily put indexes on it such that LINQ queries over the Index rather than doing a “tablescan” style search through the collection. You would create an IndexableCollection<T> as follows:

var someIndexable = new IndexableCollection<T>(someEnumerable, someIndexSpecification);

At this point, someIndexable wraps the someEnumerable, applying an index to one or more properties, as specified in someIndexSpecification. You could also inherit from IndexableCollection<T> and get indexing in your own custom collections, such that you could do a query like this:

var someItems = from s in someIndexable where s.SomeProperty == 42 select s;

… and the query would use the index for searching, rather than scan each item in the collection.

Over the years since (approaching 3!) – the project has evolved, little by little. In 2008, Jason Jarrett started contributing work, bringing in important performance enhancements, better code organization, continuous integration, and various other things, that he elaborates on here.

For 2010, we wanted to do a significant refresh, with some new capabilities for core functionality. The second major revision of i4o will feature the following:

* Support for INotifyPropertyChanged

Previously, if an indexed property changed, you had to manually notify the index of the change and reindex the item. For i4o2, we have formally added support for property change notification, such that if T supports INotifyPropertyChanged, changes to any child will result in automatic reindexing of that item.

* Support for Observable<T>

If a collection inherits from Observable<T>, and T also supports INotifyPropertyChanged, you will be able to wrap an index around that which will notice changes in the collection.

* Support for a far wider range of query operations

Previously, if a query was anything other than object.Property == <some constant | some expression>, i4o would simply use Enumerable.Where, rather than try to optimize the query using the index. The index was based on strict hash tables, which limited the kinds of operations that could be optimized. In v2, there is a much more sophisticated query processor that:

1.) Optimizes the structure based on what the child type supports. If the child type is IComparable, it uses a Red-Black tree to hold the index so that queries that use comparison operations rather than equality can be optimized.

2.) Uses a more sophisticated algorithm for visiting the expression tree of the predicate, allowing for indexes to work on much more complex predicates. So if you are doing a query where your predicate is more complex (A and B and (C or D), for example) the processor will handle those as well.

* Creation via a builder pattern:

Rather than make the user try to figure out what index is best, creating an index is a matter of calling IndexBuilder.BuildIndicesFor with a source collection and a spec:

var someIndex = IndexBuilder.BuildIndicesFor(someCollection, indexSpec);

The builder detects the types passed to it, and simply builds the most optimal index type (ObservableIndexSet or just an IndexSet) based on the type of enumerable passed to it and the child type.
* More DSL-like index specification builder
While this has been supported for some time, we added some language to the builder, so the code would read better:
            var indexSpec = IndexSpecification<SimpleClass>.Build()
                .With(person => person.FavoriteColor)
                .And(person => person.Age)
                .And(person => person.Name);

These changes are very much in early alpha testing right now, with a goal of having them fully baked prior to the Visual Studio 2010 release.  At that time, we intend to move the changes into the project trunk and do a full release.
If you want early access to the changes before then, I encourage anyone who is interested to visit codeplex.com/i4o and see the i4o2 branch of the code repository.  Once we are sufficiently stable and have integrated some of the older perf improvements into the new code base, we will likely move the i4o2 branch to the trunk.

The Nomadic Developer

Surviving and Thriving in the World of Technology Consulting

Month: March 2010

Just Make Me Think! The Best Technologies Force Hard Choices.

i4o v2