An update to a project I have been working on for some time, for which the time definitley ripe for an update.
It was an afternoon in 2007 when I was pondering… “Why I am writing the same Dictionary<K,V> collections just for indexing and putting them internal to my collection classes so I could do query optimization.” At the same time thinking about how Linq could work with such a thing, I decided to go forth and spike out a specialized collection type that would internally implement indexes and just use them when you issue a query to the collection. The result was the i4o project (codeplex.com/i4o).
If you are not familiar with i4o, the problem it solves is that it takes any IEnumerable<T> in .NET, and allows you to easily put indexes on it such that LINQ queries over the Index rather than doing a “tablescan” style search through the collection. You would create an IndexableCollection<T> as follows:
var someIndexable = new IndexableCollection<T>(someEnumerable, someIndexSpecification);
At this point, someIndexable wraps the someEnumerable, applying an index to one or more properties, as specified in someIndexSpecification. You could also inherit from IndexableCollection<T> and get indexing in your own custom collections, such that you could do a query like this:
var someItems = from s in someIndexable where s.SomeProperty == 42 select s;
… and the query would use the index for searching, rather than scan each item in the collection.
Over the years since (approaching 3!) – the project has evolved, little by little. In 2008, Jason Jarrett started contributing work, bringing in important performance enhancements, better code organization, continuous integration, and various other things, that he elaborates on here.
For 2010, we wanted to do a significant refresh, with some new capabilities for core functionality. The second major revision of i4o will feature the following:
* Support for INotifyPropertyChanged
Previously, if an indexed property changed, you had to manually notify the index of the change and reindex the item. For i4o2, we have formally added support for property change notification, such that if T supports INotifyPropertyChanged, changes to any child will result in automatic reindexing of that item.
* Support for Observable<T>
If a collection inherits from Observable<T>, and T also supports INotifyPropertyChanged, you will be able to wrap an index around that which will notice changes in the collection.
* Support for a far wider range of query operations
Previously, if a query was anything other than object.Property == <some constant | some expression>, i4o would simply use Enumerable.Where, rather than try to optimize the query using the index. The index was based on strict hash tables, which limited the kinds of operations that could be optimized. In v2, there is a much more sophisticated query processor that:
1.) Optimizes the structure based on what the child type supports. If the child type is IComparable, it uses a Red-Black tree to hold the index so that queries that use comparison operations rather than equality can be optimized.
2.) Uses a more sophisticated algorithm for visiting the expression tree of the predicate, allowing for indexes to work on much more complex predicates. So if you are doing a query where your predicate is more complex (A and B and (C or D), for example) the processor will handle those as well.
* Creation via a builder pattern:
Rather than make the user try to figure out what index is best, creating an index is a matter of calling IndexBuilder.BuildIndicesFor with a source collection and a spec:
var someIndex = IndexBuilder.BuildIndicesFor(someCollection, indexSpec);
The builder detects the types passed to it, and simply builds the most optimal index type (ObservableIndexSet or just an IndexSet) based on the type of enumerable passed to it and the child type.
* More DSL-like index specification builder
While this has been supported for some time, we added some language to the builder, so the code would read better:var indexSpec = IndexSpecification<SimpleClass>.Build() .With(person => person.FavoriteColor) .And(person => person.Age) .And(person => person.Name);
These changes are very much in early alpha testing right now, with a goal of having them fully baked prior to the Visual Studio 2010 release. At that time, we intend to move the changes into the project trunk and do a full release.
If you want early access to the changes before then, I encourage anyone who is interested to visit codeplex.com/i4o and see the i4o2 branch of the code repository. Once we are sufficiently stable and have integrated some of the older perf improvements into the new code base, we will likely move the i4o2 branch to the trunk.