Velocity 101: Get the Engineering Practices Right

January 26, 2011January 28, 2011 Aaron Erickson2 Comments

IF one could equate faster typing with velocity, engineering practices perhaps would not matter in the world of software development productivity. Thankfully, there are reasons that most organizations do not use Words Per Minute in our evaluation process when hiring new software developers. Slamming out low quality code and claiming progress, be it story points, or merely finished tasks on a GANTT chart, is a fast track to creating boat anchors that hold back companies rather than pushing them forward.

Without proper engineering practices, you will not see the benefits of agile software development. It comes down to basic engineering principles. A highly coupled system – be it software, mechanical, or otherwise, provides more vectors over which change in one part of the system can ripple into other parts of the system. This is desirable to a degree – you need your transmission to couple to the engine in order for the engine to make a car move. But the more coupling beyond the minimum you need in order to make the system work, the more the overall system destabilizes. If the braking system suddenly activated because a Justin Bieber CD was inserted into the car stereo, you would probably see that as a pretty bad defect. And not just because of the horrible “music” coming out of the speakers.

So what are the specific engineering practices? Some are lower level coding practices, others are higher level architectural concerns. Most rely on automation in some respect to guard against getting too complacent. While I generally loathe to use the term “best practices” due to the fear that someone might take these practices and try to apply them in something of a cargo-cult manner, these are some general practices that seem to across a broad section of the software development world:

Test Driven Development and SOLID

While detractors remain, it has ceased to be controversial to suggest that the practices that emerged out of the Extreme Programming movement of the early 2000s are helpful. Test driven development as a design technique selects for creating decoupled classes. It is certainly possible to use TDD to drive yourself to a highly-coupled mess, given enough work and abuse of mocking frameworks. However, anyone with any sensitivity to pain will quickly realize that having dozens of dependencies in a “God” class makes you slow, makes you work harder to add new functionality, and generally makes your tests brittle and worthless.

To move away from this pain, you write smaller, testable classes that have fewer dependencies. By reducing dependencies, you reduce coupling. When you reduce coupling, you create more stable systems that are more amenable to change – notwithstanding the other benefits you get from good test coverage. Even if you only used TDD for design benefits- and never used the tests after initially writing them, you get better, less coupled designs, which leads to greater velocity when you need to make changes. TDD doesn’t just help you in the future. It helps you move faster now.

Indeed, TDD is just one step on the way to keeping your code clean. Robert Martin treats the subject in much more depth in his book, Clean Code. Indeed, he calls all of us out to be professionals, making sure that we keep our standards and don’t give into the temptation to simply write a bunch of code that meets external acceptance criteria, but does so at the cost of internal quality. While you can, in theory, slap some code together that meets surface criteria, it is false economy to assume that bad code today will have a positive effect on velocity beyond the current iteration.

Of course, having good test coverage… particularly good automated integration, functional, performance, and acceptance tests, has a wonderful side effect of forming a robust means of regression testing your system on a constant basis. While it has been years since I have worked on systems that lacked decent coverage, from time to time I consult for companies that want to “move to agile”. Almost invariably, when I do this, I find situations where a captive IT department is proposing that an entire team spend 6 months to introduce 6 simple features into a system. I see organizations that have QA departments that have to spend another 6 months manually regression testing. TDD is a good start, but these other types of automated testing are needed as well to keep the velocity improvements going – both during and after the initial build.

Simple, but not too simple, application architecture (just enough to do the job)

While SOLID and TDD (or BDD and some of the ongoing improvements) are important, it is also important to emphasize that simplicity specifically as a virtue. That is not to say that SOLID and TDD can’t lead to simplicity – they certainly can, especially in the hands of an experienced practitioner of the tool. But without a conscious effort to keep things simple (aka apply the KISS principle – keep it simple and stupid), regardless of development technique, excess complexity creeps in.

There are natural reasons for this. One of which is the wanna-be architect effect. Many organizations have a career path where, to advance, a developer needs to advance to the title of architect – often a role where, at least it is perceived, that you get to select design patterns and ESB buses without having to get your hands dirty writing code. There are developers who believe that, in order to be seen as an architect, you need to use as many GoF patterns as possible, ideally all in the same project. It is projects like these where you eventually see the “factory factories” that Joel Spolsky lampooned in his seminal Architecture Astronaut essay. Long story short, don’t let an aspiring architecture astronaut introduce more layers than you need!

It doesn’t just need to be a wanna-be architecture astronaut that creates a Rube Goldbergesque nightmare. Sometimes, unchecked assumptions about non-functional requirements can lead a team to creating a more complex solution than actually needed. It could be anything from “Sarbanes Oxley Auditors Gone Wild” (now there is a blog post of it’s own!) requiring using an aggressive interpretation of the law to require layers you don’t really need. It could be being asked for 5 9s of reliability when you only really need 3. These kinds of excesses show up all the time in enterprise software development, especially when they come from non-technical sources.

The point is this – enterprise software frequently introduces non-functional requirements in something of a cargo-cult manner “just to be safe”, and as a result, multiplies the cost of software delivery by 10. If you have a layer being introduced as the result of a non-functional requirement, consider challenging it to make sure it is really a requirement. Sometimes it will be, but you would be surprised how often it isn’t.

Automated Builds, Continuous Integration

If creating a developer setup requires 300 pages of documentation, manual setup, and other wizardry to get right, you are likely to move much slower. Even if you have unit tests, automated regression tests, and other practices, lacking an automated way to build the app frequently results in “Works on My Machine” syndrome. Once you have a lot of setup variation, which is what you get when setup is manual, defect resolution goes from straightforward process to something like this:

Defect Logged by QA
Developer has to manually re-create defect, spends 2 hours trying, unable to do so
Developer closes defect as “unable to reproduce”
QA calls developer over, reproduces
Argument ensues about QA not being able to setup the environment correctly
Developer complaining “works on my machine”
2 hour meeting to resolve dispute
Developer has to end up diagnosing the configuration issue
Developer realizes that DEVWKSTATION42 is not equivalent to LOCALHOST for everyone in the company
Developer walks away in shame, one day later

Indeed, having builds be automated, regular, and integrated continuously can help avoid wasting a day or five every time a defect is logged. It should not be controversial to say that this practice increases velocity.

Design of Software Matters

Design isn’t dead. Good software design can help lead to good velocity. Getting the balance wrong – too simple, too complex, cripples velocity. Technical practices matter. Future articles in this velocity series will focus on some of the more people related ways to increase velocity, and they are certainly important. But with backward engineering practices, none of the things you do on the people front will really work.

Why Does Custom Software Cost So Much?

January 3, 2011January 3, 2011 Aaron Erickson5 Comments

After nearly 20 years writing custom software, mostly for corporations in IT departments, there is nearly a uniform meme I encounter among those who sponsor projects that involve custom software development:

“Custom software costs too much!”

There are stories, anecdotes, studies, and all sorts of experience about how software development schedules that go over time, well over budget, and fail to meet expectations. Far more seldom are stories that talk about how software was delivered in less time than was estimated. It should be no surprise that, despite the fact that corporations are sitting on more cash than they ever have in corporate history, companies are still skittish about embarking on new custom development projects.

This, of course, is not new. This has been a theme in software development even far before Fred Brooks wrote The Mythical Man Month. What is surprising to me – is how this condition survives even despite things like agile and lean coming to the fore. Yes, its true, even in purported agile and lean projects, there is a great variability in productivity that, when combined with a tendency to budget and plan based on best case scenarios in corporate IT, result in far too many late and over budget projects. Agile and lean help – indeed, are the right direction (this is one of the reasons I work for ThoughtWorks!). However, they are not a silver bullet. The reality is that, despite everything we try to do to mitigate risks with agile and lean, project sponsors are rightly concerned with the level of risk they see as inherent in software projects. As a result, they are going to use levers available to them to manage these risks.

Levers Used to Manage Risk

So what levers do investors have in order to manage risk? The ones that most use tend to be:

Manage using process controls (scope management, quality gates, governance, PMBOK, ITIL)
Manage down the invested capital at risk

What happens when you pull these levers? Well, sadly, this is what you tend to get:

Waterfall software development, often fixed bid, with strict “de jure” scope control with the real “de facto” scope additions added through a back-channel.
Attempt to get lowest cost per developer hour.

While this does not occur in more enlightened organizations that likely understand the fact that not all “developer hours” are the same, the sad fact is that most organizations lack such enlightenment. It is easy to imagine why this occurs. If you are a CIO managing a 2000 person IT department, part of your bias is going to be that you, being a smart CIO and all, have such a large organization for a reason. At this level of corporate politics, not only is your prestige buried in the size of your department, but the idea that you have 80% more people than you might need is, to put it simply, heretical. Given that such heretics that “rock the boat” tend to have short careers in high level corporate IT, it is not surprising that the such “resource oriented” ways of thinking predominate the conversation at the highest levels of IT.

Why These Levers Fail

Not all failure is the same. And not all failing is bad. The problem is when we fail to learn from failure. And the way these levers fail, then tend to encourage the kind of failure you don’t learn from. The kind of failure that begets more failure rather than begets learning from mistakes.

How the Hourly Rate Lever Fails

Let us take on failure that results when you try to succeed through usage of the cost reduction lever. A common way this gets sold is when there is a vendor management group – especially one where the leading VP is bonused strictly on lowering rates – that wields a lot of power in an organization. The presence of companies selling developers chiefly on the basis of rate attests to the fact that there is a market for this. You do not have to look hard for this – much of the market for offshore development caters to this group. Not to be outdone, there are many first world (US in particular) based companies that try to play the low end of the market.

However, another source of “search for the lowest rate” surprisingly comes from initial (and often feeble) attempts to try to get estimating right as a result of a failed waterfall project. If the last project failed, the chances are, the next round of estimating will be much more conservative. The conservative estimate, often higher than initial budget expectations, causes the sponsor to try to drive down the unit cost per hour. Remember, in such a situation, the hour is perceived as the fixed quantity. This attempt to fit a large number of hours into a fixed budget also is a large source of project sponsors trying to pull the hourly rate lever.

Predictably, these kinds of moves fail. When you go lower on the rate scale, you tend to get less qualified software developers. While there is a lot of corporate hubris tied into the fiction that there is some magical source of low cost developers that are otherwise brilliant and able to achieve results above their pay grade, seldom do such moves tend to work out. Low cost shops tend to have lots of turnover because of low wages – good developers are not stupid, and will not tolerate low wages for long. Beyond this, of course, low cost shops tend to underestimate work as well, because they often compete with other low cost shops. It is a race to the bottom, ironically, that causes even more project failure.

The result of all this is, of course, chasing low-cost development tends to cause a self-reinforcing failure cycle. Project fails, next project has longer estimates, sponsor does not want to spend more money on failure, leading to the next push to drive rates even lower. This continues until the organization loses any faith in any ability to write software at all. The only thing that stops this is, literally, company bankruptcy, acquisition, or a major change in management at the top of IT (if not the CEO) so that the negative feedback cycle can be broken.

How the “Strong Process” Lever Fails

One step up from project sponsors who will hit the cost lever like a mouse hitting a lever in a skinner box are the project sponsors who put their full faith in processes, such as those proscribed by the PMBOK or recommended by ITIL to solve their cost problems. The thought is that – if only we could get a better plan, and stick to the plan, we could get better at predicting the cost of software.

The chief thing that such frameworks try to do is to do strong change management, contracted service levels, and detailed planning in order to achieve the end of predictable software development. This approach, of course, leads to several problems that are well understood by many people in the Agile and Lean communities but certainly bear repeating:

Software development itself, particularly when done with business analysts, frequently leads to discoveries. As Mary Poppendeick once pointed out, the most valuable feedback for changes to software comes at the latter end of a project, when the group analyzing and developing the software best understands the domain. Traditional processes tend to consider these late changes the most costly, seeking to minimize the highest value work.
Velocity is deeply variable based on factors that management usually does not want to talk about. A good software developer can be 10 times more effective than an average software developer. Other environmental factors that have little to do with the presence of a prescriptive process can cause additional wild swings. A future article in this series will go into more detail about how potential swings in velocity are an order of magnitude more significant than the swings in “cost per hour”.

Proscriptive plans that try to predict cost and schedule without understanding the individuals, service climate, DevOps maturity, and multitude of other variables that can affect the speed of software development will surely fail. The simple fact that corporate budgeting processes select for optimistic thinking most of the time leads to assuming that simple program that consists of 4 main screens will take 8 weeks rather than 8 quarters.

Why is this a cycle of failure? Well – what happens is that you bring PMBOK, ITIL, or some other “capital-P” Process in, and it fails. Because such processes tend to be more accepted by the corporate finance community, the reaction to a failed project tends to be “we didn’t do enough process control”, rather than “we did too much”. The next round of projects tend to have more detailed estimating and stronger change control. There are stories that I personally know of requirements efforts that are measured by years, rather than weeks.

What Levers Remain?

There are levers that go far too much underused. In my experience, they are as follows:

Manage Project Duration Down

In general, prefer smaller projects to larger ones. Predictability works better when the team is smaller and the timeframe is short. I can almost decently predict what one team of 8 people will complete in an iteration if I know all the people, skills, and modes of working on that team. I, personally, am a believer in doing lots of small things rather than one big thing – as it allows for a faster feedback cycle and overall a less risky investment. However, there are many projects – some of the most interesting ones – that will never be small budget, short duration projects. This leads to the lever I use when I can’t manage duration down:

Manage Velocity Up

I have seen a team of 6 people deliver in 7 months in one company the same amount of work I have seen 150 people deliver in 2 years in other companies. And I am not alone, I have talked with others in this business, and nearly everyone I talk to has either seen, or worked on one of these very effective software development teams.

Velocity matters. It is hard to measure productivity the way you measure a person’s shoe size, but you can sure manage relative productivity over a long period of time. The perception that software costs too much is, in my opinion, one of the factors holding back the global economy from expanding. In any large company, any decent developer, would be shocked at the waste involved in many business processes – waste that could be eliminated by better software implementation. There are companies that will not change commission plans, product offerings, or even product pricing to be more profitable, chiefly because of software constraints. If you think software does not have a serious effect on both the opportunities or limitations of a company, you literally must not be paying very much attention.

What’s Next?

In this series of upcoming posts, I am going to explain various factors that affect velocity, and how you can optimize them in a manner that does not sacrifice quality. Factors that, when understood and properly managed, can mean the difference between a flexible company that can respond to market conditions, and a company whose technology that does not allow it to run so much as a second shift when demand ramps up.

Stay tuned…

What Working at ThoughtWorks Has Taught Me About Consulting So Far

December 8, 2010December 8, 2010 Aaron Erickson3 Comments

In my book The Nomadic Developer, I spent an entire chapter covering techniques that allow you to thrive as a technology consultant. Of course, I wrote that before I joined ThoughtWorks. Since joining, I can certainly say that ThoughtWorks has given me quite an education about technology consulting. This post explores some of the things I have learned over the past 18 months.

Technology Consulting is 10% Technology, 90% Consulting

Being a great technologist is around 10% of the skillset required for being a good technology consultant. I used to think it was 50%, but my understanding has now vastly changed towards the direction of the so called “soft skills” being more important. In most companies, there are thousands of opportunities to make things better using technology in some way, shape, or form. The trick to opening those opportunities is overcoming the massive “wall of cynicism” towards these kinds of investments. Discovering the opportunities, overcoming the wall of cynicism, getting the human stakeholders on board (not just upper management!), and then actually putting this all together to get a project funded and delivered seems to be 90% of the challenge.

Technology Consulting is a Subset Of, not Different Than, Management Consulting

You can do a project without having it have deep implications for the overall business. But I doubt that is the kind of project I would ever want to work on. Most technology investments are, in fact, capital investments in the business. It is very common for technology to end up both constraining and enabling corporate strategy. Even minor implementation details can have significant effects on strategic choices that a company will be able to make down the road. Everything from ability to do effective mergers, price changes, new product launches, and other strategic initiatives deeply depend on business technology. For good or ill, technology decisions drive business strategy – and until that condition changes, it is my conjecture that effective technology consulting is really a part of the overall management consulting picture.

ThoughtWorks, having recognized this, is formally rolling out our Consulting Practice which, among other things, seeks to formally do what we have been doing informally since our inception, which is to advise companies not just on how to implement technology, but what technology to implement.

Clients are Never Perfect

Expecting to come to TW where you will work on perfectly “aligned” clients that always espouse our values is, to be blunt, a poor expectation. Most people call the doctor when they are sick, not when they are well. Sometimes – no – all the time – organizational transformation is HARD, and you will have to, excuse my French, slog through some shit in order to get to the holy grail of your client that needs help becoming a perfectly functioning agile client that perfectly practices continuous delivery.

Never Underestimate The Importance of Soft Skills

Good consulting involves:

Controlling your temper and not being “shocked” when you see things like bad code and retrograde practices. When you go on site, expect anything and everything.
Understanding and addressing the skepticism of organizations for which you are the 4th, 5th, or 10th person who has been put there to try to fix things.
Learning how to build credibility so you can spend it later when you need to.
Understanding the limits of your own capabilities, so you can know when to call for help (aka you do not have all the answers).
Learning how to understand a domain quickly and credibly, so you can talk in the language of the client
… and a million other little things that have more to do with relations between humans than they have to do with technology …

It has been quite an experience, personally, just finding out how much I didn’t know, learning how to apply these principles in a large programme of work. That these things are important is rather intuitivley obvious, but at scale, it becomes a list of things you have to remind yourself about every day. When the stakes go up, and the number of people increase, the importance of these basic fundamentals really starts to outweigh almost everything else!

The Bottom Line

If you want to actually deliver a great technology solution, getting the technology right is just the table stakes. Getting the people thing right – the consulting – is 90% of the actual work. It is thrilling, engrossing work, but it certainly isn’t just about software!

The Unheralded Benefits of the F# Programming Language

September 10, 2010 Aaron Erickson2 Comments

As many long time readers know, I am an enthusiast of the F# programming language. I make no apologies for the fact that, if you are developing software on the .NET platform, F# is one of the better choices you can make for numerous reasons. It is one of the reasons I proudly contributed as a co-author to the book, Professional F# 2.0, which is being published by Wrox in October.

Some of the oft cited benefits of F# are that, to distill them quickly, it is good at doing intensely mathematical operations, it is built for parallelism, and it is good at helping define domain specific languages. Those benefits are so often cited by speakers on the F# speaker circuit that they pretty much seem cliche to me at this point (note, yours truly is proud to call himself a member of said circuit, and often gives this talk!) As great as these features are, there are a couple features, that in my more mundane F# experiences, seem to stand out as things that “save my ass”, for lack of a better phrase, more often than not.

Advantage 1: Near Eradication of the Evil NullReferenceException

The first feature I am most dutifully grateful for is that when working with F#, you almost never deal with the concept of null. A side effect of being “immutable by default” is that the pattern of “lets leave this thing uninitialized until I use it later, then forget I didn’t initialize it when I try to use it” mostly goes away. In other words, immutable code almost never has a reason to be null, therefore, no null pointer exceptions if you stick to immutable structures.

How often do you see this pattern in a legacy code base:

if (thisThing != null && thisThing.AndThatThing != null)

...

doSomethingWith(thisThing.AndThatThing);

...

… or worse, deeply similar nested examples? When I am doing code archaeology, sometimes even on more recent code bases, I usually spot this kind of code as places where:

a.) Some very careful programmer is doing null checks to make sure she does not get a null pointer exception

… or, more commonly…

b.) Someone fixed one or more NullReferenceExceptions they were getting.

The only time you routinely deal with the concept of null in F#, typically, is when doing interop work, likely someone else’s ill-initialized C# code. Of course, one may wonder how one represents the idea of something actually being “missing”, that is, something roughly analagous to null, in F# code. Well, that is where the option comes to the rescue. If, for example, you have a concept of weather, you might have this:

type Weather =

Skies : Sky

Precip : Some(Precipitation) // Precip : Precipitation option will also work

In this example, weather will always have a sky, but only might have some precipitation. If something is optional, you say so, which means that you can have the value of Precip either be Some(somePrecipValue) or None. For C# programmers, this is roughly analogous to Nullable<T>, only it applies to objects, not just value types. What this does is force the programmer to state which objects can be in a state of “absence” by exception. In the same way that a database design becomes more robust when you make more of your fields non-nullable, software becomes more robust and less prone to bugs when fewer things are “optional” as well.

Advantage 2: Your Entire Domain in One Page of Code

The second advantage – at least in my mind, is that unlike C# and Java – the lack of syntax noise in F# means that nobody uses the “one class per file” rule that is conventional in the world of most mainstream programming languages. The nice thing about this is that, frequently, you can put a reasonably complex entire domain model on one printed page of code.

One pattern I use quite a bit is that I put the set of relationships between types in one file, along with common functions between elements. If I need to extend the domain in any way – such as adding later a conversion function from a domain to a viewmodel or something like that, I put that in a separate file, where I can write extension methods that adapt the domain to whatever I need it to be adapted to.

Extension methods? What if I need to use a private member to extend the domain? Well – that does beg a question – when do functional programmers use private members? I can only vouch for myself, but seldom do I feel the need in F# programs to hide anything. Think about why we ever need encapsulation – it is usually to stop outsiders from changing the member variables inside our class. If there is nothing that varies – as is the case with a system built using immutable constructs, then there is less need for such encapsulation. I may have a private member somewhere to hide an implementation detail, but even that tends not to be a mainstream case that I would ever use in an extension scenario (i.e. projecting a domain object to a viewmodel).

The overall advantage, in the way F# is written, is that you can have all of your related concerns on a single page of code in most systems. This ability to “print out and study the domain in the loo” leads to a subtle, but important reason, why F# is good for expressing domains.

Is it for Everything?

No – not by a long-shot. But more and more, I am seeing F# as useful for many things beyond the traditional math, science, and obviously parallel type applications that functional languages are traditionally considered useful for. Specifically, the more I use it in MVC and REST style applications, the more it grows on me. Especially when I am working with Java or C# code, and fixing someone else’s NullReferenceExceptions!

Upper Management Support the Key to Success? No.

July 19, 2010 Aaron Erickson3 Comments

You have heard this one before. They key to project success is “Upper Management Support”. I hear the phrase so much it is pretty much a cliche, right up there with “be aligned with the business”. It ranks right up there with “brush your teeth in the morning” and “exercise if you want to be healthy”.

So I have a humble request. First of all, I think we can just start assuming that if you have a goal that requires more than, say, the amount of money you spend annually on copiers in a small office annually, you are going to need upper management support. Especially if it is going to change the business. If you are engaging on a “change the company for the better initiative”, getting upper management support is basically the first, and one of the easier, steps.

The real hard stuff isn’t upper management, it’s middle management. Upper management does not usually have their empires or jobs threatened by a significant IT initiative. Middle management, on the other hand, often does.
The day to day impact of a large programme on the lives of upper management is comparatively low. Most of the time, they can get on with their normal duties and continue to operate at the “strategic” level rather than hands on. Middle management, on the other hand, often has to expend significant time, energy, and political capital in the programme in order to make sure it works.
The numbers of people in upper management are lower, well, because they are upper management. You can more easily get a small number of people on the same page. Middle management numbers are much higher, and making the politics of middle management engagement much more difficult.
Various groups in middle management that are impacted unevenly create opportunities for internecine politics to enter the scene. Even if you get middle management engagement, keeping it is a much more difficult chore.

There are all sorts of things you need in order to make a large programme of transformational work successful. Upper management support is just the “table stakes” – the ante for getting to the table. Getting support of middle management, and getting past the many roadblocks they can put in your way, is much harder, frequently underestimated, and frankly, in my experience, much more of a project success factor.

The “Dark Matter” of Technical Debt: Enterprise Software

April 23, 2010 Aaron Erickson7 Comments

Bespoke software is expensive. As we all well know, it is risky to build, technical debt can easily creep in, and you can easily end up with a maintenance nightmare. And software developers, well – we all know they are hard to work with, they tend to have opinions about things, and did I mention, they are expensive?

The argument has always been that with purchased software, you get an economy of scale because you share the software with others. Of course, this works out well most of the time – nobody should ever be developing their own internal commodity software (think operating systems, databases, and other “utilities”).

However, not all software is “utility”. There is a continuum of types of software, going from something like Microsoft Windows or Linux on one end, which nobody in their right mind would write, and company specific applications of all kinds that have zero applicability outside of a given, well, “Enterprise”. The software I am talking about in this post lies somewhere in the middle of these extremes.

Almost anyone who does work in corporate IT has probably encountered one of these systems. The following traits commonly pop up:

It is oriented at a vertical market. The number of customers is often measured in 10s or 100s.
The cost for purchase is usually measured with at least 6 figures in USD.
It usually requires significant customization – either by code, or by a byzantine set of configuration options.
It was almost certainly sold on a golf course, or in a steak house.
You usually need their own consultants to do a decent installation. The company that sells the software has professional services revenues at or higher than the software license revenues.

It is my observation that software in this category almost always is loaded with technical debt. Technical debt that you can’t refactor. Technical debt that becomes a permanent fixture of the organization for many years to come. Enterprise Software – especially software sold as “Enterprise Software” to non-technical decision makers, is more often than not, a boat-anchor that holds organizations back, adding negative value.

Why is this? Enterprise software is often sold on the basis of flexibility. A common process, sadly, in the world of package selection, is to simply draw up a list of features, evaluate a set of vendors on the basis of desired features, and balance that against some license cost + implementation cost threshold. Lip service is given to “cost-of-ownership”, but the incentives in place reward minimizing the perceived future costs. What this process selects for is a combination of maximum flexibility, moderate license cost relative to a build (but often high), and minimized estimates of implementation cost. Even if one company bucks the trend, the competitive landscape always selects for things in this direction.

Why is that true? We don’t assess the technical debt of enterprise software. I have seen a lot of buy versus build analysis in my years as a technology consultant, and not once did I see something that assessed the internal quality of the solution. Enterprise software is bought based on external features, not internal quality. Nobody asks about cyclomatic complexity or afferent coupling on the golf course.

Does internal quality of purchased software matter? Absolutely. In spades. It is hardly uncommon for companies to start down a path of packaged software implementation, find some limitation, and then need to come to an agreement to customize the source code. Rarely does anyone have the intent to take on source when the software is purchased, but frequently, it happens anyway when the big hairy implementation runs into difficulty. But even if you never take possession of the source code, the ability for you to get any upgrades to the solution will be affected by the packaged software vendor’s ability to add features. If the internal quality is bad, it will affect the cost structure of the software going forward. APIs around software that has bad internal quality tend to leak out that bad quality, making integration difficult and spreading around the code smells that are presumably supposed to be kept “inside the black box”.

What is the end result? Package implementations that end up costing far in excess of what it would have been to build a piece of custom software in the first place. Lots of good money thrown after bad. Even when the implementation works, massive maintenance costs going forward. It gets worse though. The cost of the last implementation often colors the expectations for what the replacement should cost, which tends to bias organizations towards replacing one behemoth nasty enterprise software package with something equally as bad. It is, what the French like to call, a fine mess.

So what is the solution? We need to change how we buy enterprise software. The tools we have for buy versus build analysis are deficient – as few models include a real, robust cost-of-ownership analysis that properly includes the effects of insufficient internal quality. It is amazing that in this day and age, when lack of proper due diligence in package selection can cost an organization literally billions of dollars, that so little attention is paid to internal quality.

What would happen? There would be a renewed incentive to internal quality. Much of today’s mediocre software would suddenly look expensive – providing room for new solutions that are easier to work with, maintain, and provide more lasting business value. More money could be allocated to strategic software that uniquely helps the company, providing more space for innovation. In short, we would realize vastly more value out of our software investments than we do today.

F# Based Discriminated Union/Structural Similarity

April 8, 2010April 8, 2010 Aaron Erickson1 Comment

Imagine you have a need to take one type, which may or may not be a discriminated union, and see if it “fits” inside of another type. A typical case might be whether one discriminated union case would be a possible case for a different discriminated union. That is, could the structure of type A fit into the structure of type B. For lack of a better word, I am calling this “structural similarity”.

Lets start with some test cases:

module UnionTypeStructuralComparisonTest
open StructuralTypeSimilarity
open NUnit.Framework

type FooBar =
 | Salami of int
 | Foo of int * int
 | Bar of string

type FizzBuz =
 | Toast of int
 | Zap of int * int
 | Bang of string

type BigOption =
 | Crap of int * int
 | Bang of string
 | Kaboom of decimal

type Compound =
 | Frazzle of FizzBuz * FooBar
 | Crapola of double

[<TestFixture>]
type PersonalInsultTestCase() =

 [<Test>]
 member this.BangCanGoInFooBar() =
 let bang = Bang("I like cheese")
 Assert.IsTrue(bang =~= typeof<FizzBuz>)
 Assert.IsTrue(bang =~= typeof<FooBar>)
 Assert.IsTrue(bang =~= typeof<BigOption>)

 [<Test>]
 member this.KaboomDecimalDoesNotFitInFizzBuz() =
 let kaboom = Kaboom(45m)
 Assert.IsFalse(kaboom =~= typeof<FizzBuz>)

 [<Test>]
 member this.SomeStringCanBeFooBar() =
 let someString = "I like beer"
 Assert.IsTrue(someString =~= typeof<FooBar>)

 [<Test>]
 member this.SomeFoobarCanBeString() =
 let someFoobar = Bar("I like beer")
 Assert.IsTrue(someFoobar =~= typeof<string>)

 [<Test>]
 member this.SomeFoobarTypeCanBeString() =
 Assert.IsTrue(typeof<FooBar> =~= typeof<string>)

 [<Test>]
 member this.CompoundUnionTest() =
 let someCompound = Frazzle(Toast(4),Salami(2))
 Assert.IsTrue(someCompound =~= typeof<FooBar>)

To make this work, we are going to need to implement our =~= operator, and then do some FSharp type-fu in order to compare the structure:

module StructuralTypeSimilarity

open System
open Microsoft.FSharp.Reflection
open NLPParserCore

let isACase (testUnionType:Type) =
 testUnionType
 |> FSharpType.GetUnionCases
 |> Array.exists(fun u -> u.Name = testUnionType.Name)
let caseToTuple (case:UnionCaseInfo) =
 let fields = case.GetFields()
 if fields.Length > 1 then
 fields
 |> Array.map( fun pi -> pi.PropertyType )
 |> FSharpType.MakeTupleType
 else
 fields.[0].PropertyType 

let rec UnionTypeSourceSimilarToTargetSimpleType (testUnionType:Type) (targetType:Type) =
 if (testUnionType |> FSharpType.IsUnion)
   && (not (targetType |> FSharpType.IsUnion)) then
 if testUnionType |> isACase then
 let unionType = testUnionType
  |> FSharpType.GetUnionCases
  |> Array.find(fun u -> u.Name = testUnionType.Name)
 let myCaseType = caseToTuple unionType
 myCaseType =~= targetType
 else
 testUnionType
 |> FSharpType.GetUnionCases
 |> Array.map( fun case -> (case |> caseToTuple) =~= targetType )
 |> Array.exists( fun result -> result )
 else
 raise( new InvalidOperationException() )

and UnionTypeSourceSimilarToUnionTypeTarget (testUnionType:Type) (targetUnionType:Type) =
 if (testUnionType |> FSharpType.IsUnion)
  && (targetUnionType |> FSharpType.IsUnion) then
 if testUnionType |> isACase then
 targetUnionType
 |> FSharpType.GetUnionCases
 |> Array.map( fun u -> u |> caseToTuple )
 |> Array.map( fun targetTuple -> testUnionType =~= targetTuple )
 |> Array.exists( fun result -> result )
 else
 testUnionType
 |> FSharpType.GetUnionCases
 |> Array.map( fun case -> (case |> caseToTuple) =~= targetUnionType )
 |> Array.exists( fun result -> result )
 else
 raise( new InvalidOperationException() )

and SimpleTypeSourceSimilarToUnionTypeTarget (testSimpleType:Type) (targetUnionType:Type) =
 if (not (testSimpleType |> FSharpType.IsUnion))
  && (targetUnionType |> FSharpType.IsUnion) then
 targetUnionType
 |> FSharpType.GetUnionCases
 |> Array.map( fun u -> u |> caseToTuple )
 |> Array.map( fun targetTuple -> testSimpleType =~= targetTuple )
 |> Array.exists( fun result -> result )
 else
 raise( new InvalidOperationException() )

and SimpleTypeSourceSimilarToSimpleTypeTarget (testSimpleType:Type) (targetSimpleType:Type) =
 if (testSimpleType |> FSharpType.IsTuple) && (targetSimpleType |> FSharpType.IsTuple) then
 let testTupleTypes = testSimpleType |> FSharpType.GetTupleElements
 let targetTupleTypes = targetSimpleType |> FSharpType.GetTupleElements
 if testTupleTypes.Length = targetTupleTypes.Length then
 let matches = Array.zip testTupleTypes targetTupleTypes
 |> Array.map( fun(test,target) -> test =~= target )
 not (matches |> Array.exists( fun result -> not result ))
 else
 false
 else
 testSimpleType = targetSimpleType

and (=~=) (testObject:obj) (targetType:Type) =
 let objIsType (o:obj) =
 match o with
 | :? Type -> true
 | _ -> false

 let resolveToType (o:obj) =
 match objIsType o with
 | true -> o :?> Type
 | false -> o.GetType()
 let testObjectIsAType = testObject |> objIsType
 let testObjectTypeIsUnion =
 match testObjectIsAType with
 | true -> testObject |> resolveToType |> FSharpType.IsUnion
 | false -> false
 let targetTypeIsAUnion = targetType |> FSharpType.IsUnion 

 let resolvedType = testObject |> resolveToType

 match testObjectIsAType,testObjectTypeIsUnion,targetTypeIsAUnion with
 | false, _, _ -> resolvedType =~= targetType
 | true,true,false -> UnionTypeSourceSimilarToTargetSimpleType resolvedType targetType
 | true,false,false -> SimpleTypeSourceSimilarToSimpleTypeTarget resolvedType targetType
 | true,true,true -> UnionTypeSourceSimilarToUnionTypeTarget resolvedType targetType
 | true,false,true -> SimpleTypeSourceSimilarToUnionTypeTarget resolvedType targetType

Getting this to work seemed harder than it should. While my tests pass, I am sure there are both cases I have not yet covered, and probably some simpler ways I could accomplish some of the same goals.

While this is a work in progress, if anyone has any thoughts for simpler ways to do something like this, I am all ears.

Using Dynamic with C# to read XML

April 5, 2010 Aaron Erickson7 Comments

On April 10th (less than 1 week away), I am doing an updated version of my talk at Twin Cities Code Camp about using dynamic with C#.

One core technique I am seeking to demonstrate is to use the concept of a dynamic XML reader as a more human readable way to use XML content in from C# or any other dynamic language.

Consider the following usage scenarios:

http://pastie.org/904555

What we would like is an object that will use dynamic in C# to make it so that we can read XML without having to think about all the nasty mechanics of search, XPath, and other stuff that isn’t “I am looking for the foobar configuration setting” or whatever it is we are looking for in the XML we want to look at. The following is the basic spiked implementation:

http://pastie.org/904557

The real hard part was figuring out the mechanics of how DynamicMetaObject actually does it’s work. Doing a dynamic object, if you are not going to do it the easy way and simply inherit from dynamicobject, means you are going to write two classes:

Something that implements IDynamicMetaObjectProvider
Something that inherits from DynamicMetaObject

The job of IDynamicMetaObjectProvider – at least as far as I can tell – is simply to point to the right DynamicMetaObject implementation, and somehow associate something with it that will potentially drive how the dynamic object will respond to various calls. Most of the interesting stuff happens in DynamicMetaObject, where we get to specify how various kinds of bindings will work.

In a simple case like this where we are doing everything with properties, we merely need to override BindGetMember. The return value of BindGetMember will generally be another DynamicMetaObject derived class.

Generally, DynamicMetaObjects take three parameters on construction:

A target expression
A binding restriction (something that tells the object a little about what it has – not entirely sure why…)
The actual thing to pass back

In the way I am using this, there are three main ways we return. If we find we can resolve the property call to be a string, we are just going to wrap the string in a constant expression, specify a type restriction on string, and send that back. If we are resolving to another XElement that has sub-items – we are going to wrap the XElement into a new DynamicXmlMetaObject – as a means to allow for further dot notation to get at sub-elements. Lastly, if we have a group of the same item, we are going to return an array of either wrapped strings, or wrapped DynamicXmlMetaObjects. Managing through these three cases is where most of the complexity is.

This is a work in progress – and I have already been told by some that this is a bad idea (i.e. “why not XPath?”, “that looks dangerous”, and “what if the Republicans use this?”). But certainly, for certain kinds of problems, I can definitely see myself using this kind of thing to remove lots of Linq to XML plumbing code! (note, some work to integrate this and perhaps somehow combine this with XPath queries will probably happen).

Just Make Me Think! The Best Technologies Force Hard Choices.

March 26, 2010 Aaron Erickson4 Comments

In thinking about what is so compelling about certain new technologies that have emerged in recent years, a common theme is starting to emerge. The best technologies don’t just do something useful, but they make the user think about the right things that lead to better designs and more robust software. Lets start by thinking about some of the most compelling technologies or techniques that have had a lot of buzz in the last couple years:

Dependency Injection

Dependency injection is a great technique for being able to manage coupling. At least it is in practice. But there is nothing that would stop you from using dependency injection to, say, create a giant class and inject thirtyteen-hundred dependencies in it and giving yourself a maintenance nightmare.

What is important about dependency injection, is that it makes you think about dependencies. When I am writing code, I want creating a dependency from one class to another to hurt. Not a lot, but enough that it makes me put an entry in a file somewhere – be it an xml config, or a module, or something. A DI tool that wires everything up for me without me having to explicitly think about it each time – well, that is to me like an HR management tool that does not have an “are you sure” prompt on the “fire employee” button.

Continuous Integration

Continuous Integration tools do some nifty things, one of the most important being providing some visibility to the state of the code. When properly implemented, the benefits are pretty staggering. However, in my experience, one of the most important benefits is that Continuous Integration forces you to think about how to automate the deployment process. You can’t continuously integrate if some person has to flip the bozo bit in order for the build to work. CI promotes a model that makes tools that require human intervention to install seem obsolete. I can’t see this as a bad thing.

Functional Programming

Most anyone who knows me in a professional context knows I am a big fan of functional programming. I have grown to like the syntax and expressibility of languages like F#, and there are a great many reasons why the language is important. But to me, the most striking is that functional languages make you think long and hard about state. You can do state in F# (through the mutable keyword) – even in pure functional programming languages like Haskell if you implement a state monad. But the important thing here is that you have to do significant work to create state in these languages. That leads to choices about state that tend to be more mindful than in languages where state is the default.

I could talk through more examples, but I think you get the picture. The opposite tends to hold true as well – my main issue with a technology like ASP.NET Webforms is that it makes certain things easy (ViewState, notably) that it should not, in order to help you avoid having to think about the fact that you are running an application over a stateless medium. When you are considering a new technology that is emerging – don’t think features, think “What choices is this technology forcing me to make?”

i4o v2

March 2, 2010March 2, 2010 Aaron Ericksoni4o5 Comments

An update to a project I have been working on for some time, for which the time definitley ripe for an update.

It was an afternoon in 2007 when I was pondering… “Why I am writing the same Dictionary<K,V> collections just for indexing and putting them internal to my collection classes so I could do query optimization.” At the same time thinking about how Linq could work with such a thing, I decided to go forth and spike out a specialized collection type that would internally implement indexes and just use them when you issue a query to the collection. The result was the i4o project (codeplex.com/i4o).

If you are not familiar with i4o, the problem it solves is that it takes any IEnumerable<T> in .NET, and allows you to easily put indexes on it such that LINQ queries over the Index rather than doing a “tablescan” style search through the collection. You would create an IndexableCollection<T> as follows:

var someIndexable = new IndexableCollection<T>(someEnumerable, someIndexSpecification);

At this point, someIndexable wraps the someEnumerable, applying an index to one or more properties, as specified in someIndexSpecification. You could also inherit from IndexableCollection<T> and get indexing in your own custom collections, such that you could do a query like this:

var someItems = from s in someIndexable where s.SomeProperty == 42 select s;

… and the query would use the index for searching, rather than scan each item in the collection.

Over the years since (approaching 3!) – the project has evolved, little by little. In 2008, Jason Jarrett started contributing work, bringing in important performance enhancements, better code organization, continuous integration, and various other things, that he elaborates on here.

For 2010, we wanted to do a significant refresh, with some new capabilities for core functionality. The second major revision of i4o will feature the following:

* Support for INotifyPropertyChanged

Previously, if an indexed property changed, you had to manually notify the index of the change and reindex the item. For i4o2, we have formally added support for property change notification, such that if T supports INotifyPropertyChanged, changes to any child will result in automatic reindexing of that item.

* Support for Observable<T>

If a collection inherits from Observable<T>, and T also supports INotifyPropertyChanged, you will be able to wrap an index around that which will notice changes in the collection.

* Support for a far wider range of query operations

Previously, if a query was anything other than object.Property == <some constant | some expression>, i4o would simply use Enumerable.Where, rather than try to optimize the query using the index. The index was based on strict hash tables, which limited the kinds of operations that could be optimized. In v2, there is a much more sophisticated query processor that:

1.) Optimizes the structure based on what the child type supports. If the child type is IComparable, it uses a Red-Black tree to hold the index so that queries that use comparison operations rather than equality can be optimized.

2.) Uses a more sophisticated algorithm for visiting the expression tree of the predicate, allowing for indexes to work on much more complex predicates. So if you are doing a query where your predicate is more complex (A and B and (C or D), for example) the processor will handle those as well.

* Creation via a builder pattern:

Rather than make the user try to figure out what index is best, creating an index is a matter of calling IndexBuilder.BuildIndicesFor with a source collection and a spec:

var someIndex = IndexBuilder.BuildIndicesFor(someCollection, indexSpec);

The builder detects the types passed to it, and simply builds the most optimal index type (ObservableIndexSet or just an IndexSet) based on the type of enumerable passed to it and the child type.
* More DSL-like index specification builder
While this has been supported for some time, we added some language to the builder, so the code would read better:
            var indexSpec = IndexSpecification<SimpleClass>.Build()
                .With(person => person.FavoriteColor)
                .And(person => person.Age)
                .And(person => person.Name);

These changes are very much in early alpha testing right now, with a goal of having them fully baked prior to the Visual Studio 2010 release.  At that time, we intend to move the changes into the project trunk and do a full release.
If you want early access to the changes before then, I encourage anyone who is interested to visit codeplex.com/i4o and see the i4o2 branch of the code repository.  Once we are sufficiently stable and have integrated some of the older perf improvements into the new code base, we will likely move the i4o2 branch to the trunk.

The Nomadic Developer

Surviving and Thriving in the World of Technology Consulting

Author: Aaron Erickson