Monday, June 06, 2005

Threading .. the next revolution

I've been thinking for years that the next major language/runtime improvement to follow automatic memory management is automated synchronization. Not the heavy-handed top-level "synch all" but something that is smart about what needs synchronization and what doesn't. It will have to know about deadlocks and stale reads, and atomicity violations, and prevent them all with reliable accuracy.

Why this, and why now? It seems like a logical next step after memory management, and I'm far from being the first to have that notion. In the last couple of years, however, it's becoming apparent that some of the push for this kind of solution will come from the hardware that our systems are running on. Industry pundits have been saying that single-core speeds are going to stop increasing. However, this prediction has become real enough now that multi-core single dies and hyper-threading approaches are in production. To take advantage of these performance increases, you often must change the way your application functions to focus on multi-threaded capabilities.

Tools that have some of this knowledge already exist, mostly geared towards development-time analysis. What we'll need, however, is a run-time handling of the problem, backed by some kind of language adaptation (either new features, or new usage of old features)

At first glance, this problem looks a lot like the memory allocation problem. There will be run-time tradeoffs for development time gains. It should simplify the tedious and complicated business of determining where synchronization issues are. It takes a run-time tool (memory analyzers) and turns it into a run-time action (garbage collection).

However, there's a major difference. Where many developers were willing to put up with an overall runtime slowdown in order to gain an increase in developer efficiency, that won't be the case for automatic synchronization. The entire point of automatic synchronization is to take advantage of the potential performance gains in the new thread-heavy architecture. Add to that the fact that unlike most applications, the runtime synchroniztion must be 100% (possibly needing to be provably) correct, and you have a big task.

I've said something here that needs elaboration. It is my belief that most multithreaded applications today work, but are not correct. It's been my experience that if you examine a complex multithreaded system closely enough, you are almost guaranteed to find situations where there exists the potential for deadlock or dirty-read issues. In practice, these issues "almost never matter" because of outside guarantees: "That thread never runs when this thread is running" or of timing scenarios that, while possible, are very unlikely. An automated system would not have the luxury of being lucky, and as a result will incur overhead that hand-synchronized systems will not need to handle.

Speaking of handling, what would it have to handle? Here's a list of synchronization problems off the top of my head:
  • stale thread information
  • atomic reads/writes
  • deadlocks
  • transactions
The last one is the interesting one. It's not enough to have perfect field-level automatic synchronization. You must be able to designate a group of changes as being written atomically. A classic example is a bank balance transfer. The new values in 'checking' and 'savings' must be written at the same time, to prevent any user of the thread from getting a mismatched value. I think this is where new language features, or new usages of old features, are going to drive a new approach to programming.

Of course, nothing is new under the sun. That 'classic' example of transactions is just like writing to a database. Both operations need to be done atomically to the database, so that if they fail, they fail together. Approaches from relational database programming can certainly be applied here, leading us to think of shared memory like a database. Even so, I don't think there's a standard API or approach to database save transactions currently in wide use across languages, and probably not even within any open language. Closed languages like Delphi and others may likely have a single dictated approach.

Now, obviously I'm not the first to come up with the idea of automatic synchronization. There are academic papers written 15 years ago or more on the subject. Graph theory certainly has something to say about this kind of thing, and parts of its tenets are even older. There's a lot to read even before getting started, and it's a big project.

But big projects, in the XP way, should be reduced to smaller projects. So I ask: "What is the smallest amount of the above problem that we can solve, and still be useful?"