Sunday 16 March 2008

The end of coding as we know it?

I was talking to a friend and colleague, who shall remain nameless, about the use of models as a principle means of deriving applications. Oddly enough, the day before, I was also talking to one of my new colleagues at Cognizant about something not dissimilar. In the former case there is at least one (and probably many) organisation who now seek to reduce the coding burden and have made efforts to turn their coding shops into testing shops with a little coding on the side. In the case of the latter we were talking about the real IP being in the process models and not the coding.

Clearly there is much work in MDA circles to solve these problems. After all there have been attempts at moving to executable UML Few if any have really succeeded. And those that come close tend to do so based on a more siloed view of an application. There are also initiatives within OMG to codify process models based on BPDM and relate them back to UML. This latter move is of considerable interest because on the one hand it recognises that UML today does not facilitate the encoding of business processes and it recognises the need for some description of the peered observable behavior of a set of roles which we might call a choreography description.

Can we really move towards a world in which models drive everything else and do so automatically? And if we can what do these models need to provide. What are the requirements?

I would contend that any such model, we might call it a dynamic blueprint for a SOA, needs to fulfill at least the following requirements:

  1. A dynamic model MUST be able to describe the common collaborative behaviors of a set of peered roles.
  2. A dynamic model MUST NOT dictate any one physical implementation.
  3. A dynamic model MUST but be verifiable against requirements.
  4. An implementation of a dynamic model MUST be verifiable against that dynamic model.
  5. A dynamic model MUST be verifiable against liveness properties (freedom from dead locks, live locks and race conditions).
  6. A dynamic model MAY be shown to be free from live locks deadlocks and race conditions.
  7. A dynamic model MUST be able to be simulated based on a set of input criteria.
  8. A dynamic model MUST enable generation of role based state behaviours to a range of targets including but not limited to UML activity diagrams, UML state charts, Abstract WS-BPEL, WS-BPEL, WSDL, Java and C#.
Let me examine what these really mean and then I shall summarise what I think the implications are on the software markets as a whole.

Requirement 1 really states that any model must be able to describe the way in which services (which might be the embodiment of a role) exchange information and the ordering rules by which they do so. The type of information exchanged might be given as a static model (requirement 2). The ordering rules would be the conditional paths, loops, parallel paths that constitute the collaborative behavior of the model with respect to the peered roles.

Requirement 2 simply states that a static information or data model is required and that this can either be in place when creating a dynamic model or it may be created along side the dynamic model which would provide context for the information types. When we iterate between sequence diagrams and static data models today this is essentially what we do anyway. The difference is that the dynamic model is also complete unlike the sequence diagrams which provide context on the basis of the scenario that they represent.

Requirement 3 says that any dynamic model should not dictate any specific physical implementation, that is it should not require a solution to be hub and spoke, peered, hierarchical and so on. It should be capable of being implemented in a range of physical architectures which are independent of the dynamic model.

Requirements 4 to 6 say that a model must be subject to and support various forms of automatic verification just like programming languages are today when we compile them and the compiler picks up errors and so prevents the code from being made executable. In the case of a dynamic model we would want to ensure that the dynamic model meets a set of requirements for the domain that it represents. This might be achieved by validating a dynamic model against an agreed set of messages and an agreed set of sequence diagrams which collectively describe one or more use cases. On the other hand we would want to use a validated dynamic model, which as a result of validation we know meets our requirements, to verify that an implementation of that model conforms to the model. That is that there does not exist any observed execution of the implementation across all of it's constituent services any set of observable exchanges or conditions that cannot be directly mapped to the dynamic model. Putting it another way we want to use the dynamic model as input to some form of runtime governance applied to the behavior of our set of peered services. The requirements that mention liveness, live locks and so on are really not any different to saying that in any programming language it is illegal to access an array of 10 elements by writing x = array[11]. The difference is that we are looking to prevent badly formed and potentially disastrous problems arising in a distributed system and not in a single application as do compilers.. Model checking applied to a dynamic model for distributed systems is one way of ensuring that this does not happen in much the same way that type checking prevents errors at a localised application level. I mentioned something akin to this in my blog on the workshop I attended which was entitled "OO Languages with session types ".

Requirements 7 states that a model must be able to be simulated. What this means in practice is that if a dynamic model captures the collaborative behavior of a set of peered roles then we must be able to provide such a model with some input data and see the dynamic model activated. For example if a dynamic model starts with the offering of a product then we must be able to direct it to some product information and see the exchanges that then occur. Equally if we introduce a number of bidders in an auction system we need to be able to enact the choreography.

Requirement 8 is all about reviewability - if such a word exists. Simply stated it is the ability to generate or display a dynamic model in a way that reviewers can understand, comment and so sign off on.

If we had a language that we could use to describe such dynamic models, and of course I would contend that WS-CDL is a good starting point along with BPDM, and if you are interested in the future then look at Scribble too, then what does this means for for the software market as whole? In simple terms it changes the shape and size of delivery and has an impact on testing. It compresses things.

On the one hand we can view the dynamic model as UML artefacts empowering implementors. If we know that the dynamic model is correct with respect to the requirements, and we know that the dynamic model is correct with respect to any unintended consequences (aka liveness) then we can be sure that the implementors will have a precise and correct specification in UML of what they should write on a per service/role basis and so ensure that all the implemented services will not have any integration problems. It makes it much more efficient to outsource development because the dynamic modeling can be done close to the domain and the development can be done where it is most cost effective to do so - hooray for off shore development. Coupled with the ability to use the dynamic model as input for testing it also becomes possible to verify that a service is playing the correct role as it is being executed in testing.

On the other hand, if we can generate Java do we really need coders? And this is the dilemma. Can we really do without coders. If the high level dynamic model only deals with the externally observable behavior then somehow we still need the internal behavior (the business logic). If the internal behavior can be described fully in UML and code subsequently generated we can indeed generate everything. The dynamic model of the system as a whole plus the UML models for service business logic combine to provide a two step high level description of a system in which no code needs to be specified at all. So no coders? Of course it sounds too good to be true. Where does the code go? Where are the actions specified? In UML this could be done using an appropriate action language, something that has found it's way into UML2.0 but as yet not fully formed as a concrete language. Someone still has to write this stuff, so the coders just move up the stack and become more productive as they did with the onset of OO languages as a whole.

Is this the end of coding as we know it? One thing is for sure it is not the end right now. How far and how fast the growing wave towards modeling as opposed to coding will take us, for me at least I cannot see it is all the way. The action semantics still need to be coded or written textually and that is really coding by another name. I remember in the early 1990's much was made of visual programming languages. None of them made it.

What it is all about is structure and making the structure of things visible and so easier to manipulate and that is a huge leap forward because we simply have never had anything that enforces structure for a distributed system before.

In the grand scheme of things the very fact that we can articulate such possibilities (the end of coding as we know it) means that our industry as a whole is maturing and our understanding of the complexity inherent in distributed systems (SOA and the rest) is becoming clearer every day. It does not mean that we are there yet but because we can now think about the requirements of a language needed to describe such structure we are at least on the right path.