Monday, November 13, 2006

millions of loc in the pi-calculus (or any process algebras).

I was thinking of writing on this blog for a long time since I heard it had been opened, but failed to see some basic things such as how to login. Now I leaned it from Luca, and I will write a few lines.

In this first entry, I wish to write about programs. There are many programs in the world, open source or closed source. There are hundreds of programming languages, and millions lines of C code for systems level programming, and millions lines of applications code. Some of them form the infra of computing. And software lives long, often longer than machine architecture. No ISAs I know of have the instruction for "while", but it continues to be used surviving all different ISAs, now or in future. That is software.

Over the last decades, Internet and world wide web became new computing reality, and open source has initiated a new way to build software infrastructure. During this period I have been studying process algebras, together with the colleagues in the community. What has this activity been? We have surely been working on general theories, but we have also been working on description of processes, small or large, chaotic or well-behaved. So we have also been programming. Programming in processes, which have almost always been abstract, but still building behaviour, just as practical programmers do with C, Java, etc. And there are so many behaviours interacting processes can realise (starting from vendor machines to SKI combinators to file servers to data structures to Turing machines to higher-order functions to objects to cryptographic protocols to biological processes).

As I wrote, ours has so far been programming in the abstract. Through recent dialogues with people in industry, however, I have come to realise a strange thing: it looks that in all areas of programming, from embedded software to OS programming to servers to application integration, what I got trained as abstract programming is becoming reality. Why? For a simple reason that there is an increasing need to describe interacting processes: this is so when we wish to make clear in what ways complex long-term financial transactions involving many parties can be stipulated so that every party is sure how one should behave; when we need be sure that a networking product from one software vendor is interoperable with one from a different software vendor, based on information disclosed by them; or when we wish to run a program really fast in a multicore processor, making the best of its on-chip interconnect; or when we wish to do a complex mash-up of many web services to create a richer service; when we wish to contol mobile devices which get spontaneously engaged in many kinds of high-level interactions.

All this has been more or less out there for some time, or at least in preparation. But they are now getting visible and gathering pace.

I have had chances to talk with those industry people who are in dire needs to obtain and work with such interactions. For example a chief software architect in a major international bank told me that he needs to have a complete grasp of how thousands of applications in different departments in his bank interact with each other as a whole, describe and manage it (he is finding a suitable language for doing this). When I listened to his story, what I saw in my mind's eye is millions of lines of the pi-calculus processes in action. Surely the pi-calculus in particular and process algebras in general are too fine-grained for describing application-level interactions: their setup is aimed at distilling fundamental concepts, for developing basic theories. And yet what this architect should be having in near future are nothing but description of communicating processes. These interacting software should be developed rapidly, safely upgraded and incremented at run-time (without worries on deadlock, livelock, lack of compatibility, violation of regulation, etc.), usable for decision-making, and should be manageable. In short, we need description, analysis and control.

These description and analysis are nothing but part of what we have been doing in process theories, albeit in the abstract. And our inquiries into basic theory will continue, since I am afraid we have only touched its surface. Yet even from our study so far, we know several basic things about how to treat interacting processes. For example we know that it is hard even to understand what it means that two processes behave in the same way. Now thousands of programmers are going to program behaviours of such processes. What are communicating processes that I am writing? What is their behaviour, and how can I organise them?

I deem the coming interaction between engineering and theory to be a non-trivial and rich one. Many elements will make this collaboration far from straightforward and, at the same time, thrilling. It will pose many interesting questions. Today I have written long enough (I was thinking only a paragraph is enough). In a near future, we may be able to come back to this blog and discuss a couple of concrete problems picked up in my recent interactions with the engineering world.



Post a Comment

<< Home