New programming model

Last week I started tweeting about a new programming language/model that I'm working on with Tijs van der Storm. If you've talked to me about software in the last 10 years, you've probably been exposed to some of my ideas on this topic. I've been trying to make sense of it for a while, but now I'm pleased to say that it is coming together in a concrete way. We aren't ready to announce anything yet, but I can give you some idea of our guiding principles.

* Program with forests, not trees
The idea here is that all data/information should be represented explicitly as semantically integrated networks of typed values with attributes and relationships. There are two important aspects to this idea, which immediate distinguish our approach from both OO and FP. In contrast to OO, we declare structures on a larger level of granularity, to capture semantic integrity of collections of objects, rather than on individual objects. An OO programmer sees individual objects (the "trees") but cannot really see the "forest". Relative to FP, we allow explicit cycles, so that our representation is graph-based, rather than being based on trees as in FP. I know that lazy functional programs can express cyclic structures, but the cycles are not observable. We tend to call our forests "models" although it is best not to import too many assumptions from MDD or UML when we use the term.

*Support many languages. This means that we support domain-specific language. In effect, every information model you create is a language. It can have multiple interpretations. The distinction between textual and visual languages is unimportant, because text and graphics are just two different presentations of an underlying information structure.

*Dynamic checking
The structure of a model are described by other models, which represent structural and behavioral constraints. That is, all data is described by metadata. And the metadata can be more interesting than just structural types. At the top we use the typical self-describing models. However, since everything is a value, all checking is done dynamically.

* Generic operations
Because our "types" have lots of useful information in them, and can be manipulated just like any other value, its easy to write very generic operations, including equality, differencing, parsing, analysis, etc. We do extreme polytypic/generic programming, but don't worry about static checking. We'll worry about that later :-) Richer metadata (aka types or meta-models) means more powerful generic operations.

* Use code for transformations, but never generate code
We like code. Its great for projecting models onto models or computing analysis of models. We are developing a family of cyclic maps, which are like FP maps but they work on our circular structures. But you should never ever explicitly generate code. This is the big mistake of a lot of work on model-driven development. Instead, we use partial evaluation to generate code. Partial evaluation is great because it turns interpreters into compilers automatically (if you are careful!). Model to model transformations are fine and can be written in either code or as an interpretation of some other transformation language. But requiring all transformations to be models (not code), or generating code from models, is bad. I know others might disagree, but this is what we believe.

*Extreme feature-oriented modularity. That is, every idea should be written once. Allow mixins and inheritance/composition at all levels. These are very natural operations on models: to compose them and merge them. Its not easy, but we think we can make it work. You have to compose the syntax and the semantics cleanly. We are inspired by Don Batory's work here.

Our goal is to create "Smaltalk of Modeling". That is, a simple and elegant system that is based on models all the way down. It has a small well defined kernel and we are working on building real applications too, as we build the system. We are implementing in Ruby, although this is just because it is such a great language for this kind of reflective exploration. Our new system is not object-oriented, it is model-oriented. But we are looking for the key ideas in the modeling world, and not necessarily adopting any conventional wisdom. We are exploring!

1 comment:

Unknown said...

Dear William,

your work on Enso is very interesting! It is remarkable that there is a substantial overlap with recent ideas of feature-oriented software development (FOSD) -- not only the fact that modularity of features is important, but deeper connections to the FOSD model. I know that you are aware of this connection, but other readers may be interested:

- Forests are the key data structure in a formal model of FOSD [1]. The goal is similar to yours: be language-independent and represent object collaborations.

- Recently, forests have been extended to graphs to encode more semantics and to support more automatic reasoning activities [2].

- Operations on forests (e.g., composition or conflict detection) are defined generically and can be plugged in on demand [3].

- The entire FOSD model is language-independent and has been used with many different artifact types (e.g., Java, C, Haskell, Alloy programs) [3].

- FOSD tools (the internal parsers and reasoning tools) are generated based on annotated grammars, whose annotations supply semantic information [3].

- The entire forest/grammar/annotation-based model has been enriched by adding behavior to features [4].

It would be very interesting to learn how your programming model relates to these recent FOSD ideas or what different goals you have, also what connection you see to Smalltalk.

[1] Sven Apel, Christian Lengauer, Bernhard Möller, and Christian Kästner. An Algebraic Foundation for Automatic Feature-Based Program Synthesis. Science of Computer Programming (SCP), 75(11):1022–1047, November 2010.

[2] Sven Apel, Wolfgang Scholz, Christian Lengauer, and Christian Kästner. Language-Independent Reference Checking in Software Product Lines. In Proceedings of the International Workshop on Feature-Oriented Software Development (FOSD), pages 65–71. ACM Press, October 2010.

[3] Sven Apel, Christian Kästner, and Christian Lengauer. FeatureHouse: Language-Independent, Automated Software Composition. In Proceedings of the ACM/IEEE International Conference on Software Engineering (ICSE), pages 221–231. IEEE Computer Society, May 2009.

[4] P. Höfner, R. Khedri, B. Möller. Supplementing Product Families with Behaviour. International Journal of Software and Informatics, 2011. To appear.

Best

Sven