LPML Essay

(WHY LPML IS THE ANSWER FOR WHAT WE ALL REALLY WANT--MOST OF US ANYWAYS)

by Scott Harrison; April 2002

...Promoting peace and understanding with software installation...

Software developers want a no-headache, straightforward solution for installing software from a software source repository to a target filesystem.

Computer users with any real level of experience and savvy want to know WHAT EXACTLY is being installed and HOW each file is installed.

The Linux Packaging Markup Language (LPML) is meant to address the above developer and user demands. There are some though, for whom LPML provides no satisfaction. There are software developers who truly enjoy and fully appreciate the nuts and bolts of macros, recursive makefiles, SRPMS, and other package building tools with native syntax commands. There are also users for whom ALL that matters is that the application run on whatever legacy machine they have and that all other existing applications continue to work. If this is your life, cool. If not, read on.

Make no mistake. I view both the efforts AND accomplishments of the Automake/Autoconf crowd as AWESOME. However, I would make a brief analogy to state that they are similar to "TeX"--all powerful and also difficult to fully understand. I would venture my hopes that LPML be like "HTML"--representing a vanilla-layer markup syntax which is understandable at all stages of building the software.

There is a notable exception to the need for LPML. A significant number of applications run off of only one binary executable. In this case, the packaging of one file is, of course, not of any real consequence. Contrastingly, web applications usually rely on large numbers of files, so I think LPML is perfect for web applications.

The priority we all place on designing robust software packaging is worth considering. A user's typical response to installing software is "the-darn-thing-better-work". A typical software developer mindset I encounter is "the installation can't be nearly as important or complicated as this wonderful piece of space-age software I am writing". These viewpoints are completely legitimate both emotionally and economically for all concerned. They do however lead to a sad conclusion for those working in the installation cob-webs. There is no glory, I repeat, absolutely no glory in software packaging. Hence the endless pathology--no resources committed to software packaging; the software breaks; commit minimal resources; the software breaks; throw hands up in despair; no resources committed to software packaging; and so on. If you are only moonlighting as a software packager (like me) and spend most of your time developing and using software code, perhaps you too have this internal dialogue...(I hear voices :) )!

The only reward here is in the hidden job-well-done. If you try out LPML, I expect you will experience, like I did, that month after month go by without witnessing a single LPML mistake despite the continual addition of many files and customizations of a software package. Something must be right. Join the coolness! If you ever have any questions about the contents of a software package you are developing, there is an easy solution: emacs yourpackage.lpml &.

There you have it. If you are proposing shell scripts, then tell me--do you run "make config" or "make xconfig" when custom-compiling your kernel? (I expect you run "make xconfig"; it pays to organize information.) Makefile-technology is super-important to the health and welfare of freeware IMHO and having you either on-board with LPML or developing alternative approaches is very significant! It never ceases to surprise me how quickly once-dispassionate users and developers all of a sudden become passionate when the computer (for some obscure software-packaging reason) is no longer acting as the beacon of determinative logic that we all greatly desire from it. Moral: Regardless of wherever or however we contribute to the freeware community, making sure that the software ALWAYS is in working condition can help move things FORWARD (avoiding the dodgy bug hunts).

No one can change all of the world, but we can focus on a given part of it; I still believe in controlling a computer system where every single file, directory, and link can be fully and completely accounted for. The rest of the world oscillates between "legacy" systems and brand bleeping-new systems. Wouldn't it be great, for instance, if every Linux legacy box transformed into brand bleeping-new system with a transformative and understandable one-line command?

I am aware (thanks to colleagues) of some other possibilities out there like Apache Ant. However, I find that these (relatively few) other possibilities are feature-rich yet confusing compared to the cut-and-dry file-centric approach of LPML. Again, I only offer my enthusiasm to all for the sake of freeware Makefile-technology.

I think LPML is the best thing currently going.

Background

LPML was developed in a unique crucible--a quickly developing, sophisticated software project among a large and broad range of users with different skill levels (instructors and students). This is the LON-CAPA educational software project with which I have been broadly involved.

I cannot thank enough the many university instructor/computer enthusiasts and fellow members of the LON-CAPA software development team for their openness and patience toward the LPML idea. That being said, necessity is the mother of invention, and the stakes were (and are) sufficiently high and intense from all concerned--the instructors and students mainly want to work with the educational content (not LPML!). In the university setting, state-of-the-art software generation needs to be carefully conserved toward things that work and make sense in terms of the ongoing research projects. Mantra: Software packaging/installing/upgrading must not interfere with student homework submission and other tasks of priority for an educational institution.

We can of course avoid upgrading--a simple solution for stability is to get a single system up and running, and then never ever log in as root again!

Difficulty with this single system approach emerges when you are not working with a single system, but rather have multiple systems running on four different geographical continents in a wide range of system administration "habitats". Difficulty increases when three or more software files are being added de novo to the software system on a weekly basis. Difficulty increases when there is a steady stream of feature addition and modificiation. Difficulty increases further still when the permissions and ownerships of target files are open to change and debate. Difficulty increases further still when the backing up and preservation of existing configuration files changes strategy four times. Difficulty increases with all the choices software developers want to have when installing from a CVS repository. (Difficulty reaches the boiling point when you are a moonlighting researcher (here I am at 4:53 AM doing ANOTHER all-nighter) studying bacterial chromosomes who is also writing educational software and then, after all that and other things, heralding a great brave new installation strategy for...some reason or other.)

A policy of non-interference between all the difficulties REQUIRES strong certainty that things ALWAYS work.

Simply put, WE WANT A LOT OF CERTAINTY with software installation because there is enough uncertainty elsewhere.

(The precise formalisms of Kurt Goedel's work on uncertainty are gems; I hope he does not roll over at my attempt at brief summary and conjugacy.) Goedel points out that predicate logic systems have intrinsic uncertainty compared to propositional logic systems. LPML allows for CERTAINTY because it avoids the PREDICATE LOGIC typical of traditional Makefile/macro file-globbing schemes. The low granularity of LMPL being file-centric results in an EXACT AND SPECIFIC PROPOSITIONAL LOGIC.

A file-centric strategy means that it is easy to cut, copy, paste, filter, monitor, test, build, glue, document and target most any set of files according to a wide variety of criteria. Contrastingly, a file-glob strategy has to carefully curate each of these actions.

Syntactically, LPML is XML which gives anyone a fighting chance of understanding it (compared to, for instance, recursive globbing makefiles). I would suggest the following LPML code is obvious:

<lpml> <targetroot>/</targetroot> <file> <source>loncom/krb.conf</source> <target dist='default'>etc/krb.conf</target> <categoryname>conf</categoryname> <description> which Kerberos server to contact for specified Kerberos domains </description> <note> list elements are separated by newlines; each list element consists of only two subelements separated by a colon (Kerberos domain value, and Kerberos server IP address) </note> </file> </lpml>

As of this writing, LPML works quite nicely with an extraction markup language (XFML--distributed with LPML) and a process launching language (PIML--distributed with LPML). I am aware that there are also many sophisticated XSL/etc type approaches that could also be incorporated.... It is a promising road. In fact, I think anyone could sit down and write a page on a lot of possible features to develop and immediately use with LPML. Having that kind of intuition with a software product demonstrates large potential. One recently implemented approach was to use LPML to curate and maintain files on a network of computers through the usage of sudo and secure shell.

Summary

LPML was written in protest. I am critical of recursive Makefile approaches--the information is distributed in different files in various not-always-obvious directories. This limits abilities for documentation and, sometimes, modification of global installation properties. I am also critical of macros--they break easily on software dependencies and are difficult to debug. I am also critical of RPMs--the building syntax is a little complex. The XML of LPML, on the other hand, greatly streamlines the handling of information.

LPML was written under "duress". Despite the fancy potential, the core nature of LPML is that IT WORKS. It was written as the best possible working solution. Not only is it a stable solution during the development of a software package, LPML has also proven to be very stable when modifying the actual lpml_parse.pl software code--the beauty of a structured XML-parsing approach.

LPML is easy for the developer to debug (though very few software-packaging bugs are encoutered with LPML). If a file fails to install, all you have to do is LOOK AT THE PART OF THE LPML file WHICH DESCRIBES THAT FILE. Truly now, we have here a simple proposition for the developer to work with.

There is a final Catch-22 situation of "installing the installation software". LPML combats this by using relatively few files and dependencies. LPML is distributed as ONE core script file named lpml_parse.pl. For input, lpml_parse.pl typically reads in ONE LPML for your entire software package. All you need to start with are those two files. What this means is that you do not need to have pre-installed installation software (such as macro processors). And, installing files should generate no more confusion than existed before--since now we have certainty.

Or at least a lot more of it.