Canadian Information Processing Society (CIPS)
 
 

CIPS CONNECTIONS

INTERVIEWS by STEPHEN IBARAKI, I.S.P.

XML, XSLT, SVG, and XQuery international authority speaks

This week, Stephen Ibaraki, I.S.P., has an exclusive interview with the international IT expert and author, Kurt Cagle.

Due to his widely recognized expertise in XSLT, SVG, and XSL-FO issues, Kurt has been a featured speaker at more than a dozen conferences in the last five years. He is also a founding writer and regular contributor for Fawcette’s XML and Web Services Magazine and his achievements include best sellers amongst his more than 14 book credits.

His recent co-authoring contributions include SAMS XQuery Kick Start.

Discussion:

Q: Kurt, you are amongst the best to write extensively about XML, SVG, and XSLT. We are indeed fortunate to have you do this interview—thank you.

A: Thank you. That's high praise indeed.

Q: Detail your journey into computing and writing?

A: The first American novel was written by James Fenimore Cooper, an American who, having just read a Penny Dreadful imported from England, uttered the immortal words, "I can write better than that!" - then proceeded to show with his first work, a book called Precaution, that he couldn't. Still, he kept at it, eventually producing Last of the Mohicans and The Deerslayer.

I can empathize a lot with Mr. Cooper. I put myself through college (University of Illinois) writing computer programs for college professors for the princely sum of $5 an hour (yeah, even in the early 1980s, this wasn't a lot of money), but I didn't pursue a computer degree because I was told by several well meaning professors that computer programming was a dead profession … it just didn't have any real future. So I graduated with a degree in physics instead, and then went to work shortly thereafter teaching community college kids how to use computers.

I went into desktop publishing about that same time, getting a thorough grounding on markup languages in the process. There were a few years in there which I spent getting all of the wanderlust out of my system after four years working on a degree, then I moved to the Pacific Northwest because I was becoming thoroughly tired of cornfields and had family in the area. After about a year of odd jobs, I worked for a game company, producing children's games and TV spin-offs (I even animated Vanna White in 16 colors!).

It was at that time that I had my James Fenimore Cooper moment -- I was reading through a copy of the Macromedia Users Journal, a small newsletter produced at the time by Tony Bove and Cheryl Rhodes, and thought "I can write better than that." I'd actually published a science fiction magazine in college and had a few really bad published stories in circulation, so I figured that technical writing couldn't be that much harder. In truth, the first few articles weren't all that good - I made all the same mistakes that most beginning technical writers make, but Tony continued asking for my work. By the end of the year I had become a regular columnist, and actually had an inked book with Random House within 18 months (Macromedia Lingo Studio).

Q: Describe the challenges and their solutions you have faced in your career. What valuable lessons can you share with our audience?

A: First Lesson. You will not get rich writing computer books. You can make a pretty decent living, mind you, if you're willing to put in long hours, but the field is a lot like the general publishing field -- you'll have a few people who will hit the market with the right technology at the right time that will do astronomically well, and the rest will make a modest living. The real challenge is finding that sweet spot, while still maintaining interest in the field.

A typical computer book has an extraordinarily short shelf-life -- a year if you're lucky, four to five months if you're not. What's more, these are typically large books -- I've written five books in their entirety, all but one above 500 pages in length. That's a lot of writing, sometimes as much as six months of your time to produce, especially if you're doing other things concurrently.  Given that advances have actually become smaller over the years, adjusted for inflation, this means that while you're writing a book you had best have other sources of income on the side, or you'll find yourself in a crisis about mid way through.

On a more technical note, being a computer book writer has meant that I've had to develop those faculties for seeing trends as early as possible … it may take as much as a year to get a book in print from the time you begin negotiations on a contract to the time that it actually hits the shelves at Barnes and Noble or Borders, and even there you want to time it so that it precedes the market by a few months to pick up the largest volume of early adopters.

I work with open standards trends; XML, HTML, Web Services, and so forth, which can be especially problematic to gauge: typically, these standards are usually not adopted by the large vendors such as Microsoft or Sun (unless they are the ones to push them in the first place) until a point where the standards either appear to be picking up momentum without the benefit of marketing or where the standards have so obviously trumped a corporate product that they can't sit idly on it anymore. This means that most open standards percolate up through the market quietly rather than are imposed from without with great fanfare. Predicting which ones will succeed then becomes a matter of determining if (and when) they are going to hit a tipping point.

Q: Why should IT professionals read your book, XQuery Kick Start? Which features make your book unique, and what benefits can readers derive from the book?

A: First a comment. XQuery Kick Start was written by a several writers, reflecting a great deal of expertise across the board. Per Bothner wrote one of the first (and to date one of the best) XQuery interpreters -- Qexo, using the Kawa Scheme engine which he also wrote. James McGovern and Vaidyanathan and Nagarjan have also worked fairly heavily in the XQuery and XML arenas in the past. I'm the most published of the group, but we've all worked hard on this book

XQuery is intended to address a problem that's become more contentious as XML has become more pervasive. One useful way of thinking about XML is to envision it as a way to represent localized data. The ability to structure this information in robust trees makes it much easier to represent relationships than you can with a traditional relational database. However, XML is also a royal pain to store, index, and query, especially when you're dealing with large datasets. Relational databases are better for that sort of thing, but they don't tend to lend themselves as well to the kind of flexibility that XML practically demands.

There were a number of attempts early on to provide a SQL interface to certain XML tools, but these always seemed like awkward add-ons: blocks of SQL code transported in XML tags. There was very little way to structure that data, once received, and, to make it worse, the underlying data-models for the two are radically different. XQuery was meant to provide a consistent mechanism that treated all data, not just XML or even relational databases, as being equivalent. In that respect XQuery is kind of like a super SQL. As a prediction, if XQuery takes off, it will end up replacing SQL within fifteen years.

While it's no sure thing, there's a huge amount of momentum for XQuery, waiting for the spec itself to reach maturation, which will be, at a guess, in first quarter 2004. Microsoft's next version of SQL Server will incorporate an XQuery engine, as will Oracle and IBM's DB2 (I've worked with beta versions of all three). These provide vehicles for doing very XMLish things with database, without the sometimes awkward interfaces that exist now. It's perhaps best to visualize XQuery as a way to write stored procedures for any sort of data, and its modular structure lends itself well both to object oriented programming and web services.

XQuery Kick Start is one of the first major books to deal with XQuery -- more importantly, it is also one of the most comprehensive, dealing with XQuery both from the standpoint of its underlying structure and its utility across a number of different platforms.

It's worth remembering that SQL was developed in order to try to put some order into the fairly chaotic scene of database development that existed in the mid-1980s, and its pervasiveness today (try to find a job listing in IT without some SQL experience) attests to the degree to which it was successful. The same situation exists right now for XML access of data, and the programmer (or project manager) who understands how to bridge the SQL and XML worlds will be able to go anywhere. That's what our book is intended to do, help people understand how to make use of this incredibly potent technology.

Q: Can you share some pointers from your book, XQuery Kick Start?

A: Here's one. Understand the strengths and weaknesses of XQuery before you use it. The principle purpose of XQuery is to generate new XML sequences built from other resources (whether XML, SQL or otherwise). You can, theoretically, use it for reporting, but in practice this can prove to be more problematic than its worth. To me, the best place for XQuery is as part of a pipeline - a filter that connects to a data-store (such as an XML file, or a SQL database), drawing in the data and threading it together into an XML data stream that can then be passed on to some other process such as an XSLT transformation (which I would use for reporting).

Q: Describe in detail the relationship between XPath and XQuery?

A: XQuery is actually built upon the XPath 2 specification -- in a way you can think of XQuery as adding certain control structures and control syntax to the XPath language. XPath2 however has undergone a significant amount of revamping as well. The most significant change there includes the introduction of sequences as an alternative to the node-set that was intrinsic to XPath 1.0. Sequences are more general purpose structures, containing not just XML nodes but text, numbers, dates, and just about anything that can be represented in XML, as well as mixed content. This makes it easier to create nodes on the fly, and it also gives you many of the same list manipulation advantages that you get in languages such as Lisp, Scheme, and so on.

If recordsets can be thought of as the data model for SQL, then sequences are the analogous data model for XQuery. In a similar fashion, the query mechanism for SQL - SELECT statements primarily - have an analog in the XQuery FLWOR statements - where FLWOR (pronounced 'flower') is an acronym for For/Let/Where/Order by/Return, the principal keywords of XQuery. Thus, a SQL statement such as

SELECT firstName,lastName FROM users u WHERE u.state = 'CA'

would have an analogous XQuery statement:

FOR $user in $users WHERE $user/state = 'CA'
RETURN $user/firstName,$user/lastName

The difference between SQL and XQuery lies in the way you think about the dataset -- the SQL statement has the terse CASE like pseudo-English that was so common about twenty years ago (upon what I think was the fallacious assumption that you could make programming easier by using English-like statements) while the XQuery statement follows a more traditional programming model (kind of a Visual Basic metaphor that is far more familiar to the current generation of programming).

XPath 2.0 also brings with it a wealth of functions, considerably more than existed with XPath 1.0 and (for the most part) adding just what's needed to make the language sufficiently robust for work. This includes such things as a regular expression engine that will allow you the ability to perform matches and to do regex replacements, something that significantly expands the ability of the language. This also includes both a tokenizer and a sequence concatenation function which, in conjunction with regular expressions, lets you parse even complex text files cleanly and efficiently. Indeed, I think that it’s not an unreasonable statement that XPath2 (possibly in conjunction with XSLT2) may even end up becoming the first pass parser for other language compilers, such as C# or Java, especially given the advantages of moving at least portions of source code into XML.

One of the other major additions to the XPath/XQuery model is the introduction of data-types, about which I have considerably more ambivalence. Data-types first emerged as a necessity, a way of specifying to the computer what size of word to use in retaining principally numeric information. Thus guaranteed, for instance, that if you wrote a loop with an incrementing variable, that the computer would use an integer rather than a floating point number and not inadvertently view 9.995 as 9 rather than 10. However, even a halfway decent interpreter or compiler should be able to ferret this information out by context, without having the user specify it -- indeed, a good optimizing compiler is generally much better at ascertaining the exact type of primitive data-type that's appropriate to a given operation.

The second problem that emerges here is that XML does not hold information in a given type, but instead maintains this information as abstract meta-data; this means that type definitions are essentially imposed from outside rather than being intrinsic to the XML representations, which actually adds to the overhead of working with the XML.

My experience working with data-types in XQuery has been that they are a royal nuisance; introducing a typed expression in XQuery is as likely to force you to add type everywhere else, even when it is not explicitly needed for functioning. If you are serializing a result sequence to a binary representation, this may be useful, but this to me defeats the whole purpose of using XML in the first place.

Q: Summarize how you can encapsulate and retrieve XQuery functionality with modules.

A: Modules are cool. The thing to understand with modules is that the functionality doesn't really reside within XQuery per se, but is actually one of the more subtle pieces of XPath2. XPath2 includes in the spec a generalized interface that can be used to add new function sets into XPath, such that they can be referenced in an XPath expression. These interfaces use a language independent API that is explicitly defined within the XPath specification.

XQuery modules, consequently, are XPath interfaces that are written using XQuery. These modules can be defined externally (and indeed make most sense when designed that way), and it is possible and indeed preferable to group together multiple related functions in specific namespace libraries.

One consequence of this is that you are able to create a kind of super stored procedure library. For instance, suppose that you had a series of business routines that needed to be accessed from your database to handle sales reporting (such as an XML document that presented regional sales figures). Routines such as sales:getRegionalSalesFigures() and sales:getAggregateSalesData() could be contained within a single centralized repository of functions, a file called sales.xquery, that could then be imported in as a module.

Put another way, a significant amount of a company's data analysis and business logic can be incorporated into XQuery, rather than tied up in hard coded binary libraries. This makes it a lot easier to make changes to business logic as needed (or to provide access only to those business functions that users have the permissions for) and to distribute business logic as web services, or something similar.

This to me is actually a critical notion, one that hasn't really been stressed much yet. So far, when people talk about web services, they tend to think about the retrieval (or update) of data. Yet we are moving to a level of abstraction that makes it feasible to start pumping logic through those same pipes, whether in the form of application logic, orchestrated choreography, user interface information, or even "pre-potent" code that can then be compiled at the receiving in of a web service interaction. I see XQuery playing a fairly big part in that.

Q: Provide some tips on combining XQuery with XSLT for presentation and graphics.

A: There is a tendency when looking at XQuery and XSLT to wonder whether in fact one could do the other job. At first that was my impression as well, that XQuery stood a good chance of displacing XSLT, but as I've worked more with both languages I've realized that they actually balance one another quite well.

XSLT is a transformation language. A lot of people don't really understand the significance of that, and as a consequence tend to think that the principal purpose of XSLT is to convert XML object descriptions into HTML. Yet I've found, after about six years of using it (I started working with Microsoft's XML Patterns, an XSLT precursor, back in 1998, not long after XML itself went gold) that XSLT is MUCH more powerful. Formally, the purpose of an XSLT "program" is to map instances of one namespace (or combination of namespaces) to another instance of a namespace (ditto). However, if you're not completely immersed in namespaces, the post-object-oriented paradigm and the lack, that particular definition is pretty much meaningless.

I think a better example is to imagine an XML instance as a description of a "thing" -- an invoice, a poem, a baseball team. In essence, the namespace here can be thought of as a model -- a generic invoice, poem, baseball team, etc. -- and a specific XML document would be one particular instance of that -- the invoice for a computer job just completed, a Shakespearean sonnet, the Seattle Mariners baseball team (which BETTER get to the playoffs next year). The XSLT can then be thought of as a black box that will take this thing and generate other "things", perhaps with the aid of a few parameters that direct specific characteristics of the generation.

Thus, suppose that you wanted to create a web page showcasing the most productive players on the Seattle Mariners. That web page is, in effect, a new thing, a description of a document when viewed through a certain type of viewer. This particular black box corresponds to the traditional view of XSLT. However, a different black box could take the team stats and the current state of the game (a second parametric stream) and from this build a Scalable Vector Graphics (SVG) image showing the location of each player, the location of strikes and balls, the order of double plays and so forth. Or the XSLT could generate a payroll listing in XML, in descending order, providing a comparison of payroll to performance. At this point, the correlation between XSLT and presentation drops away entirely -- indeed, here the XSLT is acting in a manner consonant with business logic.

The principle problem with XSLT is the fact that it typically requires the use of a centralized in memory representation of the XML it's transforming (what's called a Document Object Model, or DOM). If you have a database with some 1 billion records, you're not going to want to convert all of those records into an XML stream so that XSLT can parse it -- there's probably not enough memory on the planet to handle any but the most trivial example of this.

Instead, you want to provide a filter that will pass a manageable subset, partially digested (I'll not touch that metaphor too closely, by the way) to the XSLT to perform the final manipulation. Nobody will want to look at 10,000 records on a computer screen at once; getting much more than 100 gets problematic in terms of both memory and usability. An XQuery against a database has the intrinsic advantage of being reasonably mappable to the optimized internal data language used by that database, something which is generally no true of XSLT. Requesting the top 100 orders out of 10,000,000 can be done via XQuery because the language serves only as an abstraction mechanism which lets you perform the same type of query across multiple types of databases without needing to know that particular characteristics of each.

Thus, XQuery filters, XSLT transforms. 'Nuff said.

Q: How does XQuery semantics facilitate optimization and performance?

A: SQL and XQuery both are set manipulation languages. The significant difference between them is that with SQL you generally don't have a clear-cut notion of a single record … everything is a record-set, contained within a table.

XQuery, on the other hand, is much more robust in terms of working with complex sets that don't necessarily map easily to a relational table model. Yet at the same time, the very consistent shape of FLWOR expressions mirrors the optimizations that have been made in SQL, usually far better than procedural languages such as C# or Java could.

Moreover, both SQL and XQuery are notable for being declarative languages: you can't change the value of a variable once assigned, and you generally can't have functions that cause some internal state change to the environment. These two seemingly minor restrictions have a huge impact upon performance, optimization, and maintenance. Because you can bypass a lot of the state maintenance overhead that arises from having to keep track of changing variable values across different scopes.

We're still fairly early in the process of trying to benchmark XQuery, but the preliminary reports I've seen all suggest that XQuery is in fact even more performant than SQL.

Q: Comment on the querying of XML data for use in document publishing and storage.

A: You don't ask easy questions, do you?

There are three main camps within the XML community -- the XML as document community, XML as data, and XML as process (there's also an XML as marketing group, which I might charitably describe as camp followers). When XML was first being formatted, the XML as document group was dominant, to such an extent that one of the first XML books that I can remember reading described XML as baby SGML, spent dozens of pages discussing the minutia of DTDs, entities and notations, and yet didn't have one word talking about XML as a way to store non-document data.

By 2000, the XML as Data camp had become ascendant, and the role of XML as a document store got pushed largely to the sidelines. The triumph of this group can be seen in the rising prominence of XML Schema, XQuery, and the SOAP/WSDL combination which was more concerned about finding a convenient way to break the DCOM/CORBA log-jam than it was insignificantly advancing the state of the art in the XML space. Don't get me wrong - web services are important, but the press devoted to web services far more reflects the massive advertising budgets of companies that see web services as a way to extend their own offerings than it does to the overall significance of that movement.

In the last couple of years, however, I've also seen a barely perceived undercurrent gathering strength -- the XML as Process camp. You see bits and pieces of this in arenas like RSS, which is a largely spontaneous attempt to build a fully distributed content management architecture. In many ways RSS to me is easily as exciting a story as web services, because it takes advantage of one of the other fundamental properties of XML -- the fact that everything can be linked by assigning a URL to a blob of data. I think in many ways that it is this relational model, predicated upon the same kind of architecture (Relational State Transfer, or REST) that underlies HTML, that will likely be the dominant form that these large scale, stateless distributed applications spaces will likely take. This also has some interesting consequences for framework architectures, something I'll talk about in a bit.

Coming back to the question - there is a fundamental problem that document/content management systems face – metadata…Dedicated CMS systems usually provide tools to introduce metadata into the content -- convert the incoming format into a manipulatible XML format, map this format to an internal XML format, then based upon this internal format, introduce editorial abstraction. This is another way of saying that XML is not a panacea -- at some point in the publishing process, you still need a pair of eyes and a squishy brain looking at a document and imputing editorial commentary that puts the information in context. The topic maps and RDF crowd is trying to figure out how to break this bottleneck, but most of the efforts seen in that respect simply end up pushing the abstraction layer onto somebody else.

RSS, by the way, is precisely this process. An RSS feed consists of a set of links that somebody felt had some relevance to one another, perhaps with a bit of abstraction metadata that describes that relevance. The relevance may be temporal, but more often than not it is principally topical -- this feed deals with XML related topics, a second deals with the Seattle Mariners, another deals with politics. RSS is perhaps the most human touch on the web right now.

I don't see XQuery becoming central to the task of taking XML content and repackaging it for a particular presentation device such as a web page or a printer. XSLT is far better suited to that. Where XQuery will become important is in dealing with the metadata of that content, what the content is about, who wrote it, when was it written, and so forth. This information can be centralized, but it doesn't have to be (my XQuery engine could retrieve links to subscribed RSS providers, query the metadata of each provider in turn for relevant content, then pass the appropriate links and contents on to XSLT to format. Because XQuery is not, ultimately, tied to a centralized data store in the same way that SQL is, this distributive model of content management could be far more wide-reaching, and reinforces in my mind the notion that data storage is becoming distributed in URL space.

Q: How can you integrate XQuery with Java applications and generate Web pages?

A:  Currently, if you want to do XQuery against the latest working draft, you're talking about Java. Two key implementations come to mind right now - Michael Kay's Saxon parser now supports the most recent (May 2003) XQuery draft, and I believe that Qeso, written by XQuery Kick Start co-author Per Bothner, is also now compliant. I fully expect that XQuery will end up becoming a part of the Java XML libraries, though I'll have to admit ignorance about the current state of this effort.

Q: Contrast the different development frameworks.

A: Okay, I'm going to upset a lot of people here, but I think that the "development framework" as a concept is on its way to the dustbin, except in very specialized cases. That should get me hate-mail from both the Java and the Microsoft supporters.

Take a look at the Java 1.4 release. There is something like a dozen top-order domains in the application tree, and the "official" class tree probably numbers into the thousands of distinct classes, especially once you start getting into J2EE. This doesn't count the hundreds of application classes that any given company develops for its own products, not to mention incompatible versions of those classes (backwards compatibility is largely a myth, something that any programmer should know instinctively). One upon a time, a typical programmer could probably have known what to do with 80-90% of a given foundation class, now that number has probably dropped to around 30%. Moreover, in order to support all of that additional functionality, the underlying Java run-time has got to be huge.

Java was created by Sun initially partially in response to the foundation class proliferation that you saw with Microsoft's C++ MFC architecture, which in turn spawned ATL (intended to simplify things), templates (intended to simplify things), and several subsequent revs that I've frankly lost track of (all of which were intended to simplify things). C# may have been produced as a way to blunt the Java development community, but its big selling point has been that it will "simplify" programming.

The irony here is that in most cases the underlying languages are not that complex. Yes, C++ has a somewhat cryptic notation and you have to be real comfortable working with second and third order references, Java's even simpler, and the distinction between Java and C# comes down to a few keywords, capitalization conventions, and my favorite -- garbage collection.  However, inevitably when a vendor wants to create a new language, they start there at the bottom, thinking that if they can get the right combination of tokens, the right ratio of direct vs. indirect references, and so forth, they'll be able to hit the magic sweet-spot.

Yet to me, the problem is much more fundamental than these designers are admitting. It comes down to the notion that every programming language is just that -- a language. It is a way of modeling our perceived reality, constrained by the limitations that define the characteristics of that language. Most programming language developers have tried to develop English. English is fairly unique among languages in that it has evolved through aggregation of terms from any number of cultures. It's a hard language to learn because it is a language in which vocabulary is more important than grammar, where tofu, burrito and ketchup are all legitimate words with legitimate spelling, and where words are introduced into it (and die from it) at an astonishing rate. Moreover, you have trade cants and specialist jargon that have meaning within very limited spheres but are so much gibberish outside of those spheres, or worse may have meanings that differ significantly from common usage.

Most foundation class libraries attempt to emulate English, without the realization that English is dynamic, while any such computer language is perforce static and bounded once defined. This means that when a framework is first established, it's usually sleek and agile, but as the language is asked to do more, it begins to bulk up. Inheritance locks the lower classes in place, which means that if a class is ill-defined, it leaves a legacy of ill-defined descendents. The people working on the language see it as being a great pyramid to which they are adding their own contributions, but it's always worth remembering that the purpose of a pyramid was to house dead people.

Every so often, an earthquake occurs. The performance characteristics of the language begin to drop, the caliber of programmer needed to pull off projects rises (and that programmer becomes more expensive), and the overhead of documentation becomes increasingly onerous. The supposed advantages that come from code reuse are outweighed by the amount of time fixing bugs, and it becomes less expensive (and time consuming) to rebuild the application from scratch than it does to try to fix the errors.  People get fed up with the language, and head off in pursuit of a language that will simplify their coding needs, one more time, which usually spawns two languages, the new one that's awaiting its own encrustation, and a revamp of the old in order to keep the rest of the developer base from wholesale migration.

One of the characteristics that I've found so fascinating with XML is the fact that it's viral in nature. People start using it to add some functionality to a web page, or to represent a particular type of object. Once in the system, however, people realize that XML is good for abstracting other objects, or they realize that XML can provide a common communication channel between two components. Transformation of content to web pages comes next, then the use of transformations for mapping data objects. Before long, a system that had been purely procedural code is now pumping XML all over the place. It becomes easier to decouple one piece of a system from another this way, of abstracting out your data access stories so that this crucial (and frequently transient) information isn't locked into your applications.

This same story seems to be happening all over the place, regardless of platform, framework or type of application. It's perhaps more remarkable because it isn't necessarily planned, and in many cases even works against the offerings of existing vendors. For every formal XML language there are a lot of ad hoc ones, local "classes" that flash into existence long enough to get the work done, then disappear. Additionally, even the formal XML languages are tightly focused -- rather than an overarching language that describes everything, you have thousands of languages that describe different things in various ways.

In theory this chaos should be unworkable, yet oddly enough this approach actually seems to be quite effective. I have a theory that this has to do with the fact that in trying to create overarching frameworks we end up trying to build single "solutions" to a given domain, which ends up working very well for problems similar to the one involved with the solution but working less well (or even not at all) for problems that involve other aspects of this problem domain. With XML, the same domain of interest can be modeled multiple ways without penalty, because the problems each model addresses are different.

Q: Where is it all heading? What do you see as the major technologies in the future? How about predictions about their implementation? Who will be the big winners and losers?  Any possible Killer Apps?

A: That's not fair. That's five questions! Seriously, I see this shift to the next level of abstraction continuing for the foreseeable future. It's a direct corollary of Moore's law. Abstraction is expensive computationally, because it relies upon both the ability to create linked relationships and the ability to subsume much of the task of optimization beneath that level of abstraction. When computational resources are scarce, for instance, you become chary about every byte of memory; when computational resources are freely available and fast, on the other hand, you can readily afford to let the computer deal with lower level programming optimizations, and in most cases the system will likely provide a better optimization than you could get by hand.

This becomes especially true in the face of strong interconnectivity. The idea that you could have had a distributed database even ten years ago would have been laughable, but in a highly networked world, the distributed database is increasingly becoming the norm. The traditional client/server model is also becoming much more peer-to-peer like, particularly since people are realizing that both "business logic" and presentation both can be expressed as the result of a stream of data from a "web service", even if that service is in fact local.

This is going to have the effect of pushing local procedural infrastructure further down the stack -- rather than handling business logic, the procedural code will run the transformations, queries, and other XML processors instead. It won't happen overnight, and there will be a lot of different variations about what kind of business logic is run across which processors, but I think the trends are clearly in place there. "Application" servers, when stripped down, are going to look a lot skinnier in the procedural department than they are now.

As to new technologies … I think that interactive paper is going to be an interesting revolution, and is a natural adjunct to the wireless layers being laid now. I think we're at the tipping point on consumer wireless networks, and 802.11g is rapidly replacing the slower b networks. One of the very real effects of wireless is that it really is an enabling technology for virtual businesses and organizations, though it's also been a godsend to Starbucks. I've been involved in at least four different virtual businesses that operated by remote access, and Starbucks seems to be the office of choice, though Tully's coffee here in Seattle is gaining ground. If someone ever married interactive paper with the sleeves on Starbucks cups, the results would be truly mind-boggling.

One thing I do expect. When a major new technology emerges, such as the network/computer PC revolution that started in the 1980s and is really now reaching true fruition, you see definite stages of development. The first, the development stage, is when the core ideas of the technologies emerge, usually with the creation of a small cadre of "hobbyists", students and researchers who are attempting to solve a problem. These people may have some funding, but by and large they are doing it because they have a problem to solve and are trying to find the best way to solve that problem.

Once you hit the tipping point, the next wave is entrepreneurial. How do we make money off of it? This results in some very good things developing, a lot of interim, or bridge, products that get incorporated into the infrastructure, and whole lot of really bad ideas (web enabled valet services, products for your pets, just to name a few of the more notable), looking for enough of a niche to grab the pool of investment money. Some people get very, very wealthy, a lot of people end up broke. One effect of this, however, is that you also saturate the available niche space for the given level of technology, at least from an entrepreneurial standpoint, and you get a bust; too many business ideas chasing too few niches of demand.

We're entering into the third phase now, what I'd call the civic phase. This is the realization that you can't build large scale structures through the agency of any single company, no matter how wealthy, though a few companies benefit disproportionately from this period. People work together to establish standards of communication, laws and regulations (and taxes) catch up with the explosion of the technology, and what had been entrepreneurial increasingly becomes oriented toward a communal undertaking (which in this wave has its exponent in the open source movement). On the other hand, more of the focus shifts toward exploring the characteristics of the new medium that makes it different from others.

In general, this pattern seems to be consistent with most new communication innovations. Telegraphy in the 1870s, telephony in the 1890s, radio in the 1920s, television in the 1950s all have gone through similar bouts of evolution. The Internet has characteristics that are perhaps most similar to radio -- the potential for both broadcast and peer-to-peer communication and so forth, but the differences with radio are also significant. Because of the fairly wide spectrum needed for radio broadcasts, spectrum became a valuable resource, and those with the deepest pockets could afford the most powerful transmitters. This had the effect of keeping radio effectively broadcast only through much of its history, though the shortwave bands even today still have some point-to-point communication.

Modern radio (what we refer to as wireless) harkens back a lot more to those old crystal radio sets than it does to the mammoth broadcast towers of today. One of the key innovations of the Internet was the notion of a URL to act as a generalized location on the network; tie that URL to a wireless transceiver and you have a unique way of identifying any communication node regardless of where it is in physical space (even it its moving, for that matter). You can associate a mnemonic -- a domain name -- to that URL (for a fee), but the numeric IP address is something that is effectively unique (and free). This is one of the reasons I'm so bullish about interactive paper.

Create a magnetic sensitive mesh, such as that which runs through a Wacom tablet, beneath a layer of specially built "paper" consisting of either small charged balls of varying intensity or organics that serve the same purpose. Place this in a frame with a small processor that controls the sweep of a magnetic field used for refreshing the paper. The processor has a limited display engine that takes an XML based format (such as an SVG file) and maintains its internal state based upon that. Along one side of this paper runs an antenna capable of picking up (and transmitting) wireless signals.

The onboard chip is optimized for display, not processing, but may have a certain limited processing capability. However, most of the smarts of this particular "computer" resides within web services transmitted from and sent to more powerful computers. A good quality "book" would weight about a pound, would cost $50 to $100, and would retain enough information locally to make this useful even when offline.

More significantly, though, would be "folders" that might cost $5-$10 apiece, would work primarily with local wireless nets, and could serve purposes as varied as providing news, interactive maps, promotional materials, research papers, school assignments, and so forth. Such folders are potentially secure (place encryption on incoming and outgoing streams) and because they could flash their memory easily enough they don't have the potential of retaining vulnerable data, as contemporary hard drives do.

With interactive paper, the web becomes truly pervasive. You no longer have to shell out a thousand bucks for a laptop, which is still a significant barrier for a lot of people. It will likely revolutionize schools, change the face of publishing, not to mention reduce the number of flyers that you end up having to cart around at trade shows that you'll never get the chance to read anyway. That alone will save hundreds of old-growth forests.

Q: What are the major problems and successes with Open Source? Can you make future predictions about specific products and services coming from the Open Source movement?

A: I’m actually going to bundle part of the last question into this one, as I think Open Source is arguable one of the most potent factors at play right now in terms of the evolution of computing. As I mentioned earlier, I think that the most recent era of entrepreneurial computing is drawing to a close. That's not to say that people won't continue making money selling software, but I think as the nature of programming changes, our relationship to software is also changing.

There are a lot of people who see the Open Source movement as being something radical, but I would in fact argue that it is the more recent entrepreneurial view of software that is in fact the deviation. During the time of main-frame computing, the software that ran on a given computer was most likely written for that computer by the company that produced the computer, or by a small army of dedicated consultants that had developed a symbiotic relationship with that company.

In the academic settings of places like Berkeley and MIT, there was a similar symbiotic relationship between vendor (such as AT&T) and university -- for the use of the computer, AT&T got topflight code that it could then roll back into its own operating system. It was only when AT&T reneged on the agreement that the issue of software ownership vs. authorship really arose.

 In the 1970s, a lot of the software that was written for the emerging PC market was written by hobbyists for other hobbyists, and while they might charge and package the software, the source code typically was contained as well. This changed as businesses began to see the software as a potential source of revenue, and this became the dominant viewpoint from about 1982 on.

I hold the belief that it is society and economy that create the conditions for groundbreaking technologies or radical movements. Had Tim Berners-Lee charged for the use of the http protocol or html, neither of these standards would have grown so dynamically, but something else would have. The need was there. Had Microsoft not realized the potential for the operating system early on, some other company (probably Apple) would have, and we'd be bemoaning the domination of Steve Jobs. More importantly, Linux would have occurred without Linus Torrvalds (though it would probably have not been quite so endearingly quirky).

With comparatively little budget for marketing, with no real expectations for remuneration, facing some of the deepest pockets in the world, a group of developers that would comfortably fit into one lunchroom of one building at Microsoft have built a world-wide operating system that routinely bests performance comparisons with the top software manufacturers in the world. What gives?

I think a big part of the growth of Open Source comes from the fact that many programmers, perhaps most of them, are ultimately scientists. They want to understand how and why things work, and they want to fix things, they want to solve problems. The impulse that makes you a crack programmer can be seen in the mathematician, in the theoretical physicist, in the climatologist. A programmer is an instinctive modeler who wants to play god, not for the power involved but for the understanding.

If you place a dozen unemployed programmers in a room for three days, you will have seven open source projects, a dozen standards for exchange, and at least three white papers when you open reopen the door. That's essentially what has happened over the last decade, as programmers, unemployed by factors largely beyond their control, burned out from years of trying to meet impossible deadlines for stupid projects or just bored with the lack of challenges that they faced in the corporate world, locked themselves into the giant chat room known as the Internet (I speak from experience here), and said, in essence, "Let's Play!"

What's emerged from this act of creative play is singularly awesome. Linux provides a foundation for building anything from an embedded system to a complex animation studio, upon the fundamental assumption first articulated by Newton, who proclaimed that if he could see further than most it was because he stood upon the shoulders of giants. This is a civic view, along with its corollary that states that computing (like all science) only truly succeeds when information is freely shared.

This attitude is permeating other endeavors -- it took years for people to get a handle on AIDS, in great part because there were a number of corporations that wanted to be first out with a vaccine that they could then patent. SARS, when it struck, had the potential to be the Spanish Flu of the 21st century -- a pandemic capable of killing millions worldwide. Because of this, everyone -- Chinese, American, Canadian, African, and European virologists and geneticists -- worked together to come up with a complete genome map of the disease within two weeks, compared to three and a half years for AIDS. The factor here wasn't computational power … it was cooperation.

Open Source (and its analog, Open Standards) is another viral meme. I don't think that, in the time scale relevant to most corporations, Open Source will ever make much sense, if the CTOs and CEOs think of software as a product. Yet it makes perfect sense to governments that don't want to be entangled in complex licensing agreements, want to have control over software if things go wrong, and want to be able to tailor the applications to their own needs. It makes sense to universities and colleges (and even high school and grade schools) who are financially strapped and can ill afford to upgrade their entire computer infrastructure every two to three years because Company X needs to make their third quarter numbers -- and who similarly may wish to teach computing that falls outside of the traditional oeuvre of tools, or need a small kernel for building robots or control systems. It makes sense to companies that see software only as the means to better enable their own business processes, rather than as a vehicle to realize a profit.

I run Mandrake Linux on my home server, with Apache serving up my various web sites using PHP as a low-level protocol for invoking XSLT transformations on XML data coming from a mySQL database.  My wife and ten year old daughter use the same machine with their own accounts; my daughter writes her school reports in Open Office, plays Shockwave games via the CrossOver bridge, and has become more proficient in customizing her environment than I am. We had XP on the system for a time, but I had problems with stability, and got slammed hard when the wave of viruses hit earlier this year.

The reason I bring this up is because to me it gives the lie that Open Source is not yet desktop ready. Another data point: I recently saw that Mattel was actually releasing a Barbie laptop that runs Linux, with its very own customized BarbieOS.

Your ten year old daughters will be running Linux before the year is out.

Think very hard about this one. For the price of a video game player and a couple of games, your daughters will have KDE (running the XML graphics language SVG by the next release), Apache, Perl, PHP, mySQL, access to any number of networked communication tools, graphical editors, video production tools, and so forth, and will have the profound understanding that programming is PINK. THIS is the future of computing.

Q: How do you create a successful business model for Web services?

A: You create a successful business model for web services by realizing that they, like all software, are services, not commodities. When the UDDI specification was first proposed, it was sold by its principle developers as a means for businesses to advertise their web services on the web, so that you could point to a UDDI node to find out who provided hexagonal wing nuts with a half millimeter screw, connect to that company, buy a hundred wing nuts through the appropriate service, and have the wing nuts on your desk by the next FedEx shipment.

This is fine in theory, but it misses one essential ingredient: trust. A good buyer cultivates relationships, evaluates potential sellers based upon their reputation, previous history and the quality of their products, and sees where deals can be made to achieve mutually beneficial results. In other words, the buyer/seller relationship is one that is based on social interactions, or trust. This is something that is hideously difficult to encode in bits and bytes, thankfully. Not surprisingly, UDDI as universal yellow pages has not taken off in any meaningful way.

The next selling point about web services came with the notion that you could sell streams of information. The problem here is that there are very few types of information that are sufficiently timely that they have any significant value as a commodity -- broad stock market data, weather, the latest news in a given field, resource libraries (such as clip-art graphics, video-on-demand, or music) … and most of these could be referenced just as readily through traditional HTTP. Contrary to what vendors of web services software may try to convince you, you are a rare company that has content that people are willing to pay good money for.

So who is the primary consumer of the information within your company? Why, your company. As mentioned before, XML is an abstraction mechanism. A web service is also an abstraction mechanism that you can use to hide the specific implementation of a particular interface behind a URL. Your accounts receivable data is important to your financial analysts and shipping departments, but in far too many cases, vendors have sold turnkey solutions that lack the flexibility to adapt as the company grows. If you can abstract the interfaces that accounts receivable provides so that you in analysis can retrieve content and incorporate it into your own software without knowing (or needing to know) what the back-end system looks like, your code becomes simpler, and it becomes easier to change if your own requirements change.

Q: Describe the current trends in business models for Web services. What impact will Web services have on these traditional business models?

A: There are effectively two different ways that web services can be used. The first approach, what I'll call the .NET approach, is to use the web service to build up a proxy of a class located on another machine. When used this way, you end up binding that distant server to your own code as if it was a local class. This is an example of tight binding.

The second approach is more message oriented. It assumes that connections are (both physically and semantically) transient and that it is the content of the message that is more important than the location of the service (or its underlying "class" semantics). This form of web services has actually been in use for a while -- using HTTP to retrieve XML through extended URL names.

Surprisingly enough, for all the talk of web services in the press as provided by complex class architectures, this second approach to web services continues to gain ground. It's simple. You don't have the danger of people writing classes that end up invoking these classes, which could result in hours of head scratching when things don't work (and an exponential increase in the number of ad hoc classes that you're dealing with in your framework). Systems built this way are more distributed, decoupled, more readily able to jump from one configuration to another as system requirements change.

Web services create dependencies, and I think that these dependencies in turn can shape the business models built around them. Dependent systems are more efficient in the short term and are generally easier to write, but they are harder to change once established, and the more the dependencies built the harder it is to alter the system. This means that dependent systems decay over time, to a point where they interfere with the ability of the business to adapt to new requirements. If you can minimize these dependencies, keep them out of the binary code and move them into more of a referential base, your IT systems will become much more nimble in the face of change.

Businesses have a tendency to strongly reflect their lines of communication. I think that web services will tend to make this association even stronger. If you build software that binds your business processes exclusively to another business, then this binding makes it that much harder to build other relationships. Intelligent businesses may find that it is better to build the web services architecture lightly over the rest of the virtual business space, the better to adjust rapidly in the face of missed deliveries, business bankruptcies, corporate shifts of focus and so forth. The few milliseconds of efficiency that you lose by this kind of design can save months and dollars by increasing your flexibility in the marketplace.

Q: What are some common problems and their solutions facing developers today?

A: Unemployment?

Seriously, I think that there's a major sea-change underway in the developer community, and this is having a huge impact upon the issues facing them. I like to use the term cohorts to describe groups of programmers that entered into the field within roughly the same time for the same reasons.

For instance, I view myself as being part of the desktop publishing cohort -- those people who entered programming in the late 1980s with the advent of desktop publishing, and who seem to have migrated pretty much universally into the arena of knowledge management after having significantly written the rules of the Internet. There was a Windows/Visual Basic cohort that entered into the field with RAD tools (1990-1996), and learned to program by the seat of their pants; there was the later Web cohort (1995-2000) that started with HTML and worked their way into Javascript and some PHP) and we're now about half-way through the Linux cohort, who were attracted to the field by Linux and the open source movement. I'd place the previous generation to mine as the database cohort (1982-1987) and the PC cohort (1976 to 1981).

These dates are all rough, of course, but what I'm trying to get at here is that the people in each of these groups have tended to have the same experiences, were drawn by the same technologies, and in many cases are at roughly the same place in their careers. Understanding these cohorts also gives you a better appreciation for some of the problems that I see upcoming.

Most businesses have tended to assume that all programmers are roughly interchangeable, but that's generally not true. A programming language is a fairly complex piece of work, and just as you would probably not want a mechanical engineer designing your electrical systems, you probably don't want an XML specialist writing C++ code (or vice versa).
 
A typical software engineer has an active programming life usually only of about twelve to eighteen years. It requires a great deal of focus and commitment, and while spending 80 hours a week in full-out "geek-mode" is something that a single programmer fresh out of college could sustain for a while (and even think was cool), by the time those same programmers hit 40, with a mortgage, kids in high school, a book in the works, and so forth, programming for its own sake has lost its appeal. Some, like me, kind of migrate into academia or publishing.  Others push into management, while not a few leave the field entirely to go do something else with their life.

This means that the number of crack SQL developers, the leaders in the field, is now dropping pretty dramatically. My cohort has moved on en masse because the information management field has been weaving back and forth with programming for years, but I think that we're moving into a mode where the two are fairly distinct disciplines. If I was Microsoft, I would be scared silly -- there was a fairly massive influx into the programming community with the introduction of Windows (especially that of VB), but I think it hit its peak in about 1993. This means that the average VB programmer has been in the field for a decade, and from here on out the number of VB programmers will be dropping pretty dramatically, especially given the comparative dearth of work right now.

I think that for the programmers that remain, the salaries are going to start looking pretty good again, but the work is going to be mired in the past. While I admire .NET, I don't see it attracting the same kind of interest that VB did at its height -- it's too complex, for starters -- and that means that innovation (which I suspect is directly proportional to the number of developers in a given discipline) is going to suffer in the Windows community in the next decade.

This is also one of the reasons that I'm fairly bullish on Linux -- its cohort is near as big as the one for Visual Basic, and the Linux cohort has not even hit its initial peak yet. Moreover, Linux programmers tend to be more knowledgeable about the way their system operates, and are more inclined toward local solution programming rather than large, monolithic applications, because the operating system encourages this mode of thinking.

Q: Can you provide your list of the ten most important issues facing corporations and IT professionals today? How can these issues be resolved?

A: 1) Over-automation. Automation replaces large, slow humans who are very good at pattern recognition and problem solving with incredibly fast programs that are good at computation and abysmally poor at pattern recognition or problem solving.  This means that when a computer makes a mistake, it makes it many, many times. Computers can save a lot of money compared to a person in the short turn, but keep in mind that artificial intelligence is a lot like Astroturf -- it may look good, but its remarkable how many stadium lawns have since replaced it with real grass.

2) Exporting Customers. Labor in the United States is expensive. Labor in India, China, Singapore, etc. is cheap, for comparable skills. The ability to use the Internet makes it possible to get to these cheap programmers and have them work as part of your team, possibly displacing local labor in the process. It's an easy equation. The US economy is, outside of security and military contracting, growing at maybe 2%. The economies in India or China are growing in the double digits. Your customers are now speaking Cantonese instead of English, and soon will start developing their own companies to compete with yours. Do you speak Cantonese?

3) The Credit Squeeze.  Billions of dollars has entered the economy, but its not finding its way into the commercial credit markets – rather its been going into housing and automobiles, where it looks to have created more than a bit of a bubble, or is going into government debt. This in turn is pushing up rates AND making it harder for businesses to get credit for expansion. I think this will become more pressing in the next year, especially since the dollar seems determined to pass $1.20 to the Euro and reach parity with the 100-yen note.

4) The Incipient Labor Crunch.  The demographics mentioned earlier with the cohorts is part of a larger issue involving shifting demographics, coupled with changing demand in areas such as China and India. There's some fairly major fragmentation going on in the labor market right now, because of the increasing need for specialists in all areas of IT, and this will decrease the available pool of such specialists (who tend to be older, and are thus farther along the cohort curve). Meanwhile, both China and India are developing fairly rich middle-classes again, which means that there will be an increasing demand for skilled labor at home, reducing the available pool of such hires or outsource services. Finally, the cohorts that are now entering into the labor pool are a major smaller group than swelled IT in the early 1990s, so that even beginning level programmers will soon be in short supply (soon being within two years).

5) Security (Too Much). Security is a game of diminishing returns. You have to assess your location (physically and in cyberspace) and determine what constitutes reasonable precaution – backing up data regularly, putting up abstraction interfaces in front of potentially compromisable data, subscribing to a trusted anti-virus service, and so forth, but if you become obsessed with security you will waste time, energy and money that can be spent far more productively elsewhere. The best precautions should not be oriented on prevention – keeping people out – but on common sense provisions such as making sure that the crown jewels are not sitting out in the hallway. Hackers WILL get into your system if they are persistent enough. If you assume they are already in and handle your systems appropriately, the worst that can happen is that they could vandalize your network (which is why you back up, regularly).

6) The Turnkey Shuffle. Corporate managers like turnkey solutions. Take it out of the box, set it up, don't have to deal with a programmer. The problem here is that each corporation is unique – it deals with differing products, services, customers, regulations, and so forth, and while it is certainly possible to take advantage of the similarities in businesses, the differences are what give businesses a competitive edge. Turnkey solutions are one-size-fits-all business models, and inevitably end up requiring far more investment down the road than was anticipated. I suspect that, if applied correctly, web services will be the death knell of turnkey solutions, while if applied incorrectly, will end up having all the worst disadvantages of turnkeys.

7) Patents and Copyrights. The patent system has effectively collapsed. The European Parliament recently adopted a patent policy which states, in essence, that software is not patentable because it involves the incremental innovation of technologies through the agencies of others. Similarly business concepts cannot be patented. They adopted this in great part because of the chaos here in the US. Patents exist to provide the inventor of innovative technology a small window in which to exploit that technology, but was never meant as a blanket protection against infringement that it has become. Moreover, both patents and copyrights were not originally intended as legal instruments which could transfer. As an author, I think it is right that I own the copyright of my work for the duration of my life. I do not, however, feel that my children should benefit from the protection of my work after I'm gone – this discourages creativity and innovation on their part. A corporation is effectively an immortal construct, and the idea that a corporation should hold a patent or a copyright will ultimately be extremely stifling to innovation.

8) The eBusiness Paradox. The eBusiness revolution that was supposed to come in the late 1990s didn't happen quite the way everyone thought. A few large vendors sold a lot of eBusiness “kits” that would let your company be part of the great revolution sweeping the world, where the smallest companies could have all the efficiencies of electronic business that the largest ones did. We saw a whole host of marketing acronyms: CRM, ERP, and so forth.  Yet it is arguable whether or not this software made any significant change in the profitability of a company; indeed, CRM packages to me served only to further Criminalize the Custom (point 9). There are a number of open initiatives with the UN, OASIS, and even the W3C, which recognize that ultimately business is just another form of communication; watch ebXML and its related technologies very closely.

9) Criminalizing the Customer.  The recent actions of the RIAA and the BSA to me would be inconceivable, save that it has become far too common of late. When I was in college, I went into a bookstore run by a shop-owner who was absolutely convinced that every customer was there to steal him blind; he hovered over everyone who entered, watching them with the most unpleasant expression I had ever seen, and I heard from others that he had threatened to call the police on them if they didn't submit to searches. The word got around, people no longer went to his shop, and within six months the store was vacant. I've seen too many companies (and consortia) of late that seem to have been taken over by this guy. Somewhere along the lines, a company seems to lose sight of the fact that a customer is a person, not a demographic, one who will seek other solutions to their needs if they feel that the price (not just cash price, but the stress from intimidating behavior) isn't worth it.

The BSA has done more to help the Open Source movement grow than any advertising that the OSS people could have done on their own – I've talked to more than one IT manager who switched his entire operation to Linux after having received threatening form letter messages sent out indicating that the BSA would and could audit people for illegal copies of software.  These people generally were “law-abiding” - they paid for their software, they religiously upgraded systems, and so forth – but they were treated like criminals because it was easier for the BSA to create a campaign of terror than it was to take the efforts to woo these customers. When you get enough of these customers shifting (then pushing the needs that they have into the software base, which also happened) then all of a sudden Open Source software begins to look pretty good indeed in comparison.

I think that a very unhealthy relationship has emerged between many companies and their customers. What's happened is that when customers are looked at as marketing statistics, companies base their actions upon “expected behavior”. Companies then go after the demographics that are most likely to buy their goods while often giving less “profitable” demographics the cold shoulder. The problem here is that people do communicate with one another, and if they are unhappy they will communicate this unhappiness far more likely than if they are satisfied. This unhappiness manifests in everything from web postings to migration away from products or (when the costs are right) accessing music or software from other channels. When these companies (often through industry groups such as the RIAA or the SBA) go after these sites or people as being criminals, this not only ignores the fundamental problem (the dissatisfaction) but further denigrates the legitimacy of the company's position in the eyes of its customer. I think that this has already spiraled out of control, and in the end will have far more adverse affects upon the business community than it does upon its customers.

10) Virtual Society. This is more of a trend than an issue, but one I think has a strong impact upon companies. If the cubicle was the work symbol of the 1990s, the laptop in the coffeehouse is definitely becoming its analog this decade – connected via wireless, working either across VPNs or in internet groups, these designers, developers, and project coordinators are essentially building an amorphous network of links and relationships that are bypassing the traditional corporate structures altogether. It also means that increasingly the workers on your projects are providing their own computer systems (which are both better configured for their particular needs and often more muscular to boot than anything that may be available in house), defining their own hours, and establishing their own spaces.

This has advantages and drawbacks for companies.  On the plus side, by putting the onus of computational resources on the employee, the company saves money, especially for off-site work. The IT force consequently is becoming more and more scarce in the building, matching to a great extent the sales force (which also tends to prefer their own IT resources, not surprisingly); while not a huge factor, I wouldn't be surprised if at least a part of the reason for the current glut of office space is because more and more companies are partially or completely virtualized.

The downside is that the business relationship between employer and employee becomes much more tenuous as well. In a traditional business, the employer provides the tools, the workspace, connections into the system, and myriad other small things (parking, coffee or vending machines, bathroom facilities, etc). Additionally, employers provide contractual benefits – pension plans and health care coverage, as well as the occasional stock purchase/company ownership plans. However, as pension plans start being drawn down (those that haven't been raided), as health care coverage costs escalate out of controls, and as employees realize the pitfalls of stock options, these factors are increasingly being taken out of the equation (and in many cases also assumed by the programmer or graphic designer).

The primary benefits that a company provide are the steadiness of paychecks (which makes budgeting possible) and accounting, yet even here companies are facing problems – wholesale job cuts are still with us, and the possibility of job cuts, while potentially making people less inclined to rock the boat in the short term, also makes them less inclined to consider the long term health of the company in their decisions. At some point, the virtual networks that employees have established will hold greater economic incentive than remaining with companies does, and these people will change to that mode. That point is not far distant – a year, maybe five, but enough so that we're on the brink of a major shift in the way work is done in this society. You are seeing the rise of the twenty-first century analog to unions, though organized in much more fluid terms because of the intrinsically networked interactions of the participants, rather than the pyramidical hierarchies of 20th century trade unions.

Intelligent managers will be planning for this, building the emergent networks into their organizations. Others will be broadsided when the economy shifts from being a loose labor market (as it is today) to a tight labor market by 2008. Given the six month focus that most businesses have, this shift is going to catch a lot of people by surprise.

Q: Any last comments?

A: I write a monthly blog and e-newsletter called the Metaphorical Web (http://www.metaphoricalweb.com) where I focus primarily on XML related issues, though I've been known to branch out to IT in general and even broader economic issues. The book, XQuery Kick Start from SAMS, should be out on the shelves now. If not, check out http://www.samspublishing.com for more information.

Thanks for giving me the opportunity to talk.