Depending Upon the Kindness of Strangers:
Notes on Open Source and Free Software
J. L. Sloan
This is a set of notes on my observations regarding open source software from my personal perspective, which is as a developer of open source software, as a user of open source software, as an employee of a multinational corporation that uses open source software, and as the owner of a small technology business. I am not a lawyer, nor any kind of legal expert, nor do I play one on TV. My degrees are in computer science, not law. These notes do not in any way constitute legal advice.
Microsoft defines open source software as “software in which both source and binaries are distributed or accessible for a given product, usually for free.” What distinguishes open source from “shareware” or “public domain software” is that open source software is licensed by its copyright holder in such a way to prevent any restrictions being placed on the distribution of the source and binaries of the software, and to require the distribution of any modifications that may be made to the software. “Distribution” here means “to make available”. This can range from providing software on CD-ROMs to placing it on a publicly accessible web site.
The Open Source Initiative
The Open Source Initiative (OSI) () is a non-for-profit organization founded in 1998 that promotes the creation and use of open source software. They define open source software as software having a license which conforms to a set of principles which include the following.
- The license cannot restrict anyone anywhere from redistributing the software. Redistribution includes either selling it or giving it away. This prohibits the charging of a royalty fee.
- The distribution of the software must include source code. The license must allow this source code to be distributed with the software.
- The license must allow modifications to the source code. It must allow new works to be derived from the software. It must allow the modifications and derived works to be redistributed under the same license terms.
- The license cannot discriminate against persons, groups, or fields of endeavor. It cannot require that the software only be used in a specific product. It cannot restrict other software, including requiring that other software be open source.
As if often said, open source software is “free” in the sense of “free speech”, not “free beer”. For example, you can sell open source software on magnetic or optical media, but you cannot prevent the person to whom you sell it from selling copies of it, with or without modifications, or even giving it away. As we will see later, the various open source license agreements do prevent that person from restricting access to that same open source code in any way.
The Free Software Foundation
A related concept is that of free software, in the same sense of “freedom”, not “free beer”. The Free Software Foundation (FSF) (http://www.fsf.org) is a not-for-profit foundation founded in 1985 by Richard Stallman, a researcher at MIT. Its purpose is to promote the creation and use of free software.
The FSF has produced a number of free software packages under the brand GNU, a recursive acronym meaning “GNU’s Not UNIX”. Although they did not produce the Linux operating system kernel itself, many of the standard utilities that ship with every Linux distribution are part of the GNU software distribution. Most notably, the GNU Compiler Collection (GCC), which includes what are probably the most used C and C++ compilers and associated run-time libraries in the world, are distributed under the GNU banner.
According to the FSF, free software has a license which conforms to a set of principles which include the following.
- You must be allowed to run the software for any purpose.
- You must be allowed to study how the software works and to adapt it to your needs.
- You must be allowed to redistribute copies of the software to anyone anywhere.
- You must be allowed to modify the software, and distribute your improvements to anyone anywhere.
The GNU General Public License
Besides producing one of the most used collections of free software, the FSF is also known for originating a series of software licenses widely used to promote free software. The core of these licenses is the GNU General Public License (GPL). The GPL is referred to by the FSF as a “copyleft”(as opposed to a “copyright”) license. The FSF defines copyleft as “a general method for making a program or other work free, and requiring all modified and extended versions of the program to be free as well”.
The GPL is perhaps best known for its “viral licensing” clause:
You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License.
This clause requires any work derived from software licensed under the GPL to also be licensed under the GPL. Just what constitutes a “derived work” is a matter of some debate, as you might imagine, and requires some fairly subtle distinctions for those that are not technically inclined.
For example, the Linux kernel is licensed under the GPL. Source code modifications to the Linux kernel are clearly covered by the GPL. In addition, device drivers compiled as part of the kernel are also covered under the GPL. However, device driver incorporated dynamically at run time as separate, dynamically loadable, modules are not covered by the GPL. This allows a company to develop a “closed” device driver using proprietary technology, yet use the driver with Linux. Furthermore, programs compiled by the GNU Compiler Collection and run under the Linux kernel, both of which are licensed under the GPL, are not themselves covered by the GPL.
The viral licensing implications for derived works was sufficiently restrictive that a second, less restrictive, license, the GNU Lesser General Public License (LGPL) was formulated by the FSF. The LGPL was intended to be applied to software libraries. The LGPL includes the following clause.
A program that contains no derivative of any portion of the Library, but is designed to work with the Library by being compiled or linked with it, is called a "work that uses the Library". Such a work, in isolation, is not a derivative work of the Library, and therefore falls outside the scope of this License.
However, linking a "work that uses the Library" with the Library creates an executable that is a derivative of the Library (because it contains portions of the Library), rather than a "work that uses the library". The executable is therefore covered by this License.
Again, what constitutes a “work that uses the library” is open to interpretation. It is generally accepted that an application that is dynamically linked uses the library but is not a derived work, while an application that is statically linked is a derived work. The distinction is a technical one. When an application is statically linked, portions of the library are incorporated into the binary executable file of the application itself. When you copy the application, you are copying that portion of the library as well. When an application is dynamically linked, it makes uses of a separate copy of the library in memory that is shared among all applications on the same computer. There is no portion of the library incorporated into the application. Shared libraries are a technology used in Linux and most variants of UNIX including MacOS, as well as in Windows where they are called Dynamic Link Libraries (DLLs).
The prohibition against static linking is enough of an issue that some open source software, such as that intended for embedded applications which typically require static linking, frequently include a clause modifying the LGPL that specifically permits static linking without any viral licensing implications. Here is an example of such a clause.
As a special exception, if other files instantiate templates or use macros or inline functions from this file, or you compile this file and link it with other works to produce a work based on this file, this file does not by itself cause the resulting work to be covered by the GNU Lesser General Public License. However the source code for this file must still be made available in accordance with the GNU Lesser General Public License.
This exception does not invalidate any other reasons why a work based on this file might be covered by the GNU Lesser General Public License.
(I have adopted just such a clause for my own open source development intended for embedded applications. For examples, see the Digital Aggregates Desperado library available at .)
The FSF also offers licenses for things other than software, for example documentation.
The FSF licenses are very widely used. It is obviously a good idea to make sure you know which variant of the FSF license applies to the software you are using. In addition, there are many other licenses for open software that are also widely used. These licenses were written for specific software projects, then adopted by other, perhaps unrelated, projects. These include the BSD License, the Artistic License, the Open Software License, the Apache License, the Common Public License, the Mozilla Public License, and many others. They merit close scrutiny before you make use of any of the software to which they apply in any commercial application.
The OSI has a section of its website indicating which of these licenses meet with their approval. The FSF has a section of its website discussing whether or not these licenses are compatible with the GPL or whether they exhibit a copyleft provision.
It may an overly broad a statement, but my interpretation is that all free software is open source, but not all open source software is free, at least not in the sense meant by the FSF. The FSF has a strong philosophical viewpoint against proprietary intellectual property, particularly when applied to software and algorithms. It is a view echoed in the term “copyleft”. It is not necessarily held by other open source advocates.
It is possible for software to be dual licensed: the copyright holder can license the software under more than one license. For example, the software can be licensed under the GPL, but also under a commercial license that allows the user to modify the software in a closed, proprietary way, typically for a fee paid to the copyright holder. The FSF discourages this kind of thing since it violates their charter, but it may be an attractive option for some software and applications.
(Although it seems controversial in the politically charged climate of open source and free software advocates, I personally believe dual licensing offers the best of both worlds to both the producer and consumer of open source software.)
Open Source Developers
There are several mechanisms through which open source software is produced. The most common is that some individual writes some software, typically for their own use, that they think might also be useful to someone else, applies an appropriate open source license (frequently the GPL) to it, and announces its availability to the world. There are web sites devoted to just such announcements, such as http://www.freshmeat.net. Large established projects with large user communities will have dedicated web sites for the distribution of their product(s), such as , , and . All of these web sites host discussion groups in which announcements can be made when new releases are available, and in which users can ask for assistance, report bugs, and submit suggested modifications. and
Eric S. Raymond, an open source advocate, has said that much open source software is the result of a single developer scratching a personal itch. In fact, the Linux kernel itself was originally the result of one such developer, Finnish computer scientist Linus Torvalds.
CIO Magazine reported that 58% of open source developers are professional software developers with 11 years of experience on average, and 30% are paid to write open source.
The flip side is that 70% of open source developers are doing it sans monetary compensation. They write open source code for the sheer joy of writing code. They write it because they need to use the software themselves and their efforts are informed and magnified by the contributions of a larger user community. They write it to get respect among their peers.
(This last motivation is frequently cited as the most common. I find this remarkable from the perspective of a consumer of open source software, but completely understandable as a producer, and will remark on it further, below.)
The smaller open source software projects remain the products of individuals working on them as a hobby. The larger open source software projects necessarily evolve into multi-developer efforts. For example, the Linux kernel is now maintained by a group of dedicated developers under the “benign dictatorship” of Torvalds. The Apache web server, which has more than 60% market share for web server software, beating even Microsoft, is maintained by a dedicated group of developers with a rotating leadership.
There are some common forces at work on every successful open source project. Most projects, large or small, have at their core a single lead developer (perhaps the only developer) whose responsibility it is to provide the technical vision, and to approve changes and fixes. (The Apache project is a rarity in that apparently its members vote on changes.) Each project is a collaborative effort between its developer(s) and its user community. Users of open source software have a much more direct link to the developer(s) than is typical with closed software, where multiple tiers of (frequently dysfunctional, in my own experience as a consumer) technical support may separate the user from the engineering staff.
Because the source code is always available for any open source software, users can not only report bugs, but develop fixes. Approval and incorporation of the fixes into the new release of software is ultimately the decision of the lead developer. And depending on the open source license, users may be required to submit their fixes to the software developer, whose is in turn required to distribute those fixes that are accepted into the code base to the user community. This creates a lot of synergy.
Because the source is required to be made available, its distribution also serves as a very broad system of backup and escrow. Open source software cannot be orphaned. If the original developer dies, goes mad, gets a life, or otherwise loses interest, there is at least the potential for another developer to pick up the project. Open source software cannot “go out of business”, although the consumer of it may be left with the responsibility to maintain it themselves.
Open Source Users
Open source developers are responsible for deciding what contributions from the user community to incorporate and how, and for the initial testing of the software. But the user community serves as a large base of beta testers. There is a strong innovation adoption curve (as described by Everett Rogers in his book Diffusion of Innovation) at work here that is visible in projects like the Linux kernel.
Innovators have read-only anonymous access directly to the Linux source code repository used by the software developers. They can download the latest source code that is “hot off the presses”, frequently unstable, undergoing a lot of churn, but which incorporates critical bug fixes or new, perhaps experimental, features in which they are interested or even to which they are contributing themselves. Even so, the changes in the code base will have been tested and vetted by the core group of Linux developers under the leadership of Torvalds. The innovators provide quick feedback to those developers, often including bug fixes, through the various web-based discussion groups.
The Linux kernel convention is that odd numbered releases (e.g. 2.7) incorporate new features that are have not been widely tested. Early adopters download the odd numbered releases with the expectation that the software will mostly work. These users can also be expected to provide bug reports and occasionally bug fixes.
Both innovators and early adopters are frequently researchers that are depending on the newest features of Linux, or may be developing those features, as part of their own work. But they also may be companies which develop their own applications on top of the Linux kernel. Increasingly, mainstream companies like Oracle are marketing Linux versions of their own closed, proprietary products, particularly as Linux continues to gain market share among server (as opposed to desktop or laptop) operating systems. These companies are depending on getting the “latest and greatest” in order get a head start on their own development, or maintain their technical lead over their competitors, even though the official release of their Linux-based products may be for a later, more stable, version of the kernel.
Even numbered releases (e.g. 2.6) of the Linux kernel are considered “ready for prime time”, well tested, and consist of bug fixes to the odd numbered release. Early majority users download the latest even numbered releases, which are a good compromise between stability and innovation.
Early majority users are frequently companies which specialize in selling packaged Linux distributions on CD-ROM and DVD. There are many of these companies now, each with a slightly different market focus. Redhat specializes in large enterprise customers. SUSE (a division of Novell) has a distribution that is particularly suited for deployment on laptops. Montavista specializes in configurations for real-time and embedded applications. Ubunto has tools and modifications for localized languages and accessibility for those with disabilities. Gentoo is Linux for übergeeks. (For an example outside the Linux kernel, Cygnus Solutions, now owned by Redhat, specialized in packaging the selling the GNU Compiler Collection and related tools from the Free Software Foundation, with a particular emphasis support for the embedded market.)
Late majority users may wait for one of the companies that provide pre-packaged Linux distributions on CD-ROM or DVD, such as Redhat, SUSE, etc. By the time these users install Linux, the features in their release will be well tested by thousands of earlier users. Installation of these packaged distributions is frequently driven by a graphics user interface (GUI) front end, simplifying configuration.
Laggards will stick with the prior even-numbered Linux release (e.g. 2.4), happy that they have an extraordinarily stable, well tested, well understood product.
Smaller open source projects, often the product of a single individual, go through a similar process, albeit with a much smaller user community. Eric Raymond describes how he evolved from a user of the open source e-mail utility “fetchmail” (originally called “popclient” in an earlier incarnation) into becoming its maintainer when the original developer got a life. Raymond quickly found himself as the central point of contact to a community of about 250 users (no doubt a fraction of the total number of fetchmail users) who were submitting bug reports, suggestions for new features, and bug fixes. The product eventually reached maturity at which point bug reports became rare. But at the project’s height, Raymond was releasing a new version about once a day.
Open source development exploits the “Wisdom of Crowds” phenomena in which a large base of collaborators contribute in one way or another to a project, typically managed by the vision of a single coordinator. It leverages off the fact that, thanks to the Internet, communication between users and developers is cheap and fast, that computing resources, thanks to the personal computer, are cheap enough that many users can run and test a software product, and that, apparently, the needs of a vast number of computer users are not being met by commercial, closed, proprietary, software.
The open source development process is remarkably market and consensus driven when compared to the centrally planned process that drives most closed software development. That such a process can generate software of a quality equal to or surpassing commercial software, without resorting to the formal (and expensive) requirements and testing efforts typified in closed software development organizations, nor conforming to any but the lowest level of the Capability Maturity Model (CMM) put forth by the Software Engineering Institute (SEI), cannot be doubted. Les Hatton is a professor of Forensic Software Engineering at the University of Kingston in the U.K. He writes frequently on the subjects of “software safety”, software quality, and failure mode analysis. He has described the Linux kernel as “arguably the most reliable application the human race has ever produced”.
Open source does not dominate the entire computing community however. Windows is still the predominant platform for the desktop, and my personal experience indicates it will stay that way for some time to come. However, Windows, and Microsoft in general, has lost the battle for server operating system share. Linux is the predominant platform for servers, and Apache has at least 60% of the web server market share. Relatively new open source applications like Asterisk, an open source PBX, are making in-roads in markets that have been traditionally serviced by closed systems.
The reason most cited by developers for doing open source development is to gain the respect of their peers.
M. Fink, The Business and Economics of Linux and Open Source, Prentice-Hall, 2003
L. Hatton, “Linux and the CMM”, http://www.leshatton.org/WEB_1199.html, 1999
C. Koch, “Your Open Source Plan”, CIO Magazine, March 15, 2003
E. Raymond, The Cathedral & The Bazaar, O’Reilly, 2001
A. St. Laurent, Understanding Open Source & Free Software Licensing, O’Reilly, 2004
J. Surowiecki, The Wisdom of Crowds, Random House, 2004
John Sloan is a principal in the Digital Aggregates Corporation (), which specializes in applying object oriented design and implementation to real-time, embedded, and device software applications. Digital Aggregates maintains Desperado, a library of components for embedded applications, amounting to about 60,000 lines of C++ open source code and documentation licensed under a version of an FSF license. John is also a developer in R&D of a multinational corporation that leverages open source for its own extensive product line.