You may have heard the argument, or seen the poster, in software development organizations: Reuse the Code, Do Not Re-invent the Wheel. Using off-the-shelf code to accelerate software development and reduce costs is nothing new. If it is available, and does the job, then use it. Open source software is probably the ultimate manifestation of code reuse, widely recognized in software organizations. Without open source, many of the technology phenomena of the last 15 years, from social networking to web applications to mobile communications and more, would not be with us in their current form.

With the accelerated use of third-party software comes the task of managing the list of components of a software project (the Bill of Materials, or BOM). Tracking third-party and open source components in a software project helps manage the quality and security aspects of the project. It also ensures compliance with the terms specified in the license.

A Real-Life Story of Compliance

Our software assessment team regularly carries out software audits for companies. Most of the audits are focused on ensuring open source software license obligations are in line with the business model and that those obligations are met. In order to carry out IP due diligence on a software project that contains anything more than tens of files, code portfolios are scanned using automated code-scanning tools. (Most projects run between five and 100,000 files.) The automated scan results are then reviewed manually to confirm or to fill in the missing information.

Confirming those detected open source projects with licenses is quick and easy. Confirming those pieces of code that are proprietary and that sometimes may have clear headers is also painless. Normally the same authors (developers) show up in the header, or a quick conversation with the engineering group resolves the identity of the code.

By far the most time-consuming aspect of an audit project is the public domain code that has clear copyright ownership (with a copyright statement in the file, or a match to the open source code that is held in an open source reference database). In most jurisdictions, you do not need to put a copyright statement on what you write: If you created it, you own it. Nobody else can use your code unless you explicitly give them permission.

And here is where the trouble starts.

When we identify such code (and we almost always find unlicensed public domain code in a portfolio), we are putting our clients in a position to track down the author of the code and get their permission in writing. Assuming that the author or copyright owners of the original code can be tracked down (a big assumption), the process becomes a patchwork of detective work intermixed with licensing and corporate decision-making. We have had cases where the offending code had to be pulled out of the portfolio and possibly replaced with proprietary software. When you are trying to ship your product, or are involved in an M&A where IP due diligence is one of the last activities, the delay caused by using unlicensed code can be very expensive.

So it was with interest that we came across this article by Simon Phipps, well known for his activities in the open source arena and his experience with open source licenses. Basically his argument revolved around the fact that most code in GitHub does not have a specific license. Moreover, there is a movement that believes “software licenses are outdated” and encourages code forking without considering the original end-result licensing aspects. Although GitHub is singled out here, the behavior is not unique to GitHub. Sourceforge has a good number of project pages with no license listing or just a mention of an “approved OSI License” against the project. Although, in all fairness, and according to our own Global IP Signatures database, GitHub is probably the biggest source of unlicensed projects.

It is unreasonable to expect repository administrators, like GitHub, to enforce license requirements on anyone that posts or stores code on their forge. Unlicensed code will appear on other sites, if not on GitHub. Instead, developers need to be made aware that public domain code has little chance of adoption if it doesn’t carry explicit permission to others to post it as well. The explicit permission, called a license, need not be complicated, and certainly doesn’t have to be invented from scratch. The Open Source Initiative has a collection of open source licenses, categorized and updated as needed.

Put a License On It