Image with the CROSS logo showing houses in fog or rain

The value of open source to universities: UC Santa Cruz tests the water

July 28, 2020 - by Andrew Oram

In addition to providing the world with its most enduring and valuable software--Internet protocol stacks, Linux, and more--free and open source software offers immense educational benefits. The gap between producing a nice prime number generator for a professor and contributing to real-life production code is huge, and numerous programmers can attest that free and open source projects helped them cross the chasm.

A few universities have recognized the value they can provide to their own students and to the world at large by sponsoring open source projects. I talked recently to Dr. Carlos Maltzahn, Adjunct Professor for Computer Science & Engineering at University of California, Santa Cruz, and the founder and director of the Center for Research in Open Source Software (CROSS). Our wide-ranging discussion covered many ways that open source works with university programs.

Over its five-year existence, CROSS found that students are eager to join the program, that people around the world with no formal connection to UCSC sign up to work on projects, and that sponsors are willing to support the projects with substantial donations. Sustainability--especially financial stability--is still hard to achieve, and CROSS has been experimenting with multiple ways to raise funds to ensure that projects can show their independence after a few years.

Other universities that reach out and build communities around open source include the University of California, Berkeley (around Spark), Stanford University (around Open vSwitch), University of Indiana, Urbana-Champaign (around LLVM and Clang), and Boston University, through a partnership with Red Hat that I have written about. Red Hat, along with the National Science Foundation, also launched a web resource called Professors' Open Source Software Experience (POSSE) that teaches teachers how to incorporate open source into education. Here I will discuss only CROSS and the implications of its work for other universities.

Education for free software

Every successful computer science student, whether or not they want to work on free software, has to cross the gap I mentioned earlier between class exercises and work environments. The gap encompasses everything the programmer has to do before writing code and after writing code. Before coding, they have to assess user needs, choose useful tasks, and get buy-in from teams. After coding, they have to carry out testing and integration, submit code for review, accept feedback, remain responsible for maintenance, and stay in touch with their team generally. Free software projects provide excellent communities to learn these skills.

Furthermore, computer science programs don't usually teach students how to read other people's code. They don't get to see what professional, production-ready projects look like. The free and open source projects are rich sources of high-quality code, which UCSC students study as part of their immersion in open source.

Some of the training in open source for UCSC computer science students is classroom-based. They learn how to work on Github, the dynamics of free software communities, and the history of two major communities: the Linux kernel, and FreeBSD. But they also get an unusual hands-on project they must complete in order to get a passing grade: they must get a patch to the Linux kernel accepted. Their patch can be trivial, such as a documentation change, but the assignment requires them to engage in the community.

CROSS projects

CROSS awards fellowships for work on research and incubator projects. Research fellows are UCSC Ph.D. students who are working on cutting-edge innovation with a plausible path to generate open source software projects. Incubator fellows are postdocs who are building a developer community around an open source research prototype.

CROSS calls for research and incubator proposals twice a year and tries to start at least two new projects per year, subject to availability of funding. CROSS, together with its industry sponsors and advisory committee (which includes Doug Cutting, Sage Weil, Karen Sandler, Nissa Strottman, and James Davis), reviews all projects twice a year and expects to fund each fellow for 2-4 years. The organization helps incubator fellows seed their developer community through the Open Source Research Experience program, which encourages students to work on project ideas authored and mentored by CROSS fellows. Thus, CROSS is both a research facility and an incubator. The current list of CROSS projects is available online.

Given that CROSS has been in existence for five years, the oldest incubator project is nearing completion of its fourth year and is facing the challenge of making the project sustainable outside of CROSS. Incubator fellows are doing a lot of grant proposal writing right now, while making their projects attractive to well-established open source software communities.

CROSS support includes paying incubator fellows, giving them time to recruit and mentor new developers, and guiding them as they connect to well-established open source projects and outside sources of financial support. Incubator applicants must demonstrate interest by well-known developer communities outside of UCSC when proposing their incubator project. We'll see in the next section what's in it for these outsiders.

Working on a CROSS project is not lucrative. Maltzahn says that an incubator fellow could probably earn four times as much money getting a typical coding job in the nearby Silicon Valley. So students are working on the projects out of a passion to make them successful, something of a start-up mentality.

The sterling model for a student open source project is Ceph, the most popular open source software for object storage. It was developed by Sage Weil as a UC Santa Cruz grad student. He spun out a company around Ceph, eventually selling it to Red Hat while Ceph remained open source. Weil then became an advisor to UCSC in its creation of CROSS.

Corporate sponsors have played a key role in CROSS from the start. It began with three sponsors, and typically gets about $100,000 from a sponsor each year, for a total earning of $300,000 to $600,000 per year. Companies sign on because open source helps them create and shape new markets, look for opportunities to network with talent and potential recruits, and track and influence education, research, and next-generation open source software ecosystems.

But the money is only one important benefit offered by a sponsor. They advise the students on real-life, business requirements that affect their projects.

For instance, one CROSS project called SkyhookDM adds smart computations (such as distributed queries) to Ceph. Some companies that manufacture computer storage devices support the project because they have narrow profit margins and prefer to outsource high-risk, pre-competitive research to universities. But they can be a reality check for the SkyhookDM developers, by explaining how far they can impose new costs on the storage products. SkyhookDM is furthest along, of all CROSS projects, toward developing a sustainable funding model.

Outside volunteers

CROSS measures the success of an open source project largely by the health of the community that forms around it: not only the number of contributors is important, but also the number of different organizations they come from. As we have seen, the project must have developers unaffiliated with UCSC even before getting CROSS approval. The organization has found that its projects appeal to a lot of programmers around the world: Mexico, Guatemala, Nigeria, and India, for instance.

Some of these are volunteers; others are paid during summer sessions and sometimes continue to be paid if there is left-over money in the Fall. This summer, CROSS is employing 11 students, all undergrad, of which five are at UCSC and six from elsewhere.

Maltzahn believes that money is not the prime motivator, because about half stay on as volunteers after the payments end. They stay because they can learn from illustrious mentors at UCSC, getting training that is not available in their local communities. He has seen the experience these students get in CROSS projects helps them into degree programs.

Getting the university's act together

To adequately support free and open source projects, university lawyers and administrators have to learn a lot more about their licenses and communities. Maltzahn points out that academic institutions have spent a lot on expertise about publishing, patenting, launching businesses, and other laws and logistics in the proprietary world, but have to catch up on learning about open source strategies to amplify their impact on society.

Students and professors report that open source computer science is valuable. For instance, students learn to go find the tools they need, rather than just working in the environment set up by the professor. This lets them become "productively lost," in the phrase attributed to David Humphrey of Seneca College. Graduates report that the skills they learned, particularly how to work with other people, have made it easier to get jobs and success, whether or not the job involves open source.

Maltzahn recommends that research universities create open source program offices (OSPOs), mirroring the "Talk Openly, Develop Openly" network of big corporations' OSPOs. He'd also love to see universities tracking their impact on society through the production of open source software, such as Ceph. I have a sense that such research would attract attention to free and open source software, and prompt a lot more colleges to make it a part of computer science curriculum.

About Andrew Oram:

Andrew Oram

Andy is a writer and editor in the computer field. His editorial projects at O'Reilly Media ranged from a legal guide covering intellectual property to a graphic novel about teenage hackers. Andy also writes often on health IT, on policy issues related to the Internet, and on trends affecting technical innovation and its effects on society. Print publications where his work has appeared include The Economist, Communications of the ACM, Copyright World, the Journal of Information Technology & Politics, Vanguardia Dossier, and Internet Law and Business. Conferences where he has presented talks include O'Reilly's Open Source Convention, FISL (Brazil), FOSDEM (Brussels), DebConf, and LibrePlanet. Andy participates in the Association for Computing Machinery's policy organization, USTPC.