Open data on universities – New fuel for transformation
Some describe data as the new oil while others suggest it is a new form of capital or compare it to electricity. Either way, there appears to be a groundswell of interest in the potential of data to fuel development.
Whether the proliferation of data is skewing development in favour of globally networked elites or disrupting existing asymmetries of information and power, is the subject of ongoing debate. Certainly, there are those who will claim that open data, from a development perspective, could catalyse disruption and redistribution.
Open data is data that is free to use without restriction. Governments and their agencies, universities and their researchers, non-governmental organisations and their donors, and even corporations, are all potential sources of open data.
Open government data, as a public rather than a private resource, embedded in principles of universal access, participation and transparency, is touted as being able to restore the deteriorating levels of trust between citizens and their governments.
Open data promises to do so by making the decisions and processes of the state more transparent and inclusive, empowering citizens to participate and to hold public institutions to account for the distribution of public services and resources.
Benefits of open data
Open data has other benefits over its more cloistered cousins (data in private networks, big data, etc). By democratising access, open data makes possible the use of data on, for example, health services, crime, the environment, procurement and education by a range of different users, each bringing their own perspective to bear on the data. This can expose bias in the data or may improve the quality of the data by surfacing data errors. Both are important when data is used to shape government policies.
By removing barriers to reusing data such as copyright or licence-fees, tech-savvy entrepreneurs can develop applications to assist the public to make more informed decisions by making available easy-to-understand information on medicine prices, crime hot-spots, air quality, beneficial ownership, school performance, etc. And access to open research data can improve quality and efficiency in science.
Scientists can check and confirm the data on which important discoveries are based if the data is open, and, in some cases, researchers can reuse open data from other studies, saving them the cost and effort of collecting the data themselves.
But access alone is not enough for open data to realise its potential. Open data must also be used. And data is used if it holds some value for the user. Governments have been known to publish server rooms full of data that no one is interested in to support claims of transparency and supporting the knowledge economy. That practice is called 'open washing'.
In South Africa, the Department of Higher Education and Training, or DHET, receives bi-annual data submissions of student and staff records from each of the now 25 public universities in South Africa. The department validates and stores the data in the Higher Education Management Information System, known as HEMIS.
While the HEMIS data is not accessible to the public, the department does publish on its website anonymised data tables extracted from HEMIS on the 'Universities' page under the heading 'HEMIS Resources'. At the time of writing, the most recent data available are student and staff data for the 2015 academic year.
Data on the department’s website consist of large Microsoft Excel tables. The tables are published with no accompanying information to guide users as to what data the tables contain, whether there are any restrictions on data reuse, how the data is structured or what any of the jargon and numerous acronyms mean. For example, it is not made clear what is the difference between the tables described as 'Enrolment 2.12 (2001-2014)' and 'Enrolment 2.7 (2001-2014)'.
Accessibility and interpretability
On earlier versions of the DHET website, the data were particularly difficult to locate. This situation has improved with more recent versions of the website, but it remains questionable whether the department's data meet two of the eight dimensions of data quality spelled out by Statistics South Africa in its South African Statistical Quality Assessment Framework – accessibility and interpretability.
To remedy the scarcity of accessible and usable data on South African universities, the Centre for Higher Education Trust, or CHET, has been publishing open data on the performance of South Africa’s state-funded public universities since 2009.
Initially, CHET published data as indicators, selected to be of use to university planners and executives attempting to steer South Africa’s public universities along a path mapped out by the erstwhile Department of Higher Education’s funding model. Many of the indicators relate to student and staff data: enrolments, graduates, number of academic staff, etc.
Unlike the DHET data tables, however, CHET’s indicators include additional data on two critical performance indicators: research productivity and the financial indicators.
Open data was published in the form of an application that allowed users to generate customised graphs of the indicators from the underlying data. It allowed users to compare universities on a single graph, to download the graph generated as an image or to download the data for further analysis. Each graph generated is presented with a data table and a glossary of terms (see Figure 1).
Figure 1: CHET’s open data application: Click on this link to view Figure 1
Research has shown that CHET, along with other actors, play an important role by catalysing the flow of data in the higher education system (see Figure 2) despite the shortcomings of the open data published on the Department of Higher Education and Training’s website.
One of the reasons why CHET is successful in liberating university data from its viscous state, is that it goes beyond access. Initially, university planners were its target audience. But it became apparent from the requests it received that the use of the data was more diverse than anticipated. Researchers, journalists, consultants and private companies were using the data in addition to university planners.
Researchers and private companies wanted more granular data; journalists wanted data easily transmittable in more visual formats.
Figure 2: The flow of data in the South African university data system
More granular data
CHET has responded to these requests as best it can by publishing more granular data for the existing indicators, by creating a graph image gallery, and by alerting users to data updates. It also rechecks all the department’s data prior to publication.
CHET has also expanded the number of indicators as the interests of data users change. For example, it included data on the racial and gender composition of students as interest in what is perceived to be the slow pace of transformation gained traction. And, more recently, it has added new financial indicators as the Fees Must Fall movement sprung up and drew attention to the costs and financing of universities.
In June 2017, not long after DHET published the 2015 data, CHET published its updated open data for 2015. The data are structured in the form of 26 indicators. Data for up to and including 2015 captures an important juncture in the history and development of South African universities because it reflects trends in university funding and expenditure at the time just before the Fees Must Fall student movement mobilised. What does the CHET data reveal?
The data on the relative contribution of three sources of university income, show that in all but two universities, student fee income increased as a percentage of total income when comparing 2015 to 2009 (see Figure 3).
At Durban University of Technology, University of the Free State, University of KwaZulu-Natal or UKZN and the University of Limpopo, there was a shift of more than 10% in the contribution of student fees to all income.
The data also show a decrease in the contribution of income from private sources, particularly at at the University of Fort Hare, University of the Free State, UKZN, the University of South Africa or UNISA, the University of the Witwatersrand or Wits, and the University of Zululand. Government funding increased noticeably as a percentage of all income at Fort Hare, UNISA, Wits and Zululand.
Figure 3: % change in sources of university income, 2009-2015
CHET’s open data from a newly added indicator show that at all South African public universities, the income from student fees and other related fee income increased per full-time equivalent student enrolled (see Figure 4).
For example, at the University of Pretoria, the income from student fees and other fee-related income in 2009 was ZAR21,477 (US$1,600) per full-time student. By 2015, this amount had increased to ZAR38,912 (US$2,900), an increase of more than 80% over a six-year period, without taking into consideration the effects of inflation on the university’s costs.
If the CHET open data are used to calculate the average annual increase in universities’ income from student fees, then the results show that at seven universities, fee income per student increased at a rate less than the 9.0% often claimed as the real rate of inflation (see Figure 5). At eight universities, the rate of increase was below the national average of 9.8% while increases at three universities were well above the rate of inflation and the national average at 14+%.
Figure 4: Formal tuition and related fee income per FTE enrolment 2009-2015
Figure 5: Student fee income per student: Average year-on-year increase 2009-2015
The graphs above only hint at what is possible in terms of creating an empirically-based indication of the state of South African universities. To be sure, the graphs raise many questions requiring further triangulation, analysis and debate.
While there are other data providers in the university data system and there is an attempt by the responsible government department to publish some data in the public domain, it is the accessible, usable and relevant open data on South African universities published by CHET that makes it possible for journalists, researchers, employers, consultants, donor funders and others to monitor, advise and challenge from an informed vantage South Africa’s universities and the policies that steer their transformation.
Francois van Schalkwyk is a doctoral candidate at Stellenbosch University’s Centre for Research on Evaluation, Science and Technology (CREST), host of the DST-NRF Centre of Excellence in Scientometrics and STI Policy. Van Schalkwyk is an independent researcher in the areas of higher education, open data and scholarly publishing. Parts of this article draw on the paper ‘Viscous Open Data: The Roles of Intermediaries in an Open Data Ecosystem’ by François van Schalkwyk, Michelle Willmers & Maurice McNaughton, Information Technology for Development Vol. 22 , Iss. sup1, 2016. http://dx.doi.org/10.1080/02681102.2015.1081868.