Seeking 200 data scientists for huge radio telescope

South Africa is seeking 200 data scientists for research that will flow from the huge global Square Kilometre Array radio telescope project. Last year the new Sol Plaatje University became the first in Africa to introduce a dedicated degree in data science. And last week three institutions teamed up to form the Inter-University Institute for Data Intensive Astronomy, or IDIA, which aims to train up to 100 young data scientists over five years.

The universities of Cape Town, the Western Cape and North-West are partners in the IDIA, which was launched at the South African Astronomical Observatory in Cape Town last Thursday by Minister of Science and Technology Naledi Pandor.

Pandor described the Square Kilometre Array, or SKA, radio telescope as “not simply an astronomy project. Or a big science project. Or an infrastructure project. It’s certainly a global infrastructure project and there will be activities in some 20 countries on five continents.

“Total project costs will run into billions of Euros, with much being spent on relaying, storing and analysing the data captured by the antennae – a task that will require processing power estimated to be equal to several million of today’s fastest computers,” she said at the launch.

The SKA project will build the world’s largest radio telescope, 50 times more powerful and 10,000 times faster than any other. The SKA will be constructed in Western Australia and in the Karoo in South Africa, and eight other African countries will help to host antennae spread over 3,000 kilometres.

Construction of the SKA is planned to start in Carnarvon, Northern Cape province, in 2017-18 with some elements operational by 2020 and with full operation in 2025. The project’s first phase was granted a budget of €650 million (US$724 million) in 2013.

Pushing boundaries

Pandor told the IDIA launch that Professor John Womersley, a member of the SKA board, once said that “SKA is to some extent an IT project with an astronomy question as a driver”.

“It’s an IT project of the kind that pushes the boundaries of global technology,” Pandor said.

“Big tech companies like IBM and Cisco are already involved because they know it will allow them to develop the knowledge and technologies that will keep them at the leading edge of computing. This in turn will benefit computer users in many spheres, from finance to government through industry and medicine to other science researchers.

“SKA challenges big data to the extreme. All science pushes the boundaries of knowledge but big science like SKA has the ambition to push those boundaries on the largest scale imaginable.

"Our challenge in Africa is to use big data to find answers to big science questions. To do that, we have to develop capacity in Africa.”

Developing capacity

SKA has challenged universities in South Africa to respond to the field of big data.

Pandor said that the newly established Sol Plaatje University, in Kimberley in Northern Cape Province – where SKA is located – had made history last year by becoming the first institution in Africa to introduce a dedicated undergraduate degree in data science. The current intake for the degree is around 30 students.

“Other universities have recognised the urgent need to develop programmes in the area of big data to be globally competitive in SKA research and are starting up programmes at postgraduate level and appointing senior staff with data science backgrounds,” she said.

The Department of Science and Technology, or DST, was supporting postgraduate students through grants by institutions like the Centre for High Performance Computing or CHPC. “For the past three years, an average of 15 postgraduate (masters and doctoral) students per year graduated from CHPC-supported programmes.”

Further, said Pandor, the government-funded National Integrated Cyber-Infrastructure System, through its Data Intensive Research Initiative for South Africa or DIRISA, would support the development of data science across the national research and innovation space.

“This will be done by enabling and facilitating data-intensive research activities in and between higher education and research institutions. Data-intensive research-capacity development programmes will be established at two institutions during this year and this initiative will be expanded to other institutions.”

There were negotiations underway with high-tech companies such as IBM aimed at developing massive open online courses – MOOCs – in topics in big data science that could be included in courses offered by universities.

“Apart from the significant investments – R200 million [US$14 million] per year – to date in supporting infrastructures for big-data research, the DST will invest about R100 million over the next three years in the establishment of DIRISA.

“In addition, work with European partners (and funding) in developing training initiatives are underway. Efforts have been initiated at a national level to better coordinate various research community efforts and infrastructures in support of developing big data skills and projects,” the minister said.

The new institute

The IDIA will gather together researchers in the fields of astronomy, computer science, statistics and e-research technologies to create data science capacity to lead the MeerKAT – a precursor radio telescope – and SKA.

Professor Russ Taylor, IDIA founding director, told IT Web last week that the institute would enable South African universities to advance within the global SKA project, leading in data science rather than merely providing data.

“The leading-edge of the knowledge and information economy is increasingly driven by analytics of big data. If we cannot fill the gap as a country, we will be less and less competitive on the world stage,” he said.

“Universities that rise to the challenge of the data revolution will be globally competitive in this new era of data-intensive research.”

Pandor described the IDIA initiative as timely. It planned to provide training in SKA-driven data-science research for up to 100 young data scientists over the next five years. SKA SA also had a significant programme that was starting to focus on supporting work in big data.

“IDIA will make a significant and broad contribution to the research enterprise in South Africa. Through a focused research and training programme in data-intensive science, IDIA will drive innovation in big data solutions that will have impact beyond astronomy.

“We will be working proactively to transfer knowledge and expertise to benefit a broad range of data challenged domains in science, humanities and commerce.”