First published in issue 3 of SOLVE magazine, 2021
Scientists have made an interesting discovery about people with no scientific training. Millions of supposedly everyday folk have a flair for scientific discovery.
All they need is a supported, dedicated platform that allows them to engage with the process of research.
It鈥檚 a realisation that emerged in the early 2000s when so-called citizen science quickly gained momentum, courtesy of the internet, and was found to be both accurate and scientifically valid in the classification of astronomical objects achieved with the Galaxy Zoo platform.
Very soon, it became all too clear to the Galaxy Zoo team that users were keen to extend their involvement to other research projects. This realisation coincided with science beginning to generate an explosion of digital data that initially could not be processed by computers and far exceeded researchers鈥 ability to inspect it themselves.
Having gained the trust of astronomers, a coalition of 兔子先生 and Oxford University researchers decided to plough ahead, creating a portal through which scientists and keen amateurs could join forces to interrogate the nature of the cosmos more extensively. This portal is called Zooniverse.
So far, it has linked more than 2.3 million volunteer researchers from the general public to about 200 science projects in total. Their tasks are primarily focused on providing classifications and analysis 鈥 be it of seabirds or in the discovery of new exoplanets. These citizen scientists have since provided almost 600 million classifications and contributed to more than 100 peer-reviewed science papers.
Strong advocacy for the scientific usefulness of Zooniverse comes from internationally prominent astronomers who helped mastermind Galaxy Zoo in the first place. Included are Daniel Thomas, 兔子先生 Professor of Astrophysics and Head of the School of Mathematics and Physics, and Professor Bob Nichol, 兔子先生 Pro Vice-Chancellor of Research, Innovation and External Relations, and previously Co-director of the Institute of Cosmology and Gravitation.
They say the turning point for this extraordinary community collaboration was the digital revolution in how data is captured. The Sloan Digital Sky Survey (SDSS) that was launched in 1992 and commenced regular survey operations in 2000 played a pivotal role in the decision to engage the general public.
We wrote proper science papers about things that armchair astronomers had discovered from their living rooms. It was really amazing.
Digital window opens
鈥淪loan was the first real digital view of the whole sky, and the universe came to life with digital camera technology 鈥 technology that astronomy helped to develop,鈥 Professor Nichol says.
鈥淒igital imaging meant we could scan the sky with a level of depth and sharpness that was revolutionary. And for the first time, in colour.
鈥淏ut it gave us more data than we knew what to do with.鈥
At stake was the opportunity to record and study millions of galaxies seen for the first time in such detail using all parts of the visible spectrum, from ultraviolet through to the reddest part of the spectrum. That onslaught of data has continued unabated, with SDSS still collecting spectral data today.
Professor Nichol describes the data collected by the SDSS as one of its greatest legacies: 鈥淚鈥檓 very proud we realised early on that the data belonged to everybody and we made our archive available publicly.鈥
Professor Thomas suspected the images themselves contained valuable information and he made the seemingly irrational decision, given the sheer volume of data at hand, to attempt to classify galaxies into different types 鈥 for example, elliptical, spiral and irregular 鈥 manually, with the help of a student. The pair managed 50,000 鈥 a herculean task that couldn鈥檛 even be called the tip of the data iceberg.
But it did prove an important point: the classifications were scientifically valuable. The structure of galaxies and their distribution contain clues about how galaxies form and evolve. It was a galling situation for Professor Thomas and his assistant that they lacked the means to classify more of the images.
At Oxford University, Professor Chris Lintott then made a radical suggestion to Professor Thomas: why not involve the public?
鈥淚t was one of those lightbulb moments, a great idea, and was the genesis of Galaxy Zoo,鈥 says Professor Nichol, who was asked to validate the idea and joined in because of his expertise in the SDSS.
Sloan was the first real digital view of the whole sky, and the universe came to life with digital camera technology 鈥 technology that astronomy helped to develop.
Enthusiasm off the scale
None of the academics, however, anticipated the public鈥檚 reaction 鈥 a reaction so enthusiastic that it crashed the server when Galaxy Zoo went live for the first time in 2007.
Professor Thomas recalls being simply awed by the response. He watched, stunned, as the public achieved a staggering 70,000 classifications in just one hour on day two. Eight million were submitted in the first 10 days.
鈥淚 really think it represented a new era in citizen science that was international in scope,鈥 Professor Thomas says.
With the platform allowing for two-way engagement with scientists, and forums for open discussions, the academics now better understand people鈥檚 motivation. And it was the same as theirs. People liked contributing to science and being part of scientific discovery.
鈥淭here鈥檚 a lot of people out there who will give their time to advance science and I think that鈥檚 one of the lovely stories to come from Galaxy Zoo and a big reason it was evolved into Zooniverse,鈥 Professor Nichol says.
The challenge of big data has subsequently given rise to machine learning and artificial intelligence (AI) technology that hinges on the ability to make accurate classifications.
However, Galaxy Zoo never released humans from the classification role. The team found people are so much better at it and for an important reason 鈥 they can spot oddities and are drawn by curiosity to anomalies. It鈥檚 a trait that is not easy to train into an algorithm. And it is a trait that has given rise to important discoveries.
Professor Thomas explains that to keep the task simple, people were initially asked to distinguish elliptical from spiral galaxies. But it was soon noticed that the amateurs behaved like scientists.
鈥淭hey started emailing us and it became impossible to address all the queries. So we established a forum for addressing common queries,鈥 recalls Professor Nichol.
People also went off and did their own research and started reading up about astronomy.
鈥淭here were quite a few interesting objects that were discovered that way,鈥 Professor Thomas says. 鈥淲e wrote proper science papers about things that armchair astronomers had discovered from their living rooms. It was really amazing.鈥
The upshot is that AI has not usurped the citizen scientists. On the contrary, the human-based SDSS galaxy classifications are now used as a dataset to train AI algorithms.
鈥淎 human can find anomalies without any preconception of what such an anomaly would look like 鈥 it just looks odd,鈥 says Professor Nichol.
鈥淐omputers still find it very difficult to find such 鈥榰nsupervised鈥 anomalies and usually need to be trained extensively before knowing what is 鈥榦dd鈥. Computers can find things that are part of their training set; they鈥檙e less good at finding stuff they鈥檝e never trained on. That鈥檚 an interesting and important difference.
鈥淭here鈥檚 been an evolution where the heavy lifting in processing of the simple stuff can be done by machines and we use the humans for the more subtle, more difficult tasks.鈥
Eventually the Hubble Space Telescope data was added into the platform and the platform evolved into Zooniverse, which continues to expand. Along the way it proved that the human brain 鈥 including that of the untrained amateur 鈥 will always play a major part in pattern recognition in science and all its applications.