The other day, Katie E. M. tweeted the following:
Astrophysicists are all like, “ah, yes, this important piece of technology? It’s called the Very Very Very Long Extra Big Spectrometer Device. This other one? We call it NINJATURTLE for short, and that stands for Nitrogen Inspecting Nice Jiggly Astronomical Technology…Urtle.”
and the physics community gave a collective sheepish grin.
I suddenly remembered an experiment I did (but never published) where astronomy professor Meghan Gray of the University of Nottingham sent me a list of astronomy acronyms for a neural network to try to imitate. Since I was training the neural network from scratch, it didn’t have any built-in concept of what acronyms are (or even what English is) and proceeded to completely fail to understand what it was supposed to do.
SIGA Search for Astrophysical Research in Antarctica
GONG Grouble Research Apparatus
FAST Fart Agension Probe
APOCOS Alloon-borbed Interferementated Systems
SARA Sutellite pour le Radio Sources
BATBEAR POLAT Collacting Optical Telescope
COLERA Compact Objects from Earth-Asterion Spectrometer
SHART Super Near Infrared Camera
EPOCh Extrasolar Planet Observations of Shart Survey
BAGASS Bamma-ray Astrophysics Loose Space Telescope
SORE Starch of Nearby Galaxies and Interferometry
Recently, though, I’ve been revisiting old datasets that didn’t work or barely worked, because there’s now a free-to-use neural net called mini-GPT-2 that’s able to make headway on datasets that completely flummoxed the others. It has more computing power than the neural nets that fit on my laptop, it uses convolution to keep track of information sentences or even paragraphs later, and it starts with the ability to generate a pretty impressive array of kinds of text, thanks to the time it spent training on a huge chunk of internet text.
Could the more powerful mini-GPT-2 figure out the astronomy acronyms?
The answer, as it turns out, is no.
First, there were the acronyms whose explanations did not even remotely match the letters.
AIRholes An Quest for Explanatory Astronomical Data
NOAAD Occluminated Propagation of Deposited Vegetations
PHATLIN Franklin Optical Seismic Telescope
DELETE Data Processing Interchange
IMF Too Large to Grasp Space
LMICE People to Detect Interstellar Sky-ins and Back-Noses in the Sky of Unstoppable Objects in the Supernovae
It also tended to leave off the acronyms entirely, but produce rather attention-getting project names:
Mothworm Monitoring Vortex Zone Laboratory
Vaboom Vantage Xybara Wide Angle Camera Array de Humboldt
Radio-When-Willing Sirius Survey
Surreal Visual Sky Imager
Orbiting Objects That Might Backfire
Shiftless Data Structures
Ask Astronomers When to Shut Up
Unicorn Astronomy Survey over Red-Terrestrial Regions
Hawking Telescope Ant Objects
Zongutan Obscene Galaxy
Amazingly Faint Explosions Eyeballs
Part of that could be lack of training data - I only had a list of about 200 acronyms, so there may not have been enough for GPT-2 to grasp the pattern. It spent its time learning to put the entries in alphabetical order instead. With a larger set of example acronyms, maybe it would have caught on. Maybe not.
And then there was another set of acronyms I would categorize as simply “NO.” Let’s just say that NASA would have trouble getting funding for these. If you want to read them, become an AI Weirdness supporter to get them as bonus content.