This Week in AI: Let us not forget the humble data annotator

Kyle Wiggers and Devin Coldewey

Updated 3 May 2024 at 12:29 pm·7-min read

Keeping up with an industry as fast-moving as AI is a tall order. So until an AI can do it for you, here’s a handy roundup of recent stories in the world of machine learning, along with notable research and experiments we didn’t cover on their own.

This week in AI, I'd like to turn the spotlight on labeling and annotation startups -- startups like Scale AI, which is reportedly in talks to raise new funds at a $13 billion valuation. Labeling and annotation platforms might not get the attention flashy new generative AI models like OpenAI's Sora do. But they're essential. Without them, modern AI models arguably wouldn't exist.

The data on which many models train has to be labeled. Why? Labels, or tags, help the models understand and interpret data during the training process. For example, labels to train an image recognition model might take the form of markings around objects, "bounding boxes" or captions referring to each person, place or object depicted in an image.

The accuracy and quality of labels significantly impact the performance -- and reliability -- of the trained models. And annotation is a vast undertaking, requiring thousands to millions of labels for the larger and more sophisticated datasets in use.

So you'd think data annotators would be treated well, paid living wages and given the same benefits that the engineers building the models themselves enjoy. But often, the opposite is true -- a product of the brutal working conditions that many annotation and labeling startups foster.

Companies with billions in the bank, like OpenAI, have relied on annotators in third-world countries paid only a few dollars per hour. Some of these annotators are exposed to highly disturbing content, like graphic imagery, yet aren't given time off (as they're usually contractors) or access to mental health resources.

Workers that made ChatGPT less harmful ask lawmakers to stem alleged exploitation by Big Tech

An excellent piece in NY Mag peels back the curtain on Scale AI in particular, which recruits annotators in countries as far-flung as Nairobi and Kenya. Some of the tasks required by Scale AI take labelers multiple eight-hour workdays -- no breaks -- and pay as little as $10. And these workers are beholden to the whims of the platform. Annotators sometimes go long stretches without receiving work, or they're unceremoniously booted off Scale AI -- as happened to contractors in Thailand, Vietnam, Poland and Pakistan recently.

Some annotation and labeling platforms claim to provide "fair-trade" work. They've made it a central part of their branding in fact. But as MIT Tech Review's Kate Kaye notes, there are no regulations, only weak industry standards for what ethical labeling work means -- and companies’ own definitions vary widely.

So, what to do? Barring a massive technological breakthrough, the need to annotate and label data for AI training isn't going away. We can hope that the platforms self-regulate, but the more realistic solution seems to be policymaking. That itself is a tricky prospect -- but it's the best shot we have, I'd argue, at changing things for the better. Or at least starting to.

Here are some other AI stories of note from the past few days:

OpenAI builds a voice cloner: OpenAI is previewing a new AI-powered tool it developed, Voice Engine, that enables users to clone a voice from a 15-second recording of someone speaking. But the company is choosing not to release it widely (yet), citing risks of misuse and abuse.
Amazon doubles down on Anthropic: Amazon has invested an additional $2.75 billion in the growing AI startup Anthropic, following through on the option it left open last September.
Google.org launches an accelerator: Google.org, Google’s charitable wing, is launching a new $20 million, six-month program to help fund nonprofits developing tech that leverages generative AI.
A new model architecture: AI startup AI21 Labs has released a generative AI model, Jamba, that employs a novel, new(ish) model architecture -- state space models, or SSMs -- to improve efficiency.
Databricks launches DBRX: In other model news, Databricks this week released DBRX, a generative AI model akin to OpenAI’s GPT series and Google’s Gemini. The company claims it achieves state-of-the-art results on a number of popular AI benchmarks, including several measuring reasoning.
Uber Eats and UK AI regulation: Natasha writes about how an Uber Eats courier’s fight against AI bias shows that justice under the U.K.'s AI regulations is hard won.
EU election security guidance: The European Union published draft election security guidelines Tuesday aimed at the around two dozen platforms regulated under the Digital Services Act, including guidelines pertaining to preventing content recommendation algorithms from spreading generative AI–based disinformation (aka political deepfakes).
Grok gets upgraded: X’s Grok chatbot will soon get an upgraded underlying model, Grok-1.5 -- at the same time all Premium subscribers on X will gain access to Grok. (Grok was previously exclusive to X Premium+ customers.)
Adobe expands Firefly: This week, Adobe unveiled Firefly Services, a set of more than 20 new generative and creative APIs, tools and services. It also launched Custom Models, which allows businesses to fine-tune Firefly models based on their assets -- a part of Adobe's new GenStudio suite.

More machine learnings

How's the weather? AI is increasingly able to tell you this. I noted a few efforts in hourly, weekly, and century-scale forecasting a few months ago, but like all things AI, the field is moving fast. The teams behind MetNet-3 and GraphCast have published a paper describing a new system called SEEDS ( Scalable Ensemble Envelope Diffusion Sampler).

Animation showing how more predictions create a more even distribution of weather predictions. Image Credits: Google

SEEDS uses diffusion to generate "ensembles" of plausible weather outcomes for an area based on the input (radar readings or orbital imagery perhaps) much faster than physics-based models. With bigger ensemble counts, they can cover more edge cases (like an event that only occurs in 1 out of 100 possible scenarios) and can be more confident about more likely situations.

Fujitsu is also hoping to better understand the natural world by applying AI image handling techniques to underwater imagery and lidar data collected by underwater autonomous vehicles. Improving the quality of the imagery will let other, less sophisticated processes (like 3D conversion) work better on the target data.

Image Credits: Fujitsu

The idea is to build a "digital twin" of waters that can help simulate and predict new developments. We're a long way off from that, but you gotta start somewhere.

Over among the large language models (LLMs), researchers have found that they mimic intelligence by an even simpler-than-expected method: linear functions. Frankly, the math is beyond me (vector stuff in many dimensions) but this writeup at MIT makes it pretty clear that the recall mechanism of these models is pretty … basic.

Even though these models are really complicated, nonlinear functions that are trained on lots of data and are very hard to understand, there are sometimes really simple mechanisms working inside them. "This is one instance of that," said co-lead author Evan Hernandez. If you're more technically minded, check out the researchers' paper here.

One way these models can fail is not understanding context or feedback. Even a really capable LLM might not "get it" if you tell it your name is pronounced a certain way, since they don't actually know or understand anything. In cases where that might be important, like human-robot interactions, it could put people off if the robot acts that way.

Disney Research has been looking into automated character interactions for a long time, and this name pronunciation and reuse paper just showed up a little while back. It seems obvious, but extracting the phonemes when someone introduces themselves and encoding that rather than just the written name is a smart approach.

Image Credits: Disney Research

Lastly, as AI and search overlap more and more, it's worth reassessing how these tools are used and whether there are any new risks presented by this unholy union. Safiya Umoja Noble has been an important voice in AI and search ethics for years, and her opinion is always enlightening. She did a nice interview with the UCLA news team about how her work has evolved and why we need to stay frosty when it comes to bias and bad habits in search.

[youtube https://www.youtube.com/watch?v=thTwpAJ9jQM?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en-US&autohide=2&wmode=transparent&w=640&h=360]

Why it’s impossible to review AIs, and why TechCrunch is doing it anyway

Futurism
Terrifying NASA Video Shows America Spewing CO2 Into Atmosphere
Trapped Gases NASA has released a new visualization that shows copious amounts of carbon dioxide swirling around the Earth's atmosphere. The video shows how concentrations of the gas move across the planet, driven by wind and atmospheric circulation, from January through March 2020. The level of detail is truly astonishing, allowing us to "zoom in […]
Futurism
NASA Says Its Rover Has Discovered a "Potential Biosignature" on Mars
Mighty Likely NASA's Perseverance Rover has found a rock on Mars that scientists believe may contain signs of ancient life on the Red Planet. As the New York Times reports, NASA researchers aren't quite ready to declare that they've found definitive biosignatures — the scientific term for "signs of life" — in the piece of ancient […]
Futurism
NASA Scrapping Finished $450 Million Moon Rover, Will Send Dead Weight "Simulator" to Moon in Its Place
NASA's $450 million lunar explorer, the Volatiles Investigating Polar Exploration Rover (VIPER), will not be going to the Moon. Something else will be taking its place, though — and given the costs involved, the decision is bound to raise a few eyebrows, if not serious questions about the space agency's budget situation. The rover, which […]
Reuters
Analysis - Taiwanese rocket startup may be early test of Japan's space hub plans
A Taiwanese startup aims to become the first foreign firm to launch a rocket from Japan by early next year, part of a plan industry advocates say will aid Tokyo's ambitions of becoming a space hub in Asia. The planned suborbital launch by TiSpace has faced regulatory hurdles and delays amid questions over whether Japan should embrace overseas business as part of its effort to double the size of its 4 trillion yen ($26 billion) space industry over the next decade. The private firm, co-founded in 2016 by current and former officials from Taiwan's space agency, has not had a successful launch.
Futurism
Astronaut Shows Photo He Shot in Space That Would Be Impossible to Take Now
Pinpoint Stars In 2003, when the International Space Station was a mere three years old, NASA astronaut Donald Pettit took a gorgeous picture of the Earth's atmosphere, with countless stars frozen in time in the background. But as Pettit revealed in a Reddit post earlier this week, the same photo "cannot be taken anymore" — […]
BBC
Boy unearths rare mammal fossil at festival
Seven-year-old James dug up a prototomus tooth during an activity at the Lyme Regis Fossil Festival.
Engadget
NASA's Perseverance rover found a rock on Mars that could indicate ancient life
A Martian rock sample collected by Perseverance contains "chemical signatures and structures" that could've been formed by ancient microbial life from billions of years ago.
CNN
Boeing, NASA may have found ‘root cause’ of Starliner spacecraft’s issues, but astronauts are still in limbo
After weeks of testing, NASA and Boeing officials say they better understand the issues plaguing the Starliner spacecraft, but still aren’t ready to name a return date.
Reuters
SpaceX's Falcon 9 cleared to return to space, FAA says
The FAA said it did not find any public safety issues in the anomaly that occurred during the failed July 11 launch and the rocket can return to flight operations while the overall investigation continues. SpaceX said in a post on X that it was ready to return the rocket to flight as soon as Saturday, July 27. In a statement on Thursday, SpaceX said that a liquid oxygen leak led to excessive cooling of one of its engine components and damaged its hardware.
Sky News
Mercury has a layer of diamond up to 10 miles thick, scientists suggest
It may be the smallest planet in the solar system but Mercury could be hiding a big secret. A layer of diamond beneath the crust of Mercury could be up to 10 miles (18km) thick, new research suggests. The researchers think two processes could have resulted in the diamond layer.
The Conversation
Landmark new research shows how global warming is messing with our rainfall
Mounting evidence shows rainfall is becoming increasingly variable, making the dry times drier and the wet much wetter. New findings confirm research into rainfall variability in Australia.
PA Media: UK News
Rollout of payment schemes causing ‘widespread uncertainty’ for farmers – report
The changes have come at a time when extreme weather, market conditions and sudden rises in input costs are putting farms under immense pressure.
Australian Associated Press
Husband found not guilty of 'brutal' wedding night rape
A man accused of a series of sexual assaults on his wedding night and honeymoon has been found not guilty on all charges in a Sydney court.
HuffPost
Stephen Colbert Taunts Trump With Absolutely Brutal Reminder About Melania
The "Late Show" host mocked the former president over one curious claim.
The Independent
Is Donald Trump good at golf? We asked a professional coach to analyze his swing
With Joe Biden calling Trump’s alleged golfing prowess into question, is the 45th president as good as he claims to be?
Yahoo News Australia
Passengers slammed over 'disturbing' train act attracting $500 fine
Commuters were noticeably annoyed by the disturbance, one man told Yahoo, and were 'shifting away' from the men in question.
HuffPost
'Trump Is Chicken': Critics Taunt 'Scared' Trump For Bailing On Debate
The former president put debate plans on pause, and critics are calling him out over it.
BuzzFeed
Kamala Harris' Press Release About Donald Trump's Fox News Appearance Is Going Viral
"Something about the question mark after 'old and quite weird' is taking me out."
Yahoo Sport Australia
Tennis world erupts over massive news about Novak Djokovic and Rafa Nadal at Olympics
Rafa Nadal has left the tennis world stunned. Find out more here.
NewsWire
Why Aussies being turned away from Bali
Hundreds of Aussie tourists are being denied entry into Indonesia’s island paradise for one reason.

More machine learnings

Latest stories