Speech recognition works for kids, and it's about time

·4-min read

Speech recognition technology is finally working for kids.

That wasn’t the case back in 1999, when my colleagues at Scholastic Education and I launched a reading intervention program called READ 180. We’d hoped to incorporate voice-enabled capabilities: Children would read to a computer program, which would provide real-time feedback on their fluency and literacy. Teachers, in turn, would receive information about their students’ progress.

Unfortunately, our idea was 20 years ahead of the technology, and we moved ahead with READ 180 without speech-recognition capabilities. Even at the height of the dot-com bubble, speech recognition for classrooms was still largely the stuff of science fiction.

Artificial intelligence and machine learning hadn’t enabled us to map the terabytes of data required to block out ambient noise in busy classrooms. Nor had it evolved to grasp the complexity of children’s voices, which have different pitches and speech patterns than those of adults, much less recognize a variety of dialects and accents, and — last but not least — manage children’s less-than-predictable behaviors when engaging with technology.

At Scholastic, we didn’t want to tell kids they were mastering something when they weren’t, and we understood the profound implications of telling a young student they got something wrong when they were actually right.

Fast forward to today. Speech recognition has advanced to the point where it can recognize and process children’s speech and account for differences in accents or dialects. Companies like Dublin-based SoapBox Labs have developed speech-recognition technology that is modeled on the diversity of children’s voices you’d find in a busy playground or classroom. Thanks to the high accuracy and performance of this technology, elementary school educators can now rely on it to help them gauge students’ progress with more regularity and offer more personalized approaches to their instruction.

Such advances could not have come at a more crucial moment.

Even before the pandemic, more than 80% of children from economically disadvantaged families failed to reach reading proficiency by fourth grade. After a year of separation from skilled educators, fumbling with technology designed for adults and vast gaps in digital equity, students had learned just 87% of the reading that they would have in a typical year, according to a report from McKinsey & Co. They lost an average of three months of learning during spring school closures.

Not surprisingly, reading losses were especially acute in schools that serve mostly students of color, where reading scores were just 77% of the historical average.

As students return to classrooms, speech recognition can revolutionize education — not to mention remote learning and entertainment in the home — by transforming the way children interact with technology. Voice-enabled literacy, as well as math and language programs, can further professionalize the field by taking the administrative work out of measuring a child’s learning rate and acquisition of foundational skills.

For example, speech recognition can generate regular and valuable insights into a student’s reading progress, pick up on patterns or isolate areas where improvement is needed. Teachers can review the progress and assessment data generated by voice-enabled tools, adapt the learning paths for each child’s needs, screen for challenges such as dyslexia and schedule timely interventions when necessary.

Voice-enabled reading tools allow every child to spend time reading aloud and receiving feedback during the school day, something that simply isn’t practical for one teacher to offer. To put the challenge in perspective: 15 minutes of individual time per child in a class of 25 would eat up more than six hours of a teacher’s day, every day. That sort of individualized observation and assessment was a persistent challenge for teachers before COVID-19. It becomes even more challenging with the emergence of remote learning and as students return to school with unprecedented educational and emotional issues.

Speech recognition technology also has the potential to increase equity in the classroom. Human reading assessment is, after all, highly subjective, and recent studies have shown variances of up to 18% caused by assessor bias. The child-centered high-accuracy speech recognition available today overcomes inevitable human bias by ensuring that every child’s voice is understood regardless of accent or dialect.

In a few years, this technology will be part of every classroom instruction, accelerating the reading — and math and language — skills of young students. Educators will find it enables them to be more strategic in their instruction. And it holds tremendous promise for something desperately needed in the era of COVID-19: technology that can significantly improve reading outcomes and tackle the global literacy crisis in a real and profound way.

Our goal is to create a safe and engaging place for users to connect over interests and passions. In order to improve our community experience, we are temporarily suspending article commenting