HOME

Shapefuture provides a better environment for O Level, IGCSE, AS and A Level Training.

Shrinking massive neural networks used to model language


Deep learning neural networks can be massive, demanding major computing power. In a test of the Lottery Ticket Hypothesis, MIT researchers have found leaner, more efficient subnetworks hidden within BERT models. Credit: Jose-Luis Olivares, MIT

Jonathan Frankle is researching artificial intelligence—not noshing pistachios—but the same philosophy applies to his "lottery ticket hypothesis." It posits that, hidden within massive neural networks, leaner subnetworks can complete the same task more efficiently. The trick is finding those 'lucky' subnetworks, dubbed winning lottery tickets.

In a new paper, Frankle and colleagues discovered such subnetworks lurking within BERT, a state-of-the-art neural network approach to natural language processing (NLP). As a branch of artificial intelligence, NLP aims to decipher and analyze human language, with applications like predictive text generation or online chatbots. In computational terms, BERT is bulky, typically demanding supercomputing power unavailable to most users. Access to BERT's winning lottery ticket could level the playing field, potentially allowing more users to develop effective NLP tools on a smartphone—no sledgehammer needed.

"We're hitting the point where we're going to have to make these models leaner and more efficient," says Frankle, adding that this advance could one day "reduce barriers to entry" for NLP.

News Source