Researchers from the University of Oxford's Computer Science Department have developed a software program called 'LipNet' that can read people's lips with incredible accuracy. The program can take a video sequence as input and analyze it frame by frame in order to predict complete sentences.
Lipreading is a difficult task that the average person can't do whatsoever. People with experience lipreading -- such as those who are hard of hearing -- fare much better, but still have an average performance of just 52 percent accuracy. In Oxford's tests, LipNet performed with 93 percent accuracy -- an incredible improvement above humans.
LipNet was trained using GRID, a data set of 64,000 English sentences. The researchers plan to adapt the system for real-world sentences in the future.
Lip-Reading Software Programs
Oxford's 'LipNet' Can Read Lips with Over 90 Percent Accuracy
Trend Themes
-
Improved Lip-reading Technology — The development of more accurate lip-reading technology has disruptive innovation potential in fields such as communication accessibility and surveillance.
-
Artificial Intelligence for Video Analysis — The use of AI to analyze video sequences for speech analysis and prediction has disruptive innovation potential in industries such as security and social media.
-
Adaptation of Existing AI Technology — The adaptation of existing AI technology, such as LipNet, for real-world applications and contexts has disruptive innovation potential in industries such as education and customer service.
Industry Implications
-
Communication Accessibility — The improved accuracy of lip-reading technology can disrupt the communication accessibility industry by creating new avenues for people with hearing impairments to communicate effectively.
-
Security Surveillance — AI-based video analysis tools like LipNet have disruptive innovation potential for the security surveillance industry by providing more accurate speech recognition capabilities for forensic and security purposes.
-
Social Media — The use of AI-based video analysis for speech prediction has disruptive innovation potential in the social media industry, enabling new features for language translation and accessibility for videos and live streams.