Oxford's 'LipNet' Can Read Lips with Over 90 Percent Accuracy
Researchers from the University of Oxford's Computer Science Department have developed a software program called 'LipNet' that can read people's lips with incredible accuracy. The program can take a video sequence as input and analyze it frame by frame in order to predict complete sentences.
Lipreading is a difficult task that the average person can't do whatsoever. People with experience lipreading -- such as those who are hard of hearing -- fare much better, but still have an average performance of just 52 percent accuracy. In Oxford's tests, LipNet performed with 93 percent accuracy -- an incredible improvement above humans.
LipNet was trained using GRID, a data set of 64,000 English sentences. The researchers plan to adapt the system for real-world sentences in the future.
Lipreading is a difficult task that the average person can't do whatsoever. People with experience lipreading -- such as those who are hard of hearing -- fare much better, but still have an average performance of just 52 percent accuracy. In Oxford's tests, LipNet performed with 93 percent accuracy -- an incredible improvement above humans.
LipNet was trained using GRID, a data set of 64,000 English sentences. The researchers plan to adapt the system for real-world sentences in the future.
Trend Themes
1. Improved Lip-reading Technology - The development of more accurate lip-reading technology has disruptive innovation potential in fields such as communication accessibility and surveillance.
2. Artificial Intelligence for Video Analysis - The use of AI to analyze video sequences for speech analysis and prediction has disruptive innovation potential in industries such as security and social media.
3. Adaptation of Existing AI Technology - The adaptation of existing AI technology, such as LipNet, for real-world applications and contexts has disruptive innovation potential in industries such as education and customer service.
Industry Implications
1. Communication Accessibility - The improved accuracy of lip-reading technology can disrupt the communication accessibility industry by creating new avenues for people with hearing impairments to communicate effectively.
2. Security Surveillance - AI-based video analysis tools like LipNet have disruptive innovation potential for the security surveillance industry by providing more accurate speech recognition capabilities for forensic and security purposes.
3. Social Media - The use of AI-based video analysis for speech prediction has disruptive innovation potential in the social media industry, enabling new features for language translation and accessibility for videos and live streams.
4.1
Score
Popularity
Activity
Freshness