Pennsylvania State University in the United States released a blog post on August 8, its computer science team invented a new “wireless tapping” technology, combined with AI and millimeter-wave radar sensors, which can transcribe the content of a conversation within a range of three meters, with an accuracy of about 60%.
IT之家 reported that the technology uses a millimeter-wave radar sensor to remotely detect the vibrations of a mobile phone, collecting the tiny surface movements of the device caused by the voice played by the earphone when it is about 3 meters away from the phone, and then inputting the captured vibration signal into the open-source voice recognition model “Whisper”.
Due to the low quality of radar signals and large noise, researchers used the “Low-Rank Adaptation” (LRA), a method that fine-tunes only a small portion of the parameters of the machine learning model, to quickly adapt to new data sources or tasks. They only fine-tuned 1% of the model’s parameters to efficiently identify the data collected by the radar.

First author Suryoday Basak, Image source Pennsylvania State University, USA.
Eventually, the system is able to transcribe mobile phone call content into text with an accuracy of up to 60 percent and a vocabulary size of up to 10,000 words. Compared to the team’s previous achievement in 2022, which could only recognize a few preset words, this new technology significantly enhances the feasibility of practical applications. The study shows that even with recognition errors, it is still possible to infer key words and some content of the call through manual correction such as supplementing context.
It is worth noting that this technology is currently limited to academic research and has not yet been commercialized. Researchers emphasize that their goal is to warn the public about the future possibility of remotely eavesdropping on mobile phone calls and to urge users to be more vigilant when making sensitive calls.