The applications for the VOCALOID singing voice synthesizer technology have become a national obsession in Japan, and a project pioneered by Yamaha since 2003 - the original synthesis engine was developed through a joint research project led by Kenmochi Hideki at the Pompeu Fabra University in Barcelona, in 2000 and was never originally intended to be a full commercial project. In 2014, Yamaha used this unique singing voice synthesizer technology to re-create the singing voice of deceased Japanese musician "hide” (Hideto Matsumoto), which passed away in 1998. One of his unrecorded and unreleased songs has even been recorded using VOCALOID and made available for purchase.
Now, the NHK-led project, "Bringing Hibari Misora Back with AI" (title unofficially translated by Yamaha) which was directly assisted by the music company, set out to use modern artificial intelligence (AI) technology to present a live performance of a new song by Hibari Misora — an illustrious entertainer who long stood at the forefront of popular music in Japan — to commemorate the 30th anniversary of her passing.
Until her passing in 1989, Hibari Misora recorded over 1500 songs, leaving behind a series of hits in her more than 40 year long career as Japan’s top singer. She posthumously became the first female recipient of the People’s Honor Award, one of the highest honors in Japan.
Using cutting-edge 4K 3D video to reproduce her likeness, Hibari Misora took the stage and dazzled viewers with her rendition of the new song. Yamaha’s role in the project was to use its VOCALOID:AI technology to faithfully reproduce Hibari Misora’s characteristic singing voice and speech for the live performance. Actual recordings of the artist’s songs and speech made while she was still alive were used as machine learning data to reproduce her singing.
The singing source data used for machine learning included background musical accompaniment, but Yamaha’s accompaniment suppression technology allowed for the generation of high-quality machine learning data to further improve the quality of the singing voice. These Yamaha technologies incorporated a relatively new and rapidly developing type of AI known as deep learning to tackle the challenge of bringing back one of Japan’s foremost late singers using modern technology.
“We believe it was the Yamaha technologies and sensibilities cultivated over 130 years of developing and producing musical instruments and audio equipment which enabled us to successfully capture the essence of her singing. Our cooperation in this project with this new evolution of singing synthesis technology has illuminated new possibilities in music by transcending the barriers of time to dazzle listeners with incredible singing,” says Koichi Morita, Senior General Manager of Research and Development Division, Technology Unit, Yamaha Corporation.
VOCALOID is a singing synthesis technology developed and released by Yamaha in 2003 which has since gained wide recognition as a technology that can produce singing using virtual singers. VOCALOID:AI uses AI to vastly improve the vocal expression of tone changes, in particular. VOCALOID:AI is a singing synthesis technology that uses deep learning to analyze singing characteristics such as tone and expression within recordings of singing by a predetermined vocalist in any language, and can synthesize singing which includes the unique mannerisms and nuances of that vocalist with any melodies and lyrics.
Currently available as VOCALOID 5, the Yamaha Singing Voice Synthesis Technology enables users to simply input melody and lyrics to synthesize a singing voice. The Cyber Diva voice library, released in 2015 was the first available in American English, together with a dedicated Editor for Cubase.
With the public debut of VOCALOID:AI, the VOCALOID label now encompasses all of Yamaha’s singing synthesis technologies, while VOCALOID:AI specifically refers to those which incorporate AI.