The Vocaloid singing synthesizer technology is categorized as concatenative synthesis, which splices and processes vocal fragments extracted
from human singing voices in the frequency domain. In singing synthesis, the system produces realistic voices by adding information of
vocal expressions like vibrato to score information.The Vocaloid synthesis technology was initially called "Frequency-domain Singing
Articulation Splicing and Shaping", although Yamaha no longer uses this name on its websites."Singing Articulation" is explained as
"vocal expressions" such as vibrato and vocal fragments necessary for singing. The Vocaloid and Vocaloid 2 synthesis engines are
designed for singing, not reading text aloud, though software such as Vocaloid-flex and Voiceroid have been developed for that.
They cannot naturally replicate singing expressions like hoarse voices or shouts, but Appends are made to create different tones
such as "whisper" and "power".
System architecture~
The main parts of the Vocaloid 2 system are the Score Editor (Vocaloid 2 Editor), the Singer Library, and the Synthesis Engine.
The Synthesis Engine receives score information from the Score Editor, selects appropriate samples from the Singer Library, and concatenates
them to output synthesized voices. There is basically no difference in the Score Editor and the Synthesis Engine provided by Yamaha
among different Vocaloid 2 products. If a Vocaloid 2 product is already installed, the user can enable another Vocaloid 2 product by
adding its library. The system supports two languages, Japanese and English, although other languages may be optional in the future.
It works standalone (playback and export to WAV) and as a ReWire application or VSTi accessible from DAW.
Score Editor~
The Score Editor is a piano roll style editor to input notes, lyrics, and some expressions. For a Japanese Singer Library, the user can
input gojuon lyrics in hiragana, katakana or romaji writing. For an English library, the Editor automatically converts the lyrics into
the IPA phonetic symbols using the built-in pronunciation dictionary. The user can directly edit the phonetic symbols of unregistered words.
A Japanese library and an English library differ in the lyrics input method, but share the same platform. Therefore, the Japanese editor
can load an English library and vice versa. As mentioned above, the lyrics input method is library-dependent, and so the Japanese and English
editors differ only in the menus. The Score Editor offers various parameters to add expressions to singing voices. The user is supposed to
optimize these parameters that best fit the synthesized tune when creating voices.This editor supports ReWire and can be synchronized with DAW.
Real-time "playback" of songs with predefined lyrics using a MIDI keyboard is also supported.
Singer Library~
]Each Vocaloid license develops the Singer Library, or a database of vocal fragments sampled from real people. The database must have all
possible combinations of phonemes of the target language, including diphones (a chain of two different phonemes) and sustained vowels,
as well as polyphones with more than two phonemes if necessary. For example, the voice corresponding to the word "sing" can be synthesized by
concatenating the sequence of diphones with the sustained vowel 1. The Vocaloid system changes the pitch of these fragments so that it fits the melody.
In order to get more natural sounds, three or four different pitch ranges are required to be stored into the library. Japanese requires 500 diphones per pitch, whereas English requires 2,500. Japanese has fewer diphones
because it has fewer phonemes and most syllabic sounds are open syllables ending in a vowel. In Japanese, there are basically three patterns of
diphones containing a consonant: voiceless-consonant, vowel-consonant, and consonant-vowel. On the other hand, English has many closed syllables
ending in a consonant, and consonant-consonant and consonant-voiceless diphones as well. Thus, more diphones need to be recorded into an English
library than into a Japanese one. Due to this linguistic difference, a Japanese library is not suitable for singing in English.
Synthesis Engine~
The Synthesis Engine receives score information contained in dedicated MIDI messages called Vocaloid MIDI sent by the Score Editor, adjusts pitch
and timbre of the selected samples in frequency domain, and splices them to synthesize singing voices.When Vocaloid runs as VSTi accessible from DAW,
the bundled VST plug-in bypasses the Score Editor and directly sends these messages to the Synthesis Engine.
Timing adjustment~
In singing voices, the consonant onset of a syllable is uttered before the vowel onset is uttered.The starting position of a note called "Note-On"
must be the same as that of the vowel onset, not the start of the syllable. Vocaloid keeps the "synthesized score" in memory to adjust sample timing
so that the vowel onset should be strictly on the "Note-On" position. No timing adjustment would result in delay.
Pitch conversion~
Since the samples are recorded in different pitches, pitch conversion is required when concatenating the samples.The engine calculates a desired pitch
from the notes, attack time, and vibrato parameters, and then selects the necessary samples from the library.
Timbre manipulation~
The engine smooths the timbre around the junction of the samples. The timbre of a sustained vowel is generated by interpolating spectral envelopes of the
surrounding samples.For example, when concatenating a sequence of diphones "s-e, e, e-t" of the English word "set", the spectral envelope of a sustained e
at each frame is generated by interpolating e in the end of "s-e" and e in the beginning of "e-t".
Vocaloid 1,2,3~
Vocaloid 1
Yamaha started development of Vocaloid in March 2000 and announced it for the first time at the German fair Musikmesse on March 5�9, 2003. The first Vocaloids,
Leon and Lola, were released by the studio Zero-G on March 3, 2004, both of which were sold as a "Virtual Soul Vocalist". Leon and Lola made their
first appearance at the NAMM Show on January 15, 2004. Leon and Lola were also demonstrated at the Zero-G Limited booth during Wired Nextfest and won the
2005 Electronic Musician Editor's Choice Award. Zero-G later released Miriam, with her voice provided by Miriam Stockley, in July 2004. Later that year,
Crypton Future Media also released their first Vocaloid, Meiko. In June 2005, Yamaha upgraded the engine version to 1.1. A patch was later released to
update all Vocaloid engines to Vocaloid 1.1.2, adding new features to the software, although there were differences between the output results of the
engine. A total of five Vocaloid products were released from 2004 to 2006. Vocaloid had no previous rival technology to contend with at the time of its
release, with the English version only having to face the later release of VirSyn's Cantor software during its original run. Despite having Japanese
phonetics, the interface lacked a Japanese version and both Japanese and English vocals had an English interface. The only differences between versions
were the color and logo that changed per template. As of 2011, this version of the software is no longer supported by Yamaha and will no longer be updated.
vocaloid 2~
Vocaloid 2 was announced in 2007. Due to time constraints, unlike the previous engine version, it did not have a public beta test and instead the software
was updated as users reported issues with it. The synthesis engine and the user interface were completely revamped, with Japanese Vocaloids possessing
a Japanese interface. New features such as note auditioning, transparent control track, toggling between playback and rendering, and expression control
were implemented. One's breath noise and husky voice can be recorded into the library to make realistic sounds. This version is not backward compatible
and its editor cannot load a library built for the previous version. Aside from the PC software, NetVocaloid services are offered. Despite this, the
software was not localized and Vocaloids of either English or Japanese would only possess that language version, so although Megurine Luka had an English
library included, as a Japanese Vocaloid she only had access to the Japanese version of the software. In total, there were 17 packages produced for Vocaloid
2 in the Japanese version of the software and five in the English version; these packages offered 35 voicebanks between them in either English or Japanese.
Vocaloid 3~
Vocaloid 3 launched on October 21, 2011, along with several products in Japanese product, the first of its kind. Several studios are providing updates to
allow Vocaloid 2 vocal libraries to come over to Vocaloid 3. It will also include the software "Vocalistener", which adjusts parameters iteratively from a
user's singing to create natural synthesized singing. It supports additional languages including Chinese, Korean, and Spanish. It is also able to use
plug-ins for the software itself and switch between normal and "classic" mode for less realistic vocal results. Unlike previous versions, the vocal
libraries and main editing software are sold as two separate items. The vocal libraries themselves only contain a "tiny" version of the Vocaloid 3 editing
software. Yamaha will also be granting the licensing of plug-ins and use of the Vocaloid software for additional mediums such as video games. Also, Vocaloid
3 has Triphone support unlike Vocaloid 2 which improves language capabilities. The first Spanish Vocaloids, Clara and Bruno, were released in 2011.
Software~
Yamaha developed Vocaloid-flex, a singing software application based on the Vocaloid engine, which contains a speech synthesizer. According to the official
announcement, users can edit its phonological system more delicately than those of other Vocaloid series to get closer to the actual speech language; for
example, it enables final devoicing, unvoicing vowel sounds or weakening/strengthening consonant sounds. It was used in a video game Metal Gear Solid: Peace
Walker released on April 28, 2010. It is still a corporate product and a consumer version has not been announced. This software was also used for the robot
model HRP-4C at CEATEC Japan 2009. Gachapoid has access to this engine and it is used through the software V-Talk.
VocaListener~
Another Vocaloid tool that was developed was VocaListener, a software package that allows for realistic Vocaloid songs to be produced.
MikuMikuDance~
To aid in the production of 3D Vocaloid animations, the program MikuMikuDance was developed. This freeware allowed a boom in the birth of fan-made and
derivative characters, as well as a boost in the promotions of Vocaloid songs. MikuMikuDance's developer went on a hiatus in May 2011 (initially announced
as a retirement from development), but started updating the software again in June 2013.
NetVocaloid~
NetVocaloid was an online vocal synthesis service. Users could synthesize singing voices on a device connected to the Internet by executing the Vocaloid
engine on the server. This service could be used even if the user did not own the Vocaloid software. The service was available in both English and Japanese.
However, as of April 2012, the service was no longer being offered on Yamaha's website.
MMDAgent~
MMDAgent is a software developed by the International Voice Engineering Institute in the Nagoya Institute of Technology, and the Alpha version was released
on December 25, 2010. This particular software allows users to interact with 3D models of the Vocaloid mascots. The software is made from 3D models and
sound files that have already been made available on the internet and will be disputed as freeware for that reason.
NetVocalis~
NetVocalis is a software being developed by Bplats, makers of the VY series, and is similar to VocaListener.
Vocaloid Editor for Cubase~
This particular version of Vocaloid is built solely for Cubase. It features no additional voices but will use any voice from Vocaloid 2 and Vocaloid 3
and acts as a plugin for the Cubase software. The result is that this version is compatible with most functions of Cubase 6.5 and can use its tools such as
buses, filters and mixers without worrying about complications.
Vocalodama~
A iOS game app made using the vocaloid software.
HardwareVocaloid-Board~
Vocaloid is set to become a hardware version called Vocaloid-Board.
eVocaloid~
This is a LSI sound generator that uses the voice of "VY1" (version dubbed "eVY1") and can be used for mobile devices.
Contact Me~