Glossary
What Is Vocal Harmony in Singing?
Vocal harmony is when two or more voices sing different notes that sound good together — the magic behind choruses and group singing.
The short version
Vocal harmony is what happens when two or more people sing different notes at the same time, and the combination sounds good. Instead of everyone singing the same melody (which is called unison), each voice takes a different note that fits with the others.
Harmony is the difference between a single voice singing a song and a group creating a sound that is greater than the sum of its parts. It is the magic behind choirs, barbershop quartets, the Beach Boys, and most pop choruses.
Unison vs. harmony
In unison, multiple singers perform the exact same melody at the same time. You hear one voice, but louder. This is common in national anthems, protest songs, and group chants.
In harmony, the singers perform different notes that are musically related. You hear multiple voices, each with its own role. The combination is the music.
The simplest form of harmony is two voices: one singing the main melody (the "lead"), and another singing a note that complements it (the "harmony"). The harmony note is usually a third or a fifth above or below the lead note. This is the basis of most pop harmony.
Interval harmonies
The distance between two notes sung together is called an interval. Different intervals create different feelings.
Unison (0). Same note, no interval. Sounds powerful but plain.
Third (3 semitones). The most common harmony interval in pop. Two notes a third apart sound bright and sweet. The melody is on one note, the harmony on a note three semitones higher or lower.
Fifth (7 semitones). A fifth is the interval of "Twinkle Twinkle Little Star's" first two notes (sol to do). It sounds open and hollow, and is the basis of power chords in rock music.
Sixth (9 semitones). A sixth is darker and more contemplative than a third. Common in soul and R&B harmonies.
Octave (12 semitones). The same note, but an octave higher or lower. Sounds like a thicker version of the same voice. Common in choral music and in pop where a high voice doubles a low voice.
Most pop harmonies stack multiple intervals at once. A three-part harmony might have a lead on the melody, a third above, and a fifth above — or some other combination that sounds full and balanced.
How band and group singing creates harmony
When a group of untrained singers performs together, the harmonies that emerge are usually simple: octaves and thirds, stacked on top of the melody. This is the natural way for voices to combine — anything more complex requires training or arrangement.
Trained vocal groups can produce dense, intricate harmonies. Barbershop quartets, for example, sing in four-part harmony with each voice assigned a specific role: lead, tenor, baritone, bass. The arrangements are carefully crafted so that the four parts lock together.
In rock and pop, harmonies are often overdubbed: the lead singer records the melody, and then sings the harmony parts on top of their own recording. This is how the Beatles, the Beach Boys, Fleetwood Mac, and many others built their signature vocal sound.
How Band Mode enables group AI covers
Band Mode is a VibeSing feature that lets multiple users sing the same song together, with each user's cloned voice taking a different harmony part.
The result is a group performance that is generated automatically. Each user contributes their voice, and the system arranges them into a multi-part harmony. Users can choose which part they sing (lead, high harmony, low harmony) or let the system assign parts automatically.
For people who want to experience the feeling of being in a vocal group, Band Mode is a way to do it without needing to find three friends who can all show up at the same studio session. Each user can be in a different city, on a different schedule, and the harmonies still come together.
AI-generated harmonies
AI models can generate harmonies automatically, without any human arrangement. Given a melody and a set of voices, the model can produce harmony parts that complement the lead and fit the style of the song.
The quality of AI-generated harmony varies. Simple third-above harmonies are usually reliable. More complex arrangements (close intervals, contrary motion, syncopation) require either a more capable model or a human arranger.
For casual group singing and social sharing, AI-generated harmony is good enough to be a lot of fun. It is not yet at the level of a professional vocal arranger, but it does not need to be — the point is participation, not perfection.