The speech apparatus and its structure, sound formation and diagnosis of disorders

The emergence of speech in humans and the formation of sounds is possible thanks to the speech apparatus. The speech apparatus is a set of coordinated organs that help form the voice, regulate it and form it into meaningful expressions. Thus, the human speech apparatus includes all elements directly involved in the creation of sounds - the articulatory apparatus, including the central nervous system, respiratory organs - lungs and bronchi, throat and larynx, oral and nasal cavities.

The structure of the speech apparatus

The structure of the human speech apparatus, that is, its structure, is divided into two sections - the central and peripheral sections. The central link is the human brain with its synapses and nerves. The central speech apparatus also includes the higher parts of the central nervous system. The peripheral department, also known as the executive department, is a whole community of elements of the body that ensure the formation of voice and speech. Further, according to the structure, the peripheral part of the speech apparatus is divided into three subsections:

The department that regulates respiratory processes. Breathing is an important function of the body. It is implemented by special nerve centers and occurs automatically. Sounds from the body always come out during exhalation, and the air wave formed at this moment helps to perform two tasks at once - articulatory function and voice formation. This section includes the lungs and bronchi, the muscles located between the ribs, and the diaphragm.
Voice department. The voice has three characteristics. This is its power, timbre and height. The work of the vocal cords causes vibrations in the air, transmitted to the outside world, perceived as voice.
The articulatory apparatus is the department that directly forms sounds into speech. Consists of active and passive organs. Active organs of articulation are movable, helping to form sounds. The main organs of articulation are the lips and tongue, palate and jaw. Their changes in position lead to the creation of narrowings in different places of the articulation department. The character of the sound produced depends on this position. The mandible helps create stressed vowels. The tongue is the main muscle of the articulatory apparatus. The clarity of pronounced sounds depends on its ability to be flexible and transformative. The lips are also a moving part and contribute to the formation of vowel sounds and speech, they are an important organ in the articulation of words, which is helped by the specific placement of the tongue.

Passive organs included in the articulatory apparatus are immobile organs. Their main task is to be the foundation, the basis for active organs. Passive organs include teeth and the entire oral cavity, and the hard palate. As well as the pharynx and larynx. Although they are motionless, they still influence, albeit slightly, the speech potential and character of a person’s speech.
Read also: Tongue twisters for children 5-6-7-8-9-10 years old in kindergarten, in English. Card index with pictures

What parts does the human hearing organ consist of?

Outer ear
Middle ear
Inner ear.

Outer ear

The outer ear is the only externally visible part of the hearing organ. It consists of:

The pinna, which collects sounds and directs them to the external auditory canal.
The external auditory canal, which is designed to conduct sound vibrations from the auricle into the tympanic cavity of the middle ear. Its length in adults is approximately 2.6 cm. Also, the surface of the external auditory canal contains sebaceous glands that secrete earwax, which protects the ear from germs and bacteria.
The eardrum that separates the outer ear from the middle ear.

Middle ear

The middle ear is an air-filled cavity behind the eardrum. It is connected to the nasopharynx by the eustachian tube, which equalizes the pressure on both sides of the eardrum. That is why, if a person’s ears are blocked, he reflexively begins to yawn or make swallowing movements. Also in the middle ear are the smallest bones of the human skeleton: the hammer, incus and stirrup. They are not only responsible for transmitting sound vibrations from the outer ear to the inner ear, but also amplify them.

Inner ear

The inner ear is the most complex part of hearing, which, due to its intricate shape, is also called the labyrinth. It consists of:

The vestibule and semicircular canals, which are responsible for the sense of balance and body position in space.
Snails filled with liquid. It is here that sound vibrations enter in the form of vibration. Inside the cochlea is the organ of Corti, which is directly responsible for hearing. It contains about 30,000 hair cells that detect sound vibrations and transmit the signal to the auditory area of the cerebral cortex. It is interesting that each of the hair cells reacts to a certain sound purity, which is why, when they die, hearing loss occurs and a person stops hearing sounds of the frequency for which the dead cell was responsible.

Voice formation

In every language on our planet there is a specific number of sounds that create the acoustic image of the language. The sound finds meaning only in the scheme of sentences and helps to distinguish one letters from others. This sound is called a phoneme of the language. All sounds of a language differ in articulatory characteristics, that is, their difference comes from the formation of sounds in the human speech apparatus. And by acoustic characteristics - by differences in sound.

The voice can be considered the result of the hard work of the muscles of various components of the peripheral speech apparatus. Three of its departments contribute to the formation of sound:

respiratory, otherwise energetic - includes the lungs, bronchi, trachea and throat;
voice-forming department, otherwise generator - the larynx along with sound cords and muscles;
sound-producing, otherwise resonator - the cavity of the oropharynx and nose.

The work of these departments of the speech apparatus in complete symbiosis can only occur through the central control of speech and voice-forming processes. This suggests that the respiratory process, articulatory mechanism and sound formation are completely controlled by the human nervous system. Its impact also extends to peripheral processes:

the functioning of the respiratory organs regulates the power of the voice;
the functioning of the oral cavity is responsible for the formation of vowels and consonants and for the difference in the articulatory process during their formation;
The nose section provides adjustment of the overtones of the sound.

The central speech apparatus occupies a key place in the formation of the voice. The human jaw and lips, palate and supraglottic lobe, pharynx and lungs are all involved in the process. The air flow leaving the body, going further through the larynx and passing through the mouth and nose is the source of sound. On its way, the air passes through the vocal cords. If they are relaxed, then the sound is not formed and passes freely. If they are close and tense, the air creates vibration as it passes. The result of this process is sound. And then, with the work of the movable organs of the oral cavity, the direct formation of letters and words occurs.

What is sound and voice from a physics point of view?

Let's start with what sound and voice are from a physics point of view. Sound is a physical phenomenon that is the propagation of mechanical wave vibrations .

From a physics point of view, sound has three properties :

height;
force;
sound spectrum.

The height depends on the vibration frequency . Oscillations occur with a certain periodicity and are measured in hertz. Hertz is a unit of frequency for periodic processes in the International System of Units, as well as in the CGS and ICGSS units.

The strength of sound (aka volume) depends on the amplitude of vibrations . Greater amplitude means stronger sound. The unit of sound intensity is decibel (dB). For example: the rustling of leaves is about 10 dB, and a loud conversation is up to 90 dB.

The sound spectrum is a set of additional vibrations or overtones that arise along with the main frequency. This can be observed especially clearly in music or singing. Overtones increase the fundamental tone in multiple ratios (overtone: over, tone) and give the sound additional color, i.e. timbre.

Sounds with periodic (identical and evenly repeating) wave vibrations are called musical tones . Sound vibrations of non-periodic repetition are not musical tones. These are, for example, creaking, crackling and other sounds.

We will talk about the power of sound and sound spectra in the following lessons, but now let's return to the pitch of sound.

Types of sound by height:

Wave vibrations perceived by the human ear, i.e. in the range of 16-20,000 Hz (hertz).
Ultrasound is sound waves with frequencies higher than those perceived by the human ear, i.e. above 20,000 Hz.
Infrasound is sound vibrations having frequencies lower than those perceived by the human ear, i.e. below 16 Hz.

Thus, the higher the vibration frequency, the higher the sound . In the context of our course on voice development, we are interested in the audible range, i.e. 16-20,000 Hz. In the lower part of the range, the sound is subjectively perceived as dull and bassy, in the upper part - as thinner and sonorous. The entire audible range of sounds is distributed along the so-called note-octave scale (see Fig. 1a), built on the basis of a binary system.

The fact is that sounds whose frequencies differ by 2 times (2 times higher or lower) are perceived by ear as similar. This table is well known to musicians, but it is presented to everyone else to understand how great the capabilities of human hearing and, accordingly, the voice are. You don’t have to delve into the designations of notes and octaves for now. We will touch on this topic when we talk about the development of the singing voice.

So we come to what voice is and how it differs from sound. Sound is a broader concept. In the context of our course, sound is absolutely everything we can hear . This is the singing of birds, the rustling of grass, the splash of water, the roar of a motor, the hum of a printer, the clink of glasses and, of course, the human voice.

The voice is the result of the work of the vocal apparatus and sound production organs (we will talk about their structure from an anatomical point of view later). The capabilities of the voice are somewhat less than the capabilities of the human ear, in the sense that even record holders cannot cover the entire gamut of sounds with their voices in the range of 16-20,000 Hz. True, some of them may go beyond the audible range.

Record-breaking voices from the Guinness Book of Records:

The highest vocal note among men, “F sharp” of the 5th octave (5,989 Hz), was taken by Amirhossein Molai in Tehran (Iran) on July 31, 2021 [Guinness World Records, 2019].
The highest note among women, “G” of the 7th octave (25,087 Hz), was taken by Brazilian singer Georgia Brown in 2004. Technically this note is not musical. Georgia Brown also holds the world record for the widest vocal range among women. Its range extends from the “G” of the major octave (98 Hz) to the “G” of the 7th octave (total 8 octaves) [Guinness World Records, 2004].
The lowest vocal note among a woman was 57.9 Hz, which is slightly higher than the A note of the counteroctave. It was taken by Maryana Pavlova (UK) in Wallington, Greater London, UK, June 3, 2021 [Guinness World Records, 2019].
The lowest vocal note produced by a man is G-7 (0.189 Hz), achieved by singer and songwriter Tim Storms (USA) at Citywalk Studios in Branson, Missouri, USA, on March 30, 2012. The frequency output of Timothy's voice was measured using Bruel & Kjaer equipment (low-frequency microphone, precision audio analyzer and post-analysis laptop) [Guinness World Records, 2012].

By the way, in music it is customary to use not the entire audible range. You can easily verify this by looking at the piano keys. All 88 keys (36 black and 52 white) cover the range from subcontractive A (27.5 Hz) to 5th octave C (4,186 Hz). This is completely sufficient to reproduce any piece of music that is comfortable for the human ear and the way we hear sounds.

You can check the capabilities of your own voice by downloading Pano Tuner and allowing the application access to the microphone. Try to play the highest and lowest note currently available to you, but do not do more than three attempts in a row, because this can lead to overstrain of the vocal apparatus. Record the result and repeat the experience after completing our course. If you have never trained your voice or worked on expanding your range before, you may be able to do so now that you have studied the anatomy of the voice and techniques that help with sound production.

There are options available online without first downloading apps. For example, the Vocal Max service. To start, click on the “Start” button, select the note that is most convenient for you to start with, and play it. As soon as you hit it, it will change color, and the next step will only be possible on any of the adjacent notes, both up and down. If you are not yet familiar with notes, try playing the note “C” of the 1st octave - it is in the range of almost every person. After studying the previous illustration, you can easily find it yourself.

As you play each of the subsequent notes, they will also change color. When you exhaust your range, click “Finish” and save the result, say, as a screenshot. Most likely, after 1-2 months of training, your range will become wider.

And finally, the speech. Speech is a joint result of the work of the vocal apparatus and thinking . If in order to make simple sounds (screaming, crying, moaning and others), we only need to use the vocal apparatus, for speech we need a preliminary understanding of what you want to say. For speech, a smaller vocal range is used than for various types of sounds - screaming, crying, moaning and others.

There is a distinction between internal and external speech, but in the context of our course we are interested, first of all, in the development of external speech, where the organs of sound production are involved. At the same time, beautiful external speech is impossible without the development of internal speech, i.e. planning and control “in the mind” of speech actions, internal pronunciation of planned phrases.

By the way, in our course you will learn that sound production can also be planned! “In your mind” you can rehearse not only the text of the future message, but also its emotional intensity, volume, and pitch. A meaningful approach to sound production, knowledge of the anatomy of the voice and an understanding of how certain movements and positions of the sound production organs affect your voice and your speech will help with this.

Structural components of speech

Responsible for speech function:

The sensory speech center is the perception of speech sounds, based on the sound discrimination system of the language; Wernicke's area in the left hemisphere of the brain is responsible for this process.
The center of motor speech - Broca's area is responsible for it, thanks to it it is possible to reproduce sounds, words and phrases.

In this regard, in clinical psychology there is the concept of impressive speech, in other words, the understanding and presentation of oral and written speech. There is also the concept of expressive speech - that which is spoken out loud accompanied by a certain tempo, rhythm, and emotions.

In the process of speech formation, each person should have a clear understanding of the following subsystems of their native language:

phonetics (what syllables, sound combinations can be, their correct structure and combination);
syntax (understanding exactly how the relationships and combinations between words occur);
vocabulary (knowledge of the vocabulary of the language)
semantics (the ability to understand the meaning of words long before acquiring pronunciation skills);
pragmatics (relationships between sign systems and those who use them).

The phonological component of a language means knowledge of the semantic units of the language (phonemes). Physically, speech sounds can be divided into noises (consonants) and tones (vowels). Any language is based on a certain distinctive feature; if you change one of them, the meaning of the word will change dramatically. The main semantic distinguishing features include deafness and sonority, softness and hardness, as well as stress and unstress. It is these features that act as the basis of the phonemes of the language system. Each language has a different number of semantic units, usually from 11 to 141.

The Russian language involves the use of 42 phonemes, in particular, 6 vowels and 36 consonants.

It has been scientifically proven that any healthy infant in the first year of life has the ability to reproduce 75 different shortest sound units, in other words, can learn any language. But, most often, children at the initial stages of their development are in only one language environment, so over time they lose the ability to reproduce sounds that do not belong to their native Russian language.

What is sound and voice in terms of anatomy?

Voice is a sound produced by exhaled air, vibration of the vocal cords and resonance. In this case, the vocal folds (or cords - we will use both terms) are involved only in fine control of the voice, and the main work is done by air flow and resonance.

To understand how the voice works and how speech is formed, we need to study the structure of the human sound apparatus. Let's divide the problem into two components: how the vocal apparatus works and how hearing works. Let's start with the voice.

General structure of the respiratory system and vocal apparatus:

1	lungs;
2	rib cage;
3	diaphragm;
4	abdominal Press;
5	bronchi;
6	trachea;
7	larynx;
8	vocal cords (located inside the larynx);
9	pharynx;
10	oral cavity;
11	nasal cavity with accessory cavities;
12	elements of the nervous system that conduct and transmit signals connecting the vocal organs with the brain centers.

We consider the respiratory system and vocal apparatus as a single whole, because sound production occurs due to exhaled air. We strongly recommend that you first study the structure of the respiratory system and vocal apparatus in the most general schematic form, and only then move on to a detailed study, analyzing illustrations with an increasing number of elements and details.

Here's a simple illustration:

Here's a more complicated one:

This approach will allow you not to get confused when studying sound production and voice formation, and to maintain a gradual transition from simple to complex.

Simple illustration:

And more complex:

Now let's move on to how exactly sound production occurs. Let's look at the process step by step.

Sound production - how it happens:

Sound is produced by the flow of air exhaled from the lungs.
The air stream moves the vocal folds.
The pitch of the voice is determined by the length and tension of the ligaments. The stronger the tension, the higher the voice. The longer the vocal cords, the lower the voice.
The strength of the voice is determined by the tightness of the ligaments and air pressure.
The degree of tension on the vocal cords changes as the internal muscles of the larynx contract.
The vocal cords are attached to the arytenoid cartilages and to the thyroid cartilages, the displacement of which determines the position of the ligaments.
With the help of the tongue, lips, soft and hard palate, one or another shape of the oral cavity is created, which determines the production of one or another sound.

This is how the voice is formed:

And this is the speech:

Let us dwell in more detail on the dependence of voice pitch on ligaments . The pitch of the voice, i.e. the frequency of sound vibrations depends on both the voltage and the length of the vocal cords.

It is easier for short ligaments to vibrate, so they can perform a greater number of oscillations per unit time. The more vibrations, the higher the voice. Long ligaments are harder to “swing”, so they are capable of making fewer vibrations per unit time. The less hesitation, the lower the voice.

This phenomenon can be figuratively compared to the wingspan of birds. Thus, the wing beat speed of a miniature hummingbird ranges from 50 to 80, and in some species up to 200 beats per second. And such a noticeably larger bird like the stork makes only 2 (two!) wing beats per second.

Let us illustrate what has been said:

Looking at the diagram of the dependence of voice pitch on the length of the vocal cords, we can roughly say that basses are our storks, and sopranos are hummingbirds. This division, adopted in academic opera, is very arbitrary. There are people whose vocal range extends far beyond any one voice type.

For example, singer Dimash has a range of 6 octaves + 5 semitones from the “A” of the counter octave to the “D” note of the 5th octave. Thus, his voice fully accommodates the range in which baritone, tenor, alto and soprano sing, and also captures the upper part of the bass register and part of the so-called “whistle register”, which extends beyond the upper notes of the soprano. Having watched Dimash’s performance, you can hear how delicately he plays the notes of the composition’s melody (not to be confused with the lyrics of the song!) in the whistle register:

Here we come close to the difference between the formation of voice and the formation of speech. From the first part of our lesson, you already know that voice is a broader concept than speech . A voice without speech is possible - for example, screaming, moaning, crying, but speech without a voice is in no way possible.

The position of the tongue, lips, soft and hard palate plays an important role in the formation of speech . With their help, we create one or another shape of the oral cavity. This or that shape of the oral cavity allows you to direct the air flow in one direction or another, which determines the difference in sounds obtained with different positions of the tongue, lips, soft and hard palate.

Now a few words about where to direct the air flow. By default, the air flow is directed into resonators - cavities inside the body, reflected inside which it turns into sound. The main resonators for the human voice are the pharynx, oral and nasal cavities, and trachea . The paranasal sinuses, parietal bone, and other cavities inside the skull are also capable of resonating.

This, by the way, explains why we hear our own voice differently than the people listening to us. We perceive wave vibrations from the skull and other resonating cavities that pass through our body. Thus, we perceive the voice we emit not only through hearing. And all our listeners perceive our voice exclusively through their hearing. Therefore, if we want to find out how others hear our voice, we need to make an audio recording - for example, on a voice recorder in a smartphone.

Due to the fact that our course is of an applied nature, it would be appropriate to clarify a couple of points here. In most courses teaching public speaking, vocals, and speech technique, you will come across such a concept as a “chest resonator.” This is a certain convention due to the fact that it is difficult for the average person to imagine where his trachea is, but it is very easy to feel the vibrations inside the chest.

Therefore, if you want to expand the lower part of your vocal range, work on the beautiful sound of the low notes of your voice, you should be prepared for the fact that you will come across formulations like “working the chest resonator”, “exercises to develop the chest resonator”, etc.

Looking ahead, let's say that in lesson No. 2 we will touch on the topic of opening resonators, including the chest one, and explain the terms in more detail, so that in the future you can independently use any books and online resources on in-depth training in speech and/or vocal techniques.

So, resonators are natural amplifiers of our voice inside our body, allowing us to make our voice higher or lower, louder or quieter, and enrich the color of the timbre. Having learned to control the flow of air and direct it to one or another resonator, you will learn to control your voice as you see fit.

To understand what resonance is, you need to remember your youth - approximately the senior or middle group of kindergarten - when we blew on a piece of paper and received a real whistle. Or when they made a “telephone” from cups and string. The point is that the thread must be taut, and only then can sound be transmitted through it. By the way, you can reproduce this experiment now. At the same time, introduce the basics of physics to your children or nephews. Video instructions are attached:

By the way, note that at 01:25 an acoustic guitar comes into the frame. The sound of a guitar is also an example of resonance. Wave vibrations go through the soundboard hole into the body and are converted into sounds, the height of which depends on which string and on which fret you press. In this case, the strings are an analogue of the vocal cords. An electric guitar with a solid soundboard also has resonating cavities inside, they are just not visible from the outside. The electrical part only enhances the mechanical wave vibrations perceived by human hearing. And now we come close to the hearing device .