Phonetics and phonology

The key principles that underly the study of linguistics.

International Phonetic Alphabet (IPA)
Onset, nucleus, and coda
Tonal languages

Defining phonetics

Phonetics and phonology are two subfields of linguistics that study the sounds of language. While both fields are concerned with the sounds of language, they differ in their scope and focus. Phonetics is the study of the physical properties of speech sounds.

It is concerned with how speech sounds are produced, transmitted, and perceived by the human auditory system. Phonetics is an experimental science that uses a range of tools and techniques to measure and analyze the physical properties of speech sounds, including spectrograms, speech synthesizers, and articulatory models.

Phonology, on the other hand, is the study of the abstract, mental representations of speech sounds that are used by speakers of a language. It is concerned with the ways in which speech sounds are organized and patterned in a language, and the rules that govern their use.

Phonology deals with the sound patterns of a language at the level of mental representations, rather than the physical properties of speech sounds.

While phonetics deals with the physical properties of speech sounds, phonology is concerned with the mental categorization and patterning of those sounds.

Articulatory, acoustic, and auditory phonetics

Phonetics is traditionally divided into three subfields: articulatory phonetics, acoustic phonetics, and auditory phonetics.

Articulatory phonetics is concerned with the way speech sounds are produced in the human vocal tract, including the movements and positioning of the articulators, such as the lips, tongue, and jaw, that are used to produce speech sounds.

Articulatory phonetics uses techniques such as x-ray recordings and electromagnetic articulography to study the movements of the articulators and to understand how speech sounds are produced.

Acoustic phonetics is concerned with the physical properties of speech sounds as they are transmitted through the air. It is concerned with the way that speech sounds are represented in the acoustic signal, including their frequency and amplitude, and the way that they change over time. Acoustic phonetics uses techniques such as spectrograms and acoustic analysis software to study the acoustic properties of speech sounds.

Auditory phonetics is concerned with the way that speech sounds are perceived by the human auditory system. It is concerned with the way that speech sounds are transformed from the acoustic signal into a representation in the auditory system, and the way that this representation is processed by the brain to form the perception of speech sounds.

Auditory phonetics uses techniques such as psychoacoustic experiments and brain imaging to study the way that speech sounds are perceived by the human auditory system.

Phonetic transcription

Phonetic transcription is the representation of speech sounds in written form, using symbols to represent specific sounds. The goal of phonetic transcription is to accurately capture the pronunciation of a word or speech sound, so that it can be easily recognized and understood by someone who is familiar with the transcription system being used.

There are several different systems of phonetic transcription, each with its own set of symbols and conventions. The most widely used system is the International Phonetic Alphabet (IPA), which is used by linguists and speech-language pathologists to describe the sounds of the world’s languages. In IPA transcription, each symbol represents a specific speech sound, or phoneme.

For example, the symbol “p” represents the voiceless bilabial plosive sound, which is produced by pressing the lips together and quickly releasing the air pressure to create a burst of sound.

In English, “a” in “about”, “e” in “taken”, “i” in “pencil”, “o” in “memory”, and “u” in “supply” all represent the same sound: the unstressed central vowel known as schwa. In Albanian, the same sound is represented with “ë”.

Thus, one sound can be represented in drastically different ways across different languages and even in the same language. In the IPA, the schwa is represented with “ə”. By using one standardized transcription, linguists working on different languages can communicate easily without having to rely on the vagaries and idiosyncrasies of the individual languages’ spelling conventions.

Phonetic features of speech sounds

Phonetic features of speech sounds are the basic properties of speech sounds that are relevant for capturing the distinctive sounds of a language. Phonetic features are classified into three main categories: the place of articulation, manner of articulation, and voicing.

Place of articulation refers to the location in the vocal tract where two articulators come into contact or proximity to produce a speech sound. For example, sounds produced with the tongue touching the alveolar ridge (the area just behind the upper teeth), such as “t”, “d”, and “n”, are said to have an alveolar place of articulation.

Some other places of articulation include bilabial (lips, e.g. “p”, “b”, “m”), labiodental (lips and upper teeth, e.g. “f”, “v”), dental (tongue and upper teeth, e.g. “th”), velar (tongue and soft palate, e.g. “k”, “g”), and glottal (vocal cords).

Manner of articulation refers to the way in which two articulators interact to produce a speech sound. For example, sounds produced by a complete closure of the vocal tract, such as “p”, “d”, or “k”, are said to have a plosive (or stop) manner of articulation.

Voicing refers to the vibration of the vocal cords during the production of a speech sound. Sounds produced with the vocal cords vibrating are said to be voiced, while sounds produced without vocal cord vibration are said to be unvoiced.

Phonemes, allophones, and phonological rules

Phonology is the study of the abstract, mental representations of speech sounds in a language. It deals with the systematic organization of speech sounds. There are several basic concepts in phonology that are essential for understanding the field.

A phoneme is the smallest unit of sound that can change the meaning of a word. Phonemes are abstract entities, or mental representations of speech sounds that allow us to distinguish between different words. For example, the difference between the words “pat” and “bat” is a single phoneme, represented as “p” and “b”.

An allophone is a particular realization of a phoneme in speech. Allophones are the physical speech sounds that we actually hear, and they vary depending on context.

For instance, the voiceless stops “p”, “t”, and “k” have different allophonic realizations in English depending on their position in a word. The sound “t” is aspirated, or pronounced more strongly, at the beginning of a word, as in “top”, than after an “s”, as in “stop”. Try putting your palm in front of your mouth to feel the difference in the strength of the burst.

Phonologists use the symbol “tʰ” to represent the stronger (aspirated) sound (i.e. “tʰop” vs. “stop”). Thus, in English, “tʰ” is an allophone, or a contextual variation, of “t”.

A phonological rule is a general principle that governs the behavior of phonemes in a language. Phonological rules describe how phonemes combine and change in different contexts. For example, the rule which captures the English stop allophony discussed previously could be stated as “at the beginning of a word, voiceless stops become aspirated.”

Phonological processes

Phonology is the study of the patterns of speech sounds in a language, and one of the key areas of investigation in phonology is the study of phonological processes.

Phonological processes are the ways in which speech sounds are modified or altered to fit the patterns and rules of a particular language. Cross-linguistically common processes include assimilation, lenition, and epenthesis.

Assimilation is a process by which a speech sound changes its place, manner, or quality to become more similar to a neighboring sound. For example, in English, the word “input” is usually pronounced as “imput”, as the nasal “n” assimilates in place to following stop “p” and becomes “m”.

Lenition is a process by which a speech sound becomes less strong or less pronounced. This often occurs in unstressed syllables, and involves the weakening of consonants.

For example, observe that the “t” in “better” is hardly ever realized as “t”. Rather, it is typically a softer “d” sound (“bedder”, common e.g. in the US) or a short pause (“be-er”, common e.g. in the UK).

Epenthesis is the process of inserting a sound into a word to make its pronunciation easier. For example, in English, the consonant “p” can be added in the word “hamster” between “m” and “s”, making it sound like “hampster”.

Phonotactics and syllable structure

Phonotactics is the study of how sounds are put together to form syllables and words. It looks at the rules that govern which combinations of sounds can occur in a language, such as whether certain consonants or vowels can be followed by another sound.

For example, in English, a word can begin with the “sp” sequence, as in “spy”. In Spanish, on the other hand, the “sp” sequence cannot occur at the beginning of a word. Hence, the English word “spy” is borrowed as “espiar”, with the addition of the epenthetic “e” at the beginning of the word.


Syllable structure refers to the way in which syllables are formed within a language. A syllable is composed of three basic parts: the onset, the nucleus, and the coda.

The onset is all the consonants that precede the vowel in a syllable. The nucleus is the vowel. The coda comprises all the consonants that follow the nucleus. For example, in the word “sixths”, “s” is the onset, “i” is the nucleus, and “xths” is the coda.

Note that languages can place different restrictions on the different components of the syllable: While “s” can appear in the coda (e.g. “less”), there are no syllables in English that start with “xths”.

Phonological diversity around the world

A wide variety of sound systems is found across different languages. Some of them are unique or uncommon, and provide valuable insights into the diversity of human language.

For instance, many African and Asian languages are tonal. In tonal languages, the meaning of a word can change depending on the tone used to pronounce it, with different tones indicating different grammatical or lexical meanings.

In Mandarin Chinese, the word “ma” can have four different meanings depending on the tone used to pronounce it: “mā” with a high flat tone means “mother”, “má” with a rising tone means “hemp”, “mǎ” with a dipping (falling and rising) tone means “horse”, “mà” with a sharp falling tone means “to scold”.

Pulmonic languages

In European languages, consonants are almost exclusively pulmonic; this is to say, the airstream used for making those sounds is produced by air pressure from the lungs.

However, many of the world’s languages also have ejective, implosive, and click consonants articulated by creating pressure in the oral cavity. Ejective consonants (found e.g. among the Afroasiatic and Mayan languages) are produced by constricting and raising the glottis, which creates pressure in the mouth and leads to a dramatic burst of air.

Implosive consonants, widespread among the languages of Sub-Saharan Africa and Southeast Asia, are articulated by moving the glottis downward. Finally, some African languages, such as Xhosa and Zulu, are known for their unique use of click consonants, which are articulated with two closures in the mouth.

Clicks that English speakers may be familiar with include “tsk-tsk!” used to express disapproval or pity and the “clip-clop” sound made by children to imitate a horse’s trotting.

Suprasegmental phonology: Stress and prosody

Suprasegmental phonology refers to the study of linguistic units that extend beyond individual speech sounds, such as stress and prosody. Stress refers to the relative emphasis placed on certain syllables within a word.

Stress can be contrastive, which means that changing which syllable is emphasized can change the meaning of the word. For example, “insight” is stressed on the first syllable, and “incite” is stressed on the second syllable. The words have two different meanings despite having the exact same segmental features (i.e. same vowels and consonants).

Prosody refers to the melody, rhythm, and intonation of speech. It encompasses a range of linguistic features, including pitch, stress, and duration, and is used to convey meaning in ways that go beyond the words themselves. Prosody is important for conveying emotions, but also for signaling emphasis and making distinctions between different sentence types. For example, in English, declarative statements tend to have falling intonation at the ends of the sentence.

Yes/no questions tend to have rising intonation at the end of the sentence. Despite seeming very natural to English speakers, these intonational patterns are not universal. In some languages, for example, subordinate clauses have rising intonation, and yes/no questions are not distinguished from declarative sentences. Thus, prosody is a language-specific system for communicating meaning, just like vowels, consonants, and stress.

The phonology of sign languages

The phonology of sign languages refers to the study of the organization of the system of signs in sign languages, which are used by deaf communities as a means of communication.

Sign languages are complex and dynamic systems that share many similarities with spoken languages, including syntax, morphology, and phonology. In contrast to spoken languages, sign languages do not have a sound-based phonology, but rather a visual-spatial phonology.


The basic unit of this phonology is the sign, which can be thought of as equivalent to a word in a spoken language. Signs are made up of a combination of handshapes, movements, and facial expressions, each of which contributes to the meaning of the sign.

Sign languages have a phonological structure that is similar to that of spoken languages. Just as spoken languages have phonemes (the smallest units of sound that convey a difference in meaning), sign languages have phonological units that are used to create meaning.

For example, changes in hand shape, movement, facial expression, or the location of the sign can change the meaning of a sign, just as changing or modifying a sound can change the meaning of a word in a spoken language.

You will forget 90% of this article in 7 days.

Download Kinnu to have fun learning, broaden your horizons, and remember what you read. Forever.

You might also like

Language documentation and revitalization;

The study of language preservation and policies for shaping language.

Psycholinguistics and language acquisition;

The role of psychology in language formation.

Computation linguisitics and NLP;

The crossover between linguistics and natural language processing.

Sociolinguistics and linguistic anthropology;

The relationship between linguistics and social studies, and the role of culture in shaping language.

Applied linguistics;

How linguistics can be applied to real-world problems.

History of linguistics;

The story of the study of linguistics from its earliest roots to today.

Leave a Reply

Your email address will not be published. Required fields are marked *