Britannica AI Icon
print Print
Please select which sections you would like to print:
verifiedCite
While every effort has been made to follow citation style rules, there may be some discrepancies. Please refer to the appropriate style manual or other sources if you have any questions.
Select Citation Style
Feedback
Corrections? Updates? Omissions? Let us know if you have suggestions to improve this article (requires login).
Thank you for your feedback

Our editors will review what you’ve submitted and determine whether to revise the article.

External Websites
Top Questions

Who first suggested the existence of an Indo-European language family?

What are the two main hypotheses about the homeland of Proto-Indo-European?

What are the language branches that developed from Proto-Indo-European?

Proto-Indo-European language, hypothetical language that is the assumed ancestor of the Indo-European language family. Proto-Indo-European (often shortened to PIE) has been linguistically reconstructed from existing Indo-European languages, and no record of it exists as its speakers did not have writing. The language was probably spoken about 6,000 years ago, although some scholars have argued for an earlier date. The prefix “proto” derives from the Greek prōtos, which means “first” and denotes the language’s status as the ancestor of the Indo-European languages that followed, making it the forerunner to Romance, Germanic, Slavic, Celtic, Iranian, Indian, as well as Anatolian, languages.

Discovery and reconstruction

That Romance languages were descended from Latin and thus constituted one “family” had been known for centuries, but the existence of the Indo-European family of languages and the nature of their genealogical relationship was first demonstrated in the 19th century. The demonstration was built on earlier work by scholars who observed that the classical Indian language Sanskrit bore several striking resemblances to both Greek and Latin. In 1786 the British orientalist Sir William Jones brought greater attention to these observations when he put forth the hypothesis that all three languages must have some common source, which likely no longer existed.

The next important step came in 1822, when the German scholar Jacob Grimm demonstrated a number of systematic correspondences between the sounds of Germanic and the sounds of Greek, Latin, and Sanskrit in related words; this is now called Grimm’s Law. By the 1870s linguists had a strong understanding of consonant mapping across Indo-European languages, but vowel systems and distributions in PIE were yet to be determined. In 1878 Ferdinand de Saussure hypothesized that all verbs in the ancestral language had just one core vowel and illustrated how vowel alternations occurred as the languages evolved. The early 20th century decipherment of the Hittite language of ancient Anatolia validated de Saussure’s hypothesis, solidifying the structure of the proposed Proto-Indo-European language.

Indo-European languages in contemporary Eurasia
More From Britannica
Indo-European languages: The parent language: Proto-Indo-European

Origin and divergence

One of the major questions historically investigated by scholars has been where people spoke Proto-Indo-European or, as it is often phrased, the location of the Indo-European homeland. While a number of homelands have been suggested, in the 21st century most scholars have converged on two primary locations: Neolithic Anatolia and the Bronze Age Pontic-Caspian steppes.

Some researchers, most notably Colin Renfrew, have argued that Proto-Indo-European found its home in Neolithic Anatolia (c. 7th millennium bce). Renfrew and other proponents of this hypothesis assert that Indo-European languages spread into Europe with the Neolithic package that brought agriculture to the region. The spread of Indo-European languages to Central and South Asia is thought to have occurred either with a spread of the Neolithic from Anatolia eastward or centuries later with a Bronze Age migration.

In the more popular of the two hypotheses, Proto-Indo-European is believed to have been spoken about 6,000 years ago, in the Pontic-Caspian steppe region north of the Black and Caspian seas, in present-day Ukraine, southern Russia, and into Kazakhstan. Subgroups are believed to have diverged and spread across the Near East, North India, and Europe in the fourth and third millennia bce.

PIE diverged into several language families, though linguists have not found a reliable and precise way to determine from linguistic evidence alone the date at which any set of related languages must have begun diverging. The language branches include the following:

Access for the whole family!
Bundle Britannica Premium and Kids for the ultimate resource destination.

Structure

The phonology and morphology of PIE have been extensively reconstructed, but less is known about PIE’s vocabulary. From linguistic evidence alone, scholars are unable to determine if a particular word existed or not in PIE; they can only state the probability of its existence.

Phonology

Consonants

Proto-Indo-European probably had 15 stop consonants. In the following grid these sounds are arranged according to the place in the mouth where the stoppage was made and the activity of the vocal cords during and immediately after the stoppage:Proto-Indo-European stop consonants, arranged according to the place in the mouth where the stoppage was made and the activity of the vocal cords during and immediately after the stoppage.

A labial sound is made with the lips, and a dental sound is made with the tip of the tongue against the back of the teeth. The palatal and velar sounds were probably made by contact between the back of the tongue and the soft palate—more toward the front of the mouth in the case of the palatals and more toward the back in the case of the velars (compare Arabic kalb ‘dog’ versus qalb ‘heart’). The labiovelar sounds were made by contact between the back of the tongue and the soft palate with concomitant rounding of the lips. Voiceless designates sounds made without vibration of the vocal cords; voiced sounds are pronounced with vibration of the vocal cords. The exact pronunciation of the voiced aspirates is somewhat uncertain; they were probably similar to the sounds transcribed bh, dh, and gh in Hindi.

Correspondences pointing to the voiced labial stop b are rare, leading some scholars to deny that b existed at all in the parent language. A minority view holds that the traditionally reconstructed voiced stops were actually glottalized sounds produced with accompanying closure of the vocal cords. The status of the velar stops k, g, and gh has likewise been questioned. The earlier view that Proto-Indo-European had a series of voiceless aspirated stops ph, th,ḱh, kh, and kʷh has largely been abandoned. (Aspirated consonants are sounds accompanied by a puff of breath.) There was one sibilant consonant, s, with a voiced alternant, z, that occurred automatically next to voiced stops. The existence of a second apical spirant (that is, a spirant formed with the tip of the tongue), þ (with a presumed pronunciation like that of th in English thin), is extremely uncertain.

There is general agreement that Proto-Indo-European had one or more additional consonants, for which the label laryngeal is used. These consonants, however, have mostly disappeared or have become identical with other sounds in the recorded Indo-European languages, so their former existence has had to be deduced mainly from their effects on neighboring sounds. Hence, the laryngeal sounds were not suspected until 1878, and even then they were rejected by most scholars until after 1927, when the Polish linguist Jerzy Kuryłowicz showed that Hittite often has (perhaps a velar spirant like the ch in German ach) in places where a laryngeal had been posited on the evidence of the other Indo-European languages. There is still considerable disagreement about how many laryngeals there were, what they sounded like, what traces they left, and how best to symbolize them. Most scholars now believe there were three, which can be written H1, H2, and H3. Of these, H1 may have been h or a glottal stop; H2 was perhaps a pharyngeal spirant like Arabic in ḥajar ‘stone’; H3, whatever its other features, was probably voiced. The principal traces they left outside Anatolian are in the quality and length of neighboring vowels, H2 changing a neighboring e to a, and probably H3 changing it to o, while all laryngeals lengthened a preceding vowel in the same syllable. In Anatolian H2 and H3 remained as , at least in some positions.

When laryngeals between consonants disappeared, a vowel sometimes remained, as in Greek stásis, Sanskrit sthitis, Old English stede ‘a standing (place)’ from Proto-Indo-European *stH2tis. Before the advent of the laryngeal theory, a separate Proto-Indo-European vowel ə (called schwa indogermanicum) was reconstructed to account for these correspondences.

Finally, there were the nasal sounds n and m, the liquids l and r, and the semivowels y and w. When y and w occurred between consonants, they were replaced by the vowels i and u. The nasals and liquids functioning as nuclei of syllables in this position (like the final sounds of English bottom, button, bottle, butter) are traditionally written m̥, n̥, , . Some scholars dispense with these diacritical marks and with the distinction between syllabic i and u and nonsyllabic y and w, but this obscures certain distinctions, such as that between -wn̥- in *ḱwn̥su ‘among dogs,’ Sanskrit śvasu, and -un- in *tund- ‘shove,’ Sanskrit tundate.

Vowels

The vowel system of Proto-Indo-European consisted of the following sounds:Vowel system of Proto-Indo-European language

In forming front vowels, the highest point of the tongue is in the front of the mouth; for back vowels, that point is in the back. High vowels are those in which the tongue is highest—closest to the roof of the mouth. Mid vowels are made with the tongue between the extremes of high and low.

The four mid vowels participated in a pattern of alternation called ablaut. In the course of inflection and word formation, roots and suffixes could appear in the “e-grade” (also called “normal grade”; compare Latin ped-is ‘of a foot’ [genitive singular]), “o-grade” (e.g., Greek pód-es ‘feet’), “zero-grade” (e.g., Avestan fra-bd-a- ‘forefoot,’ with -bd- from *-pd-), “lengthened e-grade” (e.g., Latin pēs ‘foot’ [nominative singular] from *pēd-s), and/or “lengthened o-grade” (e.g., English foot, Old English fōt).

There is some evidence for a similar pattern of alternation involving a, ā, and zero. Most instances of apparent a and ā, however, arose by “coloration” of e under the influence of a preceding or following H2 (e.g., Greek ag- ‘lead’ comes from *H2eǵ-, stā- ‘stand’ comes from *stH2-). Some cases of o, ō, and ē are likewise of laryngeal origin (e.g., Greek op- ‘see’ comes from *H3ekʷ-, dō- ‘give’ comes from *deH3-, thē- ‘put’ comes from *dheH1-). Among the high vowels, i and u did not participate in ablaut alternations but rather functioned primarily as the syllabic realizations of the consonants y and w, as in *leykʷ- ‘leave,’ zero-grade *likʷ-, parallel to *derḱ- ‘see,’ zero-grade *dr̥ḱ-. Long ī and ū in the recorded languages derive in large part from sequences of i or u plus laryngeal, as in Latin vīvus ‘alive’ from *gʷiH3wós.

The accent just before the breakup of the parent language was apparently mainly one of pitch rather than stress. Each full word had one accented syllable, presumably pronounced on a higher pitch than the others.

Morphology and syntax

Verbal inflection

The Proto-Indo-European verb had three aspects: imperfective, perfective, and stative. Aspect refers to the nature of an action as described by the speaker—e.g., an event occurring once, an event recurring repeatedly, a continuing process, or a state. The difference between English simple and “progressive” verb forms is largely one of aspect—e.g., “John wrote a letter yesterday” (implying that he finished it) versus “John was writing a letter yesterday” (describing an ongoing process, with no implication as to whether it was finished or not).

The imperfective aspect, traditionally called “present,” was used for repeated actions and for ongoing processes or states—e.g., *stí-stH2-(e)- ‘stand up more than once, be in the process of standing up,’ *mn̥-yé- ‘ponder, think,’ *H1es- ‘be.’ The perfective aspect, traditionally called “aorist,” expressed a single, completed occurrence of an action or process—e.g., *steH2- ‘stand up, come to a stop,’ *men- ‘think of, bring to mind.’ The stative aspect, traditionally called “perfect,” described states of the subject—e.g., *ste-stóH2- ‘be in a standing position,’ *me-món- ‘have in mind.’

Verb roots were by themselves either perfective (like *steH2- ‘stand’ and *men- ‘think’) or imperfective (like *H1es- ‘be’). This basic aspect, however, could be reversed by morphological devices such as ablaut, suffixation, and reduplication. The stative aspect was normally marked by reduplication and the zero-grade of the root in the indicative singular; it had personal endings that were partly distinct from those of the other two aspects.

From one aspect of a given verb the shape and even the existence of the other two aspects could not be predicted; for example, *H1es- ‘be’ had only the imperfective aspect. Ways of forming imperfectives were especially numerous and often involved, in addition to their imperfective aspectual meaning, some other notion, such as performing the action habitually or repeatedly (iterative), or causing someone else to perform it (causative). One root could thus have several imperfective stems; so to the root *H1er- ‘move’ there were at least a causative form, *H1r̥-new- ‘set in motion,’ and an iterative form, *H1r̥-sḱḥ- ‘go repeatedly.’

The Proto-Indo-European verb was also inflected for mood, by which speakers could indicate whether they were making statements or inquiries about matters of fact; making predictions, surmises, or wishes about the future or about unreal but imagined situations; or giving commands. Compare English “If John is home now (he is eating lunch)” with the verb is in the indicative mood, discussing a matter of fact, with “If John were home now (he would be eating lunch)” with the verb were in the subjunctive mood, describing an unreal situation. There were two Proto-Indo-European suffixes expressing mood: -e- alternating with -o- for the subjunctive, corresponding roughly in meaning to the English auxiliaries ‘shall’ and ‘will,’ and -yeH1- alternating with -iH1- for the optative, corresponding roughly to English ‘should’ and ‘would.’ Verbs without one of these two suffixes were marked for mood and tense by their personal endings alone.

These personal endings basically expressed the person and number of the verb’s subject, as in Latin amō ‘I love,’ amās ‘you (singular) love,’ amat ‘he or she loves,’ amāmus ‘we love,’ and so on. In the imperfective and perfective aspects there were two sets of endings, distinguishing two voices: active, in which typically the subject was not affected by the action, and mediopassive, in which typically the subject was affected, directly or indirectly. Thus, Sanskrit active yájati and mediopassive yájate both mean ‘he sacrifices,’ but the former is said of a priest who performs a sacrifice for the benefit of another, while the latter is said of a layperson who hires a priest to perform a sacrifice. In the stative aspect there was originally no distinction of voice.

To mark mood and tense, imperfective verbs that did not have a mood suffix distinguished three subtypes of active and mediopassive endings: imperative, primary, and secondary. Verbs with imperative endings belonged to the imperative mood (used for commands)—e.g., *H1s-dhí ‘be (singular),’ *H1és-tu ‘let him be.’ Verbs with primary endings were marked as non-past (present or future) in tense and indicative in mood—e.g., *H1és-ti ‘he is.’ (Indicative mood signifies objective statements and questions.) Verbs with secondary endings were unmarked for tense and mood but were normally used as past indicatives (e.g., *H1és-t ‘he was,’ *gwhén-t ‘he slew’) and to fill out gaps in the imperative paradigm (e.g., *H1és-te or *H1s-té ‘you [plural] were,’ but also ‘be [plural]’; *gʷhén-te or *gwhn̥-té ‘you [plural] slew,’ but also ‘slay [plural]’). To mark such forms unambiguously as past indicatives, an augment, usually consisting of the vowel e, could be prefixed—e.g., *é-gʷhen-t ‘he slew,’ *é-H1es-t ‘he was.’

Verbs in the perfective aspect without a mood suffix did not occur with primary endings and thus lacked a true present tense. Verbs in the stative aspect substituted a distinctive set of endings for those of the primary set but apparently used the imperative and secondary endings in the usual way to form a stative imperative and a stative past indicative.

Nominal inflection

The inflectional categories of the noun were case, number, and gender. Eight cases can be reconstructed: nominative, for the subject of a verb; accusative, for the direct object; genitive, for the relations expressed by English of; dative, corresponding to the English preposition to, as in “give a prize to the winner”; locative, corresponding to at, in; ablative, from; instrumental, with; and vocative, used for the person being addressed. Besides singular and plural number, there was a dual number for referring to two items. Each noun belonged to one of three genders: masculine, to which belonged most nouns designating male creatures; feminine, to which belonged most names of female creatures; and neuter, to which belonged only a few words for individual adult living creatures. The gender of nouns not designating living creatures was only partly predictable from their meaning.

Adjectives were nounlike words that varied in gender according to the gender of another noun with which they were in agreement, or, if used by themselves, according to the sex of the entity to which they referred; thus, Latin bonus sermō ‘good speech’ (masculine), bona aetās ‘good age’ (feminine), bonum cor ‘good heart’ (neuter), or bonus ‘a good man,’ bona ‘a good woman,’ bonum ‘a good thing.’ The neuter of an adjective was often identical with the masculine except for having different endings in the nominative and accusative cases. Feminine gender was either completely identical with the masculine or derived from it by means of a suffix, the two commonest being *-eH2- and *-iH2- (*-yeH2-).

Demonstrative, interrogative, relative, and indefinite pronouns were inflected like adjectives, with some special endings. Personal pronouns were inflected very differently. They lacked the category of gender, and they marked number and case (in part) not by endings but by different stems, as is still seen in English singular nominative “I,” but oblique “my,” “me”; plural nominative “we,” but plural oblique “our,” “us.” (The oblique is any case other than nominative or vocative.)

Syntax

Some notable features of Proto-Indo-European syntax were the non-ergative case system, in which the subject of an intransitive verb received the same case marking as the subject (rather than the object) of a transitive verb; concord (agreement) in case, number, and gender between adjective and noun; and the use of singular verbs with neuter plural subjects, as in Greek pánta rheĩ ‘all things flow,’ with the same (singular) verb as ho potamòs rheĩ ‘the river (masculine) flows,’ contrasting with hoi potamoì rhéousi ‘the rivers flow’ (indicating that neuter plurals were originally collectives and grammatically singular). Proto-Indo-European word order was flexible, but basic declarative sentences typically had the structure subject–object–verb (SOV).

Lexicon and culture

Much less is known about the parent language’s vocabulary than about its phonology and grammar. Sounds and grammatical categories do not easily disappear or undergo radical change in so many daughter languages that their former existence can no longer be detected. It is relatively easy, however, for an individual word to disappear or shift meaning in so many daughter languages that its existence or meaning in the parent language cannot be confidently inferred. Hence, from the linguistic evidence alone, scholars can never say that Proto-Indo-European lacked a word for any particular concept; they can only state the probability that certain items did exist and from these items make inferences about the culture and location in time and space of the speakers of Proto-Indo-European.

Thus it is supposed that the Proto-Indo-European community knew and talked about dogs (*ḱwón-), horses (*H1éḱwo-), sheep (*H3éwi-), and almost certainly cows (*gwów-) and pigs (*súH-). Probably all these animals were domesticated. At least one cereal grain was known (*yéwo-), and at least one metal (*H2éyos). There were vehicles (*wóǵho-) with wheels (*kʷékʷlo-), pulled by teams joined by yokes (*yugó-). Honey was known, and it probably formed the basis of an alcoholic drink (*mélit-, *médhu) related to the English mead. Numerals up through 100 (*ḱm̥tóm) were in use. All this suggests a people with a well-developed Neolithic (characterized by simple agriculture and polished stone tools) or even Chalcolithic (copper- or bronze-using) technology.

Sanat Pai Raikar Warren Cowgill Jay H. Jasanoff The Editors of Encyclopaedia Britannica