Language Families Catalog
A reference catalog of the world’s language families, their branches, geographic spread, approximate language counts, native speaker totals, and a representative sample of member languages with speaker counts. Living language inventories follow Ethnologue / Glottolog as of the mid-2020s. Isolates and major controversial macro-groupings included at the end.
I. Indo-European
Approximate scale: ~445 living languages; ~3.2 billion native speakers; spread across Europe, the Iranian plateau, the Indian subcontinent, and (via colonial expansion) the Americas, southern Africa, and Oceania.
Branch Notable Languages Speakers (approx.) Notes Anatolian (extinct) Hittite, Luwian, Lydian, Palaic — Earliest attested IE branch (2nd millennium BCE) Tocharian (extinct) Tocharian A (Agnean), Tocharian B (Kuchean) — Tarim Basin; extinct by ~1000 CE Indo-Iranian — Indo-Aryan Hindi-Urdu (590M), Bengali (270M), Marathi (95M), Gujarati (60M), Punjabi (130M), Sanskrit (liturgical) ~1.5B Devanagari and related Brahmic scripts Indo-Iranian — Iranian Persian/Farsi (80M), Pashto (60M), Kurdish (30M), Tajik (8M), Balochi ~200M Persian = Farsi/Dari/Tajik continuum Indo-Iranian — Nuristani Kati, Waigali <50k Hindu Kush Greek Modern Greek (~13M), Ancient and Koine Greek (classical/liturgical) ~13M Continuous attestation since c.1450 BCE Linear B Italic — Latin/Romance Spanish (490M), French (320M), Portuguese (260M), Italian (65M), Romanian (24M), Catalan (10M), Galician, Occitan, Sardinian, Romansh ~1.1B All descend from Vulgar Latin Celtic Irish (~170k L1), Welsh (~560k), Scottish Gaelic (~57k), Breton (~210k), Cornish (revived), Manx (revived) ~1M Two surviving subgroups: Goidelic and Brythonic Germanic — West English (380M L1, 1.5B total), German (95M), Dutch (24M), Afrikaans (7M L1), Frisian, Yiddish (~600k), Luxembourgish ~530M L1 English dominant world lingua franca Germanic — North Swedish (10M), Danish (6M), Norwegian (5M), Icelandic (330k), Faroese (70k) ~22M Mutually intelligible mainland Scandinavian Balto-Slavic — Slavic Russian (150M), Polish (40M), Ukrainian (30M), Czech (10M), Belarusian, Bulgarian, Serbo-Croatian, Slovenian, Slovak, Macedonian ~290M East / West / South Slavic Balto-Slavic — Baltic Lithuanian (3M), Latvian (1.5M) ~4.5M Most archaic IE phonology preserved Albanian Albanian (Tosk and Gheg, ~6M) ~6M Own branch since at least 2nd millennium BCE Armenian Eastern Armenian (3M), Western Armenian (1M) ~4M Distinct branch; own script (Mesropian, 405 CE)
II. Sino-Tibetan
Approximate scale: ~510 languages; ~1.4 billion speakers; China, Tibet, Himalaya, Myanmar, NE India.
Branch Notable Languages Speakers (approx.) Notes Sinitic Mandarin (920M L1), Cantonese / Yue (85M), Wu / Shanghainese (80M), Min Hokkien / Taiwanese (50M), Hakka (40M), Xiang (40M), Gan (20M) ~1.3B Tonal; written in Chinese characters Tibetic Standard Tibetan / Ü-Tsang (1.2M), Khams, Amdo, Dzongkha (640k, Bhutan official), Ladakhi, Sherpa ~6M Pre-1959 literary standard for Mahayana Burmic Burmese (33M), Karen languages (~7M), Arakanese ~40M Burmese script Pali-derived Lolo-Burmese Yi / Lolo (~7M), Lisu, Lahu, Naxi ~10M SW China; Naxi has unique pictographic Dongba script Naga & Kuki-Chin Naga languages (40+), Mizo (700k), Meitei (1.8M) ~5M NE India / Myanmar borderlands
III. Niger-Congo
Approximate scale: ~1,500 languages; ~700 million speakers; sub-Saharan Africa. Largest family by language count.
Branch Notable Languages Speakers (approx.) Notes Bantu Swahili (~200M L2; 18M L1), Zulu (28M), Xhosa (19M), Shona (15M), Lingala (40M L2), Kikuyu (8M), Luganda (10M), Kinyarwanda (12M), Kirundi (12M), Sotho, Tswana ~350M Class-noun system; spread south/east c.500 BCE–500 CE Mande Bambara (15M), Mandinka (1.3M), Soninke, Dyula ~70M West Africa; N’Ko script Atlantic Fula / Fulfulde (40M), Wolof (12M), Serer ~70M Sahel and West African coast Volta-Niger Yoruba (50M), Igbo (45M), Edo, Igala ~110M Nigeria-Benin Gur Mooré (8M), Dagbani, Senufo ~25M Burkina Faso, Mali, Ghana Kwa Akan / Twi (11M), Ewe (8M), Ga, Fon ~30M Ghana, Togo, Benin Adamawa-Ubangi Sango (5M L2, lingua franca CAR), Mumuye, Mbum ~10M Central Africa Kru Bété, Krahn, Klao ~5M Côte d’Ivoire, Liberia
IV. Afro-Asiatic
Approximate scale: ~370 languages; ~500 million speakers; North Africa, Horn, Middle East.
Branch Notable Languages Speakers (approx.) Notes Semitic Arabic (420M total across varieties), Hebrew (9M), Maltese (520k), Amharic (32M), Tigrinya (9M), Aramaic varieties (~750k), Syriac (liturgical) ~470M Triconsonantal root system Berber (Tamazight) Tashelhit (~8M), Kabyle (6M), Tarifit, Tuareg ~30M Tifinagh script revived in Morocco/Algeria Chadic Hausa (50M L1, 80M total), Bole, Margi ~60M Northern Nigeria, Niger; lingua franca Cushitic Somali (22M), Oromo (37M), Afar, Sidamo, Beja ~75M Horn of Africa Egyptian (extinct) Ancient Egyptian, Coptic (liturgical) — 3000 BCE – 17th c CE Omotic Wolaitta, Gamo, Aari ~5M SW Ethiopia; status as Afro-Asiatic debated
V. Austronesian
Approximate scale: ~1,250 languages; ~390 million speakers; Madagascar to Easter Island.
Branch Notable Languages Speakers (approx.) Notes Formosan (~14 subfamilies) Atayal, Paiwan, Amis, Bunun ~200k Aboriginal Taiwan — homeland of family Malayo-Polynesian — Western Malay/Indonesian (280M total), Filipino/Tagalog (28M L1, 80M total), Cebuano (20M), Javanese (80M), Sundanese (40M), Madurese (15M), Malagasy (25M) ~350M Insular SE Asia + Madagascar Malayo-Polynesian — Central Bali Sasak (4M), Manggarai, Tetum (1.2M; East Timor co-official) ~10M Lesser Sundas Malayo-Polynesian — Eastern (Oceanic) Fijian (650k), Samoan (510k), Tongan (190k), Maori (185k), Hawaiian (24k), Tahitian (68k), Tok Pisin (4M L2; Papua New Guinea) ~6M Pacific from Fiji to Hawaii / Easter Island
VI. Dravidian
Approximate scale: ~85 languages; ~250 million speakers; southern India, Sri Lanka, pockets in Pakistan.
Branch Notable Languages Speakers (approx.) Notes South Dravidian Tamil (80M), Kannada (45M), Malayalam (38M), Tulu (2M) ~165M Tamil Sangam literature from 3rd c BCE South-Central Telugu (95M), Gondi, Kui, Kuvi ~100M Telugu = largest Dravidian by population Central Kolami, Naiki, Parji <500k Maharashtra-AP borderlands Northern Brahui (2.4M; Balochistan), Kurukh (2M), Malto ~5M Brahui = isolated Pakistani enclave
VII. Trans-New Guinea
Approximate scale: ~480 languages; ~3 million speakers; PNG highlands and adjacent.
Branch Notable Languages Speakers (approx.) Engan Enga (230k), Huli (150k) ~400k Chimbu-Wahgi Melpa, Kuman, Wahgi ~300k Madang Amele, Gum ~100k Finisterre-Huon Selepet, Kâte ~100k
VIII. Altaic (controversial macrofamily) — usually treated as four separate families
Family Notable Languages Speakers (approx.) Notes Turkic Turkish (88M), Azerbaijani (24M), Uzbek (35M), Kazakh (13M), Uyghur (11M), Tatar (5M), Turkmen (7M), Kyrgyz (5M) ~200M Vowel harmony; agglutinative Mongolic Mongolian / Khalkha (5M), Buryat (270k), Kalmyk (155k) ~6M Traditional Mongolian script + Cyrillic Tungusic Manchu (near-extinct; 20 fluent speakers), Evenki (4k), Nanai <20k Severely endangered Koreanic Korean (77M) ~77M Treated as isolate or sole member of Koreanic Japonic Japanese (125M), Ryukyuan varieties (Okinawan, Amami, Miyako, Yaeyama) ~127M Ryukyuan UNESCO-endangered
IX. Uralic
Approximate scale: ~38 languages; ~25 million speakers; Hungary, Finland, Estonia, NW Russia, Siberia.
Branch Notable Languages Speakers (approx.) Finnic Finnish (5.4M), Estonian (1.1M), Karelian (40k), Veps, Livonian ~7M Sámi Northern Sámi (25k), Lule, Inari, Skolt ~30k total Ugric Hungarian (13M), Khanty (10k), Mansi (1k) ~13M Samoyedic Nenets (22k), Selkup, Nganasan ~30k
X. Tai-Kadai (Kra-Dai)
Approximate scale: ~95 languages; ~80 million speakers; SE Asia and southern China.
Branch Notable Languages Speakers (approx.) Tai Thai / Siamese (70M total), Lao (30M total), Shan (5M), Zhuang (16M) ~95M Kam-Sui Kam (Dong, 2.6M) ~3M Hlai Hlai (Li, 760k; Hainan) ~800k
XI. Austroasiatic
Approximate scale: ~170 languages; ~120 million speakers; mainland SE Asia, scattered India.
Branch Notable Languages Speakers (approx.) Vietic Vietnamese (86M), Mường (1.5M) ~88M Khmer Khmer (17M), Northern Khmer ~18M Mon Mon (1M), Nyah Kur ~1M Munda Santali (7.6M), Mundari, Ho, Korku ~13M
XII. Hmong-Mien (Miao-Yao)
Approximate scale: ~38 languages; ~10 million speakers; southern China, SE Asia, US diaspora.
Branch Notable Languages Speakers (approx.) Hmongic Hmong / Miao (5M; includes White, Green) ~5M Mienic Iu Mien / Yao (1.2M) ~1M
XIII. North American Indigenous Families
Family Geographic Spread Notable Languages Speakers Eskimo-Aleut Arctic from Siberia to Greenland Inuktitut (39k), Kalaallisut / Greenlandic (57k), Central Yupik (10k), Aleut (150) ~110k Na-Dene NW North America + SW US Navajo (170k), Apache (~3k), Tlingit (100), Carrier, Slavey, Chipewyan ~175k Dene-Yeniseian (proposed by Vajda 2010) linking Na-Dene to Yeniseian (Siberia) Ket (210), Yug (extinct) — Algic Algonquian + Wiyot, Yurok Cree (96k), Ojibwe (50k), Blackfoot (5k), Cheyenne (1.7k), Lenape, Mi’kmaq (8k) ~165k Iroquoian NE North America Cherokee (2k), Mohawk (2.4k), Onondaga, Seneca, Oneida, Cayuga, Tuscarora ~5k Siouan-Catawban Great Plains, SE Lakota (2k), Dakota (290), Crow (3.5k), Hidatsa, Mandan (5) ~6k Salishan Pacific NW Halkomelem, Lushootseed, Lillooet ~2k Sahaptian Plateau Sahaptin (100), Nez Perce (200) ~300 Wakashan Vancouver Island, BC coast Kwak’wala, Nuu-chah-nulth, Haisla ~1k Chinookan Columbia River Wasco-Wishram (10) <50 Uto-Aztecan SW US, Mexico, Central America Nahuatl (1.7M), Hopi (6k), Comanche (100), Pima/Tohono O’odham (16k), Shoshone (1k), Tarahumara (70k) ~2M Mayan Mesoamerica Yucatec (770k), K’iche’ (1.7M), Cakchiquel (450k), Mam (600k), Q’eqchi’ (800k), Tzeltal (560k), Tzotzil (600k) ~6M Oto-Manguean Mexico Zapotec (490k), Mixtec (520k), Otomi (290k), Mazahua, Mazatec, Chinantec ~2M Misumalpan Nicaragua, Honduras Miskito (180k), Sumo ~200k
XIV. South American Indigenous Families
Family Geographic Spread Notable Languages Speakers Quechuan Andes from Ecuador to Argentina Southern Quechua (5M), Cuzco, Ayacucho, Ancash ~7M Aymaran Bolivia, Peru, Chile Aymara (1.7M), Jaqaru ~2M Arawakan Amazon, Caribbean Garifuna (200k), Wayuu (400k), Achagua, Asháninka ~700k Tupi-Guarani Brazil, Paraguay Guaraní (6.5M; Paraguay co-official), Tupinambá (extinct), Nheengatu ~7M Carib Northern S America Carib, Pemon, Macushi ~80k Macro-Jê Brazilian Cerrado Kayapó, Xavante, Krenak ~30k Pano-Tacanan W Amazon Shipibo-Conibo (35k), Cashinahua ~50k
XV. Khoisan (defunct macrofamily; now Tuu + Kx’a + Khoe-Kwadi + isolates)
Family / Language Geographic Spread Speakers Notes Khoe-Kwadi Botswana, Namibia Khoekhoe / Nama (250k), Naro ~300k Kx’a Botswana, Namibia Ju|‘hoan (15k), |‘Hoan ~20k Tuu Southern Africa !Xóõ (4k), ǂHõã ~5k Sandawe (isolate) Tanzania Sandawe (60k) Click consonants Hadza (isolate) Tanzania Hadza (1k) Click consonants; no demonstrated relatives
XVI. Other Smaller Families
Family Geographic Spread Notable Languages Speakers Northeast Caucasian Dagestan, Chechnya Chechen (1.4M), Avar (800k), Lezgian (650k), Dargwa ~3.5M Northwest Caucasian NW Caucasus Kabardian (1.6M), Abkhaz (130k), Adyghe (590k) ~2.5M Kartvelian Georgia Georgian (4M), Mingrelian (350k), Svan (15k), Laz (30k) ~4.5M Dene-Yeniseian (proposed) Siberia + N America Ket (210), Yug (extinct) — Yeniseian side only <500 Nilo-Saharan (debated grouping) Sub-Saharan Africa Luo (5M), Dinka (4.5M), Nuer (1.5M), Maasai (1.5M), Songhay, Kanuri ~50M Kresh-Aja, Kuliak, Saharan, Maban Sahel (sub-groupings of Nilo-Saharan) —
XVII. Language Isolates
Languages with no demonstrated genetic relatives:
Isolate Region Speakers Notes Basque (Euskara) W Pyrenees 750k Pre-IE Europe survivor Burushaski N Pakistan 110k Hunza, Nagar valleys Korean Korean peninsula 77M Sometimes grouped as Koreanic Japanese Japan 125M Sometimes grouped as Japonic with Ryukyuan Ainu Hokkaido ~10 Severely moribund Nivkh (Gilyak) Sakhalin, Amur 200 Kusunda Nepal 1–2 Critically endangered Pirahã Brazilian Amazon 250–380 Recursion debate (Everett 2005) Zuni New Mexico 9k Purépecha (Tarascan) Michoacán 130k Sumerian (extinct) Mesopotamia — c.3100 – 2nd c BCE Elamite (extinct) SW Iran — c.2800 – 4th c BCE Etruscan (extinct) Etruria — c.700 BCE – 1st c CE Hurrian-Urartian (extinct) Anatolia, S Caucasus — c.2nd millennium BCE Hattic (extinct) Anatolia — Pre-Hittite Anatolia Iberian (extinct) E Iberian peninsula — Pre-Roman Spain Tartessian (extinct) SW Iberia — Meroitic (extinct) Nubia — 3rd c BCE – 5th c CE
XVIII. Macro-Groupings (controversial / rejected)
Proposal Proposer / Date Status Nostratic (IE + Uralic + Altaic + Kartvelian + Dravidian + Afro-Asiatic) Pedersen 1903; Illich-Svitych 1960s; Bomhard Held by a minority; standard practice treats families as separate Eurasiatic (IE + Uralic + Altaic + Eskimo-Aleut + others) Greenberg, Indo-European and Its Closest Relatives (2000–02); Ruhlen Marginal acceptance Amerind (almost all Native American languages outside Na-Dene and Eskimo-Aleut) Greenberg 1987 Widely rejected; methodology criticized by Campbell, Goddard, Kaufman Dene-Yeniseian (Na-Dene + Yeniseian) Vajda 2010 Cautiously accepted by many; first widely recognized New World–Old World genetic link Sino-Caucasian / Dene-Caucasian Starostin; Bengtson Speculative Austric (Austroasiatic + Austronesian + Tai-Kadai + Hmong-Mien) Schmidt 1906 Largely abandoned
XIX. Sign Languages
Sign languages are full natural languages with their own genealogies, not derivatives of spoken languages.
Family Notable Languages Notes French Sign Language family LSF, ASL (American), LIBRAS (Brazilian), Irish SL, Russian SL Spread from de l’Épée’s Paris school (1760s) British, Australian, New Zealand SL (BANZSL) BSL, Auslan, NZSL Mutually intelligible Japanese SL family JSL, Korean SL, Taiwanese SL German SL family DGS, Polish SL, Israeli SL (partly) Village sign languages Kata Kolok (Bali), Adamorobe SL (Ghana), Ban Khor SL (Thailand) Emerged in communities with high rates of hereditary deafness Home sign — Idiosyncratic, family-internal
XX. Pidgins, Creoles, and Mixed Languages
Language Base / Lexifier Region Speakers Haitian Creole French Haiti 12M Tok Pisin English PNG 4M (L2) Bislama English Vanuatu 200k Jamaican Patois English Jamaica 3M Krio English Sierra Leone 7M (L2) Sranan Tongo English Suriname 500k Papiamento Portuguese/Spanish Aruba, Curaçao, Bonaire 320k Chavacano Spanish Zamboanga, Philippines 700k Michif French + Cree Métis, North America 1k Media Lengua Spanish + Quechua Ecuador <1k
Adjacent