I strongly disagree that "it's a shame" that English does not use diacritics. English is my second language (third maybe, considering that the country of my birth is bilingual), and is my favorite language to read and to write. I tried to learn French for two years and stopped, and all those excessive writing marks were among the reasons.
God bless all those monks who decided to keep English writing clean.
Coming from Spanish, with just the right diacritics to make pronunciation obvious, at first I didn't get the concept of a "Spelling Bee". Did it involve something besides spelling? Did "Bee" was a metaphor for the actual hard part of it?
I was first exposed to written English, so after trying conversational English, I learned why its pronunciation/writing is a national competition. It might as well be random.
English would have benefited a great deal from an equivalent to the Royal Spanish Academy.
Possibly... English has a lot of linguistics with a lot of varied roots. You have many words taken from Old Norse and other Scandinavian influence as well as Latin, French and via proxy Greek derived words. Great Britain was highly fought over, contested, changed hands and merged cultures over the millennia.
It is far more organic and mixed from different sources than many prescribed languages or very local dialects of other languages. It would be very hard to pin that down. Not to mention the history of printing presses themselves, such as how the Thorn character was itself replaced as well as deprecating a few other characters that were in common use in earlier Old English.
I think it's a mistake to view that situation as unique to English.
Spain is still a multi lingual country with several local languages each of them centuries old. But even ignoring that and focusing only on Castilian, there were invasions by goths, who left behind words like ropa or guardar, and Arabic speakers, who left behind words like almacén.
Like English having both cow and beef, there are words with historical overlap but different etymologies and divergent meaning over time. For example almacén and bodega were both words for a warehouse.
There are also tons of words where Spanish had phonetically diverged from latin, but then the same word was re-imported from latin in "educated" use.
What’s that have to do with how terrible the English writing system is? Why not just reform written English to read the same way it’s sounds? I’m maybe a B2 level Russian learner and can near perfectly pronounce almost any modern Russian writing because it’s written almost exactly the way it’s spoken. I assume it’s the same with many other languages.
The article touches on this, but there have been countless attempts to restandardize English spelling or replace the Latin alphabet with one more suited to English. But English is a global language with no central authority responsible for deciding what is correct, making coordinated change nearly impossible.
To my mind, the best such attempt was Kingsley Read's, made at the behest of G. B. Shaw: https://www.shavian.info
Plus the Enlightenment reimported a lot of Greek for science and made a lot of greek morphology productive in the language again or for the first time, at least in scientific vernacular and jargon, but a lot of that makes it into daily use. (It's also why we still have fun debates today over plurals like octopi versus octopuses or matrices versus matrixes; do we follow the Greek morphology through to its Greek plurals or do we just use the boring English plural morphology? We use both, but which you use becomes in part a signifier of "learnedness" or rule-following. As a learnéd nonconformist, I find it more fun to use the English plural morphology here more often than not, but also sometimes silly uses of díacritics.)
Plus English still is extremely active (to this day) in borrowing words from neighboring languages, with a lot of Spanish words directly borrowed (generally from Mexican/Dominican/Puerto Rican influences in US English, then back out to UK English). There are even French words in today's English that weren't Norman Conquest imports, but American Revolution imports (the French were key US allies and neighbors in the Canadian and Louisiana Territories).
There's a lot of jokes/memes that English has always been a language willing to borrow the best words of any language in a similar way that school bullies are often looking for new sources of milk money to extort.
A simple rule of thumb suffices most of the time, and native speakers will still understand you when you get it wrong.
But yeah, I wish Spanish omitted genders most of the time like Japanese does. It complicates things and adds very little in exchange.
The real kick in the gonads is verb conjugations. Nearly every common verb is irregular and there are something like 18 tenses, times six subjects. Even many native speakers struggle to get them right.
It always makes me sad when a language's alphabet is different from their phonetic alphabet because it means that unless you hear how the word is pronounced there's basically no way of know how to pronounce it. Right now I'm learning Portuguese Portuguese and it just makes me so sad that it legit pushed me away from learning the language.
They pronounce 's' at the end on the word the same as how they pronounce 'x' and many many more such examples, basically no word is pronounced the way it's written.
My native language is Slovenian, the way you say the letters in the alphabet is how you pronounce them in 99% of the words and even if you miss-pronounce the 1%, the words are usually so close that people still understand you.
It just really made me appreciate my language even though it has many other things that just makes it difficult to the point that most of my writings are in English, were I don't really need to think about all the rules and can just focus on telling the story.
I'm of the opinion that all languages should use their phonetic alphabet as their alphabet, that way, once you've learned the (phonetical) alphabet you would know how to pronounce all the words. (Unlike in Portuguese where milk is written as 'leite' but it's pronounced very similarly to the word 'light' in English. (not to mention the Brazilian Portuguese)).
And to the Spanish people, your language is just slightly more aligned than Portuguese, but nowhere near as clear as I would like it to be.
I agree with the parent, Greek is much easier to pronounce, at least when compared to Spanish and Portuguese, though though the emphasis of the words not always being at the front of the word can make things a bit difficult, I'm looking at you κοτόπουλο (chicken).
If anything, I'd guess that when speaking English as second language, harder than knowing the accents on words would just be keeping track of all the exceptions in pronunciation between words that you basically just have to memorize. Tough, though, taught, thought, through, thorough, throughout, etc.
You can indeed. However, if I knew all of the words I would mispronounce, I would have already looked them up! The trouble with mispronouncing words is that often you’re unaware that you’re doing so.
The best one I've heard was from an extremely bright and well-read friend of mine in high school who once pronounced "formaldehyde" like "formal dee-hide" with emphasis on the "dee" syllable.
I have French as second and English as my third language. English comes easy and natural because we're saturated by the language. That's one of the reasons my children don't mispronounce English words as often as they do French words.
Both languages are equally terrible.
On the other hand a few weeks ago my daughter demonstrated a nearly perfect pronunciation of Italian while reading a text without understanding a word. Looks like the Italians got their shit straight. Apart from pistacchio. Nobody pronounces it pistacchio...
I also wouldn’t go as far as saying that it’s a shame that English does not use diacritics, on the other hand, I also wouldn’t say that diacritics make a language more difficult.
Learning how to use them in Spanish and German takes about an afternoon, and when it comes to learning languages, that’s a negligible amount of time.
Also agree, you can already use combinations of multiple characters to define other sounds, and that's faster to type too
Shame, though, that in English the sounds that combinations of characters make, aren't well or uniquely defined (e.g. bird, word, hurt, heard, herd, ... all sound like the same vowel)
They're faster to type largely because your keyboard is English. Other languages (French, German, etc) have diacritics right there. Even Japanese isn't that much harder to type once you actually learn it (and, on fact, is quite pleasant on a smartphone even at beginner level).
On the topic of similar word sounds, this is a big thing that hangs up English speakers on romantic languages. Their vowels are sloppy and contextual, so when they're given explicit symbols that say "use this vowel", they struggle to pick that vowel out. That "symbol to sound" wiring isn't up in the noggin'. A Spanish person learning English will see the Spanish equivalent and go "duh". But an English speaker needs those "like in bird" tables.
Luckily, we have a huge phonemic index (because of all the stealing), so we're actually at an advantage from many languages once that hurdle is crossed. Spare tonality.
2nding this. The "non-phonetic alphabet" is the biggest non-issue I see people raise a stink about. It really doesn't matter, context is the heavy-weight backbone of language.
On top of that, I think people really underestimate how inappropriate diacritics would be for English. It has a massive phonemic inventory, with 44 unique items. Compare with Spanish's 24. English's "phonetic" writing system would have to be as complex as a romanized tonal language like Mandarin (which has to account for 46 unique glyphs once you account for 4 tones over 6 vowels + the 22 consonants). Or you know, the absolute mess that is romanization of Afro-Asiatic languages. El 3arabizi daiman byi5ali el siza yid7ako, el Latin bas nizaam kteebe mish la2e2 3a lugha hal2ad m3a2ade.
> The "non-phonetic alphabet" is the biggest non-issue I see people raise a stink about
Myself and many friends who aren’t native have struggled with speaking fluently because of it. Most of us still mispronounce some words (my friend pronounced “draught beer” like the lack of rain, instead of like draft).
Doesn’t mean things should change, but it’s certainly not a “non-issue”
The bureaucratization of language is more problematic in my view, where things are seen as wrong and right and we try to cram the beauty of of natural language into a restricted box that can be cleanly and easily defined and worked with universally. I quite literally have nothing but detest for this conception of language, that it must bend to the whims of rigidity when it's very clearly a natural, highly chaotic dynamic system constantly undergoing evolution in unexpected ways.
How would you account for the fact that for many words, there isn't a consistent pronunciation rule for it at all? For example, I would guess that 50% of English speakers are non-rhotic.
Same way other dialect continuums account for it: you standardize spelling on some variant, or several variants if that is non-viable (which, yes, does mean that e.g. American and British English spellings would diverge somewhat).
To be clear, I'm not particularly advocating for making english a phonetic language. I'm just saying it being non-phonetic does cause issues (and makes it frustrating, but also shows a very interesting history).
Assuming we wanted to make English a phonetic language, then your question is kind of moot: phonetic means we need to pick the pronunciation rules for phonemes, which would make other ways to pronounce these phonemes incorrect. Some of currently-correct english would become incorrect english.
> For example, I would guess that 50% of English speakers are non-rhotic
Note that accent isn't really what people talk about when they complain about pronunciation. The problem is that there's no mapping from letters to phoneme in any english accent: laughter/slaughter, draught/draught, G(a)vin/D(a)vid...
All those examples follow the linguistic patterns of the languages they come from. They aren't arbitrary, they just don't teach us the context when we're learning as children.
Of course there’s always reasons. Teaching it to children isn’t really a solution: you’d need to know where words come from before reading them correctly, and also many people don’t learn English as children.
Phonetic languages do borrow words from other languages too, they adapt them to their own language keeping the pronunciation (the only example coming to mind right now is the Czech for sandwich, sendvič). English could do that just fine being phonetic was a goal
Does relate to the point that English still doesn't have a central linguistics authority (and likely won't ever). Just various reformers that have been more or less successful and in how distributed their reforms have been. Draught versus draft was indeed one of Noah Webster's proposed reforms that influenced a lot of American spellings and in turn is still influencing UK spellings. It's not as obvious as color versus colour, but there is a bit of US versus UK in draft versus draught.
(Webster also went on to suggest dawter over daughter, to remove more of these vestigial augh spellings, but that one still hasn't caught on even in the US. Just as the cot/caught split is its own weird remaining reform discussion.)
> It has a massive phonemic inventory, with 44 unique items. Compare with Spanish's 24, or German's 25.
I'm not sure where you're getting these numbers from, but German has around 45 phonemes according to all sources I could find, depending on how you count: 17 vowels (including two different schwa sounds), 3 diphthongs, 25 consonants.
If Arabic had to cater to afro-asiatic dialects phonemes then the script would have been even more messier. I'm a speaker of one, and my dialect is heavily influenced by the indigenous Tamazight language. and I think this is why many of the Amazigh community were and some still disappointed with the neo-Tifinagh script. While it carries symbolic weight, it doesn’t offer practical readability, phonemic clarity and tech accessibility of a modern script that Tamazight deserves. Latin script, ironically, fits Tamazight much more naturally.
You don't have to make a perfect pronunciation system. It's OK if a vowel is pronounced slightly differently, as long as its pronunciation can be predicted from context. Even if it can only be predicted 99% of the time.
Insisting that the writing system captures every little distinction is a common mistake enterprising linguists do (often when designing an alphabet for a bible translation, or "modernizing" the spelling of a language which is not their own). They don't have to. Even if you do it, it won't last long. Letters only have to be a reasonably consistent shorthand for how things are pronounced. People don't like a ton of markers or, god forbid, digits sprinkled into their writing to specify a detailed pronunciation.
English has accumulated inconsistencies for so long, though, that it can't really be said to be consistent anymore. Usually, there are radicals who just cut through and start writing more sensibly here and there (without digits or quirky phonetical markers), cutting down on the worst excesses of inconsistency. But in English, these radicals have been soundly defeated in prestige by conservative writers.
Diacritics don't need to be used the way they are in French, i.e. to preserve the original spelling. On the contrary, most languages use them to make their spelling more phonetic.
Nor is there a need for some insane kind of diacritics to handle English. Its phonemic inventory is considerable, yes, but it can be easily organized, especially when you keep in mind that many distinct sounds are allophones (and thus don't need a separate representation) - a good example is the glottal stop for "t" in words like "cat", it really doesn't need its own character since it's predictable.
Let's take General American as an example. First you have the consonant phonemes:
Nasals: m,n,ŋ
Plosives: p,b,t,d,k,g
Affricates: t͡ʃ, d͡ʒ
Fricatives: f,v,θ,ð,s,z,ʃ,ʒ,h
Approximants: l,r,j,w
Right away we can see that most are actually covered by the basic Latin alphabet. Affricates can be reasonably represented as plosive-fricative pairs since English doesn't have a contrast between tʃ/t͡ʃ or between dʒ/d͡ʒ; then we can repurpose Jj for ʒ. For ŋ one can adopt a phonemic analysis which treats it as an allophone of the sequence ng that only occurs at the end of the word (with g deleted in this context) and as allophone of n before velars.
Thus, distinct characters are only strictly needed for θ,ð,ʃ, and perhaps ʒ. All of these except for θ actually exist as extended Latin characters in their own right, with proper upper/lowercase pairs, so we could just use them as such: Ðð Ʃʃ Ʒʒ. And for θ there's the historical English thorn: Þþ. The same goes for Ŋŋ if we decide that we do want a distinct letter for it.
If one wants to hew closer to basic Latin look, we could use diacritics. Caron is the obvious candidate for Šš =ʃ and Žž=ʒ, and we could use e.g. crossbar for the other two: Đđ and Ŧŧ. If we're doing that, we might also take Čč for c. And if we really want a distinct letter for ŋ, we could use Ňň.
You can also consider which basic Latin letters are redundant in English when using phonemic spelling. These would be c (can always be replaced with k or s), q (can always be replaced with k), and x (can always be replaced with ks or gz). These can then be repurposed - e.g. if we go with two-letter affricates and then take c=ʃ x=ð q=θ we don't need any diacritics at all!
Moving on to vowels, in GA we have:
Monopthongs: ʌ,æ,ɑ,ɛ,ə,i,ɪ,o,u,ʊ
Diphthongs: aɪ,eɪ,ɔɪ,aʊ,oʊ
R-colored: ɑ˞,ɚ,ɔ˞.
Diphthongs can be reasonably represented using the combination of vowel + y/w for the glide, thus: ay,ey,oy,aw,ow.
For monophthongs, firstly, ʌ can be treated as stressed allophone of ə. If we do so, then all vowels (save for o which stands by itself) form natural pairs which can be expressed as diacritics: Aa=ɑ, Ää=æ, Ee=ɛ, Ëë=ə, Ii=i, Ïï=ɪ, Oo=o, Uu=u, Üü=ʊ.
For R-colored vowels, we can just adopt the phonemic analysis that treats them as vowel+r pairs: ar, er, or.
To sum it all up, we could have a decent phonemic American English spelling using just 4 extra vowel letters with diacritics: ä,ë,ï,ü - if we're okay with repurposing existing redundant letters and spelling affricates as two-letter sequences.
And worst case - if we don't repurpose letters, and with each affricate as well as ŋ getting its own letter - we need 10: ä,č,đ,ë,ï,ň,š,ŧ,ž,ü.
I don't think that's particularly excessive, not even the latter variant.
In about five minutes any literate English speaker can learn to read at full speed with no spaces or other punctuation. Or upside down. Or at an almost arbitrary angle.
I taught myself this when I was learning Japanese 30 years ago to prove a point. Now it’s merely an interesting trick but one with an interesting staying power: with zero practice I maintain the ability.
Accents in french are pretty irrelevant, you can totally ignore them and master the language. Most french people ignore them while chatting/mailing/texting online.
If you ignore accents, some words can be mistaken for other words (with different accents), but if you check the context, the problem quickly go away.
Accents are just useful to help you pronounce correctly words ; they are also a hint about the word's origin (ex: ^ means the words is greek) ; I don't get why it stopped you from learning the language.
> Accents in french are pretty irrelevant, you can totally ignore them and master the language. Most french people ignore them while chatting/mailing/texting online.
“Master” would definitely not be correct, but you could write intelligibly enough indeed. It will cause you issues here and there (not being taken seriously, having some miscommunications when the diacritic disambiguates the word…)
If you can’t read the diacritics though, you’ll pronounce words very incorrectly and French is a very unforgiving language for mispronunciation: you will simply not be understood
I feel not being understood when pronunciation is off is more of a France french issue.
You will be understood eitherway in Canada (given you speak with french Canadians). But I sometime have difficulty being understood by frenchmans, less so with other french speaking cultures
It would be like a speaker who can’t distinguish the uh sound in “but” with the ih sound in “bit”. Is it really the native English speaker’s fault if he can’t understand that personal dialect?
France’s vowel inventory is bigger than (or just as big as) English’s, and it has a lot more homophones. I imagine all the context goes toward disambiguating the actual homophones and not the arbitrary sets of words foreigners can’t pronounce because they don’t want to learn the accents (the system is not that hard and completely predictable).
I don't see why it couldn't be. It has a pretty large corpus of decent literature/poetry/other media/etc, and the worst people seem to complain about is its inconsistent spelling rules that even native speakers struggle with. In general I'd rather deal with spell check failing on some common homophone from time to time than say, having to memorize arbitrary genders for inanimate nouns that lack any consistent marker and then tables of grammatical cases to apply on them based on those genders. Or having to shove a verb to the end of a complicated sentence and having to unroll the whole thing to figure out what's being said (not to pick on any particular language(s) I've learned).
Oh thank god, someone said it. Who cares if "tree" is masculine or feminine, it does not give my any information. In Italian, tree is a masculine word: what can I do knowing "tree" is masculine?
Grammatical gender can serve as disambiguation. I just heard this sentence recently while watching something in Spanish:
"No me compares con alguien como tú, que llegaste aquí de una isla oriental sólo porque te impresionó un espectáculo de magia barato."
In the phrase "un espectáculo de magia barato," which means "cheap magic show" here, you can tell from the genders of the nouns and adjectives that it's that "barato" modifies "espectáculo," meaning that the show is cheap and it's not that the magic is cheap.
It's not that useful here, because it's not hard to figure out the correct meaning from the context anyway, but it's a tool that helps clarity regardless. And when you learn a language well enough, it's not like you're thinking about this super consciously, you just know the word and gendering it and its adjectives flows right off your tongue. I think this is probably easier for a non-native to learn than all the irregular spellings of English, but I wouldn't know, being a native English speaker.
It seems like we can invent better checksums and referents than grammatical gender. Arguably that's a fascinating part of the pronoun discussions in English, being one of the last remaining bastions of grammatical gender in English (that and familial relationship words). I don't expect us to invent better things at all quickly, but it seems worth trying and it is interesting seeing various experiments.
One of the things I liked in studying lojban (a conlang of interesting background) was the use of mathematical identifiers as pronouns and "math genders" more related to linguistic role, referents like "the first noun", "the third verb" as pronouns. Referring to things by number is particularly great either, but it was interesting seeing a different approach to it.
Similarly, I think the language with the best pronouns I've experienced is ASL (American Sign Language). Signed languages have the ability to use three dimensional space in ways to anchor references that are impractical in spoken languages but so useful in signed languages.
I think English makes a lot of sense, but only if you invest the time to learn some of its etymology. Knowing some Latin, German, and Greek roots (in that order) is immensely helpful. You don't have to learn those languages per se, just some of the vocab. Eventually, you can look at a word, know if it's Latin/French, Germanic, or Greek in origin and all the spelling rules make much more sense.
This takes a lot of time, effort, and interest however, which is why many (most?) people think English is nonsensical.
You can also have a (maybe wrong) sense of familiarity that feels like it makes sense.
I'm ESL but after so many years of daily contact I find writing stuff in English easier than in my native German. Never lived anywhere else. I'm not claiming it's free of errors but it just feels like less work.
None of English is nonsense. But without diacritics, you need to know the historical contexts behind the different spelling or pronunciations to understand the rules.
In English I need to find how each word is pronounced individually. What the hell is the difference between "men" and "man"? What's the difference between "bitch" and "beach"? Why "though" sounds closer to "throw" than "through" or "thought"? Those differences are encoded in a unclear way that there are more exceptions than rules.
Portuguese (my native language) is not perfect in that sense, but at least it has more rules than exceptions. Part of that is because we use the diacritic marks.
Then, I prefer excessive writing marks than excessive unclear special cases
Rules exist, but most are never taught and instead only learned through exposure. It's why "ghoti" is a trick - you have to break several rules of English pronunciation to get "fish" out of that.
Here's a page where someone tried to reconstruct as many of those rules as possible: https://www.zompist.com/spell.html - obviously it can't eliminate all exceptions but it does surprisingly well.
Rules 6-8 are relevant to one of your examples, including the explanation afterwards.
The complexity of these rules, and the number of exceptions that you need to learn notwithstanding the rules, can be roughly estimated for any given language by training a language model on word <-> IPA correspondence for that language (using a subset of the vocabulary as a training set), and then seeing how well it can predict the remaining words. You can run it in either direction, too, to separately measure the difficulty of reading (word -> IPA) and writing (IPA -> word) that language.
This was actually done for a number of languages including English:
You can see how languages with true phonemic spellings tend to be in the >90% range on both reading and writing, with Esperanto at 99%. Spanish and German are in 60-80% range. English is dismal at ~30% for both, though, with only French and Chinese being harder to write, and all other languages tested being easier to read.
I couldn't help to look and see if the company behind commercials that are burned into my brain from 40 years ago are still a thing, and lo, Hooked on Phonics is still going strong!
This page[1] walks through the basics of phonemic awareness that children need to learn via exposure & repetition in order to learn to apply that aural learning to reading.
It makes me wonder if a program like this, aimed at English-speaking children, might help those adults learning to speak & read English if they could put up with being addressed as if they were a child.
> how each word is pronounced individually. What the hell is the difference between "men" and "man"? What's the difference between "bitch" and "beach"?
From what I could easily research, Portuguese has a pretty wide variety of vowel sounds, but it still pales in comparison to the Germanic languages that English took from; and across the spectrum of English dialects and accents you can end up hearing pretty much anything vowel-like that the human voice apparatus can generate. The strength of the difference between "men" and "man" will depend on who's speaking, but it's generally less than Portuguese phonology can accommodate. The "e" sound here should be familiar; the "a" sound not so much. Spanish (and, say, Japanese) learners of English will have much the same problem, but more so; their natural "e" is a bit off.
(From what Wikipedia is telling me, many Brazilian Portuguese dialects will use the right /ɪ/ sound for "bitch" in unstressed syllables. But then, my local accent contrasts /ɪ/ with /i/ quite strongly.)
On the flip side, I struggled with pronouncing Dutch when I made a brief attempt to pick it up; the individual sounds are all straightforward enough, but certain combinations are really unnatural.
> What the hell is the difference between "men" and "man"? What's the difference between "bitch" and "beach"?
Those words all have completely different vowels in English; they're not irregular spellings. If you can't tell the difference, you probably just haven't listened to enough English or have said them incorrectly too much to tell the difference.
I think that's probably more because English uses etymological orthography.
So spelling rules are based on four distinct "primary" systems of phonics that can be used depending on whether the word or morpheme has a Germanic, Greek, Latin or French origin. (Yes I know French comes from Latin origin, but the spelling rules differ depending on whether the word was imported directly from Latin, or came in via Norman French.) And then the Germanic and French origin words can get even messier because their spelling was standardized before the Great Vowel Shift. And then whenever we take loanwords from other languages that use the Latin alphabet, we preserve that language's spelling. Which creates a whole mess of special cases where the spelling doesn't follow any of the regular phonetic rules.
If you look at languages where the writing system is famously difficult to learn, a common element they all share is etymological orthography.
>but the spelling rules differ depending on whether the word was imported directly from Latin, or came in via Norman French
In fact it can be even more complicated because in English the words can come from Norman dialects and "typical" French simultaneously. For example, warden and guardian come from the same word in Old French, the former is closer to how Normans pronounce it and the latter is closer to its modern French pronunciation.
> Do men/man and bitch/beach sound the same to you?
Not exactly the same, but I differentiate them more based on the context than in the pronunciation.
Giving an example for Portuguese that has about the same difference: "roupa de lá" (clothes from there) and "roupa de lã" (wool clothes). If you write them in Google Translate or similar you'll see the difference, which is very subtle for non-Portuguese speakers but sounds completely different to us.
In Portuguese, they indicate that a syllable is stressed and alternate ways to say the vowels. e.g. "país" is stressed in "i" and means "country", while "pais" is stressed in "a" and means "parents". Tilde (~) indicates that the vowel is nasal, e.g. the "ã" in "São Paulo" means that it sounds like the "u" in "sun"; the default sound of "a" in Portuguese is the same as in "car".
because you know the stress syllable by looking at the word. take Desert and Dessert, do we say DES-ert or des-ERT. Also in portuguese, at least, I can know which "e" sound [1] each "e" in the word makes by knowing this (well, almost, but not completely, but much better than English.)
I sometimes wonder if English dominated programming and the Internet partly because it doesn't use accents or special characters. You have limited space on a keyboard, and as a native Arabic/French speaker, typing in those languages is a real hassle. French requires é, à, ç and other accents, while Arabic is even more complex with right-to-left text and changing letter forms. English just flows naturally. Maybe the Internet's language wasn't just shaped by politics or economics, but by something as simple as which language was more convenient to type.
Tangential: Ùù has always seemed immensely silly to me. It’s given an entire key on the CSA keyboard despite only being officially used in 1 non-proper-noun word: où. It’s there solely to disambiguate with ou, the actual phonetics are not affected. Whenever I look down at my MacBook’s keyboard I think it seems a bit out of place haha
I mean that might be part of it, but also because the internet developed out of the ARPAnet which was a United States Department of Defense project, at a time when the United States was one of two superpowers (right as the other superpower stopped being a superpower or for that matter a state), in a world that already gave pretty heavy weight towards English as the lingua franca in international institutions after World War II or simply because it was the lowest-common denominator in a lot of the world post-the British Empire.
English had a lot of wind beneath its wings. Still does.
Maybe that makes sense for french vs english, but there are plenty of languages that avoid accents in transcription (despite being just as tonal or more so than english) because they don't have the analytic diction required to discuss the abstract concepts all over programming, computer science, and even just "business".
I think that's underselling the west's post-WW2 influence and the amount of innovation that was fueled by a booming capitalist society that the entire world wanted to take part in.
I'd say English's simple, non-accented latin characters being easy to represent mathematically was a happy coincidence.
Apple's HyperCard had a French dialect, and AppleScript followed with one too. It was short-lived but did provide a window as to how these programming languages might have looked like had they originated in a non-English world.
A fun factoid I just discovered: on March 11, 1968, President Lyndon B. Johnson signs an executive order mandating that ASCII be adopted as a federal information processing standard for electronic data interchange between federal agencies. This order was known as... Executive Order 11110 :)
Do you have a source for the executive order claim? I can't find it on this list of executive orders signed by Lydon Johnson, https://en.m.wikipedia.org/wiki/List_of_executive_actions_by... . And as far as I can tell the claim originates on ascii-code.com and spread from there?
Googling executive order 11110 gives no primary information.
From a quick search it seems this was a Presidential Memorandum not an Executive Action.
“Executive orders are generally more formal, require publication in the Federal Register, and must cite the President's legal authority, while memoranda are less formal, may not be published, and do not always require a justification of authority.”
These sometimes get called executive orders, like some memos that trump has signed in the last few months were called executive orders by the news and online.
They are essentially the same though. Memos carry legal weight and can direct agencies to carry out specific actions.
The 11110 thing is a myth, though. The closest that we get to that is that it is number 127 in the NARA's catalogue of the Johnson's public papers for 1968.
Well, there's this story about how printing failed Arabic. Allegedly, in Italy, they tried to print a Koran, but because the printers didn't speak Arabic, and were trained on Latin scripts, they messed it up so much that the Arab world came to believe printing is not going to work for them. Even though most scientific books of the day were written in Arabic and the best schools spoke the language, it quickly fell out of favor, being replaced by Latin in Europe.
In turn, the Caliphate made a point of standardizing the script and creating libraries which fueled research science for a good few centuries.
----
Even before Internet, languages with diacritics (eg. Russian Ё) were deprecating their use. I believe something similar is happening in German (with ß). Also, languages with long history seen incremental thinning out of the alphabet to remove duplication and rare special cases. Sometimes, the opposite happened, but it was usually brought by reactionary politics, especially inspired by local nationalism which looked for validation in ancient history. So, for example, in the 90s Ukrainians brought back the letter Ґ that was used in only a handful of words, and was happily forgotten during the Soviet times.
So, convenience and suitability for new technology can be a meaningful factor in adoption.
You don't even have to leave English to find examples of printing shifting script. The printing press killed the thorn "Þ" character which made the the "th" sound. It got replaced with either a "y" (which looked sorta-kinda like a thorn) as in "Ye Olde" or a "th", which is how a speaker not accustomed to the sound might approximate it "tuh-huh".
That is a really bad example, because English does have fairly productive pronunciation rules [1], and trying to make 'fish' come out of ghoti requires breaking them. 'gh' only occurs as an /f/ sound when it occurs at the end of a syllable; as an initial consonant cluster, it's invariably /g/. Turning 'ti' to /ʃ/ is a fairly normal affricatization, which requires a subsequent vowel, which is lacking here (consider words like 'ratio', 'gracious', or 'nation'). Even turning the 'o' into /ɪ/ relies on fairly regular vowel destressing, which there's no reason to expect in 'ghoti'--which should be pronounced per English rules, pretty unambiguously, like goatee.
There are some real issues with English spelling, like the inconsistency of pronouncing 'ea' as /i/ or /ɛ/ (consider, uh, read and read). But 'ghoti' isn't one of them, because that's a case where there's not a lot of ambiguity in English pronunciation.
[1] The worst offenders in English pronunciation are when English borrows foreign words both with foreign pronunciations and foreign spellings.
It has become a thing where folks are taught, basically, that English is not a phonetic language. It is truly mind boggling the number of college educated folks I've talked with that start to try and argue that we don't have a phonetic alphabet.
And, like, I get it. We don't have a fully regular one. But this is like the people that think we don't have a single word to describe some things, when they have to basically ignore adjectives and many many synonyms to get to that idea.
Even better when folks complain that we have different ways to refer to people from other nations. Ignoring that a large part of that is that we heavily deferred to how said people wanted to be referred to.
At least one really obvious way to know that English is a phonetic language: fantasy authors create all sorts of made up names in their books. Sure, sometimes there are disagreements over how to pronounce these names, but generally readers come up with quite similar pronunciations.
The confusion may come from the various spelling conventions in the numerous loan words. In many of the counterintuitive cases, you could imagine a more phonetic spelling. The tradition has been to preserve buffet as is, instead of rewriting it as, "buffay".
The distinction is there. English can be used phonetically. We prefer to preserve the heritage of various loan words instead.
Hearing Americans pronounce the French loanword 'niche' as 'nitch' instead of 'neesh' is cringe-inducing.
English pronunciation is just kind of a mess (especially in the US). It is one of the few languages where highly educated mature people are regularly unsure of how to pronounce a word in their own language or where there is no agreed upon 'non-dialect'/standard pronunciation.
...we all agree that the right pronunciation of "nitch" is "neesh", though, or at least I've never heard a serious argument to the contrary. People just genuinely don't know how to pronounce it because they've only seen it written.
One that still gets me personally is "hyperbole"--I know how it's pronounced but when I read it, I still say "hyper-bowl" in my head more often than not. I don't think I've ever made the mistake while reading out loud to someone yet, but it will likely happen some day and when it does I will feel very stupid.
>It is one of the few languages where highly educated mature people are regularly unsure of how to pronounce a word in their own language
Which is worse, being unable to correctly pronounce a word (but still being close enough to be understandable) or being completely unable to write a word?
Some Americans clearly must do this, but personally, I've never heard this in my life until I saw it on a YouTube video of a British person complaining how Americans pronounce words. Obviously, your experience may vary - it's a big country.
The transatlantic dispute over "aluminum/aluminium" seems minor when you consider how English is used globally. Even within Britain, there are considerable variations.
The one that gets me, as an American is nuclear vs nucular. Both have been in use verbally and written for decades... academics have adopted the former, even if the latter was more common in most early use. And that's just one, pretty recent example.
I'd argue that is mostly because 1) people follow audiobook or TV series pronunciations and 2) most discussions happen online and not in verbal form.
This is definitely a problem when it surfaces. For example the Stormlight Archive [1] series has two voice actors narrating the audiobook, and they don't even agree between them how to pronounce half the made up names.
As someone who has listened to The Stormlight Archive (and The Wheel of Time with the same two narrators), the differences are absolutely there, but they're relatively small.
Fantasy novels predate the widespread popularity of audiobooks. It used to be quite expensive to distribute a large enough volume of audio. The old "books on tape" cost a lot of money, were frequently abridged, and only existed for the most popular titles.
Reminiscent of a tweet about the death of the inventor of the GIF, who reportedly said it should be pronounced "jif" — the retweeter's comment was, "I guess he's with Jod."
English is more phonetic than not. There are a lot of words where it isn't clear what is the correct pronunciation, but if you put a random sequence of letters together there are only a few possible pronunciations, often exactly one.
I wish English was more phonetic. Spelling and pronunciations is a mess. However the language is mostly phonetic.
There's something you speakers of non-phonetic languages cannot fully grasp, I'm afraid!
We Italians, when we were children, we were taught to read based on the written letters, and we were able to read any word. It was normal, during primary school, to pronounce a word correctly and then ask the teacher what it meant. This is something you can not do in English.
And the converse was true as well! An Italian child is able to hear the surname of a new acquaintance, or the name of the village they are from, and write it down properly. In Italian, the question "How do you spell it?" does not make any sense! Again, this is something you can not do in English. Nor can you do it in French, because in the past centuries they had ink to spare and as such they started writing down useless letters that they do not pronounce.
You can frequently do that in English too. Of course there are exceptions, but if anything it's typically because of words/names from other languages.
In my experience learning Spanish, their loan words are Spanish-ized, being made to be pronounced and spelled in a format that makes more sense in Spanish. Whereas in English, the pronunciation and spelling is usually taken more directly from the source, so you get a bunch of instances where a word's spelling doesn't really match its pronunciation.
> We Italians, when we were children, we were taught to read based on the written letters, and we were able to read any word. It was normal, during primary school, to pronounce a word correctly and then ask the teacher what it meant. This is something you can not do in English.
We're still taught very basic phonetic rules in English. Like how vowels have a long sound and a short sound, where "ee" is the long e sound, or "<vowel> <consonant> e" triggers the long sound for that vowel. But you're also taught that many words are exceptions (e.g. bear vs beard). And you learn there are patterns to the exceptions, like how "ea," if it doesn't sound like "ee," will sound like a short e, like in "head" or "breadth," and particularly in cases like "dream - dreamt" or "leap - leapt."
And if you do a lot of reading as a kid, you vaguely recognize in the back of your mind some words that seem to follow a different set of pronunciation rules not taught in school (e.g. rouge, mirage, entourage, entrée, matinée, parfait, buffet, memoir, soirée, patois), which you learn implicitly. I remember this as a kid, only later learning those were French.
And this lets you guess pretty well how you'd pronounce a word. Just with basic rules and a lot of input to learn from, you can guess how to pronounce pretty much anything with good accuracy, because there are rules, and even a logic to the exceptions, but the rules are overlapping, so it's more like a set of rules you choose from.
I'd liken it to machine learning. You can learn the rules without even being taught the rules, like I did in the case of French loan words. And there are probably rules we follow without even realizing it, just instinctively thinking it's the natural way to pronounce the word without knowing why.
I'm not saying it's as good as being as phonetic as Italian, but it's not like we just have to memorize the pronunciation and spelling of every word as though it were a structureless string of letters and a corresponding, unrelated sound.
Yes, but Italy had to centralize its language in order to accomplish this. 1000 Italian dialects were suppressed in a very heavyweight process. (And probably some people didn't like speaking Florentine, which became modern Italian.)
English is complicated because it's decentralized and there is no authority to regularize it. Which is a feature, not a bug.
1 - Being fluent in the national language does not prevent people from maintaining their dialects in parallel.
2 - Whether a language is phonetic has no relation to political issues concerning dialects.
3 - Whether a language is phonetic has no relation to whether people like to use it.
4 - English got decentralized starting with the Age of Sail, but the lack of correspondence between written and oral forms is systemic and older than that.
> English got decentralized starting with the Age of Sail, but the lack of correspondence between written and oral forms is systemic and older than that.
That's not really true -- there is and was a great deal of dialect diversity within England itself. It was widespread printing that allowed languages to be standardized at the scale of nation-states in the first place: the divergences that developed after the age of sail were reversing convergence that had only begun a couple of hundred years earlier.
And although versions of English from the south and east of England became the basis for modern standard English, other dialects persisted and sometimes spread around the world, so some of the differences between English dialects globally are due to disparate influences from different dialects originating within the British Isles.
being fluent in a language makes you less likely to be interested in a second when everyone speaks the first. This plays out over generations in killing the less common languages.
There is a still a lot more linguistic diversity in Italy than across the entire English speaking world.
e.g. Northern Italian languages are technically more closely related to Gallo-Romance languages from the other side of the Alps than to standard Italian.
I think you're trying to to argue something like: "the set of dialects that make up English have a large(r?) set of allowable IPA orthographic representations than the accepted set of English orthographies" or something to that effect? And, that, perhaps, Spanish (French? Ukrainian?) have a much smaller set of alternate IPA orthographies for a given acceptable orthography?
I guess I'm really confused. It's not like English is some Arabic language where the orthography is in a second nearly unintelligible languages? Or, Chinese or Egyptian hieroglyphs... ?
> I think you're trying to to argue something like:
I'm arguing exactly what I wrote: a phonetic language is one when you can see a written word and pronounce it correctly, without knowing what it means and without having ever heard it before.
Edit - as an example, consider "door" and "pool": the written form is not sufficient to guess the sound to associate to the double o.
This is something that should be looked up and not argued about. As far as I can remember, the vast majority of alphabetic languages are phonetic. English, French, and Portuguese are not.
Being able to guess how something is pronounced sometimes is not enough to say that English is phonetically spelled. English often borrows spellings directly from the languages that it is borrowing a word from, those spellings are usually phonetic (based on the source language's rules), and due to the presence of certain peculiar sounds, one can often guess which phonetically-spelled language a word was borrowed from. That's not an English word being spelled phonetically, that's people being forced to become language detectives. You can get lucky and guess the pronunciation of a Chinese character that you've never seen before (based on the radicals), but no one would say that Chinese characters are a phonetic alphabet.
Other than the soundalikes "b" = "v" and in Latin America soft "c" and "z" = "s", when Spanish speakers don't know how to spell a word, it's because they are also saying the word wrong when they speak.
The door vowel placed between P and L would make the word 'Paul' or 'pall' in most English accents. If I imagine 'door' with the pool vowel, I get something like a Scottish pronunciation of 'dour'.
I'm not stating that English is anything like that. Just that it is not phonetic, in the sense that the written form of a word is not sufficient to pronounce it correctly.
That isn't what that means, though. It is not regular, it is phonetic. Indeed, your argument that there is confusion in spelling is because it is phonetic, but not regular. You know the letters in "glasses" correspond to sounding out something. In contrast to something like an emoji, :glasses:, which you don't.
I have to agree with you. With respect to emojis, English is phonetic. But this statement is as stretched as considering a diesel guzzling truck green because the fuel it burns was indeed created using solar energy.
No it isn't. Pedantically, English the language is definitionally phonetic, as it is spoken. Sign language is not phonetic, nor are things like smoke signals/traffic signals/etc.
Just as it would be silly to claim that Japanese is not phonetic. Of course spoken Japanese is phonetic. They even have two fully regular alphabets that can both express the same phonemes, but are used for different reasons. As well, they have a completely logographic set that does not relate to phonemes, even though it is used for most writing.
We're discussing features of written language ("phonetic" -- or the etymologically related "phonological") is a way of categorizing writing systems by their relationship to spoken language.
> Of course spoken Japanese is phonetic
"Phonetic" is not a feature of spoken language, but of the relation between other language forms (usually, written, but you could make the same distinction for, say, sign languages) and spoken language.
> They even have two fully regular alphabets
I assume from "two fully regular" you are referring to hiragana and katakana, but those are syllabaries, not alphabets. (Romaji is an alphabetic system, though, but I don't know where you'd find a second one.)
Phonetic is absolutely a feature linked to spoken languages, though? It quite literally is relating to spoken sounds. Sign language, for example, is not phonetic, as many users of it cannot speak or hear.
Fair that I should have said they have two phonetic writing systems, decidedly not alphabets. I'm not sure the distinction is one that matters for what we are covering here?
> Phonetic is absolutely a feature linked to spoken languages, though?
It's a feature linked to spoken languages, since it is a feature of the relation of non-spoken (usually written) language to a spoken language.
But it is not a feature of a spoken language.
> Sign language, for example, is not phonetic, as many users of it cannot speak or hear.
Yes, in causal terms, the fact many users of sign languages aren't familiar with the sounds of the spoken language is a reason sign languages tend not be phonetic, but they are not phonetic in definitional terms because the symbols in the sign language do not represent the sounds of spoken language.
But it would make no sense to call a spoken language phonetic (except maybe if it was a code for a different spoken language, in which the phonemes in one mapped to the individual phonemes, rather than ideas, of the other.)
It absolutely is a feature of spoken languages. It is in contrast to vocalizations, specifically because it is about speech and not just the sounds animals can make.
I get what you are aiming at, but phonetics is about speech. Is why you can reliably say how many phonemes different languages have. If you had to cover all vocalizations that people could do, you would have a bit more trouble.
"phonetics" is about speech, but the noun "phonetics" is not the adjective "phonetic" as applied to a language. "phonetic" is not a modifier that applies to spoken language (with the hypothetical caveat I gave upthread), and even if it was, it would have a different definition than the one that applies to non-spoken language and is about the relation such a language has to a spoken language, so trying to redirect to it in a discussion of that use of the adjective "phonetic" would be equivocation, argumentative conflation of different definitions of the same word.
It is hard for me to read this. You seem to have given up on capital letters. And sentences. I don't like criticizing run-on sentences as being indicative of bad thinking; but I do literally feel you grasping here.
I'm largely comfortable with the idea that there is something lacking in the orthography of English. Fully comfortable, even. I'm growing frustrated with how many are pushing the idea that it is not phonetic. The system is literally to convey, in writing, the words that you would speak in English. And the word "phonetic" captures that perfectly.
If you want to argue that we are building a new use of the word "phonetic" applied to writing that supersedes "orthography" and related terms. You do you. It still seems nonsensical to me and only works if you ignore that we have an alphabet that is literally used to convey speech sounds.
The issue at the start of the conversation is not about speaking or gesturing. It is about using the Latin alphabet properly (i.e. phonetically, as it was designed) or "with some imagination" as the English does.
The alphabet is used to communicate the spoken words. Not the concepts or something else, literally the spoken words. Is a big part of why slang is so popular in fiction settings, as they would use the letter to convey pronunciation. Because the letters generally represent phonemes.
I think I was too swayed by Sold a Story; but I am heavily convinced that the non phonics based attempts to teach reading was a massive disaster. And not just for reading literature, but also for reading math. Without learning to effectively interact with symbols, people grow to think they either get the math or they don't.
Yeah, English orthography is a hot mess, but it's still fundamentally phonetic and alphabetic. Just try to learn to read Japanese or Chinese, and you'll very quickly come to miss English's pile of nonsense.
> That is a really bad example, because English does have fairly productive pronunciation rules
Not really. There's no way to guess how many english words are pronounced based on the written form, unless you've heard it before. And of course the pronunciation may vary wildly based on region/country as well.
The most telling evidence of this is the existence of Spelling Bee competitions in english language countries. The fact that hearing a word being spoken is challenging enough to figure out how it is written that it is a competitive sport, says it all.
There are many languages where the concept of a spelling bee competition makes no sense at all, because as soon as you hear the word being spoken, it is 100% deterministically obvious how it is written. English, not so much.
According to this paper [0] and my own experience, it's way easier to pronounce a word in French given the spelled word than in English. It's slightly harder to spell French than English for the model of the study, but it's really close. Now, in my personal experience, I feel like French has a lot of rules while English has a lot of outliers which do not follow any rules. But my native language is French, so I am obviously biased.
Yeah as far as I know, in French words are always pronounced consistent with how they're spelled. The same is not true in English. Americans complain a lot about french spellings '-ioux', 'eau', etc. but they offer no gripe over the difference between '-ough' in 'enough' vs 'through'.
French is funny to me because the written language and the spoken language are in some ways quite different, with written french introducing considerable complexity. aller, allait, allais, allaient, alleé, etc. Since the spoken context for all the conjugations is almost always clear, I'm not sure why someone introduced the extra complexity.
> Yeah as far as I know, in French words are always pronounced consistent with how they're spelled.
Whoa, very much not! I have spent the last 20 years trying to learn how to pronounce french words (my partner is a native french speaker, so I keep trying). The only somewhat consistent pattern I have is that the last few letters of each word are often silent, but even that is not really always consistent.
I'm fluent in 4 languages but french is an impossibly tough nut to crack for me.
> Yeah as far as I know, in French words are always pronounced consistent with how they're spelled.
It's far from as bad as English, but here's a Reddit thread with lots of French words which are not spelt as they are written. Not esoteric words either; along the lines of hier and monsieur
I disagree. For whatever reason, most proficient readers I know have an intuition about the correct pronunciation of a word even if they’ve never heard it spoken before. And even if they use an intuitive pronunciation that isn’t identical to standard pronunciation, they’ll still be understood.
Spelling bee is the opposite direction, going from pronunciation to spelling; not a fair comparison.
> For whatever reason, most proficient readers I know have an intuition about the correct pronunciation of a word even if they’ve never heard it spoken before.
Because pronunciation rules exist, they're just never explicitly taught and instead learned through exposure. For example, here's someone reconstructing as many of the rules as they can: https://www.zompist.com/spell.html
Finnish is extremely easy, there is one sound for each letter and zero exceptions.
Spanish is also very predictable. While there are a few exceptions (like 'c' can be 'c' or 's'), they are very easy rules to follow, so never any surprises.
English and French are in the batshit crazy category. It's pretty much all random, you just have to know from memorization.
I would expect that spelling bees would select words that are not phonetically spelled. This selection bias does not imply that English does not have productive pronunciation rules.
True, in that spelling bees will select for harder words.
But the fact that such words exist, in such large quantities that memorizing them all is so challenging that this becomes a competitive sport, is why engligh is so impossible.
Dutch, which has a pretty reasonable sound-to-orthography mapping (some exceptions of course, but not all that many) also has spelling bees. Often won by the Belgians.
> Not really. There's no way to guess how many english words are pronounced based on the written form, unless you've heard it before. And of course the pronunciation may vary wildly based on region/country as well.
> The most telling evidence of this is the existence of Spelling Bee competitions in english language countries. The fact that hearing a word being spoken is challenging enough to figure out how it is written that it is a competitive sport, says it all.
That's two exact opposite things.
Languages for which you know how to pronounce a word just from its written form => you can have spelling bee competition there.
Languages for which you know how to write a word when you hear it pronounced => no spelling bee competition.
I'll take French as an example : if you see "o", "au", "eau" in a word you know how to pronounce it. There is one and only way. But if you hear "o" in a word then good luck knowing how to write it. So you got dictées (spelling bees) even if you can easily guess how a written word sounds like. The existence of spelling bee competition in the English world is not proof that the language written word pronunciation are a guess.
At least that's often spelled "fo'c'sle" these days, which gives you a good idea of the actual pronunciation.
My personal favorite in English is "colonel" being pronounced the same as "kernel". Which is insane even from an etymological perspective because the word is a derivative of "column" (as in, a colonel is someone who commands/leads a column of soldiers).
A lot of nautical terms have unusual pronunciations. English sailors primarily came from coastal regions, and were very happy to have a lingo that was incomprehensible to the landsman. All of this carried over to North America as well.
French is a lot less bad than English in this regard. In French you can usually (though not always) predict how a word is pronounced from its spelling, but not vice versa. In English, both directions are impossible.
French is not a good example. Pronunciation often deviates from spelling in French (e.g. many silent letters and inconsistent mappings).
Hungarian, however, is pronounced the way it is written, as its orthographic type is phonemic, whereas French and English are of type deep orthography.
Serbian is of the perfectly phonemic type. "Write as you speak, read as it is written" is a common saying.
The silent letters are not the point - that's why the poster you replied to said it doesn't work speech->writing in French. But writing->speech is much, much more consistent than in English, even if the orthography itself is kinda criminal with all the silent letters and whatnot.
I've only really been exposed to French in music, where I've sung various French pieces of the years. But from my experience, at least French is consistent? As-written is as-pronounced.
Is this not really the case, and therefore is French also guilty of having the same vowels/consonants pronounced differently for completely arb reasons?
My son's first year teacher said (I may have the numbers slightly wrong) that Spanish has 23 phonemes (sounds the mouth makes) and 23 graphemes (ways to write sounds). English, on the other hand, has 43 phonemes and over 500 graphemes.
Spanish is better than English, but it's nowhere near that regular. There are three different ways to pronounce "x", wild dialectal variations in "ll" and "c", etc.
The rules are very clear on when those are used though, you are not really arguing the original point imo. What are the dialectical variations in "ll" and "c"?
(B2-ish Spanish learner here but) "ll" is pronounced in at least three variants that I know of: "y", "j", and something between "sh" and "ch". E.g. "llama" might be pronounced like (in English writing) "yama", "zhama", or "shama". The last one really threw me for a while; it's super common in Argentina at least.
I spent time in the "Rio de la Plata" area in the late 1970s, mainly Montevideo, and learned rioplatense Spanish, and would use the ZH sound as in "meaSure" for Y/LL letters in "playa" and "calle".
In the last 40 years I've spent mostly in the USA I rarely have heard Uruguayan/Argentinian Spanish in person or in media, but was surprised to hear Messi and others in recent interviews use SH as in "puSH" for the Y/LL, this apparent has been a generational shift in that area, first in Argentina and then Uruguay. I'd sound old-fashioned if I were to go back to Montevideo these days.
I see what you mean. I think you should stick to one form and learn by difference or you could quickly get lost.
"ll" in standard spanish is a strong english "y".
However, in spanish argentinian from the area of Buenos Aires (but not the argentinian Córdoba, which sounds more like colombian spanish) it is "sh", being that s something like a mix in-between of "j" and "s" + h as in "she" but the sound is a bit different.
Without being able to record some sound I cannot express it better but I am sure you can find something around. Javier Milei, the president, has such an accent.
AFAIK "ll" can also be the palatalized "l" sound in some dialects, i.e. in the same relationship to regular "l" as "ñ" is to "n". Indeed, this is the original pronunciation from which all others have diverged.
I think that must have been within one dialect. If you include all dialects of English (Scottish, Irish, Australian, Singaporian, Indian, American, etc. etc.) I'm sure you have a lot more than 43 phonemes.
In any case, her point wasn't to give a lecture on linguistics, but to impress upon the parents how complicated English really is to learn to read.
x is pronounced four different ways in Spanish: like j in México, like the English “sh” in Xcaret, like s in xenofobia and like English “x” in extremo.
The first two are not productive now in normal Spanish words: they are only used in old spellings that have irregularly been retained, and in loanwords from indigenous languages. But they do exist.
Well, yes. I was speaking about standard Spanish from Spain.
Xenofobia is an s, yes, and excursión is "ks"
In fsct, Méjico is the traditional way to write Mexico in Spanish grom Spain until it was accepted the other form a few years ago. I still write "Méjico" myself.
Since less than 10% of Spanish speakers are from Spain, there’s no reason to assume you were specifically talking about that one country when referring to the Spanish language in general.
And anyway, as you point out, even in Spain the form México is accepted now.
I thought it is perfectly reasonable to talk about spanish from Spsin the same you talk about English from England.
After all, it is where they come from originally and have their own spelling (colour vs color, etc.)
An x in standard spanish has always been the two sounds I told you and that mexican deviation is specific to Mexico.
Yes, it is over 100 million speakers but I was still assuming the root language in its original place as the reference. Sorry if I did not express it correctly.
I get your point, but FWIW, México is not a Mexican deviation; it's just an older Spanish spelling. E.g. Jiménez was once spelled Ximénez and there are probably lots of other examples.
The "root language spoken in its original place" absolutely did pronounce X like modern J.
This isn't entirely correct. A distinct sound that the mouth makes is a "phone". A phoneme is almost always a group of several phones - allophones - that native language speakers perceive as a single sound. Another way to phrase it is that if you change one phoneme to another one, it makes a different word (possibly a non-existing one, but regardless the native speakers would consider it distinct), but changing from one phone to another doesn't change the word.
For example, in English, the phoneme /t/ has allophones [t], [tʰ], [ɾ], or [ʔ] depending on context. OTOH [ɾ] is a distinct phoneme in Spanish, and [ʔ] is a distinct phoneme in Arabic.
Unfortunately these two are often confused, so one should be careful with such counts and comparing them - it's not uncommon when people count phonemes in their native language, but phones in other languages (when those phones sound distinct to them).
This can also vary significantly from dialect to dialect, since one very common thing in language evolution is for two similar phonemes to collapse into a single one while retaining the original distinction as allophones. For English, in particular, the number of phonemes varies a lot between American and British English (with the latter having more distinctions).
Spanish "maps" very nicely but even Spanish isn't exactly 1:1
- /k/ can be written both c and qu, and k where it occasionally appears in the language (e.g. kilo) - and the u in qu is silent.
- /s/ can be written c, s, and z, though stress rules are different for c and z.
- r and rr are distinct sounds but r = rr at the beginning of words, I think.
- At least in Mexican Spanish: The "ua" sound can be spelled ua or oa (e.g. Michoacan, Oaxaca) - and also the breathy sound of j can also be written with an x.
- d has a sound a little like English voiced-th at the end of words (e.g. juventud)
qu: the u is always silent and qu is followed by i or e. It is still a systematic way of reading. It is like gue and gui, you pronounce as in "singer" the "ge", the u is mute. If you want to pronounce the u, as in pingüino, you set the diaeresis.
The stress rules, to the best of my knowledge, is very systemaic (not 100% but I would say "almost" at least for the words in use). Even the stress rules are very uniform.
> r and rr are distinct sounds but r = rr at the beginning of words, I think.
This is still systematic reading. At the start of a word it is the strong one, yes. And when it is preceded by a consonant, such as in "enredar" (that is strong r). There is no exception of any kind here.
> d has a sound a little like English voiced-th at the end of words (e.g. juventud)
That is some dialects in some areas. We pronounce a clean d at the end in my area (around Valencia). It is also the correct, standard way to do it for spanish. The other is a deviation existing in León, for example.
Yes, I'll always remember the long time spent asking for the whereabouts of Ocean Drive, mispronounced by me because the correct pronunciation would require the word to be written as Oshean or maybe Oshan. It was 1995. I have had very few occasions to hear native speakers. A lot of people and I were figuring out plausible but incorrect pronunciations by applying the most usual pronunciation rules to the written words.
> Spanish is totally systematic in this sense and once you can read it, you can pronounce it.
IMHO purely phonemic orthography makes orthography unnecessary complex, as there are language features like assimilation[1] that happens naturally in spoken form but does not make sense in written form.
In contrast, morphophonemic orthography keeps systematic and consistent mapping between spoken and written form for individual morphemes, but not necessary for words, as in written form morphemes are just concatenated (to make words), while in spoken form there may be complex interactions.
It's not so strict, but we try most time to keep it consistent. For example, here in Buenos Aires we almost don't say the "d" at the end of the word, like in "ciudad" (city), in some pronunciation guides I saw it written with a tiny d.
If the variant get's too popular the two versions become the official spelling, for example "septiembre" and "setiembre" (September) are correct. I hate the second one and I never use it, but it's popular somewhere. After many years, sometime the old spelling disappears and is marked as archaic.
An orthography that surfaces (non-phonemic) assimilation would be phonetic rather than phonemic. For example, many languages assimilate "n" to "m" before "b", but the phoneme is still /n/, and native speakers are often not even aware that this assimilation occurs (which is what indicates that it's still the same phoneme).
This is true, but Spanish orthography isn't completely phonemic (and simpler for it). It is very shallow and very consistent but it doesn't spell out things like assimilation differences, people are just wrong to describe it as completely phonemic.
This often gets trotted out, but it's not really true. English is a solidly Germanic language, which merely happened to lose the core attribute of Indo-European languages (extensive verb inflection), and in more recent centuries, there's been a tendency to adopt Latin and Greek words for new word formation rather than (as German did) using native words. So 'technology' instead of 'craftlearn' or 'television' instead of 'farsight'.
Even among major languages, English isn't anywhere near the worst offender of copulating with other languages for features--it never really adopted foreign grammar, the way you see with, e.g., Turkic languages.
Solidly Germanic with an absurd amount of French, down to nearly identical spelling for many common words. I’m not talking about cognates but actually 100% the same spelling and meaning and they’re often not from some recent century but from old French.
I’m sure you have a solid basis for saying this but it’s basically impossible to write many sentences without by accident using French down to the original spelling.
I was going to highlight all the examples I used by accident myself in this post but I gave up because the links were making it too long.
I believe this is because England was conquered by the Normans (french speakers). I think it was within the last 100 years or so that the English aristocracy finally stopped speaking French among themselves.
As I understand it, English at it's core is a Germanic language that underwent significant creolization with scandinavian sources. That core then acquired a significant amount of Old French and latin vocabulary, particularly in upper class terminology.
The creolization is why English has a relatively simple grammar, and all the word sources is why we have like 16-20 vowel sounds trying to cram into latin characters.
> in more recent centuries, there's been a tendency to adopt Latin and Greek words for new word formation rather than (as German did) using native words
Note that the prevalence of native words in German is the result of a modern reform movement, not something that happened naturally within the language.
> [English] never really adopted foreign grammar
There's the argument that do-support is borrowed from Celtic.
There's a really good podcast [1] that dives into the background of English. It starts off even further back, talking about PIE and how that affected all the earlier languages of the region. And then starts tying the pieces together on how English was formed.
Almost all the most used words in English are Germanic. Latin in particular is overrpresented because of scientific and technical terms which are rarely used.
That might be true if you just count up every word in the dictionary by origin. However if you weight the words by frequency, Germanic will be way higher. That is, if you take a transcript of an average conversation in English, the proportion of words inherited from Old English (i.e., Germanic) will be much higher than 26%.
> Spanish is totally systematic in this sense and once you can read it, you can pronounce it.
is there no accent variation in Spanish?
Such a 1:1 system would never work in English, because the way words are pronounced can be very different in e.g. Melbourne, Newcastle-upon-Tyne and Boston, for example.
One of the problems in english (not the only one, but one of them) is that for the vowels there are 5 graphs (is this term correct? Sorry but hope it is understandable) but many more sounds. In Spanish there are 5 vowels in the latin alphabet and exactly five sounds and nothing else.
Valencian has 7 sounds though, two for e and two for o. Similarly, Catalan also (and in some circumstances the o sounds as u, when the stress is not in it and other stuff). But they still have quite strict rules.
Yeah but we represent a lot of vowel sounds by combining vowels - 5 letters (not including y), if we allow any combo of two to represent a different sound that's 25 combos, and if we remember that preceding and following consonants can modify vowels too (though, dough, caught bought vs thou, bao, sour, or; on, con, Ron vs how, cow, ow) that's quite a lot of combos.
Now, you can (and should!) accuse me of cherry-picking examples, since the rules are less consistent and/or vastly more complicated than what I represented. But I maintain that there are orders of magnitude more ways to represent vowel sounds than 5, and the clue is the context. Not, as many will suggest, memorizing each individual case (though there's certainly plenty of that going around, much like Spanish's infamous irregularly verb conjugations), but understanding categories and families and patterns.
English sounds usually are best understood with groups of three letters, rather than one letter at a time. If you looks at throuples, you'll likely find far more of that consistency we all so deeply desire.
Yes, English is VERY consistent. The problem is that there are multiple systems working inside English vocabulary, so you have to get familiar with more than one rule set.
You're right to point out that English pronunciation varies widely across regions, but that doesn't fully negate the value of a systematic orthography. What germandiago is referring to is the relationship between graphemes (letters) and phonemes (sounds). Spanish has a highly phonemic orthography, meaning the rules for converting letters to sounds (and vice versa) are consistent and predictable. Yes, there are accentual and dialectal variations within Spanish (e.g. seseo in Latin America vs. ceceo in parts of Andalusia) but these are largely phonological shifts applied systematically, not random deviations from spelling norms.
In contrast, English has a deep orthography, where historical layers (e.g. Norman French, Old Norse, Latin borrowings) and sound changes (like the Great Vowel Shift) have led to a chaotic mapping between spelling and pronunciation. A consistent system wouldn't eliminate dialectal variation, but it could reduce ambiguity and aid literacy, as evidenced by languages like Finnish or Korean.
I don't know if Korean is ultimately that good. Hangeul are a monstrous improvement over the old mixed script (which itself is better than the Japanese iteration because the Koreans only used Chinese characters for Chinese loans), but it still has a lot of sound change rules and can be a bit of a pain to read because of how letters flow to the next syllable. It's not in the same league with Finnish or Spanish, at any rate.
Yeah there are multiple accents in Spanish, but each accent is still a 1:1 mapping from written word to pronunciation, there's no enough/through/dough nonsense.
In Spain you'll listen the three cases at once and all of them are perfectly valid.
-ito it's almost the universal way everywhere in the Hispanic world.
-ico it's widely used in the South of Navarre and Aragón and everyone will understand you. Heck, it's the diminutive from used by the hick people, and thus, it's uber known, altough you might look like a bumfuck village redneck sheepherd with a beret by using -ico outside of Navarre/Aragón.
-illo it's more from the South, but, again, understood everywhere.
In Argentina everyone will understand you, but if you don't use "ito" then people may ask where are you from.
"ico" is used in many countries of Central America and Caribe. I asked someone from Colombia, so I'm sure about Colombia but I'm no sure about every other country.
Is "illo" used in Madrid? I think I heard it in movies or TV programs from Spain.
The explanation you gave is already contained in the cited Wikipedia article. I think this "ghoti" example is more of a tongue-in-cheek mocking of pronunciation inconsistencies. If you want a jarring example, consider laughter and slaughter. I know, i know, they have different origins, but still, it confuses foreigners like me while learning the language.
But English orthography isn't meant to serve foreigners.
Im ESL, I struggled with English spelling as much as the next latin speaker who's already learned to read and write in foreigner.
But now that I get the reason behind it, I love it. I consider English orthography worthy of UNESCO protection, even. In fact, I am annoyed at the regular spelling of my two latin languages that have left so much history behind.
It’s fairly good at helping us understand the etymology. Have a “y” acting as a vowel in the spelling? Good chance it’s Greek. Have a “k”? Almost certainly not Latin.
That is trivia that is useless in almost all contexts. I've been a native English speaker all my life and this is the first I've heard of that. I can't think of any situation in life where knowing that fact would have been helpful. Your claim seems reasonable, but if someone says you are wrong I wouldn't fact check it even if clear links were posted so that I could.
But if you had known it (aka, if anyone had taught it to you), it wouldn't be useless, as you would know the context and how to pronounce it...not to mention the meaning behind it
If you’re seeing a word for the first time, it is pretty useful - partly with pronunciation but definitely with meaning.
You do have to have some familiarity with the source languages, but if it’s an unfamiliar but nativized word, those are almost always ultimately Latin or Greek.
If you're seeing the word for the first time and need to figure out how to pronounce it, how would you know that “y” is acting as a vowel and not as a consonant in the first place?
If it's followed by a vowel, it's likely a Germanic word: yule, your, young, yellow (and you probably know the word, since our core vocabulary is still mostly Germanic). If it's at the end or between consonants, like syllabary or ontogeny, probably Greek.
You might also just happen to know a smattering (or even a lot) of Greek and Latin.
Probably not. Toddlers generally don't have the brain to learn any reading. Spanish's advantages in reading isn't how young you can start learning to read, it is how fast you can stop reading. Spanish schools stop teaching reading takes about 5 years to learn, English 6, and Japanese 9 - after that much training kids are finally considered to read anything. (sometimes we talk about college level reading, but that is more about mastery of topic specific topic - Doctors, lawyers, and engineers each have special vocabulary that needs extra training to read, but they cannot read each other's technical papers)
I want to know who thought that chinese transliterated into "english characters" should use a whole bunch of q, x and zs to represent sounds in a way that no other english word does.
Pinyin was written by Chinese speakers for Chinese speakers. There are other romanizations written by westerners, and these are easier to see where the sounds come from; e.g., "tsai" rather than "cai".
What use is "q" as a letter at all in English? It makes a "k" sound and always occurs with a "u" after it. Why not use it for the "tch" sound? (Which, btb, is different than the "ch" sound.)
"C" is about the same -- by itself it always sounds exactly like "k" or "s". Why not use it for the "ts" sound?
As for "zhou" -- in English, z is very similar to an s, but voiced. So in pinyin, zh is just like ch, but voiced.
Lots of languages do this BTW. When people from Wycliffe want to translate a Bible into an obscure language without a writing system, they first have to invent a writing system. They could invent all new characters, but why? All it would do is make that language hard to type. So they take the sounds that language has, and map them onto Latin characters. Sometimes there's an obvious mapping, sometimes not.
Look up Welsh's spelling for another example of this.
> Why not use it for the "tch" sound? (Which, btb, is different than the "ch" sound.)
What are you thinking of? There is no difference between those things.
But your major point here is correct; on the fundamentals there is no reason for the English alphabet to feature a Q.
> "C" is about the same -- by itself it always sounds exactly like "k" or "s". Why not use it for the "ts" sound?
With the modern alphabet there's no reason for a C either. However, the answer to "why not use it for the 'ts' sound" is pretty obvious - that sound isn't part of the English phonemic inventory. It occurs, but that is almost always just a result of what is supposed to be a bare /t/ being followed by /s/ for grammatical reasons. (For an example of the general feeling here, note that an English word cannot start with /ts/ at all.) Why would we use any letter to represent the "ts" sound? We represent it the same way it exists in our language, as a sequence of two unrelated sounds.
> So in pinyin, zh is just like ch, but voiced.
Technically the only voiced consonants in pinyin are m / n / ng / l / r. I think a voicing contrast was present in Middle Chinese, and there's one today in Shanghainese and presumably other Wu dialects, but not in Mandarin.
> What are you thinking of? There is no difference between those things.
I'm talking about pinyin here. In Mandarin, there are to distinct sounds, one represented in pinyin by 'q', and one by 'ch'. It took me months to hear the difference, and months more to be able to pronounce them properly. I think there are other romanizations where the 'q' sound is represented "tch".
(In fact, I'm inclined to think that there are actually two different sounds in English as well; "witch" and "Charlie" don't feel the same in my mouth.)
> Technically the only voiced consonants in pinyin are m / n / ng / l / r.
I think we're using different definitions of "voiced". Other voiced / unvoiced pairs in English include g/k, b/p, v/f, z/s. See [1] for an "official" example of "voiced" being used the way I'm using it.
How else would you describe the difference between "qu" and "ju", or "chou" and "zhou"? The only difference I can feel is when your vocal cords turn on.
> In fact, I'm inclined to think that there are actually two different sounds in English as well; "witch" and "Charlie" don't feel the same in my mouth.
There aren't.
> I think there are other romanizations where the 'q' sound is represented "tch".
Well, maybe; there are a large number of romanizations of Mandarin. But there are no significant romanizations where that is true. It's q in pinyin, ch' in Wade-Giles, and ts' or k' in postal romanization.
> How else would you describe the difference between "qu" and "ju", or "chou" and "zhou"? The only difference I can feel is when your vocal cords turn on.
You could read my other comment in the thread. qu and chou are aspirated; ju and zhou aren't. Your vocal cords don't turn on at different points for those syllables. Mandarin Chinese doesn't use voicing contrasts.
> I think we're using different definitions of "voiced". Other voiced / unvoiced pairs in English include g/k, b/p, v/f, z/s. See [1] for an "official" example of "voiced" being used the way I'm using it.
Yes, I know what voicing is. You don't seem to know what consonants are used in Mandarin.
> qu and chou are aspirated; ju and zhou aren't. ...Compare [ref]
So the idea here is that chou and zhou are related in a similar way that the t's in "top" and "stop" are related: your mouth and vocal cords are doing the same thing, but in one case you have the puff of air and the other you don't.
At any rate, going back to the original question: the logic behind the choice is still consistent. On this classification, in Mandarin, p and t and ch are aspirated, and in English p and t and ch are voiceless; b and d and j and zh are unaspirated, and in English b and d and j and z are voiced. (And q is mainly thrown in to fill the gap, but its pronunciation in English is voiceless as well.)
Or, to explicitly quote from the ref you shared:
> Such pairs [of aspirated and unaspirated plosives and fricatives] are represented in the pinyin system mostly using letters which in Romance languages generally denote voiceless/voiced pairs (for example [p] and [b]).
Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction. In the former case unvoiced consonants often have aspirated allophones as in English, and in the latter case unaspirated consonants often have voiced allophones especially between vowels, as in Chinese or Korean. Hence why it makes sense to map the two in this manner - if your native language uses aspiration as the primary feature, and you hear someone who uses voicing, your brain will generally map it "automatically" for you, and their speech will sound weird but understandable.
(But then you get Hindi with a four-way distinction, both voiced/unvoiced and aspirated/unaspirated in all possible combinations.)
> Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction.
Yes, that makes sense -- I certainly learned something from this conversation. It makes sense that speakers would naturally tend to classify things along different lines, and in Chinese the aspirated / unaspirated classification makes sense.
That said, after having had some time to sit with the proposition that 'j' in the English name "Joe" is voiced, and the "zh" in Chinese word "zhou" is unvoiced, it continues to seem obviously false to me. It seems very much to me like mistaking of the map for the territory [1].
>> True aspirated voiced consonants, as opposed to murmured (breathy-voice) consonants such as the [bʱ], [dʱ], [ɡʱ] that are common among the languages of India, are extremely rare.
> Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction.
My understanding is that all of these options are fairly common:
- two-way contrast between aspirated and unaspirated
- two-way contrast between voiced and voiceless
- three-way contrast between voiceless aspirated, voiceless, and voiced
- three-way contrast for labial and alveolar stops; two-way contrast for velar stops
> They're spelled that way; I don't think they're supposed to be pronounced that way.
True, but most languages don't distinguish between [h] and [ɦ] to begin with, with one often the allophone of the other. So listening to Hindi it sounds like the same thing, more or less.
It's best not to think of Hanyu Pinyin as using "English characters" to pronounce Mandarin. It's just a mapping of the initial, medial, and final sounds onto the Latin alphabet in a consistent way, so that once you know the mapping, you know the pronunciation right away, and more practically, you can _type_ it right away.
I used to always think these romanization schemes were really bad, until I realized they were just not for me. The ease of sight-reading and getting the correct pronunciation for a random english speaker is not the goal. It's primarily for the convenience of users of other languages to have a systematic encoding. To make it pronunciation-friendly you would have to have to add a bunch of complexity to the mapping that would compromise its usage by the real audience.
In general, it's not transliteration into English characters, it's transliteration into the Latin alphabet. That means that transliteration tends to be shared across the various European languages that use the Latin alphabet. And given that the English were one of the last powers to actually engage in the naval trade war, they're less likely to be the basis of a major transliteration effort.
In the case of the q and x, I believe it comes from 500-year old Portuguese.
> That means that transliteration tends to be shared across the various European languages that use the Latin alphabet
Not just European languages. Pinyin is useful for everyone that has to interact with Chinese words, whether their first language is English, French, Swahili, or even Mandarin.
A lot of people might not realize that the primary users of Pinyin are Chinese people. The way typing Chinese works is that you type the pronunciation in Pinyin and then a box pops up with choices of characters from which you select the correct one. It's also used in dictionaries to give the pronunciation of unfamiliar characters.
Your first question, who thought of the system, has a straight answer. From Wikipedia:
> Hanyu Pinyin was designed by a group of mostly Chinese linguists, including Wang Li, Lu Zhiwei, Li Jinxi, Luo Changpei, as well as Zhou Youguang (1906–2017), an economist by trade, as part of a Chinese government project in the 1950s.
By the way, they are not “English” characters; they are Latin/Roman characters, and used in a huge number of languages with different spelling conventions. Pinyin was created for the entire world to use, not specifically English speakers.
How would you spell that sound in a way that is consistently recognized?
"zh" is actually one of the more reasonable pinyin digraphs because it follows the same pattern as "sh". If "s" + "h" results in [ʃ], then logically "z" + "h" should result in [ʒ].
"c" is used the way pinyin uses it in many languages (e.g. pretty much all Slavic ones that use the Latin alphabet, for starters).
"x" and "q" are more questionable, but there's precedent for either in languages using Latin-based alphabets - "x" can be [ʃ] in Spanish, for example, and "q" is [c͡ç] in Albanian.
> "zh" is actually one of the more reasonable pinyin digraphs because it follows the same pattern as "sh". If "s" + "h" results in [ʃ], then logically "z" + "h" should result in [ʒ].
Note that the sound [ʒ] is common in Mandarin, but its pinyin spelling is "r". "zh" isn't voiced and is affricated.
Wait til you get a load of Tamil/Malayalam transliterations’ use of “zh”. It was proposed by some German linguist to represent a really retroflex “r” and now makes outsiders pronounce kozhikode as “cozy-code” instead a closer “korikode”
Pinyin uses s in a very common way, z in the way of Italian, and c more or less in the manner of various Slavic languages. They are a sequence of related sounds: s is the fricative, z is affricated, and c is both affricated and aspirated.
Sh, zh, and ch are a sequence of sounds related to s, z, and c. Sh is a fricative articulated farther back in the mouth, zh is its affricated form, and ch is both affricated and aspirated.
And as a bonus, sh and ch match English usage, which isn't likely to have been a primary concern.
It's also worth noting that for many Chinese speakers, there is no difference between s/sh, z/zh, or c/ch.
(x, j, and q are what you get if you use the middle of your tongue, instead of the tip, to pronounce sh/zh/ch. They occur before front vowels; sh/zh/ch only appear before back (or central) vowels.)
A friend of mine remarked to me once that when she was in school, her teacher informed the class that English speakers would not understand what the pinyin letter "q" was supposed to mean, which I immediately confirmed. She thought this was hilarious.
Well that is a good point. For some reason I just assumed that pinyin was specific to english and that other languages used different transliteration schemes.
> Turning 'ti' to /ʃ/ is a fairly normal affricatization
It can't be an affrication, because /ʃ/ is not an affricate. (Although /tj/ is affricated, as /tʃ/ [think "gotcha"] - when you say 'ti', you're referring to words that were pronounced with /s/ rather than /t/.)
Wouldn't /sj/ -> /ʃ/ usually just be called "palatalization"?
(The specific phenomenon in the context of English appears to be called "yod-coalescence".)
As a native, "toward" is pronounced exactly like "to ward", but (usually) with the highly-unstressed vowel variant of "to". Remember that "w" is a semivowel, but it's not doing anything special here (at least in the vast majority of mainstream English dialects). In contexts where it is emphasized (or I suppose in more formal registers) it can strengthened to merely the normal lack of stress.
English might make more sense if someone actually sat down and wrote out the real stress rules, rather than trying to cram everything into just "unstressed" and "stressed" and only caring within a word.
=====
"To" might be one of the syllables with the most possible stress levels, with at least 4 and possible more. As I spell them,
1. "too" - full stress. Common for "two" and "too", but possible for "to" under rare circumstances.
2. "to" - less emphasized but still arguably stressed; still has the "proper" vowel. Usually this is as strong as "to" gets; "two" and "too" often fall down to this level if before a stressed syllable. Arguably this could be split into "stressed but near words with even more stress" and "unstressed but still enunciated" (which occurs even within a register).
3. "tah/tuh" - unstressed, the vowel mutates toward the schwa. Very common for "to", but forbidden in a few contexts. May be slightly merged into the previous syllable. Can we split this?
4. "t'" - very unstressed vowel has basically disappeared; may or may not remain a separate syllable from the one that follows (should that be split?).
The infinitive particle can't be 3 (normally 2, not sure if 1) if the following verb is implied (but not if the speech is cut off). At the start of the sentence it also can't be 3, and 1 is possible as seen below though 2 remains the default. Note that many common verbs act specially when before an infinitive particle; although sometimes treated as phrasal verbs it would be silly to treat them as taking a bare infinitive as their argument.
Adverbial particle "to" when the phrasal verb takes a direct object can be 2 or 3; this likely depends on the specific verb it's part of. Note that many people parse this as a preposition (taking a prepositional object), but this is technically incorrect (though there are some verbs where it really is unclear even when doing the rearrangement and translation/synonym tests).
Adverbial particle "to" when the phrasal verb does not have a direct object is usually 2 or even 1 (e.g. in the imperative). Some heretics have started calling this a preposition too (unfortunately, often in ESL contexts), but this should be avoided at all costs; they're just too cowardly to give particles the respect they deserve. Probably the only common example in modern English is "come to", but there are several others in jargon or archaic English.
Particle/preposition (the parsing is arguable) "to" used between numbers (range, ratio, exponentiation, time before the hour) tends to be 3, especially if one of the numbers is a "two". With variables it is slightly more likely to be 2.
Preposition "to" meaning "direction", or "contact", or "comparison/containment" tends to be 2, but can usually fall to 3 (less likely at the start of a sentence, and can also be prevented by what precedes it, e.g. "look to" can fall to 3 without much effort, but "looked to" strongly stays at 2). Contrast with "toward" of related meaning, which takes effort to get from 4 to 3.
Preposition "to" meaning "according to", "degree", or "target" (including but not limited to the explicit expression of an indirect object with most verbs, which we could argue should count as a particle instead. If you're wondering what verbs are excepted, one is "ask" - it can only use "of", as in "ask a question of him") is much more strongly 2, and requires significant effort to force it down to 3.
Adverb "to" is always 2 I think, but this is rare enough that I'm not sure.
=====
"To be or not to be", as famous as it is, has a pretty unusual stress pattern for most of its words: full stress on the first "to", semi-stress on the first "be", no stress (but still full length) on "or" (normal), full stress on "not", some stress on the second "to", and some stress on the second "be" (more than "to" but less than "not").
"engage", "engorge", "engrave", "engross", "engulf" are all fairly common words that are either often or exclusively pronounced that way (some dictionaries might show /in-g/, but /n/ is really /ŋ/ before g or k, even if they remain). Since these can take prefixes, this also proves we're not limited to being at the start of a word. Searching for words that can be spelled with with "ing" or "eng" finds a few more but nothing super interesting (though a few are in the middle of a word).
Obviously words where "g" is pronounced /dʒ/ (like "j" for those who can't read IPA) aren't subject to this.
In my local (dialect and) accent all of these words have a pretty clear initial /ɛ/ and not /ɪ/. (But also: /ɪ/ usually contrasts strongly with /i/ here, but the sound before /ŋ/ is almost a third in-between vowel.)
You might be right, but for what it’s worth I’ve literally never heard any of those words pronounced that way. I’ve only ever heard the word “English” start with the same sound as inside, while “engage” and your other examples start with the same sound as entertain.
While you're right, I feel like there's no safe argument to make here, because some group somewhere will pronounce some word in a certain way, so there can't really be a blanket rule.
You're correct on the reasons why "ghoti" cannot be pronounced like "fish," but what your explanation illustrates is that the mapping from English spelling to pronunciation is extremely nuanced - needlessly so.
A more direct phonetic writing system, like many other languages have, would make it much easier to learn how to read and write English.
ghoti is a ridiculous example. it takes its components entirely out of context. 'gh' as 'f' only occurs at the end of a syllable, 'ti' as 'sh' only exists as part of '-tion' where the pronunciation slurred over time. Pretending it says anything about the nature of the English language outside of English being a complex merging of various other languages that has evolved with time is silly.
As someone that enjoys reading I can't think of a more descriptive language than the English language...It's easily one of the most powerful languages on earth and has twice (!) the number of words in its vocabulary compared to something like French, which is heavily centralized by some managerial class. You just have to appreciate the language for what its strengths are (unbounded capability to communicate using just words) vs what you as a novice need to do to master it. Which, to be frank, is easier than mastering something like the Korean language that has all this drama and ceremony around politeness and speech levels.
> I doubt that this page will convince anyone that English spelling is a good system. There's too many oddities. [...] What I hope to have shown, however, is that beneath all the pitfalls, there's a rather clever and fairly regular mechanism at work, and one which still gets the vast majority of words pretty much correct. It's not to modern tastes, but by no means as broken as people think.
Which is to say, English spelling is definitely messed up. But it's not some insane thing that lacks any hint of sanity that some people try to portray it as.
This article feels to me as it was written in bad faith, trying and failing to prove a point, but then positing the point was proved.
The author happily start the article by submitting:
>The purpose of this page is to describe [...] the rules that tell you how to pronounce a written word correctly over 85% of the time.
but then they quietly show that with their whole page of rules, the reader will not actually pronounce 85% of the words correctly as they just claimed, but actually less than 60%. By arbitrarily deciding that a number of errors can be considered small, the author bumps the number of "correctly pronounced words" to 85%.
Are we talking about 85% of the whole language? No, just 5000 words. Even if they are the most frequent in the written language, they would still only account for around 95% of all the words.
The author position is:
- people complain about the English spelling all the time, saying it's horrible
- the English spelling is actually pretty systematic and this page will explain the rules to understand it
- when you will have mastered these rules, you will pronounce half of the words perfectly - for extremely common words such as "give", "get", "real", "very", "put", "half" you are still SOL
- the english spelling is not so horrible after all: as a perfect student you will only butcher more than 1 word every 10 spoken
To me, the author has proved the point he was trying to disprove.
(and in which rule do /ˈsɪŋɚ/ and /ˈfɪŋɡəɹ/ end up?)
All languages have inconsistencies, but it seems in vogue these days to single out english and use it as a punching bag. Furthermore, no natural human languages (i.e. not artificially constructed ones like esperanto) are logical. They all have irregularities and illogical aspects.
It is not "beyond fucked" that things have different pronunciations sometimes. Other languages have problems for people who solely learn by speaking. It's not unique to english.
Amongst niche circles of linguists, maybe. It's easy to single out for the average person because it's popular- english is a language learned around the entire world.
Besides, I'd rather have some more word pronunications than memorizing a table of der/das/die,dem/dem/der and a word's gender on top of learning the word itself. Or changing the position of a verb depending on if I used a modal verb or not.
Your examples are more or less regular though. English is a stress-based language, so it's expected that pronunciation might change when you add an extra syllable, if the stress moves (syllable -> syllabic is another example, btw).
> wind, rewind
This one is trivial, no? the "wind" in "rewind" is pronounced the same, with /aj/. The "wind" with /ɪ/ is unrelated.
Could you please share your list? I have this discussion a few times per year and I'd love to hand that list to people that think written English makes sense.
I was thinking of writing a blog article on it but I don't think I'd need to anymore!
Most English words are regular, and most commonly used ones too. "the", "be", "are", "why", "can", "might", "life", etc. are all perfectly regular if you understand how to read english orthography (which uses character clusters and can't be read a letter at a time).
Infinite/finite regularly related, too - the reason the pronunciation of the finite cluster changes is due to stress differences (initial in- always takes the stress, and then the following syllable must be destressed). Note that the long vowel at the end comes back in the 4 syllable "infinitum", again due to regular stress rules.
Yep! Not only that but people will actively mispronounce words as a form of vetting. Mispronunciations also becoming a form of tribal identity. Speaking of American vs proper English. America is the most diverse cultural landscape in human history. If you stay put, you won’t see it. Start traveling around the country and its the only thing you see.
this is not hyperbole. Sure other places are diverse, however because of the unique nature of the US and its size it just ends up attracting and subsequently absorbing.
America is diverse in some ways, but in terms of language and dialects (which is what we're discussing here), America is remarkably homogeneous. There are many tiny countries with more linguistic diversity than the US.
Specifically to english and dialects, you are correct. England proper has a different dialect and accent for every nook! London for literal neighborhoods! It also has several hundred if not thousand of years on the US for language to develop. Africa has everyone beat on this front. Bantu alone has who knows how many sublanguages! America has done a pretty remarkable thing in keeping its language internally consistent despite it’s overwhelming cultural diversity and influences! That it sucks to learn for the uninitiated is exactly for this reason.
I thought the article did a good job of explaining how English uses additional letters where French use accents, like the "h" in "ship" to indicate how the s is pronounced.
Yes, and how do those entirely true observations connect to the non-use of diacritics in English?
I pointed out the ship example from the text, which was used to demonstrate how "this early French influence over English, which arose from the Norman Conquest, is the beginning of the reason why English is written without accent marks. ... This was the French habit that the Normans brought to England: the use of extra letters to spell sounds that the alphabet didn’t have special letters for. This is why English has combinations like sh, th, ee, oo, ou that each make only a single sound."
That's an extra letter being used to indicate a different sound than the base sound, similar to how diacritics are used to indicate a different sound than the base sound ("the cedilla has the function of ensuring that a c can be pronounced like an s, despite coming before an a, o, or, u").
> "is the beginning of the reason why English is written without accent marks"
> sh, th, ee, oo, ou
That's cool 'n all, but I believe that only applies to French writing in English for English people.
Many languages have combinations of letters that have a single sound, it's no excuse for not having accents.
In German one can write strasse and straße or müller and mueller (different writing, same sound). They too don't have accents, but words written differently also sound different: schon = "already" and schön = "beautiful".
But German, on one hand retained diacritic marks, on the other it's also almost deterministic about pronunciation.
a it's always /a/
ä it's always /ɛ/ or /ə/ like e
sch it's always /ʃ/ as in schule
ch it's always /x/ after a, o, u and /ç/ after e, i
and so on
English doesn't use diacritics, IMO, because English doesn't make sense, it's a pastiche of lowest common denominators, so fck diacritics, they are too hard, let's write words as we like and pronounce them the way we feel they should sound, regardless of how they are written.
But it could use accents, for example rècord and recòrd, present and presènt, pérmit and permìt it's just they never thought it could be useful...
> Many languages have combinations of letters that have a single sound, it's no excuse for not having accents.
You don't need an "excuse" for not having accents. Digraphs and diacritical marks are simply two different ways to mark a letter as being pronounced as "somewhat similar but different". Whether one is better than the other is a matter of subjective perception, and it's very common for languages to not do it consistently. For example, Spanish has "ll" but also "ñ" (ironically the latter used to be "nn"!), and Czech has "č" but also "ch".
What's criminal about English is not the lack of diacritics, but rather the extremely convoluted and hard to predict rules for interpreting digraphs and trigraphs. If "ch" always meant the same thing, it would be just fine.
> but I believe that only applies to French writing in English for English people
Shrug. Yes, languages have different paths in their linguist and lexicographic evolution. Film at 11.
I still like what this linguistics PhD wrote about the specific history of one aspect of English language evolution.
> English doesn't make sense
That is of course an exaggeration. Just because the rules are complex and full of exceptions doesn't mean there's no sense. Even if you reject all of linguistics, Shannon in “Prediction and entropy of printed English”, demonstrated that English is compressible, which means there must be some patterns.
You're not wrong, except the technological reason. As I understand it, English lost a lot of characters when the movable type printing press was created.
Your link says ſ (the long s) didn't disappear (from English) until several hundred years after the movable type printing press and makes no mention of physical problems when using that letter, suggesting instead removal gave a type a more modern feel:
> Pioneer of type design John Bell (1746–1831), who started the British Letter Foundry in 1788, is often "credited with the demise of the long s".[12] Paul W. Nash concluded that the change mostly happened very fast in 1800, and believes that this was triggered by the Seditious Societies Act. To discourage subversive publications, this required printing to name the identity of the printer, and so in Nash's view gave printers an incentive to make their work look more modern.
Yes, this was explicitly called out in the ASCII standard, and is the reason ASCII has ~ (in place of the proposed ‾) and ‘^’ (which replaced the ‘↑’ in the original 1963 version).
Interesting! The z80 card in my family’s Apple 2 would render “^” as “↑” and I always wondered the connection. I guess they were using the original spec.
This comes from typewriters. Curiously, the reason why Esperanto uses Ĉ, Ĝ, Ĥ, Ĵ, and Ŝ is because the circumflex was present on French typewriters (which were very common in Europe at the time). Even though French itself only uses it for Â, Ê, Û - since it was a distinct key used for overtyping, it could be repurposed in this manner, just like Unicode combining marks today.
The Economist magazine uses a diæresis (two dots) in words like “coöperate” and “reëlect” to indicate that both vowels are pronounced separately, rather than as a diphthong. This is considered old-school and uncommon though.
That is the fun thing about English. There isn't really a single right way to speak or write it. It is defined by common usage. As long as your audience understands you, it is correct.
As someone else pointed out, loan words often have accents. At what point does jalapeño become en english word? There is no other english word to refer to the pepper, therefore it is now an english word and therefore english words can have diacritics.
The closest thing we have to a source of truth for the english language is the OED. It isn't prescriptive, it just lists how words are used rather than how words should be used.
> That is the fun thing about English. There isn't really a single right way to speak or write it. It is defined by common usage. As long as your audience understands you, it is correct.
That's how all languages work - to the chagrin of l'Académie Française - English is no special exception.
Learning the relationship between a diæresis and a diphthong and then seeing that the word diæresis contains a diphthong has rounded out my day nicely, thanks for that.
I enjoyed learning recently that the most common diacritics in Czech are the háček and the čárka. The word "háček" has a čárka followed by a háček, while the word "čárka" has a háček followed by by a čárka!
A "calque" is a word that's been brought from one language into another by translating the individual parts. A "loanword" is a word that's been brought over by just taking the word with little modification.
For example, "calque" is a loanword, while "loanword" (from German "Lehnwort") is a calque.
Similarly, a grave accent is sometimes used in poetry to indicate that a single vowel is voiced - e.g. in "cursèd" to indicate that the word should be pronounced as two syllables "curse-ed", rather than a single syllable "curst".
Loanwords often retain their accents as well: cliché, façade, doppelgänger, jalapeño.
I’ve always seen it written with an acute accent: ‘curséd.’ Wikipedia notes both usages, but to my knowledge I have never once read a poem which used a grave accent that way.
Winged and legged are still pronounced like that too, at least by some.
Interestingly, as an addition to the parent comment, there's a certain point in time where a lot of -ed words are often spelt -'d, which presumably is from the transitionary period between the expectation that the -ed was pronounced and today's general pronunciation.
The Economist uses diacritics in French, German, Italian, Portuguese and Spanish words, but deletes the diacritics from other languages (or maybe they keep them when they happen to resemble diacritics from those languages). I think I once saw a letter from a Hungarian complaining about that: a word they'd used meant something silly or obscene after they'd removed the diacritic.
>Similarly, Chinese and Korean names are usually written in the order they are pronounced, while Japanese names are reversed.
As a Chinese speaker this is maddeningly confusing when reading Western media. It's also a fairly new trend, I want to say a decade ago Chinese and Korean names were also read in Western order.
That seems like a quirk of the magazine for thsie pstticular words, but its more common for some others like "naïve" and "Zoë", although that's gone out of fashion somewhat since computers took over (and I believe both of those are loan words in english)
I love this, because I always do a double take and start pronouncing it as coOUUUperate and REEEElect, giving me much entertainment (I am easily entertained!).
Oh interesting, I've never seen those cases. I'd say it's more common (although maybe still a little old-school?) to use it in words like Noël or Chloë.
The difference is whether the sequence of vowels crosses the morpheme boundary or not. When it does, as in "cooperate", it's usually readily obvious to native speakers even when seeing the word for the first time, thus they don't need a mark to disambiguate.
I speak differently than my brothers because I grew up at my grandparents 3 MILES! away and if I go to my family restaurant 2 MILES the other direction there is a different accent again, and I mean different words too not just the sound. Where I used to go to school 10 miles away they don't understand if I speak my dialect because it's a different region.
The whole Italy is like that, a different dialect every 2-3 miles, every family, town, city, province, county and region has different accents and ways to make food and recipes. My town is 3200 years old, older than the Romans, they used to fight, then ally then fight again with them etc., this dialect thing is very old, cultures, traditions and families.
Of course we have the Italian language in common and the main dialects are separated by the main city of the region then by the region itself but yep, that's how it is.
Having so many different dialects (and full minor languages!) saying the same word slightly differently, Italians were forced to find (and use) a way to put the correct accent in writing.
Other languages probably don't have the mind boggling number of dialects Italy has. GP was not exaggerating, it really changes every few kilometers.
Like the article says: "situations like these are surprisingly few in English"
well, if you ignore the current country borders then "German" would encompass a large portion of Switzerland and the Netherlands. So, with that assumption, I would be surprised if Italian had more dialects than German.
My cofounder's wife, during a parents together at school, was "advised" by some of the mothers to not "hang around those" mothers because they're stranger folk. Turns out, they lived 1.5 miles away in the next village.
Well, that's because they're really languages and not dialects! They all derive from Latin, there is no "old Italian" or anything, at some point we decided the Florentine "dialect", having the most literary prestige, would be standard Italian.
Italians only really started speaking Italian in their day-to-day life after the war. It was mostly a written/literary language before that.
Yes, surprisingly few Italian dialects are actually Italian derivatives (maybe only a couple?)
But there are differences between a dialect and a language, we can't say all of those are languages even if most come from Latin.
Italian wikipedia says that officially in Italy there are about 13 recognized languages (not counting Italian, plus French and Slovenian in some parts), and about a dozen main dialects.
In wikipedia you will notice 3 big dialect groups that are just that, groups of many, many dialects that do not qualify as languages.
It's more a difference of how recognized by the community those are, and how unified by grammar, locality and uniqueness. Kind of a gray area for many.
> But there are differences between a dialect and a language, we can't say all of those are languages even if most come from Latin.
That's not really true. There's no scientific reason to say that some varieties are "dialects" and some are "languages". It is purely a political and culture question.
> Well, that's because they're really languages and not dialects!
Indeed they are not strictly dialects of Italian, which followed its own evolution alongside them. I think most of them could still be explained as dialects of Latin, who underwent major "niche differentiation" in the immediate aftermath of the fall of Rome and the rise of barbaric kingdoms.
> [Italian] was mostly a written/literary language before that.
This is a bit of an exaggeration. Clearly, even before the early modern era "Italians" could understand each other. Dante (from Florence) lived in Genoa and Ravenna, and had no need for an interpreter from what we can gather. Ditto the many "Renaissance men" who toured around Italy (Leonardo: Florence->Milan; Raphael and Michelangelo: Florence->Rome; Galileo:Pisa->Padua). This level of interconnection becomes really hard to explain without a high degree of mutual intelligibility.
Dante is a poor example for language proficiency, as he was educated / traveled/ well read. The common person would have a much different lived experience
I have colleagues in India. It's a diverse mesh of regions that vary in about every way. Was explained people grow up with 3 languages, their regional language, a neighboring region's language, a more general language, & then educated folk are taught English. Then in school they were still taking classes for other romantic languages. At an Indian restaurant with one colleague I noticed they would mostly rely on hand gestures. One factor here is that there may often be a language barrier
I've also interacted a bit with Senegalese, which has Wolof as the primary language, then French taught in schools. Many only know Wolof (with French influence weaved in). & the well educated learn to speak English, & how to maintain more European French accent
England has small accent shifts every 25 mins (the other audible accent / http://news.bbc.co.uk/1/hi/business/7843058.stm) - the situation you describe is two communication orders more complicated than that!
Closer than that in some places. I'm from Sunderland, which is contiguous with Gateshead, and then Newcastle. I can clearly hear when someone is from Sunderland vs. Newcastle, although 'a foreigner' - say, someone from London - might not be able to pick it.
I dare say Liverpudlians and Mancunians and Glaswegians and so on would make the same claim.
It doesn't compare to that coolness you just shared, but I'm from Long Island (right outside New York City) and I and everyone from my childhood town can differentiate a Long Island accent from a New Jersey accent (very similar but subtly different; a suburb on the other side of NYC) from a Queens accent (a type of NY accent from a NY neighborhood, whose most famous exemplar is The Nanny) from a Brooklyn accent (another type of NY accent, the Mel Brooks sort and how my dad speaks), etc etc. So, while, the US is nothing like Italy where every 3 miles there's a different language-or-dialect, the US accent isn't nearly as uniform as one might think, for even within cities and their suburbs, like my hometown in the above example, there is a comparable dynamic, where going not-that-far (these neighborhoods and suburbs aren't far from each other) people speak in accents that are notably different to locals, although surely people not from NY group it all together as "the NY accent" without differentiating the level-of-nasal-ness and other such contributing factors to the accent.
Sadly those Brooklyn and Queens accents are becoming rare in large parts of Brooklyn and Queens. You really have to go out to areas with few transplants (Long Island, Staten Island, or rapidly shrinking white working class parts of Bk/Queens) to hear the typical NYC-area accents being used as the main variety of the majority of the community.
I grew up in the province of Friesland [0], which is part of the Frisia cultural region, an area that was not occupied by the Romans back when so it retained some of its identity and culture - although a lot of that was erased by Christian missionaries and subsequent invasions and government takeovers etc etc etc.
Anyway, super local accent changes are a thing there as well, go north a few kilometers from where I grew up and you go from the "woods" to the "clay", which has its own intonation and possibly words. Then there were town specific stereotypes - people from this town will knife you, that town is full of inbreds, etc. That's probably a lot of made-up intentional drama though, lol.
Similarly in Norway and Sweden, new dialects every few miles, with both pronounciation and word changes. Places that could reach each other by boat tend to have more similar dialects (while if there's a mountain in the way you can have a bigger difference, though flight distance is shorter)
Interesting. I know that as a spanish speaker, there are some Italians whom I understand almost perfectly (like 90% and I can fill in the other 10% from context), but there are other Italians speakers where I can't understand anything at all.
When I was doing a bunch of learning about linguistics, situations like this were very interesting and confusing to me. I still don't have a good working intuition for how this is possible. I don't understand what maintains the sound differences in the face of the continuous exposure to substantially different accents. It's empirically possible, but it's never made sense to me. Why don't you and your brothers end up talking the same after a while?
I mean, people do end up talking the same after a while. Regional differences are disappearing and being leveled all over the world due to the influence of centralized education systems and media.
Same in some parts of Germany. In the area where I grew up in you can tell in which village a person is from just by the way they talk, and the villages are just ~3 km apart!
From what I know this is because it was a relatively remote, dangerous and poor region (all by the standards of hundred years back) which changed ownership a lot (between clergy, bavaria, prussia) and people were mostly left to themselves
You think that's bad, visit your friends to the East in Slovenia. You'd think they're doing it on purpose! How do so few people in such a small area make so many variations in the "same" language?
Generally speaking, countries that have a lot of different ethnic groups and/or introduced universal education relatively late tend to be those with more diverse dialects. Think about it: in a world without newspapers and TV, where most people live their entire lives in the same village they have been born in, and relatively few travelers, any linguistic innovation that appears in one place is going to take a very long time to travel elsewhere. Thus, local dialects tend to diverge. Universal school education slows this down by introducing a standard literary language (and, historically, often in a very forcible way). Mass media, TV especially, leads to further homogenization.
In 6th grade, so back in 1982, I read the French SF novel "Malevil".
I was astounded (speaking as a US kid here), to learn that French people born and raised in France didn't natively speak French, but instead learned their regional language.
> And besides, Thomas was already quite isolated enough as it was: by his youth, by his city origins, by his cast of thought, by his character, and by his ignorance of our patois. I had to ask La Menou and Peyssou not to overdo the use of their first language — since neither of them had learned much French till they went to school — because at mealtimes, if they began a conversation in patois, then everyone else, little by little, would begin to drop into patois too, and after a while Thomas was made to feel a stranger in our life.
Two minutes ago I learned that "patois" has a distinct meaning in France: "patois refers to any sociolect associated with uneducated rural classes, in contrast with the dominant prestige language (Standard French)" https://en.wikipedia.org/wiki/Patois
I am very ill-informed on the history of the topic, including the national language policies of France and Italy. I do know that Sardinian is not a dialect of Italian, but my knowledge isn't much deeper than that. ;)
IIRC in the early 1900s, coercive methods were used to stop children speaking their native regional languages, a lot of it in school.
In my region of Brittany (France) the most famous example that was on posters detailing good manners would say : "Il est interdit de parler breton et de cracher par terre" meaning "It's forbidden to speak Breton and to spit on the ground", placing both on the same level.
Stamping out minority languages and dialects was (and often still is) unfortunately common in most countries. I'm Russian, and my native regional dialect has some minor differences from standard Russian that make it sound a bit more like Belarusian. I remember how in school we had a teacher making fun of our manner of pronouncing words as "kolkhoznik speech" (implying that only the uneducated speak like that). This was in 1990s.
> I was astounded (speaking as a US kid here), to learn that French people born and raised in France didn't natively speak French, but instead learned their regional language.
As a French person born before 1982, I find this sentence questionable.
If you mean "there were some people who learned a local dialect", then sure, you could dig some up.
If you mean "many regions had dialects that were learned before French", then I believe you misunderstood (or were misled).
Finding anyone who even spoke a regional dialect would've been a novelty, let alone one who grew up speaking it before French.
I mean "there were some people", not all people - Thomas, in the quote, came from Paris and spoke French. He did not learn a regional language.
I don't mean 'many regions' because the only example I had was one region. The fact that there was at least one region where local French people, in a region which had been part of France seemingly since at least the Middle Ages, did not speak French as their mother tongue, astonished me.
FWIW, the French Wikipedia page says:
> Ainsi Malevil serait partiellement inspiré du site de Commarques (sa grotte, son abri troglodyte et son château)[2], tandis que le village de la Roque serait partiellement inspiré de la Roque Saint-Christophe, forteresse troglodyte voisine du château de Commarque. ...
and the location,
> La vallée des Rhunes : inspirée de la vallée des Beunes, et plus précisément la grande Beune.
so the author's fictional location was supposed to suggest the department of Dordogne in south west France.
> Limousin ... is a dialect of the Occitan language, spoken in the three departments of Limousin, parts of Charente and the Dordogne in the southwest of France. ... Limousin is used primarily by people over age 50 in rural communities. All speakers speak French as a first or second language. Due to the French single language policy, it is not recognised by the government and therefore considered endangered by the linguistic community.
Those people over age of 50 would likely have been children in a book written 53 years ago, with Limousin as a much more common language amongst the local adults.
"Over 50 in rural communities" in one of the more sparsely populated areas of France makes for a very small slice of the population even in that area, and even then, as pointed out, French is spoken by everyone.
On top of that, it is more "anyone who speaks limousin is likely over 50, and in a rural community", than "anyone over 50 in rural communities in that area likely speaks Limousin".
There are 10k speakers of Limousin today (according to Wikipedia), out of about 1.2M residents in Dordogne and Limousin combined. That's less than 1% just for that area.
To me, it is more of a local curiosity than a mind-blowing fact, but I suppose I grew up learning about the various dialects in France, so I have a different take.
> and even then, as pointed out, French is spoken by everyone.
Yes, as even the Malevil quote I gave pointed out. (At least by school age.)
> On top of that, it is more
The book was written over 50 years ago, so the Wikipedia article about present day use of Limousin isn't all that indicative of what it was like for the adult characters in the book, who would have been born before 1950.
> There are 10k speakers of Limousin
Why are you being so nit-picky? Look, this is a fictional place and the specific local language is never stated. I just today read the Wikipedia entry which give info about the location.
I specifically picked out Limousin, yes, because it fit the area, and because I could quote how the Limousin language was more widely spoken when the book was written than now.
But as the text I quoted says "Limousin ... is a dialect of the Occitan language". Wikipedia says there are about 200,000 speakers of Occitan, so that's the more relevant comparison, and "Though it was still an everyday language for most of the rural population of southern France well into the 20th century, the language is now declining in every region where it was spoken." - https://en.wikipedia.org/wiki/Occitan_language
It seems to me that when Malevil was written, Occitan was still widely spoken as a first language in the area. Wikipedia says the author was living in the area when he wrote the book, so he should know.
The only reason I mentioned it was because you wrote "Finding anyone who even spoke a regional dialect would've been a novelty, let alone one who grew up speaking it before French." while the book, written by the French novelist Robert Merle - Wikipedia informs me he was "a household name in France, with the author repeatedly called the Alexandre Dumas of the 20th century" - comes across that speaking in patois was not a novelty but simply something expected, and which effectively all locals spoke.
I simply cannot reconcile your surprise with my reading and limited understanding except by assuming it's from before your time, from a mostly forgotten era.
> I suppose I grew up learning about the various dialects in France
That's .. kinda the issue, isn't? In Malevil the local language patois is not seen as a dialect of French, as I quoted, it was a language learned in school.
Wikipedia says it's more related to Catalan than French.
Did I? I mentioned dialects in France, not of French, IIRC.
I'm nitpicking because, TBH, I quite likely just read too much into your use of "astounded" in your original comment. It seemed to me that you were overestimating how prevalent or significant these languages were.
By the mid-20th century, they were already quite less popular and even less so by the time Malevil took place (1977, I take it, even though it was written a few years earlier), especially when it came to being taught before French.
At the same time, I guess I was maybe as surprised to learn that Louisiana French is still a thing as you were about these areas in France. :)
When is something a dialect in France and when is it a language in France?
> It seemed to me that you were overestimating how prevalent or significant these languages were.
I said I was in sixth grade, a kid living in the US.
I didn't even know then there was more than one Romance language in Italy - as I alluded to in my original comment.
Yes, I now, decades later, know more. But I was sharing my childhood misapprehension and how I learned the world was more complicated than 11 year old me thought as something meant for others to smile at and enjoy, not to be nitpicked as if my comment was any profound statement about all of France.
My interpretation was not "questionable" - the story clearly was supposed to take place in a part of France where many of those in the countryside still learned a Romance language other than French as their mother tongue. That matches the real history for that supposed area that the author drew from. Yes, it's certainly something that's a lot less common now, some 50 years later. But then just say that things have changed.
it remains true to this day. gascon[0] is still spoken in south of france, by both young and old. i know because i've heard it spoken. the idea that the french speak french, italians italian, is very modern. european nations weren't as properly integrated as modern history will have us believe. iirc the integration sped up post-ww2. cf seeing like a state[1].
> mean different words too not just the sound. Where I used to go to school 10 miles away they don't understand if I speak my dialect because it's a different region.
Oh geez, for example in Italian to say here you say "qui", where I grew up I say "mchi" but my brothers say "mqui" or "mque", where I used to go to school they say "meque" with the weirdest sound.
To say what are you doing in Italian is "cosa fai" but I say "co fei" and my brothers "sa fei" and where I used to go to school they say "che fe".
These are just simple simple things but almost everything changes here and there and I can't put the sound with the words here, they actually sound different, and change where the actual accents are.
I grew up in southern Switzerland and the dialect situation is the same as you describe.
Not necessarily every town retained their distinctive dialect in practice because people move, not all parents pass the dialect down to their kids etc.
But I remember a friend of mine lived in this village of 40 inhabitants where they said "e peu que?" instead of italian "e poi cosa?"
I have relatives in Bari so I've been fascinated by Barese. My Italian is not good but I can passively pick it up when listening or watching television, but Barese sounds 100% like a completely different language to me. French and Spanish are more intelligible.
Funny also I moved to USA ~20 years ago and you lose the Italian, you don't remember words etc. but you'll never lose your dialect, it just comes natural because that's how you grew up instead of what you learned growing up and from school, Tuscan people have it easier because the language comes from their dialect, Dante etc.
And to add, I wouldn't click that link if you paid me lol, I hate the Barese... ok I clicked, funny stuff.
not obvious at all when every sentence uses "you" to indicate a general rule that applies to every Italian rather than "I" to indicate a personal experience
> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that"
Personally, the parent comment added a lot more, even inadvertantly, than one complaining about whether someone has or has not read the article.
The Godwin vignette at the beginning is such a clever way to dramatize what would otherwise be a dry spelling shift. Also, I never realized the irony that English avoids diacritics because of French influence
According to the article, Norman influence led to double letters being used to better mark out sounds, which achieves the same as diacritics. It made English mostly good enough (failures like 'lead' are rare). Being good enough, and lacking a strong central authority, the language only accepted a conservative standardisation, and avoided larger changes such as including diacritics. Without these Norman changes, there is more chance diacritics would have been added, as it would not have been 'good enough'.
Written English is a worse is better story. The Norman influenced version being the first-mover that users cling to even when better comes along.
Well, the "lead issue" could be fixed by writing the verb "leed" (after all, it's exactly the same sound as in the word "queen" mentioned in the article), but for some reason this hasn't happened...
It happened in newspaper jargon for the leading sentence of an article (though they used the spelling "lede" instead), because "lead" was already a metonym for hot type (which was cast out of the metal).
It hasn't happened otherwise presumably because the risk of confusion is normally very low when not in a Pb-filled context.
Diacritics wouldn't have helped moderns if they were in from the beginning - most of the confusing words used to be pronounced like they are spelled (at least to people of the time). Maybe they would have helped to petrify pronunciations and slowed or stopped linguistic drift but I somewhat doubt that given historical literacy rates.
> to dramatize what would otherwise be a dry spelling shift
I don't think that's how it was developed, though. I really doubt there are real-world cases where cwen was scrubbed and queen written above it (correct me if I'm wrong!).
I think it’s more like “people stopped writing English for time being, only learned to write Norman and Latin, so when they needed to write a word or two, they’s use the spelling they knew. Eventually, this spelling because the way of writing English”.
I don’t think a situation with Godwin is plausible.
> I never realized the irony that English avoids
> diacritics because of French influence
I'm not sure that's the best way to put it. Old English also generally didn’t use diacritics (modern texts add them: we’d use cwēn instead of cƿen, but these are modern invention).
So, English didn't use diacritics before Normans, and Normans didn't change this.
Do the tittles on `i` and `j` count as diacritics? In English those vowel symbols never appear without their tittles. (In contrast, the related vowel symbol `y`, which is like an `ij` combo and is named "Greek I" in French, never appears with tittles!) In some sense, the glyphs are idiomatically atomic with the diacritics permanently stuck to them.
I don't know what a headless `i` or `j` would sound like in English, since they aren't used. So it's not really a verifiable claim that the tittles don't alter the sound.
For the record, Wikipedia states they are diacritics. I'm leaning toward agreeing due to this observation: "In most Latin-based orthographies, the lowercase letter i conventionally has its dot replaced when a diacritical mark atop the letter, such as a tilde or caron, is placed." Ex: aeiouyj, âêîôûyj, äëïöüÿj, áéíóúyj́ (oops, my example shows an accent above the J's tittle–down one, but three up)
You don't even use them consistently in the same sentence (your unaccented i has at least 3 different values, for instance).
The real reason English spelling is frozen in the 1600s is that that is the last time all English speakers had a common language community. Since the foundation of the colonies, Englishes have diverged from each other from that starting point, so that no reform can be neutral to all current Englishes - some have merged what was distinct in early modern English (e.g. cot-caught merger); while in other cases what was a single class has been split (e.g. the bath-trap split). Wikipedia has a (non-comprehensive) table: https://en.wikipedia.org/wiki/Sound_correspondences_between_... note for example that even where two varieties have merged phonemes, they might have merged them differently (compare Southern American to Australian). You might try to come up with a spelling system that covers all possible combinations, but it would be then very hard for the speakers who have mergers (i.e. all of them) to use - how is an Australian supposed to know which äː vowels are æ in American and which are ɑ? How are the Americans supposed to know which ɑ's are äː vs ɒ in Australia? etc. etc.
It's very important for English to have 9000 vowels so we can tell where you're from within about a five mile radius, no matter how hard you try to hide it.
If you mess up every vowel in an English sentence, everybody can understand every word, but it makes everybody a little upset and a little aggressive. If you want to play it safe, just make every vowel a schwa and people will think you're from New Zealand.
You'd be surprised! I actually see a fair amount of haceks hereabouts in PNW because they are used in the orthography of the local Salish Native American languages, so you end up with road signs like these: https://www.charkoosta.com/news/salish-language-stop-signs-e...
You're trying to create set of rules for something that's evolved from strong oral to written to emphasis on oral again. It's organic and used in coordination with many other countries and their languages. If you understood that many of our rules are defined within specific instances, by specific needs (publishers example), and are somewhat arbitrary, you'd be amazed we have any consensus at all. My theory is that publishers, broadcasters (Like the BBC) and educational institutions are really where standardization has been enforced. Outside of that English and language is as flexible as a sender and receiver of a communication will allow.
It does sometimes, though its use may mark the author as among the agèd.
Not to mention loanwords, which of course English is full of, and are sometimes considered properly spelt with their original accents, though many will spell them naïvely without.
Diphthongs too, especially in British English, are not just an archæological find, though out of pragmatism usually written digitally with two separate characters.
On the internet the most marked issue is the difference between British English spellings (England, Canada, Australia, New Zealand) and the USA. It is frustrating that on most spell checked text boxes words like: harbour, labour, actualise, etc shown as misspelt.
I find it most irksome that the Australian Labor Party has chosen the USA spelling in spite of being part of the Commonwealth.
The great thing about using Singapore as my locale is that it accepts pretty much any English spelling you throw at it, British or American. You see quite a mix of both on signs and documents here, too.
> As a result of these circumstances, things like spelling practices varied from one place to another, and one scribe to another. The same word could even be written on the same page in multiple ways.
I believe we can all still be confident scribes and maybe even have our own preferred way of writing words, where we within reason push the boundaries or push our own viewpoints through self expression :D
No it isn't. -t instead of -ed in general for many words is dialectal for one thing, more commonly retained (a Saxonism) in the West Country than elsewhere. Misspelled in particular though is distinctly American, everywhere in Britain uses misspelt.
(Ironically, I'm not sure if deliberately ironically, you 'mispelt' both, fwiw.)
> though its use may mark the author as among the agèd
Thirtysomething here. I use diaeresis (a/k/a diæresis) over e.g. coöperate. It’s more concise than a hyphen. And it makes more sense than cooperate, given cooper is a word.
English still sometimes (albeit very rarely) uses one type of diacritic. The diaeresis is in occasional use. Now days it is mainly used in the word "naïve," but it will be familiar to readers of the New Yorker on words like "coöperate."
The diaeresis is used disambiguate when a pair of vowels make two separate vowel-sounds, instead of one. For instance, if you didn't know better, you would think that the words "naive" and "nave" (said "nāv", the congregational of a traditional church) were homophones. But the diaeresis shows you that the "a" and "i" are said independently (nah-ēv).
Of course, English also uses diacritics occasionally in some borrowed words: résumé, née, fiancée/fiancé. But these are also considered optional.
>>
The use of diacritics arises out of a mismatch between an alphabet and the language it’s being used to write: if an alphabet were well adapted to a language, it would have letters for all the language’s sounds.
<<
and then it use 'ç' as an example even though French has 's' for the same sound, amusing.
To summarize the article, when a language has a single creator, (in this case, the person who runs the first major printing press in France,) that person has immense power to make significant changes to the language when needed. On the other hand if the language has multiple collaborators each with some influence to make changes to the language, such changes tend to be much more conservative ones.
A point that is always good to think about, a population defines its language and a language defines its population, it's a symbiotic relationship. The language you speak, will shape how you perceive/interact/understand the world itself.
We do, of course, use accents and other diacritics. It's not as common as in other languages, but most people will come across a few each day. The popular argument here is that many are French, with the accents optional, yet soupçon and exposé are rarely written naked. If you want non-French, pick up the New Yorker and you will find coöperative and reëlect, or a poem to find changèd and learnèd. We use them in names, from Brontë to Beyoncé.
Between languages, even the letters have different uses. Diacritics can be used to signal a different sound or the tonicity of the word (at least in the languages I know those are the two uses).
I don't understand what this thread is all about. English doesn't need accents because there's no universal meaning attached to each one? That doesn't make sense.
Do you have any examples? As a Finnish speaker the Swedish "a" sounds the same. "Pappa", "framtiden" etc.
It's "ä" and "e" which have swapped uses, but it's not exactly consistent (e.g. "Järnvägstorget" where first ä is close to the Finnish ä, second ä is closer to e but so is the e at the end)
Many native english speaker here like to fantasize on the superiority of other cultures / languages but what good are diacritics for when there are still a shitload of letters that have no diacritics and can be pronounced in different ways?
For example let's take french... A cat is a "chat" but you don't pronounced the 't'. Oh but in "chatte" (pussycat or pussy), you pronounce the t's. While in other words in french you pronounced the 't', like in "table" (yup, it means a table btw).
Speaking of which, the 'e' in "le chat" isn't pronounced the same as the mostly (but not entirely) silent 'e' in "table".
No diacritics on these 'e' here and yet they've got different pronunciations.
Don't come and say: "but that's only with silent letters". Definitely not. "elle" (she) and "le" (the)... Different pronunciation for these three e's.
I've got better: "les fils" (the sons) vs "les fils" (the cables). Exact same spelling. But in one you pronounce the 's', in the other you don't.
Wait, even better: "le fils" (the son) vs "les fils" (the sons). Same pronunciation for "fils", no plural or singular: just one word with a 's' at the end.
Stop romanticizing about french: it probably has more exceptions and weirdness than english.
And you probably don't want to get me started on the average reading and writing skills in elementary and secondary schools in France. It's in freefall so the whole point is kinda moot: the digital natives can't use diacritics properly in french. Heck, many can't even (and don't want to) speak proper french. The language is becoming simpler and simpler, dumber and dumber.
> Stop romanticizing about french: it probably has more exceptions and weirdness than english.
As a non-native to both French and English who was taught both languages at school, there is a difference that french pronunciation rules were taught from the beginning, while english pronunciation was taught just as IPA transcription of dictionary words.
The problem with French is that pronounciation changed but not ortography. It's easy to see that you did pronounce chaT in the past. Other languages periodically review their ortography. My language had that twice in my lifetime.
>Many native english speaker here like to fantasize on the superiority of other cultures / languages
Some languages really are a lot better than English as far as mapping between spellings and pronunciations. French just isn't one of them; as you pointed out, it's possibly even worse.
I point to German as the superior European language in this regard. I learned some in high school. I can't speak conversationally any more, but I know the pronunciation rules, so if I can read it, I can say it and pronounce it well enough for a German speaker to understand me, even though I don't understand it myself.
That said, German is a nightmare compared to English because of the grammatical complexity (cases etc.), but for pronunciation in relation to spelling, it's excellent. The written form really does reflect the spoken form accurately.
> it probably has more exceptions and weirdness than english.
Pronunciation-wise, I doubt it. All your examples have English counterparts.
Consider eleven (the vowel sounds for the same letter), psychology (silent p), wind / rewind, many irregular verbs (like read, read, read), Wednesday and business (many letters are just not/weirdly pronounced), history and litterature (one fewer syllable than expected), the complex rules to pronounce the ed + exceptions... You basically have to know how an English word is pronounced to pronounce it correctly. Guessing works but only so far, and I believe less than for French (and I'm a French speaker too).
I have a close friend from the US who likes to make fun of the French language, but when I cite English, he says oh yeah, but for English we already know that! :-)
Anyway, English and French are both quite bad at this, and you are right, that's nothing to be proud about. It's just a reality we have to deal with.
> The language is becoming simpler and simpler, dumber and dumber.
Simpler is not dumber and I absolutely don't think the language is becoming dumber. The last reform (1990) brings more regularity and this is most welcome, freeing us time for things that actually matter, making the language more accessible to foreigners as well as people with conditions like dyslexia or dysorthography and less a status tool. I welcome the French language becoming more welcoming.
Or please strongly back your dumber and dumber statement. Because usually that's just baseless, tired rambling from clueless conservative people saying such things. A French speciality (a national sport even, championed by the Figaro?).
> And you probably don't want to get me started on the average reading and writing skills in elementary and secondary schools in France. It's in freefall
That too. Maybe you should fix your English before lamenting on the writing skills of people, because you are making a lot of basic language mistakes in this very comment in which you are doing this. That's harsh and not nice, but that's what you are seemingly doing to others and I want to take the opportunity to make you feel what it may feel like. Actually, you probably cannot even begin to imagine how you may sound like to people for whom writing is a struggle. Such people often feel ashamed because of people like you. Let's just be forgiving, tolerant, more empathetic and stop using language skills as status and start focusing on the content.
I have a close acquaintance who expresses themself perfectly, only writing without mistakes is hard for them. They even have an official disability recognition for their strong dyslexia (so they can have a related tool on their workstation). Let's just cut people some slack on their writing skills (which are in the vast majority not related to laziness - or maybe you are suggesting people are dumber and dumber?) and the world will be a better place.
See also [1] for a nuanced discussion on "Writing skills are lower and lower". It turns out it's partly due to more people going to school and not only the elite, which is a good thing, including children whose first language is not French and whose life in general may likely be a bit more complicated than the one of a random privileged French child (like I was).
FWIW, both “history” and “literature” have the number syllables you would expect in my dialect of English (Western American), at least among people I know. But I know exactly what you are talking about! Many regional dialects drop the “o” in history and the first “e” in literature.
On the other hand, we do violence to the pronunciation of “comfortable”. I’ve lived in so many parts of the English speaking world that I can partially code switch pronunciation for some dialects. Kind of weird but not that bad.
In American English, it is common to pronounce it something like “comf-ter-ble” in most dialects. Some dialects of e.g. British English pronounce it as you would expect from the spelling with 4 syllables. I can’t think of an American dialect that pronounces it correctly. Perhaps some New England or Canadian dialects do?
My experience traveling around the English speaking world is that it is very forgiving of pronunciation. What trips you up is differences in vocabulary and semantics. You have to learn a new dictionary and a bit of inexplicable grammar everywhere you travel. I’ve learned very different languages that had similar relationships to adjacent languages; the words are all familiar but the meanings of those words have been remapped to something else. English tends toward a similar pattern.
As a (sort of) Englishman, it's a strange feeling reading about the Normans (or Vikings!) as "they", when in fact it's now "us":
> Then the smile vanishes. There are no more English queens or kings. Only Normans.
Fun fact: due to pedigree collapse, if you have white British ancestors, you most likely have a direct linear connection to every Viking, Norman, and peasant who still has living descendents today. William the Conqueror is your great(great, etc) grandfather, as is Cnut the Great, Kenneth MacAlpin, and Rhodri the Great, etc etc.
It’s funny, how people identify can diverge a lot from genetic reality. Even e.g. Brazilians, who are mostly descendent of Europeans, will always say “we were colonised”
I have a theory that English is popular because pronunciation encodes almost no information so it works well regardless of accent. Some asian languages, and even French, heavily depend on tone for understanding so are tougher for non-native speakers to communicate in. Butchered English can still be generally understood, thus it's position as lingua franca.
Former linguistics major here. Interestingly, 'lingua franca' originally referred to a specific pidgin trade language spoken in the Mediterranean. The 'franca' part referred to the Franks, who were originally a Germanic tribe that established kingdoms in what is now France and much of western Europe. By the late Byzantine period, 'Franks' had become a blanket term for all Western Europeans. What happened to both 'Franks' and eventually to 'lingua franca' is an example of semantic broadening.
This elides a lot of history, despite being glib it's mostly correct.
If English wasn't as easy to learn as it is, it would have been destroyed though.
The absolute selling point of English is the fact that since it has no proper rules it's the "glue" of European languages, it's the bash of human linguistics.
Ugly, crude, nearly impossible to master if you're not using it daily and all it really does is pin together superior languages that actually have formal rules, but could never be as flexible as "common".
Yes, it enjoyed tremendous success due to the british empire, and continues to dominate thanks to the hollywood propaganda machine - and it owes about 90% of it's success to that. But it's important to note that last 10% is important too, and that is because English is an easy language to learn and it is able to evolve rapidly.
> The absolute selling point of English is the fact that since it has no proper rules ...
Anyone who thinks English has "no proper rules" clearly has never had the joy of learning English as a second language.
(Or maybe they have a really warped notion of what "formal rules" mean when it comes to languages. There are no natural human languages in the world that are dictated by formal rules. All formal rules are after-the-fact descriptions devised to explain the language that is already there.)
If people want a language with "proper rules"... head over to conlangs. https://youtu.be/x_x_PQ85_0khttps://en.wikipedia.org/wiki/Ithkuil is my favorite (I've got a copy of the grammar guide that is on my shelf of random things next to Random Numbers by the RAND corporation).
English regularly violates its own rules and additionally has no correcting body (Swedish has central body that dictates language rules for example).
That's part of why it's so difficult to fully master, and there are rules (sentence structure) for clarity, but there's no actually solid rules for pronunciation (it differs depending on word) or even what words are really proper words (there are central dictionaries that largely agree, but there are also "Hinglish", patois and the other creole dialects).
English steals aggressively from other languages, since that's its history. Other languages might borrow some words but there's multiple branches of these inside english. You can use English with only latin-root words, or English with only Germanic-root words and both are as valid english as each other.
That's true for any human language. E.g. in Russian, adjectives use the gender, case and plurality of a noun, until they suddenly don't.
> English steals aggressively from other languages, since that's its history.
That's not unique to English. E.g. Japanese has even borrowed numerals, and some of its pronouns are borrowings. Russian has borrowed verb forms.
Having a lot of Latin borrowings is quite common in most European languages. Even in Romance languages, there are a lot of Latin borrowings (e.g. minuto is Latin borrowing, miúdo is a native Portuguese word).
> You can use English with only latin-root words, or English with only Germanic-root words and both are as valid english as each other.
That's similar to how e.g. Romanian has Latin-based and Slavic-based vocabulary. This is not that unique.
> but there are also "Hinglish", patois and the other creole dialects
Many languages have or had patois and creoles based on them.
> If English wasn't as easy to learn as it is, it would have been destroyed though.
I really dislike this argument. It treats English as a mythical, exceptional language even though it really is not.
English was not particularly hard or easy compared to other European languages. It did not have a particularly hard or easy structure, and orthography took centuries to normalise in continental languages as well. It had the quirk of combining Germanic grammar with Romance vocabulary, but that’s relevant for linguists, not most speakers.
What happened is that it was simplified and adapted over the course of centuries.
French was not displaced by English because of some magical language qualities. The French were displaced by the British somewhat, but mostly by the Americans and language followed.
That probably depends where you live. A lot of Nordic people tell me the learnt English as a kid watching cartoons, long before they were thinking of such things.
we learn english because it is a subject in school. money does not come into consideration for most people. the motivation to teach english in school is another question however. as is the motivation for parents to pay for extra english classes outside of school.
Another way of saying this is that spoken English has a lot of inbuilt, inherent error correct-ability, ala a very large minimum Hamming distance between spoken words/phrases.
I always found French to be very much the opposite in spoken form, due to the 'consonnes finales muettes' and liaison and élision, along with the large amount of homonyms and general colloquialism used in everyday speech. Yet in written form, it is nearly as straightforward as English, as you get back those damn letters that aren't being spoken.
> I always found French to be very much the opposite in spoken form, due to the 'consonnes finales muettes' and liaison and élision, along with the large amount of homonyms and general colloquialism used in everyday speech.
It is not that different from English in that respect. I found both to be quite difficult compared to e.g. German, which is very regular, or Spanish (which is annoying grammar-wise but straightforward to pronounce).
Spoken English is full of elisions and silent letters, and also full of locale-dependent colloquialisms that take some time getting used to. I remember struggling for a while living in New York and London despite having a decent level in “standard” English. I still occasionally struggle with my mates from Ireland and Yorkshire. After living more than a decade in English-speaking countries I accepted that I will never be able to pronounce correctly a word I never heard before.
Missing liaisons is not problematic when you speak French. It marks you as a non-native but it does not make you harder to understand. They can be often omitted by natives as well, depending on the accent.
> I will never be able to pronounce correctly a word I never heard before.
This happens not infrequently to native English speakers. It's especially prominent for people who read a lot when they're young and develop a large vocabulary that doesn't get socialized until much later. My English teacher, of all people, was notorious for this.
Real life examples from native speakers: Emphasizing the wrong syllable in "forage", "respite", and "parameter". Pronouncing "draught" like "fraught". Softening "chasm" and "chaos". And an extra syllable (long e sound) in "homogeneous".
It's fairly easy when the stress is consistent or mostly so. In English thought it's a phonemic feature in its own right so there's no general rule to memorize in the first place - you just have to learn it for each word.
Fun facts almost one third (1/3) of English language vocabulary are similar to French. To be exact most of the professional and legal version of the English words are taken from French. Hence if you understand English, you can read short notice or announcement in French, and understand them mostly. But if you have people spoken the same notice and announcement in French version to you without you reading it, most probably you won't understand most of the same sentences.
Plus there is another 1/3 coming from Latin which French speakers has no issue understanding either. English is basically akin to a dumbed down pidgin of French (exponentially less verb conjugation, no gender agreement, less pronouns with the thou/you merge, less articles and annoying small words, etc.) starting over a Germanic core.
Harsh but that rings true. In it's defense i'll point out that English is exponentially more useful in the modern world and even French has started borrowing nouns from English. Also English has more words then
any other language which in my mind makes it the best. (to clarify i know a little of other languages and i understand that there are concepts which English is not even equipped to express properly but i stand by what i write)
I'm still learning, English is huge and it can be a delight to discover.
Most words in foreign languages that most people believe don’t have an English equivalent often do, but the English word is so obscure that almost no native English speaker knows it, and as you point out, English vocabulary is so large that no one will ever come close to learning it all. English is the C++ of human languages.
What interests me is the prominence of words in foreign languages that have an extremely obscure equivalent in English. Like, why do they devote common vocabulary to it and what does it mean that they do?
I have been conversational in languages almost no one learns from parts of the world no one cares about. They are full of words like this and I still use those words in English because that was the first word I learned for the concept. But when I’ve taken the time to see if an equivalent English word exists, it always does. Ironically, it is safer to assume that my ignorance of the English language (my native language) is more likely than the lack of a word in English for a thing.
That's exactly what I'm alluding in my other comments thread but referring to Chinese language and writing system complexity rather than English for the C++ and Rust, but on second thought Rust probably be the Chinese equivalent.
> But when I’ve taken the time to see if an equivalent English word exists, it always does.
It's the same happen with C++ that has been ripping up Dlang features for quite sometimes now including its new module system [1].
[1] Converting a large mathematical software package written in C++ to C++20 modules (42 comments):
During the Norman Conquest, England was ruled by the French... and that is when those words entered the language.
Also from that time was many culinary words. The word for the meat in English is the word for the animal in French (the word for the animal in English is likely germanic in origin). That was in part because the when the French speaking nobility wanted boef (French for cow), they didn't want a cow (German Kuh) - they wanted the meat of a cow. So English got beef. Pork? French asked for porc, but didn't want a swine.
While ~29% of the dictionary words come from French, in any written or spoken sentence the number falls dramatically. All the small joining words we use and the core of our grammar is Germanic.
Fun fact, in your first sentence there is approximately 30% of the words which are not Germanic.
(my separation of the words, which may be slightly off:
While of the words come from, in any written or spoken the falls. All the small words we and the of our is
//
dictionary French sentence number dramatically joining use core grammar Germanic
)
Is it the case that it encodes no information, or is it the case that the information is somehow..."optional"? I know I selectively ignore unintentionally snarky or sarcastic tones from non native English speakers. Even the simple example of turning "ok" into "oook" can be used to imply someone is being unreasonable.
Singlish says, "Hi." It's fascinating to watch that in action here in Singapore, as I find Singlish to be a compact and efficient form of English that greedily borrows words from Mandarin, Hokkien, Hakka, Teochew, Malay, Tamil, and a few other languages to enable rich communication among the various cultures, ethnicities, and language groups found here. It even borrows the grammatical structures of some of those languages, and yet the meaning still gets through. It didn't take me long to get comfortable with it, and it helped me appreciate the promiscuous nature of English even more.
Chinese doesn't use accents, but the characters are extremely complicated in comparison. The chacters are both the images and the specific strokes which draw the image.
Spoken Chinese has at least five tones (1,2,3,4,5 Number five stands for neutral) but to native speakers there is much nuance.
I won't explain the reason of its popularity. Someone braver than I may do it. Grammar is very simple, by the way
Chinese is hard in unnecessary way both in language speaking and writing. I've got the impression that they make in unnecessary hard so only certain people can operate the language and work in government or I call it the elite mentality. I've got the same impression about complex programming languages for examples C++ and Rust. The languages are so complex that you cannot even make the compiler fast [1].
Spoken mandarin has 5 tones but the original ancient Chinese is similar to Cantonese and it has 7 tones. The modern Chinese writing characters is considered simplified because in Taiwan they use the original and more complex Chinese characters.
Fun facts King Sejong of Korea actually get rid of the cumbersome Chinese characters for writing Korean languages and introduced new Korean characters Hangul in 15th CE [2[. It's reported Korean literacy rate skyrocketed in a very short time because it's much easier and suited the Korean language better. Another fun facts, Korean characters can be learnt overnight but you need to memorize and understand several thousands of Chinese characters just to read and understand the newspaper headlines in Chinese. I have a Chinese friend who has Chinese mother tongue and is a well accomplished senior engineer but he cannot even read Chinese newspapers since he did not has a formal education in Chinese writing system.
As Einstein famously remark you should make it simple but not simpler.
[1] Why is the Rust compiler so slow? (425 comments):
> Chinese is hard in unnecessary way both in language speaking and writing.
Is spoken Mandarin really "hard in an unnecessary way"? I think it's quite straightforward, except for the tones. The tones are difficult for anyone who isn't a native speaker of a tonal language. But they are trivial to learn as a child, and easy to learn for native speakers of say Thai (a mostly unrelated language that also happens to use tones). Uneducated people in all walks of life speak both Mandarin and their local dialect well.
Written Chinese really is objectively difficult, and it's a believable argument that before Mao it was intentionally gatekept that way to have a caste of intellectual "elites".
in addition, chinese grammar is very easy. what makes learning chinese hard is the writing because it is difficult in itself and you can't use it to reinforce the learning of spoken words or vice versa.
As much as I approve of shitting on Chinese characters, a lot of the arguments about literacy don't really apply in the modern age. Back in the 1400s when Sejong and his ministers published the Hun'ming'jeong'eun, sure, but in the modern day literacy is pretty much driven by the modern schooling system and even Japan achieves high literacy rates. It's a bunch of unnecessary extra work, but it's not an impediment to being able to read if that work is put in.
It is true that in 1400s Korea being able to read was a sign of status, and the literati argued against making it easier to preserve their station. The same applied to postwar Japan according to J. Marshall Unger.
Given the meaning of "accent" given in the article Chinese seems like a very accented language (saying that as a Mandarin speaker). Aren't Chinese tones the very definition of an accented language? (as defined by the article, accent is a broad term)
> This is why English has combinations like sh, th, ee, oo, ou that each make only a single sound.
Struggling with the th and ou here as only making a single sound.
Through and rough, both not the same ou as sound.
That and Thames, but this might be becaues Thames is proper noun?
The only thing I like about Croatian is that there is none of this nonsense. If you understand the letters and how to pronounce them, you can read a word and pronounce it correctly. In English there are so many words that you would have no idea the correct pronunciation until you've heard a native speaker say it. Even that's no guarantee it will be correct though!
The claim that “English doesn’t use accents” is, quite frankly, a bit naïve. After all, one only needs to step into a café for a crème brûlée, or to read a novel featuring a tête-à-tête, to see that English is perfectly comfortable with accented words. From the résumé of a job seeker to the façade of a building, from the attaché case of a businessman to the naïve assumptions of a newcomer, accents are sprinkled throughout our language like so many éclairs in a patisserie. English may not have been born with diacritics, but it has certainly acquired a taste for them.
In almost every example you state, we've taken the word and the pronunciation, but have dropped the accent marks themselves. It's part of what makes English pronunciation a minefield.
If I recall correctly, the accent marks only started to get dropped when people began using keyboards. Resumé and fiancée still got an accent when it was handwritten
Seeing that virtually none of these are pronounced as they are in the original, I would say that English keeps them out of respect for their source language, but is definitely not comfortable with them.
As Esperanto was designed to be a neutral language that took words from many (European) languages, it needed an orthography that would unambiguously denote those sounds.
You can trivially transliterate the circumflex letter with digraphs using h anyway. ĝ -> gh
Still perfectly readable.
If you want a language where there is a trivial and unambiguous mapping between written and spoken language and the sounds don't exist in basic latin script, you need to do something. You can use diacritics, you can use digraphs and other tricks or well you just give up. But saying they are "unnecessary complexity" is very mistaken.
Digraphs can still unambiguously denote sounds, but Zamenhof really wanted the canonical spelling to have one character = one phoneme.
And, ironically, it was easier to do it back then compared to the computer age, because typewriters did something similar to Unicode combining marks - specifically, on the French typewriter, you'd have a single key for ^ which was used to overtype letters where circumflex was needed. And since French typewriters were very common in all countries Zamenhof used circumflex for his letters (except Ŭ, seemingly just because).
As far as digraphs, using "h" can be ambiguous in some cases, which is why Esperanto digraphs are more often written using "x" these days when diacritics aren't available for some reason. I actually quite like that scheme for several reasons. The obvious one is that "x" is then strictly a modifier letter with no sound value of its own, like hard/soft sign in Cyrillic, so all digraphs are unambiguous. But also, there's a matching diacritical mark, "combining X above". So we could e.g. say that "cx" and "c̽" are the same thing, and you use the latter when possible falling back to the former when you need ASCII.
Tldr. 1066 French didn't have them. Later on that each language French / English independently solved the "need more letters" problem. English adds them e.g. "th" is 2 letters for a sound. French uses diacritics.
I strongly disagree that "it's a shame" that English does not use diacritics. English is my second language (third maybe, considering that the country of my birth is bilingual), and is my favorite language to read and to write. I tried to learn French for two years and stopped, and all those excessive writing marks were among the reasons.
God bless all those monks who decided to keep English writing clean.
You can always tell someone who is well read in English when they mispronounce everything they say.
Greek is so much easier than English to pronounce words correctly.
Coming from Spanish, with just the right diacritics to make pronunciation obvious, at first I didn't get the concept of a "Spelling Bee". Did it involve something besides spelling? Did "Bee" was a metaphor for the actual hard part of it?
I was first exposed to written English, so after trying conversational English, I learned why its pronunciation/writing is a national competition. It might as well be random.
English would have benefited a great deal from an equivalent to the Royal Spanish Academy.
Possibly... English has a lot of linguistics with a lot of varied roots. You have many words taken from Old Norse and other Scandinavian influence as well as Latin, French and via proxy Greek derived words. Great Britain was highly fought over, contested, changed hands and merged cultures over the millennia.
It is far more organic and mixed from different sources than many prescribed languages or very local dialects of other languages. It would be very hard to pin that down. Not to mention the history of printing presses themselves, such as how the Thorn character was itself replaced as well as deprecating a few other characters that were in common use in earlier Old English.
I think it's a mistake to view that situation as unique to English.
Spain is still a multi lingual country with several local languages each of them centuries old. But even ignoring that and focusing only on Castilian, there were invasions by goths, who left behind words like ropa or guardar, and Arabic speakers, who left behind words like almacén.
Like English having both cow and beef, there are words with historical overlap but different etymologies and divergent meaning over time. For example almacén and bodega were both words for a warehouse.
There are also tons of words where Spanish had phonetically diverged from latin, but then the same word was re-imported from latin in "educated" use.
What’s that have to do with how terrible the English writing system is? Why not just reform written English to read the same way it’s sounds? I’m maybe a B2 level Russian learner and can near perfectly pronounce almost any modern Russian writing because it’s written almost exactly the way it’s spoken. I assume it’s the same with many other languages.
The article touches on this, but there have been countless attempts to restandardize English spelling or replace the Latin alphabet with one more suited to English. But English is a global language with no central authority responsible for deciding what is correct, making coordinated change nearly impossible.
To my mind, the best such attempt was Kingsley Read's, made at the behest of G. B. Shaw: https://www.shavian.info
Go to England and try to get even two small towns to agree on what the sounds are
This does not even account for the bizarre spelling of many (most?) English words. For example, letters that are skipped.
That’s funny!
But really, these days we have Hollywood and it sorta decides what English sounds like. Even if it sounds different in your town in the USA.
Plus the Enlightenment reimported a lot of Greek for science and made a lot of greek morphology productive in the language again or for the first time, at least in scientific vernacular and jargon, but a lot of that makes it into daily use. (It's also why we still have fun debates today over plurals like octopi versus octopuses or matrices versus matrixes; do we follow the Greek morphology through to its Greek plurals or do we just use the boring English plural morphology? We use both, but which you use becomes in part a signifier of "learnedness" or rule-following. As a learnéd nonconformist, I find it more fun to use the English plural morphology here more often than not, but also sometimes silly uses of díacritics.)
Plus English still is extremely active (to this day) in borrowing words from neighboring languages, with a lot of Spanish words directly borrowed (generally from Mexican/Dominican/Puerto Rican influences in US English, then back out to UK English). There are even French words in today's English that weren't Norman Conquest imports, but American Revolution imports (the French were key US allies and neighbors in the Canadian and Louisiana Territories).
There's a lot of jokes/memes that English has always been a language willing to borrow the best words of any language in a similar way that school bullies are often looking for new sources of milk money to extort.
On the other hand an "Noun Gender Bee" would probably be more interesting in Spanish than English? ;)
A simple rule of thumb suffices most of the time, and native speakers will still understand you when you get it wrong.
But yeah, I wish Spanish omitted genders most of the time like Japanese does. It complicates things and adds very little in exchange.
The real kick in the gonads is verb conjugations. Nearly every common verb is irregular and there are something like 18 tenses, times six subjects. Even many native speakers struggle to get them right.
Sure, but saying the word would often give it away.
It always makes me sad when a language's alphabet is different from their phonetic alphabet because it means that unless you hear how the word is pronounced there's basically no way of know how to pronounce it. Right now I'm learning Portuguese Portuguese and it just makes me so sad that it legit pushed me away from learning the language.
They pronounce 's' at the end on the word the same as how they pronounce 'x' and many many more such examples, basically no word is pronounced the way it's written.
My native language is Slovenian, the way you say the letters in the alphabet is how you pronounce them in 99% of the words and even if you miss-pronounce the 1%, the words are usually so close that people still understand you.
It just really made me appreciate my language even though it has many other things that just makes it difficult to the point that most of my writings are in English, were I don't really need to think about all the rules and can just focus on telling the story.
I'm of the opinion that all languages should use their phonetic alphabet as their alphabet, that way, once you've learned the (phonetical) alphabet you would know how to pronounce all the words. (Unlike in Portuguese where milk is written as 'leite' but it's pronounced very similarly to the word 'light' in English. (not to mention the Brazilian Portuguese)).
And to the Spanish people, your language is just slightly more aligned than Portuguese, but nowhere near as clear as I would like it to be.
I agree with the parent, Greek is much easier to pronounce, at least when compared to Spanish and Portuguese, though though the emphasis of the words not always being at the front of the word can make things a bit difficult, I'm looking at you κοτόπουλο (chicken).
If anything, I'd guess that when speaking English as second language, harder than knowing the accents on words would just be keeping track of all the exceptions in pronunciation between words that you basically just have to memorize. Tough, though, taught, thought, through, thorough, throughout, etc.
My Greek teacher had trouble with early and yearly
> You can always tell someone who is well read in English when they mispronounce everything they say.
This is a very popular pro-reading sentiment. The trouble is that you can also read about how to pronounce words.
You can indeed. However, if I knew all of the words I would mispronounce, I would have already looked them up! The trouble with mispronouncing words is that often you’re unaware that you’re doing so.
Respite got me recently.
The best one I've heard was from an extremely bright and well-read friend of mine in high school who once pronounced "formaldehyde" like "formal dee-hide" with emphasis on the "dee" syllable.
Well maybe they were just from rural Texas!
I have French as second and English as my third language. English comes easy and natural because we're saturated by the language. That's one of the reasons my children don't mispronounce English words as often as they do French words. Both languages are equally terrible. On the other hand a few weeks ago my daughter demonstrated a nearly perfect pronunciation of Italian while reading a text without understanding a word. Looks like the Italians got their shit straight. Apart from pistacchio. Nobody pronounces it pistacchio...
As an ESL person I do wish English used accents the way Spanish does to indicate what syllable has the primary emphasis.
I also wouldn’t go as far as saying that it’s a shame that English does not use diacritics, on the other hand, I also wouldn’t say that diacritics make a language more difficult.
Learning how to use them in Spanish and German takes about an afternoon, and when it comes to learning languages, that’s a negligible amount of time.
Not exactly sure what you mean (in German) but just learning about them and using them correctly are 2 completely different things.
I mean, you could maybe ignore the use of going a -> ä for plural forms, I would argue that learning all these words are part of it.
I'm not saying it's hugely complicated but I've seen enough people struggle with it.
Spanish is a bit tidier with them then French, though
Also agree, you can already use combinations of multiple characters to define other sounds, and that's faster to type too
Shame, though, that in English the sounds that combinations of characters make, aren't well or uniquely defined (e.g. bird, word, hurt, heard, herd, ... all sound like the same vowel)
They're faster to type largely because your keyboard is English. Other languages (French, German, etc) have diacritics right there. Even Japanese isn't that much harder to type once you actually learn it (and, on fact, is quite pleasant on a smartphone even at beginner level).
On the topic of similar word sounds, this is a big thing that hangs up English speakers on romantic languages. Their vowels are sloppy and contextual, so when they're given explicit symbols that say "use this vowel", they struggle to pick that vowel out. That "symbol to sound" wiring isn't up in the noggin'. A Spanish person learning English will see the Spanish equivalent and go "duh". But an English speaker needs those "like in bird" tables.
Luckily, we have a huge phonemic index (because of all the stealing), so we're actually at an advantage from many languages once that hurdle is crossed. Spare tonality.
2nding this. The "non-phonetic alphabet" is the biggest non-issue I see people raise a stink about. It really doesn't matter, context is the heavy-weight backbone of language.
On top of that, I think people really underestimate how inappropriate diacritics would be for English. It has a massive phonemic inventory, with 44 unique items. Compare with Spanish's 24. English's "phonetic" writing system would have to be as complex as a romanized tonal language like Mandarin (which has to account for 46 unique glyphs once you account for 4 tones over 6 vowels + the 22 consonants). Or you know, the absolute mess that is romanization of Afro-Asiatic languages. El 3arabizi daiman byi5ali el siza yid7ako, el Latin bas nizaam kteebe mish la2e2 3a lugha hal2ad m3a2ade.
> The "non-phonetic alphabet" is the biggest non-issue I see people raise a stink about
Myself and many friends who aren’t native have struggled with speaking fluently because of it. Most of us still mispronounce some words (my friend pronounced “draught beer” like the lack of rain, instead of like draft).
Doesn’t mean things should change, but it’s certainly not a “non-issue”
> Most of us still mispronounce some words
The bureaucratization of language is more problematic in my view, where things are seen as wrong and right and we try to cram the beauty of of natural language into a restricted box that can be cleanly and easily defined and worked with universally. I quite literally have nothing but detest for this conception of language, that it must bend to the whims of rigidity when it's very clearly a natural, highly chaotic dynamic system constantly undergoing evolution in unexpected ways.
[dead]
How would you account for the fact that for many words, there isn't a consistent pronunciation rule for it at all? For example, I would guess that 50% of English speakers are non-rhotic.
English pronunciation does vary quite widely and it would be difficult to rewrite all the books and websites into all the different accents.
It's also decentralized - there's no authority to tell the English-speaking community how to spell things or how to say things.
I think these are both advantages that outweigh the phonetic inconsistency.
Same way other dialect continuums account for it: you standardize spelling on some variant, or several variants if that is non-viable (which, yes, does mean that e.g. American and British English spellings would diverge somewhat).
To be clear, I'm not particularly advocating for making english a phonetic language. I'm just saying it being non-phonetic does cause issues (and makes it frustrating, but also shows a very interesting history).
Assuming we wanted to make English a phonetic language, then your question is kind of moot: phonetic means we need to pick the pronunciation rules for phonemes, which would make other ways to pronounce these phonemes incorrect. Some of currently-correct english would become incorrect english.
> For example, I would guess that 50% of English speakers are non-rhotic
Note that accent isn't really what people talk about when they complain about pronunciation. The problem is that there's no mapping from letters to phoneme in any english accent: laughter/slaughter, draught/draught, G(a)vin/D(a)vid...
All those examples follow the linguistic patterns of the languages they come from. They aren't arbitrary, they just don't teach us the context when we're learning as children.
Of course there’s always reasons. Teaching it to children isn’t really a solution: you’d need to know where words come from before reading them correctly, and also many people don’t learn English as children.
Phonetic languages do borrow words from other languages too, they adapt them to their own language keeping the pronunciation (the only example coming to mind right now is the Czech for sandwich, sendvič). English could do that just fine being phonetic was a goal
at some point these differences would qualify as different languages...
Draught beer is a linguistic holdout. I think many USA places list it as draft beer.
Does relate to the point that English still doesn't have a central linguistics authority (and likely won't ever). Just various reformers that have been more or less successful and in how distributed their reforms have been. Draught versus draft was indeed one of Noah Webster's proposed reforms that influenced a lot of American spellings and in turn is still influencing UK spellings. It's not as obvious as color versus colour, but there is a bit of US versus UK in draft versus draught.
(Webster also went on to suggest dawter over daughter, to remove more of these vestigial augh spellings, but that one still hasn't caught on even in the US. Just as the cot/caught split is its own weird remaining reform discussion.)
> It has a massive phonemic inventory, with 44 unique items. Compare with Spanish's 24, or German's 25.
I'm not sure where you're getting these numbers from, but German has around 45 phonemes according to all sources I could find, depending on how you count: 17 vowels (including two different schwa sounds), 3 diphthongs, 25 consonants.
Edited for accuracy, thanks.
If Arabic had to cater to afro-asiatic dialects phonemes then the script would have been even more messier. I'm a speaker of one, and my dialect is heavily influenced by the indigenous Tamazight language. and I think this is why many of the Amazigh community were and some still disappointed with the neo-Tifinagh script. While it carries symbolic weight, it doesn’t offer practical readability, phonemic clarity and tech accessibility of a modern script that Tamazight deserves. Latin script, ironically, fits Tamazight much more naturally.
You don't have to make a perfect pronunciation system. It's OK if a vowel is pronounced slightly differently, as long as its pronunciation can be predicted from context. Even if it can only be predicted 99% of the time.
Insisting that the writing system captures every little distinction is a common mistake enterprising linguists do (often when designing an alphabet for a bible translation, or "modernizing" the spelling of a language which is not their own). They don't have to. Even if you do it, it won't last long. Letters only have to be a reasonably consistent shorthand for how things are pronounced. People don't like a ton of markers or, god forbid, digits sprinkled into their writing to specify a detailed pronunciation.
English has accumulated inconsistencies for so long, though, that it can't really be said to be consistent anymore. Usually, there are radicals who just cut through and start writing more sensibly here and there (without digits or quirky phonetical markers), cutting down on the worst excesses of inconsistency. But in English, these radicals have been soundly defeated in prestige by conservative writers.
Diacritics don't need to be used the way they are in French, i.e. to preserve the original spelling. On the contrary, most languages use them to make their spelling more phonetic.
Nor is there a need for some insane kind of diacritics to handle English. Its phonemic inventory is considerable, yes, but it can be easily organized, especially when you keep in mind that many distinct sounds are allophones (and thus don't need a separate representation) - a good example is the glottal stop for "t" in words like "cat", it really doesn't need its own character since it's predictable.
Let's take General American as an example. First you have the consonant phonemes:
Nasals: m,n,ŋ
Plosives: p,b,t,d,k,g
Affricates: t͡ʃ, d͡ʒ
Fricatives: f,v,θ,ð,s,z,ʃ,ʒ,h
Approximants: l,r,j,w
Right away we can see that most are actually covered by the basic Latin alphabet. Affricates can be reasonably represented as plosive-fricative pairs since English doesn't have a contrast between tʃ/t͡ʃ or between dʒ/d͡ʒ; then we can repurpose Jj for ʒ. For ŋ one can adopt a phonemic analysis which treats it as an allophone of the sequence ng that only occurs at the end of the word (with g deleted in this context) and as allophone of n before velars.
Thus, distinct characters are only strictly needed for θ,ð,ʃ, and perhaps ʒ. All of these except for θ actually exist as extended Latin characters in their own right, with proper upper/lowercase pairs, so we could just use them as such: Ðð Ʃʃ Ʒʒ. And for θ there's the historical English thorn: Þþ. The same goes for Ŋŋ if we decide that we do want a distinct letter for it.
If one wants to hew closer to basic Latin look, we could use diacritics. Caron is the obvious candidate for Šš =ʃ and Žž=ʒ, and we could use e.g. crossbar for the other two: Đđ and Ŧŧ. If we're doing that, we might also take Čč for c. And if we really want a distinct letter for ŋ, we could use Ňň.
You can also consider which basic Latin letters are redundant in English when using phonemic spelling. These would be c (can always be replaced with k or s), q (can always be replaced with k), and x (can always be replaced with ks or gz). These can then be repurposed - e.g. if we go with two-letter affricates and then take c=ʃ x=ð q=θ we don't need any diacritics at all!
Moving on to vowels, in GA we have:
Monopthongs: ʌ,æ,ɑ,ɛ,ə,i,ɪ,o,u,ʊ
Diphthongs: aɪ,eɪ,ɔɪ,aʊ,oʊ
R-colored: ɑ˞,ɚ,ɔ˞.
Diphthongs can be reasonably represented using the combination of vowel + y/w for the glide, thus: ay,ey,oy,aw,ow.
For monophthongs, firstly, ʌ can be treated as stressed allophone of ə. If we do so, then all vowels (save for o which stands by itself) form natural pairs which can be expressed as diacritics: Aa=ɑ, Ää=æ, Ee=ɛ, Ëë=ə, Ii=i, Ïï=ɪ, Oo=o, Uu=u, Üü=ʊ.
For R-colored vowels, we can just adopt the phonemic analysis that treats them as vowel+r pairs: ar, er, or.
To sum it all up, we could have a decent phonemic American English spelling using just 4 extra vowel letters with diacritics: ä,ë,ï,ü - if we're okay with repurposing existing redundant letters and spelling affricates as two-letter sequences.
And worst case - if we don't repurpose letters, and with each affricate as well as ŋ getting its own letter - we need 10: ä,č,đ,ë,ï,ň,š,ŧ,ž,ü.
I don't think that's particularly excessive, not even the latter variant.
Now try to get close to a billion people around the world with already varied cultures to follow the "new" rules of their native language.
I'm well aware that any kind of English spelling reform is non-viable for backwards compatibility reasons.
But that is a different argument from saying that English can't use diacritic-based orthography because the phonemic inventory is too complex.
I'll just say that learning Serbian Cyrillic in two days and knowing instantly how to pronounce any word I read was amazing.
Agreed.
Honestly you don’t even need most punctuation.
In about five minutes any literate English speaker can learn to read at full speed with no spaces or other punctuation. Or upside down. Or at an almost arbitrary angle.
I taught myself this when I was learning Japanese 30 years ago to prove a point. Now it’s merely an interesting trick but one with an interesting staying power: with zero practice I maintain the ability.
Punctuation was indeed a later addition to Latin, as well as lowercase letters.
Accents in french are pretty irrelevant, you can totally ignore them and master the language. Most french people ignore them while chatting/mailing/texting online.
If you ignore accents, some words can be mistaken for other words (with different accents), but if you check the context, the problem quickly go away.
Accents are just useful to help you pronounce correctly words ; they are also a hint about the word's origin (ex: ^ means the words is greek) ; I don't get why it stopped you from learning the language.
> Accents in french are pretty irrelevant, you can totally ignore them and master the language. Most french people ignore them while chatting/mailing/texting online.
“Master” would definitely not be correct, but you could write intelligibly enough indeed. It will cause you issues here and there (not being taken seriously, having some miscommunications when the diacritic disambiguates the word…)
If you can’t read the diacritics though, you’ll pronounce words very incorrectly and French is a very unforgiving language for mispronunciation: you will simply not be understood
I feel not being understood when pronunciation is off is more of a France french issue. You will be understood eitherway in Canada (given you speak with french Canadians). But I sometime have difficulty being understood by frenchmans, less so with other french speaking cultures
It would be like a speaker who can’t distinguish the uh sound in “but” with the ih sound in “bit”. Is it really the native English speaker’s fault if he can’t understand that personal dialect?
France’s vowel inventory is bigger than (or just as big as) English’s, and it has a lot more homophones. I imagine all the context goes toward disambiguating the actual homophones and not the arbitrary sets of words foreigners can’t pronounce because they don’t want to learn the accents (the system is not that hard and completely predictable).
English is your favorite language to read and write? Said no one ever…
I don't see why it couldn't be. It has a pretty large corpus of decent literature/poetry/other media/etc, and the worst people seem to complain about is its inconsistent spelling rules that even native speakers struggle with. In general I'd rather deal with spell check failing on some common homophone from time to time than say, having to memorize arbitrary genders for inanimate nouns that lack any consistent marker and then tables of grammatical cases to apply on them based on those genders. Or having to shove a verb to the end of a complicated sentence and having to unroll the whole thing to figure out what's being said (not to pick on any particular language(s) I've learned).
Oh thank god, someone said it. Who cares if "tree" is masculine or feminine, it does not give my any information. In Italian, tree is a masculine word: what can I do knowing "tree" is masculine?
Grammatical gender can serve as disambiguation. I just heard this sentence recently while watching something in Spanish:
"No me compares con alguien como tú, que llegaste aquí de una isla oriental sólo porque te impresionó un espectáculo de magia barato."
In the phrase "un espectáculo de magia barato," which means "cheap magic show" here, you can tell from the genders of the nouns and adjectives that it's that "barato" modifies "espectáculo," meaning that the show is cheap and it's not that the magic is cheap.
It's not that useful here, because it's not hard to figure out the correct meaning from the context anyway, but it's a tool that helps clarity regardless. And when you learn a language well enough, it's not like you're thinking about this super consciously, you just know the word and gendering it and its adjectives flows right off your tongue. I think this is probably easier for a non-native to learn than all the irregular spellings of English, but I wouldn't know, being a native English speaker.
It seems like we can invent better checksums and referents than grammatical gender. Arguably that's a fascinating part of the pronoun discussions in English, being one of the last remaining bastions of grammatical gender in English (that and familial relationship words). I don't expect us to invent better things at all quickly, but it seems worth trying and it is interesting seeing various experiments.
One of the things I liked in studying lojban (a conlang of interesting background) was the use of mathematical identifiers as pronouns and "math genders" more related to linguistic role, referents like "the first noun", "the third verb" as pronouns. Referring to things by number is particularly great either, but it was interesting seeing a different approach to it.
Similarly, I think the language with the best pronouns I've experienced is ASL (American Sign Language). Signed languages have the ability to use three dimensional space in ways to anchor references that are impractical in spoken languages but so useful in signed languages.
It is my favourite language to read and write.
I am English though.
Just because something is complete nonsense, doesn't mean it can't be enjoyable.
I think English makes a lot of sense, but only if you invest the time to learn some of its etymology. Knowing some Latin, German, and Greek roots (in that order) is immensely helpful. You don't have to learn those languages per se, just some of the vocab. Eventually, you can look at a word, know if it's Latin/French, Germanic, or Greek in origin and all the spelling rules make much more sense.
This takes a lot of time, effort, and interest however, which is why many (most?) people think English is nonsensical.
You can also have a (maybe wrong) sense of familiarity that feels like it makes sense.
I'm ESL but after so many years of daily contact I find writing stuff in English easier than in my native German. Never lived anywhere else. I'm not claiming it's free of errors but it just feels like less work.
None of English is nonsense. But without diacritics, you need to know the historical contexts behind the different spelling or pronunciations to understand the rules.
> excessive writing marks
In English I need to find how each word is pronounced individually. What the hell is the difference between "men" and "man"? What's the difference between "bitch" and "beach"? Why "though" sounds closer to "throw" than "through" or "thought"? Those differences are encoded in a unclear way that there are more exceptions than rules.
Portuguese (my native language) is not perfect in that sense, but at least it has more rules than exceptions. Part of that is because we use the diacritic marks.
Then, I prefer excessive writing marks than excessive unclear special cases
Rules exist, but most are never taught and instead only learned through exposure. It's why "ghoti" is a trick - you have to break several rules of English pronunciation to get "fish" out of that.
Here's a page where someone tried to reconstruct as many of those rules as possible: https://www.zompist.com/spell.html - obviously it can't eliminate all exceptions but it does surprisingly well.
Rules 6-8 are relevant to one of your examples, including the explanation afterwards.
The complexity of these rules, and the number of exceptions that you need to learn notwithstanding the rules, can be roughly estimated for any given language by training a language model on word <-> IPA correspondence for that language (using a subset of the vocabulary as a training set), and then seeing how well it can predict the remaining words. You can run it in either direction, too, to separately measure the difficulty of reading (word -> IPA) and writing (IPA -> word) that language.
This was actually done for a number of languages including English:
https://arxiv.org/abs/1912.13321
You can see how languages with true phonemic spellings tend to be in the >90% range on both reading and writing, with Esperanto at 99%. Spanish and German are in 60-80% range. English is dismal at ~30% for both, though, with only French and Chinese being harder to write, and all other languages tested being easier to read.
Nice!
I couldn't help to look and see if the company behind commercials that are burned into my brain from 40 years ago are still a thing, and lo, Hooked on Phonics is still going strong!
This page[1] walks through the basics of phonemic awareness that children need to learn via exposure & repetition in order to learn to apply that aural learning to reading.
It makes me wonder if a program like this, aimed at English-speaking children, might help those adults learning to speak & read English if they could put up with being addressed as if they were a child.
[1] https://www.hookedonphonics.com/reading/phonemic-awareness/
> how each word is pronounced individually. What the hell is the difference between "men" and "man"? What's the difference between "bitch" and "beach"?
From what I could easily research, Portuguese has a pretty wide variety of vowel sounds, but it still pales in comparison to the Germanic languages that English took from; and across the spectrum of English dialects and accents you can end up hearing pretty much anything vowel-like that the human voice apparatus can generate. The strength of the difference between "men" and "man" will depend on who's speaking, but it's generally less than Portuguese phonology can accommodate. The "e" sound here should be familiar; the "a" sound not so much. Spanish (and, say, Japanese) learners of English will have much the same problem, but more so; their natural "e" is a bit off.
(From what Wikipedia is telling me, many Brazilian Portuguese dialects will use the right /ɪ/ sound for "bitch" in unstressed syllables. But then, my local accent contrasts /ɪ/ with /i/ quite strongly.)
On the flip side, I struggled with pronouncing Dutch when I made a brief attempt to pick it up; the individual sounds are all straightforward enough, but certain combinations are really unnatural.
> What the hell is the difference between "men" and "man"? What's the difference between "bitch" and "beach"?
Those words all have completely different vowels in English; they're not irregular spellings. If you can't tell the difference, you probably just haven't listened to enough English or have said them incorrectly too much to tell the difference.
I think that's probably more because English uses etymological orthography.
So spelling rules are based on four distinct "primary" systems of phonics that can be used depending on whether the word or morpheme has a Germanic, Greek, Latin or French origin. (Yes I know French comes from Latin origin, but the spelling rules differ depending on whether the word was imported directly from Latin, or came in via Norman French.) And then the Germanic and French origin words can get even messier because their spelling was standardized before the Great Vowel Shift. And then whenever we take loanwords from other languages that use the Latin alphabet, we preserve that language's spelling. Which creates a whole mess of special cases where the spelling doesn't follow any of the regular phonetic rules.
If you look at languages where the writing system is famously difficult to learn, a common element they all share is etymological orthography.
>but the spelling rules differ depending on whether the word was imported directly from Latin, or came in via Norman French
In fact it can be even more complicated because in English the words can come from Norman dialects and "typical" French simultaneously. For example, warden and guardian come from the same word in Old French, the former is closer to how Normans pronounce it and the latter is closer to its modern French pronunciation.
Do men/man and bitch/beach sound the same to you? I am kinda confused here, these words have distinct meanings and sounds.
> Do men/man and bitch/beach sound the same to you?
Not exactly the same, but I differentiate them more based on the context than in the pronunciation.
Giving an example for Portuguese that has about the same difference: "roupa de lá" (clothes from there) and "roupa de lã" (wool clothes). If you write them in Google Translate or similar you'll see the difference, which is very subtle for non-Portuguese speakers but sounds completely different to us.
Portuguese has a ton of such examples.
"O meu canto" can mean "My corner" and "My singing".
"Conselho" means "advice" and "Concelho" means "council".
"Aço" means steel, and "Asso" means "I roast".
All of these pairs sound exactly the same.
How can writing marks help in this regard? I can imagine a language with both a lot of exceptions and writing marks.
In Portuguese, they indicate that a syllable is stressed and alternate ways to say the vowels. e.g. "país" is stressed in "i" and means "country", while "pais" is stressed in "a" and means "parents". Tilde (~) indicates that the vowel is nasal, e.g. the "ã" in "São Paulo" means that it sounds like the "u" in "sun"; the default sound of "a" in Portuguese is the same as in "car".
Accent marks give additional phonetic information.
because you know the stress syllable by looking at the word. take Desert and Dessert, do we say DES-ert or des-ERT. Also in portuguese, at least, I can know which "e" sound [1] each "e" in the word makes by knowing this (well, almost, but not completely, but much better than English.)
[1]: https://en.wikipedia.org/wiki/IPA_vowel_chart_with_audio
Maybe Jazz Emu is onto something: https://www.youtube.com/watch?v=zJ69ny57pR0
I sometimes wonder if English dominated programming and the Internet partly because it doesn't use accents or special characters. You have limited space on a keyboard, and as a native Arabic/French speaker, typing in those languages is a real hassle. French requires é, à, ç and other accents, while Arabic is even more complex with right-to-left text and changing letter forms. English just flows naturally. Maybe the Internet's language wasn't just shaped by politics or economics, but by something as simple as which language was more convenient to type.
Tangential: Ùù has always seemed immensely silly to me. It’s given an entire key on the CSA keyboard despite only being officially used in 1 non-proper-noun word: où. It’s there solely to disambiguate with ou, the actual phonetics are not affected. Whenever I look down at my MacBook’s keyboard I think it seems a bit out of place haha
I mean that might be part of it, but also because the internet developed out of the ARPAnet which was a United States Department of Defense project, at a time when the United States was one of two superpowers (right as the other superpower stopped being a superpower or for that matter a state), in a world that already gave pretty heavy weight towards English as the lingua franca in international institutions after World War II or simply because it was the lowest-common denominator in a lot of the world post-the British Empire.
English had a lot of wind beneath its wings. Still does.
Maybe that makes sense for french vs english, but there are plenty of languages that avoid accents in transcription (despite being just as tonal or more so than english) because they don't have the analytic diction required to discuss the abstract concepts all over programming, computer science, and even just "business".
I think that's underselling the west's post-WW2 influence and the amount of innovation that was fueled by a booming capitalist society that the entire world wanted to take part in.
I'd say English's simple, non-accented latin characters being easy to represent mathematically was a happy coincidence.
> booming capitalist society that the entire world wanted to take part in.
You can't be that naive. Not every country wanted and a lot of them were forced into.
A much more simpler answer is the fact that much of the tech and ecosystem for computers was developed and commercialized in the US.
If Silicon Valley was in France, we'd all be using AZERTY and Minitel.
Yes, this is the pragmatic answer.
Apple's HyperCard had a French dialect, and AppleScript followed with one too. It was short-lived but did provide a window as to how these programming languages might have looked like had they originated in a non-English world.
A fun factoid I just discovered: on March 11, 1968, President Lyndon B. Johnson signs an executive order mandating that ASCII be adopted as a federal information processing standard for electronic data interchange between federal agencies. This order was known as... Executive Order 11110 :)
Do you have a source for the executive order claim? I can't find it on this list of executive orders signed by Lydon Johnson, https://en.m.wikipedia.org/wiki/List_of_executive_actions_by... . And as far as I can tell the claim originates on ascii-code.com and spread from there?
Googling executive order 11110 gives no primary information.
Edit: found it https://www.presidency.ucsb.edu/documents/memorandum-approvi...
It's just not an executive order as far as I can tell. Not an expert on US governance by any means though.
Edit 2: I mixed up Executive orders and Actions it seems?
From a quick search it seems this was a Presidential Memorandum not an Executive Action.
“Executive orders are generally more formal, require publication in the Federal Register, and must cite the President's legal authority, while memoranda are less formal, may not be published, and do not always require a justification of authority.”
These sometimes get called executive orders, like some memos that trump has signed in the last few months were called executive orders by the news and online.
They are essentially the same though. Memos carry legal weight and can direct agencies to carry out specific actions.
The 11110 thing is a myth, though. The closest that we get to that is that it is number 127 in the NARA's catalogue of the Johnson's public papers for 1968.
* https://www.govinfo.gov/app/details/PPP-1968-book1
Thank you for the correction.
Well, there's this story about how printing failed Arabic. Allegedly, in Italy, they tried to print a Koran, but because the printers didn't speak Arabic, and were trained on Latin scripts, they messed it up so much that the Arab world came to believe printing is not going to work for them. Even though most scientific books of the day were written in Arabic and the best schools spoke the language, it quickly fell out of favor, being replaced by Latin in Europe.
In turn, the Caliphate made a point of standardizing the script and creating libraries which fueled research science for a good few centuries.
----
Even before Internet, languages with diacritics (eg. Russian Ё) were deprecating their use. I believe something similar is happening in German (with ß). Also, languages with long history seen incremental thinning out of the alphabet to remove duplication and rare special cases. Sometimes, the opposite happened, but it was usually brought by reactionary politics, especially inspired by local nationalism which looked for validation in ancient history. So, for example, in the 90s Ukrainians brought back the letter Ґ that was used in only a handful of words, and was happily forgotten during the Soviet times.
So, convenience and suitability for new technology can be a meaningful factor in adoption.
You don't even have to leave English to find examples of printing shifting script. The printing press killed the thorn "Þ" character which made the the "th" sound. It got replaced with either a "y" (which looked sorta-kinda like a thorn) as in "Ye Olde" or a "th", which is how a speaker not accustomed to the sound might approximate it "tuh-huh".
I came here to say the same, I remember the days when some systems only had upper case characters and the character set was limited.
That may have allowed for more control characters they would be allowed in other European Languages.
English doesn't use accents because the speakers don't give a __ about the correspondence between the written form and the pronunciation.
https://en.wikipedia.org/wiki/Ghoti
That is a really bad example, because English does have fairly productive pronunciation rules [1], and trying to make 'fish' come out of ghoti requires breaking them. 'gh' only occurs as an /f/ sound when it occurs at the end of a syllable; as an initial consonant cluster, it's invariably /g/. Turning 'ti' to /ʃ/ is a fairly normal affricatization, which requires a subsequent vowel, which is lacking here (consider words like 'ratio', 'gracious', or 'nation'). Even turning the 'o' into /ɪ/ relies on fairly regular vowel destressing, which there's no reason to expect in 'ghoti'--which should be pronounced per English rules, pretty unambiguously, like goatee.
There are some real issues with English spelling, like the inconsistency of pronouncing 'ea' as /i/ or /ɛ/ (consider, uh, read and read). But 'ghoti' isn't one of them, because that's a case where there's not a lot of ambiguity in English pronunciation.
[1] The worst offenders in English pronunciation are when English borrows foreign words both with foreign pronunciations and foreign spellings.
It has become a thing where folks are taught, basically, that English is not a phonetic language. It is truly mind boggling the number of college educated folks I've talked with that start to try and argue that we don't have a phonetic alphabet.
And, like, I get it. We don't have a fully regular one. But this is like the people that think we don't have a single word to describe some things, when they have to basically ignore adjectives and many many synonyms to get to that idea.
Even better when folks complain that we have different ways to refer to people from other nations. Ignoring that a large part of that is that we heavily deferred to how said people wanted to be referred to.
At least one really obvious way to know that English is a phonetic language: fantasy authors create all sorts of made up names in their books. Sure, sometimes there are disagreements over how to pronounce these names, but generally readers come up with quite similar pronunciations.
The confusion may come from the various spelling conventions in the numerous loan words. In many of the counterintuitive cases, you could imagine a more phonetic spelling. The tradition has been to preserve buffet as is, instead of rewriting it as, "buffay".
The distinction is there. English can be used phonetically. We prefer to preserve the heritage of various loan words instead.
Eh. Only sometimes.
Hearing Americans pronounce the French loanword 'niche' as 'nitch' instead of 'neesh' is cringe-inducing.
English pronunciation is just kind of a mess (especially in the US). It is one of the few languages where highly educated mature people are regularly unsure of how to pronounce a word in their own language or where there is no agreed upon 'non-dialect'/standard pronunciation.
...we all agree that the right pronunciation of "nitch" is "neesh", though, or at least I've never heard a serious argument to the contrary. People just genuinely don't know how to pronounce it because they've only seen it written.
One that still gets me personally is "hyperbole"--I know how it's pronounced but when I read it, I still say "hyper-bowl" in my head more often than not. I don't think I've ever made the mistake while reading out loud to someone yet, but it will likely happen some day and when it does I will feel very stupid.
> I've never heard a serious argument to the contrary.
Well, here you go: https://www.merriam-webster.com/dictionary/niche#did-you-kno...
> I still say "hyper-bowl" in my head more often than not.
Same. This is where diacritics would fix the problem: Hyperbolé. Although hyperbolee would also work, of course.
>It is one of the few languages where highly educated mature people are regularly unsure of how to pronounce a word in their own language
Which is worse, being unable to correctly pronounce a word (but still being close enough to be understandable) or being completely unable to write a word?
https://globalchinapulse.net/character-amnesia-in-china/
Some Americans clearly must do this, but personally, I've never heard this in my life until I saw it on a YouTube video of a British person complaining how Americans pronounce words. Obviously, your experience may vary - it's a big country.
The transatlantic dispute over "aluminum/aluminium" seems minor when you consider how English is used globally. Even within Britain, there are considerable variations.
https://en.wikipedia.org/wiki/Anglosphere#/media/File:Anglos...
The one that gets me, as an American is nuclear vs nucular. Both have been in use verbally and written for decades... academics have adopted the former, even if the latter was more common in most early use. And that's just one, pretty recent example.
I'd argue that is mostly because 1) people follow audiobook or TV series pronunciations and 2) most discussions happen online and not in verbal form.
This is definitely a problem when it surfaces. For example the Stormlight Archive [1] series has two voice actors narrating the audiobook, and they don't even agree between them how to pronounce half the made up names.
[1] https://en.wikipedia.org/wiki/The_Stormlight_Archive
As someone who has listened to The Stormlight Archive (and The Wheel of Time with the same two narrators), the differences are absolutely there, but they're relatively small.
Fantasy novels predate the widespread popularity of audiobooks. It used to be quite expensive to distribute a large enough volume of audio. The old "books on tape" cost a lot of money, were frequently abridged, and only existed for the most popular titles.
"It's pronounced Jandalf!"
Reminiscent of a tweet about the death of the inventor of the GIF, who reportedly said it should be pronounced "jif" — the retweeter's comment was, "I guess he's with Jod."
https://twitter.com/andylevy/status/1506748105735159818 (not there anymore; maybe the account holder ditched Twitter)
I don’t think he could be taken seriously with a name like Yandalf.
cue GIF pronunciation war.
It's not pronounced "jraphics interchange format", therefore hard G
And yet it is gif like in gin and giraffe...
And yet there's also girl, gift, gimp, gill, gibbon, and giggle.
But but but the creator himself said it is gif like in gin and giraffe... right?
TIL: gimp is gimp and not gimp? I always pronounced this like gin.
> But but but the creator himself said it is gif like in gin and giraffe... right?
Yeah, that's what the creator said, and that's actually how I pronounce it, too. Just pointing out that "gi-" words can have both hard and soft Gs.
> TIL: gimp is gimp and not gimp? I always pronounced this like gin.
You learn something new every day!
> English is not a phonetic language.
Whoever says that English is a phonetic language does not know what a phonetic language is.
The property that characterizes a phonetic language is that you can properly pronounce a written word that you know nothing about.
English is more phonetic than not. There are a lot of words where it isn't clear what is the correct pronunciation, but if you put a random sequence of letters together there are only a few possible pronunciations, often exactly one.
I wish English was more phonetic. Spelling and pronunciations is a mess. However the language is mostly phonetic.
There's something you speakers of non-phonetic languages cannot fully grasp, I'm afraid!
We Italians, when we were children, we were taught to read based on the written letters, and we were able to read any word. It was normal, during primary school, to pronounce a word correctly and then ask the teacher what it meant. This is something you can not do in English.
And the converse was true as well! An Italian child is able to hear the surname of a new acquaintance, or the name of the village they are from, and write it down properly. In Italian, the question "How do you spell it?" does not make any sense! Again, this is something you can not do in English. Nor can you do it in French, because in the past centuries they had ink to spare and as such they started writing down useless letters that they do not pronounce.
You can frequently do that in English too. Of course there are exceptions, but if anything it's typically because of words/names from other languages.
In my experience learning Spanish, their loan words are Spanish-ized, being made to be pronounced and spelled in a format that makes more sense in Spanish. Whereas in English, the pronunciation and spelling is usually taken more directly from the source, so you get a bunch of instances where a word's spelling doesn't really match its pronunciation.
> We Italians, when we were children, we were taught to read based on the written letters, and we were able to read any word. It was normal, during primary school, to pronounce a word correctly and then ask the teacher what it meant. This is something you can not do in English.
We're still taught very basic phonetic rules in English. Like how vowels have a long sound and a short sound, where "ee" is the long e sound, or "<vowel> <consonant> e" triggers the long sound for that vowel. But you're also taught that many words are exceptions (e.g. bear vs beard). And you learn there are patterns to the exceptions, like how "ea," if it doesn't sound like "ee," will sound like a short e, like in "head" or "breadth," and particularly in cases like "dream - dreamt" or "leap - leapt."
And if you do a lot of reading as a kid, you vaguely recognize in the back of your mind some words that seem to follow a different set of pronunciation rules not taught in school (e.g. rouge, mirage, entourage, entrée, matinée, parfait, buffet, memoir, soirée, patois), which you learn implicitly. I remember this as a kid, only later learning those were French.
And this lets you guess pretty well how you'd pronounce a word. Just with basic rules and a lot of input to learn from, you can guess how to pronounce pretty much anything with good accuracy, because there are rules, and even a logic to the exceptions, but the rules are overlapping, so it's more like a set of rules you choose from.
I'd liken it to machine learning. You can learn the rules without even being taught the rules, like I did in the case of French loan words. And there are probably rules we follow without even realizing it, just instinctively thinking it's the natural way to pronounce the word without knowing why.
I'm not saying it's as good as being as phonetic as Italian, but it's not like we just have to memorize the pronunciation and spelling of every word as though it were a structureless string of letters and a corresponding, unrelated sound.
Sorry for the long comment.
Yes, but Italy had to centralize its language in order to accomplish this. 1000 Italian dialects were suppressed in a very heavyweight process. (And probably some people didn't like speaking Florentine, which became modern Italian.)
English is complicated because it's decentralized and there is no authority to regularize it. Which is a feature, not a bug.
You are wrong on several levels.
1 - Being fluent in the national language does not prevent people from maintaining their dialects in parallel.
2 - Whether a language is phonetic has no relation to political issues concerning dialects.
3 - Whether a language is phonetic has no relation to whether people like to use it.
4 - English got decentralized starting with the Age of Sail, but the lack of correspondence between written and oral forms is systemic and older than that.
> English got decentralized starting with the Age of Sail, but the lack of correspondence between written and oral forms is systemic and older than that.
That's not really true -- there is and was a great deal of dialect diversity within England itself. It was widespread printing that allowed languages to be standardized at the scale of nation-states in the first place: the divergences that developed after the age of sail were reversing convergence that had only begun a couple of hundred years earlier.
And although versions of English from the south and east of England became the basis for modern standard English, other dialects persisted and sometimes spread around the world, so some of the differences between English dialects globally are due to disparate influences from different dialects originating within the British Isles.
being fluent in a language makes you less likely to be interested in a second when everyone speaks the first. This plays out over generations in killing the less common languages.
There is a still a lot more linguistic diversity in Italy than across the entire English speaking world.
e.g. Northern Italian languages are technically more closely related to Gallo-Romance languages from the other side of the Alps than to standard Italian.
I think you're trying to to argue something like: "the set of dialects that make up English have a large(r?) set of allowable IPA orthographic representations than the accepted set of English orthographies" or something to that effect? And, that, perhaps, Spanish (French? Ukrainian?) have a much smaller set of alternate IPA orthographies for a given acceptable orthography?
I guess I'm really confused. It's not like English is some Arabic language where the orthography is in a second nearly unintelligible languages? Or, Chinese or Egyptian hieroglyphs... ?
> I think you're trying to to argue something like:
I'm arguing exactly what I wrote: a phonetic language is one when you can see a written word and pronounce it correctly, without knowing what it means and without having ever heard it before.
Edit - as an example, consider "door" and "pool": the written form is not sufficient to guess the sound to associate to the double o.
Which language is phonetic? I think you're beating around the bush here; you claim English isn't phonetic, but which language is?
Spanish and Italian are.
There's also Finnish.
This is something that should be looked up and not argued about. As far as I can remember, the vast majority of alphabetic languages are phonetic. English, French, and Portuguese are not.
Being able to guess how something is pronounced sometimes is not enough to say that English is phonetically spelled. English often borrows spellings directly from the languages that it is borrowing a word from, those spellings are usually phonetic (based on the source language's rules), and due to the presence of certain peculiar sounds, one can often guess which phonetically-spelled language a word was borrowed from. That's not an English word being spelled phonetically, that's people being forced to become language detectives. You can get lucky and guess the pronunciation of a Chinese character that you've never seen before (based on the radicals), but no one would say that Chinese characters are a phonetic alphabet.
Other than the soundalikes "b" = "v" and in Latin America soft "c" and "z" = "s", when Spanish speakers don't know how to spell a word, it's because they are also saying the word wrong when they speak.
Door and pool are pronounced the same where I am, with a drawn out double o sound. When spoken rapidly, the vowel contracts, especially in door.
The door vowel placed between P and L would make the word 'Paul' or 'pall' in most English accents. If I imagine 'door' with the pool vowel, I get something like a Scottish pronunciation of 'dour'.
dew-r pew-l
blood
Counter point, anyone that claims English isn't a phonetic language doesn't know what a logographic writing system is. Or what a gesture language is.
I'm not stating that English is anything like that. Just that it is not phonetic, in the sense that the written form of a word is not sufficient to pronounce it correctly.
That isn't what that means, though. It is not regular, it is phonetic. Indeed, your argument that there is confusion in spelling is because it is phonetic, but not regular. You know the letters in "glasses" correspond to sounding out something. In contrast to something like an emoji, :glasses:, which you don't.
I have to agree with you. With respect to emojis, English is phonetic. But this statement is as stretched as considering a diesel guzzling truck green because the fuel it burns was indeed created using solar energy.
No it isn't. Pedantically, English the language is definitionally phonetic, as it is spoken. Sign language is not phonetic, nor are things like smoke signals/traffic signals/etc.
Just as it would be silly to claim that Japanese is not phonetic. Of course spoken Japanese is phonetic. They even have two fully regular alphabets that can both express the same phonemes, but are used for different reasons. As well, they have a completely logographic set that does not relate to phonemes, even though it is used for most writing.
We're discussing features of written language ("phonetic" -- or the etymologically related "phonological") is a way of categorizing writing systems by their relationship to spoken language.
> Of course spoken Japanese is phonetic
"Phonetic" is not a feature of spoken language, but of the relation between other language forms (usually, written, but you could make the same distinction for, say, sign languages) and spoken language.
> They even have two fully regular alphabets
I assume from "two fully regular" you are referring to hiragana and katakana, but those are syllabaries, not alphabets. (Romaji is an alphabetic system, though, but I don't know where you'd find a second one.)
Phonetic is absolutely a feature linked to spoken languages, though? It quite literally is relating to spoken sounds. Sign language, for example, is not phonetic, as many users of it cannot speak or hear.
Fair that I should have said they have two phonetic writing systems, decidedly not alphabets. I'm not sure the distinction is one that matters for what we are covering here?
> Phonetic is absolutely a feature linked to spoken languages, though?
It's a feature linked to spoken languages, since it is a feature of the relation of non-spoken (usually written) language to a spoken language.
But it is not a feature of a spoken language.
> Sign language, for example, is not phonetic, as many users of it cannot speak or hear.
Yes, in causal terms, the fact many users of sign languages aren't familiar with the sounds of the spoken language is a reason sign languages tend not be phonetic, but they are not phonetic in definitional terms because the symbols in the sign language do not represent the sounds of spoken language.
But it would make no sense to call a spoken language phonetic (except maybe if it was a code for a different spoken language, in which the phonemes in one mapped to the individual phonemes, rather than ideas, of the other.)
It absolutely is a feature of spoken languages. It is in contrast to vocalizations, specifically because it is about speech and not just the sounds animals can make.
I get what you are aiming at, but phonetics is about speech. Is why you can reliably say how many phonemes different languages have. If you had to cover all vocalizations that people could do, you would have a bit more trouble.
"phonetics" is about speech, but the noun "phonetics" is not the adjective "phonetic" as applied to a language. "phonetic" is not a modifier that applies to spoken language (with the hypothetical caveat I gave upthread), and even if it was, it would have a different definition than the one that applies to non-spoken language and is about the relation such a language has to a spoken language, so trying to redirect to it in a discussion of that use of the adjective "phonetic" would be equivocation, argumentative conflation of different definitions of the same word.
It is hard for me to read this. You seem to have given up on capital letters. And sentences. I don't like criticizing run-on sentences as being indicative of bad thinking; but I do literally feel you grasping here.
I'm largely comfortable with the idea that there is something lacking in the orthography of English. Fully comfortable, even. I'm growing frustrated with how many are pushing the idea that it is not phonetic. The system is literally to convey, in writing, the words that you would speak in English. And the word "phonetic" captures that perfectly.
If you want to argue that we are building a new use of the word "phonetic" applied to writing that supersedes "orthography" and related terms. You do you. It still seems nonsensical to me and only works if you ignore that we have an alphabet that is literally used to convey speech sounds.
The issue at the start of the conversation is not about speaking or gesturing. It is about using the Latin alphabet properly (i.e. phonetically, as it was designed) or "with some imagination" as the English does.
The alphabet is used to communicate the spoken words. Not the concepts or something else, literally the spoken words. Is a big part of why slang is so popular in fiction settings, as they would use the letter to convey pronunciation. Because the letters generally represent phonemes.
> but not regular.
There is "not being regular" and there is "not even trying, and getting it right by a stroke of luck from time to time".
I learned to read phonetically, sounding out the words. It worked very well. No other scheme for learning to read has worked remotely as well.
I think I was too swayed by Sold a Story; but I am heavily convinced that the non phonics based attempts to teach reading was a massive disaster. And not just for reading literature, but also for reading math. Without learning to effectively interact with symbols, people grow to think they either get the math or they don't.
No Professor or "expert" in the Education field ever advanced their own career by advocating for simple & obvious things which actually worked.
/s?
Yeah, English orthography is a hot mess, but it's still fundamentally phonetic and alphabetic. Just try to learn to read Japanese or Chinese, and you'll very quickly come to miss English's pile of nonsense.
> That is a really bad example, because English does have fairly productive pronunciation rules
Not really. There's no way to guess how many english words are pronounced based on the written form, unless you've heard it before. And of course the pronunciation may vary wildly based on region/country as well.
The most telling evidence of this is the existence of Spelling Bee competitions in english language countries. The fact that hearing a word being spoken is challenging enough to figure out how it is written that it is a competitive sport, says it all.
There are many languages where the concept of a spelling bee competition makes no sense at all, because as soon as you hear the word being spoken, it is 100% deterministically obvious how it is written. English, not so much.
But, french is much worse!
According to this paper [0] and my own experience, it's way easier to pronounce a word in French given the spelled word than in English. It's slightly harder to spell French than English for the model of the study, but it's really close. Now, in my personal experience, I feel like French has a lot of rules while English has a lot of outliers which do not follow any rules. But my native language is French, so I am obviously biased.
[0]: https://arxiv.org/pdf/1912.13321
Yeah as far as I know, in French words are always pronounced consistent with how they're spelled. The same is not true in English. Americans complain a lot about french spellings '-ioux', 'eau', etc. but they offer no gripe over the difference between '-ough' in 'enough' vs 'through'.
French is funny to me because the written language and the spoken language are in some ways quite different, with written french introducing considerable complexity. aller, allait, allais, allaient, alleé, etc. Since the spoken context for all the conjugations is almost always clear, I'm not sure why someone introduced the extra complexity.
> Yeah as far as I know, in French words are always pronounced consistent with how they're spelled.
Whoa, very much not! I have spent the last 20 years trying to learn how to pronounce french words (my partner is a native french speaker, so I keep trying). The only somewhat consistent pattern I have is that the last few letters of each word are often silent, but even that is not really always consistent.
I'm fluent in 4 languages but french is an impossibly tough nut to crack for me.
> Yeah as far as I know, in French words are always pronounced consistent with how they're spelled.
It's far from as bad as English, but here's a Reddit thread with lots of French words which are not spelt as they are written. Not esoteric words either; along the lines of hier and monsieur
https://www.reddit.com/r/French/comments/1269a2x/is_there_a_...
I disagree. For whatever reason, most proficient readers I know have an intuition about the correct pronunciation of a word even if they’ve never heard it spoken before. And even if they use an intuitive pronunciation that isn’t identical to standard pronunciation, they’ll still be understood.
Spelling bee is the opposite direction, going from pronunciation to spelling; not a fair comparison.
> For whatever reason, most proficient readers I know have an intuition about the correct pronunciation of a word even if they’ve never heard it spoken before.
Because pronunciation rules exist, they're just never explicitly taught and instead learned through exposure. For example, here's someone reconstructing as many of the rules as they can: https://www.zompist.com/spell.html
South Slavic languages have 1-1 mappings thanks to an engineer in disguise [1]
https://en.wikipedia.org/wiki/Vuk_Karad%C5%BEi%C4%87#Linguis...
It's really not that hard. Some other examples of the same include Belarusian (albeit with Cyrillic alphabet) and Finnish.
Finnish is extremely easy, there is one sound for each letter and zero exceptions.
Spanish is also very predictable. While there are a few exceptions (like 'c' can be 'c' or 's'), they are very easy rules to follow, so never any surprises.
English and French are in the batshit crazy category. It's pretty much all random, you just have to know from memorization.
I would expect that spelling bees would select words that are not phonetically spelled. This selection bias does not imply that English does not have productive pronunciation rules.
True, in that spelling bees will select for harder words.
But the fact that such words exist, in such large quantities that memorizing them all is so challenging that this becomes a competitive sport, is why engligh is so impossible.
Dutch, which has a pretty reasonable sound-to-orthography mapping (some exceptions of course, but not all that many) also has spelling bees. Often won by the Belgians.
> Not really. There's no way to guess how many english words are pronounced based on the written form, unless you've heard it before. And of course the pronunciation may vary wildly based on region/country as well.
> The most telling evidence of this is the existence of Spelling Bee competitions in english language countries. The fact that hearing a word being spoken is challenging enough to figure out how it is written that it is a competitive sport, says it all.
That's two exact opposite things.
Languages for which you know how to pronounce a word just from its written form => you can have spelling bee competition there.
Languages for which you know how to write a word when you hear it pronounced => no spelling bee competition.
I'll take French as an example : if you see "o", "au", "eau" in a word you know how to pronounce it. There is one and only way. But if you hear "o" in a word then good luck knowing how to write it. So you got dictées (spelling bees) even if you can easily guess how a written word sounds like. The existence of spelling bee competition in the English world is not proof that the language written word pronunciation are a guess.
> But, french is much worse!
Nah. Having learned both, French is easier in this regard. It is not as random, it has rules they work most of the time.
French has much stricter rules, but I could see how the abundance of silent letters would make a spelling bee harder.
French also has some weird gotchas, e.g. "la démocratie" where the spelling represents the word's root rather than pronunciation.
An even worse example is the last name "de Broglie", which I think most French natives would likely get wrong
https://www.youtube.com/watch?v=k45IZDkg2Pg
Not nearly as bad as the English pronouncing, say, the name Cholmondeley :-)
As a spanish I could say the most challenging part of english is the lack of consistency between how you write something and how you pronounce it.
Spanish is totally systematic in this sense and once you can read it, you can pronounce it.
English is a bit messy regarding to this, for whatever reasons.
Portuguese and German are like that.
You’ve never seen the word before, but when reading it for the first time, you’ll probably pronounce it correctly.
English is awful, but French takes the crown on this one—though more because it has the same pronunciation for many different words and written forms.
English, on the other hand, the alphabet doesn’t map well.
Mood and flood both have “oo”, yet each is pronounced differently. You need to know the word beforehand to know exactly how it’s pronounced.
Or live and live, read and read (past participle), or castle (the t is mute) or bear, beard, the ea is different.
I do not want to be offensive, there are lots more , but it is an amazing sh*tshow the mapping.
If you think castle is bad, wait till you hear forecastle (“fok-sul”)
At least that's often spelled "fo'c'sle" these days, which gives you a good idea of the actual pronunciation.
My personal favorite in English is "colonel" being pronounced the same as "kernel". Which is insane even from an etymological perspective because the word is a derivative of "column" (as in, a colonel is someone who commands/leads a column of soldiers).
Yes, an incredibly rare use of double apostrophes in English! More uncommonly you'll see bo's'n as well, for boatswain.
Hahaha. I was not aware of that one. Yes, looks like undecipherable.
A lot of nautical terms have unusual pronunciations. English sailors primarily came from coastal regions, and were very happy to have a lingo that was incomprehensible to the landsman. All of this carried over to North America as well.
its just elision. "four-cassle" vs "fo'k'sul."
French is a lot less bad than English in this regard. In French you can usually (though not always) predict how a word is pronounced from its spelling, but not vice versa. In English, both directions are impossible.
French is not a good example. Pronunciation often deviates from spelling in French (e.g. many silent letters and inconsistent mappings).
Hungarian, however, is pronounced the way it is written, as its orthographic type is phonemic, whereas French and English are of type deep orthography.
Serbian is of the perfectly phonemic type. "Write as you speak, read as it is written" is a common saying.
The silent letters are not the point - that's why the poster you replied to said it doesn't work speech->writing in French. But writing->speech is much, much more consistent than in English, even if the orthography itself is kinda criminal with all the silent letters and whatnot.
I am inclined to agree.
I've only really been exposed to French in music, where I've sung various French pieces of the years. But from my experience, at least French is consistent? As-written is as-pronounced.
Is this not really the case, and therefore is French also guilty of having the same vowels/consonants pronounced differently for completely arb reasons?
Fritteuse.
My son's first year teacher said (I may have the numbers slightly wrong) that Spanish has 23 phonemes (sounds the mouth makes) and 23 graphemes (ways to write sounds). English, on the other hand, has 43 phonemes and over 500 graphemes.
Spanish is better than English, but it's nowhere near that regular. There are three different ways to pronounce "x", wild dialectal variations in "ll" and "c", etc.
The rules are very clear on when those are used though, you are not really arguing the original point imo. What are the dialectical variations in "ll" and "c"?
(B2-ish Spanish learner here but) "ll" is pronounced in at least three variants that I know of: "y", "j", and something between "sh" and "ch". E.g. "llama" might be pronounced like (in English writing) "yama", "zhama", or "shama". The last one really threw me for a while; it's super common in Argentina at least.
I spent time in the "Rio de la Plata" area in the late 1970s, mainly Montevideo, and learned rioplatense Spanish, and would use the ZH sound as in "meaSure" for Y/LL letters in "playa" and "calle".
In the last 40 years I've spent mostly in the USA I rarely have heard Uruguayan/Argentinian Spanish in person or in media, but was surprised to hear Messi and others in recent interviews use SH as in "puSH" for the Y/LL, this apparent has been a generational shift in that area, first in Argentina and then Uruguay. I'd sound old-fashioned if I were to go back to Montevideo these days.
I see what you mean. I think you should stick to one form and learn by difference or you could quickly get lost.
"ll" in standard spanish is a strong english "y".
However, in spanish argentinian from the area of Buenos Aires (but not the argentinian Córdoba, which sounds more like colombian spanish) it is "sh", being that s something like a mix in-between of "j" and "s" + h as in "she" but the sound is a bit different.
Without being able to record some sound I cannot express it better but I am sure you can find something around. Javier Milei, the president, has such an accent.
AFAIK "ll" can also be the palatalized "l" sound in some dialects, i.e. in the same relationship to regular "l" as "ñ" is to "n". Indeed, this is the original pronunciation from which all others have diverged.
as has been stated many times in this thread, the rules are also very clear in English. They just aren't taught.
I think that must have been within one dialect. If you include all dialects of English (Scottish, Irish, Australian, Singaporian, Indian, American, etc. etc.) I'm sure you have a lot more than 43 phonemes.
In any case, her point wasn't to give a lecture on linguistics, but to impress upon the parents how complicated English really is to learn to read.
The dialects I can buy it but I think the x has only two ways? It is a very regular language from the point of view of the written mapping to sound.
x is pronounced four different ways in Spanish: like j in México, like the English “sh” in Xcaret, like s in xenofobia and like English “x” in extremo.
The first two are not productive now in normal Spanish words: they are only used in old spellings that have irregularly been retained, and in loanwords from indigenous languages. But they do exist.
Well, yes. I was speaking about standard Spanish from Spain.
Xenofobia is an s, yes, and excursión is "ks" In fsct, Méjico is the traditional way to write Mexico in Spanish grom Spain until it was accepted the other form a few years ago. I still write "Méjico" myself.
Since less than 10% of Spanish speakers are from Spain, there’s no reason to assume you were specifically talking about that one country when referring to the Spanish language in general.
And anyway, as you point out, even in Spain the form México is accepted now.
I thought it is perfectly reasonable to talk about spanish from Spsin the same you talk about English from England.
After all, it is where they come from originally and have their own spelling (colour vs color, etc.)
An x in standard spanish has always been the two sounds I told you and that mexican deviation is specific to Mexico.
Yes, it is over 100 million speakers but I was still assuming the root language in its original place as the reference. Sorry if I did not express it correctly.
I get your point, but FWIW, México is not a Mexican deviation; it's just an older Spanish spelling. E.g. Jiménez was once spelled Ximénez and there are probably lots of other examples.
The "root language spoken in its original place" absolutely did pronounce X like modern J.
True, I forgot that detail. Ximenez did exist in fact and I forgot that. So it must be that.
Dialects are different since they are still internally consistent.
> phonemes (sounds the mouth makes)
This isn't entirely correct. A distinct sound that the mouth makes is a "phone". A phoneme is almost always a group of several phones - allophones - that native language speakers perceive as a single sound. Another way to phrase it is that if you change one phoneme to another one, it makes a different word (possibly a non-existing one, but regardless the native speakers would consider it distinct), but changing from one phone to another doesn't change the word.
For example, in English, the phoneme /t/ has allophones [t], [tʰ], [ɾ], or [ʔ] depending on context. OTOH [ɾ] is a distinct phoneme in Spanish, and [ʔ] is a distinct phoneme in Arabic.
Unfortunately these two are often confused, so one should be careful with such counts and comparing them - it's not uncommon when people count phonemes in their native language, but phones in other languages (when those phones sound distinct to them).
This can also vary significantly from dialect to dialect, since one very common thing in language evolution is for two similar phonemes to collapse into a single one while retaining the original distinction as allophones. For English, in particular, the number of phonemes varies a lot between American and British English (with the latter having more distinctions).
Spanish "maps" very nicely but even Spanish isn't exactly 1:1
- /k/ can be written both c and qu, and k where it occasionally appears in the language (e.g. kilo) - and the u in qu is silent.
- /s/ can be written c, s, and z, though stress rules are different for c and z.
- r and rr are distinct sounds but r = rr at the beginning of words, I think.
- At least in Mexican Spanish: The "ua" sound can be spelled ua or oa (e.g. Michoacan, Oaxaca) - and also the breathy sound of j can also be written with an x.
- d has a sound a little like English voiced-th at the end of words (e.g. juventud)
qu: the u is always silent and qu is followed by i or e. It is still a systematic way of reading. It is like gue and gui, you pronounce as in "singer" the "ge", the u is mute. If you want to pronounce the u, as in pingüino, you set the diaeresis.
The stress rules, to the best of my knowledge, is very systemaic (not 100% but I would say "almost" at least for the words in use). Even the stress rules are very uniform.
> r and rr are distinct sounds but r = rr at the beginning of words, I think.
This is still systematic reading. At the start of a word it is the strong one, yes. And when it is preceded by a consonant, such as in "enredar" (that is strong r). There is no exception of any kind here.
> d has a sound a little like English voiced-th at the end of words (e.g. juventud)
That is some dialects in some areas. We pronounce a clean d at the end in my area (around Valencia). It is also the correct, standard way to do it for spanish. The other is a deviation existing in León, for example.
Not an issue with /s/ in Iberian Spanish. Once you have 'distinción', most ortography errors I've read overseas plummet down there.
Yes, I'll always remember the long time spent asking for the whereabouts of Ocean Drive, mispronounced by me because the correct pronunciation would require the word to be written as Oshean or maybe Oshan. It was 1995. I have had very few occasions to hear native speakers. A lot of people and I were figuring out plausible but incorrect pronunciations by applying the most usual pronunciation rules to the written words.
> Spanish is totally systematic in this sense and once you can read it, you can pronounce it.
IMHO purely phonemic orthography makes orthography unnecessary complex, as there are language features like assimilation[1] that happens naturally in spoken form but does not make sense in written form.
In contrast, morphophonemic orthography keeps systematic and consistent mapping between spoken and written form for individual morphemes, but not necessary for words, as in written form morphemes are just concatenated (to make words), while in spoken form there may be complex interactions.
[1] https://en.wikipedia.org/wiki/Assimilation_(linguistics)
It's not so strict, but we try most time to keep it consistent. For example, here in Buenos Aires we almost don't say the "d" at the end of the word, like in "ciudad" (city), in some pronunciation guides I saw it written with a tiny d.
If the variant get's too popular the two versions become the official spelling, for example "septiembre" and "setiembre" (September) are correct. I hate the second one and I never use it, but it's popular somewhere. After many years, sometime the old spelling disappears and is marked as archaic.
An orthography that surfaces (non-phonemic) assimilation would be phonetic rather than phonemic. For example, many languages assimilate "n" to "m" before "b", but the phoneme is still /n/, and native speakers are often not even aware that this assimilation occurs (which is what indicates that it's still the same phoneme).
Strictly speaking, spanish has the same sound for v and b, unlike other romance languages. G and j when followed by e or i also.
This is true, but Spanish orthography isn't completely phonemic (and simpler for it). It is very shallow and very consistent but it doesn't spell out things like assimilation differences, people are just wrong to describe it as completely phonemic.
as i understand it, english is actually 3-5 other languages in a trench coat.
This often gets trotted out, but it's not really true. English is a solidly Germanic language, which merely happened to lose the core attribute of Indo-European languages (extensive verb inflection), and in more recent centuries, there's been a tendency to adopt Latin and Greek words for new word formation rather than (as German did) using native words. So 'technology' instead of 'craftlearn' or 'television' instead of 'farsight'.
Even among major languages, English isn't anywhere near the worst offender of copulating with other languages for features--it never really adopted foreign grammar, the way you see with, e.g., Turkic languages.
Solidly Germanic with an absurd amount of French, down to nearly identical spelling for many common words. I’m not talking about cognates but actually 100% the same spelling and meaning and they’re often not from some recent century but from old French.
I’m sure you have a solid basis for saying this but it’s basically impossible to write many sentences without by accident using French down to the original spelling.
I was going to highlight all the examples I used by accident myself in this post but I gave up because the links were making it too long.
This is why something like Anglish even exists https://en.wikipedia.org/wiki/Linguistic_purism_in_English
I believe this is because England was conquered by the Normans (french speakers). I think it was within the last 100 years or so that the English aristocracy finally stopped speaking French among themselves.
As I understand it, English at it's core is a Germanic language that underwent significant creolization with scandinavian sources. That core then acquired a significant amount of Old French and latin vocabulary, particularly in upper class terminology.
The creolization is why English has a relatively simple grammar, and all the word sources is why we have like 16-20 vowel sounds trying to cram into latin characters.
> English has a relatively simple grammar
You mean "relatively simple morphology". English phonology and syntax are not simple at all (e.g. lots of information carried by word order).
Let's not downplay the influence that the French language had on English.
> in more recent centuries, there's been a tendency to adopt Latin and Greek words for new word formation rather than (as German did) using native words
Note that the prevalence of native words in German is the result of a modern reform movement, not something that happened naturally within the language.
> [English] never really adopted foreign grammar
There's the argument that do-support is borrowed from Celtic.
There's a really good podcast [1] that dives into the background of English. It starts off even further back, talking about PIE and how that affected all the earlier languages of the region. And then starts tying the pieces together on how English was formed.
[1] https://historyofenglishpodcast.com
At least 3:
~26% Germanic
~29% Latin
~29% French
~16% Other
RobWords covers this really well: https://youtu.be/PCE4C9GvqI0?si=4Wd6NFus4v1YqmC3
Almost all the most used words in English are Germanic. Latin in particular is overrpresented because of scientific and technical terms which are rarely used.
That might be true if you just count up every word in the dictionary by origin. However if you weight the words by frequency, Germanic will be way higher. That is, if you take a transcript of an average conversation in English, the proportion of words inherited from Old English (i.e., Germanic) will be much higher than 26%.
It seems hard to measure with any kind of objectivity here, considering how much Latin is in French (and even modern German) as well.
Blame the Normans for that one...well English was already kind of a mess, but the the conquest of England by the Normans really sealed the deal.
> Spanish is totally systematic in this sense and once you can read it, you can pronounce it.
is there no accent variation in Spanish?
Such a 1:1 system would never work in English, because the way words are pronounced can be very different in e.g. Melbourne, Newcastle-upon-Tyne and Boston, for example.
One of the problems in english (not the only one, but one of them) is that for the vowels there are 5 graphs (is this term correct? Sorry but hope it is understandable) but many more sounds. In Spanish there are 5 vowels in the latin alphabet and exactly five sounds and nothing else.
Valencian has 7 sounds though, two for e and two for o. Similarly, Catalan also (and in some circumstances the o sounds as u, when the stress is not in it and other stuff). But they still have quite strict rules.
Yeah but we represent a lot of vowel sounds by combining vowels - 5 letters (not including y), if we allow any combo of two to represent a different sound that's 25 combos, and if we remember that preceding and following consonants can modify vowels too (though, dough, caught bought vs thou, bao, sour, or; on, con, Ron vs how, cow, ow) that's quite a lot of combos.
Now, you can (and should!) accuse me of cherry-picking examples, since the rules are less consistent and/or vastly more complicated than what I represented. But I maintain that there are orders of magnitude more ways to represent vowel sounds than 5, and the clue is the context. Not, as many will suggest, memorizing each individual case (though there's certainly plenty of that going around, much like Spanish's infamous irregularly verb conjugations), but understanding categories and families and patterns.
English sounds usually are best understood with groups of three letters, rather than one letter at a time. If you looks at throuples, you'll likely find far more of that consistency we all so deeply desire.
Yes, English is VERY consistent. The problem is that there are multiple systems working inside English vocabulary, so you have to get familiar with more than one rule set.
You're right to point out that English pronunciation varies widely across regions, but that doesn't fully negate the value of a systematic orthography. What germandiago is referring to is the relationship between graphemes (letters) and phonemes (sounds). Spanish has a highly phonemic orthography, meaning the rules for converting letters to sounds (and vice versa) are consistent and predictable. Yes, there are accentual and dialectal variations within Spanish (e.g. seseo in Latin America vs. ceceo in parts of Andalusia) but these are largely phonological shifts applied systematically, not random deviations from spelling norms.
In contrast, English has a deep orthography, where historical layers (e.g. Norman French, Old Norse, Latin borrowings) and sound changes (like the Great Vowel Shift) have led to a chaotic mapping between spelling and pronunciation. A consistent system wouldn't eliminate dialectal variation, but it could reduce ambiguity and aid literacy, as evidenced by languages like Finnish or Korean.
I don't know if Korean is ultimately that good. Hangeul are a monstrous improvement over the old mixed script (which itself is better than the Japanese iteration because the Koreans only used Chinese characters for Chinese loans), but it still has a lot of sound change rules and can be a bit of a pain to read because of how letters flow to the next syllable. It's not in the same league with Finnish or Spanish, at any rate.
Yeah there are multiple accents in Spanish, but each accent is still a 1:1 mapping from written word to pronunciation, there's no enough/through/dough nonsense.
For example for a small car ("auto") you say and write:
In Argentina: "autito"
In Colombia: "autico"
In Spain: "autillo"
the same rule applies for all words, not only for cars.
In Spain you'll listen the three cases at once and all of them are perfectly valid.
-ito it's almost the universal way everywhere in the Hispanic world.
-ico it's widely used in the South of Navarre and Aragón and everyone will understand you. Heck, it's the diminutive from used by the hick people, and thus, it's uber known, altough you might look like a bumfuck village redneck sheepherd with a beret by using -ico outside of Navarre/Aragón.
-illo it's more from the South, but, again, understood everywhere.
In Argentina everyone will understand you, but if you don't use "ito" then people may ask where are you from.
"ico" is used in many countries of Central America and Caribe. I asked someone from Colombia, so I'm sure about Colombia but I'm no sure about every other country.
Is "illo" used in Madrid? I think I heard it in movies or TV programs from Spain.
Yes, it's used, all over the whole country.
The explanation you gave is already contained in the cited Wikipedia article. I think this "ghoti" example is more of a tongue-in-cheek mocking of pronunciation inconsistencies. If you want a jarring example, consider laughter and slaughter. I know, i know, they have different origins, but still, it confuses foreigners like me while learning the language.
But English orthography isn't meant to serve foreigners.
Im ESL, I struggled with English spelling as much as the next latin speaker who's already learned to read and write in foreigner.
But now that I get the reason behind it, I love it. I consider English orthography worthy of UNESCO protection, even. In fact, I am annoyed at the regular spelling of my two latin languages that have left so much history behind.
English Orthography doesn't exactly serve native speakers either.
It’s fairly good at helping us understand the etymology. Have a “y” acting as a vowel in the spelling? Good chance it’s Greek. Have a “k”? Almost certainly not Latin.
That is trivia that is useless in almost all contexts. I've been a native English speaker all my life and this is the first I've heard of that. I can't think of any situation in life where knowing that fact would have been helpful. Your claim seems reasonable, but if someone says you are wrong I wouldn't fact check it even if clear links were posted so that I could.
But if you had known it (aka, if anyone had taught it to you), it wouldn't be useless, as you would know the context and how to pronounce it...not to mention the meaning behind it
If you’re seeing a word for the first time, it is pretty useful - partly with pronunciation but definitely with meaning.
You do have to have some familiarity with the source languages, but if it’s an unfamiliar but nativized word, those are almost always ultimately Latin or Greek.
If you're seeing the word for the first time and need to figure out how to pronounce it, how would you know that “y” is acting as a vowel and not as a consonant in the first place?
If it's followed by a vowel, it's likely a Germanic word: yule, your, young, yellow (and you probably know the word, since our core vocabulary is still mostly Germanic). If it's at the end or between consonants, like syllabary or ontogeny, probably Greek.
You might also just happen to know a smattering (or even a lot) of Greek and Latin.
Im a materials scientist and I use etymology every day.
Knowing etymology is a an easy way to memorize things.
> But English orthography isn't meant to serve foreigners.
Or natives. It is slower for children to learn to read English than other languages.
Teaching my toddler to read now and I definitely feel like if we spoke Spanish my work would be done already.
Probably not. Toddlers generally don't have the brain to learn any reading. Spanish's advantages in reading isn't how young you can start learning to read, it is how fast you can stop reading. Spanish schools stop teaching reading takes about 5 years to learn, English 6, and Japanese 9 - after that much training kids are finally considered to read anything. (sometimes we talk about college level reading, but that is more about mastery of topic specific topic - Doctors, lawyers, and engineers each have special vocabulary that needs extra training to read, but they cannot read each other's technical papers)
I learned how to read in six months.
My kid took two and half years.
The Chinese take 10 years.
So what? Are the Chinese terribly educated?
English is not a phonetic language and it also lacks accents.
Saying it has pronunciartion rules it is an strech. You have conventions.
In languages like spanish if you read a word, is very hard to misspronounce it.
no, for the millionth time, English has rules, NOT conventions, you just need to know the historical context behind the multiple rules.
I want to know who thought that chinese transliterated into "english characters" should use a whole bunch of q, x and zs to represent sounds in a way that no other english word does.
Why is Zhou pronounced that way?!
Pinyin was written by Chinese speakers for Chinese speakers. There are other romanizations written by westerners, and these are easier to see where the sounds come from; e.g., "tsai" rather than "cai".
What use is "q" as a letter at all in English? It makes a "k" sound and always occurs with a "u" after it. Why not use it for the "tch" sound? (Which, btb, is different than the "ch" sound.)
"C" is about the same -- by itself it always sounds exactly like "k" or "s". Why not use it for the "ts" sound?
As for "zhou" -- in English, z is very similar to an s, but voiced. So in pinyin, zh is just like ch, but voiced.
Lots of languages do this BTW. When people from Wycliffe want to translate a Bible into an obscure language without a writing system, they first have to invent a writing system. They could invent all new characters, but why? All it would do is make that language hard to type. So they take the sounds that language has, and map them onto Latin characters. Sometimes there's an obvious mapping, sometimes not.
Look up Welsh's spelling for another example of this.
> Why not use it for the "tch" sound? (Which, btb, is different than the "ch" sound.)
What are you thinking of? There is no difference between those things.
But your major point here is correct; on the fundamentals there is no reason for the English alphabet to feature a Q.
> "C" is about the same -- by itself it always sounds exactly like "k" or "s". Why not use it for the "ts" sound?
With the modern alphabet there's no reason for a C either. However, the answer to "why not use it for the 'ts' sound" is pretty obvious - that sound isn't part of the English phonemic inventory. It occurs, but that is almost always just a result of what is supposed to be a bare /t/ being followed by /s/ for grammatical reasons. (For an example of the general feeling here, note that an English word cannot start with /ts/ at all.) Why would we use any letter to represent the "ts" sound? We represent it the same way it exists in our language, as a sequence of two unrelated sounds.
> So in pinyin, zh is just like ch, but voiced.
Technically the only voiced consonants in pinyin are m / n / ng / l / r. I think a voicing contrast was present in Middle Chinese, and there's one today in Shanghainese and presumably other Wu dialects, but not in Mandarin.
> What are you thinking of? There is no difference between those things.
I'm talking about pinyin here. In Mandarin, there are to distinct sounds, one represented in pinyin by 'q', and one by 'ch'. It took me months to hear the difference, and months more to be able to pronounce them properly. I think there are other romanizations where the 'q' sound is represented "tch".
(In fact, I'm inclined to think that there are actually two different sounds in English as well; "witch" and "Charlie" don't feel the same in my mouth.)
> Technically the only voiced consonants in pinyin are m / n / ng / l / r.
I think we're using different definitions of "voiced". Other voiced / unvoiced pairs in English include g/k, b/p, v/f, z/s. See [1] for an "official" example of "voiced" being used the way I'm using it.
How else would you describe the difference between "qu" and "ju", or "chou" and "zhou"? The only difference I can feel is when your vocal cords turn on.
[1] https://en.wikipedia.org/wiki/Plosive#Voice
The article you linked to specifically says there are only voiceless plosives in Mandarin!
And, you'll notice I pointed out English voiced plosives. :-D
> In fact, I'm inclined to think that there are actually two different sounds in English as well; "witch" and "Charlie" don't feel the same in my mouth.
There aren't.
> I think there are other romanizations where the 'q' sound is represented "tch".
Well, maybe; there are a large number of romanizations of Mandarin. But there are no significant romanizations where that is true. It's q in pinyin, ch' in Wade-Giles, and ts' or k' in postal romanization.
> How else would you describe the difference between "qu" and "ju", or "chou" and "zhou"? The only difference I can feel is when your vocal cords turn on.
You could read my other comment in the thread. qu and chou are aspirated; ju and zhou aren't. Your vocal cords don't turn on at different points for those syllables. Mandarin Chinese doesn't use voicing contrasts.
> I think we're using different definitions of "voiced". Other voiced / unvoiced pairs in English include g/k, b/p, v/f, z/s. See [1] for an "official" example of "voiced" being used the way I'm using it.
Yes, I know what voicing is. You don't seem to know what consonants are used in Mandarin.
Compare https://en.wikipedia.org/wiki/Standard_Chinese_phonology#Con... .
> qu and chou are aspirated; ju and zhou aren't. ...Compare [ref]
So the idea here is that chou and zhou are related in a similar way that the t's in "top" and "stop" are related: your mouth and vocal cords are doing the same thing, but in one case you have the puff of air and the other you don't.
At any rate, going back to the original question: the logic behind the choice is still consistent. On this classification, in Mandarin, p and t and ch are aspirated, and in English p and t and ch are voiceless; b and d and j and zh are unaspirated, and in English b and d and j and z are voiced. (And q is mainly thrown in to fill the gap, but its pronunciation in English is voiceless as well.)
Or, to explicitly quote from the ref you shared:
> Such pairs [of aspirated and unaspirated plosives and fricatives] are represented in the pinyin system mostly using letters which in Romance languages generally denote voiceless/voiced pairs (for example [p] and [b]).
Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction. In the former case unvoiced consonants often have aspirated allophones as in English, and in the latter case unaspirated consonants often have voiced allophones especially between vowels, as in Chinese or Korean. Hence why it makes sense to map the two in this manner - if your native language uses aspiration as the primary feature, and you hear someone who uses voicing, your brain will generally map it "automatically" for you, and their speech will sound weird but understandable.
(But then you get Hindi with a four-way distinction, both voiced/unvoiced and aspirated/unaspirated in all possible combinations.)
> Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction.
Yes, that makes sense -- I certainly learned something from this conversation. It makes sense that speakers would naturally tend to classify things along different lines, and in Chinese the aspirated / unaspirated classification makes sense.
That said, after having had some time to sit with the proposition that 'j' in the English name "Joe" is voiced, and the "zh" in Chinese word "zhou" is unvoiced, it continues to seem obviously false to me. It seems very much to me like mistaking of the map for the territory [1].
[1] https://en.wikipedia.org/wiki/Map%E2%80%93territory_relation
> But then you get Hindi with a four-way distinction, both voiced/unvoiced and aspirated/unaspirated in all possible combinations.
They're spelled that way; I don't think they're supposed to be pronounced that way.
https://en.wikipedia.org/wiki/Aspirated_consonant#Voiced_con...
>> True aspirated voiced consonants, as opposed to murmured (breathy-voice) consonants such as the [bʱ], [dʱ], [ɡʱ] that are common among the languages of India, are extremely rare.
> Languages usually have either the voiced/unvoiced distinction as phonemic, or the aspirated/unaspirated distinction.
My understanding is that all of these options are fairly common:
- two-way contrast between aspirated and unaspirated
- two-way contrast between voiced and voiceless
- three-way contrast between voiceless aspirated, voiceless, and voiced
- three-way contrast for labial and alveolar stops; two-way contrast for velar stops
> They're spelled that way; I don't think they're supposed to be pronounced that way.
True, but most languages don't distinguish between [h] and [ɦ] to begin with, with one often the allophone of the other. So listening to Hindi it sounds like the same thing, more or less.
It's best not to think of Hanyu Pinyin as using "English characters" to pronounce Mandarin. It's just a mapping of the initial, medial, and final sounds onto the Latin alphabet in a consistent way, so that once you know the mapping, you know the pronunciation right away, and more practically, you can _type_ it right away.
https://en.wikipedia.org/wiki/Pinyin
I used to always think these romanization schemes were really bad, until I realized they were just not for me. The ease of sight-reading and getting the correct pronunciation for a random english speaker is not the goal. It's primarily for the convenience of users of other languages to have a systematic encoding. To make it pronunciation-friendly you would have to have to add a bunch of complexity to the mapping that would compromise its usage by the real audience.
A few plausible answers to that:
In general, it's not transliteration into English characters, it's transliteration into the Latin alphabet. That means that transliteration tends to be shared across the various European languages that use the Latin alphabet. And given that the English were one of the last powers to actually engage in the naval trade war, they're less likely to be the basis of a major transliteration effort.
In the case of the q and x, I believe it comes from 500-year old Portuguese.
> That means that transliteration tends to be shared across the various European languages that use the Latin alphabet
Not just European languages. Pinyin is useful for everyone that has to interact with Chinese words, whether their first language is English, French, Swahili, or even Mandarin.
A lot of people might not realize that the primary users of Pinyin are Chinese people. The way typing Chinese works is that you type the pronunciation in Pinyin and then a box pops up with choices of characters from which you select the correct one. It's also used in dictionaries to give the pronunciation of unfamiliar characters.
Your first question, who thought of the system, has a straight answer. From Wikipedia:
> Hanyu Pinyin was designed by a group of mostly Chinese linguists, including Wang Li, Lu Zhiwei, Li Jinxi, Luo Changpei, as well as Zhou Youguang (1906–2017), an economist by trade, as part of a Chinese government project in the 1950s.
By the way, they are not “English” characters; they are Latin/Roman characters, and used in a huge number of languages with different spelling conventions. Pinyin was created for the entire world to use, not specifically English speakers.
How would you spell that sound in a way that is consistently recognized?
"zh" is actually one of the more reasonable pinyin digraphs because it follows the same pattern as "sh". If "s" + "h" results in [ʃ], then logically "z" + "h" should result in [ʒ].
"c" is used the way pinyin uses it in many languages (e.g. pretty much all Slavic ones that use the Latin alphabet, for starters).
"x" and "q" are more questionable, but there's precedent for either in languages using Latin-based alphabets - "x" can be [ʃ] in Spanish, for example, and "q" is [c͡ç] in Albanian.
> "zh" is actually one of the more reasonable pinyin digraphs because it follows the same pattern as "sh". If "s" + "h" results in [ʃ], then logically "z" + "h" should result in [ʒ].
Note that the sound [ʒ] is common in Mandarin, but its pinyin spelling is "r". "zh" isn't voiced and is affricated.
Wait til you get a load of Tamil/Malayalam transliterations’ use of “zh”. It was proposed by some German linguist to represent a really retroflex “r” and now makes outsiders pronounce kozhikode as “cozy-code” instead a closer “korikode”
Who called them "english characters"?
Pinyin uses s in a very common way, z in the way of Italian, and c more or less in the manner of various Slavic languages. They are a sequence of related sounds: s is the fricative, z is affricated, and c is both affricated and aspirated.
Sh, zh, and ch are a sequence of sounds related to s, z, and c. Sh is a fricative articulated farther back in the mouth, zh is its affricated form, and ch is both affricated and aspirated.
And as a bonus, sh and ch match English usage, which isn't likely to have been a primary concern.
It's also worth noting that for many Chinese speakers, there is no difference between s/sh, z/zh, or c/ch.
(x, j, and q are what you get if you use the middle of your tongue, instead of the tip, to pronounce sh/zh/ch. They occur before front vowels; sh/zh/ch only appear before back (or central) vowels.)
A friend of mine remarked to me once that when she was in school, her teacher informed the class that English speakers would not understand what the pinyin letter "q" was supposed to mean, which I immediately confirmed. She thought this was hilarious.
Well that is a good point. For some reason I just assumed that pinyin was specific to english and that other languages used different transliteration schemes.
The English are definitely characters.
[dead]
Agreed, a far better critique of English spelling:
https://people.cs.georgetown.edu/nschneid/cosc272/f17/a1/cha...
> Even turning the 'o' into /ɪ/ relies on fairly regular vowel destressing
Isn't the "o" in "women" stressed?
It depends on your accent, within a couple steps of me I can find someone that pronounces it “Wimmen”, “Wuhmen”, or “Woemen”.
English is a particularly challenging language to spell with. How many languages have a national spelling competition?
> Turning 'ti' to /ʃ/ is a fairly normal affricatization
It can't be an affrication, because /ʃ/ is not an affricate. (Although /tj/ is affricated, as /tʃ/ [think "gotcha"] - when you say 'ti', you're referring to words that were pronounced with /s/ rather than /t/.)
Wouldn't /sj/ -> /ʃ/ usually just be called "palatalization"?
(The specific phenomenon in the context of English appears to be called "yod-coalescence".)
As a non native, it still bothers me how "toward" is pronounced, "toord", really?
That's just one accent. Most accents pronounce that W (especially outside the US).
toward is pronounced exactly as its spelt in Canada
come to texas and experience a whole universe of dipthongs (one of which remedies this)
‘W’ started out as a long ‘U’ so it’s not unreasonable
It was spelled as a double U originally (hence the name), but that doesn't mean that it was pronounced as a long U! It was always an approximant.
As a native, "toward" is pronounced exactly like "to ward", but (usually) with the highly-unstressed vowel variant of "to". Remember that "w" is a semivowel, but it's not doing anything special here (at least in the vast majority of mainstream English dialects). In contexts where it is emphasized (or I suppose in more formal registers) it can strengthened to merely the normal lack of stress.
English might make more sense if someone actually sat down and wrote out the real stress rules, rather than trying to cram everything into just "unstressed" and "stressed" and only caring within a word.
=====
"To" might be one of the syllables with the most possible stress levels, with at least 4 and possible more. As I spell them,
1. "too" - full stress. Common for "two" and "too", but possible for "to" under rare circumstances.
2. "to" - less emphasized but still arguably stressed; still has the "proper" vowel. Usually this is as strong as "to" gets; "two" and "too" often fall down to this level if before a stressed syllable. Arguably this could be split into "stressed but near words with even more stress" and "unstressed but still enunciated" (which occurs even within a register).
3. "tah/tuh" - unstressed, the vowel mutates toward the schwa. Very common for "to", but forbidden in a few contexts. May be slightly merged into the previous syllable. Can we split this?
4. "t'" - very unstressed vowel has basically disappeared; may or may not remain a separate syllable from the one that follows (should that be split?).
The infinitive particle can't be 3 (normally 2, not sure if 1) if the following verb is implied (but not if the speech is cut off). At the start of the sentence it also can't be 3, and 1 is possible as seen below though 2 remains the default. Note that many common verbs act specially when before an infinitive particle; although sometimes treated as phrasal verbs it would be silly to treat them as taking a bare infinitive as their argument.
Adverbial particle "to" when the phrasal verb takes a direct object can be 2 or 3; this likely depends on the specific verb it's part of. Note that many people parse this as a preposition (taking a prepositional object), but this is technically incorrect (though there are some verbs where it really is unclear even when doing the rearrangement and translation/synonym tests).
Adverbial particle "to" when the phrasal verb does not have a direct object is usually 2 or even 1 (e.g. in the imperative). Some heretics have started calling this a preposition too (unfortunately, often in ESL contexts), but this should be avoided at all costs; they're just too cowardly to give particles the respect they deserve. Probably the only common example in modern English is "come to", but there are several others in jargon or archaic English.
Particle/preposition (the parsing is arguable) "to" used between numbers (range, ratio, exponentiation, time before the hour) tends to be 3, especially if one of the numbers is a "two". With variables it is slightly more likely to be 2.
Preposition "to" meaning "direction", or "contact", or "comparison/containment" tends to be 2, but can usually fall to 3 (less likely at the start of a sentence, and can also be prevented by what precedes it, e.g. "look to" can fall to 3 without much effort, but "looked to" strongly stays at 2). Contrast with "toward" of related meaning, which takes effort to get from 4 to 3.
Preposition "to" meaning "according to", "degree", or "target" (including but not limited to the explicit expression of an indirect object with most verbs, which we could argue should count as a particle instead. If you're wondering what verbs are excepted, one is "ask" - it can only use "of", as in "ask a question of him") is much more strongly 2, and requires significant effort to force it down to 3.
Adverb "to" is always 2 I think, but this is rare enough that I'm not sure.
=====
"To be or not to be", as famous as it is, has a pretty unusual stress pattern for most of its words: full stress on the first "to", semi-stress on the first "be", no stress (but still full length) on "or" (normal), full stress on "not", some stress on the second "to", and some stress on the second "be" (more than "to" but less than "not").
It's not, unless you're a yankee. They're going to hear you're a foreigner anyway, might as well speak Queen's English.
If you think that’s crazy, consider that “English” is the only word in the English language that spells the /ɪŋ/ (‘ing’) sound as “eng”.
Angland - Eng-land - Ing-land
Eh, not really.
"engage", "engorge", "engrave", "engross", "engulf" are all fairly common words that are either often or exclusively pronounced that way (some dictionaries might show /in-g/, but /n/ is really /ŋ/ before g or k, even if they remain). Since these can take prefixes, this also proves we're not limited to being at the start of a word. Searching for words that can be spelled with with "ing" or "eng" finds a few more but nothing super interesting (though a few are in the middle of a word).
Obviously words where "g" is pronounced /dʒ/ (like "j" for those who can't read IPA) aren't subject to this.
In my local (dialect and) accent all of these words have a pretty clear initial /ɛ/ and not /ɪ/. (But also: /ɪ/ usually contrasts strongly with /i/ here, but the sound before /ŋ/ is almost a third in-between vowel.)
Sorry, I was sloppy and wrote /i/ rather than trying to dig up how to enter the correct vowel when I was focused on the "ng".
- English - /ˈɪŋ(ɡ)lɪʃ/
- engage - /ɪnˈɡeɪd͡ʒ/, /ɛnˈɡeɪd͡ʒ/
- engorge - /ɪnˈɡɔːdʒ/
- engross - /ɪnˈɡɹəʊs/, /ɪŋˈɡɹəʊs/, /ɛnˈɡɹoʊs/, /ɛŋˈɡɹoʊs/
- engulf - /ɪŋˈɡʌlf/
According to Wiktionary only engulf and engross also use /ŋ/.
I've never heard engulf pronounced similarly to English
The stress is on a different syllable so it's kinda pointless to compare.
You might be right, but for what it’s worth I’ve literally never heard any of those words pronounced that way. I’ve only ever heard the word “English” start with the same sound as inside, while “engage” and your other examples start with the same sound as entertain.
While you're right, I feel like there's no safe argument to make here, because some group somewhere will pronounce some word in a certain way, so there can't really be a blanket rule.
You're correct on the reasons why "ghoti" cannot be pronounced like "fish," but what your explanation illustrates is that the mapping from English spelling to pronunciation is extremely nuanced - needlessly so.
A more direct phonetic writing system, like many other languages have, would make it much easier to learn how to read and write English.
ghoti is a ridiculous example. it takes its components entirely out of context. 'gh' as 'f' only occurs at the end of a syllable, 'ti' as 'sh' only exists as part of '-tion' where the pronunciation slurred over time. Pretending it says anything about the nature of the English language outside of English being a complex merging of various other languages that has evolved with time is silly.
Read and read are the same exact fucking letters and are pronounced differently. You really don't need to go very far to find many examples.
English is fucked up. The only way to learn how to speak it properly is by memorization.
Other languages like Spanish or Korean keep a near-perfect one to one correspondence between written form and expected pronunciation.
As someone that enjoys reading I can't think of a more descriptive language than the English language...It's easily one of the most powerful languages on earth and has twice (!) the number of words in its vocabulary compared to something like French, which is heavily centralized by some managerial class. You just have to appreciate the language for what its strengths are (unbounded capability to communicate using just words) vs what you as a novice need to do to master it. Which, to be frank, is easier than mastering something like the Korean language that has all this drama and ceremony around politeness and speech levels.
[flagged]
The best 'defense' of English spelling I'm aware of is https://www.zompist.com/spell.html, which at the end admits:
> I doubt that this page will convince anyone that English spelling is a good system. There's too many oddities. [...] What I hope to have shown, however, is that beneath all the pitfalls, there's a rather clever and fairly regular mechanism at work, and one which still gets the vast majority of words pretty much correct. It's not to modern tastes, but by no means as broken as people think.
Which is to say, English spelling is definitely messed up. But it's not some insane thing that lacks any hint of sanity that some people try to portray it as.
This article feels to me as it was written in bad faith, trying and failing to prove a point, but then positing the point was proved.
The author happily start the article by submitting:
>The purpose of this page is to describe [...] the rules that tell you how to pronounce a written word correctly over 85% of the time.
but then they quietly show that with their whole page of rules, the reader will not actually pronounce 85% of the words correctly as they just claimed, but actually less than 60%. By arbitrarily deciding that a number of errors can be considered small, the author bumps the number of "correctly pronounced words" to 85%.
Are we talking about 85% of the whole language? No, just 5000 words. Even if they are the most frequent in the written language, they would still only account for around 95% of all the words.
The author position is:
- people complain about the English spelling all the time, saying it's horrible
- the English spelling is actually pretty systematic and this page will explain the rules to understand it
- when you will have mastered these rules, you will pronounce half of the words perfectly - for extremely common words such as "give", "get", "real", "very", "put", "half" you are still SOL
- the english spelling is not so horrible after all: as a perfect student you will only butcher more than 1 word every 10 spoken
To me, the author has proved the point he was trying to disprove.
(and in which rule do /ˈsɪŋɚ/ and /ˈfɪŋɡəɹ/ end up?)
All languages have inconsistencies, but it seems in vogue these days to single out english and use it as a punching bag. Furthermore, no natural human languages (i.e. not artificially constructed ones like esperanto) are logical. They all have irregularities and illogical aspects.
It is not "beyond fucked" that things have different pronunciations sometimes. Other languages have problems for people who solely learn by speaking. It's not unique to english.
English is singled out because its orthography really is atrociously bad even when compared with other languages.
https://arxiv.org/pdf/1912.13321
Amongst niche circles of linguists, maybe. It's easy to single out for the average person because it's popular- english is a language learned around the entire world.
Besides, I'd rather have some more word pronunications than memorizing a table of der/das/die,dem/dem/der and a word's gender on top of learning the word itself. Or changing the position of a verb depending on if I used a modal verb or not.
English lacking a language regulator makes it hard to be a bootlicker for that particular language. Whose boot are they licking?
The blob's
> wind, rewind
Wind and rewind are fine. It's just wind and wind are a problem, like read and read.
For example, wind and rewind sound the same in: Rewind that, so I can see him wind up.
also wound and wound
A Dutch man wrote an amazing poem highlighting the absurdity of English orthography: https://people.cs.georgetown.edu/nschneid/cosc272/f17/a1/cha...
> In my native language these kinds of errors are impossible as how you pronounce letters doesn't change depending on the word they are in
Don't forget when the pronunciation depends solely on the meaning.
Live or Live?
Your examples are more or less regular though. English is a stress-based language, so it's expected that pronunciation might change when you add an extra syllable, if the stress moves (syllable -> syllabic is another example, btw).
> wind, rewind
This one is trivial, no? the "wind" in "rewind" is pronounced the same, with /aj/. The "wind" with /ɪ/ is unrelated.
Could you please share your list? I have this discussion a few times per year and I'd love to hand that list to people that think written English makes sense.
I was thinking of writing a blog article on it but I don't think I'd need to anymore!
I'll organize it a bit and I can (if I don't forget) share it tomorrow
What was the poem (song?) that captured many more of these? (Anyone?)
The Chaos (1922) by Gerard Nolst Trenité ?
https://people.cs.georgetown.edu/nschneid/cosc272/f17/a1/cha...
https://www.youtube.com/watch?v=1edPxKqiptw (and others similar)
Bingo!
Like others, I don't find that nonsense word particularly enlightening.
But maybe compare '-ough' in: cough, tough, dough, through, plough.
You missed enough and borough
The -Ough in enough and tough and in borough and dough are pronounced the same, at least in any accent I can think of.
To be honest, English orthography is such a Frankenstein's monster of historical layers that even if we did care, untangling it would be a nightmare
Indeed, you could never tell how an English word is pronounced unless you “just know”. And then it's still inconsistent (e.g. finite / infinite).
Most English words are regular, and most commonly used ones too. "the", "be", "are", "why", "can", "might", "life", etc. are all perfectly regular if you understand how to read english orthography (which uses character clusters and can't be read a letter at a time).
Infinite/finite regularly related, too - the reason the pronunciation of the finite cluster changes is due to stress differences (initial in- always takes the stress, and then the following syllable must be destressed). Note that the long vowel at the end comes back in the 4 syllable "infinitum", again due to regular stress rules.
Yep! Not only that but people will actively mispronounce words as a form of vetting. Mispronunciations also becoming a form of tribal identity. Speaking of American vs proper English. America is the most diverse cultural landscape in human history. If you stay put, you won’t see it. Start traveling around the country and its the only thing you see.
this is not hyperbole. Sure other places are diverse, however because of the unique nature of the US and its size it just ends up attracting and subsequently absorbing.
America is diverse in some ways, but in terms of language and dialects (which is what we're discussing here), America is remarkably homogeneous. There are many tiny countries with more linguistic diversity than the US.
Specifically to english and dialects, you are correct. England proper has a different dialect and accent for every nook! London for literal neighborhoods! It also has several hundred if not thousand of years on the US for language to develop. Africa has everyone beat on this front. Bantu alone has who knows how many sublanguages! America has done a pretty remarkable thing in keeping its language internally consistent despite it’s overwhelming cultural diversity and influences! That it sucks to learn for the uninitiated is exactly for this reason.
I thought the article did a good job of explaining how English uses additional letters where French use accents, like the "h" in "ship" to indicate how the s is pronounced.
that's how "sh" is pronounced, not how "s" is pronounced
same pronunciation of sh in ship is found in
- sugar
- sure
- machine
- Chicago
- mustache
- sheikh
- nation (!!!!)
Can you notice that some of those words do not have any "s" in them?
English doesn't make any sense.
Yes, and how do those entirely true observations connect to the non-use of diacritics in English?
I pointed out the ship example from the text, which was used to demonstrate how "this early French influence over English, which arose from the Norman Conquest, is the beginning of the reason why English is written without accent marks. ... This was the French habit that the Normans brought to England: the use of extra letters to spell sounds that the alphabet didn’t have special letters for. This is why English has combinations like sh, th, ee, oo, ou that each make only a single sound."
That's an extra letter being used to indicate a different sound than the base sound, similar to how diacritics are used to indicate a different sound than the base sound ("the cedilla has the function of ensuring that a c can be pronounced like an s, despite coming before an a, o, or, u").
> "is the beginning of the reason why English is written without accent marks" > sh, th, ee, oo, ou
That's cool 'n all, but I believe that only applies to French writing in English for English people.
Many languages have combinations of letters that have a single sound, it's no excuse for not having accents.
In German one can write strasse and straße or müller and mueller (different writing, same sound). They too don't have accents, but words written differently also sound different: schon = "already" and schön = "beautiful".
But German, on one hand retained diacritic marks, on the other it's also almost deterministic about pronunciation.
a it's always /a/
ä it's always /ɛ/ or /ə/ like e
sch it's always /ʃ/ as in schule
ch it's always /x/ after a, o, u and /ç/ after e, i
and so on
English doesn't use diacritics, IMO, because English doesn't make sense, it's a pastiche of lowest common denominators, so fck diacritics, they are too hard, let's write words as we like and pronounce them the way we feel they should sound, regardless of how they are written.
But it could use accents, for example rècord and recòrd, present and presènt, pérmit and permìt it's just they never thought it could be useful...
> Many languages have combinations of letters that have a single sound, it's no excuse for not having accents.
You don't need an "excuse" for not having accents. Digraphs and diacritical marks are simply two different ways to mark a letter as being pronounced as "somewhat similar but different". Whether one is better than the other is a matter of subjective perception, and it's very common for languages to not do it consistently. For example, Spanish has "ll" but also "ñ" (ironically the latter used to be "nn"!), and Czech has "č" but also "ch".
What's criminal about English is not the lack of diacritics, but rather the extremely convoluted and hard to predict rules for interpreting digraphs and trigraphs. If "ch" always meant the same thing, it would be just fine.
> but I believe that only applies to French writing in English for English people
Shrug. Yes, languages have different paths in their linguist and lexicographic evolution. Film at 11.
I still like what this linguistics PhD wrote about the specific history of one aspect of English language evolution.
> English doesn't make sense
That is of course an exaggeration. Just because the rules are complex and full of exceptions doesn't mean there's no sense. Even if you reject all of linguistics, Shannon in “Prediction and entropy of printed English”, demonstrated that English is compressible, which means there must be some patterns.
Now to drink some maté.
Clearly the early scribes were looking forward to the 7-bit ASCII code and needed to reduce the number of characters that were represented.
You're not wrong, except the technological reason. As I understand it, English lost a lot of characters when the movable type printing press was created.
Only þ (thorn) died with the printing of Caxton's Bible using y-, for cost reasons.
The other letters -- ƿ (wynn), æ (ash), and ð (eth) -- went out of use long before movable type printing. https://www.deadlanguagesociety.com/p/the-lost-letters-of-th...
And long s:
https://en.m.wikipedia.org/wiki/Long_s
AFAIK it was dropped out because the top hook of the long s punch broke easily, and could be easily replaced with a basic s.
Your link says ſ (the long s) didn't disappear (from English) until several hundred years after the movable type printing press and makes no mention of physical problems when using that letter, suggesting instead removal gave a type a more modern feel:
> Pioneer of type design John Bell (1746–1831), who started the British Letter Foundry in 1788, is often "credited with the demise of the long s".[12] Paul W. Nash concluded that the change mostly happened very fast in 1800, and believes that this was triggered by the Seditious Societies Act. To discourage subversive publications, this required printing to name the identity of the printer, and so in Nash's view gave printers an incentive to make their work look more modern.
Þe 'ae' glyph is still used in printed English, e.g. spelling of 'Encyclopaedia', 'paedophile'
Similarly þhe 'oe' glyph is also used, often in medical contexts.
Þe loss of þorn is somewhat sad, as it is still easily understood by native speakers when substituted for its modern digraph.
As I understand it, æ was a letter in Old English while Þe same glyph in Modern English is a ligature, wið no linguistic connection between the two.
Last year at https://news.ycombinator.com/item?id=40267080 I found that in the 1800s the ligatures æ, œ, fl, ff, ffi, fi and ffl were pretty common in type collections.
And yet only in non-American printed English.
If you go early enough, my understanding is that people would write accents in ascii by doing:
e <backspace character> '
Which was called "overstriking".
Yes, this was explicitly called out in the ASCII standard, and is the reason ASCII has ~ (in place of the proposed ‾) and ‘^’ (which replaced the ‘↑’ in the original 1963 version).
Interesting! The z80 card in my family’s Apple 2 would render “^” as “↑” and I always wondered the connection. I guess they were using the original spec.
And probably ‘←’ where we now have ‘_’. Character-generator ICs with the 1963 64-character set hung on around for a couple decades.
This comes from typewriters. Curiously, the reason why Esperanto uses Ĉ, Ĝ, Ĥ, Ĵ, and Ŝ is because the circumflex was present on French typewriters (which were very common in Europe at the time). Even though French itself only uses it for Â, Ê, Û - since it was a distinct key used for overtyping, it could be repurposed in this manner, just like Unicode combining marks today.
If you go back even further, you get the iota subscript [0]
[0] https://en.wikipedia.org/wiki/Iota_subscript
Iota subscript is a 12-century invention. Rough and smooth breathings (ἁ for ha, ἀ for a) are much older Greek diacritics.
For another example of classical diacritics, see apices in Latin (á for long a).
All hail the first software engineers of the scriptorium
But they added extra letters to words to make up for lack of number of letters. They'd be fans of utf-8 maybe.
The Economist magazine uses a diæresis (two dots) in words like “coöperate” and “reëlect” to indicate that both vowels are pronounced separately, rather than as a diphthong. This is considered old-school and uncommon though.
Unless The Economist does it as well, you were probably thinking of The New Yorker.
https://www.arrantpedantry.com/2020/03/24/umlauts-diaereses-...
Oops I think you are right. My parents subscribed to both and I must have mixed them up.
That is the fun thing about English. There isn't really a single right way to speak or write it. It is defined by common usage. As long as your audience understands you, it is correct.
As someone else pointed out, loan words often have accents. At what point does jalapeño become en english word? There is no other english word to refer to the pepper, therefore it is now an english word and therefore english words can have diacritics.
The closest thing we have to a source of truth for the english language is the OED. It isn't prescriptive, it just lists how words are used rather than how words should be used.
Jalapeño is in the OED with the tilde https://www.oed.com/dictionary/jalapeno_n?tab=factsheet#1253...
> That is the fun thing about English. There isn't really a single right way to speak or write it. It is defined by common usage. As long as your audience understands you, it is correct.
That's how all languages work - to the chagrin of l'Académie Française - English is no special exception.
I like to believe that, by definition, the only person who speaks English properly is the King of England. Everyone else has an accent.
I find it interesting that the Spanish consider the ñ to be a separate letter, in their 27 letter alphabet.
The double l "ll" is also a separate letter and is pronounced the same as y in "eye"
I see naïve as an example of diacritics in English as well.
Learning the relationship between a diæresis and a diphthong and then seeing that the word diæresis contains a diphthong has rounded out my day nicely, thanks for that.
I enjoyed learning recently that the most common diacritics in Czech are the háček and the čárka. The word "háček" has a čárka followed by a háček, while the word "čárka" has a háček followed by by a čárka!
A "calque" is a word that's been brought from one language into another by translating the individual parts. A "loanword" is a word that's been brought over by just taking the word with little modification.
For example, "calque" is a loanword, while "loanword" (from German "Lehnwort") is a calque.
Similarly, a grave accent is sometimes used in poetry to indicate that a single vowel is voiced - e.g. in "cursèd" to indicate that the word should be pronounced as two syllables "curse-ed", rather than a single syllable "curst".
Loanwords often retain their accents as well: cliché, façade, doppelgänger, jalapeño.
But it's habanero, not habañero - people mistakenly put the ñ by analogy with jalapeño.
I’ve always seen it written with an acute accent: ‘curséd.’ Wikipedia notes both usages, but to my knowledge I have never once read a poem which used a grave accent that way.
The adjective "learnèd" (meaning "well-educated") is a native English word that should take the grave accent even outside of poetry. Also "unlearnèd".
Winged and legged are still pronounced like that too, at least by some.
Interestingly, as an addition to the parent comment, there's a certain point in time where a lot of -ed words are often spelt -'d, which presumably is from the transitionary period between the expectation that the -ed was pronounced and today's general pronunciation.
Oddly enough, I pronounce 'legged' that way, but not 'winged'.
e.g. "Long legged monster"
You see this in Shakespeare's plays, "-ed" endings are the equivalent of "-èd", whereas "-'d" is pronounced "-ed" as is common today.
There’s also the (dying) use of diareses to indicate vowel stress, for example coöperative or naïve.
Used to just be a dash, like re-elect. Cooperate was co-operate. People got tired of writing dashes and they got shortened.
The Economist uses diacritics in French, German, Italian, Portuguese and Spanish words, but deletes the diacritics from other languages (or maybe they keep them when they happen to resemble diacritics from those languages). I think I once saw a letter from a Hungarian complaining about that: a word they'd used meant something silly or obscene after they'd removed the diacritic.
Most publications are haphazard like this. The diacritic example also applies to Vietnamese, despite the alphabet coming from Portuguese.
Similarly, Chinese and Korean names are usually written in the order they are pronounced, while Japanese names are reversed.
>Similarly, Chinese and Korean names are usually written in the order they are pronounced, while Japanese names are reversed.
As a Chinese speaker this is maddeningly confusing when reading Western media. It's also a fairly new trend, I want to say a decade ago Chinese and Korean names were also read in Western order.
That seems like a quirk of the magazine for thsie pstticular words, but its more common for some others like "naïve" and "Zoë", although that's gone out of fashion somewhat since computers took over (and I believe both of those are loan words in english)
I love this, because I always do a double take and start pronouncing it as coOUUUperate and REEEElect, giving me much entertainment (I am easily entertained!).
Edit: also see rôle, which invokes this classic: https://i.redd.it/qrfr7o4ue2z51.jpg
Oh interesting, I've never seen those cases. I'd say it's more common (although maybe still a little old-school?) to use it in words like Noël or Chloë.
The difference is whether the sequence of vowels crosses the morpheme boundary or not. When it does, as in "cooperate", it's usually readily obvious to native speakers even when seeing the word for the first time, thus they don't need a mark to disambiguate.
I don't remember ever seeing that in The Economist.
I think you're thinking of New Yorker magazine, perhaps?
Fascinating. I had wrote that off as a bug in their CMS.
French.. you people have no idea how Italy is.
I speak differently than my brothers because I grew up at my grandparents 3 MILES! away and if I go to my family restaurant 2 MILES the other direction there is a different accent again, and I mean different words too not just the sound. Where I used to go to school 10 miles away they don't understand if I speak my dialect because it's a different region.
The whole Italy is like that, a different dialect every 2-3 miles, every family, town, city, province, county and region has different accents and ways to make food and recipes. My town is 3200 years old, older than the Romans, they used to fight, then ally then fight again with them etc., this dialect thing is very old, cultures, traditions and families.
Of course we have the Italian language in common and the main dialects are separated by the main city of the region then by the region itself but yep, that's how it is.
This article is about accents on letters (diacritics), not accents as in dialects.
I found your post interesting neverthelesss.
It is probably connected.
Having so many different dialects (and full minor languages!) saying the same word slightly differently, Italians were forced to find (and use) a way to put the correct accent in writing.
Other languages probably don't have the mind boggling number of dialects Italy has. GP was not exaggerating, it really changes every few kilometers.
Like the article says: "situations like these are surprisingly few in English"
Germany is similar. Especially in more rural areas, a couple villages away people are going to have a hard time understanding you.
Though there's typically a common dialect variant everybody speaks, usually the one spoken by the largest city in the region.
E.g. every middle-franconian understands Nuremberg franconian dialect and is able to talk in a way they would understand.
well, if you ignore the current country borders then "German" would encompass a large portion of Switzerland and the Netherlands. So, with that assumption, I would be surprised if Italian had more dialects than German.
Heck, Swiss German is like this lol.
My cofounder's wife, during a parents together at school, was "advised" by some of the mothers to not "hang around those" mothers because they're stranger folk. Turns out, they lived 1.5 miles away in the next village.
I'm American
My ear has just gotten to the point of noticing German dialects, and spotting the quizzical looks of other German/Austrian/Swiss people in the group
Fascinating. I feel like they had 1,000 years to resolve this
1000? Prussia dissolved only in 1947 and the nation state of Germany was reunified only in 1990.
In any case, communication technology (trains, TVs) is a greater determinant of dialect than government.
suboptimal outcome for sure
Italian script doesn't use diacritics, though, so it's not the same kind of accent as the article talks about.
Italian script most definitely requires diacritics.
"è" (is) vs "e" (and)
"pero" (pear tree) vs "però" (but)
"perché" is the only correct one and "perche" and "perchè" do not exist
and so many other examples.
Oh huh, I've forgotten more Italian than I thought, thanks for the correction!
Well, that's because they're really languages and not dialects! They all derive from Latin, there is no "old Italian" or anything, at some point we decided the Florentine "dialect", having the most literary prestige, would be standard Italian.
Italians only really started speaking Italian in their day-to-day life after the war. It was mostly a written/literary language before that.
Yes, surprisingly few Italian dialects are actually Italian derivatives (maybe only a couple?)
But there are differences between a dialect and a language, we can't say all of those are languages even if most come from Latin.
Italian wikipedia says that officially in Italy there are about 13 recognized languages (not counting Italian, plus French and Slovenian in some parts), and about a dozen main dialects.
In wikipedia you will notice 3 big dialect groups that are just that, groups of many, many dialects that do not qualify as languages.
It's more a difference of how recognized by the community those are, and how unified by grammar, locality and uniqueness. Kind of a gray area for many.
> But there are differences between a dialect and a language, we can't say all of those are languages even if most come from Latin.
That's not really true. There's no scientific reason to say that some varieties are "dialects" and some are "languages". It is purely a political and culture question.
> Well, that's because they're really languages and not dialects!
Indeed they are not strictly dialects of Italian, which followed its own evolution alongside them. I think most of them could still be explained as dialects of Latin, who underwent major "niche differentiation" in the immediate aftermath of the fall of Rome and the rise of barbaric kingdoms.
> [Italian] was mostly a written/literary language before that.
This is a bit of an exaggeration. Clearly, even before the early modern era "Italians" could understand each other. Dante (from Florence) lived in Genoa and Ravenna, and had no need for an interpreter from what we can gather. Ditto the many "Renaissance men" who toured around Italy (Leonardo: Florence->Milan; Raphael and Michelangelo: Florence->Rome; Galileo:Pisa->Padua). This level of interconnection becomes really hard to explain without a high degree of mutual intelligibility.
Dante is a poor example for language proficiency, as he was educated / traveled/ well read. The common person would have a much different lived experience
I have colleagues in India. It's a diverse mesh of regions that vary in about every way. Was explained people grow up with 3 languages, their regional language, a neighboring region's language, a more general language, & then educated folk are taught English. Then in school they were still taking classes for other romantic languages. At an Indian restaurant with one colleague I noticed they would mostly rely on hand gestures. One factor here is that there may often be a language barrier
I've also interacted a bit with Senegalese, which has Wolof as the primary language, then French taught in schools. Many only know Wolof (with French influence weaved in). & the well educated learn to speak English, & how to maintain more European French accent
And it's great.
My HS Italian teacher's university thesis was on the different dialects within Naples and their various (ancient) Greek origins.
England has small accent shifts every 25 mins (the other audible accent / http://news.bbc.co.uk/1/hi/business/7843058.stm) - the situation you describe is two communication orders more complicated than that!
Closer than that in some places. I'm from Sunderland, which is contiguous with Gateshead, and then Newcastle. I can clearly hear when someone is from Sunderland vs. Newcastle, although 'a foreigner' - say, someone from London - might not be able to pick it.
I dare say Liverpudlians and Mancunians and Glaswegians and so on would make the same claim.
It doesn't compare to that coolness you just shared, but I'm from Long Island (right outside New York City) and I and everyone from my childhood town can differentiate a Long Island accent from a New Jersey accent (very similar but subtly different; a suburb on the other side of NYC) from a Queens accent (a type of NY accent from a NY neighborhood, whose most famous exemplar is The Nanny) from a Brooklyn accent (another type of NY accent, the Mel Brooks sort and how my dad speaks), etc etc. So, while, the US is nothing like Italy where every 3 miles there's a different language-or-dialect, the US accent isn't nearly as uniform as one might think, for even within cities and their suburbs, like my hometown in the above example, there is a comparable dynamic, where going not-that-far (these neighborhoods and suburbs aren't far from each other) people speak in accents that are notably different to locals, although surely people not from NY group it all together as "the NY accent" without differentiating the level-of-nasal-ness and other such contributing factors to the accent.
Sadly those Brooklyn and Queens accents are becoming rare in large parts of Brooklyn and Queens. You really have to go out to areas with few transplants (Long Island, Staten Island, or rapidly shrinking white working class parts of Bk/Queens) to hear the typical NYC-area accents being used as the main variety of the majority of the community.
I grew up in the province of Friesland [0], which is part of the Frisia cultural region, an area that was not occupied by the Romans back when so it retained some of its identity and culture - although a lot of that was erased by Christian missionaries and subsequent invasions and government takeovers etc etc etc.
Anyway, super local accent changes are a thing there as well, go north a few kilometers from where I grew up and you go from the "woods" to the "clay", which has its own intonation and possibly words. Then there were town specific stereotypes - people from this town will knife you, that town is full of inbreds, etc. That's probably a lot of made-up intentional drama though, lol.
[0] https://en.wikipedia.org/wiki/Friesland
[1] https://en.wikipedia.org/wiki/Frisia
Similarly in Norway and Sweden, new dialects every few miles, with both pronounciation and word changes. Places that could reach each other by boat tend to have more similar dialects (while if there's a mountain in the way you can have a bigger difference, though flight distance is shorter)
Interesting. I know that as a spanish speaker, there are some Italians whom I understand almost perfectly (like 90% and I can fill in the other 10% from context), but there are other Italians speakers where I can't understand anything at all.
When I was doing a bunch of learning about linguistics, situations like this were very interesting and confusing to me. I still don't have a good working intuition for how this is possible. I don't understand what maintains the sound differences in the face of the continuous exposure to substantially different accents. It's empirically possible, but it's never made sense to me. Why don't you and your brothers end up talking the same after a while?
I mean, people do end up talking the same after a while. Regional differences are disappearing and being leveled all over the world due to the influence of centralized education systems and media.
Same in some parts of Germany. In the area where I grew up in you can tell in which village a person is from just by the way they talk, and the villages are just ~3 km apart!
From what I know this is because it was a relatively remote, dangerous and poor region (all by the standards of hundred years back) which changed ownership a lot (between clergy, bavaria, prussia) and people were mostly left to themselves
'Ennery 'Iggins, is that you?
Italian here... Are you from the south of Italy, by any chance? Because I'm from there and it's exactly how you describe it.
Yeah, from near Urbino but moved to USA ~20 years ago.
Urbino is more like Central Italy.
You think that's bad, visit your friends to the East in Slovenia. You'd think they're doing it on purpose! How do so few people in such a small area make so many variations in the "same" language?
Generally speaking, countries that have a lot of different ethnic groups and/or introduced universal education relatively late tend to be those with more diverse dialects. Think about it: in a world without newspapers and TV, where most people live their entire lives in the same village they have been born in, and relatively few travelers, any linguistic innovation that appears in one place is going to take a very long time to travel elsewhere. Thus, local dialects tend to diverge. Universal school education slows this down by introducing a standard literary language (and, historically, often in a very forcible way). Mass media, TV especially, leads to further homogenization.
In 6th grade, so back in 1982, I read the French SF novel "Malevil".
I was astounded (speaking as a US kid here), to learn that French people born and raised in France didn't natively speak French, but instead learned their regional language.
Here is an example, from https://archive.org/details/malevilmerl00merl/page/150/mode/... :
> And besides, Thomas was already quite isolated enough as it was: by his youth, by his city origins, by his cast of thought, by his character, and by his ignorance of our patois. I had to ask La Menou and Peyssou not to overdo the use of their first language — since neither of them had learned much French till they went to school — because at mealtimes, if they began a conversation in patois, then everyone else, little by little, would begin to drop into patois too, and after a while Thomas was made to feel a stranger in our life.
Two minutes ago I learned that "patois" has a distinct meaning in France: "patois refers to any sociolect associated with uneducated rural classes, in contrast with the dominant prestige language (Standard French)" https://en.wikipedia.org/wiki/Patois
I am very ill-informed on the history of the topic, including the national language policies of France and Italy. I do know that Sardinian is not a dialect of Italian, but my knowledge isn't much deeper than that. ;)
IIRC in the early 1900s, coercive methods were used to stop children speaking their native regional languages, a lot of it in school.
In my region of Brittany (France) the most famous example that was on posters detailing good manners would say : "Il est interdit de parler breton et de cracher par terre" meaning "It's forbidden to speak Breton and to spit on the ground", placing both on the same level.
Stamping out minority languages and dialects was (and often still is) unfortunately common in most countries. I'm Russian, and my native regional dialect has some minor differences from standard Russian that make it sound a bit more like Belarusian. I remember how in school we had a teacher making fun of our manner of pronouncing words as "kolkhoznik speech" (implying that only the uneducated speak like that). This was in 1990s.
I am afraid this quote is an urban legend. It never existed.
> I was astounded (speaking as a US kid here), to learn that French people born and raised in France didn't natively speak French, but instead learned their regional language.
As a French person born before 1982, I find this sentence questionable.
If you mean "there were some people who learned a local dialect", then sure, you could dig some up.
If you mean "many regions had dialects that were learned before French", then I believe you misunderstood (or were misled).
Finding anyone who even spoke a regional dialect would've been a novelty, let alone one who grew up speaking it before French.
FWIW, the book was written in 1972.
I mean "there were some people", not all people - Thomas, in the quote, came from Paris and spoke French. He did not learn a regional language.
I don't mean 'many regions' because the only example I had was one region. The fact that there was at least one region where local French people, in a region which had been part of France seemingly since at least the Middle Ages, did not speak French as their mother tongue, astonished me.
FWIW, the French Wikipedia page says:
> Ainsi Malevil serait partiellement inspiré du site de Commarques (sa grotte, son abri troglodyte et son château)[2], tandis que le village de la Roque serait partiellement inspiré de la Roque Saint-Christophe, forteresse troglodyte voisine du château de Commarque. ...
and the location,
> La vallée des Rhunes : inspirée de la vallée des Beunes, et plus précisément la grande Beune.
so the author's fictional location was supposed to suggest the department of Dordogne in south west France.
https://en.wikipedia.org/wiki/Limousin_dialect tells me
> Limousin ... is a dialect of the Occitan language, spoken in the three departments of Limousin, parts of Charente and the Dordogne in the southwest of France. ... Limousin is used primarily by people over age 50 in rural communities. All speakers speak French as a first or second language. Due to the French single language policy, it is not recognised by the government and therefore considered endangered by the linguistic community.
Those people over age of 50 would likely have been children in a book written 53 years ago, with Limousin as a much more common language amongst the local adults.
"Over 50 in rural communities" in one of the more sparsely populated areas of France makes for a very small slice of the population even in that area, and even then, as pointed out, French is spoken by everyone.
On top of that, it is more "anyone who speaks limousin is likely over 50, and in a rural community", than "anyone over 50 in rural communities in that area likely speaks Limousin".
There are 10k speakers of Limousin today (according to Wikipedia), out of about 1.2M residents in Dordogne and Limousin combined. That's less than 1% just for that area.
To me, it is more of a local curiosity than a mind-blowing fact, but I suppose I grew up learning about the various dialects in France, so I have a different take.
> and even then, as pointed out, French is spoken by everyone.
Yes, as even the Malevil quote I gave pointed out. (At least by school age.)
> On top of that, it is more
The book was written over 50 years ago, so the Wikipedia article about present day use of Limousin isn't all that indicative of what it was like for the adult characters in the book, who would have been born before 1950.
> There are 10k speakers of Limousin
Why are you being so nit-picky? Look, this is a fictional place and the specific local language is never stated. I just today read the Wikipedia entry which give info about the location.
I specifically picked out Limousin, yes, because it fit the area, and because I could quote how the Limousin language was more widely spoken when the book was written than now.
But as the text I quoted says "Limousin ... is a dialect of the Occitan language". Wikipedia says there are about 200,000 speakers of Occitan, so that's the more relevant comparison, and "Though it was still an everyday language for most of the rural population of southern France well into the 20th century, the language is now declining in every region where it was spoken." - https://en.wikipedia.org/wiki/Occitan_language
It seems to me that when Malevil was written, Occitan was still widely spoken as a first language in the area. Wikipedia says the author was living in the area when he wrote the book, so he should know.
The only reason I mentioned it was because you wrote "Finding anyone who even spoke a regional dialect would've been a novelty, let alone one who grew up speaking it before French." while the book, written by the French novelist Robert Merle - Wikipedia informs me he was "a household name in France, with the author repeatedly called the Alexandre Dumas of the 20th century" - comes across that speaking in patois was not a novelty but simply something expected, and which effectively all locals spoke.
I simply cannot reconcile your surprise with my reading and limited understanding except by assuming it's from before your time, from a mostly forgotten era.
> I suppose I grew up learning about the various dialects in France
That's .. kinda the issue, isn't? In Malevil the local language patois is not seen as a dialect of French, as I quoted, it was a language learned in school.
Wikipedia says it's more related to Catalan than French.
Why do you describe it as dialect of French?
> Why do you describe it as dialect of French?
Did I? I mentioned dialects in France, not of French, IIRC.
I'm nitpicking because, TBH, I quite likely just read too much into your use of "astounded" in your original comment. It seemed to me that you were overestimating how prevalent or significant these languages were.
By the mid-20th century, they were already quite less popular and even less so by the time Malevil took place (1977, I take it, even though it was written a few years earlier), especially when it came to being taught before French.
At the same time, I guess I was maybe as surprised to learn that Louisiana French is still a thing as you were about these areas in France. :)
When is something a dialect in France and when is it a language in France?
> It seemed to me that you were overestimating how prevalent or significant these languages were.
I said I was in sixth grade, a kid living in the US.
I didn't even know then there was more than one Romance language in Italy - as I alluded to in my original comment.
Yes, I now, decades later, know more. But I was sharing my childhood misapprehension and how I learned the world was more complicated than 11 year old me thought as something meant for others to smile at and enjoy, not to be nitpicked as if my comment was any profound statement about all of France.
My interpretation was not "questionable" - the story clearly was supposed to take place in a part of France where many of those in the countryside still learned a Romance language other than French as their mother tongue. That matches the real history for that supposed area that the author drew from. Yes, it's certainly something that's a lot less common now, some 50 years later. But then just say that things have changed.
My bad for nitpicking. Sorry.
See also: Jamaican Patois aka speakyspoke
https://en.wikipedia.org/wiki/Jamaican_Patois
it remains true to this day. gascon[0] is still spoken in south of france, by both young and old. i know because i've heard it spoken. the idea that the french speak french, italians italian, is very modern. european nations weren't as properly integrated as modern history will have us believe. iirc the integration sped up post-ww2. cf seeing like a state[1].
[0]: https://en.wikipedia.org/wiki/Gascon_dialect
[1]: https://en.wikipedia.org/wiki/Seeing_Like_a_State
I studied Anglo-Norman French (circa 1300s), and found it strikingly useful speaking with a woman who worked in the Breton region of France.
> mean different words too not just the sound. Where I used to go to school 10 miles away they don't understand if I speak my dialect because it's a different region.
Like what? You have to give us examples.
Oh geez, for example in Italian to say here you say "qui", where I grew up I say "mchi" but my brothers say "mqui" or "mque", where I used to go to school they say "meque" with the weirdest sound.
To say what are you doing in Italian is "cosa fai" but I say "co fei" and my brothers "sa fei" and where I used to go to school they say "che fe".
These are just simple simple things but almost everything changes here and there and I can't put the sound with the words here, they actually sound different, and change where the actual accents are.
I grew up in southern Switzerland and the dialect situation is the same as you describe.
Not necessarily every town retained their distinctive dialect in practice because people move, not all parents pass the dialect down to their kids etc.
But I remember a friend of mine lived in this village of 40 inhabitants where they said "e peu que?" instead of italian "e poi cosa?"
I have relatives in Bari so I've been fascinated by Barese. My Italian is not good but I can passively pick it up when listening or watching television, but Barese sounds 100% like a completely different language to me. French and Spanish are more intelligible.
https://www.youtube.com/shorts/gEKxf8RD-OM
Funny also I moved to USA ~20 years ago and you lose the Italian, you don't remember words etc. but you'll never lose your dialect, it just comes natural because that's how you grew up instead of what you learned growing up and from school, Tuscan people have it easier because the language comes from their dialect, Dante etc.
And to add, I wouldn't click that link if you paid me lol, I hate the Barese... ok I clicked, funny stuff.
you are making the mistake of confusing your experience, which is of course legit but anecdata, with "how it works" in general.
I'm an almost 50 years old Italian so not a spring chicken but I definitely learnt Italian growing up, not a dialect, and not "from school".
I guess it's the difference between growing up in a city vs a village.
Well yeah, GP's comment obviously only applies in the case that your native language is not standard Italian.
not obvious at all when every sentence uses "you" to indicate a general rule that applies to every Italian rather than "I" to indicate a personal experience
Hah, Barese sounds like a Frenchman is trying to speak Italian but can't be bothered.
you clearly haven't read the article... they are talking about diacritics (accents) and not inflection of the spoken language.
i find absolutely worrisome that nobody is reading the articles anymore, and they just read the title.
it makes the quality of the discussion very very low.
From the site guidelines:
> Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that"
Personally, the parent comment added a lot more, even inadvertantly, than one complaining about whether someone has or has not read the article.
The Godwin vignette at the beginning is such a clever way to dramatize what would otherwise be a dry spelling shift. Also, I never realized the irony that English avoids diacritics because of French influence
> Also, I never realized the irony that English avoids diacritics because of French influence
But is that why they avoid diacritics? It sounds like English probably wouldn't have had diacritics even if the Normans hadn't come in.
Seeing my son try to learn to read things like "cycle", I feel like diacritics would make English writing a lot more accessible.
According to the article, Norman influence led to double letters being used to better mark out sounds, which achieves the same as diacritics. It made English mostly good enough (failures like 'lead' are rare). Being good enough, and lacking a strong central authority, the language only accepted a conservative standardisation, and avoided larger changes such as including diacritics. Without these Norman changes, there is more chance diacritics would have been added, as it would not have been 'good enough'.
Written English is a worse is better story. The Norman influenced version being the first-mover that users cling to even when better comes along.
Well, the "lead issue" could be fixed by writing the verb "leed" (after all, it's exactly the same sound as in the word "queen" mentioned in the article), but for some reason this hasn't happened...
It happened in newspaper jargon for the leading sentence of an article (though they used the spelling "lede" instead), because "lead" was already a metonym for hot type (which was cast out of the metal).
It hasn't happened otherwise presumably because the risk of confusion is normally very low when not in a Pb-filled context.
Diacritics wouldn't have helped moderns if they were in from the beginning - most of the confusing words used to be pronounced like they are spelled (at least to people of the time). Maybe they would have helped to petrify pronunciations and slowed or stopped linguistic drift but I somewhat doubt that given historical literacy rates.
> to dramatize what would otherwise be a dry spelling shift
I don't think that's how it was developed, though. I really doubt there are real-world cases where cwen was scrubbed and queen written above it (correct me if I'm wrong!).
I think it’s more like “people stopped writing English for time being, only learned to write Norman and Latin, so when they needed to write a word or two, they’s use the spelling they knew. Eventually, this spelling because the way of writing English”.
I don’t think a situation with Godwin is plausible.
> I never realized the irony that English avoids > diacritics because of French influence
I'm not sure that's the best way to put it. Old English also generally didn’t use diacritics (modern texts add them: we’d use cwēn instead of cƿen, but these are modern invention).
So, English didn't use diacritics before Normans, and Normans didn't change this.
Do the tittles on `i` and `j` count as diacritics? In English those vowel symbols never appear without their tittles. (In contrast, the related vowel symbol `y`, which is like an `ij` combo and is named "Greek I" in French, never appears with tittles!) In some sense, the glyphs are idiomatically atomic with the diacritics permanently stuck to them.
I don't think so; the tittle is part of the lowercase form of the letter. If it was a diacritic, it could appear with the uppercase forms as well.
In French you can (and most people do) drop diacritics from all uppercase forms.
They are not diacritics since they don’t alter the sound of the letter.
I don't know what a headless `i` or `j` would sound like in English, since they aren't used. So it's not really a verifiable claim that the tittles don't alter the sound.
For the record, Wikipedia states they are diacritics. I'm leaning toward agreeing due to this observation: "In most Latin-based orthographies, the lowercase letter i conventionally has its dot replaced when a diacritical mark atop the letter, such as a tilde or caron, is placed." Ex: aeiouyj, âêîôûyj, äëïöüÿj, áéíóúyj́ (oops, my example shows an accent above the J's tittle–down one, but three up)
https://en.wikipedia.org/wiki/Tittle
Ai blív ingliš šud imbreis diakritiks, ounli daunsaid ai sí is it wud spel dí end of speling bí kompetišns. Pun intendid.
You don't even use them consistently in the same sentence (your unaccented i has at least 3 different values, for instance).
The real reason English spelling is frozen in the 1600s is that that is the last time all English speakers had a common language community. Since the foundation of the colonies, Englishes have diverged from each other from that starting point, so that no reform can be neutral to all current Englishes - some have merged what was distinct in early modern English (e.g. cot-caught merger); while in other cases what was a single class has been split (e.g. the bath-trap split). Wikipedia has a (non-comprehensive) table: https://en.wikipedia.org/wiki/Sound_correspondences_between_... note for example that even where two varieties have merged phonemes, they might have merged them differently (compare Southern American to Australian). You might try to come up with a spelling system that covers all possible combinations, but it would be then very hard for the speakers who have mergers (i.e. all of them) to use - how is an Australian supposed to know which äː vowels are æ in American and which are ɑ? How are the Americans supposed to know which ɑ's are äː vs ɒ in Australia? etc. etc.
It's very important for English to have 9000 vowels so we can tell where you're from within about a five mile radius, no matter how hard you try to hide it.
If you mess up every vowel in an English sentence, everybody can understand every word, but it makes everybody a little upset and a little aggressive. If you want to play it safe, just make every vowel a schwa and people will think you're from New Zealand.
Czech diacritics taking over the world :-)
You'd be surprised! I actually see a fair amount of haceks hereabouts in PNW because they are used in the orthography of the local Salish Native American languages, so you end up with road signs like these: https://www.charkoosta.com/news/salish-language-stop-signs-e...
You're trying to create set of rules for something that's evolved from strong oral to written to emphasis on oral again. It's organic and used in coordination with many other countries and their languages. If you understood that many of our rules are defined within specific instances, by specific needs (publishers example), and are somewhat arbitrary, you'd be amazed we have any consensus at all. My theory is that publishers, broadcasters (Like the BBC) and educational institutions are really where standardization has been enforced. Outside of that English and language is as flexible as a sender and receiver of a communication will allow.
It does sometimes, though its use may mark the author as among the agèd.
Not to mention loanwords, which of course English is full of, and are sometimes considered properly spelt with their original accents, though many will spell them naïvely without.
Diphthongs too, especially in British English, are not just an archæological find, though out of pragmatism usually written digitally with two separate characters.
On the internet the most marked issue is the difference between British English spellings (England, Canada, Australia, New Zealand) and the USA. It is frustrating that on most spell checked text boxes words like: harbour, labour, actualise, etc shown as misspelt.
I find it most irksome that the Australian Labor Party has chosen the USA spelling in spite of being part of the Commonwealth.
The great thing about using Singapore as my locale is that it accepts pretty much any English spelling you throw at it, British or American. You see quite a mix of both on signs and documents here, too.
After enough programming over the years, I feel like my mind has separated the concept of a colour (which I learned as a child) and `Color` data.
For me (Brit), ‘program’ is some software and ‘programme’ is a collection of projects, in the project management sense
me too: 'dialog' is a computer popup; 'dialogue' is a verbal exchange
It gets worse. Canadian is different to both British and American (or it has some of each) and Australian is different again from all three.
Did you mean misspelled
> As a result of these circumstances, things like spelling practices varied from one place to another, and one scribe to another. The same word could even be written on the same page in multiple ways.
I believe we can all still be confident scribes and maybe even have our own preferred way of writing words, where we within reason push the boundaries or push our own viewpoints through self expression :D
"mispelt" is a somewhat archaic British spelling. "mispelled" is more modern.
No it isn't. -t instead of -ed in general for many words is dialectal for one thing, more commonly retained (a Saxonism) in the West Country than elsewhere. Misspelled in particular though is distinctly American, everywhere in Britain uses misspelt.
(Ironically, I'm not sure if deliberately ironically, you 'mispelt' both, fwiw.)
> though its use may mark the author as among the agèd
Thirtysomething here. I use diaeresis (a/k/a diæresis) over e.g. coöperate. It’s more concise than a hyphen. And it makes more sense than cooperate, given cooper is a word.
Thirtysomething here too. I see a diaeresis in naïve often enough to remember that it happens yet uncommon enough to be taken by surprise anyway.
So we can use them if you're feeling fancy or writing for The New Yorker
English still sometimes (albeit very rarely) uses one type of diacritic. The diaeresis is in occasional use. Now days it is mainly used in the word "naïve," but it will be familiar to readers of the New Yorker on words like "coöperate."
The diaeresis is used disambiguate when a pair of vowels make two separate vowel-sounds, instead of one. For instance, if you didn't know better, you would think that the words "naive" and "nave" (said "nāv", the congregational of a traditional church) were homophones. But the diaeresis shows you that the "a" and "i" are said independently (nah-ēv).
Of course, English also uses diacritics occasionally in some borrowed words: résumé, née, fiancée/fiancé. But these are also considered optional.
> if you didn't know better
It's funny how The New Yorker clings to this given its readership is mostly people who absolutely do know better.
From the article
>> The use of diacritics arises out of a mismatch between an alphabet and the language it’s being used to write: if an alphabet were well adapted to a language, it would have letters for all the language’s sounds. <<
and then it use 'ç' as an example even though French has 's' for the same sound, amusing.
To summarize the article, when a language has a single creator, (in this case, the person who runs the first major printing press in France,) that person has immense power to make significant changes to the language when needed. On the other hand if the language has multiple collaborators each with some influence to make changes to the language, such changes tend to be much more conservative ones.
A point that is always good to think about, a population defines its language and a language defines its population, it's a symbiotic relationship. The language you speak, will shape how you perceive/interact/understand the world itself.
We do, of course, use accents and other diacritics. It's not as common as in other languages, but most people will come across a few each day. The popular argument here is that many are French, with the accents optional, yet soupçon and exposé are rarely written naked. If you want non-French, pick up the New Yorker and you will find coöperative and reëlect, or a poem to find changèd and learnèd. We use them in names, from Brontë to Beyoncé.
There is an excellent Wikipedia article that goes into detail on the subject: https://en.wikipedia.org/wiki/English_terms_with_diacritical...
-Naïve- is the example I think I see most often but I think it’s often spelled -naive- and no one would fuss too much with either spelling
Diacritics aren't unambiguous, there are different conventions for using them. What sound does "ā" make? It depends.
If what it depends on is the language then thats trivial.
Why is it trivial?
The ä and a sounds in Swedish and Finnish are swapped; and they're direct neighbours (with compulsory education for Swedish in Finland, no less).
But within each language it is well defined.
Between languages, even the letters have different uses. Diacritics can be used to signal a different sound or the tonicity of the word (at least in the languages I know those are the two uses).
I don't understand what this thread is all about. English doesn't need accents because there's no universal meaning attached to each one? That doesn't make sense.
Do you have any examples? As a Finnish speaker the Swedish "a" sounds the same. "Pappa", "framtiden" etc.
It's "ä" and "e" which have swapped uses, but it's not exactly consistent (e.g. "Järnvägstorget" where first ä is close to the Finnish ä, second ä is closer to e but so is the e at the end)
Ä in Swedish is an æ sound.
Ä in Finnish is a pitched A sound, like the A in “cat”.
The pitched “a” in Swedish is the default one.
Wikipedia lists both "cat" and the Finnish "mäki" under æ: https://en.m.wikipedia.org/wiki/Near-open_front_unrounded_vo...
Do you have some example words that would show the difference?
Well, mostly hearing people say the words will be telling.
Gävle in Sweden: https://forvo.com/word/g%C3%A4vle/
Linnanmäki in Finland: https://forvo.com/word/linnanm%C3%A4ki/
In the Finnish example you can hear both the soft “en” (linnan) and the higher pitched “” (maki) which is triggered with umlauts;
Where the Swedish A is softened by umlauts in the Gävle example.
That's an american cat then, because that sounds crazy to my ears
(how did I get downvoted for this when I literally lived in both countries)
not just neighbors by country. a not insignificant proportion of finns speak swedish natively
Many native english speaker here like to fantasize on the superiority of other cultures / languages but what good are diacritics for when there are still a shitload of letters that have no diacritics and can be pronounced in different ways?
For example let's take french... A cat is a "chat" but you don't pronounced the 't'. Oh but in "chatte" (pussycat or pussy), you pronounce the t's. While in other words in french you pronounced the 't', like in "table" (yup, it means a table btw).
Speaking of which, the 'e' in "le chat" isn't pronounced the same as the mostly (but not entirely) silent 'e' in "table".
No diacritics on these 'e' here and yet they've got different pronunciations.
Don't come and say: "but that's only with silent letters". Definitely not. "elle" (she) and "le" (the)... Different pronunciation for these three e's.
I've got better: "les fils" (the sons) vs "les fils" (the cables). Exact same spelling. But in one you pronounce the 's', in the other you don't.
Wait, even better: "le fils" (the son) vs "les fils" (the sons). Same pronunciation for "fils", no plural or singular: just one word with a 's' at the end.
Stop romanticizing about french: it probably has more exceptions and weirdness than english.
And you probably don't want to get me started on the average reading and writing skills in elementary and secondary schools in France. It's in freefall so the whole point is kinda moot: the digital natives can't use diacritics properly in french. Heck, many can't even (and don't want to) speak proper french. The language is becoming simpler and simpler, dumber and dumber.
Source: I'm a native french speaker.
> Stop romanticizing about french: it probably has more exceptions and weirdness than english.
As a non-native to both French and English who was taught both languages at school, there is a difference that french pronunciation rules were taught from the beginning, while english pronunciation was taught just as IPA transcription of dictionary words.
I love French. Institutionalized mumbling.
You might not hear the t in chat, but you need to know it's there to pronounce it properly. And especially if there's a vowel starting the next word.
The problem with French is that pronounciation changed but not ortography. It's easy to see that you did pronounce chaT in the past. Other languages periodically review their ortography. My language had that twice in my lifetime.
>Many native english speaker here like to fantasize on the superiority of other cultures / languages
Some languages really are a lot better than English as far as mapping between spellings and pronunciations. French just isn't one of them; as you pointed out, it's possibly even worse.
I point to German as the superior European language in this regard. I learned some in high school. I can't speak conversationally any more, but I know the pronunciation rules, so if I can read it, I can say it and pronounce it well enough for a German speaker to understand me, even though I don't understand it myself.
That said, German is a nightmare compared to English because of the grammatical complexity (cases etc.), but for pronunciation in relation to spelling, it's excellent. The written form really does reflect the spoken form accurately.
Spanish seems quite decent in both of these aspects.
It's easier than German, but it's still a pain because of gendered nouns. English was right to dump that crap centuries ago.
Indeed!
> it probably has more exceptions and weirdness than english.
Pronunciation-wise, I doubt it. All your examples have English counterparts.
Consider eleven (the vowel sounds for the same letter), psychology (silent p), wind / rewind, many irregular verbs (like read, read, read), Wednesday and business (many letters are just not/weirdly pronounced), history and litterature (one fewer syllable than expected), the complex rules to pronounce the ed + exceptions... You basically have to know how an English word is pronounced to pronounce it correctly. Guessing works but only so far, and I believe less than for French (and I'm a French speaker too).
I have a close friend from the US who likes to make fun of the French language, but when I cite English, he says oh yeah, but for English we already know that! :-)
Anyway, English and French are both quite bad at this, and you are right, that's nothing to be proud about. It's just a reality we have to deal with.
> The language is becoming simpler and simpler, dumber and dumber.
Simpler is not dumber and I absolutely don't think the language is becoming dumber. The last reform (1990) brings more regularity and this is most welcome, freeing us time for things that actually matter, making the language more accessible to foreigners as well as people with conditions like dyslexia or dysorthography and less a status tool. I welcome the French language becoming more welcoming.
Or please strongly back your dumber and dumber statement. Because usually that's just baseless, tired rambling from clueless conservative people saying such things. A French speciality (a national sport even, championed by the Figaro?).
> And you probably don't want to get me started on the average reading and writing skills in elementary and secondary schools in France. It's in freefall
That too. Maybe you should fix your English before lamenting on the writing skills of people, because you are making a lot of basic language mistakes in this very comment in which you are doing this. That's harsh and not nice, but that's what you are seemingly doing to others and I want to take the opportunity to make you feel what it may feel like. Actually, you probably cannot even begin to imagine how you may sound like to people for whom writing is a struggle. Such people often feel ashamed because of people like you. Let's just be forgiving, tolerant, more empathetic and stop using language skills as status and start focusing on the content.
I have a close acquaintance who expresses themself perfectly, only writing without mistakes is hard for them. They even have an official disability recognition for their strong dyslexia (so they can have a related tool on their workstation). Let's just cut people some slack on their writing skills (which are in the vast majority not related to laziness - or maybe you are suggesting people are dumber and dumber?) and the world will be a better place.
See also [1] for a nuanced discussion on "Writing skills are lower and lower". It turns out it's partly due to more people going to school and not only the elite, which is a good thing, including children whose first language is not French and whose life in general may likely be a bit more complicated than the one of a random privileged French child (like I was).
[1] https://www.youtube.com/watch?v=p8SJ6v2A0qU&t=120 (in French)
FWIW, both “history” and “literature” have the number syllables you would expect in my dialect of English (Western American), at least among people I know. But I know exactly what you are talking about! Many regional dialects drop the “o” in history and the first “e” in literature.
On the other hand, we do violence to the pronunciation of “comfortable”. I’ve lived in so many parts of the English speaking world that I can partially code switch pronunciation for some dialects. Kind of weird but not that bad.
Interesting!
So how do you pronounce comfortable there?
In American English, it is common to pronounce it something like “comf-ter-ble” in most dialects. Some dialects of e.g. British English pronounce it as you would expect from the spelling with 4 syllables. I can’t think of an American dialect that pronounces it correctly. Perhaps some New England or Canadian dialects do?
My experience traveling around the English speaking world is that it is very forgiving of pronunciation. What trips you up is differences in vocabulary and semantics. You have to learn a new dictionary and a bit of inexplicable grammar everywhere you travel. I’ve learned very different languages that had similar relationships to adjacent languages; the words are all familiar but the meanings of those words have been remapped to something else. English tends toward a similar pattern.
As a (sort of) Englishman, it's a strange feeling reading about the Normans (or Vikings!) as "they", when in fact it's now "us":
> Then the smile vanishes. There are no more English queens or kings. Only Normans.
Fun fact: due to pedigree collapse, if you have white British ancestors, you most likely have a direct linear connection to every Viking, Norman, and peasant who still has living descendents today. William the Conqueror is your great(great, etc) grandfather, as is Cnut the Great, Kenneth MacAlpin, and Rhodri the Great, etc etc.
It’s funny, how people identify can diverge a lot from genetic reality. Even e.g. Brazilians, who are mostly descendent of Europeans, will always say “we were colonised”
I have a theory that English is popular because pronunciation encodes almost no information so it works well regardless of accent. Some asian languages, and even French, heavily depend on tone for understanding so are tougher for non-native speakers to communicate in. Butchered English can still be generally understood, thus it's position as lingua franca.
French was the lingua franca for a very long time (pun intended)
Former linguistics major here. Interestingly, 'lingua franca' originally referred to a specific pidgin trade language spoken in the Mediterranean. The 'franca' part referred to the Franks, who were originally a Germanic tribe that established kingdoms in what is now France and much of western Europe. By the late Byzantine period, 'Franks' had become a blanket term for all Western Europeans. What happened to both 'Franks' and eventually to 'lingua franca' is an example of semantic broadening.
Yes, the “Franks” in “lingua franca” were mostly Italians.
English is currently popular because money is always popular
This elides a lot of history, despite being glib it's mostly correct.
If English wasn't as easy to learn as it is, it would have been destroyed though.
The absolute selling point of English is the fact that since it has no proper rules it's the "glue" of European languages, it's the bash of human linguistics.
Ugly, crude, nearly impossible to master if you're not using it daily and all it really does is pin together superior languages that actually have formal rules, but could never be as flexible as "common".
Yes, it enjoyed tremendous success due to the british empire, and continues to dominate thanks to the hollywood propaganda machine - and it owes about 90% of it's success to that. But it's important to note that last 10% is important too, and that is because English is an easy language to learn and it is able to evolve rapidly.
> The absolute selling point of English is the fact that since it has no proper rules ...
Anyone who thinks English has "no proper rules" clearly has never had the joy of learning English as a second language.
(Or maybe they have a really warped notion of what "formal rules" mean when it comes to languages. There are no natural human languages in the world that are dictated by formal rules. All formal rules are after-the-fact descriptions devised to explain the language that is already there.)
> Anyone who thinks English has "no proper rules" clearly has never had the joy of learning English as a second language.
https://en.wikipedia.org/wiki/Apophony#Ablaut-motivated_comp... and https://en.wikipedia.org/wiki/Reduplication#English
Tic Tac Toe is the one that I remember most easily...
> Examples include: bric-a-brac, chit-chat, clip-clop, ding-dong, flimflam, flip-flop, hip-hop, jibber-jabber, kitty-cat, knick-knack, mishmash, ping-pong, pitter-patter, riffraff, sing-song, slipslop, splish-splash, tick-tock, ticky-tacky, tip-top, whiff-whaff, wibble-wobble, wishy-washy, zig-zag.
Saying any of those in the wrong order sounds wrong to a native ear.
It even shows up in Live, Laugh, Love.
And then there's adjective order... https://dictionary.cambridge.org/us/grammar/british-grammar/...
--
If people want a language with "proper rules"... head over to conlangs. https://youtu.be/x_x_PQ85_0k https://en.wikipedia.org/wiki/Ithkuil is my favorite (I've got a copy of the grammar guide that is on my shelf of random things next to Random Numbers by the RAND corporation).
English regularly violates its own rules and additionally has no correcting body (Swedish has central body that dictates language rules for example).
That's part of why it's so difficult to fully master, and there are rules (sentence structure) for clarity, but there's no actually solid rules for pronunciation (it differs depending on word) or even what words are really proper words (there are central dictionaries that largely agree, but there are also "Hinglish", patois and the other creole dialects).
English steals aggressively from other languages, since that's its history. Other languages might borrow some words but there's multiple branches of these inside english. You can use English with only latin-root words, or English with only Germanic-root words and both are as valid english as each other.
Easy to learn; awful to master.
> English regularly violates its own rules
That's true for any human language. E.g. in Russian, adjectives use the gender, case and plurality of a noun, until they suddenly don't.
> English steals aggressively from other languages, since that's its history.
That's not unique to English. E.g. Japanese has even borrowed numerals, and some of its pronouns are borrowings. Russian has borrowed verb forms.
Having a lot of Latin borrowings is quite common in most European languages. Even in Romance languages, there are a lot of Latin borrowings (e.g. minuto is Latin borrowing, miúdo is a native Portuguese word).
> You can use English with only latin-root words, or English with only Germanic-root words and both are as valid english as each other.
That's similar to how e.g. Romanian has Latin-based and Slavic-based vocabulary. This is not that unique.
> but there are also "Hinglish", patois and the other creole dialects
Many languages have or had patois and creoles based on them.
> If English wasn't as easy to learn as it is, it would have been destroyed though.
I really dislike this argument. It treats English as a mythical, exceptional language even though it really is not.
English was not particularly hard or easy compared to other European languages. It did not have a particularly hard or easy structure, and orthography took centuries to normalise in continental languages as well. It had the quirk of combining Germanic grammar with Romance vocabulary, but that’s relevant for linguists, not most speakers.
What happened is that it was simplified and adapted over the course of centuries.
French was not displaced by English because of some magical language qualities. The French were displaced by the British somewhat, but mostly by the Americans and language followed.
> the hollywood propaganda machine - and it owes about 90% of it's success to that
Who's being glib now? Most people learn English because it's means making more money - in technology, finance, tourism, ...
That probably depends where you live. A lot of Nordic people tell me the learnt English as a kid watching cartoons, long before they were thinking of such things.
we learn english because it is a subject in school. money does not come into consideration for most people. the motivation to teach english in school is another question however. as is the motivation for parents to pay for extra english classes outside of school.
Another way of saying this is that spoken English has a lot of inbuilt, inherent error correct-ability, ala a very large minimum Hamming distance between spoken words/phrases.
I always found French to be very much the opposite in spoken form, due to the 'consonnes finales muettes' and liaison and élision, along with the large amount of homonyms and general colloquialism used in everyday speech. Yet in written form, it is nearly as straightforward as English, as you get back those damn letters that aren't being spoken.
> I always found French to be very much the opposite in spoken form, due to the 'consonnes finales muettes' and liaison and élision, along with the large amount of homonyms and general colloquialism used in everyday speech.
It is not that different from English in that respect. I found both to be quite difficult compared to e.g. German, which is very regular, or Spanish (which is annoying grammar-wise but straightforward to pronounce).
Spoken English is full of elisions and silent letters, and also full of locale-dependent colloquialisms that take some time getting used to. I remember struggling for a while living in New York and London despite having a decent level in “standard” English. I still occasionally struggle with my mates from Ireland and Yorkshire. After living more than a decade in English-speaking countries I accepted that I will never be able to pronounce correctly a word I never heard before.
Missing liaisons is not problematic when you speak French. It marks you as a non-native but it does not make you harder to understand. They can be often omitted by natives as well, depending on the accent.
> I will never be able to pronounce correctly a word I never heard before.
This happens not infrequently to native English speakers. It's especially prominent for people who read a lot when they're young and develop a large vocabulary that doesn't get socialized until much later. My English teacher, of all people, was notorious for this.
Real life examples from native speakers: Emphasizing the wrong syllable in "forage", "respite", and "parameter". Pronouncing "draught" like "fraught". Softening "chasm" and "chaos". And an extra syllable (long e sound) in "homogeneous".
This is my experience as well. I would just add
> Emphasizing the wrong syllable
As a native French speaker, I will never be able to emphasise the right syllable in any language, ever :D
Stress is just not really a thing in French, and it is quite difficult to get it intuitively later in life.
It's fairly easy when the stress is consistent or mostly so. In English thought it's a phonemic feature in its own right so there's no general rule to memorize in the first place - you just have to learn it for each word.
But I think its global dominance has more to do with historical and economic factors than linguistic flexibility
Fun facts almost one third (1/3) of English language vocabulary are similar to French. To be exact most of the professional and legal version of the English words are taken from French. Hence if you understand English, you can read short notice or announcement in French, and understand them mostly. But if you have people spoken the same notice and announcement in French version to you without you reading it, most probably you won't understand most of the same sentences.
Plus there is another 1/3 coming from Latin which French speakers has no issue understanding either. English is basically akin to a dumbed down pidgin of French (exponentially less verb conjugation, no gender agreement, less pronouns with the thou/you merge, less articles and annoying small words, etc.) starting over a Germanic core.
Harsh but that rings true. In it's defense i'll point out that English is exponentially more useful in the modern world and even French has started borrowing nouns from English. Also English has more words then any other language which in my mind makes it the best. (to clarify i know a little of other languages and i understand that there are concepts which English is not even equipped to express properly but i stand by what i write)
I'm still learning, English is huge and it can be a delight to discover.
Most words in foreign languages that most people believe don’t have an English equivalent often do, but the English word is so obscure that almost no native English speaker knows it, and as you point out, English vocabulary is so large that no one will ever come close to learning it all. English is the C++ of human languages.
What interests me is the prominence of words in foreign languages that have an extremely obscure equivalent in English. Like, why do they devote common vocabulary to it and what does it mean that they do?
I have been conversational in languages almost no one learns from parts of the world no one cares about. They are full of words like this and I still use those words in English because that was the first word I learned for the concept. But when I’ve taken the time to see if an equivalent English word exists, it always does. Ironically, it is safer to assume that my ignorance of the English language (my native language) is more likely than the lack of a word in English for a thing.
> English is the C++ of human languages
You can say that again.
That's exactly what I'm alluding in my other comments thread but referring to Chinese language and writing system complexity rather than English for the C++ and Rust, but on second thought Rust probably be the Chinese equivalent.
> But when I’ve taken the time to see if an equivalent English word exists, it always does.
It's the same happen with C++ that has been ripping up Dlang features for quite sometimes now including its new module system [1].
[1] Converting a large mathematical software package written in C++ to C++20 modules (42 comments):
https://news.ycombinator.com/item?id=44433899
During the Norman Conquest, England was ruled by the French... and that is when those words entered the language.
Also from that time was many culinary words. The word for the meat in English is the word for the animal in French (the word for the animal in English is likely germanic in origin). That was in part because the when the French speaking nobility wanted boef (French for cow), they didn't want a cow (German Kuh) - they wanted the meat of a cow. So English got beef. Pork? French asked for porc, but didn't want a swine.
While ~29% of the dictionary words come from French, in any written or spoken sentence the number falls dramatically. All the small joining words we use and the core of our grammar is Germanic.
See Rob Word's "Is English really a Germanic language?" https://www.youtube.com/watch?v=PCE4C9GvqI0
Fun fact, in your first sentence there is approximately 30% of the words which are not Germanic.
(my separation of the words, which may be slightly off:
While of the words come from, in any written or spoken the falls. All the small words we and the of our is // dictionary French sentence number dramatically joining use core grammar Germanic )
Is it the case that it encodes no information, or is it the case that the information is somehow..."optional"? I know I selectively ignore unintentionally snarky or sarcastic tones from non native English speakers. Even the simple example of turning "ok" into "oook" can be used to imply someone is being unreasonable.
Singlish says, "Hi." It's fascinating to watch that in action here in Singapore, as I find Singlish to be a compact and efficient form of English that greedily borrows words from Mandarin, Hokkien, Hakka, Teochew, Malay, Tamil, and a few other languages to enable rich communication among the various cultures, ethnicities, and language groups found here. It even borrows the grammatical structures of some of those languages, and yet the meaning still gets through. It didn't take me long to get comfortable with it, and it helped me appreciate the promiscuous nature of English even more.
Chinese doesn't use accents, but the characters are extremely complicated in comparison. The chacters are both the images and the specific strokes which draw the image.
Spoken Chinese has at least five tones (1,2,3,4,5 Number five stands for neutral) but to native speakers there is much nuance.
I won't explain the reason of its popularity. Someone braver than I may do it. Grammar is very simple, by the way
Chinese is hard in unnecessary way both in language speaking and writing. I've got the impression that they make in unnecessary hard so only certain people can operate the language and work in government or I call it the elite mentality. I've got the same impression about complex programming languages for examples C++ and Rust. The languages are so complex that you cannot even make the compiler fast [1].
Spoken mandarin has 5 tones but the original ancient Chinese is similar to Cantonese and it has 7 tones. The modern Chinese writing characters is considered simplified because in Taiwan they use the original and more complex Chinese characters.
Fun facts King Sejong of Korea actually get rid of the cumbersome Chinese characters for writing Korean languages and introduced new Korean characters Hangul in 15th CE [2[. It's reported Korean literacy rate skyrocketed in a very short time because it's much easier and suited the Korean language better. Another fun facts, Korean characters can be learnt overnight but you need to memorize and understand several thousands of Chinese characters just to read and understand the newspaper headlines in Chinese. I have a Chinese friend who has Chinese mother tongue and is a well accomplished senior engineer but he cannot even read Chinese newspapers since he did not has a formal education in Chinese writing system.
As Einstein famously remark you should make it simple but not simpler.
[1] Why is the Rust compiler so slow? (425 comments):
https://news.ycombinator.com/item?id=44390488
[2] Hangul:
https://en.wikipedia.org/wiki/Hangul
> Chinese is hard in unnecessary way both in language speaking and writing.
Is spoken Mandarin really "hard in an unnecessary way"? I think it's quite straightforward, except for the tones. The tones are difficult for anyone who isn't a native speaker of a tonal language. But they are trivial to learn as a child, and easy to learn for native speakers of say Thai (a mostly unrelated language that also happens to use tones). Uneducated people in all walks of life speak both Mandarin and their local dialect well.
Written Chinese really is objectively difficult, and it's a believable argument that before Mao it was intentionally gatekept that way to have a caste of intellectual "elites".
in addition, chinese grammar is very easy. what makes learning chinese hard is the writing because it is difficult in itself and you can't use it to reinforce the learning of spoken words or vice versa.
>The tones are difficult for anyone who isn't a native speaker of a tonal language
That's the the majority of the world's population.
Sure, but it's a tiny minority of Chinese learners.
"Hard in an unnecessary way" implies that it's objectively difficult and complex, not just different from what certain outsiders are used to.
As much as I approve of shitting on Chinese characters, a lot of the arguments about literacy don't really apply in the modern age. Back in the 1400s when Sejong and his ministers published the Hun'ming'jeong'eun, sure, but in the modern day literacy is pretty much driven by the modern schooling system and even Japan achieves high literacy rates. It's a bunch of unnecessary extra work, but it's not an impediment to being able to read if that work is put in.
It is true that in 1400s Korea being able to read was a sign of status, and the literati argued against making it easier to preserve their station. The same applied to postwar Japan according to J. Marshall Unger.
Given the meaning of "accent" given in the article Chinese seems like a very accented language (saying that as a Mandarin speaker). Aren't Chinese tones the very definition of an accented language? (as defined by the article, accent is a broad term)
The article is talking about diacritics, ie. turning e to é and so on. Even by that standard, though, pinyin uses diacritics to mark tone.
So that it fits into 7-b it ASCII.
> This is why English has combinations like sh, th, ee, oo, ou that each make only a single sound.
Struggling with the th and ou here as only making a single sound.
Through and rough, both not the same ou as sound.
That and Thames, but this might be becaues Thames is proper noun?
The only thing I like about Croatian is that there is none of this nonsense. If you understand the letters and how to pronounce them, you can read a word and pronounce it correctly. In English there are so many words that you would have no idea the correct pronunciation until you've heard a native speaker say it. Even that's no guarantee it will be correct though!
Correct. It's "one sound" represented by two letters, but not always the same sound in different words.
Ah, I misunderstood "one sound" as meaning, "these letters only ever produce one sound that is unchanging regardless the word".
The claim that “English doesn’t use accents” is, quite frankly, a bit naïve. After all, one only needs to step into a café for a crème brûlée, or to read a novel featuring a tête-à-tête, to see that English is perfectly comfortable with accented words. From the résumé of a job seeker to the façade of a building, from the attaché case of a businessman to the naïve assumptions of a newcomer, accents are sprinkled throughout our language like so many éclairs in a patisserie. English may not have been born with diacritics, but it has certainly acquired a taste for them.
In almost every example you state, we've taken the word and the pronunciation, but have dropped the accent marks themselves. It's part of what makes English pronunciation a minefield.
If I recall correctly, the accent marks only started to get dropped when people began using keyboards. Resumé and fiancée still got an accent when it was handwritten
If only you had written an example that is not a French word!
You have actually reinforced the thesis you were trying to dispute: English does not use accents.
> perfectly comfortable
Seeing that virtually none of these are pronounced as they are in the original, I would say that English keeps them out of respect for their source language, but is definitely not comfortable with them.
French barely has diacritics.
There aren’t that many different diacritics, but every other word has one. Just look at any text in French, it’s full of é, è, ê, and à.
You mean Umlauts?
Fascinating site...
Diacritics mostly just add a layer of unnecessary complexity, and make it hard for computers to handle.
Given the above, I'm surprised Esperanto was designed using accent marks. But I suspect those weren't the most practical people.
Computers were not much of a concern in 1887.
As Esperanto was designed to be a neutral language that took words from many (European) languages, it needed an orthography that would unambiguously denote those sounds.
You can trivially transliterate the circumflex letter with digraphs using h anyway. ĝ -> gh
Still perfectly readable.
If you want a language where there is a trivial and unambiguous mapping between written and spoken language and the sounds don't exist in basic latin script, you need to do something. You can use diacritics, you can use digraphs and other tricks or well you just give up. But saying they are "unnecessary complexity" is very mistaken.
Digraphs can still unambiguously denote sounds, but Zamenhof really wanted the canonical spelling to have one character = one phoneme.
And, ironically, it was easier to do it back then compared to the computer age, because typewriters did something similar to Unicode combining marks - specifically, on the French typewriter, you'd have a single key for ^ which was used to overtype letters where circumflex was needed. And since French typewriters were very common in all countries Zamenhof used circumflex for his letters (except Ŭ, seemingly just because).
As far as digraphs, using "h" can be ambiguous in some cases, which is why Esperanto digraphs are more often written using "x" these days when diacritics aren't available for some reason. I actually quite like that scheme for several reasons. The obvious one is that "x" is then strictly a modifier letter with no sound value of its own, like hard/soft sign in Cyrillic, so all digraphs are unambiguous. But also, there's a matching diacritical mark, "combining X above". So we could e.g. say that "cx" and "c̽" are the same thing, and you use the latter when possible falling back to the former when you need ASCII.
> Computers were not much of a concern in 1887.
Obviously true, but typewriters were in common business use by then.
Tldr. 1066 French didn't have them. Later on that each language French / English independently solved the "need more letters" problem. English adds them e.g. "th" is 2 letters for a sound. French uses diacritics.
We are polite and it is considered racist.