2869: Blueberries aren’t Blue* Oct 29, 2024
Blueberries are of course named for the color despite the strong reddish undertone—from the same natural chemicals that give red onions and red cabbage their purple color. Of course, there are also ‘blackberries’ and potentially ‘raspberries’ (see more on what color that is) also with color-based names, but naming precision aside, not everyone around the world agrees on blueberries.
The Modern Hebrew word for them is אוכמנית (uchmanit) from an Aramaic word אוכמא (ukama) meaning ‘black’. Likewise, the Finnish ‘mustikka’ comes from ‘musta’ (black) , Russian черника (chernika) from чёрный (chornyj) also for ‘black’. Georgian’s word, მოცვი (mocvi) does sort of mean “blue+berry” but it has 3 different word for blue, and this is the lightest, more like a cyan or turquois.
2868: Henna & Heresy (Gopher pt. 3) Oct 28, 2024
The mysterious gopher wood may be cypress, related to Cyprus and copper, but it also has another place that it pops up. As with the Hebrew כופר (cofer) the Greek κύπρος (kupros) means ‘henna’ which English got from Arabic, along with ‘tachina’ (a.k.a. tahini) meaning ‘to grind; smear’. While the Greek and Hebrew words—Greek adopted its root from a semitic source—relates to both the plant and the body-paint byproduct, Hebrew כופר also means ‘heretic’*. The sense was extended from smearing to blotting out, and then somewhat metaphorically to blotting out someone’s name.
The root letters can also mean ‘atonement’ or ‘village’ but all 3 of these roots are not clearly related.
2867: Cyprus and Cypress pt. 2 Oct 26, 2024
The Greek word kύπρος (kúpros) is not by itself a confusing term, but it pops up in lots of interesting places. Κύπρος is the Greek name for the island of Cyprus, which is the same name as its had since antiquity. Famous for its being the largest copper producer in the Mediterranean during the Bronze Age, this is the source of the metal’s name in Greek and eventually in English. Of course, it is also the name for the cypress tree, differentiated in spelling in English from the Island for clarity, but pronounced the same.
Over the years, some have tried to connect the name to different religious practices of the Phoenicians or others, but even if this is true, this is not the source of the name of the island.
2866: Gopher Wood: An Etymological Mystery pt. 1 Oct 25, 2024
There are numerous untranslated—potentially untranslatable—words from Biblical Hebrew, typically in the realm of natural items like plants, animals, and minerals. One such example is גפר, the type of wood used for Noah’s Ark. The sentence structure itself is somewhat notable insofar as עשה לך תבת עצי־גפר (Make for yourself a box of woods/trees of gopher) is plural—atypical for mass nouns— but it is definitely a tree type.
The main candidate for what this would have been is, in English, the cypress. The name of this tree, specifically the Mediterranean cypress, is traceable to Ancient Greek κύπρος but that itself appears to be of semitic origin, potentially related to the Akkadian kupru, denoting a type of aromatic tree sap, and would be related to the root גפר.
This is part one. There will be more on this root tomorrow.
2865: A Non-Standard Unit? Oct 24, 2024
Those partial to the metric system may mock imperial or US customary units for their lack of consistent increments, but the top contender may be the penny, abbreviated: d, historically the abbreviation for pence. It is used for nails, especially those not measurable with wire gauges, but what makes them non-standard is their lack of internal consistency. For instance, while at first they seem to increase by one penny per ¼-inch, this does not hold up.
2d = 1 in., 4d = 1½ in. and 10 = 3in. in length, but that’s when the logical increments stop. Meanwhile 20d = 4in., (not 5 ½”) and 60d is only 6in. and not the staggering 15½ inches one would expect having looked at the beginning of the scale.
This is because, even though nails are still sold in this way (only) as a system of dimensional measurement, the name is a reflection of how much 100 (or 120) nails of such a size cost in 15th century England. That is to say, 2 pennies got a hundred 1-inch nails, but 6 pennies only got a hundred 6-inch nails.
2864: Oldest "New" City Oct 23, 2024
These days, cities or other regions being named for another place is pretty common. Just looking in America, 4 of 50 states—New Hampshire, New Jersey, New Mexico, New York—are named as ‘New …’, not to mention countless cities, both with the epithet or not (e.g. Cairo, Illinois or St. Petersburg, Florida). It may be surprising, but the ancient world had some ‘New …’ cities as well.
The oldest contender, and is probably Carthage, now Tunis. In Phoenician it was Qart Hadasht just meaning ‘new city’ named when the Phoenicians a.k.a. Canaanites moved their seat of power to there from Tyre, Lebanon sometime in the 9th century BC. That might not count since it is not named for another place. What definitely was is New Carthage, known now as the Spanish city of Cartagena, from Latin Carthago Nova or in Phoenician as Qart Hadasht just, again meaning ‘new city’ but in Latin it was directly named for the city of Carthage as they did not understand the name meaningfully. This New New City, as it were, was founded or at least rededicated shortly after the First Punic War.
Elsewhere, the city of Naples comes from the Greek Νεάπολις (Neápolis) also meaning ‘new city’, after having expanded the port city of Παρθενόπη (Parthenope) meaning "Pure Eyes". Again, this is not exactly naming after another place, but to consider a “new city” in the 7th century BC is notable.
Otherwise, the practice of naming a city or region wholesale after another place only became commonly in the Age of Exploration, but was common for all nations involved, including by the Portuguese, Spanish, Dutch, French, and English alike. This also came at a time when the frequency of naming cities was maybe at its height.
2863: Numerals for Letters Oct 22, 2024
In languages that use alphabetic writing systems originally developed for other languages, or even just earlier versions of the given language, there are usually two ways of representing sounds outside of that alphabet. The first is the use of diacritic marks on the letters, and the second is by using digraphs and so on, which can be different letters like the English <th> or the same letter like the Spanish <ll> or <rr>.
One less common but equally effective way is to simply make up a new letter. Often this is a modification on an existing letter, like the Turkish İ, i obviously modeled after the lower case Latin I,i, but some languages have also borrowed from numbers. For instance, the Squamish name for its people is Sḵwx̱wú7mesh where its notable 7 is seen representing the glottal stop, quite similar to the IPA symbol for one, which is ⟨ʔ⟩, but is more available to print and type. In some Mesoamerican languages, namely Yucatec, historically <ꜭ> and <ꜫ> were used for the ejectives /k’/ and /q’/ respectively. These letters are respectively called Cuatrillo and Tresillo meaning ‘little 4 / 3’ in Spanish. While no longer in use, replaced by both diacritics and digraphs, they were some of the earliest attempts to transcribe these sounds unfamiliar to the Spanish. It was easier back then when it was not a concern to have to contend with the limitations of the printing press, nor certainly keyboards.
2862: Cedilla Ç: Why? Oct 22, 2024
The letter Ç, known as a cedille (from French) or a cedilla (from Spanish) is used in Turkic languages, along with a few Romance languages. The origins have been discussed here before including how it originated to represent the /t͡ʃ/ (like CH in English 'chew'), but still begs the question of why specifically the letter C was chosen to be modified, and about its strange name.
The name ‘cedilla’ means ‘little Z’ in Spanish, and likewise with its name in Portuguese and French. This is because is not a C with a comma underneath, c̦, like in some Eastern European languages (also, those have a space in between), but originated as a combination of the letter C and Z: ꝣ—but in a cursive form like ʒ, similar to the German ß. In both the original /t͡ʃ/ pronunciation, or in the modern French / Portuguese pronunciation which is like /s/, this makes some sense, but why C?
This is because in Romance languages especially, the letter C has a tendency to vary widely depending on the linguistic environment, namely based on which vowel follows. Consider the most, or maybe only, widely used Ç in an English word: façade. In Latin faciēs (meaning, and related to ‘face’) was pronounced with a /k/ sound, which became Italian ‘facciata’ with a /t͡ʃ/ sound, which was then loaned into French with the /s/ sound. Of course, it could have been written with an <s>; it also keeps some remnant of the etymology but also indicates it is not pronounced like /k/—which it normally would be before <a>—the letter Ç is used.
None of this is true in Turkic languages regarding etymologies, but that the typeface is convenient to have a distinct letter for /t͡ʃ/.
2861: Arabic Afrikaans Oct 20, 2024
The South African language of Afrikaans dates back to the 17th century as it diverged from its Dutch origins, but the first texts only arose as late as the early 19th century. While Afrikaans is a Germanic language, the earliest written samples used Arabic script. These texts were religious Islamic and were written by the Cape Malay population, one of the earliest groups to take to Afrikaans as a first language.
Arabic Afrikaans was a short-lived orthographic trend until the language’s writing system developed with Latin letters until its standardization in the 20th century. Nevertheless, over seventy Arabic Afrikaans texts remain extant, with these early writings shedding light on the robust literary and organizational skills of the Cape Muslim slave population. In comparison, the Dutch Afrikaaner group did not publish written material in Afrikaans until the late 19th century.
-Jordyn Stone
2860: A Confusing Surrender Oct 19, 2024
Diglossia—a situation that occurs when two languages exist within one community—is not that rare. Usually there is a higher and lower register, such as around Europe with Latin, Classical Arabic or even today Standard Arabic in the Middle East and North Africa, that were used as formal or official languages but not used by the common person day to day.
This kind of academic-only bilingualism was on display when the Empire of Japan surrendered to the Allied Forces in WWII, despite the fact that the cultural practice of using an older, more formal version of Japanese had been on its way out already a century before this. Emperor Hirohito made an announcement broadcast on national television and radio in a form of Medieval Japanese that—while not completely unintelligible—would have been unfamiliar and difficult to understand for the typical Japanese citizen. Adding to the confusion, he never used any terms of surrender, only referring to the “conditions of the Potsdam agreement”. This meant that radio announcers had to separately clarify that Japan had surrendered.
2859: Raspberry Oct 18, 2024
‘Raspberry’, is, like ‘cranberry’, is not parsable; that is to say, ‘rasp’ is a word, but unlike blueberry, it is not clearly “rasp + berry”. In fact, it is not clear where it comes from at all. On the one hand, it could be from the Latin “vinum raspeys” referring to a type of wine, and that the berry is named after the color of the wine. It could also have a Germanic origin and indeed be related to rasp, referring either to their coarse texture or potentially the thorny branches.
2857: Final Kaf כ: Arabic transliteration in Hebrew Oct 16, 2024
Hebrew has developed certain standards when it comes to transliterating other languages, that usually avoids the problems posed from the limits of applying its writing system to other languages. For instance, certain letters are used only vocalically in transliteration (see more about that here). Another issue is that some letters change environmentally but this is not treated as such in loans. For instance, traditionally, פּ (peh) is only after a consonant, but after a vowel is פ (feh), and also the letter is different at the end of the word. Since peh is always feh at the end of the word, written in a word-final form ף (such as סוף sof ‘end’), in loans it is written as peh such as from English telescope: טלסקופ. A similar situation exists with the letter ך/כ (kaf / khaf) which also has goes from plosive to fricative after a vowel, and it has a word-final form.
In loan words from most languages, the letter ק (koof) is used for the /k/ sound no matter the spelling (note ‘quiche’: קיש), even going back to ancient loan words like קיסר (kesar) for ‘Caesar’. Despite the fact that in traditional Hebrew ק (koof) represents the gutural* /q/, but it does have have the fricative form like כ (caf) does. However, Arabic does have both a /k/ and /q/, unlike say Latin or English, so in transliterating those words, the כ (caf) must be used. This is why a word like הי טק ‘high tech’ or לינק ‘link’ (online) taken directly from English use a ק (koof). Arabic, which does allow /k/ or /x/ after a vowel as in תאריך (tarikh) ‘calendar date’, but also has a /q/ as seen in the Hebrew אופק (ofeq) ‘horizon’ and אפרסק (afarseq) ‘peach’. While it isn’t loaned into Hebrew, the term שמכ (shemek) meaning ‘your name’ (masculine) is spelt with the standard כ (caf) form for clarity.
That said, while the spelling from Arabic is preserved, the distinction in Modern Hebrew of using /q/ barely exists, but even the choice between two letters when they make the same sound can be important for a sense of etymological clarity, which English does too, arguably to a fault.
*This is just colloquialism in English; it is a uvular consonant.
2856: Beheld and Beholden Oct 15, 2024
Behold has two possible participial forms: beheld and beholden. While the former is used as a past tense, but as with other words in English (see: hanged-hung) there is a semantic difference between the two, with ‘beheld’ denoting looking or regarding, like ‘behold’ does, but ‘beholden’ meaning ‘obliged’ is not acting the same way. You might think this was a shift over time of the whole word, but really ‘beholden’ is acting more in line with the root ‘hold’. The prefix be- which was more productive in Old English, which changed the meaning to something extended temporally (beholden = held continuously) but also metaphorically (behold/beheld = held in view). In older forms of English as a result, ‘beheld’ and ‘beholden’ etc. could be used interchangeably, but when the prefix be- became less productive, the meanings narrowed and diverged.
2855: The Rise of ‘Human’ Oct 14, 2024
The word ‘human’ is very common today and is increasingly used in a generic like ‘person’ instead of its previous relegation to the field of science. It is easy to see why it replaced ‘man’, with ‘humankind’ rising to similar levels as ‘mankind’ in recent decades, and ‘mankind’ on a precipitous decline, but also ‘human’ is beginning to be interchanged with ‘person’ outside of form settings. The word in particular exploded in popularity over man starting in the early 1960’s and especially in the next decade, and is now seems to be losing its academic connotations.
Somewhat ironically, ‘man’ in English was not always gendered, but later replaced the word ‘wer’. German as well now had the word ‘Mann’ (man; husband) but uses ‘man’ as the pronoun ‘one’, and far more commonly than in English wherein most people opt for the 2ⁿᵈ person ‘you’ (acting like 3ʳᵈ person). Meanwhile, ‘human’ comes from the same Latin root as ‘homem’, ‘homme’, and hombre (in Portuguese, French, and Spanish) for ‘a man’.
2854: Problems with S- (part 2): Germanic Oct 13, 2024
From the Atlantic coast of Portugal all the way through to much of Central Europe, you will not find many native words beginning with [sp] or [st], or even [sk] sounds. This was explored yesterday in Romance languages. In German, the sounds ST and SP are not possible, and realized as [ʃt] and [ʃp] (like SH-p/t) even though the spelling would indicate otherwise. Technically, SK is also not really possible, and morphed into Sch [ʃ] (just SH-). This is true at the start of syllables, not just words. All of this is also true of Yiddish although that uses a different writing system.
But otherwise it does not really look like other Germanic languages struggle with this; after all English and, say, Afrikaans are replete with words beginning with S+ -T, -P, or -K. While these will not have the same difficulty exactly, things are not so simple.
In English, the stops after S are referred to as being clear, meaning that they are not denoted as having much aspiration (as would be found if it was the first sound of a word) nor are they glottalized at all as might happen at the end. For clarity, say a word like ‘pop’ or ‘tot’ and realize that the first consonant doesn’t sound exactly like the second. The trouble with these clear plosives is that they also are not clearly voiced, and so while in the other languages looked at here the S has changes, in English the other vowels change. For instance, it is not clear that the word ‘stop’ is pronounced [stɒp] or voiced as [sdɒp] (with a [d]).
This is taken to a further extreme in Afrikaans, where these stops are losing voicing entirely, except in between vowels, in a system that closer resembles tones, where the sounds are differentiated by pitch. Admittedly this is not a problem only for plosives following S.
You can read more about what’s going on in Afrikaans (not from Word Facts): here.
2853: Why So Many Spanish Words start 'Es-'? (pt. 1)12, 2024
In Western European Romance languages, excluding Italian, there is always an E- in front of S+consonant, like in Spanish ‘estoy’ (‘I am’) that used to be ‘sto’, and just look at these English cognates in Spanish: school → escuela (French: école), special → especial though not actually in ‘Español’ (‘Spanish’) which comes from ‘Hispania’. ‘Emerald’ is ‘esmeralda’ in Spanish, from ‘smaragdus’ in Latin, but French words not only gained the E like with Spanish and Portuguese, but also dropped the S, hence where English got it.
This occurs in words that developed from Latin beginning in SP, ST, and SC [sk]—or in other words <S> + [unvoiced plosive]—and the vowel [e] was added for phonetic ease to not have to begin a vowel this way.
While this sound combination may not look so difficult, especially from an English speaking background, keep in mind, Romance languages tend to articulate the /s/ sound further back than in Germanic languages (see more [s̠]) and these sorts of slight differences will culminate in very large changes over whole words.
There will be more on this phenomenon in Germanic languages, including English, in the post tomorrow.
2851: How Long are Long Vowels? Oct 10, 2024
If you were operating a telegraph, you would need to distinguish from dots (*) and dashes (—), (and to time empty spacing) given that ambiguity in any of these lengths would lead to confusion and unintelligibility. This means that the relative length of a dot to a dash is standardly 1:3. A similar problem emerges in languages with short and long sounds.
In some languages, this is somewhat foolproof, because in some languages, vowel length for instance is interdependent with the surrounding sounds. In Swedish and pre-Modern Hebrew* for instance, a long consonant is always preceded by a short vowel and vice versa. Meanwhile, in languages like Estonian and Arabic, the length of any consonant or vowel is an independent feature for the most part.
Languages with phonemic vowel length or consonant length most commonly see a ratio of 1:2 comparing how much time a short or long vowel is released. Some exceptions exist, like Finnish which is 1:3, so you might expect that other languages where the relative lengths of consonants and vowels are independent, to also be long. Actually in Arabic, the ratios can still be 1:2 or even closer because the average length for long vowels (for all languages, really) depends to an extend on the vowel in question with high-vowels like /i:/ usually being shorter. In Estonian too, with short, long, and overlong vowels, the ratio is somewhere like 1:2:4 or even 1:2:5.
*In Hebrew, this principle is often undermined by elision that instead forces a long consonant after a long vowel.
2850: Why Croatian Catholics Never Used Latin Oct 9, 2024
One of the defining factors of pre-modern Europe and especially the Catholic church was the proliferation of Latin. Through into the Early Modern period, it was a lingua franca used by all people for academic and other formal writings or procedures. It was even the official language of Hungary into the 19th century.
Croatia used it as well in many of those areas, however, it was the only Catholic region permitted to not use Latin for the liturgy before the 20th century. Instead they used Church Slavonic, that in Eastern Orthodox regions held similar sway to Latin to the West.
Pope Innocent IV (AD 1195 –1254) allowed the Croatians to use Church Slavonic as written in the Glagolitic script, a combination used elsewhere exclusively by the Eastern Orthodox Church. This is especially strange given that Latin was the sole official language of much of what is now Croatia until 1847, so while it was used in legal and academic settings, religious services were conducted in Slavonic. Both were considered dead languages even at the time, but the fact that Church Slavonic was permitted as the only exception to Latin in the Catholic Church was a concession to them as a key borderland territory close to the Italian peninsula that its potential downfall posed a real risk to Rome.
2849: Short, Long, and Overlong Consonants Oct 8, 2024
Usually, the distinction between the consonants [b, d, g] and [p, t, k] is one of voicing, which is to say whether the larynx (vocal chords) are engaged or not, but this is not true in Estonian orthography. Estonian, which has both phonemic vowel- and consonant length, uses [b, d, g] for the short consonants, and [p, t, k] for the long, aka geminated consonants. This is only true of those plosives, not other consonants.
Where many languages have merely short and long consonants—or like English, no meaningful difference in length at all—Estonian has 3 categories. In the case of overlong consonants so, the spelling is duplicated, i.e. [pp, tt, kk]. For example,
kabi (‘hoof’) is actually pronounced /kɑpi/ (short). Meanwhile
Kapi (of the wardrobe) /kɑpːi/ (long) in the genitive singular is kappi pronounced /kɑpːːi/ (overlong) in the illative case, meaning “inside the wardrobe”.
Note that Estonian also has 3 levels of phonemic vowel length, with for instance all single-syllable words having an overlong vowel.
2848: Box: From Knotted Shrub to any Rectangle Oct 7, 2024
Boxwood is quite a hardy wood, despite the fact that it comes from an evergreen shrub, so due to its strength but size limitations, it is used in small strips to reinforce pieces, in a process known as ‘boxing’. While it would be impractical for making large boxes, it seems to be the origin of the word ‘box’ from Latin pyxis (‘medicine box’) likely related to the Ancient Greek πυξίς (puxís) for ‘boxwood’. This has not led to a number of words that are now only distantly related in meaning, like the Irish ‘piseog’ for ‘evil curse’ from the original Latin meaning, or indeed the sport of ‘boxing’, or the more generic use seen in Afrikaans ‘bos’ meaning ‘bush; shrub’ of all types, hence ‘rooibos’ (red bush).
In English, despite the fact that the root possibly meant ‘to bend’ in reference to the shrubbery branches, it can now be used for any sort of rectangle or rectangular prison, real or imagined. Insofar as it can be used generically to refer to a rectangle particularly to enclose something, this was reinforced by the shrub being one of many frequently used for acting as a kind of fence or barrier, because it is very sturdy.