Understanding Phonetic Modifiers

Note: You will likely need to have the "Arial Unicode MS" font installed on your system to read the tables below.

Phonetic modifiers (a.k.a. diacritics) are small glyphs attached to primary alphabetic glyphs whose purpose is generally to modify the pronunciation of the primary glyph or the characters around it. Here we have three tables: common modifiers, uncommon modifiers, and Thai modifiers. Modifiers exist for a number of scripts (writing systems) that we don't cover here, but the concepts are similar. The Thai modifiers are present because we often make applications for the Thai market and have to work with these modifiers in text processing code. The common modifiers are used in the most common European languages, while the uncommon modifiers are used sporadically in other languages. The Thai modifiers are specific to Thai.

In many cases, modifiers are built into the characters they are commonly used with. For example, there are independent  pre-built characters for á, ý, ć and ń as well as a standalone ´ character. This makes it easy on code that processes and draws text, as it doesn't need to worry about combining characters with their modifiers. However, in some cases modifiers aren't built into characters and text processing code must do this itself. In particular, Thai characters use modifiers so extensively that the combinatorial explosion of possible combinations makes it prohibitive to have pre-built characters. Thus text processing and drawing code must take care of combining itself. This makes writing text editors a bit more difficult, to say the least.

Common Modifiers Table

Here we have a table of the primary modifiers in use by most languages. Note that each modifier has a Unicode value and a combining Unicode value. The Unicode value refers to a standalone modifier character that exists in its own character cell. The combining Unicode value refers to the same character shifted to the left such that when drawn it appears over the previous character. Combining forms of characters allow you to synthesize modified characters on your own. For example, here is an equals sign with a caron implemented with two characters:


 

Modifier
Name
Example Unicode Value
Combining Unicode Value
´ acute accent á 02CA 0301
` grave accent à 02CB 0300
ˆ circumflex accent â 02C6 0302
˜ tilde ã 02DC 0303
¨ diaeresis (or umlaut) ä 00A8 0308
˚ ring (or bolle) å 02DA 030A
¸ cedilla ç 00B8 0327
ˇ caron ǎ 02C7 030C
˘ breve ă 02D8 0306
ˉ macron ā 02C9 0304
˙ dot above ċ 02D9 0307
˛ ogonek ą 02DB 0328

Uncommon Modifiers Table

Below is a table of uncommonly used modifiers. These modifiers are used in African languages, among others. You probably won't ever need to work with these.

Modifier
Name
Example Unicode Value
Combining Unicode Value
  ̐ condrabindu 0310 0310
  ̑ inverted breve 0311 0311
ʻ turned comma 02BB 0312
ʽ reversed comma a 02BD 0314
ˍ low macron 02CD 0331
ˎ low grave accent 02CE 0316
ˏ low acute accent 02CF 0317
̏̏  ̏ double grave accent
<NA> 030F
˝ double acute accent 02DD 030B
˞ rhotic hook ɚ 02DE <NA>
. dot below 0323 0323
  ̤ diaeresis below <NA> 0324

Thai Modifiers Table

Below is a table of all Thai modifier characters. Unlike other modifiers, Thai modifiers exist only in combining form. Thus there are no pre-built Thai characters that have modifiers built into them like with many Western characters.

Modifier
Name
Example Combining Unicode Value
   ั Thai mai han-akat กั 0E31
   ิ Thai sara i กิ 0E34
   ี Thai sara ii กี 0E35
   ึ
Thai sara ue
กึ 0E36
   ื Thai sara uee
กื 0E37
  ุ Thai sara u กุ 0E38
   ู Thai sara uu กู 0E39
   ฺ Thai phinthu กฺ 0E3A
   ็ Thai maithaikhu ก็ 0E47
   ่ Thai mai ek ก่ 0E48
   ้ Thai mai tho ก้ 0E49
   ๊ Thai mai tri ก๊ 0E4A
   ๋ Thai mai chattawa ก๋ 0E4B
   ์ Thai thanthakhat ก์ 0E4C
   ํ Thai nikhahit  กํ 0E4D
   ๎ Thai yamakkam ก๎ 0E4E


End of document