Paul Pedriana
Senior Software Engineer
Electronic Arts
ppedriana@ea.com
EAText is an advanced next-generation text characterization, layout, and font engine.
The biggest difference between EAText and other systems is that EAText supports all conceivable EA locales, including Western, Chinese, Japanese, Korean, and complex locales such as Thai, Hebrew, Hindi, and Arabic. This is the first time the problem of complex script layout has been universally solved within Electronic Arts and in doing so it should help lower one of the most significant barriers to multi-locale development. By using EAText, you automatically get support for about 25 languages in your game. This support includes basic layout and display, HTML/styled display, and interactive text editing. EAText additionally comes with a set of basic pipeline tools that can be used with EAText but can also be used with other systems due to their pipeline/engine-neutral design. Lastly, EAText and its associated packages follow standards such as Unicode, CSS, and XHTML, so expectations and behavior follow well-known conventions.
There are some packages within EA that have overlapping functionality with EAText. Most prevalent among these is the rwfont package. EAText does not seek to replace or compete against these packages but rather it attempts to provide augmented functionality for users that have a need for its features. Much of the EAText library and associated tools were written in such a way as to be agnostic to the underlying rendering package and so can mix with other related packages.
Many examples here show pictures of text in various EA games. In most cases these examples were done with text layout against screenshots of the game and were not done by modifying the game itself to display the text.
This document provides a graphical introduction to many of EAText's features. A summary of the primary features is as follows:
EAText largely follows the established conventions of typography. There is no definitive standard for typographical metrics, though there is little disagreement on the topic. We present font metrics and glyph metrics respectively as defined by EAText. Font metrics are metrics that describe the entire font, whereas glyph metrics describe individual glyphs.
Font Metrics
Glyph Metrics
Western languages uses scripts (writing systems) that are mostly straightforward in their layout rules. Each stroke of the user's keyboard corresponds to a single glyph on the screen to the right of the previous glyph. Space separates words and punctuation ends sentences. The left arrow key moves the cursor to the glyph to the left and the backspace key deletes the glyph to the left. However, this doesn't quite apply to some scripts, and we call such scripts complex.
Bidirectionality
Some scripts such as Hebrew are right-to-left (RTL) instead of left-to-right (LTR), and thus the text starts on the right and each appended character is on the left. Additionally, when numerals or Western script is used with RTL script, they are ordered LTR within the RTL text. This is called bidirectionality, or bidi for short. And when characters such as (), [], or “” are used in such bidirectional text they need to be graphically reversed by the layout engine. Let's look at a few examples.
Typing order (left) and display order (right) of Hebrew embedded into English
Hebrew embedded into English embedded into Hebrew
There are additional problems encountered beyond re-ordering, such as glyph mirroring, punctuation placement, justification, character affinity, etc. A thorough discussion of these problems is outside the scope of this document.
Substitution
Some scripts require the displayed glyphs to be different from the characters the user typed, depending on their context within a line of text. In addition to being RTL, Arabic is a cursive script (i.e. like handwriting) and requires the displayed text to be smoothly connected, like you connect characters when handwriting. To implement this on a computer using discrete glyphs requires contextual substitution. Additionally, Arabic has required ligatures; recall that a ligature is the combining of two glyphs into a single glyph. We'll provide a few simple examples here.
Arabic contextual substitution
Positioning
Some scripts require that the displayed glyphs be positioned on screen in a particular location relative to other glyphs instead of simply being in the same location after the previous glyph. Thai and Hindi are two languages which require custom glyph positioning lest their text be odd looking or illegible. We provide a couple examples to graphically describe this situation.
Thai glyph positioning
Hindi glyph positioning
EAText supports layout and interactive editing in 28 languages from 9 scripts, including:
Arabic (bidirectional) French Italian Russian Chinese German Japanese (kanji, hiragana, katakana) Spanish Czech Greek Korean Swedish Danish Hebrew (bidirectional) Norwegian Thai (complex composition and breaking) Dutch Hindi (complex composition) Polish Tagalog English Hungarian Portuguese Turkish Finnish Icelandic Romanian Vietnamese
Some scripts are easier to support than others, with the bidirectional and complex scripts being particularly difficult. Here we present some EAText renderings of a few of the more esoteric scripts.
Korean script -- known as Hangul script -- is a highly organized phonemic alphabet organized into syllabic blocks. Interactive editing of Hangul requires an IME (input method editor) which composes building block consonants and vowels (jamo) into glyphs.
Thai is a complex script known for its vertical glyph composition whereby decorations are dynamically applied to glyphs in positions that depend on the glyphs and other decorations. Thai is also unique because it doesn't separate words by spaces and doesn't delimit sentences with punctuation (note the picture below). Line breaking during layout needs to be done by dictionary lookup or by invisible characters. Thai decorations are vertically stacked upon base glyphs, with a result of this being that Thai requires more vertical space than most other scripts.
Hebrew is a right-to-left script which is noted for lacking vowels in most informal usage. While Hebrew is a right-to-left script, embedded left-to-right script and numbers must be rendered left-to-right inline with the right-to-left Hebrew. While Western GUIs and their text are left-aligned, Hebrew GUIs and text are right-aligned.
Arabic is a cursive script (i.e. like handwriting), and its displayed glyphs and their connections must be chosen by the layout engine based on their relative positions combined with OpenType information associated with the font. Arabic is a right-to-left script but embedded Western script and numbers must be rendered left-to-right, as we can see in the example below.
Hindi (Devanagari script) is noted for its beautiful text and raised baseline. The selection of the appropriate glyphs and their positioning must be done by the layout engine based on the context of the text combined with OpenType information associated with the font.
Interactive text editing is a common task in PC-based games and an occasional task on console-based games. It is one of the more difficult things to do properly in a GUI, especially in the face of localization. As a result, many games shy away from the problem by providing minimal text creation or editing features in their games. EAText solves this problem and makes it relatively easy for any game to implement text editing, as it comes with a generic multi-line text editor that is independent of the rendering system and can be plugged into any application. With it you get proper localized support for text editing in all of the languages listed earlier. Think of it as being a fully abstracted "owner-draw" text editing widget.
Here's a listing of TextEdit's primary features:
Here's a 2D scrolling TextEdit showing some highlighted text:
Here's a TextEdit being used interactively with a terrain view:
EAText has "fast mode" layout functionality for implementing debug HUDs. In this mode there is no advanced layout and instead characters are simply placed one after another in a tight programming loop. This is useful for non-localized needs such as frames-per-second counters, on-screen debug traces, single lines of Western text, etc. Here we show an example of a debug stats HUD done via the fast mode functions.
EAText supports locale-proper advanced layout such as HTML. The XHTML package is a separate package which implements a standards-compliant XHTML/CSS parsing and display engine using EAText. XHTML is the most modern form of HTML and is the direction the W3C standardization committee is going. While XHTML is best known for describing web pages presented by web browsers, it can also be used to display a richer gaming interface, especially interfaces with dynamic content.
The following is a somewhat unrefined demo of XHTML capabitilities, but it serves the purpose of demonstrating some of its capabilities.
Below we have a pre-shipping example of XHTML as used in EA's Spore game. The view consists of XHTML dynamically downloaded by the game from a conventional HTTP server. The view below is essentially an in-game web browser displaying content authored on the server by conventional tools and formats.
In addition to TrueType fonts, EAText supports bitmapped fonts. A bitmapped font is one based on custom-made bitmaps as opposed to TrueType's scalable outlines. The primary advantage of bitmapped fonts is that they can be multicolored with arbitrary alpha (i.e. ARGB) as opposed to being limited to monochrome with edge alpha. The primary disadvantage of bitmapped fonts is that they use more memory and don't scale in size well.
The following should help convey the difference between outline and bitmapped fonts, respectively:
A primary use of bitmapped fonts is to embed symbols or pictures inline with
text, such as controller button
symbols. This makes things easy because the pictures are treated like
characters, and thus no extra effort
is required by
the programmer or artist.
With EAText you can use bitmapped fonts mixed with TrueType fonts. Because EAText layout is style-based instead of font-based, setting up the display or editing of mixed outline/bitmapped fonts requires little effort on the part of the programmer or artist.
EAText supports some basic common text effects dynamically. By dynamically we mean that the effect is synthesized at runtime algorithmically as opposed to being done at production time with a bitmapped font. While dynamic effects have a runtime execution cost, they can save memory due to creating only what's needed. These text affects are implemented per-glyph as opposed to per layout and thus they have some limitations that are best overcome by doing layout level effects (e.g. double line rendering).
outline
![]()
shadow
![]()
emboss
![]()
The above effects are not done by EAText; they are Photoshop mockups. EAText has these options but as of this writing (January 2007) the options don't look as nice as they should and are scheduled to be revised.
Alignment
EAText supports the usual set of alignment options with plain text, styled text and interactive text editing. This functionality is properly implemented across all scripts, including cursive scripts such as Arabic. We display some horizontal alignment options here, but EAText also does vertical aligment as well.
Ellipsizing
One of the options provided is line-level and paragraph-level ellipsizing, whereby ellipsis are automatically laid out with the text appropriately for all languages.
Highlighting
EAText can provide highlighting information, which allows correct highlighting of text by the rendering system. This is useful in an interactive text editing environment and is used by the TextEdit component of EAText. This feature is not trivial, as highlighted text can have no self-overlapping regions (the user may be drawing with alpha) and the highlighted ranges may be discontiguous in the presence of bidirectional text.
The following consists of a contiguous selection. Really.
Try it for yourself:
The Hebrew author wrote, "שפן אכל קצת גזר בטעם (yes, lettuce) חסה, ודי" and I realized later that it was a pangram.
Overlapping glyphs should contribute to highlight alpha only once:
Underlining
EAText can provide underline layout, which allows correct underlining of text by the rendering system. This is most useful for PC titles that implement hyperlinks in-game. This feature is not quite as simple as it may seem, as glyphs may be overlapping and underlined ranges may be visually discontiguous in bidirectional environments.
Owner draw
EAText allows the user to implement owner-draw objects inline with the text without having to resort to using full-blown XHTML. The objects use the same glyph metrics as do text glyphs and so are not limited to sitting on the baseline. The user can assign basic properties to objects in order to control their word- and line-breaking properties. Here we show two objects owner-drawn and a lightning post-effect linking them.
Password shaping
EAText implements password display and interactive entry without the programmer or font artist do any work, as is commonly needed for this. To use password display with EAText/Typesetter, just enable the TextStyle password option and it does the rest.
Line wrapping
EAText supports the following line breaking modes which follow the CSS3 text-wrap standard:
Character and line spacing
EAText allows the user to explicitly control character spacing (tracking) and line spacing. Here we have some text laid out with variable character spacing settings.
Curved text
EAText allows you to lay out text along the inside or the outside of an arbitrary user-defined curve. A Bezier mode comes built-in, and a simple Bezier was used to generate the picture below. A common in-game use would be to flow text around a logo, put text on players' jerseys, place text on a curving road, etc.
The BitmapFontEditor (BFE) is a WYSIWYG editor for multi-colored bitmapped
fonts. It is flexible with respect to the font file format it works with, as it
is independent of any private or proprietary binary standards. The font metrics
consist of a text file and the font graphics consist of open standard graphics
formats such as .png, .tga, or anything else gimex.dll can read/write.
Developers are expected to convert these graphics files to optimized
platform-specific formats as part of the build process. It is not up to
BitmapFontEditor to try to decide what's best in this respect for teams.
BFE is a typographically proper implementation of a font generator and editor.
All aspects of editing are available via the GUI and it has the following basic
feature set:
BitmapFont examples
Here we present a small collection of bitmapped fonts created with BitmapFontEditor (graphics by Photoshop). A couple of these may be slightly over-the-top, but they represent real things you can do with ARGB fonts.
BitmapFontEditor screenshots
EAText supports 3D polygonal text generated from an arbitrary TrueType font with various parameters via PolygonFontGenerator. User-supplied glyph models can be used as well, but can still be laid out by the EAText Typesetter. EAText baseline curve fitting can be applied to polygonal text the same as textured text.
PolygonFontGenerator converts a TrueType font (outline font) to a polygonal representation suitable for rendering with 3D graphics operations. This is as opposed to the traditional method of converting an outline font to a bitmapped representation at runtime.
Here's a picture of a PolygonFont being used with the EAText text editor in 3D:
Here we can see the mesh used to draw a simple glyph:
ContourFontGenerator creates fonts that are defined not by polygons but by their outlines as line segments. You can do various effects with this such as treating the outline contour as a path. The following is an example of how a pixel shader might implement a stroked path along a ContourFont. ContourFont is new for EAText and doesn't yet have an elaborate demo, so we make a nifty picture here with Photoshop to suggest a possibility.
The EATextViewer tool is a tool that lets artists, producers, and programmers rapidly preview text of a variety of fonts and styles in WYSIWYG form.
The app shown below is a Windows app, but the tool runs natively on XBox 360 and PS3 as well (native Wii not available yet), so you can get a WYSIWYG view directly on the target platform. A couple of the options shown below haven't been implemented in the viewer yet, but that will be rectified. The GUI below looks like a Windows GUI, but actually it's a UTFWin GUI designed to mimic Windows. The four text large boxes at the bottom are text editors. You need to connect your mouse and keyboard to your 360 and PS3 if you want to use the app interactively on those platforms.
A comprehensive discussion of benchmarks is outside the scope of this document and would be more suitable for a technical document. Nevertheless, users would probably like to know how fast EAText is and how much memory it uses. It turns out that neither of these have a simple answer, as the performance and memory usage of EAText depends on how you set its configuration options and how you use it. We present a few bulleted statements on the topic here:
Here we provide some basic benchmark results of character throughput under various configurations and platforms. The numbers here highly dependent on a number of factors, but should provide rough figures to go by. Note that the measurements are for drawing 1000 characters, which is at least an order of magnitude greater than most games need to layout at a time. These benchmarks will improve when additional optimization is implemented.
We provide a brief glossary of terms used in this document.
| Advance Advance width |
Advance width refers to the distance
from the beginning of one character on a printed display to the
beginning of the next character on the display. |
| Ascent |
The ascent of a glyph is the distance from the base line to the
topmost portion of the glyph. The ascent of a font is the distance from
the baseline to the topmost portion of any glyph. See Descent. See http://en.wikipedia.org/wiki/Typeface for some related typographical concepts. |
| Baseline |
The baseline is the line upon which a string of text is drawn. For
horizontal text, it is at the bottom of the glyphs; for vertical text,
it goes through the center of glyphs. |
| Bidirectional Bidi |
Refers to text that when viewed flows
in two opposite directions, usually left-to-right and right-to-left. See http://en.wikipedia.org/wiki/Bi-directional_text See http://www.unicode.org/reports/tr9/ |
| Character | A character is an atomic unit of a
written language. Consists of letters, Asian ideograms, numerals,
punctuation, etc. In computer systems, the term 'character' is somewhat
ambiguous. See http://en.wikipedia.org/wiki/Grapheme |
| Character set | A character set is an assignment of
glyphs to numerical values (e.g. A == 65). ASCII is a character set, as
is Chinese Big 5. Unicode specifies a single unified character set with
each character called a "code point." See http://en.wikipedia.org/wiki/Character_encoding |
| Cluster | A cluster is a contiguous set of
characters that combine together in a single glyph cell and are
considered one indivisible unit. A simple example is the cluster of a
and ` together as à. Complex scripts have more complex examples. |
| Complex script | A complex script has at least one of the following attributes:
|
| CSS Cascading Style Sheets |
Cascading Style Sheets is a mechanism for adding style (e.g. fonts,
colors, spacing) to documents. It is used with HTML to provide a rich
layout system. See http://www.w3.org/Style/CSS. |
| Descent |
The descent of a glyph is the distance between the baseline and the
lowest part of a glyph. The descent of a font is the largest distance
between the baseline and the bottom of any glyph. For most Western
scripts, the descent is a value less or equal to zero, as the bottoms of
such glyphs are usually at the baseline or below it. See Ascent. See http://en.wikipedia.org/wiki/Typeface for some related typographical concepts. |
| Diacritic | A diacritic is an accent mark added to a letter. Examples include
(but are not limited to) accent, macron, umlaut, breve, caron,
circumflex, cedilla, and ogonek. See http://en.wikipedia.org/wiki/Diacritical |
| Embedding | Embedding refers to the inclusion of a run of opposing direction
text within a run of text. If you have an English sentence and put some
RTL text in the middle of it, the RTL text is embedded within the
primary LTR text. If you then embed a run of LTR text within the RTL
text then you have an additional level of embedding in place. An
embedding level of 0 means LTR, a level of 1 means RTL alone or within
LTR 0, a level of 2 means LTR within RTL 1, a level of 3 means RTL
within LTF of level 2, etc. See http://www.unicode.org/reports/tr9/ |
| Family | A family is a class of related fonts. A family is very similar to a
typeface, though strictly speaking it is possible to define a family
which has multiple typefaces in it. However, it often occurs that a
family has a single typeface in it and the two terms become synonymous. |
| Font | A font is a particular incarnation of a typeface (e.g. 10 pt courier
bold). It is common to confuse fonts with their superset -- typefaces. |
| Glyph | A glyph is a graphical representation of a character. In terms of
computer text processing and display, a glyph may be a combination of
multiple code points, as would be the case with separate accents or
other diacriticals. There is a distinction between glyphs and characters
that is subtle but significant. Character are what the user types on
their keyboard, whereas glyphs are the physical manifestation of such
characters. The ratio between characters and glyphs is not necessarily
1:1, though it is often so for simple English text. See http://en.wikipedia.org/wiki/Glyph |
| Jamo | Jamo are the Korean equivalent of letters. They are combined in
square two-dimensional patterns of two to four jamo to form what are
known as Hangul. |
| Justification |
Justification is the process of fitting horizontal text within its
left and right boundaries such that it meets both edges. It is a common
mistake to confuse alignment with justification. There is left
alignment, but there is no such thing as left justification, as that's
like saying something meets both sides on the left side. |
| Hangul | The name of the script used to write the Korean language. There are
11,172 encoded characters in the Hangul Syllables character block, AC00..D7A3. Looks like this: 한류 |
| Kerning | Kerning is the process of altering the space between specific pairs
of glyphs. Normally kerning is found only with variably spaced fonts and
not with monospaced fonts. A kerning value for a font is an adjustment
of the normal advance vector for a given glyph to the next glyph. Thus,
a zero value means that there is no adjustment and the glyphs are
regularly drawn; a negative value means the glyphs are made closer
together than normal, and a positive value means the glyphs are farther
apart than normal. See http://en.wikipedia.org/wiki/Kerning |
| Layout | Layout is the process of arranging glyphs from string of text.
Layout of some scripts (e.g. Latin) is relatively simple, whereas the
layout of others (e.g. Arabic, Thai) is complex. |
| Ligature | A ligature is a single glyph which is a derived from two otherwise
independent side-by-side glyphs. The combination of a and e to form æ is
an example of a ligature. See http://en.wikipedia.org/wiki/Ligature_(typography) |
| LTR Left to Right |
LTR refers to text display directionality. Most scripts use the left to right direction to display their text. See RTL. |
| OpenType |
A typographical system that consists of a font specification and a
layout assistance specification. OpenType fonts are similar to TrueType
fonts but have additional information and can have PostScript outlines.
OpenType fonts files usually use the .otf file extension. See http://en.wikipedia.org/wiki/OpenType |
| Outline font | An outline font that is primarily defined by filled curves such as
Bezier curves. Such fonts are usually scalable; a single outline
specification allows for various sizes of rendered glyphs. TrueType,
PostScript, and OpenType fonts are examples of outline fonts. |
| RTL Right to Left |
RTL refers to text display directionality. Middle-Eastern scripts such as Arabic, Hebrew, and Farsi are right to left. See LTR. |
| Script | A script is a writing system, such as Latin, Arabic, Hangul. Western
languages such as English, French, and Spanish all fall under the
category of Latin script. See http://en.wikipedia.org/wiki/Writing_system |
| Shaping | Shaping is the process of converting a string of keyboard derived
characters into a string of glyphs properly designed for display. This
may include adding, removing, and changing glyphs from the original
string. For example, a two character string of a` may be shaped into à.
Similarly, Arabic text shaping causes characters to change
representation depending on their position within a word (beginning,
middle, end, or alone). |
| TrueType | TrueType is an outline font specification. TrueType fonts files use
the .ttf or .ttc file extension. See http://en.wikipedia.org/wiki/TrueType |
| Unicode | Unicode is a character set that attempts to encompass all known
written languages. Additionally Unicode defines a number of standardized
conventions for the basic computational processing of text from these
languages. See http://en.wikipedia.org/wiki/Unicode See http://en.wikipedia.org/wiki/ISO_10646 |
| x-height |
The height of short lower-case glyphs such as 'x' in Latin fonts. See http://en.wikipedia.org/wiki/Typeface for some related typographical concepts. |