Skip to content

More about core fonts

This section describes XFree86-created enhancements to the core X11 fonts system that were adopted by X.Org.

Core fonts and internationalisation

The scalable font backends (Type 1 and TrueType) can automatically re-encode fonts to the encoding specified in the XLFD in fonts.dir. For example, a fonts.dir file can contain entries for the Type 1 Courier font such as

cour.pfa -adobe-courier-medium-r-normal--0-0-0-0-m-0-iso8859-1
cour.pfa -adobe-courier-medium-r-normal--0-0-0-0-m-0-iso8859-2

which will lead to the font being recoded to ISO 8859-1 and ISO 8859-2 respectively.

The fontenc layer

Two of the scalable backends (Type 1 and the FreeType TrueType backend) use a common fontenc layer for font re-encoding. This allows these backends to share their encoding data, and allows simple configuration of new locales independently of font type.

Please note: the X-TrueType (X-TT) backend is not included in X11R7.7. That functionality has been merged into the FreeType backend.

In the fontenc layer, an encoding is defined by a name (such as iso8859-1), possibly a number of aliases (alternate names), and an ordered collection of mappings. A mapping defines the way the encoding can be mapped into one of the target encodings known to fontenc; currently, these consist of Unicode, Adobe glyph names, and arbitrary TrueType cmaps.

A number of encodings are hardwired into fontenc, and are therefore always available; the hardcoded encodings cannot easily be redefined. These include:

  • iso10646-1: Unicode;

  • iso8859-1: ISO Latin-1 (Western Europe);

  • iso8859-2: ISO Latin-2 (Eastern Europe);

  • iso8859-3: ISO Latin-3 (Southern Europe);

  • iso8859-4: ISO Latin-4 (Northern Europe);

  • iso8859-5: ISO Cyrillic;

  • iso8859-6: ISO Arabic;

  • iso8859-7: ISO Greek;

  • iso8859-8: ISO Hebrew;

  • iso8859-9: ISO Latin-5 (Turkish);

  • iso8859-10: ISO Latin-6 (Nordic);

  • iso8859-15: ISO Latin-9, or Latin-0 (Revised Western-European);

  • koi8-r: KOI8 Russian;

  • koi8-u: KOI8 Ukrainian (see RFC 2319);

  • koi8-ru: KOI8 Russian/Ukrainian;

  • koi8-uni: KOI8 Unified (Russian, Ukrainian, and Byelorussian);

  • koi8-e: KOI8 European, ISO-IR-111, or ECMA-Cyrillic;

  • microsoft-symbol and apple-roman: these are only likely to be useful with TrueType symbol fonts.

Additional encodings can be added by defining encoding files. When a font encoding is requested that the fontenc layer doesn't know about, the backend checks the directory in which the font file resides (not necessarily the directory with fonts.dir!) for a file named encodings.dir. If found, this file is scanned for the requested encoding, and the relevant encoding definition file is read in. The mkfontdir utility, when invoked with the -e option followed by the name of a directory containing encoding files, can be used to automatically build encodings.dir files. Please see the mkfontdir(1) manual page for more details.

A number of encoding files for common encodings are included with X11R7.7. Information on writing new encoding files can be found in Format of encoding directory files and Format of encoding files later in this document.

Backend-specific notes about fontenc

The FreeType backend

For TrueType and OpenType fonts, the FreeType backend scans the mappings in order. Mappings with a target of PostScript are ignored; mappings with a TrueType or Unicode target are checked against all the cmaps in the file. The first applicable mapping is used.

For Type 1 fonts, the FreeType backend first searches for a mapping with a target of PostScript. If one is found, it is used. Otherwise, the backend searches for a mapping with target Unicode, which is then composed with a built-in table mapping codes to glyph names. Note that this table only covers part of the Unicode code points that have been assigned names by Adobe.

Specifying an encoding value of adobe-fontspecific for a Type 1 font disables the encoding mechanism. This is useful with symbol and incorrectly encoded fonts (see Hints about using badly encoded fonts below).

If a suitable mapping is not found, the FreeType backend defaults to ISO 8859-1.

Format of encoding directory files

In order to use a font in an encoding that the font backend does not know about, you need to have an encodings.dir file either in the same directory as the font file used or in a system-wide location (/usr/share/fonts/X11/encodings/ by default).

The encodings.dir file has a similar format to fonts.dir. Its first line specifies the number of encodings, while every successive line has two columns, the name of the encoding, and the name of the encoding file; this can be relative to the current directory, or absolute. Every encoding name should agree with the encoding name defined in the encoding file. For example,

3
mulearabic-0 /usr/share/fonts/X11/encodings/mulearabic-0.enc
mulearabic-1 /usr/share/fonts/X11/encodings/mulearabic-1.enc
mulearabic-2 /usr/share/fonts/X11/encodings/mulearabic-2.enc

The name of an encoding must be specified in the encoding file's STARTENCODING or ALIAS line. It is not enough to create an encodings.dir entry.

If your platform supports it (it probably does), encoding files may be compressed or gzipped.

The encoding.dir files are best maintained by the mkfontdir utility. Please see the mkfontdir(1) manual page for more information.

Format of encoding files

The encoding files are free form, i.e. any string of whitespace is equivalent to a single space. Keywords are parsed in a non-case-sensitive manner, meaning that size, SIZE, and SiZE all parse as the same keyword; on the other hand, case is significant in glyph names.

Numbers can be written in decimal, as in 256, in hexadecimal, as in 0x100, or in octal, as in 0400.

Comments are introduced by a hash sign #. A # may appear at any point in a line, and all characters following the # are ignored, up to the end of the line.

The encoding file starts with the definition of the name of the encoding, and possibly its alternate names (aliases):

STARTENCODING mulearabic-0
ALIAS arabic-0

The name of the encoding and its aliases should be suitable for use in an XLFD font name, and therefore contain exactly one dash -.

The encoding file may then optionally declare the size of the encoding. For a linear encoding (such as ISO 8859-1), the SIZE line specifies the maximum code plus one:

SIZE 0x2B

For a matrix encoding, it should specify two numbers. The first is the number of the last row plus one, the other, the highest column number plus one. In the case of jisx0208.1990-0 (JIS X 0208(1990), double-byte encoding, high bit clear), it should be

SIZE 0x75 0x80

In the case of a matrix encoding, a FIRSTINDEX line may be included to specify the minimum glyph index in an encoding. The keyword FIRSTINDEX is followed by two integers, the minimum row number followed by the minimum column number:

FIRSTINDEX 0x20 0x20

In the case of a linear encoding, a FIRSTINDEX line is not very useful. If for some reason however you chose to include on, it should be followed by a single integer.

Note that in most font backends inclusion of a FIRSTINDEX line has the side effect of disabling default glyph generation, and this keyword should therefore be avoided unless absolutely necessary.

Codes outside the region defined by the SIZE and FIRSTINDEX lines are understood to be undefined. Encodings default to linear encoding with a size of 256 (0x100). This means that you must declare the size of all 16 bit encodings.

What follows is one or more mapping sections. A mapping section starts with a STARTMAPPING line stating the target of the mapping. The target may be one of:

  • Unicode (ISO 10646):

    STARTMAPPING unicode
    

  • a given TrueType cmap:

    STARTMAPPING cmap 3 1
    

  • PostScript glyph names:

    STARTMAPPING postscript
    

Every line in a mapping section maps one from the encoding being defined to the target of the mapping. In mappings with a Unicode or TrueType mapping, codes are mapped to codes:

0x21 0x0660
0x22 0x0661
...

As an abbreviation, it is possible to map a contiguous range of codes in a single line. A line consisting of three integers

<it/start/ <it/end/ <it/target/

is an abbreviation for the range of lines

start     target

start+1   target+1

...

end       target+end-start

For example, the line

0x2121 0x215F 0x8140

is an abbreviation for

0x2121 0x8140
0x2122 0x8141
...
0x215F 0x817E

Codes not listed are assumed to map through the identity (i.e. to the same numerical value). In order to override this default mapping, you may specify a range of codes to be undefined by using an UNDEFINE line:

UNDEFINE 0x00 0x2A

or, for a single code,

UNDEFINE 0x1234

PostScript mappings are different. Every line in a PostScript mapping maps a code to a glyph name

0x41 A
0x42 B
...

and codes not explicitly listed are undefined.

A mapping section ends with an ENDMAPPING line

ENDMAPPING

After all the mappings have been defined, the file ends with an ENDENCODING line

ENDENCODING

In order to make future extensions to the format possible, lines starting with an unknown keyword are silently ignored, as are mapping sections with an unknown target.

Using symbol fonts

Type 1 symbol fonts should be installed using the adobe-fontspecific encoding.

In an ideal world, all TrueType symbol fonts would be installed using one of the microsoft-symbol and apple-roman encodings. A number of symbol fonts, however, are not marked as such; such fonts should be installed using microsoft-cp1252, or, for older fonts, microsoft-win3.1.

In order to guarantee consistent results (especially between Type 1 and TrueType versions of the same font), it is possible to define a special encoding for a given font. This has already been done for the ZapfDingbats font; see the file encodings/adobe-dingbats.enc.

Hints about using badly encoded fonts

A number of text fonts are incorrectly encoded. Incorrect encoding is sometimes done by design, in order to make a font for an exotic script appear like an ordinary Western text font on systems which are not easily extended with new locale data. It is often the result of the font designer's laziness or incompetence; for some reason, most people seem to find it easier to invent idiosyncratic glyph names rather than follow the Adobe glyph list.

There are two ways of dealing with such fonts: using them with the encoding they were designed for, and creating an ad hoc encoding file.

Using fonts with the designer's encoding

In the case of Type 1 fonts, the font designer can specify a default encoding; this encoding is requested by using the adobe-fontspecific encoding in the XLFD name. Sometimes, the font designer omitted to specify a reasonable default encoding, in which case you should experiment with adobe-standard, iso8859-1, microsoft-cp1252, and microsoft-win3.1. (The encoding microsoft-symbol doesn't make sense for Type 1 fonts).

TrueType fonts do not have a default encoding. However, most TrueType fonts are designed with either Microsoft or Apple platforms in mind, so one of microsoft-symbol, microsoft-cp1252, microsoft-win3.1, or apple-roman should yield reasonable results.

Specifying an ad hoc encoding file

It is always possible to define an encoding file to put the glyphs in a font in any desired order. Again, see the encodings/adobe-dingbats.enc file to see how this is done.

Specifying font aliases

By following the directions above, you will find yourself with a number of fonts with unusual names --- with encodings such as adobe-fontspecific, microsoft-win3.1 etc. In order to use these fonts with standard applications, it may be useful to remap them to their proper names.

This is done by writing a fonts.alias file. The format of this file is very simple: it consists of a series of lines each mapping an alias name to a font name. A fonts.alias file might look as follows:

"-ogonki-alamakota-medium-r-normal--0-0-0-0-p-0-iso8859-2" \
  "-ogonki-alamakota-medium-r-normal--0-0-0-0-p-0-adobe-fontspecific"

(both XLFD names on a single line). The syntax of the fonts.alias file is more precisely described in the mkfontdir(1) manual page.

Additional notes about scalable core fonts

About the FreeType backend

The FreeType backend (formerly xfsft) is a backend based on version 2 of the FreeType library (see the FreeType web site) and has the X-TT functionalities for CJKV support provided by the After X-TT Project (see the After X-TT Project web site). The FreeType backend has support for the fontenc style of internationalisation (see The fontenc layer). This backend supports TrueType font files (*.ttf), OpenType font files (*.otf), TrueType Collections (*.ttc), OpenType Collections (*.otc) and Type 1 font files (*.pfa and *.pfb).

In order to access the faces in a TrueType Collection file, the face number must be specified in the fonts.dir file before the filename, within a pair of colons, or by setting the 'fn' TTCap option. For example,

:1:mincho.ttc -misc-pmincho-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0

refers to face 1 in the mincho.ttc TrueType Collection file.

The new FreeType backend supports the extended fonts.dir syntax introduced by X-TrueType with a number of options, collectively known as TTCap. A TTCap entry follows the general syntax

option=value:

and should be specified before the filename. The new FreeType almost perfectly supports TTCap options that are compatible with X-TT 1.4. The Automatic Italic (ai), Double Strike (ds) and Bounding box Width (bw) options are indispensable in CJKV. For example,

mincho.ttc -misc-mincho-medium-r-normal--0-0-0-0-c-0-jisx0208.1990-0
ds=y:mincho.ttc -misc-mincho-bold-r-normal--0-0-0-0-c-0-jisx0208.1990-0
ai=0.2:mincho.ttc -misc-mincho-medium-i-normal--0-0-0-0-c-0-jisx0208.1990-0
ds=y:ai=0.2:mincho.ttc -misc-mincho-bold-i-normal--0-0-0-0-c-0-jisx0208.1990-0
bw=0.5:mincho.ttc -misc-mincho-medium-r-normal--0-0-0-0-c-0-jisx0201.1976-0
bw=0.5:ds=y:mincho.ttc -misc-mincho-bold-r-normal--0-0-0-0-c-0-jisx0201.1976-0
bw=0.5:ai=0.2:mincho.ttc -misc-mincho-medium-i-normal--0-0-0-0-c-0-jisx0201.1976-0
bw=0.5:ds=y:ai=0.2:mincho.ttc -misc-mincho-bold-i-normal--0-0-0-0-c-0-jisx0201.1976-0

setup the complete combination of jisx0208 and jisx0201 using mincho.ttc only. More information on the TTCap syntax is found on the After X-TT Project page.

The FreeType backend uses the fontenc layer in order to support recoding of fonts; this was described in The fontenc layer and especially The FreeType backend earlier in this document.

Delayed glyph rasterisation

When loading a proportional fonts which contain a huge number of glyphs, the old FreeType delayed glyph rasterisation until the time at which the glyph was first used. The new FreeType (libfreetype-xtt2) has an improved very lazy metric calculation method to speed up the process when loading TrueType or OpenType fonts. Although the X-TT module also has this method, the "vl=y" TTCap option must be set if you want to use it. This is the default method for FreeType when it loads multi-byte fonts. Even if you use a unicode font which has tens of thousands of glyphs, this delay will not be worrisome as long as you use the new FreeType backend -- its very lazy method is super-fast.

The maximum error of bitmap position using very lazy method is 1 pixel, and is the same as that of a character-cell spacing. When the X-TT backend is used with the vl=y option, a chipped bitmap is displayed with certain fonts. However, the new FreeType backend has minimal problem with this, since it corrects left- and right-side bearings using italicAngle in the TrueType/OpenType post table, and does automatic correction of bitmap positions when rasterisation so that chipped bitmaps are not displayed. Nevertheless if you don't want to use the very lazy method when using multi-bytes fonts, set vl=n in the TTCap option to disable it:

vl=n:luxirr.ttf -b&h-Luxi Serif-medium-r-normal--0-0-0-0-p-0-iso10646-1

Of course, both backends also support an optimisation for character-cell fonts (fonts with all glyph metrics equal, or terminal fonts). A font with an XLFD specifying a character-cell spacing c, as in

-misc-mincho-medium-r-normal--0-0-0-0-c-0-jisx0208.1990-0

or

fs=c:mincho.ttc -misc-mincho-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0

will not compute the metric for each glyph, but instead trust the font to be a character-cell font. You are encouraged to make use of this optimisation when useful, but be warned that not all monospaced fonts are character-cell fonts.