Adobe CIDs and glyphs in CJK TrueType font
Adobe CIDs and glyphs in CJK TrueType font
(aka CJKTTCID.INF in gs-cjk project)

Table of contents

For other information, see the Ghostscript overview.


Overview

This note provides information on utilization of CJK (Chinese, Japanese and Korean) TrueType fonts as Type 2 CID-keyed fonts from the viewpoint of its validity and limitation. In order to compose Type 2 CID-keyed font from CJK TrueType font on the fly, gs-cjk project uses not only general-purpose (ToCID) CMaps but also ToUnicode and ToCode CMaps which are freely distributed with Acrobat Reader by Adobe Systems Incorporated. The current algorithm of mapping CIDs to TTF completely depends on Adobe CMaps, TrueType cmap and GSUB tables.

The current revision of on-the-fly Type 2 CID-keyed font technology supports the following kinds (RO: Registry-Ordering) of CID-keyed fonts:

[RO]
Adobe-CNS1
Adobe-GB1
Adobe-Japan1
Adobe-Japan2
Adobe-Korea1

and doesn't support the following kinds of CID-keyed fonts:
[RO]
Adobe-CNS2
Adobe-HongKong1
Adobe-Korea2
Adobe-Vietnam1

The current revision can handle the following kinds (Encoding in cmap table) of TrueType fonts as CID-keyed fonts:
[Encoding][RO]
UnicodeAdobe-*
ShiftJISAdobe-Japan1
PRCAdobe-GB1
Big5Adobe-CNS1
WansungAdobe-Korea1
JohabAdobe-Korea1

and doesn't support UCS-4 Encoding TrueType fonts for the present. In the case of Unicode Encoding, RO can be detected by reading ``Code Page Character Range'' of OS/2 table of TTF as follows:
[Encoding][Code Page][RO]
UnicodeJapaneseAdobe-Japan1
Simplified ChineseAdobe-GB1
Korean WansungAdobe-Korea1
Traditional ChineseAdobe-CNS1
Korean JohabAdobe-Korea1

For each combination of RO and TTF Encoding, following Adobe CMaps are applied:
[RO-Encoding][Supplement limit]
[used CMap]- [Comment]

Adobe-CNS1-Big50
Adobe-CNS1-ETen-B5- ToCode CMap
ETen-B5-V- ToCID CMap
ETen-B5-H- ToCID CMap

Adobe-CNS1-Unicode3
Adobe-CNS1-UCS2- ToUnicode CMap
UniCNS-UCS2-V- ToCID CMap
UniCNS-UCS2-H- ToCID CMap

Adobe-GB1-PRC2
Adobe-GB1-GBK-EUC- ToCode CMap
GBK-EUC-V- ToCID CMap
GBK-EUC-H- ToCID CMap

Adobe-GB1-Unicode4
Adobe-GB1-UCS2- ToUnicode CMap
UniGB-UCS2-V- ToCID CMap
UniGB-UCS2-H- ToCID CMap

Adobe-Japan1-ShiftJIS2
Adobe-Japan1-90ms-RKSJ- ToCode CMap
90ms-RKSJ-V- ToCID CMap
90ms-RKSJ-H- ToCID CMap

Adobe-Japan1-Unicode4
Adobe-Japan1-UCS2- ToUnicode CMap
UniJIS-UCS2-V- ToCID CMap
UniJIS-UCS2-H- ToCID CMap

Adobe-Japan2-Unicode0
UniHojo-UCS2-V- ToCID CMap
UniHojo-UCS2-H- ToCID CMap

Adobe-Korea1-Johab1
KSC-Johab-V- ToCID CMap
KSC-Johab-H- ToCID CMap

Adobe-Korea1-Unicode2
Adobe-Korea1-UCS2- ToUnicode CMap
UniKS-UCS2-V- ToCID CMap
UniKS-UCS2-H- ToCID CMap

Adobe-Korea1-Wansung1
Adobe-Korea1-KSCms-UHC- ToCode CMap
KSCms-UHC-V- ToCID CMap
KSCms-UHC-H- ToCID CMap

where Supplement values are denoted as the limit determined by the maximum CID in used CMaps.

The Glyph Substitution table (GSUB) of TTF, Single Substitution Format 2 is read for vertically-used glyphs in CIDs. The current revision doesn't handle any other formats of GSUB, so handling ligatures and variants as CID-keyed fonts might be tasks to be solved in future.

In recent CID-keyed fonts, pre-rotated Latin glyphs are defined, but the current revision merely maps to normal Latin glyphs. I don't know whether gs-cjk project can handle pre-rotated glyphs or not in future.

Notes of CJKTTCID

In the following tables and comments, I note details of validity and limitation for the individual kinds of CID-keyed fonts composed from generally-circulated and Unicode TrueType fonts at the current revision. Naturally, these results of glyphs lacking are affected by TrueType fonts you use.

Adobe-CNS1 CID-keyed font composed from Traditional Chinese Unicode TTF
-----------------------------------------------------------------------
[ROS][CID range][Comment]
Adobe-CNS1-0    0-  50596,97,124-127,228,260 are lacking
  506-  561no problem
  562-  594all glyphs are lacking
  595-13645no problem
13646-1374813646,13647 are lacking
13749-1399813996-13998 are lacking
13999-14098some glyphs are lacking
Adobe-CNS1-114099-17407lots of glyphs are lacking (*1)
Adobe-CNS1-217408-1760017503,17504 are lacking (*2)
Adobe-CNS1-317601-1760517603 is lacking
17606-18845lots of glyphs are lacking (*3)
Adobe-CNS1-418846-18961all glyphs assignment is impossible (*4)
(*1) HK GCCS
(*2) not pre-rotated
(*3) HK SCS
(*4) HK SCS (unused in UniCNS-UCS-2 CMap, though used in UniCNS-UTF8,
UniCNS-UTF16, UniCNS-UTF32, also ETHK-B5, needless to say HKscs-B5)


Adobe-GB1 CID-keyed font composed from Simplified Chinese Unicode TTF
---------------------------------------------------------------------
[ROS][CID range][Comment]
Adobe-GB1-0    0-  93999,695,698,737,935,938 are lacking
  940- 7702no problem
 7703- 77167705,7708 are incorrect
Adobe-GB1-1 7717- 9896no problem
Adobe-GB1-2 9897-22126no problem
Adobe-GB1-322127-2235222347,22350,22352 are lacking (*1)
Adobe-GB1-422353-22427all glyphs are not available (*2)
22428-29058all glyphs are not available (*3)
29059-29063all glyphs are not available (*4)
(*1) not pre-rotated
(*2) additional Hiragana and Katakana, extended Bopomofo glyphs
(*3) the Unified Han Ideographs Extension A
(*4) pre-rotated glyphs


Adobe-Japan1 CID-keyed font composed from Japanese Unicode TTF
--------------------------------------------------------------
[ROS][CID range][Comment]
Adobe-Japan1-0    0- 1124lots of glyphs are lacking or incorrect:
96-98,127,128,130-133,135-137,226,326,
390,396,422,424,502,506-509,512,513,515,
606,607,632
 1125- 7477no problem
 7478- 76327478 is lacking and 7608,7609 are incorrect
 7633- 8004lots of glyphs are lacking or incorrect (*4)
 8005- 8283lots of glyphs are lacking or incorrect:
8008,8053,8059-8061,8091,8102-8111,8166-8181,
8189,8190,8227-8229,8260
Adobe-Japan1-1 8284- 8358lots of glyphs are lacking or incorrect:
8295-8297,8300-8302,
8306,8307,8321,8322,8325,8326
Adobe-Japan1-2 8359- 8717no problem
 8718- 87198718 is lacking and 8719 is incorrect
Adobe-Japan1-3 8720- 9353some glyphs are lacking or incorrect (*1)
Adobe-Japan1-4 9354- 9737some glyphs are lacking or incorrect (*2)
 9738-13319 lots of glyphs are lacking or incorrect (*3)
13320-15443all glyphs are variants or lacking (*4)
(*1) not pre-rotated
(*2) not italic form
(*3) many ligature, pre-rotated, pre-rotated and italic form glyphs
(*4) lots of variants are assigned substitutes


Adobe-Japan2 CID-keyed font composed from Japanese Unicode TTF
--------------------------------------------------------------
[ROS][CID range][Comment]
Adobe-Japan2-0    0- 6067no problem


Adobe-Korea1 CID-keyed font composed from Korean Unicode TTF
------------------------------------------------------------
[ROS][CID range][Comment]
Adobe-Korea1-0    0-  357some glyphs are lacking or incorrect:
61,97,100,104,111,227
  358- 3435no problem
 3436- 8055no problem
 8056- 8190lots of glyphs are lacking or incorrect:
8059,8061,8075,8083-8085,8089,8091,8093,8190
 8191- 9332no problem
Adobe-Korea1-1 9333-18154perhaps no problem, but I can't check (*)
Adobe-Korea1-218155-18351some glyphs are lacking
(*) Technical Note on Adobe-Korea1-1,2 has not been published yet[6].

Future Works

As stated above, I think that the current mapping algorithm based on ToCID CMaps and ToUnicode CMaps still has problems. The gs-cjk project is considering how to settle the matters.

References

  1. Microsoft Corporation, ``OpenType specification,'' http://www.asia.microsoft.com/typography/otspec/
  2. Adobe Systems Incorporated, ``Adobe-CNS1-4 Character Collection for CID-Keyed Fonts,'' http://partners.adobe.com/asn/developer/pdfs/tn/5080.Adobe-CNS1-4.pdf
  3. Adobe Systems Incorporated, ``Adobe-GB1-4 Character Collection for CID-Keyed Fonts,'' http://partners.adobe.com/asn/developer/pdfs/tn/5079.Adobe-GB1-4.pdf
  4. Adobe Systems Incorporated, ``Adobe-Japan1-4 Character Collection for CID-Keyed Fonts,'' http://partners.adobe.com/asn/developer/pdfs/tn/5078.Adobe-Japan1-4.pdf
  5. Adobe Systems Incorporated, ``Adobe-Japan2-0 Character Collection for CID-Keyed Fonts,'' http://partners.adobe.com/asn/developer/pdfs/tn/5097.Adobe-Japan2-0.pdf
  6. Adobe Systems Incorporated, ``Adobe-Korea1-0 Character Collection for CID-Keyed Fonts,'' http://partners.adobe.com/asn/developer/pdfs/tn/5093.Adobe-Korea1-0.pdf
  7. Taiji Yamada, ``Tips on PostScript,'' http://www.aihara.co.jp/~taiji/tops/
  8. ``gs-cjk project,'' http://www.gyve.org/gs-cjk/
This article is written by Taiji Yamada <taiji@aihara.co.jp>. He takes full responsibility for the wording and content of this article.

Copyright © 2001 Taiji Yamada <taiji@aihara.co.jp> and gs-cjk project.

Copyright © 2002 artofcode LLC. All rights reserved.

This file is part of GNU Ghostscript. See the GNU General Public License (the "License") for full details of the terms of using, copying, modifying, and redistributing GNU Ghostscript.

GNU Ghostscript version 6.53, 13 February 2002