##Adobe File Version: 1.000 #======================================================================= # FTP file name: JAPANESE.TXT # # Contents: Map (external version) from Mac OS Japanese # encoding to Unicode 2.1 # # Copyright: (c) 1994-1999 by Apple Computer, Inc., all rights # reserved. # # Contact: charsets@apple.com # # Changes: # # b03 1999-Sep-22 Change mappings for 0xFF, 0x8150, 0x8163, # 0xEB50, and 0xEB63 to reduce problems in # interconversion with CP 932. Update contact # e-mail address. Matches internal utom<b9>, # ufrm<b10>, and Text Encoding Converter # version 1.5. # b02 1998-Aug-18 Update the mappings for Mac OS Japanese # 0x8650 and 0x8855. Update the description of # the PostScript screen variant. Matches # internal utom<b7>, ufrm<b7>. # n06 1998-Feb-05 Update to match internal utom<n10>, ufrm<n27> # and Text Encoding Converter version 1.3: # Use standard Unicodes plus transcoding hints # instead of single corporate characters; see # details below. Also reorder into single list # with all one-byte characters at the # beginning, and rewrite all the initial # comments. # n03 1995-Apr-15 Matches internal ufrm<n11>. # # Standard header: # ---------------- # # Apple, the Apple logo, and Macintosh are trademarks of Apple # Computer, Inc., registered in the United States and other countries. # Unicode is a trademark of Unicode Inc. For the sake of brevity, # throughout this document, "Macintosh" can be used to refer to # Macintosh computers and "Unicode" can be used to refer to the # Unicode standard. # # Apple makes no warranty or representation, either express or # implied, with respect to these tables, their quality, accuracy, or # fitness for a particular purpose. In no event will Apple be liable # for direct, indirect, special, incidental, or consequential damages # resulting from any defect or inaccuracy in this document or the # accompanying tables. # # These mapping tables and character lists are subject to change. # The latest tables should be available from the following: # # <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/> # <ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/> # # For general information about Mac OS encodings and these mapping # tables, see the file "README.TXT". # # Format: # ------- # # Three tab-separated columns; # '#' begins a comment which continues to the end of the line. # Column #1 is the Mac OS Japanese code (in hex as 0xNN or 0xNNNN) # Column #2 is the corresponding Unicode or Unicode sequence (in # hex as 0xNNNN, 0xNNNN+0xNNNN, etc.). # Column #3 is a comment containing the Unicode name. # In some cases an additional comment follows the Unicode name. # # The entries are in Mac OS Japanese code order. # All one-byte characters are at the beginning. # # Some of these mappings require the use of corporate characters. # See the file "CORPCHAR.TXT" and notes below. # # Control character mappings are not shown in this table, following # the conventions of the standard UTC mapping tables. However, the # Mac OS Japanese encoding uses the standard control characters at # 0x00-0x1F and 0x7F. # # Notes on Mac OS Japanese: # ------------------------- # # This table covers the standard Mac OS Japanese encoding used # in Mac OS versions 7.1 and later. The Mac OS Japanese encoding is # based on Shift-JIS, but adds another 300 or so characters using # code points that are unassigned in Shift-JIS. Certain Mac OS # Japanese fonts are based on a modified version of the Mac OS # Japanese encoding; see below. # # Some of the information below comes from Ken Lunde's book # "Understanding Japanese Information Processing", O'Reilly & Assoc., # 1993. # # 1. Conventional Shift-JIS # # Most Shift-JIS implementations include the following characters: # # a) One-byte characters from JIS X0201-1976. This has two parts: # # - JIS-Roman, the Japanese national variant of ISO 646 (the # international version of ASCII). This is identical to ASCII # except that 0x5C is YEN SIGN instead of REVERSE SOLIDUS, # 0x7E is OVERLINE instead of TILDE, and usually 0x7C is # BROKEN BAR instead of VERTICAL LINE (although this last # difference is sometimes seen as just a glyph variant). # # - "Halfwidth" katakana and punctuation characters with codes # 0xA1-0xDF. # # c) Two-byte characters with first/lead/high byte in the range # 0x81-0x9F or 0xE0-0xFC, and second/trail/low byte in the # range 0x40-0x7E or 0x80-0xFC. The first byte range was chosen # to avoid any JIS X0201 characters. The two-byte characters # include: # # - Characters from JIS X0208-1990, transformed so they map # onto Shift-JIS code points 0x8140-0xEFFC. The original JIS # X0208 characters have code points in the range 0x2121 to # 0x7E7E (corresponding to "ku-ten" codes in the range 1,1 to # 94,94 - i.e. row and column on a JIS X0208 chart). # # - A user-defined range using Shift-JIS code points # 0xF040-0xFCFC, providing 2444 code points. # # Note: PostScript fonts are based on JIS X0208-1983 (formerly # known as JIS C6226-1983). This earlier version of JIS X0208 # lacks two Kanji characters that were added for JIS X0208-1990; # these have Shift-JIS codes xEAA3 and 0xEAA4. # # 2. Mac OS Japanese changes and additions # # a) One-byte changes and additions # # - Changes to JIS-Roman: In Mac OS Japanese, 0x7C and 0x7E # are assigned as in ASCII, not as in JIS-Roman: # 0x7C VERTICAL LINE (can be considered a glyph difference) # 0x7E TILDE # # - Additional one-byte characters: basic Shift-JIS leaves # five one-byte code points unassigned. Mac OS Japanese assigns # these as follows: # 0x80 REVERSE SOLIDUS (the character at 0x5C in ASCII) # 0xA0 NO-BREAK SPACE ("halfwidth"; common Shift-JIS addition) # 0xFD COPYRIGHT SIGN # 0xFE TRADE MARK SIGN # 0xFF halfwidth horizontal ellipsis # # b) Two-byte additions # # Many JIS X0208 code points are unassigned; these correspond to # many unassigned code points in Shift-JIS. Many implementations # of Shift-JIS, including Mac OS Japanese, add characters using # these code points. The standard Mac OS Japanese additions are: # # - 260 symbols and dingbat-like number and letter forms using # Shift-JIS codes in the range 0x8540-0x886D. These include # circled and parenthesized numbers and letters, square katakana # and Kanji forms, etc. # # - 53 vertical forms for hiragana, katakana, and punctuation, # using Shift-JIS codes in the range 0xEB41-0xED96. These are # so-called "ku+84" vertical forms, since their ku-ten code is # derived from that of the corresponding abstract or horizontal # form character by adding 84 to the ku (row) number. # # Most of these additional characters are found in other vendor # implementations of Shift-JIS, although often with different # code points, and so most of these characters are also found # in Unicode. However, some of these additional characters do # not correspond to any standard single Unicode character. # # 3. Mac OS Japanese font variants # # Some fonts used with Mac OS Japanese implement variants of the # encoding described above. # # a) Basic variant # # This is used with the fonts TohabaGothic and TohabaMincho. This # variant has none of the two-byte additions described in section 2b # above, but it does have all the one-byte changes and additions # described in section 2a. # # These fonts also lack glyphs for the Kanji characters at 0xEAA3 # and 0xEAA4. # # b) PostScript screen variant # # This is used with the screen fonts ChuGothic and SaiMincho. This # variant does not have the Apple 260 symbols and dingbat-like # additions in the range 0x8540-0x886D; instead it has a different # set of about 160 symbols and dingbat-like additions in the range # 0x86A2-0x879C. Like the standard variant, it does have the ku+84 # vertical forms in the range 0xEB41-0xED96; it also has additional # vertical forms in the ranges 0xEE5F-0xEE6E and 0xEE80-0xEE81 # (although many fonts do not have glyphs for these additional # vertical forms). This variant also has the one-byte changes at # 0x7C and 0x7E and the additions at 0x80 and 0xA0, but these fonts # lack glyphs for the one-byte additions at 0xFD-0xFF. These fonts # also lack glyphs for the Kanji characters at 0xEAA3 and 0xEAA4. # # c) PostScript printer variant # # When the screen fonts ChuGothic and SaiMincho are printed on # certain printers such as Apple's LaserWriter NTX-J, the printer # will use a built-in font that is supposed to match the screen font. # In fact, the printer fonts implement a superset of the screen # variant ...
wendy6