diff options
author | Marc G. Fournier | 1998-03-15 07:39:04 +0000 |
---|---|---|
committer | Marc G. Fournier | 1998-03-15 07:39:04 +0000 |
commit | 661ecf3c48e16a9add216287eb969d7615e47968 (patch) | |
tree | 91b54d5905aa2e22bd0ae9ea8c6b0f3cab75d3f4 /doc/README.mb | |
parent | 31a925c4d07675bc098a742ee9ca642ec79a40ee (diff) |
From: t-ishii@sra.co.jp
Included are patches intended for allowing PostgreSQL to handle
multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and
Mule internal code. With the MB patch you can use multi-byte character
sets in regexp and LIKE. The encoding system chosen is determined at
the compile time.
To enable the MB extension, you need to define a variable "MB" in
Makefile.global or in Makefile.custom. For further information please
take a look at README.mb under doc directory.
(Note that unlike "jp patch" I do not use modified GNU regexp any
more. I changed Henry Spencer's regexp coming with PostgreSQL.)
Diffstat (limited to 'doc/README.mb')
-rw-r--r-- | doc/README.mb | 67 |
1 files changed, 67 insertions, 0 deletions
diff --git a/doc/README.mb b/doc/README.mb new file mode 100644 index 00000000000..d6ff7e569b1 --- /dev/null +++ b/doc/README.mb @@ -0,0 +1,67 @@ +postgresql 6.3 multi-byte(MB) patch PL2 README Mar 10 1998 + + Tatsuo Ishii + t-ishii@sra.co.jp + http://www.sra.co.jp/people/t-ishii/PostgreSQL/ + +Introduction + +MB patch is intended for allowing PostgreSQL to handle multi-byte +charachter sets such as EUC(Extende Unix Code), Unicode and Mule +internal code. With the MB patch you can use multi-byte character sets +in regexp and LIKE. The encoding system chosen is determined at the +compile time. + +The patch also fixes some problems concerning with 8-bit single byte +character sets including ISO8859. (I would not say all of problems +have been fixed. I just confirmed that the regression test ran fine +and a few French characters could be used with the patch. Please let +me know if you find any problem while using 8-bit characters) + +How to use + +After applying the MB patch, create src/Makefile.custom with a line +including: + +MB=encoding_system + +where encoding_system is one of: + +EUC_JP Japanese EUC +EUC_CN Chinese EUC +EUC_KR Korean EUC +EUC_TW Taiwan EUC +UNICODE Unicode(UTF-8) +MULE_INTERNAL Mule internal + +Example: + +% cat Makefile.custom +MB=EUC_JP + +If MB is not defined, nothing is changed except better supporting for +8-bit single byte character sets. + +References + +These are good sources to start learning various kind of encoding +systems. + +ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf + Detailed explanations of EUC_JP, EUC_CN, EUC_KR, EUC_TW + appear in section 3.2. + +Unicode: http://www.unicode.org/ + The homepage of UNICODE. + + RFC 2044 + UTF-8 is defined here. + +History + +Mar 10, 1998 PL2 released + * add regression test for EUC_JP, EUC_CN and MULE_INTERNAL + * add an English document (this file) + * fix problems concerning 8-bit single byte characters + +Mar 1, 1998 PL1 released |