From: t-ishii@sra.co.jp

Included are patches intended for allowing PostgreSQL to handle multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and Mule internal code. With the MB patch you can use multi-byte character sets in regexp and LIKE. The encoding system chosen is determined at the compile time. To enable the MB extension, you need to define a variable "MB" in Makefile.global or in Makefile.custom. For further information please take a look at README.mb under doc directory. (Note that unlike "jp patch" I do not use modified GNU regexp any more. I changed Henry Spencer's regexp coming with PostgreSQL.)
author: Marc G. Fournier 1998-03-15 07:39:04 +0000
committer: Marc G. Fournier 1998-03-15 07:39:04 +0000
commit: 661ecf3c48e16a9add216287eb969d7615e47968 (patch)
tree: 91b54d5905aa2e22bd0ae9ea8c6b0f3cab75d3f4 /doc/README.mb
parent: 31a925c4d07675bc098a742ee9ca642ec79a40ee (diff)
1 files changed, 67 insertions, 0 deletions
diff --git a/doc/README.mb b/doc/README.mb
new file mode 100644
index 00000000000..d6ff7e569b1
--- /dev/null
+++ b/doc/README.mb
@@ -0,0 +1,67 @@
+postgresql 6.3 multi-byte(MB) patch PL2 README	  Mar 10 1998
+
+						Tatsuo Ishii
+						t-ishii@sra.co.jp
+		  http://www.sra.co.jp/people/t-ishii/PostgreSQL/
+
+Introduction
+
+MB patch is intended for allowing PostgreSQL to handle multi-byte
+charachter sets such as EUC(Extende Unix Code), Unicode and Mule
+internal code. With the MB patch you can use multi-byte character sets
+in regexp and LIKE. The encoding system chosen is determined at the
+compile time.
+
+The patch also fixes some problems concerning with 8-bit single byte
+character sets including ISO8859. (I would not say all of problems
+have been fixed. I just confirmed that the regression test ran fine
+and a few French characters could be used with the patch. Please let
+me know if you find any problem while using 8-bit characters)
+
+How to use
+
+After applying the MB patch, create src/Makefile.custom with a line
+including:
+
+MB=encoding_system
+
+where encoding_system is one of:
+
+EUC_JP			Japanese EUC
+EUC_CN			Chinese EUC
+EUC_KR			Korean EUC
+EUC_TW			Taiwan EUC
+UNICODE			Unicode(UTF-8)
+MULE_INTERNAL		Mule internal
+
+Example:
+
+% cat Makefile.custom
+MB=EUC_JP
+
+If MB is not defined, nothing is changed except better supporting for
+8-bit single byte character sets.
+
+References
+
+These are good sources to start learning various kind of encoding
+systems.
+
+ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
+	Detailed explanations of EUC_JP, EUC_CN, EUC_KR, EUC_TW
+	appear in section 3.2.
+
+Unicode: http://www.unicode.org/
+	The homepage of UNICODE.
+
+	RFC 2044
+	UTF-8 is defined here.
+
+History
+
+Mar 10, 1998 PL2 released
+	* add regression test for EUC_JP, EUC_CN and MULE_INTERNAL
+	* add an English document (this file)
+	* fix problems concerning 8-bit single byte characters
+
+Mar 1, 1998 PL1 released
author	Marc G. Fournier	1998-03-15 07:39:04 +0000
committer	Marc G. Fournier	1998-03-15 07:39:04 +0000
commit	661ecf3c48e16a9add216287eb969d7615e47968 (patch)
tree	91b54d5905aa2e22bd0ae9ea8c6b0f3cab75d3f4 /doc/README.mb
parent	31a925c4d07675bc098a742ee9ca642ec79a40ee (diff)