summaryrefslogtreecommitdiff
path: root/doc/README.mb
diff options
context:
space:
mode:
authorMarc G. Fournier1998-03-15 07:39:04 +0000
committerMarc G. Fournier1998-03-15 07:39:04 +0000
commit661ecf3c48e16a9add216287eb969d7615e47968 (patch)
tree91b54d5905aa2e22bd0ae9ea8c6b0f3cab75d3f4 /doc/README.mb
parent31a925c4d07675bc098a742ee9ca642ec79a40ee (diff)
From: t-ishii@sra.co.jp
Included are patches intended for allowing PostgreSQL to handle multi-byte charachter sets such as EUC(Extende Unix Code), Unicode and Mule internal code. With the MB patch you can use multi-byte character sets in regexp and LIKE. The encoding system chosen is determined at the compile time. To enable the MB extension, you need to define a variable "MB" in Makefile.global or in Makefile.custom. For further information please take a look at README.mb under doc directory. (Note that unlike "jp patch" I do not use modified GNU regexp any more. I changed Henry Spencer's regexp coming with PostgreSQL.)
Diffstat (limited to 'doc/README.mb')
-rw-r--r--doc/README.mb67
1 files changed, 67 insertions, 0 deletions
diff --git a/doc/README.mb b/doc/README.mb
new file mode 100644
index 00000000000..d6ff7e569b1
--- /dev/null
+++ b/doc/README.mb
@@ -0,0 +1,67 @@
+postgresql 6.3 multi-byte(MB) patch PL2 README Mar 10 1998
+
+ Tatsuo Ishii
+ t-ishii@sra.co.jp
+ http://www.sra.co.jp/people/t-ishii/PostgreSQL/
+
+Introduction
+
+MB patch is intended for allowing PostgreSQL to handle multi-byte
+charachter sets such as EUC(Extende Unix Code), Unicode and Mule
+internal code. With the MB patch you can use multi-byte character sets
+in regexp and LIKE. The encoding system chosen is determined at the
+compile time.
+
+The patch also fixes some problems concerning with 8-bit single byte
+character sets including ISO8859. (I would not say all of problems
+have been fixed. I just confirmed that the regression test ran fine
+and a few French characters could be used with the patch. Please let
+me know if you find any problem while using 8-bit characters)
+
+How to use
+
+After applying the MB patch, create src/Makefile.custom with a line
+including:
+
+MB=encoding_system
+
+where encoding_system is one of:
+
+EUC_JP Japanese EUC
+EUC_CN Chinese EUC
+EUC_KR Korean EUC
+EUC_TW Taiwan EUC
+UNICODE Unicode(UTF-8)
+MULE_INTERNAL Mule internal
+
+Example:
+
+% cat Makefile.custom
+MB=EUC_JP
+
+If MB is not defined, nothing is changed except better supporting for
+8-bit single byte character sets.
+
+References
+
+These are good sources to start learning various kind of encoding
+systems.
+
+ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
+ Detailed explanations of EUC_JP, EUC_CN, EUC_KR, EUC_TW
+ appear in section 3.2.
+
+Unicode: http://www.unicode.org/
+ The homepage of UNICODE.
+
+ RFC 2044
+ UTF-8 is defined here.
+
+History
+
+Mar 10, 1998 PL2 released
+ * add regression test for EUC_JP, EUC_CN and MULE_INTERNAL
+ * add an English document (this file)
+ * fix problems concerning 8-bit single byte characters
+
+Mar 1, 1998 PL1 released