This topic describes how to create partitioned tables and replicated tables by using the CREATE TABLE
statement in AnalyticDB for MySQL and how to specify distribution keys, partition keys, indexes, partition lifecycle, and tiered storage policies of hot and cold data for tables.
Syntax
CREATE TABLE [IF NOT EXISTS] table_name
({column_name column_type [column_attributes] [ column_constraints ] [COMMENT 'column_comment']
| table_constraints}
[, ... ])
[table_attribute]
[partition_options]
[index_all]
[storage_policy]
[block_size]
[engine]
[table_properties]
[AS query_expr]
[COMMENT 'table_comment']
column_attributes:
[DEFAULT {constant | CURRENT_TIMESTAMP}]
[AUTO_INCREMENT]
column_constraints:
[{NOT NULL|NULL} ]
[PRIMARY KEY]
table_constraints:
[{INDEX|KEY} [index_name] (column_name|column_name->'$.json_path'|column_name->'$[*]')][,...]
[FULLTEXT [INDEX|KEY] [index_name] (column_name) [index_option]] [,...]
[PRIMARY KEY [index_name] (column_name,...)]
[CLUSTERED KEY [index_name] (column_name,...) [ASC|DESC]]
[[CONSTRAINT [symbol]] FOREIGN KEY (fk_column_name) REFERENCES pk_table_name (pk_column_name)][,...]
[ANN INDEX [index_name] (column_name,...) [index_option]] [,...]
table_attribute:
DISTRIBUTED BY HASH(column_name,...) | DISTRIBUTED BY BROADCAST
partition_options:
PARTITION BY
{VALUE(column_name) | VALUE(DATE_FORMAT(column_name, 'format')) | VALUE(FROM_UNIXTIME(column_name, 'format'))}
LIFECYCLE N
index_all:
INDEX_ALL= 'Y|N'
storage_policy:
STORAGE_POLICY= {'HOT'|'COLD'|'MIXED' {hot_partition_count=N}}
block_size:
BLOCK_SIZE= VALUE
engine:
ENGINE= 'XUANWU|XUANWU_V2'
By default, internal tables of AnalyticDB for MySQL use the zstd compression algorithm.
Parameters
table_name, column_name, column_type, and COMMENT
partition_options (Partition key and lifecycle)
INDEX_ALL (Indexes on all columns)
Examples
Create a partitioned table that is automatically partitioned by date
Create a partitioned table named sales that is automatically partitioned by the date value of the sale_time volume.
CREATE TABLE sales (
sale_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id VARCHAR NOT NULL COMMENT 'Customer ID',
phone_num BIGINT NOT NULL COMMENT 'Phone number',
revenue DECIMAL(15, 2) COMMENT 'Total amount',
sale_time TIMESTAMP NOT NULL COMMENT 'Order time',
PRIMARY KEY (sale_time,sale_id)
)
DISTRIBUTED BY HASH(sale_id)
PARTITION BY VALUE(DATE_FORMAT(sale_time, '%Y%m%d'));
Create a partitioned table that is configured with a partition lifecycle
Create a partitioned table named customer. Specify login_time
, customer_id
, and phone_num
as the composite primary key, customer_id
as the distribution key, and login_time
as the partition key. Set the partition lifecycle to 30.
All partitions are sorted in descending order based on the values of the login_time
partition key. Only the first 30 partitions are retained. When data is written to the 31st partition, the partition with the smallest partition key value is automatically deleted.
Assume that data from the 1st (with login_time value 20231201) to the 30th (with login_time value 20231230) day is written to corresponding partitions from Partition 20231201 to Partition 20231230. When data with login_time value 20231231 is written to the database on the 31st day, the partition with the smallest login_time value (Partition 20231201) is automatically deleted. This way, only data within the last 30 days is retained.
CREATE TABLE customer (
customer_id BIGINT NOT NULL COMMENT 'Customer ID',
customer_name VARCHAR NOT NULL COMMENT 'Customer name',
phone_num BIGINT NOT NULL COMMENT 'Phone number',
city_name VARCHAR NOT NULL COMMENT 'City',
sex INT NOT NULL COMMENT 'Gender',
id_number VARCHAR NOT NULL COMMENT 'ID card number',
home_address VARCHAR NOT NULL COMMENT 'Home address',
office_address VARCHAR NOT NULL COMMENT 'Office address',
age INT NOT NULL COMMENT 'Age',
login_time TIMESTAMP NOT NULL COMMENT 'Logon time',
PRIMARY KEY (login_time,customer_id,phone_num)
)
DISTRIBUTED BY HASH(customer_id)
PARTITION BY VALUE(DATE_FORMAT(login_time, '%Y%m%d')) LIFECYCLE 30
COMMENT 'Customer information table';
Create a non-partitioned table
Create a non-partitioned table without distribution or partition keys
If you create a table that has a primary key but not a distribution key, AnalyticDB for MySQL automatically uses the primary key as the distribution key.
CREATE TABLE orders (
order_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id INT NOT NULL COMMENT 'Customer ID',
order_status VARCHAR(1) NOT NULL COMMENT 'Order status',
total_price DECIMAL (15, 2) NOT NULL COMMENT 'Total amount',
order_date DATE NOT NULL COMMENT 'Order date',
PRIMARY KEY(order_id,order_date)
);
Query the statement that is used to create the table and verify that the primary key columns order_id and order_date are used as the distribution key.
SHOW CREATE TABLE orders;
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| orders | CREATE TABLE `orders` ( |
| | 'order_id' bigint NOT NULL COMMENT 'Order ID', |
| | 'customer_id' int NOT NULL COMMENT 'Customer ID', |
| | 'order_status' varchar(1) NOT NULL COMMENT 'Order status', |
| | 'total_price' decimal(15, 2) NOT NULL COMMENT 'Total amount', |
| | 'order_date' date NOT NULL COMMENT 'Order date', |
| | PRIMARY KEY (`order_id`,`order_date`) |
| | ) DISTRIBUTED BY HASH(`order_id`,`order_date`) INDEX_ALL='Y' STORAGE_POLICY='HOT' ENGINE='XUANWU' TABLE_PROPERTIES='{"format":"columnstore"}' |
+---------+-----------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)
Create a non-partitioned table without primary or distribution keys
If you create a table that does not have a primary key or a distribution key, AnalyticDB for MySQL adds the __adb_auto_id__
column to the table and uses the column as the primary key and the distribution key.
CREATE TABLE orders_new (
order_id BIGINT NOT NULL COMMENT 'Order ID',
customer_id INT NOT NULL COMMENT 'Customer ID',
order_status VARCHAR(1) NOT NULL COMMENT 'Order status',
total_price DECIMAL (15, 2) NOT NULL COMMENT 'Total amount',
order_date DATE NOT NULL COMMENT 'Order date'
);
Query the statement that is used to create the table and verify that the auto-increment column named __adb_auto_id__ is automatically added to the table and used as the primary key and the distribution key.
SHOW CREATE TABLE orders_new;
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| orders_new | CREATE TABLE `orders_new` ( |
| | `__adb_auto_id__` bigint AUTO_INCREMENT, |
| | 'order_id' bigint NOT NULL COMMENT 'Order ID', |
| | 'customer_id' int NOT NULL COMMENT 'Customer ID', |
| | 'order_status' varchar(1) NOT NULL COMMENT 'Order status', |
| | 'total_price' decimal(15, 2) NOT NULL COMMENT 'Total amount', |
| | 'order_date' date NOT NULL COMMENT 'Order date', |
| | PRIMARY KEY (`__adb_auto_id__`) |
| | ) DISTRIBUTED BY HASH(`__adb_auto_id__`) INDEX_ALL='Y' STORAGE_POLICY='HOT' ENGINE='XUANWU' TABLE_PROPERTIES='{"format":"columnstore"}' |
+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.04 sec)
Create a non-partitioned table with primary and distribution keys but without a partition key
Create a table named supplier that uses the supplier_id
auto-increment column as the distribution key and is sharded based on the hash values of supplier_id
values.
CREATE TABLE supplier (
supplier_id BIGINT AUTO_INCREMENT PRIMARY KEY,
supplier_name VARCHAR,
address INT,
phone VARCHAR
)
DISTRIBUTED BY HASH(supplier_id);
Specify storage policies
Specify a cold storage policy
CREATE TABLE item (
order_id BIGINT NOT NULL,
item_id INT NOT NULL,
quantity DECIMAL(15, 2) NOT NULL,
discount DECIMAL(15, 2) NOT NULL,
shipdate DATE NOT NULL,
PRIMARY KEY (order_id,item_id,shipdate)
)
DISTRIBUTED BY HASH(item_id)
PARTITION BY VALUE(date_format(shipdate, '%Y%m')) LIFECYCLE 200
STORAGE_POLICY='COLD';
Specify a hot storage policy
CREATE TABLE item (
order_id BIGINT NOT NULL,
item_id INT NOT NULL,
quantity DECIMAL(15, 2) NOT NULL,
discount DECIMAL(15, 2) NOT NULL,
shipdate DECIMAL NOT NULL,
PRIMARY KEY (order_id,item_id,shipdate)
)
DISTRIBUTED BY HASH(item_id)
PARTITION BY VALUE(date_format(shipdate, '%Y%m')) LIFECYCLE 200
STORAGE_POLICY='HOT';
Specify a tiered storage policy and set the number of hot partitions to 16
CREATE TABLE item (
order_id BIGINT NOT NULL,
item_id INT NOT NULL,
quantity DECIMAL(15, 2) NOT NULL,
discount DECIMAL(15, 2) NOT NULL,
shipdate DATE NOT NULL,
PRIMARY KEY (order_id,item_id,shipdate)
)
DISTRIBUTED BY HASH(item_id)
PARTITION BY VALUE(date_format(shipdate, '%Y%m')) LIFECYCLE 200
STORAGE_POLICY='MIXED' HOT_PARTITION_COUNT=16;
Create regular indexes on specific columns
Create regular indexes on the id and date columns.
CREATE TABLE index_tb (
id INT,
sales DECIMAL(15, 2),
date DATE,
INDEX (id),
INDEX (date),
PRIMARY KEY (id)
)
DISTRIBUTED BY HASH(id);
Specify a full-text index
Create a full-text index named fidx_c on the content column.
CREATE TABLE fulltext_tb (
id INT,
content VARCHAR,
keyword VARCHAR,
FULLTEXT INDEX fidx_c(content),
PRIMARY KEY (id)
)
DISTRIBUTED BY HASH(id);
For information about how to create and change a full-text index, see the "Create a full-text index for an existing table" section of the Create a full-text index topic.
For information about full-text search, see Full-text search.
Specify a vector index
Create a table that has a four-dimensional vector column of the ARRAY<SMALLINT> type named short_feature and a four-dimensional vector column of the ARRAY<FLOAT> type named float_feature.
Create vector indexes named short_feature_index and float_feature_index on the vector columns for the table.
CREATE TABLE fact_tb (
xid BIGINT NOT NULL,
cid BIGINT NOT NULL,
uid VARCHAR NOT NULL,
vid VARCHAR NOT NULL,
wid VARCHAR NOT NULL,
short_feature array<smallint>(4),
float_feature array<float>(4),
ann index short_feature_index(short_feature),
ann index float_feature_index(float_feature),
PRIMARY KEY (xid, cid, vid)
)
DISTRIBUTED BY HASH(xid) PARTITION BY VALUE(cid) LIFECYCLE 4;
For more information about vector indexing and vector search, see Vector search.
Specify a foreign key index
Create a table named store_returns
. Use the FOREIGN KEY
clause to associate the sr_item_sk
column of the store_returns table with the primary key column customer_id
of the customer
table.
CREATE TABLE store_returns (
sr_sale_id BIGINT NOT NULL PRIMARY KEY,
sr_store_sk BIGINT,
sr_item_sk BIGINT NOT NULL,
FOREIGN KEY (sr_item_sk) REFERENCES customer (customer_id)
);
Specify a JSON array index
Create a table and create a JSON array index named idx_vj on the vj column.
CREATE TABLE json(
id INT,
vj JSON,
INDEX idx_vj(vj->'$[*]')
)
DISTRIBUTED BY HASH(id);
For information about how to create and change JSON array indexes, see the "Create a JSON array index" section of the JSON indexes topic and the "JSON array indexes" section of the ALTER TABLE topic.
FAQ
Column attributes and constraints
Distribution key, partition key, and lifecycle
Indexes
Column-oriented storage
Others
Common errors and troubleshooting
References
For information about how to write data to a table, see INSERT INTO.
For information about how to insert query results into a table or overwrite specific data in a table with query results, see INSERT SELECT FROM or INSERT OVERWRITE SELECT.
For information about how to import data from data sources, such as ApsaraDB RDS, MaxCompute, and OSS, into AnalyticDB for MySQL, see Data import.