pgxc_ctl

pgxc_ctl pgxc_ctl Description The manual configuration of the individual components of a Postgres-XL cluster can be cumbersome and error prone. The Postgres-XL Cluster Control utility, pgxc_ctl, simplifies this by allowing for the configuration, initialization, starting, stoping, monitoring and failover of Postgres-XL components. This section describes how to use pgxc_ctl. Building and installing pgxc_ctl You should build pgxc_ctl using your Postgres-XL build environment. pgxc_ctl source code comes with the Postgres-XL source code tarball. The latest version of the source code will be available at its home repository, pgxc_ctl If you would like to use the latest version from the pgxc_ctl home repository, get the source code tarball and expand it in the source's contrib directory of your Postgres-XL build environment. If you are using pgxc_ctl from the Postgres-XL tarball, you don't have to do this. Before building pgxc_ctl, you should build Postgres-XL binary, like $ cd your Postgres-XL build directory $ ./configure your configuration option $ make Please note that you dont'have to install the Postgres-XL binary to build pgxc_ctl. Also please note that Postgres-XL top level make and make install command does not take care of pgxc_ctl. You should build and install it separately. Then you can build pgxc_ctl as follows: $ cd contrib/pgxc_ctl $ make $ make install The pgxc_ctl binary will be installed in the same directory as the Postgres-XL binaries. Postgres-XL consists of many components (or "nodes") running in various physical or virtual machines. Because pgxc_ctl relies on ssh connections between the machines where pgxc_ctl and other nodes are running, you should setup ssh-agent authentication to avoid typing a password each time pgcx_ctl issues ssh. pgxc_ctl home directory pgxc_ctl uses its own work directory, where it stores configuration files and logs, as well as other resources. The default value is $HOME/pgxc_ctl and you can specify this by the --home option. The pgxc_ctl home directory may be referred to as $PGXC_CTL_HOME through this manual. You do not have to create this directory manually. pgxc_ctl will create the directory when it is invoked for the first time. Please note that you need appropriate privilege to create $PGXC_CTL_HOME. You can create this manually, of course. For details, please refer to later sections. pgxc_ctl configuration file pgxc_ctl uses a configuration file. The default name and the location is $PGXC_CTL_HOME/pgxc_ctl.conf. When you change Postgres-XL cluster configuration using pgxc_ctl commands, this file will be updated. Depending upon your configuration, pgxc_ctl will back up this file according to your configuration. pgxc_ctl provides the command "prepare" to setup the prototype of this file. For details, please refer to command syntax of pgxc_ctl. pgxc_ctl initialization file You can specify your preferred parameters of pgxc_ctl behavior. You can specify parameters in /etc/pgxc_ctl and/or $HOME/.pgxc_ctl. Setups in $HOME/.pgxc_ctl have higher priority so you can specify system-wide setups at /etc/pgxc_ctl and then your personal preferences in $HOME/.pgxc_ctl. The format of this file will be described in a later section. Running pgxc_ctl for the first time Unless you build $PGXC_CTL_HOME and the configuration file from the scratch, you should run pgxc_ctl to build your $PGXC_CTL_HOME and get a prototype of configuration file. From your shell prompt, simply type pgxc_ctl. You will have the following prompt: $ pgxc_ctl PGXC$ You will get the default prompt, which you can modify at any time through initialization files. Try to type pwd. You will see what your $PGXC_CTL_HOME is. $ pgxc_ctl PGXC$ pwd /home/postgres-xl/pgxc_ctl PGXC$ If you specify option with another directory, pgxc_ctl will start at this directory, after building it if needed. $ pgxc_ctl --home /home/postgres-xl/my_pgxc_ctl PGXC$ pwd /home/postgres-xl/my_pgxc_ctl PGXC$ You can specify your pgxc_ctl_home with the environment variable PGXC_CTL_HOME, or you can specify this as variable pgxc_ctl_home in your initialization files. The command line option has the highest priority, then the environment, $HOME/.pgxc_ctl and /etc/pgxc_ctl. Type prepare or prepare config to get a configuration template file pgxc_ctl.conf at $PGXC_CTL_HOME. You may add a file name as an option to get a configuration template in your favorite file. For example: PGXC$ prepare PGXC$ or PGXC$ prepare config my_pgxc.conf PGXC$ You may also generate a template configuration file suitable for testing Postgres-XL on the localhost. Use option minimal to generate a such a template configuration file. PGXC$ prepare config minimal PGXC$ prepare config minimal my_minimal_pgxc.conf If you want, you may want to start off with a completely empty cluster to add all the nodes one-by-one. Use option empty to generate an empty template configuration file. PGXC$ prepare config empty PGXC$ prepare config empty my_empty_pgxc.conf A more detailed syntax of the command will be described in a later section. Make your configuration If you are starting with an empty configuration file, then there is no real need to provide values for most of the variables. However, if you want to provide custom values for pg_hba.conf entries or additional parameters to be added to your postgresql.conf file, then you will need to do so before going ahead with your cluster creation. You can skip the rest of this section if you are going ahead with an empty configuration. Please take a look at the template of the configuration file you created in the previous section. This file is actually a bash script file to setup various bash script variables which are passed to pgxc_ctl next time you run it. The Postgres-XL configuration needs to specify same or similar values to each node configuration, for example, work directory, port, etc. To avoid trivial errors, you can specify the same value as defaults for variables and refer to them later in each variable setups. For example, a part of your template may look like this: #---- Shortcuts ------ gtmProxyDir=$HOME/pgxc/nodes/gtm_pxy #---- Overall ------- gtmProxy=y # Specify y if you configure at least one GTM # proxy. # You may not configure gtm proxies only when # you dont' configure GTM slaves. # If you specify this value not to y, the # following parameters will be set to default # empty values. # If we find there're no valid Proxy server # names (means, every servers are specified # as none), then gtmProxy value will be set to # "n" and all the entries will be set to empty # values. gtmProxyNames=(gtm_pxy1 gtm_pxy2 gtm_pxy3 gtm_pxy4) # No used if it is not configured gtmProxyServers=(node06 node07 node08 node09) # Specify none if you dont' configure it. gtmProxyPorts=(20001 20001 20001 20001) # Not used if it is not configured. gtmProxyDirs=($gtmProxyDir $gtmProxyDir $gtmProxyDir $gtmProxyDir) # Not used if it is not configured. This section specifies the GTM proxy configuration. We have four GTM proxies in each of the server. They share working directory path and is specified as a shortcut which is referred to later. You can do all these in any part of the configuration file. Please note that the working directory of this script is $PGXC_CTL_HOME, unless you change it explicitly in this configuration file. pgxc_ctl invocation options When you invoke pgxc_cmd from your shell, pgxc_ctl accepts several options to control its behavior. pgxc_ctl command format is as follows: pgxc [options ... ] [pgxc_command] Options are as follows: Specifies configuration file. The default is pgxc_ctl.conf, or the value of configFile option found in the initialization file. Specifies $PGXC_CTL_HOME directory. You can specify this as pgxc_ctl_home variable in the initialization file. Specifies where to read pgxc_ctl commands. There's no corresponding variable in the initialization file. Default is the standard input. Specifies where to write the log. The path is relative to $PGXC_CTL_HOME or the value of log directory specified as option or logDir variable in the initialization file, if specified. Specifies the directory of the log file. Default is $PGXC_HOME/pgxc_log/. Specifies where to write pgxc_ctl output. Default is the standard output. Specifies to run pgxc_ctl without printing many messages. This value can also be set as variable verbose in the initialization file. You can setup level of messages logMessage and printMessage variables in the initialization file as well. Prints pgxc_ctl version and exits. Specifies to run pgxc_ctl to print as many messages as possible. pgxc_ctl initialization file As described in previous sections, pgxc_ctl behavior, as specified in the command line options, can be specified in advance in the initialization file /etc/pgxc_ctl or $HOME/.pgxc_ctl. The syntax is as follows: name value [ value ... ] # comment Blank lines or lines beginning with '#' are simply ignored. If you'd like to include spaces or tabs in the variable name, enclose the name with '...' or "...". Please note that this file is not a bash script. List of the name and their value is as follows: Specify the configuration file name. Default is pgxc_ctl.conf. This option can be overridden by command line option. Specify remote temporary directory, default is /tmp. Specifies the directory to write log. Can be overridden by command line option. Specifies the log file name, which is relative to $PGXC_CTL_HOME/pgxc_log or value of logDir variable. Can be overridden by command line option. Specifies the message level to print to the terminal or output file. Valid value is MANDATORY, PANIC, ERROR, WARNING, NOTICE, NOTICE2, INFO, DEBUG1, DEBUG2, or DEBUG3. Default is NOTICE. Specifies $PGXC_CTL_HOME. Default is $HOME/pgxc_ctl or environment variable $PGXC_CTL_HOME. Can be overriden by command line option. Specifies the message level to print to the log file. Valid value is MANDATORY, PANIC, ERROR, WARNING, NOTICE, NOTICE2, INFO, DEBUG1, DEBUG2, or DEBUG3. Default is NOTICE. Specify local temporary directory, default is /tmp. Specifies verbose message output from pgxc_ctl. Value should be y or n. Can be overridedn by or command line option. Specifies pgxc_prompt. Default is 'PGXC$ '. Typical example of this initialization file will be as follows: $ cat ~/.pgxc_ctl xc_prompt 'PGXC$ ' verbose y logMessage 'DEBUG3' printMessage 'DEBUG1' printLocation y $ Postgres-XL basics and its resources Postgres-XL components Postgres-XL consists of the following components. Each component may be called , which may not necessarily refer to physical or virtual server because you can configure more than one node to one physical/virtual server. You should consider how many of them to configure. Hereafter, we call such physical/virtual server as . GTM GTM stands for global transaction manager. You must have one in the cluster. For production, the GTM should be configured on a separate server. The GTM can have a slave which can be failed over to. The GTM slave can be installed (hopefully) in a separate server but can be installed in one of the others where you have a gtm_proxy, coordinators and datanodes. GTM-Proxy The GTM proxy reduces the communication load between Coordinator and GTM and helps GTM failover. You should configure one gtm_proxy on each server where you have a Coordinator or Datanode as described below. Coordinator The Coordinator handles application connections and statement handling. For simplicity and load balancing, it is a good idea to install a Coordinator on each server other than where GTM (and GTM slave) are configured. Coordinators can have a slave. Slaves can be configured in one of the servers where another Coordinator master is installed. Datanode A Datanode stores the data and runs local SQL statement supplied by a Coordinator. A Datanode should also be configured on all the servers except those for the GTM (and GTM slave). Common resource assignment and configuration practice Each component requires the following resources: hostname, IP address or host name you can refer to through DNS, /etc/hosts or by equivalent means. port work directory Also, Coordinators and Datanodes need an additional port for connection pooling to other nodes. In the same host, you must not assign the same port and the same work directory between nodes. pgxc_ctl checks this. When assigning the port, you should be careful not to assign an already assigned one to other services. Also, please note the following: You should not assign the same port to the GTM master and GTM slave. GTM, Coordinators and Datanodes can configure their slaves. pgxc_ctl does not support cascaded slaves or more than one slave per Coordinator and Datanode. It is not a restriction of postgres-XL, it is a restriction of pgxc_ctl. At present, Coordinator and Datanode slaves are connected using synchronous replication in pgxc_ctl. This is not a Posgres-XL restriction either. In the future, asynchronous, cascaded and multiple slaves may be supported. Configuration As described in the previous section, you can configure your Postgres-XL cluster by editing pgxc_ctl.conf or other configuration files manually. But editing the file can be a bit of work. A better way would be to start off with an empty configuration file. The pgxc_ctl utility supports three types of templates as shown below. PGXC$ prepare config empty or PGXC$ prepare config minimal or PGXC$ prepare config The default pgxc_ctl.conf file can be found inside the $HOME/pgxc_ctl location. You can edit it to configure your Postgres-XL cluster or you can choose to start with an empty cluster and add components one-by-one. When the configuration is messed up, you can again create a specific template of your choice with the proper prepare config command. You can choose to specify your own custom name for the configuration file like below: PGXC$ prepare config empty my_config.conf Then you can edit this file to configure and customize your postgres-XL cluster. This configuration file is basically a bash script file which declares many variables to define the cluster configuration. Although it might seem confusing, but With template values and comments, one can easily understand what each of these variables mean. You can also generate a minimal configuration file, good enough to test Postgres-XL on the localhost by specifying minimal. For example: PGXC$ prepare config minimal PGXC$ prepare config minimal my_minimal_config.conf Given below is the description of the various variables in the order that they appear in the configuration file. Overall Option if you backup the configuration file to a remote server. Specify y if you'd like to backup the configuration file. n otherwise. Name of the configuration backup file. Effective when configuration file backup is enabled. Host name (or IP address) where you backup the configuration file. Effective when configuration file backup is enabled. Local directory used by pgxc_ctl itself. You need full access to this directory. This parameter was left here to make it compatible with the bash-version. It is recommended to configure this parameter in initialization file. Postgres-XL should at least be installed in the server you are running pgxc_ctl. This variable specifies the installation directory, as you specify with the option to the configure script. All of the installation will be copied to the same directory at each servers and you should give appropriate privilege to this directory in advance. Name of the database user who owns whole Postgres-XL database. This can be different from $pgxcUser. In the present version, we assume these two should be the same though. Name of the operating system user you are logging in as Postgres-XL owner. At present, this should be the same as $pgxcOwner. Directory used for work at each server except for the one pgxc_ctl runs. You need full access to this directory at all the servers. This parameter was left here to make it compatible with the bash-version. It is recommended to configure this parameter in initialization file. GTM Node name of the GTM master. If you'd like to add specific configuration parameters to both the GTM master and slave, specify the file that contains such lines for the gtm.config file. Otherwise, specify none. Work directory for GTM master. Listening port number of GTM master. Host name where GTM master runs. If you'd like to add specific configuration parameters only to the GTM master, specify the file which contains such lines for the gtm.config file. Otherwise, specify none Option to enable a GTM slave. Specify y to enable, n otherwise. Node name of GTM slave. Work directory for GTM slave. Listening port number of GTM slave. Host name of where the GTM slave runs. Effective only when a GTM slave is configured. If you'd like to add specific configuration parameters only to the GTM slave, specify the file which contains such lines for the gtm.config file. Otherwise, specify none. GTM Proxy This specifies if you configure any GTM proxy in your Postgres-XL cluster. Specify the value y if you configure the GTM proxy in your Postgres-XL cluster. Otherwise specify n. If you specify n, all the other parameters for gtm_proxy will be ignored. This is a shortcut used to assign the same work directory to all GTM proxies. You don't have to worry about it when you specify these values manually. Specify work directory for each GTM proxy. If you'd like to add configuration values to all GTM proxies, specify the file name which contains such lines for the gtm_proxy.conf. Otherwise specify none. Specify unique name for each GTM proxy. This is an array. In the template, we have four servers for Coordinators and Datanodes and we have four GTM Proxies as well. Specify listening port number for each GTM proxy. Specify host name where each of the GTM Proxy runs. Specify server name as the same order as $gtmProxyNames. If you'd like to add specific configuration value to each GTM proxy, specify file names with such lines for gtm_proxy.conf. Otherwise specify none. Coordinators Shortcut to assign the same WAL archive directory to all Coordinator slaves. Not needed if you specify these manually. Array of WAL archive log directory for each datanode slave. If you don't configure Coordinator slaves and specify a coordSlave variable value to n, you don't have to worry about this variable. If you would like to add extra configuration values for all coordinators, specify the file name containing such lines for postgresql.conf. Specify none otherwise. File name which contains entries for the pg_hba.conf file for all coordinators. Specify none if you do not have such file. Shortcut to assign the same work directory to all coordinator masters. Not needed if you specify these manually. Array of coordinator master work directory. Array of the host name where each coordinator master runs. Specify in the order of above. Shortcut to assign the same value to each member of . Not needed if you assign the value manually. Array of coordinator max_wal_senders value. Note that a master and the slave shares the same value of this variable. Array to specify Coordinator names. A Coordinator slave uses the same name as the master. Array of CIDR addresses to be added to pg_hba.conf. Will create pg_hba.conf file entry with user. Array of the listening port number for each coordinator. Array of the port number for each pooler. The pooler takes care of the connection between a Coordinator and Datanode and needs a separate port. Specify y if you configure a Coordinator slave. n otherwise. If you specify n, then all the other variables for coordinator slave will be ignored. Shortcut to assign the same work directory to all the Coordinator slaves. Not needed if you specify these manually. Array of work directories for each Coordinator slaves. Array of the hostname where the slave of each Coordinator runs. Specify none if you don't configure the slave for specific coordinator. Array of the listening port number for each coordinator slave. Array of the port number for each pooler. The pooler takes care of the connection between a Coordinator and Datanode and needs a separate port. Array of the filename which contains extra configuration values for each coordinator. Specify none if you don't have such a file. Datanodes Shortcut to assign the same WAL archive directory to all the Datanode slaves. Not needed if you specify these manually. Array of WAL archive log directory for each Datanode slave. If you would like to add extra configuration values for all the Datanodes, specify the file name containing such lines for postgresql.conf. Specify none otherwise. File name which contains entries for all the Datanodes' pg_hba.conf file. Specify none if you don't have such file. Shortcut to assign the same work directory to all Datanode masters. Not needed if you specify these manually. Array of Datanode masters' work directories. Array of Datanode masters' XLOG directories. Array of the host name where each Datanode master runs. Specify in the order of $coordNames above. Shortcut to assign the same value to each member of datanodeMaxWalSenders. Not needed if you assign the value manually. Array of Datanode max_wal_senders value. Array to specify Datanode names. Array of CIDR addresses to be added to pg_hba.conf. Will create pg_hba.conf file entry with $pgxcOwner user. Array of the listening port number for each Datanode. Array of the port number for each pooler. Pooler takes care of the connection between a Coordinator and Datanode and needs a separate port. Specify y if you configure Datanode slaves. n otherwise. If you specify n, all the other variables for Datanode slaves will be ignored. Shortcut to assign the same work directory to all Datanode slaves. Not needed if you specify these manually. Array of work directories for each Datanode slave. Array of XLOG directories for each Datanode slave. Array of the hostname where the slave of each Datanode runs. Specify none if you don't configure the slave for specific coordinator. Array of the listening port number for each datanode slave. Array of the port number for each pooler. The pooler takes care of the connection between a Coordinator and Datanode and needs a separate port. Array of the filename that contains extra configuration values for each Datanode. Specify none if you don't have such file. Array of file names which contain specific extra pg_hba.conf entries for each Datanode. Specify none if you don't have such file. Specify name of the primary node. This must be one of the names in $datanodeNames. If you don't want the primary node, specify N/A or none. pgxc_ctl commands pgxc_ctl command names and literal options are case-insensitive. Other options are case-sensitive. If other command is given, it will be passed to your shell. When the shell stops, then the control returns to pgxc_ctl. add gtm master name host port dir add gtm slave name host port dir add gtm_proxy name host port dir add coordinator master name host port pooler dir< extraServerConf extraPgHbaConf add coordinator slave name host port pooler dir archDir add datanode master name host port pooler dir waldir extraServerConf extraPgHbaConf add datanode slave name host port pooler dir waldir archDir Add the specified node to your Postgres-XL cluster. Each node needs a host name and its work directory. GTM master, GTM slave, GTM proxy, Coordinator master/slave and Datanode master/slave need its own port to listen to. Coordinators and Datanodes also need a pooler port to pool connections to Datanodes. Coordinator and Datanode slaves need a directory to receive WAL segments from their master. While adding a Coordinator master and a Datanode master, extra server configuration and extra pg_hba configuration parameters can be specified in a file. A separate XLOG directory may also be specified for Datanode master and slave. When you add a Coordinator and Datanode master, node information at all of the Coordinators will be updated with the new one and gtm_proxy will be selected automatically based upon where the new node runs. You cannot add slaves without master. Typically, when you start with an empty configuration file, first you will add your GTM node. Then you will add your first Coordinator master and then the first Datanode master. When you add a Coordinator master and it is the first Coordinator in the cluster, then it starts up on its own with empty node metadata. Otherwise the new Coordinator master connects to any existing Coordinator and gets the existing node metadata of the cluster. When you add a Datanode master and it is the first Datanode, then it connects to any existing Coordinator to get the node metadata. Otherwise the Datanode master connects to any existing Datanode and gets the current metadata from it. Createdb [ - coordinator ] createdb_option ... Invokes createdb utility to create a new database using specified coordinator. If no coordinator is specified, pgxc_ctl chooses one of the available ones. Createuser[ - coordinator ] createuser_option ... Invokes createuser utility to create a new user using specified coordinator. If a Coordinator is not specified, pgxc_ctl chooses one of the available ones. deploy [ all | host ... ] Deploys Postgres-XL binaries and other installation material to specified hosts. If "all" is specified, they will be deployed to all hosts found in the configuration file. If a list of the hosts are specified, deployment will be done to all the specified hosts, regardless if they are found in the configuration file or not. Target directory is taken from the variable . failover [ gtm | coordinator nodename | datanode nodename | nodename ] Failover specified node to its master. init [force] all init [force] nodename ... init [force] gtm [ master | slave | all ] init [force] gtm_proxy [ all | nodename ... ] init [force] coordinator nodename ... init [force] coordinator [ master | slave ] [ all | nodename ... ] init [force] datanode nodename ... init [force] datanode [ master | slave ] [ all | nodename ... ] Initializes specified nodes. At initialization, all the working directories of each component will be created if it does not exist. If it does and force is specified, then all contents under the working directory will be removed. Without force option, existing non-empty directories will not be cleaned and the server will start with the existing data. When "all" option is specified, then node information at each coordinator will be set up. kill all kill nodename ... kill gtm [ master | slave | all ] kill gtm_proxy [ all | nodename ... ] kill coordinator nodename ... kill coordinator [ master | slave ] [ all | nodename ... ] kill datanode nodename ... kill datanode [ master | slave ] [ all | nodename ... ] Kills specified node. If nodename is specified and it has both a master and slave, then both master and slave will be chosen. When killing components, their ports will be cleaned too. log [ variable | var ] varname log [ message | msg ] message_body Prints the specified contents to the log file. Variable or var option writes specified variable name and its value. Message or msg option writes specified message. monitor all monitor nodename ... monitor gtm [ master | slave | all ] monitor gtm_proxy [ all | nodename ... ] monitor coordinator nodename ... monitor coordinator [ master | slave ] [ all | nodename ... ] monitor datanode nodename ... monitor datanode [ master | slave ] [ all | nodename ... ] Monitors if specified nodes are running. prepare [ path ] Write pgxc_ctl configuration file template to the specified file. If path option is not specified, target file will be the default configuration file, or the file specified by configFile option in /etc/pgxc_ctl or ~/.pgxc_ctl. If you specify relative path, it will be against pgxc_ctl_home. psql [ - coordinator ] psql_option ... Invokes psql targetted to the specified Coordinator. If no Coordinator is specified, pgxc_ctl will choose one of the available ones. q | quit | exit Exits pgxc_ctl. This command has no option. reconnect gtm_proxy [ all | nodename ... ] Reconnects specified gtm_proxy to a new GTM. This is needed after you failover a GTM to its slave. remove gtm master [ clean ] remove gtm slave [ clean ] remove gtm_proxy nodename [ clean ] remove coordinator [ master| slave ] nodename [ clean ] remove datanode [ master| slave ] nodename [ clean ] Removes the specified node from the cluster. If clean option is specified, then the work directory and listening socket will be cleared. set varname value ... Set variable value. You can specify multiple values to a variable. In this case simply specify these values as separated values. show [ variable | var ] [ all | varname ... ] Displays configuration or variable name and its value. start all start nodename ... start gtm [ master | slave | all ] start gtm_proxy [ all | nodename ... ] start coordinator nodename ... start coordinator [ master | slave ] [ all | nodename ... ] start datanode nodename ... start datanode [ master | slave ] [ all | nodename ... ] Starts specified node. stop [ -m smart | fast | immediate ] all stop gtm [ master | slave | all ] stop gtm_proxy [ all | nodename ... ] stop [ -m smart | fast | immediate ] coordinator nodename ... stop [ -m smart | fast | immediate ] coordinator [ master | slave ] [ all | nodename ... ] stop [ -m smart | fast | immediate ] datanode nodename ... stop [ -m smart | fast | immediate ] datanode [ master | slave ] [ all | nodename ... ] Stops specified node. For Datanode and Coordinator, you can specify stop as in the "pg_ctl stop" command. When you stop a Coordinator or Datanode slave, the master will be reconfigured to remove synchronous replication. When you stop Coordinator or Datanode slave, the master will be reconfigurated to remove synchronous replication. unregister unregister_option ... Unregisteres specified node from the GTM. This could be needed when starting a new node after a node crashes. unregister_option is one of the following: name: Specifies node name to unregister. { gtm | gtm_proxy | gtm_proxy_postmaster | coordinator | datanode }: Specifies the category of the specified node.