Create common infrastructure for cross-version upgrade testing.

To test pg_upgrade across major PG versions, we have to be able to modify or drop any old objects with no-longer-supported properties, and we have to be able to deal with cosmetic changes in pg_dump output. Up to now, the buildfarm and pg_upgrade's own test infrastructure had separate implementations of the former, and we had nothing but very ad-hoc rules for the latter (including an arbitrary threshold on how many lines of unchecked diff were okay!). This patch creates a Perl module that can be shared by both those use-cases, and adds logic that deals with pg_dump output diffs in a much more tightly defined fashion. This largely supersedes previous efforts in commits 0df9641d3, 9814ff550, and 62be9e4cd, which developed a SQL-script-based solution for the task of dropping old objects. There was nothing fundamentally wrong with that work in itself, but it had no basis for solving the output-formatting problem. The most plausible way to deal with formatting is to build a Perl module that can perform editing on the dump files; and once we commit to that, it makes more sense for the same module to also embed the knowledge of what has to be done for dropping old objects. Back-patch versions of the helper module as far as 9.2, to support buildfarm animals that still test that far back. It's also necessary to back-patch PostgreSQL/Version.pm, because the new code depends on that. I fixed up pg_upgrade's 002_pg_upgrade.pl in v15, but did not look into back-patching it further than that. Tom Lane and Andrew Dunstan Discussion: https://postgr.es/m/891521.1673657296@sss.pgh.pa.us
author: Tom Lane 2023-01-17 01:35:53 +0000
committer: Tom Lane 2023-01-17 01:35:55 +0000
commit: 52585f8f072ac187380f7e02183e87dcf6789ff0 (patch)
tree: b129468957ad9c1c3cd8cb84514ae80950ea2c7f /src/bin
parent: 980ae173108e353045e5ab4a842bb21e9dfe6715 (diff)
3 files changed, 62 insertions, 190 deletions
diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING
index 98286231d7..81a4324a76 100644
--- a/src/bin/pg_upgrade/TESTING
+++ b/src/bin/pg_upgrade/TESTING
@@ -10,31 +10,14 @@ This will run the TAP tests to run pg_upgrade, performing an upgrade
 from the version in this source tree to a new instance of the same
 version.
 
-Testing an upgrade from a different version requires a dump to set up
-the contents of this instance, with its set of binaries.  The following
-variables are available to control the test (see DETAILS below about
-the creation of the dump):
+Testing an upgrade from a different PG version is also possible, and
+provides a more thorough test that pg_upgrade does what it's meant for.
+This requires both a source tree and an installed tree for the old
+version, as well as a dump file to set up the instance to be upgraded.
+The following environment variables must be set to enable this testing:
 export olddump=...somewhere/dump.sql	(old version's dump)
 export oldinstall=...otherversion/	(old version's install base path)
-
-"filter_rules" is a variable that can be used to specify a file with custom
-filtering rules applied before comparing the dumps of the PostgreSQL
-instances near the end of the tests, in the shape of regular expressions
-valid for perl.  This is useful to enforce certain validation cases where
-pg_dump could create inconsistent outputs across major versions.
-For example:
-
-	# Remove all CREATE POLICY statements
-	s/^CREATE\sPOLICY.*//mgx
-	# Replace REFRESH with DROP for materialized views
-	s/^REFRESH\s(MATERIALIZED\sVIEW)/DROP $1/mgx
-
-Lines beginning with '#' and empty lines are ignored.  One rule can be
-defined per line.
-
-Finally, the tests can be done by running
-
-	make check
+See DETAILS below for more information about creation of the dump.
 
 You can also test the different transfer modes (--copy, --link,
 --clone) by setting the environment variable PG_TEST_PG_UPGRADE_MODE
@@ -52,22 +35,32 @@ The most effective way to test pg_upgrade, aside from testing on user
 data, is by upgrading the PostgreSQL regression database.
 
 This testing process first requires the creation of a valid regression
-database dump that can be then used for $olddump.  Such files contain
+database dump that can then be used for $olddump.  Such files contain
 most database features and are specific to each major version of Postgres.
 
 Here are the steps needed to create a dump file:
 
 1)  Create and populate the regression database in the old cluster.
     This database can be created by running 'make installcheck' from
-    src/test/regress using its source code tree.
+    src/test/regress in the old version's source code tree.
 
-2)  Use pg_dumpall to dump out the contents of the instance, including the
-    regression database, in the shape of a SQL file.  This requires the *old*
-    cluster's pg_dumpall so as the dump created is compatible with the
-    version of the cluster it is dumped into.
+    If you like, you can also populate regression databases for one or
+    more contrib modules by running 'make installcheck USE_MODULE_DB=1'
+    in their directories.  (USE_MODULE_DB is essential so that the
+    pg_upgrade test script will understand which database is which.)
 
-Once the dump is created, it can be repeatedly used with $olddump and
-`make check`, that automates the dump of the old database, its upgrade,
-the dump out of the new database and the comparison of the dumps between
-the old and new databases.  The contents of the dumps can also be manually
-compared.
+2)  Use pg_dumpall to dump out the contents of the instance, including the
+    regression database(s), into a SQL file.  Use the *old* version's
+    pg_dumpall so that the dump created is compatible with that version.
+
+Once the dump file is created, it can be used repeatedly.  Set $olddump
+to point to the dump file and run 'make check' or 'make installcheck'
+in the new version's src/bin/pg_upgrade directory.  (If you included any
+contrib databases in the old dump, you must use 'make installcheck' and
+ensure that the corresponding contrib modules have been installed in
+the new version's installation tree.)  This will build a temporary cluster
+using the old installation's executables, populate it from the dump file,
+and then try to pg_upgrade it to the new version.  Success is reported
+if pg_dumpall output matches between the pre-upgrade and post-upgrade
+databases.  In case of trouble, manually comparing those dump files may
+help to isolate the problem.
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index c066fd7d93..62a8fa9d8b 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -10,6 +10,7 @@ use File::Path qw(rmtree);
 
 use PostgreSQL::Test::Cluster;
 use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::AdjustUpgrade;
 use Test::More;
 
 # Can be changed to test the other modes.
@@ -37,37 +38,16 @@ sub generate_db
 # This returns the path to the filtered dump.
 sub filter_dump
 {
-	my ($node, $dump_file) = @_;
+	my ($is_old, $old_version, $dump_file) = @_;
 	my $dump_contents = slurp_file($dump_file);
 
-	# Remove the comments.
-	$dump_contents =~ s/^\-\-.*//mgx;
-	# Remove empty lines.
-	$dump_contents =~ s/^\n//mgx;
-
-	# Apply custom filtering rules, if any.
-	if (defined($ENV{filter_rules}))
+	if ($is_old)
 	{
-		my $filter_file = $ENV{filter_rules};
-		die "no file with custom filter rules found!" unless -e $filter_file;
-
-		open my $filter_handle, '<', $filter_file
-		  or die "could not open $filter_file";
-		while (<$filter_handle>)
-		{
-			my $filter_line = $_;
-
-			# Skip comments and empty lines
-			next if ($filter_line =~ /^#/);
-			next if ($filter_line =~ /^\s*$/);
-
-			# Apply lines with filters.
-			note "Applying custom rule $filter_line to $dump_file";
-			my $filter = "\$dump_contents =~ $filter_line";
-			## no critic (ProhibitStringyEval)
-			eval $filter;
-		}
-		close $filter_handle;
+		$dump_contents = adjust_old_dumpfile($old_version, $dump_contents);
+	}
+	else
+	{
+		$dump_contents = adjust_new_dumpfile($old_version, $dump_contents);
 	}
 
 	my $dump_file_filtered = "${dump_file}_filtered";
@@ -83,7 +63,7 @@ sub filter_dump
 # that gets upgraded.  Before running the upgrade, a logical dump of the
 # old cluster is taken, and a second logical dump of the new one is taken
 # after the upgrade.  The upgrade test passes if there are no differences
-# in these two dumps.
+# (after filtering) in these two dumps.
 
 # Testing upgrades with an older version of PostgreSQL requires setting up
 # two environment variables, as of:
@@ -198,15 +178,29 @@ my $oldbindir = $oldnode->config_data('--bindir');
 # only if different major versions are used for the dump.
 if (defined($ENV{oldinstall}))
 {
-	# Note that upgrade_adapt.sql and psql from the new version are used,
-	# to cope with an upgrade to this version.
-	$newnode->command_ok(
-		[
-			'psql', '-X',
-			'-f',   "$srcdir/src/bin/pg_upgrade/upgrade_adapt.sql",
-			'-d',   $oldnode->connstr('regression'),
-		],
-		'ran adapt script');
+	# Consult AdjustUpgrade to find out what we need to do.
+	my $dbnames =
+	  $oldnode->safe_psql('postgres', qq(SELECT datname FROM pg_database));
+	my %dbnames;
+	do { $dbnames{$_} = 1; }
+	  foreach split /\s+/s, $dbnames;
+	my $adjust_cmds =
+	  adjust_database_contents($oldnode->pg_version, %dbnames);
+
+	foreach my $updb (keys %$adjust_cmds)
+	{
+		my $upcmds = join(";\n", @{ $adjust_cmds->{$updb} });
+
+		# For simplicity, use the newer version's psql to issue the commands.
+		$newnode->command_ok(
+			[
+				'psql', '-X',
+				'-v',   'ON_ERROR_STOP=1',
+				'-c',   $upcmds,
+				'-d',   $oldnode->connstr($updb),
+			],
+			"ran version adaptation commands for database $updb");
+	}
 }
 
 # Take a dump before performing the upgrade as a base comparison. Note
@@ -359,8 +353,8 @@ my $dump1_filtered = $dump1_file;
 my $dump2_filtered = $dump2_file;
 if ($oldnode->pg_version != $newnode->pg_version)
 {
-	$dump1_filtered = filter_dump($oldnode, $dump1_file);
-	$dump2_filtered = filter_dump($newnode, $dump2_file);
+	$dump1_filtered = filter_dump(1, $oldnode->pg_version, $dump1_file);
+	$dump2_filtered = filter_dump(0, $oldnode->pg_version, $dump2_file);
 }
 
 # Compare the two dumps, there should be no differences.
@@ -371,7 +365,7 @@ is($compare_res, 0, 'old and new dumps match after pg_upgrade');
 if ($compare_res != 0)
 {
 	my ($stdout, $stderr) =
-	  run_command([ 'diff', $dump1_filtered, $dump2_filtered ]);
+	  run_command([ 'diff', '-u', $dump1_filtered, $dump2_filtered ]);
 	print "=== diff of $dump1_filtered and $dump2_filtered\n";
 	print "=== stdout ===\n";
 	print $stdout;
diff --git a/src/bin/pg_upgrade/upgrade_adapt.sql b/src/bin/pg_upgrade/upgrade_adapt.sql
deleted file mode 100644
index a368a51ff5..0000000000
--- a/src/bin/pg_upgrade/upgrade_adapt.sql
+++ /dev/null
@@ -1,115 +0,0 @@
---
--- SQL queries for upgrade tests across different major versions.
---
--- This file includes a set of SQL queries to make a cluster to-be-upgraded
--- compatible with the version this file is based on.  Note that this
--- requires psql, as per-version queries are controlled with a set of \if
--- clauses.
-
--- This script is backward-compatible, so it is able to work with any version
--- newer than 9.2 we are upgrading from, up to the branch this script is stored
--- on (even if this would not run if running pg_upgrade with the same version
--- for the origin and the target).
-
--- \if accepts a simple boolean value, so all the version checks are
--- saved based on this assumption.
-SELECT
-  ver <= 902 AS oldpgversion_le92,
-  ver <= 904 AS oldpgversion_le94,
-  ver <= 906 AS oldpgversion_le96,
-  ver <= 1000 AS oldpgversion_le10,
-  ver <= 1100 AS oldpgversion_le11,
-  ver <= 1300 AS oldpgversion_le13,
-  ver <= 1500 AS oldpgversion_le15
-  FROM (SELECT current_setting('server_version_num')::int / 100 AS ver) AS v;
-\gset
-
--- Objects last appearing in 9.2.
-\if :oldpgversion_le92
--- Note that those tables are removed from the regression tests in 9.3
--- and newer versions.
-DROP TABLE abstime_tbl;
-DROP TABLE reltime_tbl;
-DROP TABLE tinterval_tbl;
-\endif
-
--- Objects last appearing in 9.4.
-\if :oldpgversion_le94
--- This aggregate has been fixed in 9.5 and later versions, so drop
--- and re-create it.
-DROP AGGREGATE array_cat_accum(anyarray);
-CREATE AGGREGATE array_larger_accum (anyarray) (
-                  sfunc = array_larger,
-                  stype = anyarray,
-                  initcond = $${}$$);
--- This operator has been fixed in 9.5 and later versions, so drop and
--- re-create it.
-DROP OPERATOR @#@ (NONE, bigint);
-CREATE OPERATOR @#@ (PROCEDURE = factorial,
-                     RIGHTARG = bigint);
-\endif
-
--- Objects last appearing in 9.6.
-\if :oldpgversion_le96
-DROP FUNCTION public.oldstyle_length(integer, text);
-\endif
-
--- Objects last appearing in 10.
-\if :oldpgversion_le10
-DROP FUNCTION IF EXISTS boxarea(box);
-DROP FUNCTION IF EXISTS funny_dup17();
-\endif
-
--- Objects last appearing in 11.
-\if :oldpgversion_le11
--- WITH OIDS is supported until v11, so remove its support for any
--- relations marked as such.
-DO $stmt$
-  DECLARE
-    rec text;
-  BEGIN
-  FOR rec in
-    SELECT oid::regclass::text
-    FROM pg_class
-    WHERE relname !~ '^pg_'
-      AND relhasoids
-      AND relkind in ('r', 'f')
-    ORDER BY 1
-  LOOP
-    EXECUTE 'ALTER TABLE ' || quote_ident(rec) || ' SET WITHOUT OIDS';
-  END LOOP;
-  END; $stmt$;
-\endif
-
--- Objects last appearing in 13.
-\if :oldpgversion_le13
--- Until v10, operators could only be dropped one at a time, so be careful
--- to stick with one command for each drop here.
-DROP OPERATOR public.#@# (pg_catalog.int8, NONE);
-DROP OPERATOR public.#%# (pg_catalog.int8, NONE);
-DROP OPERATOR public.!=- (pg_catalog.int8, NONE);
-DROP OPERATOR public.#@%# (pg_catalog.int8, NONE);
-\endif
-
--- Objects last appearing in 15.
--- The internal format of "aclitem" has changed in 16, so replace it with
--- text type in tables.
-\if :oldpgversion_le15
-DO $stmt$
-  DECLARE
-    rec record;
-  BEGIN
-  FOR rec in
-    SELECT oid::regclass::text as rel, attname as col
-    FROM pg_class c, pg_attribute a
-    WHERE c.relname !~ '^pg_'
-      AND c.relkind IN ('r')
-      AND a.attrelid = c.oid
-      AND a.atttypid = 'aclitem'::regtype
-    ORDER BY 1
-  LOOP
-    EXECUTE 'ALTER TABLE ' || quote_ident(rec.rel) || ' ALTER COLUMN ' ||
-      quote_ident(rec.col) || ' SET DATA TYPE text';
-  END LOOP;
-  END; $stmt$;
-\endif
author	Tom Lane	2023-01-17 01:35:53 +0000
committer	Tom Lane	2023-01-17 01:35:55 +0000
commit	52585f8f072ac187380f7e02183e87dcf6789ff0 (patch)
tree	b129468957ad9c1c3cd8cb84514ae80950ea2c7f /src/bin
parent	980ae173108e353045e5ab4a842bb21e9dfe6715 (diff)