Correctifs appliqués

Heikki Linnakangas pushed:

Robert Haas pushed:

  • psql: Fix incorrect version check for table partitioning. Table partitioning was added in 10, not 9.6. Fabrízio de Royes Mello, per report from Jeff Janes
  • doc: Fix purported type of pg_am.amhandler to match reality. Joel Jacobson
  • Remove should_free arguments to tuplesort routines. Since commit e94568ecc10f2638e542ae34f2990b821bbf90ac, the answer is always "false", and we do not need to complicate the API by arranging to return a constant value. Peter Geoghegan Discussion:
  • doc: Improve documentation related to table partitioning feature. Commit f0e44751d7175fa3394da2c8f85e3ceb3cdbfe63 implemented table partitioning, but failed to mention the "no row movement" restriction in the documentation. Fix that and a few other issues. Amit Langote, with some additional wordsmithing by me.
  • Update typedefs.list. So developers can more easily run pgindent locally
  • Clean up code, comments, and formatting for table partitioning. Amit Langote, plus pgindent-ing by me. Inspired in part by review comments from Tomas Vondra.
  • Fix bug in hashbulkdelete. Commit 6d46f4783efe457f74816a75173eb23ed8930020 failed to account for the possibility that hashbulkdelete() might encounter a bucket that has been split since it began scanning the bucket array. Repair. Extracted from a larger pathc by Amit Kapila; I rewrote the comment.
  • Fix bugs in RelationGetPartitionDispatchInfo. The previous coding was not quite right for cases involving multiple levels of partitioning. Amit Langote
  • Remove _hash_wrtbuf() in favor of calling MarkBufferDirty(). The whole concept of _hash_wrtbuf() is that we need to know at the time we're releasing the buffer lock (and pin) whether we dirtied the buffer, but this is easy to get wrong. This patch actually fixes one non-obvious bug of that form: hashbucketcleanup forgot to signal _hash_squeezebucket, which gets the primary bucket page already locked, as to whether it had already dirtied the page. Calling MarkBufferDirty() at the places where we dirty the buffer is more intuitive and lets us simplify the code in various places as well. On top of all that, the ultimate goal here is to make hash indexes WAL-logged, and as the comments to _hash_wrtbuf() note, it should go away when that happens. Making it go away a little earlier than that seems like a good preparatory step. Report by Jeff Janes. Diagnosis by Amit Kapila, Kuntal Ghosh, and Dilip Kumar. Patch by me, after studying an alternative patch submitted by Amit Kapila. Discussion:
  • Fix more hash index bugs around marking buffers dirty. In _hash_freeovflpage(), if we're freeing the overflow page that immediate follows the page to which tuples are being moved (the confusingly-named "write buffer"), don't forget to mark that page dirty after updating its hasho_nextblkno. In _hash_squeezebucket(), it's not necessary to mark the primary bucket page dirty if there are no overflow pages, because there's nothing to squeeze in that case. Amit Kapila, with help from Kuntal Ghosh and Dilip Kumar, after an initial trouble report by Jeff Janes.
  • Unbreak Finalize HashAggregate over Partial HashAggregate. Commit 5dfc198146b49ce7ecc8a1fc9d5e171fb75f6ba5 introduced the use of a new type of hash table with linear reprobing for hash aggregates. Such a hash table behaves very poorly if keys are inserted in hash order, which does in fact happen in the case where a query use a Finalize HashAggregate node fed (via Gather) by a Partial HashAggregate node. In fact, queries with this type of plan tend to run effectively forever. Fix that by seeding the hash value differently in each worker (and in the leader, if it participates). Andres Freund and Robert Haas
  • Simplify LWLock tranche machinery by removing array_base/array_stride. array_base and array_stride were added so that we could identify the offset of an LWLock within a tranche, but this facility is only very marginally used apart from the main tranche. So, give every lock in the main tranche its own tranche ID and get rid of array_base, array_stride, and all that's attached. For debugging facilities (Trace_lwlocks and LWLOCK_STATS) print the pointer address of the LWLock using %p instead of the offset. This is arguably more useful, and certainly a lot cheaper. Drop the offset-within-tranche from the information reported to dtrace and from one can't-happen message inside lwlock.c. The main user-visible impact of this change is that pg_stat_activity will now report all waits for LWLocks as "LWLock" rather than reporting some as "LWLockTranche" and others as "LWLockNamed". The main motivation for this change is that the need to specify an array_base and an array_stride is awkward for parallel query. There is only a very limited supply of tranche IDs so we can't just keep allocating new ones, and if we try to use the same tranche IDs every time then we run into trouble when multiple parallel contexts are use simultaneously. So if we didn't get rid of this mechanism we'd have to make it even more complicated. By simplifying it in this way, we instead reduce the size of the generated code for lwlock.c by about 5%. Discussion:
  • Fix outdated comment in lwlock.c Commit 3761fe3c20bb040b15f0e8da58d824631da00caa should have made this change, but didn't. Reported by Ãlvaro Herrera.

Tom Lane pushed:

  • Make the different Unix-y semaphore implementations ABI-compatible. Previously, the "sem" field of PGPROC varied in size depending on which kernel semaphore API we were using. That was okay as long as there was only one likely choice per platform, but in the wake of commit ecb0d20a9, that assumption seems rather shaky. It doesn't seem out of the question anymore that an extension compiled against one API choice might be loaded into a postmaster built with another choice. Moreover, this prevents any possibility of selecting the semaphore API at postmaster startup, which might be something we want to do in future. Hence, change PGPROC.sem to be PGSemaphore (i.e. a pointer) for all Unix semaphore APIs, and turn the pointed-to data into an opaque struct whose contents are only known within the responsible modules. For the SysV and unnamed-POSIX APIs, the pointed-to data has to be allocated elsewhere in shared memory, which takes a little bit of rejiggering of the InitShmemAllocation code sequence. (I invented a ShmemAllocUnlocked() function to make that a little cleaner than it used to be. That function is not meant for any uses other than the ones it has now, but it beats having InitShmemAllocation() know explicitly about allocation of space for semaphores and spinlocks.) This change means an extra indirection to access the semaphore data, but since we only touch that when blocking or awakening a process, there shouldn't be any meaningful performance penalty. Moreover, at least for the unnamed-POSIX case on Linux, the sem_t type is quite a bit wider than a pointer, so this reduces sizeof(PGPROC) which seems like a good thing. For the named-POSIX API, there's effectively no change: the PGPROC.sem field was and still is a pointer to something returned by sem_open() in the postmaster's memory space. Document and check the pre-existing limitation that this case can't work in EXEC_BACKEND mode. It did not seem worth unifying the Windows semaphore ABI with the Unix cases, since there's no likelihood of needing ABI compatibility much less runtime switching across those cases. However, we can simplify the Windows code a bit if we define PGSemaphore as being directly a HANDLE, rather than pointer to HANDLE, so let's do that while we're here. (This also ends up being no change in what's physically stored in PGPROC.sem. We're just moving the HANDLE fetch from callees to callers.) It would take a bunch of additional code shuffling to get to the point of actually choosing a semaphore API at postmaster start, but the effects of that would now be localized in the port/XXX_sema.c files, so it seems like fit material for a separate patch. The need for it is unproven as yet, anyhow, whereas the ABI risk to extensions seems real enough. Discussion:
  • Fix race condition in test_decoding "slot" test. This test, just added in commit a924c327e, sometimes fails because the old backend hasn't finished dropping the temporary replication slot when the new backend looks. Borrow the previously-invented methodology for waiting for the old process to disappear from pg_stat_activity. Petr Jelinek Discussion:
  • Catversion bump for temporary replication slots. Missed in commit a924c327e2793d2025b19e18de7917110dc8afd8. Per Fujii Masao.
  • Fix creative, but unportable, spelling of "ptr != NULL". Or at least I suppose that's what was really meant here. But even aside from the not-per-project-style use of "0" to mean "NULL", I doubt it's safe to assume that all valid pointers are > NULL. Per buildfarm member pademelon.
  • Prevent planagg.c from failing on queries containing CTEs. The existing tests in preprocess_minmax_aggregates() usually prevent it from trying to do anything with queries containing CTEs, but there's an exception: a CTE could be present as a member of an appendrel, if we flattened a UNION ALL that contains CTE references. If it did try to generate an optimized path for a query using a CTE, it failed with "could not find plan for CTE", as reported by Torsten Förtsch. The proximate cause is an unwise decision in commit 3fc6e2d7f to clear subroot->cte_plan_ids in build_minmax_path(). That left the subroot's cte_plan_ids list out of step with its parse->cteList. Removing the "subroot->cte_plan_ids = NIL;" assignment is enough to let the case work again, but really it's pretty silly to be expending any cycles at all in this module when there are CTEs: we always treat their outputs as unordered so there's no way for the optimization to win. Hence, also add an early-exit test so we don't waste time like that. Back-patch to 9.6 where the misbehavior was introduced. Report:
  • Improve handling of array elements as getdiag_targets and cursor_variables. There's no good reason why plpgsql's GET DIAGNOSTICS statement can't support an array element as target variable, since the execution code already uses the generic exec_assign_value() function to assign to it. Hence, refactor the grammar to allow that, by making getdiag_target depend on the assign_var production. Ideally we'd also let a cursor_variable expand to an element of a refcursor[] array, but that's substantially harder since those statements also have to handle bound-cursor-variable cases. For now, just make sure the reported error is sensible, ie "cursor variable must be a simple variable" not "variable must be of type cursor or refcursor". The latter was quite confusing from the user's viewpoint, since what he wrote satisfies the claimed restriction. Per bug #14463 from Zhou Digoal. Given the lack of previous complaints, I see no need for a back-patch. Discussion:
  • Sync our copy of the timezone library with IANA release tzcode2016j. This is a trivial update (consisting in fact only in the addition of a comment). The point is just to get back to being synced with an official release of tzcode, rather than some ad-hoc point in their commit history, which is where commit 1f87181e1 left it.
  • Improve documentation around TS_execute(). I got frustrated by the lack of commentary in this area, so here is some reverse-engineered documentation, along with minor stylistic cleanup. No code changes more significant than removal of unused variables. Back-patch to 9.6, not because that's useful in itself, but because we have some bugs to fix in phrase search and this would cause merge failures if it's only in HEAD.
  • Fix FK-based join selectivity estimation for semi/antijoins. This case wasn't thought through sufficiently in commit 100340e2d. It's true that the FK proves that every outer row has a match in the inner table, but we forgot that some of the inner rows might be filtered away by WHERE conditions located within the semijoin's RHS. If the RHS is just one table, we can reasonably take the semijoin selectivity as equal to the fraction of the referenced table's rows that are expected to survive its restriction clauses. If the RHS is a join, it's not clear how much of the referenced table might get through the join, so fall back to the same rule we were already using for other outer-join cases: use the minimum of the regular per-clause selectivity estimates. This gives the same result as if we hadn't considered the FK at all when there's a single FK column, but it should still help for multi-column FKs, which is the case that 100340e2d is really meant to help with. Back-patch to 9.6 where the previous commit came in. Discussion:
  • In contrib/uuid-ossp, #include headers needed for ntohl() and ntohs(). Oversight in commit b8cc8f947. I just noticed this causes compiler warnings on FreeBSD, and it really ought to cause warnings elsewhere too: all references I can find say that <arpa/inet.h> is required for these. We have a lot of code elsewhere that thinks that both <netinet/in.h> and <arpa/inet.h> should be included for these functions, so do it that way here too, even though <arpa/inet.h> ought to be sufficient according to the references I consulted. Back-patch to 9.4 where the previous commit landed.

Peter Eisentraut pushed:

Magnus Hagander pushed:

Fujii Masao pushed:

  • Add missing documentation for effective_io_concurrency tablespace option. The description of effective_io_concurrency option was missing in ALTER TABLESPACE docs though it's included in CREATE TABLESPACE one. Back-patch to 9.6 where effective_io_concurrency tablespace option was added. Michael Paquier, reported by Marc-Olaf Jaschke
  • Ensure that num_sync is greater than zero in synchronous_standby_names. Previously num_sync could be set to zero and this setting caused an assertion failure. This means that multiple synchronous standbys code should assume that num_sync is greater than zero. Also setting num_sync to zero is nonsense because it's basically the configuration for synchronous replication. If users want not to make transaction commits wait for any standbys, synchronous_standby_names should be emptied to disable synchronous replication instead of setting num_sync to zero. This patch forbids users from setting num_sync to zero in synchronous_standby_names. If zero is specified, an error will happen during processing the parameter settings. Back-patch to 9.6 where multiple synchronous standbys feature was added. Patch by me. Reviewed by Tom Lane. Discussion: <>

Noah Misch pushed:

Correctifs en attente

Pavel Stěhule sent in two more revisions of a patch to implement \g[b]store commands for psql.

Rafia Sabih sent in a patch to implement parallel index-only scans.

Michaël Paquier sent in a patch to fix an issue with strong random on Win32.

Michaël Paquier, Amit Kapila and Ashutosh Sharma traded patches to fix an issue that manifested as a hang in pldebugger.

Amit Kapila sent in another revision of a patch to fix some hash index issues.

Tomas Vondra sent in another revision of a patch to implement multivariate statistics.

Nikita Glukhov sent in a patch to implement recursive processing of nested objects and arrays in json[b]_populate_record[set](), json[b]_to_record[set]().

Tomas Vondra sent in two more revisions of a patch to implement two slab-like memory allocators.

Kyotaro HORIGUCHI sent in another revision of a patch to implement character conversion with radix trees.

Dilip Kumar sent in another revision of a patch to implement parallel bitmap heap scan.

Tom Lane sent in a patch to fix matching of boolean index columns to sort ordering.

Aleksander Alekseev sent in a patch to fix documentation of the timestamp type.

Gilles Darold sent in another revision of a patch to implement pg_current_logfile().

Andrew Borodin sent in a patch to add a sleep sort test for pg_background.

Masahiko Sawada and Fujii Masao traded patches to add quorum commit.

Dilip Kumar sent in another revision of a patch to implement parallel merge join.

Robert Haas sent in a patch to refactor tstate serialization in preparation for making CURRENT_* parallel safe.

Jesper Pedersen sent in a patch to create a pg_catversion() builtin function.

Michaël Paquier, Fujii Masao, and Kyotaro HORIGUCHI traded patches to fix a pg_basebackup crash.

Vladimir Rusinov sent in a patch to rename pg_switch_xlog to pg_switch_wal.

Robert Haas sent in a patch to remove _hash_wrtbuf() and cause those functions which previously called it to do MarkBufferDirty() directly.

Peter Eisentraut, Petr Jelínek, and Steve Singer traded patches to implement logical replication.

Rahila Syed sent in another revision of a patch to turn the types columns with unknown type in views to text.

Dagfinn Ilmari Mannsåker sent in a patch to add GUCs for predicate lock promotion thresholds.

Magnus Hagander sent in a patch to remove the -X option from pg_basebackup.

Magnus Hagander sent in a patch to add missing newlines to error messages in pg_basebackup.

Wesley Massuda sent in a patch to implement composite type NOT NULL constraints.

Alexander Law sent in a patch to speed up documentation builds.

Pavel Stěhule sent in two more revisions of a patch to implement server-side session variables.

Dmitry Ivanov sent in a patch to fix an infelicity between the ancient sql_inheritance GUC and the new native partitions.

Heikki Linnakangas sent in a patch to shorten the window between writing the timeline history file and writing the end-of-recovery record.

Magnus Hagander sent in two revisions of a patch to add a --no-slot option to pg_basebackup.

Heikki Linnakangas sent in a patch to use the "plain:" prefix for plaintext passwords stored in pg_authid.

Amit Kapila sent in two revisions of a patch to fix dirty marking and lock chaining for hash indexes.

Mithun Cy sent in another revision of a patch to cache meta page for hash indexes.

Antonin Houska sent in a patch to simplify some code in basebackup.

Peter Moser sent in another revision of a patch to implement temporal query processing with range types.

Fujii Masao sent in a patch to fix a bug in synchronous replication where there were an invalid number of sync standbys in synchronous_standby_names.

Stephen Frost sent in a patch to fix some infelicities between pg_dump and TRANSFORMs, in passing fixing pg_dump for some ancient versions of PostgreSQL.

Robert Haas sent in a patch to remove the sql_inheritance GUC.

Michaël Paquier sent in a patch to fix some typos in waldender.c and slot.c.

Dmitry Dolgov sent in a patch to implement jsonb_delete with arrays.

Dean Rasheed sent in a patch to fix a CREATE OR REPLACE VIEW bug.

Stas Kelvich sent in a patch to speed up two-phase transactions via a twophase recovery list.

Kevin Grittner sent in another revision of a patch to add transition tables for, among other things, materialized view maintenance and row access in per-statement triggers.

Pavel Stěhule sent in another revision of a patch to implement xmltable().

Peter Eisentraut sent in a patch to add a function to import operation system collations.