Correctifs appliqués

Andres Freund pushed:

There's three categories of changes leading to better performance: - Splitting the per-attribute part of SendRowDescriptionMessage into a v2 and a v3 version allows avoiding branches for every attribute. - Preallocating the size of the buffer to be big enough for all attributes and then using pq_write* avoids unnecessary buffer size checks & resizing. - Reusing a persistently allocated StringInfo for all. SendRowDescriptionMessage() invocations avoids repeated allocations & reallocations. Author: Andres Freund Discussion:

  • Force "restrict" not to be used when compiling with xlc. Per buildfarm animal Hornet and followup manual testing by Noah Misch, it appears xlc miscompiles code using "restrict" in at least some cases. Allow disabling restrict usage with FORCE_DISABLE_RESTRICT=yes in template files, and do so for aix/xlc. Author: Andres Freund and Tom Lane Discussion:
  • Improve sys/catcache performance. The following are the individual improvements: 1) Avoidance of FunctionCallInfo based function calls, replaced by more efficient functions with a native C argument interface. 2) Don't extract columns from a cache entry's tuple whenever matching entries - instead store them as a Datum array. This also allows to get rid of having to build dummy tuples for negative & list entries, and of a hack for dealing with cstring vs. text weirdness. 3) Reorder members of catcache.h struct, so imortant entries are more likely to be on one cacheline. 4) Allowing the compiler to specialize critical SearchCatCache for a specific number of attributes allows to unroll loops and avoid other nkeys dependant initialization. 5) Only initializing the ScanKey when necessary, i.e. catcache misses, greatly reduces cache unnecessary cpu cache misses. 6) Split of the cache-miss case from the hash lookup, reducing stack allocations etc in the common case. 7) CatCTup and their corresponding heaptuple are allocated in one piece. This results in making cache lookups themselves roughly three times as fast - full-system benchmarks obviously improve less than that. I've also evaluated further techniques: - replace open coded hash with simplehash - the list walk right now shows up in profiles. Unfortunately it's not easy to do so safely as an entry's memory location can change at various times, which doesn't work well with the refcounting and cache invalidation. - Cacheline-aligning CatCTup entries - helps some with performance, but the win isn't big and the code for it is ugly, because the tuples have to be freed as well. - add more proper functions, rather than macros for SearchSysCacheCopyN etc., but right now they don't show up in profiles. The reason the macro wrapper for syscache.c/h have to be changed, rather than just catcache, is that doing otherwise would require exposing the SysCache array to the outside. That might be a good idea anyway, but it's for another day. Author: Andres Freund Reviewed-By: Robert Haas Discussion:
  • Add pg_noinline macro to c.h. Forcing a function not to be inlined can be useful if it's the slow-path of a performance critical function, or should be visible in profiles to allow for proper cost attribution. Author: Andres Freund Discussion:

Peter Eisentraut pushed:

Tom Lane pushed:

  • Add missing clean step to src/test/modules/brin/Makefile. I noticed the tmp_check subdirectory wasn't getting cleaned up after a check-world run. Apparently will only do this for you if you've defined REGRESS. The only other src/test/modules Makefile that does not set that is snapshot_too_old, and it does it like this.
  • Regenerate configure script. Not sure how fffd651e83ccbd6191a76be6ec7c6b1b27888fde ended up probing for "strnlenfrak" rather than "strnlen". My autoconf doesn't do that ...
  • Fix low-probability loss of NOTIFY messages due to XID wraparound. Up to now async.c has used TransactionIdIsInProgress() to detect whether a notify message's source transaction is still running. However, that function has a quick-exit path that reports that XIDs before RecentXmin are no longer running. If a listening backend is doing nothing but listening, and not running any queries, there is nothing that will advance its value of RecentXmin. Once 2 billion transactions elapse, the RecentXmin check causes active transactions to be reported as not running. If they aren't committed yet according to CLOG, async.c decides they aborted and discards their messages. The timing for that is a bit tight but it can happen when multiple backends are sending notifies concurrently. The net symptom therefore is that a sufficiently-long-surviving listen-only backend starts to miss some fraction of NOTIFY traffic, but only under heavy load. The only function that updates RecentXmin is GetSnapshotData(). A brute-force fix would therefore be to take a snapshot before processing incoming notify messages. But that would add cycles, as well as contention for the ProcArrayLock. We can be smarter: having taken the snapshot, let's use that to check for running XIDs, and not call TransactionIdIsInProgress() at all. In this way we reduce the number of ProcArrayLock acquisitions from one per message to one per notify interrupt; that's the same under light load but should be a benefit under heavy load. Light testing says that this change is a wash performance-wise for normal loads. I looked around for other callers of TransactionIdIsInProgress() that might be at similar risk, and didn't find any; all of them are inside transactions that presumably have already taken a snapshot. Problem report and diagnosis by Marko Tiikkaja, patch by me. Back-patch to all supported branches, since it's been like this since 9.0. Discussion:
  • Doc: fix missing explanation of default object privileges. The GRANT reference page, which lists the default privileges for new objects, failed to mention that USAGE is granted by default for data types and domains. As a lesser sin, it also did not specify anything about the initial privileges for sequences, FDWs, foreign servers, or large objects. Fix that, and add a comment to acldefault() in the probably vain hope of getting people to maintain this list in future. Noted by Laurenz Albe, though I editorialized on the wording a bit. Back-patch to all supported branches, since they all have this behavior. Discussion:
  • Remove unnecessary PG_TRY overhead for CurrentResourceOwner changes. resowner/README contained advice to use a PG_TRY block to restore the old CurrentResourceOwner value anywhere that that variable is transiently changed. That advice was only inconsistently followed, however, and on reflection it seems like unnecessary overhead. We don't bother with such a convention for transient CurrentMemoryContext changes, on the grounds that any (sub)transaction abort will start out by resetting CurrentMemoryContext to what it wants. But the same is true of CurrentResourceOwner, so there seems no need to treat it differently. Hence, remove PG_TRY blocks that exist only to restore CurrentResourceOwner before re-throwing the error. There are a couple of places that restore it along with some other actions, and I left those alone; the restore is probably unnecessary but no noticeable gain will result from removing it. Discussion:
  • Prevent sharing transition states between ordered-set aggregates. This ought to work, but the built-in OSAs are not capable of coping, because their final-functions destructively modify their transition state (specifically, the contained tuplesort object). That was fine when those functions were written, but commit 804163bc2 moved the goalposts without telling orderedsetaggs.c. We should fix the built-in OSAs to support this, but it will take a little work, especially if we don't want to sacrifice performance in the normal non-shared-state case. Given that it took a year after 9.6 release for anyone to notice this bug, we should not prioritize sharable-state over nonsharable-state performance. And a proper fix is likely to be more complicated than we'd want to back-patch, too. Therefore, let's just put in this stop-gap patch to prevent nodeAgg.c from choosing to use shared state for OSAs. We can revert it in HEAD when we get a better fix. Report from Lukas Eder, diagnosis by me, patch by David Rowley. Back-patch to 9.6 where the problem was introduced. Discussion:
  • Doc: fix typo in release notes. Ioseph Kim Discussion:
  • Add port/strnlen support to libpq and ecpg Makefiles. In the wake of fffd651e8, any makefile that pulls in snprintf.c from src/port/ needs to be prepared to pull in strnlen.c as well. Per buildfarm.
  • Fix AggGetAggref() so it won't lie to aggregate final functions. If we merge the transition calculations for two different aggregates, it's reasonable to assume that the transition function should not care which of those Aggref structs it gets from AggGetAggref(). It is not reasonable to make the same assumption about an aggregate final function, however. Commit 804163bc2 broke this, as it will pass whichever Aggref was first associated with the transition state in both cases. This doesn't create an observable bug so far as the core system is concerned, because the only existing uses of AggGetAggref() are in ordered-set aggregates that happen to not pay attention to anything but the input properties of the Aggref; and besides that, we disabled sharing of transition calculations for OSAs yesterday. Nonetheless, if some third-party code were using AggGetAggref() in a normal aggregate, they would be entitled to call this a bug. Hence, back-patch the fix to 9.6 where the problem was introduced. In passing, improve some of the comments about transition state sharing. Discussion:
  • Rely on sizeof(typename) rather than sizeof(variable) in pqformat.h. In each of the pq_writeintN functions, the three uses of sizeof() should surely all be consistent. I started out to make them all sizeof(ni), but on reflection let's make them sizeof(typename) instead. That's more like our usual style elsewhere, and it's just barely possible that the failures buildfarm member hornet has shown since 4c119fbcd went in are caused by the compiler getting confused about sizeof() a parameter that it's optimizing away. In passing, improve a couple of comments. Discussion:
  • Improve implementation of CRE-stack-flattening in map_variable_attnos(). I (tgl) objected to the obscure implementation introduced in commit 1c497fa72. This one seems a bit less action-at-a-distance-y, at the price of repeating a few lines of code. Improve the comments about what the function is doing, too. Amit Khandekar, whacked around a bit more by me Discussion:
  • Explicitly track whether aggregate final functions modify transition state. Up to now, there's been hard-wired assumptions that normal aggregates' final functions never modify their transition states, while ordered-set aggregates' final functions always do. This has always been a bit limiting, and in particular it's getting in the way of improving the built-in ordered-set aggregates to allow merging of transition states. Therefore, let's introduce catalog and CREATE AGGREGATE infrastructure that lets the finalfn's behavior be declared explicitly. There are now three possibilities for the finalfn behavior: it's purely read-only, it trashes the transition state irrecoverably, or it changes the state in such a way that no more transfn calls are possible but the state can still be passed to other, compatible finalfns. There are no examples of this third case today, but we'll shortly make the built-in OSAs act like that. This change allows user-defined aggregates to explicitly disclaim support for use as window functions, and/or to prevent transition state merging, if their implementations cannot handle that. While it was previously possible to handle the window case with a run-time error check, there was not any way to prevent transition state merging, which in retrospect is something commit 804163bc2 should have provided for. But better late than never. In passing, split out pg_aggregate.c's extern function declarations into a new header file pg_aggregate_fn.h, similarly to what we've done for some other catalog headers, so that pg_aggregate.h itself can be safe for frontend files to include. This lets pg_dump use the symbolic names for relevant constants. Discussion:
  • gcc's support for __attribute__((noinline)) hasn't been around forever. Buildfarm member gaur says it wasn't there in 2.95.3. Guess that 3.0 and later have it.

Robert Haas pushed:

Álvaro Herrera pushed:

Joe Conway pushed:

Correctifs en attente

Nathan Bossart sent in another revision of a patch to add more logging to VACUUM and ANALYZE.

Amul Sul sent in three more revisions of a patch to add hash partitioning.

Ashutosh Bapat sent in a patch to support partition-wise join for dummy partitioned relation.

Amit Kapila sent in another revision of a patch to parallize queries containing initplans.

Jeevan Chalke sent in another revision of a patch to implement partition-wise aggregation/grouping.

David Rowley sent in two revisions of a patch to remove left joins with a DISTINCT clause, as the DISTINCT makes the LEFT JOIN redundant.

Julien Rouhaud and Thomas Munro traded patches to fix an oversight in EphemeralNamedRelation support.

Rushabh Lathia sent in another revision of a patch to add parallel B-tree index build sorting.

Michaël Paquier sent in another revision of a patch to refactor the routine to test connection to an SSL server, support channel binding 'tls-unique' in SCRAM, add connection parameters "saslname" and "saslchannelbinding", and implement channel binding tls-server-end-point for SCRAM.

Masahiko Sawada and Aleksander Alekseev traded patches to ensure that updated columns are not null in logical replication when they shouldn't be.

Laurenz Albe sent in a patch to document that PUBLIC has USAGE privileges on newly created types.

Amit Khandekar sent in another revision of a patch to implement parallel append.

Ashutosh Bapat sent in another revision of a patch to modify the bound comparision functions to accept members of PartitionKey, add partition-wise join for 1:1, 1:0, 0:1 partition matching, and add tests for same.

Yugo Nagata sent in a patch to implement lockable views.

Tom Lane sent in a patch to remove Windows warnings from VS 2017 by removing MemSet.

Dilip Kumar sent in another revision of a patch to improve bitmap costing for lossy pages.

Pavel Stěhule sent in another revision of a patch to PL/pgsql to allow forcing either a generic or a custom plan.

Peter Eisentraut sent in a patch to replace GrantObjectType with ObjectType.

Jeevan Chalke sent in a patch to add a cost_append function which determines and returns the cost of an Append node.

Amit Kapila sent in a patch to handle redundant ConvertRowtypeExpr nodes.

Álvaro Herrera sent in a patch to refactor the relkind check in DefineIndex from an ugly rat's nest of 'if' statements to a switch statement.

Amit Kapila sent in a patch to fix parallel safety for extern params, fix simple expr interaction gather, and fix parallel mode nested execution.

Tomas Vondra sent in a PoC patch to enable parallel execution for cursors explicitly.

Pavel Stěhule sent in another revision of a patch to enable default namespaces for XPath expressions.