Revues de code

Correctifs appliqués

Magnus Hagander a poussé :

Peter Eisentraut a poussé :

Tom Lane a poussé :

  • Reduce idle power consumption of walwriter and checkpointer processes. This patch modifies the walwriter process so that, when it has not found anything useful to do for many consecutive wakeup cycles, it extends its sleep time to reduce the server's idle power consumption. It reverts to normal as soon as it's done any successful flushes. It's still true that during any async commit, backends check for completed, unflushed pages of WAL and signal the walwriter if there are any; so that in practice the walwriter can get awakened and returned to normal operation sooner than the sleep time might suggest. Also, improve the checkpointer so that it uses a latch and a computed delay time to not wake up at all except when it has something to do, replacing a previous hardcoded 0.5 sec wakeup cycle. This also is primarily useful for reducing the server's power consumption when idle. In passing, get rid of the dedicated latch for signaling the walwriter in favor of using its procLatch, since that comports better with possible generic signal handlers using that latch. Also, fix a pre-existing bug with failure to save/restore errno in walwriter's signal handlers. Peter Geoghegan, somewhat simplified by Tom
  • Reduce idle power consumption of stats collector process. Latch-ify the stats collector, so that it does not need an arbitrary wakeup cycle to check for postmaster death. The incremental savings in idle power is pretty marginal, since we only had it waking every two seconds; but I believe that this patch may also improve the collector's performance under load, by reducing the number of kernel calls made per message when messages are arriving constantly (we now avoid a select/poll call except when we need to sleep). The change also reduces the time needed for a normal database shutdown on platforms where signals don't interrupt select().
  • Fix an issue in recent walwriter hibernation patch. Users of asynchronous-commit mode expect there to be a guaranteed maximum delay before an async commit's WAL records get flushed to disk. The original version of the walwriter hibernation patch broke that. Add an extra shared-memory flag to allow async commits to kick the walwriter out of hibernation mode, without adding any noticeable overhead in cases where no action is needed.
  • Improve control logic for bgwriter hibernation mode. Commit 6d90eaaa89a007e0d365f49d6436f35d2392cfeb added a hibernation mode to the bgwriter to reduce the server's idle-power consumption. However, its interaction with the detailed behavior of BgBufferSync's feedback control loop wasn't very well thought out. That control loop depends primarily on the rate of buffer allocation, not the rate of buffer dirtying, so the hibernation mode has to be designed to operate only when no new buffer allocations are happening. Also, the check for whether the system is effectively idle was not quite right and would fail to detect a constant low level of activity, thus allowing the bgwriter to go into hibernation mode in a way that would let the cycle time vary quite a bit, possibly further confusing the feedback loop. To fix, move the wakeup support from MarkBufferDirty and SetBufferCommitInfoNeedsSave into StrategyGetBuffer, and prevent the bgwriter from entering hibernation mode unless no buffer allocations have happened recently. In addition, fix the delaying logic to remove the problem of possibly not responding to signals promptly, which was basically caused by trying to use the process latch's is_set flag for multiple purposes. I can't prove it but I'm suspicious that that hack was responsible for the intermittent "postmaster does not shut down" failures we've been seeing in the buildfarm lately. In any case it did nothing to improve the readability or robustness of the code. In passing, express the hibernation sleep time as a multiplier on BgWriterDelay, not a constant. I'm not sure whether there's any value in exposing the longer sleep time as an independently configurable setting, but we can at least make it act like this for little extra code.
  • Further tweaking of nomenclature in checkpointer.c. Get rid of some more naming choices that only make sense if you know that this code used to be in the bgwriter, as well as some stray comments referencing the bgwriter.
  • Improve tests for postmaster death in auxiliary processes. In checkpointer and walwriter, avoid calling PostmasterIsAlive unless WaitLatch has reported WL_POSTMASTER_DEATH. This saves a kernel call per iteration of the process's outer loop, which is not all that much, but a cycle shaved is a cycle earned. I had already removed the unconditional PostmasterIsAlive calls in bgwriter and pgstat in previous patches, but forgot that WL_POSTMASTER_DEATH is supposed to be treated as untrustworthy (per comment in unix_latch.c); so adjust those two cases to match. There are a few other places where the same idea might be applied, but only after substantial code rearrangement, so I didn't bother.
  • Remove unportable use of SGML character-code entity. It'd be nice to be able to spell Jan Urbanski's name with the correct accent marks, but we haven't yet found a way that works in everybody's docs toolchain. This way definitely doesn't.
  • Improve Windows implementation of WaitLatch/WaitLatchOrSocket. Ensure that signal handlers are serviced before this function returns. This should make the behavior more like Unix. Also, add some more error checking, and make some other cosmetic improvements. No back-patch since it's not clear whether this is fixing any live bug that would affect 9.1. I'm more concerned about 9.2 anyway given our considerable recent expansions in the usage of WaitLatch.
  • Fix Windows implementation of PGSemaphoreLock. The original coding failed to reset ImmediateInterruptOK before returning, which would potentially allow a subsequent query-cancel interrupt to be accepted at an unsafe point. This is a really nasty bug since it's so hard to predict the consequences, but they could be unpleasant. Also, ensure that signal handlers are serviced before this function returns, even if the semaphore is already set. This should make the behavior more like Unix. Back-patch to all supported versions.
  • Make WaitLatch's WL_POSTMASTER_DEATH result trustworthy; simplify callers. Per a suggestion from Peter Geoghegan, make WaitLatch responsible for verifying that the WL_POSTMASTER_DEATH bit it returns is truthful (by testing PostmasterIsAlive). Then simplify its callers, who no longer need to do that for themselves. Remove weasel wording about falsely-set result bits from WaitLatch's API contract.
  • Temporarily revert stats collector latch changes so we can ship beta1. This patch reverts commit 49340037ee3ab46cb24144a86705e35f272c24d5 and some follow-on tweaking in pgstat.c. While the basic scheme of latch-ifying the stats collector seems sound enough, it's failing on most Windows buildfarm members for unknown reasons, and there's no time left to debug that before 9.2beta1. Better to ship a beta version without this improvement. I hope to re-revert this once beta1 is out, though.
  • Tweak documentation wording to avoid "pdfendlink" failure. HEAD documentation was failing to build as US PDF for me, because a link to "CREATE CAST" was getting split across pages. Adjust wording to remove this rather gratuitous cross-reference.
  • Stamp 9.2beta1.
  • Improve discussion of setting server parameters. Rewrite description of "include_if_exists" for clarity. Add subsection headings to make the structure of the page a little clearer. A couple other minor improvements too. Josh Kupershmidt and Tom Lane
  • Fix contrib/citext's upgrade script to handle array and domain cases. We previously recognized that citext wouldn't get marked as collatable during pg_upgrade from a pre-9.1 installation, and hacked its create-from-unpackaged script to manually perform the necessary catalog adjustments. However, we overlooked the fact that domains over citext, as well as the citext[] array type, need the same adjustments. Extend the script to handle those cases. Also, the documentation suggested that this was only an issue in pg_upgrade scenarios, which is quite wrong; loading any dump containing citext from a pre-9.1 server will also result in the type being wrongly marked. I approached the documentation problem by changing the 9.1.2 release note paragraphs about this issue, which is historically inaccurate. But it seems better than having the information scattered in multiple places, and leaving incorrect info in the 9.1.2 notes would be bad anyway. We'll still need to mention the issue again in the 9.1.4 notes, but perhaps they can just reference 9.1.2 for fix instructions. Per report from Evan Carroll. Back-patch into 9.1.
  • Cosmetic adjustments for postmaster's handling of checkpointer. Correct some comments, order some operations a bit more consistently. No functional changes.
  • Update example of process titles shown by "ps". This example was quite old: it lacked the WAL writer and autovac launcher as well as the more recently added checkpointer. Linux "ps" seems to show slightly different stuff now too.
  • Explain compatibility item about language names a bit more. Since we've got an "open items" list item about this, apparently some people are pretty worried about it. In passing remove a lot of trailing whitespace.
  • Fix WaitLatchOrSocket to handle EOF on socket correctly. When using poll(), EOF on a socket is reported with the POLLHUP not POLLIN flag (at least on Linux). WaitLatchOrSocket failed to check this bit, causing it to go into a busy-wait loop if EOF occurs. We earlier fixed the same mistake in the test for the state of the postmaster_alive socket, but missed it for the caller-supplied socket. Fortunately, this error is new in 9.2, since 9.1 only had a select() based code path not a poll() based one.
  • Avoid unnecessary process wakeups in the log collector. syslogger was coded to wake up once per second whether there was anything useful to do or not. As part of our campaign to reduce the server's idle power consumption, change it to use a latch for waiting. Now, in the absence of any data to log or any signals to service, it will only wake up at the programmed logfile rotation times (if any).
  • Fix bogus declaration of local variable. rc should be an int here, not a pgsocket. Fairly harmless as long as pgsocket is an integer type, but nonetheless wrong. Error introduced in commit 87091cb1f1ed914e2ddca424fa28f94fdf8461d2.
  • Attempt to fix some issues in our Windows socket code. Make sure WaitLatchOrSocket regards FD_CLOSE as a read-ready condition. We might want to tweak this further, but it was surely wrong as-is. Make pgwin32_waitforsinglesocket detach its private event object from the passed socket before returning. I suspect that failure to do so leads to race conditions when other code (such as WaitLatchOrSocket) attaches a different event object to the same socket. Moreover, the existing coding meant that repeated calls to pgwin32_waitforsinglesocket would perform ResetEvent on an event actively connected to a socket, which is rumored to be an unsafe practice; the WSAEventSelect documentation appears to recommend against this, though it does not say not to do it in so many words. Also, uniformly use the coding pattern "WSAEventSelect(s, NULL, 0)" to detach events from sockets, rather than passing the event in the second parameter. The WSAEventSelect documentation says that the second parameter is ignored if the third is 0, so theoretically this should make no difference. However, elsewhere on the same reference page the use of NULL in this context is recommended, and I have found suggestions on the net that some versions of Windows have bugs with a non-NULL second parameter in this usage. Some other mostly-cosmetic cleanup, such as using the right one of WSAGetLastError and GetLastError for reporting errors from these functions.
  • Re-revert stats collector latch changes. This reverts commit cb2f2873d6b81ad7f0a9733ba738bfac0746fb7b, restoring the latch-ified stats collector logic. We'll soon see if this works any better on the Windows buildfarm machines.
  • Fix DROP TABLESPACE to unlink symlink when directory is not there. If the tablespace directory is missing entirely, we allow DROP TABLESPACE to go through, on the grounds that it should be possible to clean up the catalog entry in such a situation. However, we forgot that the pg_tblspc symlink might still be there. We should try to remove the symlink too (but not fail if it's no longer there), since not doing so can lead to weird behavior subsequently, as per report from Michael Nolan. There was some discussion of adding dependency links to prevent DROP TABLESPACE when the catalogs still contain references to the tablespace. That might be worth doing too, but it's an orthogonal question, and in any case wouldn't be back-patchable. Back-patch to 9.0, which is as far back as the logic looks like this. We could possibly do something similar in 8.x, but given the lack of reports I'm not sure it's worth the trouble, and anyway the case could not arise in the form the logic is meant to cover (namely, a post-DROP transaction rollback having resurrected the pg_tablespace entry after some or all of the filesystem infrastructure is gone).
  • Add some temporary instrumentation to pgstat.c. Log main-loop blocking events and the results of inquiry messages. This is to get some clarity as to what's happening on those Windows buildfarm members that still don't like the latch-ified stats collector. This bulks up the postmaster log a tad, so I won't leave it in place for long.

Bruce Momjian a poussé :

Simon Riggs a poussé :

Joe Conway a poussé :

Heikki Linnakangas a poussé :

Robert Haas a poussé :

Correctifs rejetés (à ce jour)

  • Pas de déception cette semaine :-)

Correctifs en attente

  • Robert Haas sent in several approaches to optimizing CLOG background writing.
  • Zoltan Boszormenyi sent in two more revisions of the patch to create and use a locktimeout and SIGALARM framework.
  • Pavel Stehule sent in another revision of the patch to add an enhanced ErrorData structure to PL/pgsql.
  • Magnus Hagander sent in a patch to fix a bug where pg_receivelog didn't handle timeouts correctly.
  • Antonin Houska sent in a patch to implement some more of the LATERAL functionality via functions.
  • Jeff Janes sent in a patch against pgbench which adds a --foreign-keys option to initialization mode which creates all the relevant constraints between the default tables.
  • Noah Misch sent in a patch which updates code comments per PGPROC/PGXACT split.