Глава 45. Background Worker Processes
PostgreSQL can be extended to run user-supplied code in separate processes. Such processes are started, stopped and monitored by postgres, which permits them to have a lifetime closely linked to the server's status. These processes have the option to attach to PostgreSQL's shared memory area and to connect to databases internally; they can also run multiple transactions serially, just like a regular client-connected server process. Also, by linking to libpq they can connect to the server and behave like a regular client application.
Warning |
There are considerable robustness and security risks in using background worker processes because, being written in the C language, they have unrestricted access to data. Administrators wishing to enable modules that include background worker process should exercise extreme caution. Only carefully audited modules should be permitted to run background worker processes. |
Background workers can be initialized at the time that
PostgreSQL is started by
including the module name in shared_preload_libraries. A module wishing to run
a background worker can register it by calling RegisterBackgroundWorker(BackgroundWorker *worker)
from its
_PG_init()
. Background workers can
also be started after the system is up and running by calling the
function RegisterDynamicBackgroundWorker(BackgroundWorker *worker, BackgroundWorkerHandle
**handle)
. Unlike RegisterBackgroundWorker
, which can only be
called from within the postmaster, RegisterDynamicBackgroundWorker
must be called
from a regular backend.
The structure BackgroundWorker is defined thus:
typedef void (*bgworker_main_type)(Datum main_arg); typedef struct BackgroundWorker { char bgw_name[BGW_MAXLEN]; int bgw_flags; BgWorkerStartTime bgw_start_time; int bgw_restart_time; /* in seconds, or BGW_NEVER_RESTART */ bgworker_main_type bgw_main; char bgw_library_name[BGW_MAXLEN]; /* only if bgw_main is NULL */ char bgw_function_name[BGW_MAXLEN]; /* only if bgw_main is NULL */ Datum bgw_main_arg; int bgw_notify_pid; } BackgroundWorker;
bgw_name is a string to be used in log messages, process listings and similar contexts.
bgw_flags is a bitwise-or'd bitmask indicating the capabilities that the module wants. Possible values are BGWORKER_SHMEM_ACCESS (requesting shared memory access) and BGWORKER_BACKEND_DATABASE_CONNECTION (requesting the ability to establish a database connection, through which it can later run transactions and queries). A background worker using BGWORKER_BACKEND_DATABASE_CONNECTION to connect to a database must also attach shared memory using BGWORKER_SHMEM_ACCESS, or worker start-up will fail.
bgw_start_time is the server state during which postgres should start the process; it can be one of BgWorkerStart_PostmasterStart (start as soon as postgres itself has finished its own initialization; processes requesting this are not eligible for database connections), BgWorkerStart_ConsistentState (start as soon as a consistent state has been reached in a hot standby, allowing processes to connect to databases and run read-only queries), and BgWorkerStart_RecoveryFinished (start as soon as the system has entered normal read-write state). Note the last two values are equivalent in a server that's not a hot standby. Note that this setting only indicates when the processes are to be started; they do not stop when a different state is reached.
bgw_restart_time is the interval, in seconds, that postgres should wait before restarting the process, in case it crashes. It can be any positive value, or BGW_NEVER_RESTART, indicating not to restart the process in case of a crash.
bgw_main is a pointer to the function to run when the process is started. This function must take a single argument of type Datum and return void. bgw_main_arg will be passed to it as its only argument. Note that the global variable MyBgworkerEntry points to a copy of the BackgroundWorker structure passed at registration time. bgw_main may be NULL; in that case, bgw_library_name and bgw_function_name will be used to determine the entrypoint. This is useful for background workers launched after postmaster startup, where the postmaster does not have the requisite library loaded.
bgw_library_name is the name of a library in which the initial entrypoint for the background worker should be sought. It is ignored unless bgw_main is NULL. But if bgw_main is NULL, then the named library will be dynamically loaded by the worker process and bgw_function_name will be used to identify the function to be called.
bgw_function_name is the name of a function in a dynamically loaded library which should be used as the initial entrypoint for a new background worker. It is ignored unless bgw_main is NULL.
bgw_notify_pid is the PID of a PostgreSQL backend process to which the postmaster should send SIGUSR1 when the process is started or exits. It should be 0 for workers registered at postmaster startup time, or when the backend registering the worker does not wish to wait for the worker to start up. Otherwise, it should be initialized to MyProcPid.
Once running, the process can connect to a database by calling
BackgroundWorkerInitializeConnection(char *dbname, char
*username)
. This allows the process to run
transactions and queries using the SPI
interface. If dbname is NULL, the
session is not connected to any particular database, but shared
catalogs can be accessed. If username is
NULL, the process will run as the superuser created during
initdb.
BackgroundWorkerInitializeConnection can only be called once per
background process, it is not possible to switch databases.
Signals are initially blocked when control reaches the
bgw_main function, and must be
unblocked by it; this is to allow the process to customize its
signal handlers, if necessary. Signals can be unblocked in the
new process by calling BackgroundWorkerUnblockSignals
and blocked by
calling BackgroundWorkerBlockSignals
.
If bgw_restart_time for a
background worker is configured as BGW_NEVER_RESTART, or if it exits with an exit
code of 0 or is terminated by TerminateBackgroundWorker
, it will be
automatically unregistered by the postmaster on exit. Otherwise,
it will be restarted after the time period configured via
bgw_restart_time, or immediately if
the postmaster reinitializes the cluster due to a backend
failure. Backends which need to suspend execution only
temporarily should use an interruptible sleep rather than
exiting; this can be achieved by calling WaitLatch()
. Make sure the WL_POSTMASTER_DEATH flag is set when calling that
function, and verify the return code for a prompt exit in the
emergency case that postgres itself has
terminated.
When a background worker is registered using the RegisterDynamicBackgroundWorker
function, it is
possible for the backend performing the registration to obtain
information regarding the status of the worker. Backends wishing
to do this should pass the address of a BackgroundWorkerHandle * as the second argument to
RegisterDynamicBackgroundWorker
. If
the worker is successfully registered, this pointer will be
initialized with an opaque handle that can subsequently be passed
to GetBackgroundWorkerPid(BackgroundWorkerHandle *, pid_t *)
or TerminateBackgroundWorker(BackgroundWorkerHandle *)
. GetBackgroundWorker
can be used to poll the
status of the worker: a return value of BGWH_NOT_YET_STARTED indicates that the worker has
not yet been started by the postmaster; BGWH_STOPPED indicates that it has been started
but is no longer running; and BGWH_STARTED indicates that it is currently
running. In this last case, the PID will also be returned via the
second argument. TerminateBackgroundWorker
causes the postmaster
to send SIGTERM to the worker if it is
running, and to unregister it as soon as it is not.
In some cases, a process which registers a background worker
may wish to wait for the worker to start up. This can be
accomplished by initializing bgw_notify_pid to MyProcPid and then passing the BackgroundWorkerHandle * obtained at registration
time to WaitForBackgroundWorkerStartup(BackgroundWorkerHandle *handle, pid_t *)
function. This function will
block until the postmaster has attempted to start the background
worker, or until the postmaster dies. If the background runner is
running, the return value will BGWH_STARTED, and the PID will be written to the
provided address. Otherwise, the return value will be BGWH_STOPPED or BGWH_POSTMASTER_DIED.
The worker_spi contrib module contains a working example, which demonstrates some useful techniques.
The maximum number of registered background workers is limited by max_worker_processes.