Open main menu

MediaWiki/JobQueue

< MediaWiki
Revision as of 14:44, 1 June 2023 by Admin (talk | contribs) (link MediaWiki-Docker to the github repo)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

MediaWiki is a web system for knowledge sharing. Naturally, the primary 'job' of this system is serving web requests. The 'read' case is simple: Ask for a web page, Apache returns it. But when you add in the writer's use case, which can involve uploading images, complex authoring - which may include multiple pages (e.g. categories, templates) to compose the final content, the case gets a bit more complex. The supporting subsystems, complex functionality, and operational considerations require that a number of operations are handled by a secondary process. That process is generically called a 'job'. Jobs are handled by a system called the Job Queue. Examples of jobs are listed on the manual page and include updating the 'links' database table when a template is changed, pre-rendering common thumbnails on file upload, HTML cache invalidation, audio and video transcoding. Transcoding in particular is not suitable for running on web requests so you need a background runner.

Furthermore, when operating in a container-based microservices architecture, there must be a way to routinely execute a plethora of 'maintenance' scripts.

Instead of doing this in a singular, monolithic environment where you would program the Linux cron system to handle the queue; and use SSH and shell commands to execute maintenance scripts, you need a way to invoke a service container that has full "knowledge" of the primary application and service endpoints but is NOT used for handling web requests. In a restaurant analogy, it is not the wait staff serving customers at the restaurant, or even the chef cooking food. It is taking care of inventory, menu offerings, seating, staffing and schedules, and dishes etc (the manager, the hostess, bus boy and dishwasher). In a Docker environment, setting up a Jobrunner container is exactly how it's done in MediaWiki Docker

Requirements

  1. $wgJobRunRate needs to be set to zero so we're not running jobs with regular web requests.

Reference

  1. https://www.mediawiki.org/wiki/Manual:Job_queue
  2. https://www.mediawiki.org/wiki/Manual:Job_queue/For_developers


The job queue manual talks about creating a continuous service, but that is a recipe for running MediaWiki on a traditional 'LAMP' stack - not a containerized app.

The job queue can actually be implemented with a DB backend (default), or a Redis store. See $wgJobTypeConf

Visibility

You can see the current number of jobs using the API api.php?action=query&format=json&meta=siteinfo&formatversion=2&siprop=statistics stats for this site