Difference between revisions of "MediaWiki/JobQueue"

From Freephile Wiki
Jump to navigation Jump to search
(draft)
 
(link MediaWiki-Docker to the github repo)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
MediaWiki is a web system for knowledge sharing. Naturally, the primary 'job' of this system is serving web requests. The 'read' case is simple: Ask for a web page, Apache returns it. But when you add in the ''writer's'' use case, which can involve uploading images, complex authoring - which may include multiple pages (e.g. categories, templates) to compose the final content, the case gets a bit more complex. The supporting subsystems, complex functionality, and operational considerations require that a number of operations are handled by a secondary process. That process is generically called a 'job'. Jobs are handled by a system called the '''Job Queue'''. [[mediawikiwiki:Manual:Job_queue#Job_examples|Examples of jobs]] are listed on the manual page.
+
MediaWiki is a web system for knowledge sharing. Naturally, the primary 'job' of this system is serving web requests. The 'read' case is simple: Ask for a web page, Apache returns it. But when you add in the ''writer's'' use case, which can involve uploading images, complex authoring - which may include multiple pages (e.g. categories, templates) to compose the final content, the case gets a bit more complex. The supporting subsystems, complex functionality, and operational considerations require that a number of operations are handled by a secondary process. That process is generically called a 'job'. Jobs are handled by a system called the '''Job Queue'''. [[mediawikiwiki:Manual:Job_queue#Job_examples|Examples of jobs]] are listed on the manual page and include updating the 'links' database table when a template is changed, pre-rendering common thumbnails on file upload, HTML cache invalidation, audio and video transcoding. Transcoding in particular is not suitable for running on web requests so you need a background runner.
  
 
Furthermore, when operating in a container-based microservices architecture, there must be a way to routinely execute a plethora of 'maintenance' scripts.  
 
Furthermore, when operating in a container-based microservices architecture, there must be a way to routinely execute a plethora of 'maintenance' scripts.  
  
Instead of doing this in a singular, monolithic environment where you would program the Linux '''cron''' system to handle the queue; and use SSH and shell commands to execute maintenance scripts, you need a way to invoke a service container that has full "knowledge" of the primary application and service endpoints but is NOT used for handling web requests. In a restaurant analogy, it is not the wait staff serving customers at the restaurant, or even the chef cooking food. It is taking care of inventory, menu offerings, seating, staffing and schedules, and dishes etc (the manager, the hostess, bus boy and dishwasher).  
+
Instead of doing this in a singular, monolithic environment where you would program the Linux '''cron''' system to handle the queue; and use SSH and shell commands to execute maintenance scripts, you need a way to invoke a service container that has full "knowledge" of the primary application and service endpoints but is NOT used for handling web requests. In a restaurant analogy, it is not the wait staff serving customers at the restaurant, or even the chef cooking food. It is taking care of inventory, menu offerings, seating, staffing and schedules, and dishes etc (the manager, the hostess, bus boy and dishwasher). In a Docker environment, setting up a [[mediawikiwiki:MediaWiki-Docker/Configuration_recipes/Jobrunner|Jobrunner container]] is exactly how it's done in [https://github.com/wikimedia/mediawiki-docker/tree/master MediaWiki Docker]
 +
 
 +
==Requirements==
 +
 
 +
#[[mediawikiwiki:Special:MyLanguage/Manual:$wgJobRunRate|$wgJobRunRate]] needs to be set to zero so we're not running jobs with regular web requests.
 +
 
 +
==Reference==
 +
 
 +
#https://www.mediawiki.org/wiki/Manual:Job_queue
 +
#https://www.mediawiki.org/wiki/Manual:Job_queue/For_developers
  
== Requirements ==
 
# [[mediawikiwiki:Special:MyLanguage/Manual:$wgJobRunRate|Special:MyLanguage/Manual:$wgJobRunRate]] needs to be set to zero so we're not running jobs with regular web requests.
 
  
== Reference ==
 
# https://www.mediawiki.org/wiki/Manual:Job_queue
 
 
The job queue manual talks about [[mediawikiwiki:Manual:Job_queue#Continuous_service|creating a continuous service]], but that is a recipe for running MediaWiki on a traditional 'LAMP' stack - not a containerized app.
 
The job queue manual talks about [[mediawikiwiki:Manual:Job_queue#Continuous_service|creating a continuous service]], but that is a recipe for running MediaWiki on a traditional 'LAMP' stack - not a containerized app.
  
== Visibility ==
+
The job queue can actually be implemented with a DB backend (default), or a Redis store. See [[mediawikiwiki:Manual:$wgJobTypeConf|$wgJobTypeConf]]
You can see the current number of jobs using the API <code>'''api.php?action=query&format=json&meta=siteinfo&formatversion=2&siprop=statistics'''</code>  [https://wiki.freephile.org/wiki/api.php?action=query&format=json&meta=siteinfo&formatversion=2&siprop=statistics this site]
+
 
 +
==Visibility==
 +
You can see the current number of jobs using the API <code>'''api.php?action=query&format=json&meta=siteinfo&formatversion=2&siprop=statistics'''</code>  [https://wiki.freephile.org/wiki/api.php?action=query&format=json&meta=siteinfo&formatversion=2&siprop=statistics stats for this site]
  
 
[[Category:DevOps]]
 
[[Category:DevOps]]

Latest revision as of 14:44, 1 June 2023

MediaWiki is a web system for knowledge sharing. Naturally, the primary 'job' of this system is serving web requests. The 'read' case is simple: Ask for a web page, Apache returns it. But when you add in the writer's use case, which can involve uploading images, complex authoring - which may include multiple pages (e.g. categories, templates) to compose the final content, the case gets a bit more complex. The supporting subsystems, complex functionality, and operational considerations require that a number of operations are handled by a secondary process. That process is generically called a 'job'. Jobs are handled by a system called the Job Queue. Examples of jobs are listed on the manual page and include updating the 'links' database table when a template is changed, pre-rendering common thumbnails on file upload, HTML cache invalidation, audio and video transcoding. Transcoding in particular is not suitable for running on web requests so you need a background runner.

Furthermore, when operating in a container-based microservices architecture, there must be a way to routinely execute a plethora of 'maintenance' scripts.

Instead of doing this in a singular, monolithic environment where you would program the Linux cron system to handle the queue; and use SSH and shell commands to execute maintenance scripts, you need a way to invoke a service container that has full "knowledge" of the primary application and service endpoints but is NOT used for handling web requests. In a restaurant analogy, it is not the wait staff serving customers at the restaurant, or even the chef cooking food. It is taking care of inventory, menu offerings, seating, staffing and schedules, and dishes etc (the manager, the hostess, bus boy and dishwasher). In a Docker environment, setting up a Jobrunner container is exactly how it's done in MediaWiki Docker

Requirements[edit | edit source]

  1. $wgJobRunRate needs to be set to zero so we're not running jobs with regular web requests.

Reference[edit | edit source]

  1. https://www.mediawiki.org/wiki/Manual:Job_queue
  2. https://www.mediawiki.org/wiki/Manual:Job_queue/For_developers


The job queue manual talks about creating a continuous service, but that is a recipe for running MediaWiki on a traditional 'LAMP' stack - not a containerized app.

The job queue can actually be implemented with a DB backend (default), or a Redis store. See $wgJobTypeConf

Visibility[edit | edit source]

You can see the current number of jobs using the API api.php?action=query&format=json&meta=siteinfo&formatversion=2&siprop=statistics stats for this site