对象缓存
MediaWiki在多个组件和多个层中使用缓存。 本页记录了我们在MediaWiki PHP应用程序中使用的各种缓存。
概述
MediaWiki中的对象缓存描述了两种存储:
- 缓存。存储计算结果或从外部源获取的数据的地方(用于更高的访问速度)。 这是计算机科学定义中的“缓存”。
- 贮存。一个存储轻量级数据的地方,而不是存储在其他地方。 也称为藏匿(或“囤积”物品)。 这些值不可能(或不允许)按需重新计算。
术语
如果程序能够验证该值是否过期,则称缓存密钥为“可验证”。
当一个密钥只能有一个可能的值时(例如计算π的第100位),这适用于可以缓存在密钥math_pi_digit:100
下的情况。
结果可以安全地存储在高速访问存储中,无需协调,因为它永远不需要更新或清除。
如果它从缓存中过期,可以重新计算并产生相同的结果。
这同样适用于将某个版本的Wikitext存储到页面。
版本123已经被创建,并且将始终包含相同的内容。
如果程序知道它正在查找的修订ID,那么像revision_content:123
这样的缓存密钥也可以是可验证的缓存密钥。
存储结构化数据
WANObjectCache
::getWithSet
的习惯用法及其“版本”选项,它自动处理前向和后向兼容性,包括在软件的各个版本中使缓存密钥无效。
●避免存储类对象。 存储基元或基元的︵嵌套︶数组。 类应该转换为简单数组或从简单数组转换,并存储为这些简单数组或JSON字符串。 编码和序列化必须由用户完成,而不是由例如BagOStuff或WANObjectCache接口完成。 ︵未来,MediaWiki可能会自动为实现JsonUnserializable的类执行此操作,这是在MediaWiki1.36中引入的︶。
服务设施
这些是MediaWiki功能可用的抽象存储,请参见用途部分了解示例。
本地服务器
MediaWikiServices->getLocalServerObjectCache()
访问。
●可配置‥否︵自动检测︶。
●行为‥非常快︵<0.1ms,来自本地内存︶,低容量,不在应用程序服务器之间共享。
此存储中的值仅保存在任何给定web服务器的本地RAM中︵通常使用php-apcu︶。
这些不会复制到其他服务器或群集,并且没有更新或清除协调选项。
如果web服务器没有安装php-apcu︵或等效的︶,则该接口会返回到一个空的占位符,其中没有存储键。
它还被设置为维护脚本和其他命令行模式的空界面。
MediaWiki支持APCu和WinCache。
本地集群
MediaWikiServices->getObjectCacheFactory()->getLocalClusterInstance()
访问。
●可配置‥是,通过$wgMainCacheType。
●行为‥快速︵~1ms,来自服务内存︶,中等容量,在应用程序服务器之间共享,但不跨数据中心复制。
WAN缓存
MediaWikiServices->getMainWANObjectCache()
访问。
●可配置‥是,通过$wgMainWANCache,默认为$wgMainCacheType。
●行为‥快速︵~1ms,来自服务内存︶,中等容量,在应用程序服务器之间共享,失效事件跨数据中心复制
getWithSet
方法计算和存储值。
要使缓存无效,请使用密钥清除︵而不是直接设置密钥︶。
另请参见wikitech.wikimedia.org上的WANObjectCache。
MicroStash
- Accessed through
MediaWikiServices->getMicroStash()
. - Configurable: Yes, via $wgMicroStashType , which defaults to
CACHE_ANYTHING
. - Behaviour: fast (~1ms, from service memory), medium capacity, shared between application servers, with invalidation done via TTL. Data stored will only be evicted only when the TTL expires regardless of if or not the data gets used.
Values in this store are stored centrally in the primary data centre (typically using Memcached as backend). Values are not replicated to other data centers, and data gets evicted only when the time to live (TTL) elapses.
主仓库
MediaWikiServices->getMainObjectStash()
访问。
●可配置‥是,通过$wgMainStash。
●行为‥可能涉及磁盘读取︵1-10ms︶、半持久性、在应用程序服务器之间共享以及跨数据中心复制。
使用
Session store
- Accessed via
Session
objects, which itself is accessed via SessionManager, orRequestContext->getRequest()->getSession()
- Configured via
$wgSessionCacheType
.
This is not really a cache, in the sense that the data is not stored elsewhere.
Interwiki cache
See Interwiki cache for details, and also ClearInterwikiCache.php.
Parser cache
- Accessed via the
ParserCache
class. - Backend configured by
$wgParserCacheType
(typically MySQL). - Keys are canonical by page ID and populated when a page is parsed.
- Revision ID is verified on retrieval.
See Manual:Parser cache for details. See also purgeParserCache.php.
Message cache
- Access via
MessageCache
. - Backend configurable by $wgMessageCacheType (defaults to $wgMainCacheType, with fallback to MySQL).
Revision text
- Accessed via
SqlBlobStore::getBlob
. - Stored in the WAN cache, using key class
SqlBlobStore-blob
. - Keys are verifiable and values immutable. Cache is populated on demand.
Background
The main use case for caching revision text (as opposed to fetching directly from the text table or External Storage) is for handling cases where the text of many different pages is needed by a single web request.
- Originally implemented in 2006 (r16549, commit 376014e).
- Process cache added in 2016 (git #I77575d6, git #Ic61ee91).
- Adopted by MessageCache in 2017 (git #Ib668e69).
This is primarily used by:
- Parsing wikitext. When parsing a given wiki page, the Parser needs the source of the current page, but also recursively needs the source of all transcluded template pages (and Lua module pages). It is not unusual for a popular article to indirectly transclude over 300 such pages. The use of Memcached saves time when saving edits and rendering page views.
- MessageCache. This is a wiki-specific layer on top of LocalisationCache, which consists primarily of message overrides from "MediaWiki:"-namespace pages on the given wiki. When building this blob, the source text of many different pages needs to be fetched. This is cached per-cluster in Memcached, and locally per-server (to reduce Memcached bandwidth ; r11678, commit 6d82fa2).
Example
Key WANCache:v:global:SqlBlobStore-blob:<wiki>:<content address>
.
"content address" refers to the content.content_address
on the wiki's main database (e.g. "tt:1123").
This in turn refers to the text table or (External Storage).
To reverse engineer which page/revision this relates to, Find content.content_id
for the content address (SELECT content_id FROM content WHERE content_address = "tt:963546992";
), then find the revision ID for that content slot (SELECT slot_revision_id FROM slots WHERE slot_content_id = 943285896;
).
The revision ID can then be used on-wiki in a url like https://en.wikipedia.org/w/index.php?oldid=951705319, or you can look it up in the revision and page tables.
Revision meta data
- Accessed via
RevisionStore::getKnownCurrentRevision
. - Stored in the WAN cache, using key class
revision-row-1.29
. - Keys are verifiable (by page and revision ID) and values immutable. Cache is populated on demand.
MessageBlobStore
Stores interface text used by ResourceLoader modules. It is similar to LocalisationCache, but includes the wiki-specific overrides. (LocalisationCache is wiki-agnostic). These overrides come from the database as wiki pages in the MediaWiki-namespace.
- Accessed via
MessageBlobStore
. - Stored in the WAN cache, using key class
MessageBlobStore
. - Keys are verifiable (by ResourceLoader module name and hash of message keys). Values are mutable and expire after a week. Cache populated on demand.
- All keys are purged when LocalisationCache is rebuild. When a user save a change to a MediaWiki-namespace page on the wiki, a subset of the keys are also purged.
Minification cache
ResourceLoader caches the minified versions of raw JavaScript and CSS input files.
- Accessed via
ResourceLoader::filter
. - Stored locally on the server (APCu).
- Keys are verifiable (deterministic value). No purge strategy needed. Cache populated on demand.
LESS compilation cache
ResourceLoader caches the meta data and parser output of LESS files it has compiled.
- Accessed via
ResourceLoaderFileModule::compileLessFile
. - Stored locally on the server (APCu).
File content hasher
ResourceLoader caches the checksum of any file directly or indirectly used by a module. When serving the startup manifest to users, it needs the hashes of many thousands of files. To reduce I/O overhead, it caches this content hash locally, keyed by path and mtime.
- Accessed via
FileContentsHasher
. - Stored locally on the server (APCu).
See also
- Architectural modules/Cache
- Manual:Performance tuning: How to configure your web server and/or cache proxy and MediaWiki; to improve performance.
- Manual:File cache: Simplistic cache mechanism that caches HTTP responses on-disk.
- Manual:Configuration settings#Cache: Various configuration settings to set up caching backends and enable parts of the application to use them.
- Wikimedia BagOStuff (ObjectCache) , the library providing this functionality