条件変数とspurious wakeup

条件変数(condition variable)同期プリミティブにまつわる "spurious wakeup" についてメモ。安定した対訳語が存在しないようなので、本記事ではそのまま英語表記とする*1。spurious は “偽の; 疑似; 似非” といった意味の単語であり*2、wakeup は “条件変数で待機していたスレッドがブロック解除されて実行再開する”︵＝寝ていたスレッドが起きる︶ことを意味する。 まとめ‥

●spurious wakeup は、﹁条件変数での待機(wait)処理において、何もしていないのにブロック解除される現象﹂を指す。ライブラリ内部実装やハードウェア／OSの都合といった、同期プリミティブの利用者から関与できない原因によって生じる。その発生頻度は特に言及されないが、一般的には低確率でしか生じないはず。 ●spurious wakeup に対応するため、条件変数での待機処理と待機条件のチェックは必ずループ中に記述する。特にC++11標準ライブラリの場合、待機関数のうち Predicate をとるオーバーロードを利用することでコーディング誤りを防止できる。 ●多くの環境で spurious wakeup があり得ると明記されている*3‥C++11標準ライブラリ、Boost.Threadライブラリ、POSIX(pthread)、Windows APIやJava言語など。

spurious wakeup の具体例

C++11標準ライブラリで実装された単純な生産者‐消費者(Producer-Consumer)パターンを考える。生産者スレッドはFIFOキューへデータ追加後に条件変数に対して通知を行う。一方、消費者スレッドはキューが空の間は条件変数availに対して待機し、availへの通知をうけてFIFOキューからデータを取り出す。つまり条件変数availを、条件﹁キュー内にデータが存在する﹂まで待機するスレッド間同期機構として利用している。このとき★箇所の待機処理avail.wait(lk)において、“条件変数への通知を行っていないにも関わらずブロックが解除されてスレッドが動きだす” ことがあり得る。

#include <deque>
#include <mutex>
#include <condition_variable>

struct Data { /*...*/ };
std::deque<Data> queue;         // FIFOキュー
std::mutex mtx;                 // queue保護
std::condition_variable avail;  // queueに有効なデータが存在？

// 生産者スレッド
void producer()
{
  for (;;) {
    Data data = /* データ生成 */;
    {
      std::lock_guard<std::mutex> lk(mtx);
      queue.push_back(data);
    }
    avail.nofity_one();  // 条件変数availへ通知
  }
}

// 消費者スレッド
void consumer()
{
  for (;;) {
    Data data;
    {
      std::unique_lock<std::mutex> lk(mtx);
      while (queue.empty()) {
        avail.wait(lk);  // ★条件変数availへの通知を待機
      }
      data = queue.front();
      queue.pop_front();
    }
    /* データ消費 */;
  }
}

上記コードでは、条件変数の待機処理と待機条件のチェックqueue.empty()がループ中に記載されている。このため、仮に spurious wakeup によりブロックが解除されても、待機条件を満足しない︵キュー内にデータが存在しない︶ことを検知し再び条件変数に対して待機する。参考のため、条件変数の誤った利用コードと spurious wakeup で生じる問題を下記に示す。

// BUG: 消費者スレッドの誤った実装
{
  std::unique_lock<std::mutex> lk(mtx);
  if (queue.empty()) {  // BUG: 待機処理がループで囲われていない
    avail.wait(lk);  // (1) spurious wakeupでブロック解除されると...
  }
  // (2) FIFOキューが空なのに取り出し操作！
  data = queue.front();  queue.pop_front();
}

このような定型処理のために、第2引数に Predicate をとるwaitメンバ関数オーバーロードが提供されている。

template <class Predicate>
  void wait(unique_lock<mutex>& lock, Predicate pred);
// Effects: while (!pred()) wait(lock);

明示的ループ構造が不要となり、条件変数の誤った利用リスクが無くなるため、可能な限りこちらを利用すべき。さらに waitメンバ関数から制御が戻る＝条件変数に通知が行われたとなるため、利用者からは spurious wakeup が完全に隠蔽される。

{
  std::unique_lock<std::mutex> lk(mtx);
  // 条件「キュー内にデータが存在する」まで待機
  avail.wait(lk, [&]{ return !queue.empty(); });
  data = queue.front();  queue.pop_front();
}

C++11標準ライブラリ：std::condition_variable

C++11標準ライブラリで追加された条件変数オブジェクト（std::condition_variableおよびstd::condition_variable_any）において、spurious なブロック解除＆スレッド始動が起こりえると言及されている。N3337 30.5.1/p10より引用（下線部は強調）。

void wait(unique_lock& lock);
Effects:

Atomically calls lock.unlock() and blocks on *this.

When unblocked, calls lock.lock() (possibly blocking on the lock), then returns.

The function will unblock when signaled by a call to notify_one() or a call to notify_all(), or spuriously.

If the function exits via an exception, lock.lock() shall be called prior to exiting the function scope.

Boost.Threadライブラリ：boost::condition_variable

Boost.Threadの条件変数オブジェクト（boost::condition_variableおよびboost::condition_variable_any）も同様。

Effects:
Atomically call lock.unlock() and blocks the current thread. The thread will unblock when notified by a call to this->notify_one() or this->notify_all(), or spuriously. When the thread is unblocked (for whatever reason), the lock is reacquired by invoking lock.lock() before the call to wait returns. The lock is also reacquired by invoking lock.lock() if the function exits with an exception.
http://www.boost.org/doc/html/thread/synchronization.html#thread.synchronization.condvar_ref.condition_variable.wait

POSIX(pthread)：pthread_cond_t

pthreadの条件変数（pthread_cond_t型）。wikipedia:en:Spurious_wakeupによればpthreadが発祥？

When using condition variables there is always a Boolean predicate involving shared variables associated with each condition wait that is true if the thread should proceed. Spurious wakeups from the pthread_cond_timedwait() or pthread_cond_wait() functions may occur. Since the return from pthread_cond_timedwait() or pthread_cond_wait() does not imply anything about the value of this predicate, the predicate should be re-evaluated upon such return.
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_wait.html

(The following sections are informative.)
RATIONALE
An added benefit of allowing spurious wakeups is that applications are forced to code a predicate-testing-loop around the condition wait. This also makes the application tolerate superfluous condition broadcasts or signals on the same condition variable that may be coded in some other part of the application. The resulting applications are thus more robust. Therefore, IEEE Std 1003.1-2001 explicitly documents that spurious wakeups may occur.
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_signal.html

Windows API：CONDITION_VARIABLE

Windows APIの条件変数（CONDITION_VARIABLE型）はWindows Vistaで追加された。同APIはWindows XP以前には存在しない。

Condition variables are subject to spurious wakeups (those not associated with an explicit wake) and stolen wakeups (another thread manages to run before the woken thread). Therefore, you should recheck a predicate (typically in a while loop) after a sleep operation returns.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms682052.aspx

Java：Object#wait(), Conditionインタフェース

Java言語では全てのオブジェクト︵Object型︶をモニター機構として利用できる。また、Java1.5で追加された java.util.concurrent.locks パッケージに条件変数︵Conditionインタフェース︶が存在する。これらの待機処理においても同様の spurious wakeup が生じうる。 A thread can also wake up without being notified, interrupted, or timing out, a so-called spurious wakeup. While this will rarely occur in practice, applications must guard against it by testing for the condition that should have caused the thread to be awakened, and continuing to wait if the condition is not satisfied. In other words, waits should always occur in loops, like this one: (snip) http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html#wait%28long%29 When waiting upon a Condition, a "spurious wakeup" is permitted to occur, in general, as a concession to the underlying platform semantics. This has little practical impact on most application programs as a Condition should always be waited upon in a loop, testing the state predicate that is being waited for. An implementation is free to remove the possibility of spurious wakeups but it is recommended that applications programmers always assume that they can occur and so always wait in a loop. http://docs.oracle.com/javase/6/docs/api/java/util/concurrent/locks/Condition.html 2013-10-18追記‥JDK1.4まではJava公式ドキュメントで明記されていなかったとのこと。 waitがnotifyがなくても再開することがあるという問題で、JavaDocではObject#waitでは﹁スプリアスウェイクアップ﹂、Condition#awaitでは﹁見せかけの起動﹂とかかれてます。そのため、Object#waitやCondition#awaitは、再開条件でのループで囲む必要があります。これはEffective Javaで指摘されたことにより有名になりました。そのため、JDK1.4までのJavaDocには記述がありません。きしだのHatena - 正しいスレッドプログラム関連URL

*1:Web上で見つけられた限りでは﹁スプリアス・ウェイクアップ﹂﹁見せかけの起動﹂﹁偽りの目覚め﹂﹁疑似覚醒﹂など。大半がJava言語における同現象の説明中。 *2:http://ejje.weblio.jp/content/spurious *3:“決して spurious wakeup が生じない” とする言語仕様やスレッドライブラリ設計もありえる。

yohhoyの日記

技術的メモをしていきたい日記