6

这对我来说完全是一个惊喜。readIORef当有atomicModifyIORef飞行时,有人可以解释阻塞背后的原因是什么吗?我知道假设是提供给后一个函数的修改函数应该非常快,但这不是重点。

这是一个示例代码,它重现了我正在谈论的内容:

{-# LANGUAGE NumericUnderscores #-}
module Main where

import Control.Concurrent
import Control.Concurrent.Async
import Control.Monad
import Data.IORef
import Say (sayString)
import Data.Time.Clock
import System.IO.Unsafe

main :: IO ()
main = do
  ref <- newIORef (10 :: Int)
  before <- getCurrentTime
  race_ (threadBusy ref 10_000_000) (threadBlock ref)
  after <- getCurrentTime
  sayString $ "Elapsed: " ++ show (diffUTCTime after before)


threadBlock :: IORef Int -> IO ()
threadBlock ref = do
  sayString "Below threads are totally blocked on a busy IORef"
  race_ (forever $ sayString "readIORef: Wating ..." >> threadDelay 500_000) $ do
    -- need to give a bit of time to ensure ref is set to busy by another thread
    threadDelay 100_000
    x <- readIORef ref
    sayString $ "Unblocked with value: " ++ show x


threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do
  sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
  y <- atomicModifyIORef' ref (\x -> unsafePerformIO (threadDelay n) `seq` (x * 10000, x))
  -- threadDelay is not required above, a simple busy loop that takes a while works just as well
  sayString $ "Finished blocking the IORef, returned with value: " ++ show y

运行这段代码会产生:

$ stack exec --package time --package async --package say --force-dirty --resolver nightly -- ghc -O2 -threaded atomic-ref.hs && ./atomic-ref
Setting IORef to busy for 10000000 μs
Below threads are totally blocked on a busy IORef
readIORef: Wating ...
Unblocked with value: 100000
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Elapsed: 10.003357215s

请注意,readIORef: Wating ...它只打印两次,一次是在阻塞之前,一次是在阻塞之后。这是非常出乎意料的,因为它是一个在完全独立的线程中运行的动作。这意味着阻塞IORef会影响其他线程而不是调用的线程readIORef,这更令人惊讶。

这些语义是预期的,还是一个错误?我适合不是错误,为什么这是预期的?稍后我会打开一个 ghc 错误,除非有人对此行为有我想不出的解释。这是 ghc 运行时的一些限制,我不会感到惊讶,在这种情况下,我稍后会在这里提供答案。无论结果如何,了解这种行为都非常有用。

编辑 1

在评论中请求了我尝试过的不需要的繁忙循环unsafePerformIO,所以在这里

threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do
  sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
  y <- atomicModifyIORef ref (\x -> busyLoop 10000000000 `seq` (x * 10000, x))
  sayString $ "Finished blocking the IORef, returned with value: " ++ show y

busyLoop :: Int -> Int
busyLoop n = go 1 0
  where
    go acc i
      | i < n = go (i `xor` acc) (i + 1)
      | otherwise = acc

结果完全相同,只是运行时略有不同。

Setting IORef to busy for 10000000 μs
Below threads are totally blocked on a busy IORef
readIORef: Wating ...
Unblocked with value: 100000
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Elapsed: 8.545412986s

编辑 2

事实证明,这sayString是没有输出没有出现的原因。这是 outsayString被交换时的内容putStrLn

Below threads are totally blocked on a busy IORef
Setting IORef to busy for 10000000 μs
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Unblocked with value: 100000
Elapsed: 10.002272691s

那仍然没有回答问题,为什么要readIORef阻止。事实上,我偶然发现了 Samuli Thomasson 的《Haskell High Performance》一书中的一句话,它告诉我们不应该发生阻塞:

在此处输入图像描述

4

1 回答 1

1

我想我明白现在发生了什么。TLDR,readIORef不是阻塞操作!非常感谢所有对这个问题发表评论的人。

我在心理上分解逻辑的方式是(与问题相同,但添加了线程名称):


threadBlock :: IORef Int -> IO ()
threadBlock ref = do
  race_ ({- Thread C -} forever $ sayString "readIORef: Wating ..." >> threadDelay 500_000) $ do
    {- Thread B -}
    threadDelay 100_000
    x <- readIORef ref
    sayString $ "Unblocked with value: " ++ show x

threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do {- Thread A -}
  sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
  y <- atomicModifyIORef' ref (\x -> unsafePerformIO (threadDelay n) `seq` (x * 10000, x))
  sayString $ "Finished blocking the IORef, returned with value: " ++ show y
  • 线程 Aref用一个 thunk 更新 a 的内容,该计算完成后将被填充unsafePerformIO (threadDelay n) `seq` (x * 10000, x)。重要的部分是因为atomicModifyIORef'最有可能使用 CAS(比较和交换)实现并且交换成功,因为预期值匹配并且新值已使用尚未评估的 thunk 更新。因为atomicModifyIORef'是严格的,所以它必须等到计算出值,这需要 10 秒才能返回。所以线程 A 阻塞。
  • 线程 B 从没有阻塞的ref情况下读取 thunk 。readIORef现在,一旦尝试打印 thunk 的新内容,x它必须停止并等待,直到它被一个值填充,该值仍在计算过程中。因为它必须等待,所以看起来它被阻塞了。
  • 假设线程 C 每 0.5 秒打印一条消息sayString,但它没有这样做,因此它也被阻止了。快速查看say包,GHC.IO.Handle看起来Handleforstdout被线程 B 阻塞,因为say包中的打印假设没有交错发生,因此线程 C 也无法进行任何打印,因此它看起来也被阻塞了。这就是为什么切换到putStrLn未阻塞的线程 C 并允许它每 0.5 秒打印一条消息的原因。

这绝对让我信服,但如果有人有更好的解释,我很乐意接受另一个答案。

于 2020-05-22T17:18:32.697 回答