parsing - Applicative Parser deriving Alternative without empty

Question

From reading this answer, I understand that the inclusion of empty in Alternative was largely a design decision to make Alternative a monoid (and thus more powerful). It seems to me this was also because otherwise you couldn't express any laws for Alternative.

But this is a pain if I have a generic applicative parser like so:

newtype Parser err src target = Parser (ExceptT err (State [src]) target)
  deriving (Functor, Applicative, Alternative, Monad, MonadState [src], MonadError err)

Clearly, we can get the same behavior as <|> and many/some from Control.Applicative with:

option :: Parser e s t -> Parser e s t -> Parser e s t
option parserA parserB = do
  state <- get
  parserA `catchError` \_ -> put state >> parserB

many :: Parser e s t -> Parser e s [t]
many parser = some parser `option` return []

some :: Parser e s t -> Parser e s [t]
some parser = (:) <$> parser <*> many parser

Even though none of these use empty, it seems like I am forced to re-implement them instead of deriving Alternative, because I can't conceive a generic way to instance empty for it (of course, I'd still need to instance <|> to get the preservation of state on parserA error, but then I could get some, many, optional and friends for free).

Digging into Parsec's source, it seems that it subverts this because it doesn't allow for custom error types (or at least, Parsec isn't parameterized by a custom error type):

instance Applicative.Alternative (ParsecT s u m) where
    empty = mzero
    (<|>) = mplus

instance MonadPlus (ParsecT s u m) where
    mzero = parserZero
    mplus p1 p2 = parserPlus p1 p2

-- | @parserZero@ always fails without consuming any input. @parserZero@ is defined
-- equal to the 'mzero' member of the 'MonadPlus' class and to the 'Control.Applicative.empty' member
-- of the 'Control.Applicative.Alternative' class.

parserZero :: ParsecT s u m a
parserZero
    = ParsecT $ \s _ _ _ eerr ->
      eerr $ unknownError s

unknownError :: State s u -> ParseError
unknownError state        = newErrorUnknown (statePos state)

newErrorUnknown :: SourcePos -> ParseError
newErrorUnknown pos
    = ParseError pos []

Taking inspiration from this, it seems like the only reasonable way around this is to have my generic parser wrap the user error type with something like:

 data ParserError err = UserError err | UnknownError

 newtype Parser err src target = Parser (ExceptT (ParserError err) (State [src]) target)
  deriving (Functor, Applicative, Alternative, Monad, MonadState [src], MonadError err)

And then empty can be:

empty = throwError UnknownError

This just feels wrong, though. This wrapping only exists to appease the requirement for an empty and it makes consumers of this generic parser do more work to handle errors (they also must handle UnknownError now and unwrap their custom errors). Is there any way to avoid this?

score 4 · Accepted Answer

这是标准base层次结构的问题。它不是非常模块化。如果我们想实现最大的模块化，在一些完美的世界Alternative中会被分成三个类型类。

请参阅AlternativePureScript 世界中的定义：

https://pursuit.purescript.org/packages/purescript-control/3.3.0/docs/Control.Alternative#t:Alternative

像这样：

class Functor f <= Alt f where
  alt :: forall a. f a -> f a -> f a  -- alt is (<|>)

class Alt f <= Plus f where
  empty :: forall a. f a

class (Applicative f, Plus f) <= Alternative f

因此，如果类型层次结构足够模块化，您可以Alt为您的Parser类型实现（并具有所有<|>-only-related 功能）但不是Alternative. If Alternativeis Monoidthen Altis Semigroup：您可以附加元素，但您没有空元素。

请注意，以前在GHCbase包中Applicative不是. 所以你可以期待在未来的一些类似的事情（模块化）可以发生在.MonadSemigroupbaseSemigroupMonoidAlternative

parsing - Applicative Parser deriving Alternative without empty

1 回答 1

Related

Reference