问

如何使用Alex monadic lexer与Happy？

Hide-my-love 发布于 2023-02-13 15:40

我正在尝试学习使用Alex + Happy来构建解析器,特别是我有兴趣学习使用monadAlex 的包装器.我已经看过亚历克斯和快乐的文档,但对于我来说,他们都非常缺乏将它们一起使用的任何有用信息.我设法让它们basic和posn包装纸一起工作,但我很茫然monad.

我已经看过关于Alex,Happy和monadic词法分析器的不同问题(包括:是否有关于使用Alex + Happy构建简单解释器的任何教程？但没有一个能够提供一个使用的简单示例monad.

大多数在线代码使用Happy与自定义词法分析器函数,或使用basic或posnAlex包装器.

这是一个类似ini的语法的简单词法分析器:

{
module IniLexer where
}

%wrapper "monad"



$spaces = [\ \t]
$alpha = [a-zA-Z]
$digits = [0-9]
$alnum = [$alpha$digits]


@identifier = $alpha $alnum*

@comment = \#.*

@integer = $digits+

@boolean = (true) | (false)

@string = \"[^\"]*\"


:-

@integer    { mkL LInteger }
@boolean    { mkL LBoolean }
@string     { mkL LString }

@identifier  { mkL LIdentifier }

\[@identifier\] { mkL LSection }

=           { mkL LAssign }

\;          { mkL LEndAssign }
@comment    ;
[\ \t \n]+  ;


{

data LexemeClass = LInteger | LBoolean | LString | LIdentifier | LSection | LAssign | LEndAssign | LEOF
    deriving (Eq, Show)


mkL :: LexemeClass -> AlexInput -> Int -> Alex Token
mkL c (p, _, _, str) len = let t = take len str
                           in case c of
                                LInteger -> return (IntegerNum ((read t) :: Integer) p)
                                LBoolean -> return (BooleanVal (if t == "true"
                                                                   then True
                                                                   else False
                                                               ) p)
                                LString -> return (StringTxt (take (length t - 2) (drop 1 t)) p)
                                LIdentifier -> return (Identifier t p)
                                LSection -> return (SectionHeader (take (length t - 2) (drop 1 t)) p)
                                LAssign -> return (Assignment p)
                                LEndAssign -> return (EndAssignment p)


-- No idea why I have to write this myself. Documentation doesn't mention it.
alexEOF :: Alex Token
alexEOF = return Eof



data Token = SectionHeader {identifier :: String, position :: AlexPosn} |
             Identifier {name :: String, position :: AlexPosn}          |
             Assignment {position :: AlexPosn}                          |
             EndAssignment {position :: AlexPosn}                       |
             IntegerNum {value :: Integer, position :: AlexPosn}        |
             BooleanVal {istrue :: Bool, position :: AlexPosn}          |
             StringTxt  {text :: String, position :: AlexPosn}          |
             Eof
    deriving (Eq, Show)


}

这是相对快乐的解析器:

{
module Main where

import IniLexer

}



%name parseIniFile
%error {parseError}
%lexer  {alexMonadScan} {AlexEOF}
%monad {Alex}
%tokentype {Token}
%token
    SECTION     {SectionHeader name _ }
    IDENT       {Identifier name _ }
    '='         {Assignment _ }
    INT         {IntegerNum value _ }
    BOOL        {BooleanVal istrue _ }
    STRING      {StringTxt text _ }
    ';'         {EndAssignment _ }


%%


ConfigFile : SequenceOfSections                    {reverse $1}

SequenceOfSections : {- empty -}                   {   []  }
                   | SequenceOfSections Section    {$2 : $1}


Section : SECTION SectionBody                      {Section (identifier $1) (reverse $2)}


SectionBody : {- empty -}        {[]}
            | SectionBody AssignmentLine ';' {$2 : $1}


AssignmentLine : IDENT '=' Value      {(name $1, $3)}

Value : INT         {IntV (value $1)}
      | BOOL        {BoolV (istrue $1)}
      | STRING      {StringV (text $1)}


{

data Value = IntV Integer | BoolV Bool | StringV String
    deriving (Eq, Show)

data Section = Section String [(String, Value)]
    deriving (Eq, Show)

data IniFile = IniFile [Section]
    deriving (Eq, Show)


parseError :: [Token] -> Alex a
parseError t = fail "a"

main = do
    s <- getContents
    print $ parseIniFile $ runAlex s alexMonadScan

}

这引发了很多编译器错误:

[...]
Couldn't match expected type `(AlexReturn t1 -> Alex a0) -> t0'
                with actual type `Alex Token'
    The function `alexMonadScan' is applied to one argument,
    but its type `Alex Token' has none
[...]

我该如何修改要使用的解析器alexMonadScan？该快乐文档不是完全清楚并努力不使用任何明确的实例(或提供的例子,从我的观点看clarying失败).

如果需要,我可以发布我posn的相同lexer +解析器的版本.

1 个回答

据我所知,你的词法分析器的定义是完全正确的.假设那里没有bug,你需要解决的唯一问题就是解析器的配置.首先,你使用的词法分析器是错误的.虽然该函数是Alex词法分析器的接口,但它具有类型
```
alexMonadScan :: Alex result
```
但是乐思的快乐想要的是类型
```
lexer :: (Token -> P a) -> P a
```
P我们使用的monad 在哪里？这就是说词法分子应该给我们Alex a一个延续的时间.我们需要一个简单的包装器:
```
lexwrap :: (Token -> Alex a) -> Alex a
lexwrap cont = do
    token <- alexMonadScan
    cont token
```
或者等价的
```
lexwrap = (alexMonadScan >>=)
```
其次,alexEOF在%lexer指令中使用会导致解析器在每次输入时失败.您在那里提供的名称将插入到生成代码中的case语句的分支中,因此您必须使用数据构造函数的名称而不是值 - 特别是,您需要使用Alex将发出的数据构造函数发出EOF信号.

这使得解析器中的词法分析器行略有不同.
```
%lexer {lexwrap} {Eof}
```
(作为旁注,这就是你需要alexEOF = return Eof自己编写的原因.你在里面返回的数据构造函数alexEOF需要与你识别为Happy的数据构造函数进行模式匹配,以结束文件.Alex无法知道你要发射什么,而且Happy无法知道你选择通过Alex发射什么.)

现在下一个问题是你的parseError类型不正确.当只使用monad时,这确实是你需要的类型,但是当你在混合中添加词法分析器时,你的parseError必须有不同的类型.此外,可能不建议使用fail,因此这里有一个稍好的定义:
```
parseError :: Token -> Alex a
parseError _ = alexError "Why is using happy and alex so hard"
```
最后,这里的主要功能有点奇怪.我们想要调用解析器的方法是使用runAlex调用它.所以这里有一个快速包装器.传入的字符串是您要解析的字符串.
```
parse :: String -> Either String [Section]
parse s = runAlex s parseIniFile
```
函数解析的类型由parseIniFile的定义决定.在这里,它是Alex [Section]如此的Either String [Section]被返回.

我认为这就是一切.
2023-02-13 15:43 回答

棒棒冰

撰写答案

今天，你开发时遇到什么问题呢？

立即提问

热门标签