lit
story into a data structureParse defines a bunch of smaller parsers that in conjunction
can be used to parse an entire literate file in the form of Text
.
Parse does the bare minimum in assembling the data structure,
assembling a [Chunks]
. For example, narratives are packaged into
Prose
objects by line, instead of attempting to parse entire
narrative sections into one Prose
object. For more information on
the types refer to Types.hs.
Much of the later portion of Parse, the listing of parsers, does not lend well to a narrative. So the narrative will resort to the traditional role of comments in non-literate programs.
An overview of the file:
<< * >>=
<< define Parse module >>
<< import modules >>
<< lit file to data structure >>
<< many parsers >>
<< string to monadic text >>
<< testing functions >>
For the sake of testing, parse exports all of its definitions.
<< define Parse module >>=
{-# LANGUAGE OverloadedStrings #-}
module Parse where
<< import modules >>=
import Text.Parsec
import Text.Parsec.Text
import qualified Data.Text as T
import Types
encode
is the primary function of Parse. It interfaces between Process and the various pipelines which interpret the [Chunk]
into various outputs.
<< lit file to data structure >>=
encode :: T.Text -> [Chunk]
encode txt =
case (parse entire "" txt) of
Left err -> []
Right result -> result
These are the many parsers from generic to most specific, which collectively parse the entire lit
file.
<< many parsers >>=
<< parse all the chunks >>
<< parse a chunk >>
<< parse a chunk of prose >>
<< parse a chunk definition >>
<< try parsing the end of a definition >>
<< parse the title of a definiton >>
<< parse inside of title >>
<< parse a subpart of the definition >>
<< parse a reference subpart >>
<< parse a code subpart >>
<< parse any line >>
<< parse space or tab >>
Parse all the chunks until the end of the file
<< parse all the chunks >>=
entire :: Parser Program
entire = manyTill chunk eof
Parse a chunk which is either of type Def
or type Prose
<< parse a chunk >>=
chunk :: Parser Chunk
chunk = (try def) <|> prose
Parse a line of prose, return Prose
container of the line.
<< parse a chunk of prose >>=
prose :: Parser Chunk
prose = grabLine >>= (\line -> return $ Prose line)
Parse a definiton of type Def
by parsing the title
and parsing the subparts (lines of code or references to chunks)
<< parse a chunk definition >>=
def :: Parser Chunk
def = do
(indent, header, lineNum) <- title
parts <- manyTill (part indent) $ endDef indent
return $ Def lineNum header parts
A parser which tries does not consume input if it fails. endDef
succeeds
when the indentation does not match the proper indentation, or when there is
a chunk title. Note: if it succeeds it will consume the newlines
without the indent
<< try parsing the end of a definition >>=
endDef :: String -> Parser ()
endDef indent = try $ do { skipMany newline; notFollowedBy (string indent) <|> (lookAhead title >> parserReturn ()) }
Parse the line which gives a title to a code chunk. title
returns unique output
to be used later. Ex. it records the indent, to match agains the body it contains.
<< parse the title of a definiton >>=
-- Returns (indent, macro-name, line-no)
title :: Parser (String, T.Text, Int)
title = do
pos <- getPosition
indent <- many ws
name <- packM =<< between (string "<<") (string ">>=") (many notDelim)
newline
return $ (indent, T.strip name, sourceLine pos)
Parse a character which could lie within a title.
<< parse inside of title >>=
notDelim = noneOf ">="
Parse a line after a title, which could either be a line of code or a reference to a code chunk
<< parse a subpart of the definition >>=
part :: String -> Parser Part
part indent =
try (string indent >> varLine) <|>
try (string indent >> defLine) <|>
(grabLine >>= \extra -> return $ Code extra)
Parse a reference to a code chunk in the form << ... >>
<< parse a reference subpart >>=
varLine :: Parser Part
varLine = do
indent <- packM =<< many ws
name <- packM =<< between (string "<<") (string ">>") (many notDelim)
newline
return $ Ref name indent
Parse a line of code, an alias for grabLine
<< parse a code subpart >>=
defLine :: Parser Part
defLine = do
line <- grabLine
return $ Code line
Parse a line, returning the line
<< parse any line >>=
grabLine :: Parser T.Text
grabLine = do
line <- many (noneOf "\n\r")
last <- newline
return $ T.pack $ line ++ [last]
Parse a character which is either a space or a \t
<< parse space or tab >>=
ws :: Parser Char
ws = char ' ' <|> char '\t'
A monadic equivalent of pack
<< string to monadic text >>=
packM str = return $ T.pack str
The following functions allow for testing Parser Text
and Parser Chunk
<< testing functions >>=
textP :: Parsec T.Text () T.Text -> T.Text -> T.Text
textP p txt =
case (parse p "" txt) of
Left err -> T.empty
Right result -> result
chunkP :: Parsec T.Text () Chunk -> T.Text -> Maybe Chunk
chunkP p txt =
case (parse p "" txt) of
Left err -> Nothing
Right result -> Just result