August 1, 2020
After several steps, we now have the tools to allow
every occurence of \ref and \pageref in a comment of a Java
source file to be replaced by the appropriate reference. All that's needed is to
read in the list of possible references, examine the
parsed out comments, and make the replacement.
The entire program is discussed below, with a few remarks about
each piece and how it relates to earlier iterations.
The first part of this program is identical to the first part in the previous step. It loads an existing list of references.
data LabelEntry = LabelEntry {
fileName :: String,
labelType :: String,
latexLabel :: String,
section :: String,
page :: String
} deriving (Eq)
instance Show LabelEntry where
show (LabelEntry f1 f2 f3 f4 f5) = f1 ++ " " ++ f2 ++ " " ++ f3 ++ " " ++
f4 ++ " " ++ f5
readLabelEntry :: String -> LabelEntry
readLabelEntry s = do
let xs = words s
LabelEntry {fileName = xs !! 0,
labelType = xs !! 1,
latexLabel = xs !! 2,
section = xs !! 3,
page = xs !! 4
}
-- Given the name for a database file, read it in.
readDB :: String -> IO [LabelEntry]
readDB fname = do
contents <- readFile fname
return $ map readLabelEntry $ lines contents
A couple more trivial functions are needed to work with the list of labels:
labelMatches :: LabelEntry -> String -> Bool labelMatches (LabelEntry _ _ comp _ _ ) s = comp == s labelSection :: LabelEntry -> String labelSection (LabelEntry _ _ _ s _) = s labelPage :: LabelEntry -> String labelPage (LabelEntry _ _ _ _ s) = s
The Java source code parser is identical to
what appeared earlier, so there's no reason to present all of it here again. The important
point is that parseJava converts Java source code to a list of
ParsedJava values:
data ParsedJava = SLComment String | MLComment String | JavaCode String | WhiteSpace String parseJava :: String -> Either ParseError [ParsedJava] parseJava input = parse parseJavaInput "" input
The earlier parser can be used as-is, but that parser doesn't parse anything inside the comments it identifies. The comments need some further parsing to do a find-and-replace. There are three commands to look for, all of which work just as they do in LaTeX.
If \ref{label} occurs somewhere in a Java comment, then the program looks up
label in the list of LabelEntry values, and replaces
\ref{label} with the corresponding section value from the
matching LabelEntry. This field is called section because
that's usually what it is – a section number, like "4.3.2" – although it may be
an equation number, figure number, etc.
\pageref works the same way as \ref, except that
\pageref{label} is replaced by the page value from the
corresponding LabelEntry.
Use the \verb (short for "verbatim") command to turn off find-and-replace.
The first character after \verb is used to terminate the run of verbatim input.
So
\verb|Don't parse \ref{something-or-other} please|
will pass through the parser and come out the other end as
Don't parse \ref{something-or-other} please
Here's a parser that finds and replaces the three possible commands:
parseAndReplace :: String -> [LabelEntry] -> Either ParseError String
parseAndReplace input defs = runParser replaceComment defs "" input
replaceComment :: GenParser Char [LabelEntry] String
replaceComment = do
x <- many1 (notMacro <|> isMacro <|> falseMacro)
return $ concat x
notMacro :: GenParser Char [LabelEntry] String
notMacro = many1 $ noneOf "\\"
isMacro :: GenParser Char [LabelEntry] String
isMacro = verbMacro <|> refMacro <|> pagerefMacro
verbMacro :: GenParser Char [LabelEntry] String
verbMacro = do
try $ string "\\verb"
x <- anyChar
guts <- many $ noneOf [x]
void $ char x
return guts
refMacro :: GenParser Char [LabelEntry] String
refMacro = do
try $ string "\\ref{"
guts <- many $ noneOf ['}']
void $ char '}'
defs <- getState
let xs = filter (\t -> labelMatches t guts) defs
if (null xs)
then return "UNDEFINED"
else return $ labelSection $ head xs
pagerefMacro :: GenParser Char [LabelEntry] String
pagerefMacro = do
try $ string "\\pageref{"\
guts <- many $ noneOf ['}']
void $ char '}'
defs <- getState
let xs = filter (\t -> labelMatches t guts) defs
if (null xs)
then return "UNDEFINED"
else return $ labelPage $ head xs
falseMacro :: GenParser Char [LabelEntry] String
falseMacro = do
void $ char '\\'
return ['\\']
Use parseAndReplace on the contents of each comment. The function returns
the same comment, after making any replacments.
Making the parser stateful is relatively easy. Instead of calling parse to
invoke Parsec, use runParser, and provide the state. In this case, the
state is fixed, and it's just a [LabelEntry] value. The functions
refMacro and pagerefMacro use Parsec's getState
function to access this state data.
Finally, the top level, with main:
output :: [LabelEntry] -> ParsedJava -> String
output defs (JavaCode s) = s
output defs (WhiteSpace s) = s
output defs (SLComment s) = replaceMacro defs s
output defs (MLComment s) = replaceMacro defs s
replaceMacro :: [LabelEntry] -> String -> String
replaceMacro defs s = do
let x = parseAndReplace s defs
case x of
Left err -> show(err)
Right valid -> valid
argsValid :: [String] -> IO Bool
argsValid names = do
if (null names) || (length names /= 2)
then return False
else doFilesExist names
doFilesExist :: [String] -> IO Bool
doFilesExist names = allM (\s -> doesFileExist s) names
main = do
args <- getArgs
argsValid args >>= \case
False -> putStrLn "Give me the labels file and the Java file."
True -> do
macroDefs <- readDB (args !! 0)
contents <- readFile (args !! 1)
let rawParse = parseJava contents
case rawParse of
Left err -> putStrLn (show(err))
Right valid -> mapM_ putStr $ map (output macroDefs) valid