What Does This Do?
This simple program converts sections of poetry written in text into LaTeX code compatible with the poemscol package. Itโs important to note that it only converts the poetry sections, so youโll still need to write other LaTeX commands manually.
- wrap around the poem with
\begin{poem}
โฆ\\end{poem}
- add title latex command if a argument is given
- read the poem text from STDIN
- split lines by stanza and add latex command
\begin{stanza}
โฆ\end{stanza}
- add
\versline
at the end of each line, except last line of a stanza
Why I made this?
Manually converting text to LaTeX is tedious, so Iโm looking for a converter to streamline the process. Interestingly, I tried to adding a command within ChatGPT4 to convert my text into poemscol syntax, but it was too slow and became slower overtime. Recognizing that the task wasnโt complex, I decided to tackle it myself using my favorite language (one Iโm also eager to learn more about).
- summary: A converter text to Latex is required. but ChatGPT4 performance is not great, so I made it by myself.
{-# LANGUAGE OverloadedStrings #-}
module Main (main) where
import System.Environment as ENV
This program may not be large, but using Data.Text
is a good idea for handling text. I believe this answer provides a good explanation for this approach. and this is a summary from ChatGPT v4
Data.Text is more space-efficient than Haskell's native String, which is a linked list of Chars with high space overhead. It is also more performant, providing better memory locality and interfacing more efficiently with native system libraries, especially for IO-heavy programs. Additionally, Data.Text offers text-specific functions not available with native Strings.
So, Data.Text
, Data.Text.IO
is imported.
import qualified Data.Text as T
import qualified Data.Text.IO as T
import Data.Text (Text)
Additionally, there are a few more modules to handle syntax. I prefer to use (>>>)
instead of .
for longer statements, especially when the code consists of more than four steps. This is because itโs easier to follow than backward function combination when using .
, and I use (<&>)
for similar reasons (forward function composition).
import Data.Either (isRight)
import Control.Arrow ((>>>))
import Data.Functor ((<&>))
beginTitle :: Text
= "\\sequencetitle{"
beginTitle
endTitle :: Text
= "}\n\\poemtitle{}"
endTitle
beginStanza :: Text
= "\\begin{stanza}"
beginStanza
endStanza :: Text
= "\\end{stanza}"
endStanza
beginPoem :: Text
= "\\begin{poem}"
beginPoem
endPoem :: Text
= "\\end{poem}"
endPoem
middleEOLSurfix :: Text
= "\\verseline"
middleEOLSurfix
aIndent :: Text
= " "
aIndent
-- ind: indent
ind :: Int -> Text
= (flip T.replicate) aIndent ind
The title will be retrieved from the command line, while the rest of the poem will be provided through STDIN. So firtly, I made getArgs for Data.Text
version.
getArgsText :: IO [Text]
= map T.pack <$> ENV.getArgs getArgsText
It is always a good idea for a program to have a help message. Even if you are the person who created the program, you could forget how to use it and have to open the code again, LOL.
parseOpts :: [Text] -> IO (Either Text Text)
"-h"] = do
parseOpts [<- T.pack <$> ENV.getProgName
progName return $ Left $ "Usage: " <> progName <> " [OPTION] [A Poem Title]" <> "\n"
<> "Return a poem structured for a LaTeX package, `poemscol'" <> "\n"
<> "Read text data from STDIN." <> "\n\n"
<> "-h show this message." <> "\n\n"
<> ":^]\n"
While it may not be the best design, the title will be retrieved from the entire command line arguments. The advantage is that you donโt need to use quotes around the title.
=
parseOpts ts return $ case (mkt ts) of "" -> Right ""
-> Right $ (addtex tt) <> "\n\n"
tt where
= T.unwords
mkt = (beginTitle <>) . (flip (<>)) endTitle addtex
And the poem text will be read from the STDIN! `parseContents will take previously parsed data, which is the poem title, and combine it with the parsed poem text.โ
parseContents :: (Either Text Text) -> IO (Either Text Text)
=
parseContents ei if isRight ei then
do
<- parseBody
pb return . ((<> pb) <$>) $ ei
else
return ei
โparseBody is the main part of the program, which:
- Groups by stanza.
- Adds a special command for each line (which is โโ) except for the last line of a stanza.
- Adds syntax for the stanza.
- Adds indentation to each line for better readability.
where
= T.getContents <&>
parseBody "\n\n" -- Group by stanza
( T.splitOn >>> map T.lines -- and then divide into lines
-- within each group, NB: *map*
>>> map foldLinesWithTex
>>> foldStanzasWithTex
)= 1 -- stanza indent level
sil = sil + 1 -- line indent level lil
T.intercalate
works very similarly to the general join
function in most programming languages. This perfectly fits my need to ensure that the last line doesnโt get an extra \verseline
suffix.
-- Add "\verseline" to the end of each line except the last line of a stanza.
= T.intercalate (middleEOLSurfix <> "\n" <> (ind lil)) foldLinesWithTex
Unsurprisingly, foldr
is used for folding. โบ๏ธ
-- Wrap each stanza in a stanza structure with indentation.
=
foldStanzasWithTex foldr (\n acc ->
if T.null n then
accelse
<> beginStanza <> "\n"
(ind sil) <> (ind lil) <> n <> "\n"
<> (ind sil) <> endStanza <> "\n"
<> acc
"" )
In the main
function, combine that with the (>>=) operator. The last block of code will print out the result using T.putStr(Data.Text.putStr). In this case, I only have two scenarios for Either handling, but in both cases, Iโll print out a help message for Left or the result for Right.
main :: IO ()
= getArgsText >>= parseOpts >>= parseContents >>=
main -> case x of
(\x Left l -> T.putStr l -- this will be help message
Right r -> T.putStr (beginPoem <> "\n" <> r <> endPoem <> "\n") )
Any Possible Improvements?
I could handle an โempty titleโ as a Warning, but I donโt feel itโs necessary here. If it were a warning instead of Right โโ, I would need to handle the previous result differently in parseContents to check for any fixable errors that come in. Additionally, if I need to make changes, the Either Text Text
data type is not sufficient to handle them correctly. Perhaps Either SomeErrorHandlingDataType Text
or Either Error WarningAndParsed
would be more suitable. I lean towards the second option because Iโd like to parse the body even if there is no title.
Another Advantage
I could integrate this program within org-mode in Emacs, allowing me to write down the text and generate a syntaxed poem in the same place. Iโll post about this sooner or later, but before I get lazy, hereโs a snippet:
**** poem
#+name: poem8
#+begin_verse
๋๋ฌผ๋ฐฉ์ธ ๊ฐ์๋ค.
ํผ์๋ด๊ธฐ๋ณด๋ค
ํฐ์ ธ๋์จ ๋ฏํ..
๋ชฉ๋ จ ๊ฝ๋ด์ค๋ฆฌ์
์ฐ์ํ ๊ธฐ๋ค๋ฆผ.
์์ํ ์ง์ฌ์ด ํผ์ด๋๋ค.
๋ง์์ ์ฌ๋ฐฑ๊ณผ ๊ฐ์ ํ์..
"์ง์ฌ์๋ ๋๋ฎ์ด๊ฐ ์๋๊ฑธ๊น."
๊ทธ ๋์ด๋ฅผ ๋ง์ถ์ด์ผ
๋ ์์ ๋นจ๋ ค๋ค์ด์
๋ง์์๊น์ง ๋ฐํ๋ ๊ฒ์ด์๋ค.
#+end_verse
#+begin_src sh :stdin poem_example :results output :var title="Magnolia"
poemscol-portion-exe $title
#+end_src
And if we execute the code above, we can get a result like the following:
#+RESULTS:
#+begin_example
\begin{poem}
\sequencetitle{Magnolia}
\poemtitle{}
\begin{stanza}
๋๋ฌผ๋ฐฉ์ธ ๊ฐ์๋ค.\verseline
ํผ์๋ด๊ธฐ๋ณด๋ค\verseline
ํฐ์ ธ๋์จ ๋ฏํ..
\end{stanza}
\begin{stanza}
๋ชฉ๋ จ ๊ฝ๋ด์ค๋ฆฌ์\verseline
์ฐ์ํ ๊ธฐ๋ค๋ฆผ.
\end{stanza}
\begin{stanza}
์์ํ ์ง์ฌ์ด ํผ์ด๋๋ค.\verseline
๋ง์์ ์ฌ๋ฐฑ๊ณผ ๊ฐ์ ํ์..
\end{stanza}
\begin{stanza}
"์ง์ฌ์๋ ๋๋ฎ์ด๊ฐ ์๋๊ฑธ๊น."
\end{stanza}
\begin{stanza}
๊ทธ ๋์ด๋ฅผ ๋ง์ถ์ด์ผ\verseline
๋ ์์ ๋นจ๋ ค๋ค์ด์\verseline
๋ง์์๊น์ง ๋ฐํ๋ ๊ฒ์ด์๋ค.
\end{stanza}
\end{poem}
#+end_example
Thank you for reading ^^;