lp - Literate Programming*
==========================
 
Chr. Clemens Lahme (clemens.lahme@techinvest.li)
 
2023-12-20
 
(* because who really needs an effing line printer.)
 
Literate programming in 2023, Not Invented Here Syndrome to its fullest.
 
So we (which is just me and myself) want to start from scratch. Why? Because why
not? Why follow someone elses ideas and use cases?
 
It shouldn't be so hard either, with today's modern languages and the basic idea
(embed code in text and then just write a book or paper) floating around since
the 80s. So what are the main ideas and tools to use here?
 
Scripting languages that easily transform text files. So regular expressions and
in particular Ruby - skipping the whole flex lex bison grammar complexity.
 
On the text front, take notice of md - markdown. Also formatting in HTML _and_
PDF, use of TeX, asciidoc if it brings anything to the table, especially on the
font handling side, and not taking them, if it just pollutes and complicates the
whole process, like the whole XML world and unnecessary garbage escape
sequences.
 
The main point maybe is just: do the opposite of embedding everything inside XML
structure.
 
Instead, everything should be already readable just in its original format, and
then with tools the text can be just beautified as a bonus and on top of
it. After all, you don't need any markdown tool at all to read a markdown file!
 
Last not least, after doing the first iteration, it becomes clear to me that
Rails philosophy is also an inspiration. Everything, especially with the text
formatting, should just be the way it is to be expected, and then actually
nothing is needed on top of it. That means head lines in text are just
formatting without any reformatting in mind, so far not a single marker or
escape code is needed too. Scripts have always used EOF or EOT signs for file
redirection, and can be used here just as well and apropriately. All the 'magic'
is then delegated to processing in separated scripts, but without any surprises
or actual real magic. At least, that's how it should appear and work in
practice.
 
Table of Contents
-----------------
  1. First Two Rules
  2. Down to the Metal
  3. First Example
  4. First Release
  5. Statistic
  6. Extract Named Files
  7. Evolving Scripts
  8. Testing
  9. Adding a Table of Contents
    1. Source Documentation
  10. Special Case Without index.txt
  11. CSS
  12. Inlining Images
  13. Installation
    1. Requirements
  14. Download
  15. Usage
  16. Add a TOC
  17. 'lp' Convenience Script
  18. Project Structure and Conventions
  19. Specification
    1. Overview
    2. General Principles
    3. Headings
    4. Main Source Code Block
    5. Source File Extraction (Here-Docs)
    6. Links
    7. Visual Formatting Notes
    8. Images
    9. Table of Contents (TOC)
    10. CSS Styling
  20. Version
  21. Help
  22. Release History
  23. Slides
  24. Issues
  25. Conclusion
  26. Glossary
    1. Literate Programming
    2. Markdown
    3. Here-Document
 
1. First Two Rules
------------------
 
So here is already the second rule. Use #! to start the program code, and
anything without whitespace at the beginning of a line and starting with
/^exit[\(\s/ should indicate the end of program code. Everything before and
afterwards is just text!
 
Now back to the first rule. Text should not be polluted with escape or markup
language. Even less than markdown. So we have not to invent our own markdown
language or specification, we just have to define a consistent and fixed rule
for everything that can be reformatted in another format.
 
So in this case the main headline is underlined by equal signes of the same
length as the headline itself. On the second level, headlines are just
underlined by the minus sign '-'. Should we allow sub chapters further down?
Maybe, but let's keep it open for now.
 
2. Down to the Metal
--------------------
 
So now down to the metal. How should our program look alike, how should
everything start? We name the program 'lp', for 'literate programming'. What is
the first feature, we want to implement? Reformat everything in HTML? Reformat
just the text itself? Like reformat the lines, or even left right alignement?
 
Or just split out the program from the commentary?
 
So how would we use the program?
 
cd myproject
lp
 
Then what should happen? Should the program already be executed? No, that should
be happening with:
 
lp run
 
And just 'lp'? What should that do? Maybe we need a second project, to use this
project with? By now in 2026 we actually did and kept using it for more
projects further on.
 
Yes, maybe we could still leave the original code as it is. Like, you actually
can have both, pure text files, and also pure code files, plus a mix of
them. But then what is the purpose and the extra benefit of 'lp' itself? Maybe
just the help for me, to use a journal mode for myself, while programming?!
 
So then how to intermix commentary inside of existing files? Well, in Ruby you
just use =begin and =end inside of it, not!?
 
Yes, and then later we extract the commentary to make it available in HTML, or
strip it out of the code and insert it into the PDF documentary.
 
So which fonts do we use? Monospace I guess. Why not. So we use TeX to create
PDF files?
 
3. First Example
----------------
 
So we started documenting another project and with more brainstorming, we have
now our first use case.
 
lp
 
This default command should just look in doc/ for the index.txt file, and create
a corresponding index.html file, with URLs also being converted.
 
#! /usr/bin/env ruby
 
def process_text_file( input_filename )
  if File.exist?( input_filename ) == false
    puts "ERROR: input file does not exist: #{input_filename}"
    exit 2
  end
  file = File.open( input_filename, "r" )
  content = file.read()
  file.close
 
  #
  # Extract named scripts (heredocs).
  #
  lines = content.split( /\n/ )
  script_filename = nil
  script = ""
  mode = nil
  f_or_t = nil
  lines.each do |line|
    if (f_or_t == nil) && (line =~ /<<\s*EO(F|T)\s*$/)
      mode = "w"
      if line =~ />>/
        mode = "a"
      end
      script_filename = line.sub( /^\s*cat\s+>>?\s+/, '' )
      script_filename.sub!( /\s.*$/, '' )
      script = ""
      f_or_t = line.sub( /^.*EO([FT])\s*$/, "\\1" )
    elsif f_or_t && script_filename && (line =~ /^\s*EO#{f_or_t}\s*$/)
      file = File.open( script_filename, mode )
      file.write( script )
      file.close
      system( "chmod 700 #{script_filename}" )
      puts script_filename
      f_or_t = nil
      script_filename = nil
      next
    elsif script_filename
      script << line
      script << "\n"
    end
  end
 
  #
  # Generate HTML output.
  #
  output =  ""
  output << "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">"
  output << "<html>\n"
  output << "<head>\n"
  output << "<title>lp: #{File.realdirpath( '.' ).sub( /^.*\//, '' )}</title>\n"
  output << "</head>\n"
  output << "<body>\n"
  output << "<tt>\n"
  content.gsub!( /&/m, "&amp;" )
  content.gsub!( /</m, "&lt;" )
  content.gsub!( />/m, "&gt;" )
  content.gsub!( /\n/m, "<br />\n" )
  content.gsub!( /<br \/>\n<br \/>\n/m, "<br />\n&nbsp;<br />\n" )
 
  # Handle links, but treat local files just relative to the doc directory if necessary,
  # without the preceeding 'file://'.
  content.gsub!( /(^|\s)(file):\/\/([^\s<]+)(\s|<)/m, "\\1<a href=\"\\3\">\\2://\\3</a>\\4" )
  content.gsub!( /(^|\s)(https?):\/\/([^\s<]+)(\s|<)/m, "\\1<a href=\"\\2://\\3\">\\2://\\3</a>\\4" )
 
  # Repeated spaces have to be respected absolutely.
  content.gsub!( /  /m, '&nbsp;&nbsp;' )
 
  output << content
  output << "</tt>\n"
  output << "</body>\n"
  output << "</html>\n"
 
  output_filename = input_filename.sub( /\.txt$/, '.html' )
  file = File.open( output_filename, "w" )
  file.write( output )
  file.close
  system( "chmod 600 '#{output_filename}'" )
  puts output_filename
  # The TOC should be added only once!
  # At the moment the TOC is only added to the index.html file.
  file_index_html = File.open( output_filename, "r" )
  content_index_html = file_index_html.read()
  file_index_html.close
 
  #
  # Add Table of Contents (TOC).
  #
  if content_index_html !~ /^Table of Contents<br/m
    # We need to determine where the 'add_toc.rb' script is located.
    # Assuming it is in ../bin relative to this script.
    app_home = File.realpath( File.dirname( File.realpath( __FILE__ ) ) +
                              "/.." )
    toc_script_filename = "#{app_home}/bin/add_toc.rb"
    command = "#{app_home}/bin/add_toc.rb '#{output_filename}'"
    puts command
    system command
  end
end
 
# help placeholder #
 
# version placeholder #
 
files = [ "./doc/index.txt" ]
if ARGV.length > 0
  files = ARGV
end
files.each do |input_filename|
  process_text_file( input_filename )
end
 
exit( 0 )
 
But this code inside this text document must also be extracted. So for this we
have some bootstrap code outside of this text document. Of course, we must later
also incorporate this separate program into this document:
 
./bin/bootstrap.rb
 
The text parser should recognize, that this line above has a corresponding local
file, which is a program. So it should include a link, instead of the whole
file, maybe. Or extract the main comment to describe this program.
 
4. First Release
----------------
 
The first release (commit) had the following statistic.
 
Lines of code:    36    28.57%
Lines of text:    90    71.43%
Total            126   100.00%
 
5. Statistic
------------
 
For the statistic we use the following code:
 
cat > ./bin/statistic.sh <<EOF
#! /bin/bash
 
TEXTLINES=$(cat doc/index.txt | grep -E -v '^\s*$' | wc -l)
CODELINES=$(cat bin/lp.rb bin/bootstrap.rb | grep -E -v '^\s*$' | grep -E -v '^#' | wc -l)
EOFLINES=0
CODELINES=$(expr $CODELINES '+' $EOFLINES)
TEXTLINES=$(expr $TEXTLINES '-' $CODELINES)
TOTALLINES=$(expr $TEXTLINES '+' $CODELINES)
CODEPERCENTAGE=$(ruby -e "puts ${CODELINES}.0 / ${TOTALLINES}.0 * 100.0")
TEXTPERCENTAGE=$(ruby -e "puts ${TEXTLINES}.0 / ${TOTALLINES}.0 * 100.0")
printf "Lines of code: %8d   %6.2f\n" ${CODELINES} ${CODEPERCENTAGE}
printf "Lines of text: %8d   %6.2f\n" ${TEXTLINES} ${TEXTPERCENTAGE}
printf "Total:         %8d   100.00\n" ${TOTALLINES}
EOF
 
6. Extract Named Files
----------------------
 
As with the statistic.sh script example, we now need to create named script or
program files, in addition to the default project script, which is named as a
Ruby file in the bin directory, with the same name as the project directory,
e.g. ./bin/myproject.rb.
 
OK, we added that to the main script above searching lines for EOT or EOF.
 
7. Evolving Scripts
-------------------
 
Now, in order to preserve the development of a program and make it easier to
understand and follow its logic, how can we in later steps adapt such scripts?
 
One idea is to add place holders and later insert new code at these insertion
points.
Or we just add simple lines like 'source another_script.sh' into the code and
separate the logic that way, accompanying with another section of text.
For Ruby use 'load "my/file.rb"' to add extra code later on etc.
 
8. Testing
----------
 
First is the resulting HTML code compliant? Use 'tidy' for that. And use
the 'check' rule in the Makefile. Yes, we didn't mention the make file, but
we use one to invoke 'bootstrap.rb' and create the 'lp.rb' file as well as the
HTML page 'index.html'.
 
So now we add the test code into the test script, which will be invoked from the
'check' rule in the Makefile (not included in this document).
 
cat > ./bin/test_lp.sh <<EOF
#! /bin/bash
# Do not edit this file directly, it is auto generated by lp.
 
set -e
 
grep -E "\\.<a " test/url/index.html && { echo "ERROR: wrong link found."; exit 1; }
 
type tidy || { echo "ERROR: tidy command is not available."; exit 2; }
echo 'tidy ./doc/index.html'
tidy ./doc/index.html 2>&1 | grep "No warnings or errors were found."
 
echo "SUCCESS: $0 - $?."
EOF
 
9. Adding a Table of Contents
-----------------------------
 
Before the first sub headline, and only for HTML, we want to add automatically a
table of contents.
 
For this we read the text file, and parse for sub headers. These are defined as:
 
1. An empty line.
2. A line of text.
3. A line of '-' signs of the same length as the line above.
4. A final empty line.
 
And we number the chapters through. We work directly on the HTML output, as the
text file will not be changed anyway.
 
For anchors of this headings, we can just number them through.
 
cat > ./bin/add_toc.rb <<EOT
#! /usr/bin/env ruby
# Do not edit this file, it gets automatically generated by lp.
 
filename = ARGV[ 0 ]
if !filename
  filename = "./doc/index.html"
end
if File.exists?( filename )
  file = File.open( filename, "r" )
  content = file.read()
  file.close
else
  printf "INFO: file '#{filename}' does not exist, which is fine, so no TOC"
  printf " needs to be added to any file.\n      Exiting with status: 0\n"
  exit 0
end
 
lines = content.split( /\n/ )
status = nil
headline = nil
# Contains arrays of from: [ headline, index, nil || true ]
toc = []
subsections = []
indexes = []
lines.each_with_index do |line, index|
  if status == nil
    if line =~ /^&nbsp;<br \/>$/
      status = "empty line"
      #puts "#{index}: #{status}"
      next
    end
  elsif status == "empty line"
    if line =~ /^&nbsp;<br \/>$/
      next
    elsif line =~ /^\s*-+\s*<br \/>$/
      status = nil
      next
    else
      headline = line.sub( /^\s+/, '' )
      headline.sub!( /\s*<br \/>$/, '' )
      #puts "#{index}: #{headline}"
      status = headline.length
      next
    end
  elsif status == "dashed line"
    if line =~ /^&nbsp;<br \/>$/
      # We should compare the length of the dashed line. But hey, we call it
      # Bingo anyway.
 
      # Any collected subsections must belong to the previous section.
      # So we add them and clear them.
      if toc.length > 0
        toc[ -1 ] << subsections
      end
      subsections = []
 
      toc << [ headline, index ]
      status = "empty line"
      #puts "Bingo: #{toc.inspect}"
      next
    end
  elsif status == "subsection line"
    if line =~ /^&nbsp;<br \/>$/
      # We should compare the length of the dashed line. But hey, we call it
      # Bingo anyway.
      subsections << [ headline, index ]
      status = "empty line"
      #puts "Bingo: #{toc.inspect}"
      next
    end
  else
    # Potentially before the could be a headline.
    if line =~ /^\s*-+\s*<br \/>$/
      status = "dashed line"
      #puts status
      next
    elsif line =~ /^\s*'+\s*<br \/>$/
      status = "subsection line"
      #puts status
      next
    elsif line =~ /^&nbsp;<br \/>$/
      status = "empty line"
      #puts "#{index}: #{status}"
      next
    else
      status = nil
      next
    end
  end
end
# Again, any subsections left, they need to get attached to the previous
# section.
if toc.length > 0
  toc[ -1 ] << subsections
end
subsections = []
#puts toc.inspect
 
if toc.length > 0
  toc_content = ""
  toc_content << "&nbsp;<br />\n"
  toc_content << "Table of Contents<br />\n"
  toc_content << "-----------------<br />\n"
  # Seems the toc is not in monospace font on purpose?
  # Yes. Let's choose monospace instead for congruency.
  # Add back in <tt> flags and the beginning and end to undo this.
  toc_content << "</tt><ol>\n"
  toc.each_with_index do |row, index|
    toc_content << "<li><a href=\"#headline#{index + 1}\">#{row[ 0 ]}</a>"
    if row[ 2 ].length > 0
      toc_content << "\n<ol>\n"
      row[ 2 ].each_with_index do |subsection, index_sub|
        toc_content << "<li><a href=\"#headline#{index + 1}_#{index_sub + 1}\">#{subsection[ 0 ]}</a></li>"
      end
      toc_content << "</ol>\n"
    end
    toc_content << "</li>\n"
  end
  toc_content << "</ol><tt>"
 
  # Insert ToC.
  output = ""
  headline_index = 0
  subsection_index = 0
  lines.each_with_index do |line, index|
    if index == toc[ 0 ][ 1 ] - 3
      output << toc_content
    end
    if (headline_index -1 < toc.length)                            &&
       (subsection_index < (toc[ headline_index - 1 ][ 2 ]).length) &&
       (index == toc[ headline_index - 1 ][ 2 ][ subsection_index ][ 1 ] - 2)
       #puts headline_index
       #puts subsection_index
       #exit 3
      output << "<a name=\"headline#{headline_index}_#{subsection_index + 1}\" />"
      output << "#{headline_index}.#{subsection_index + 1} "
    end
    if (headline_index - 1 < toc.length)                            &&
       (subsection_index < (toc[ headline_index - 1 ][ 2 ]).length) &&
       (index == toc[ headline_index - 1 ][ 2 ][ subsection_index ][ 1 ] - 1)
      output << "'" * ((headline_index).to_s.length + 2 +
                       (subsection_index + 1).to_s.length)
      subsection_index += 1
    end
    if (headline_index < toc.length)             &&
       (index == toc[ headline_index ][ 1 ] - 2)
      output << "<a name=\"headline#{headline_index + 1}\" />#{headline_index + 1}. "
    end
    if (headline_index < toc.length)             &&
       (index == toc[ headline_index ][ 1 ] - 1)
      output << '-' * ((headline_index + 1).to_s.length + 2)
      headline_index += 1
      subsection_index = 0
    end
    output << line
    output << "\n"
  end
  file = File.open( filename, "w" )
  file.write( output )
  file.close
  puts filename
end
EOT
 
Now we retroactively also change the previoulsy created 'lp.rb' project script
to invoke this TOC adding script each time a new HTML file gets generated with
exactly that HTML file name as a parameter.
 
9.1 Source Documentation
''''''''''''''''''''''''
 
The script performs its task in a series of distinct steps:
 
1. Parsing for Headings: The script iterates through the HTML content, line
   by line, to identify Level 2 headings (subsections). The identification
   logic relies on the specific HTML output format generated by 'lp' for its
   headings.
   A heading is recognized by a specific pattern:
   - A preceding empty line (represented as '&nbsp;<br />').
   - A line of text (the heading itself).
   - A following line consisting of minus signs ('-').
 
2. Building the TOC List: As it finds headings, the script stores the
   heading text and its associated line number in an array called 'toc'. This
   array serves as the raw data for the Table of Contents.
 
3. Generating the TOC HTML: If any headings were found, the script builds
   an HTML fragment for the TOC. This fragment consists of:
   - A header ("Table of Contents").
   - An ordered list ('<ol>') where each list item ('<li>') contains
     an anchor link ('<a href="#headlineX">...') pointing to a specific
     section. The 'X' is a number corresponding to the heading's order in the
     document.
 
4. Modifying the HTML Document: The script then rebuilds the entire HTML
   document by iterating through the original lines again. During this
   reconstruction:
   - It inserts the generated TOC HTML fragment just before the first
     identified heading.
   - It adds anchor tags ('<a name="headlineX"></a>') directly before each
     heading to serve as the target for the TOC links. It also prepends a
     numbered prefix (e.g., "1. ", "2. ") to the heading text.
 
5. Writing the Output: Finally, the modified HTML content (now including
   the TOC and numbered headings) overwrites the original file. A confirmation
   message is printed to the console.
 
10. Special Case Without index.txt
----------------------------------
 
So moving on to use lp with projects that have already a default doc/index.html
file, I don't want to move that around to make space for a new index.txt file.
So in the text file there should simply be no #! .. exit section and then the
index.html version does not need to be (re)created at all.
 
Does it work like this already?
 
No. This does not work, because
 
a) the existing index.html gets overwritten, and
 
b) there is a complain if doc/index.txt is missing.
 
How about if doc.index.html exists and index.txt is missing, then we skip the
creation process?
 
Ah, the solution exists already, lp doc/example.txt creates doc/example.html and
now we are fine.
 
11. CSS
-------
 
So one of our 'lp' principles is that the original text format can be used
unaltered without any espace codes, tags, or macro codes.
 
But how can we alter the formatting. Of course, with CSS, we like that
definition or not, why not use it, as we can "configure" our text output that
way with a separate standard.
 
So what we will do, if there exists a 'styles.css' file in the same location as
the processed text file, we will add a link to the styles.css file inside the
produced HTML file.
 
cat > ./lib/css.rb <<EOT
# Do not edit, file is generated.
 
$: << File.dirname( __FILE__ )
require 'lplib'
 
def insert_css_link( html_filename )
  lp_dir = File.dirname( html_filename )
  css_filename = lp_dir + "/" + "styles.css"
  if File.exists?( css_filename )
    css_link = "  <link rel=stylesheet type=\"text/css\" href=\"styles.css\" title=\"css\">"
    html = Lplib.readfile( html_filename )
    html.sub!( /<\/head>/m, "#{css_link}\n</head>" )
    Lplib.writefile( html_filename, html )
    puts "Inserted CSS link into file: #{html_filename}"
  else
    puts "No CSS style sheet added."
  end
end
 
insert_css_link( ARGV[ 0 ] )
EOT
 
cat > ./lib/lplib.rb <<EOT
# Do not edit, file is generated.
 
class Lplib
  def self.readfile( filename )
    file = File.open( filename, "r" )
    ret_val = file.read
   file.close
 
    return ret_val
  end
 
  def self.writefile( filename, content )
    file = File.open( filename, "w" )
    file.write( content )
    file.close
  end
end
EOT
 
12. Inlining Images
-------------------
 
When a single line consists of no spaces and the whole line represents an
existing image file, the file will be inlined into the HTML output (with a src
tag).
 
We can therefore keep the processing separate from the rest by just post
processing the resulting HTML file.
 
cat > ./bin/inline_images.rb <<EOT
#! /usr/bin/env ruby
# Don't alter this file, it gets auto generated by lp!
 
$: << File.dirname( __FILE__ ) + "/lib"
require 'lplib'
 
html_filename = ARGV[ 0 ]
if html_filename == nil
  puts "ERROR: no HTML input file provided!"
  exit 2
end
if File.exists?( html_filename ) == false
  puts "ERROR: given HTML file name does not exist!"
  exit 2
end
 
#puts "inlining"
 
html = Lplib.readfile( html_filename )
#puts html.length
 
changed = false
while true
  regexp = /^[^\s<]+\.(jpg|png|webp)<br \/>\n/m
  index = html.index( regexp )
  if index == nil
    break
  end
  #puts index
  output = html[ 0 ... index ]
  image_filename = html[ index .. -1 ].sub( /<br \/>\n.*\z/m, '' )
  output << "<img src=\"#{image_filename}\" /><br />\n"
  html = html[ index .. -1 ]
  html.sub!( regexp, '' )
  output << html
  html = output
  changed = true
end
if changed
  bak_filename = html_filename + ".bak"
  File.rename( html_filename, bak_filename )
  Lplib.writefile( html_filename, html )
  puts html_filename
else
  puts "Info: HTML file has no images, no need to save file again: #{html_filename}"
end
 
EOT
 
13. Installation
----------------
 
OK, finally we handle the installation process. Normally this is the first step,
together with the requirements. But as this documentation document got written
while developing the program, this section ended at the end (for now).
 
13.1 Requirements
'''''''''''''''''
 
A Ruby version 2 must be in the environment path and named ruby2.
 
cat > ./configure <<EOT
#! /bin/bash
# Do not edit this file, it gets generated automatically by lp.
 
echo -e "\nChecking requirements for lp...\n"
 
# Obviously bash is installed or this script would not start at all.
type ruby  || { echo "ERROR: ruby2 is missing."; exit 2; }
type make  || { echo "ERROR: make is missing.";  exit 2; }
type bash  || { echo "ERROR: bash is missing.";  exit 2; }
type tidy  || { echo "INFO: tidy command for checking HTML is missing, but it is not essential.";
                exit 2; }
type links || { echo "INFO: links text web browser is missing, but it is not essential.";
                exit 2; }
type which || { echo "INFO: the 'which' command is is missing, but it is only used for testing.";
                exit 2; }
 
echo -e "\nAll requirements for lp are fulfilled."
echo -e "\nIn order to build the project type: make\n"
EOT
 
So the user has to run the classical:
 
./configure
make
sudo make install
 
Afterwards make sure the 'lp' command is in the environment PATH included:
 
export PATH=$PATH:/usr/local/lp/bin
 
14. Download
------------
 
And yeah, where to get it?
 
git clone
http://techinvest.li/git/lp.git lp
cd lp
 
And then you do the above installation.
 
15. Usage
---------
 
Here is an example
 
cat > ./test/example.txt <<EOT
Example Program
===============
 
This is an example to demonstrate the usage of the lp program.
 
cat > test/example_script.sh <<EOF
#! /bin/bash
 
echo "hello, world!"
EOF
EOT
 
And then to use lp on this file:
 
lp test/example.txt
 
And to execute the script created by that lp document:
 
./test/example_script.sh
 
16. Add a TOC
-------------
 
How to add a table of contents to an HTML document?
 
After you created the HTML document regularly use the add_toc.rb script. It
either uses the default doc/index.html file or the given html file as a
parameter.
 
17. 'lp' Convenience Script
---------------------------
 
Instead of using ./bin/lp.rb it is much easier to just type 'lp'.
 
cat > bin/lp <<EOT
#! /bin/bash
# Do not edit this file, it gets automatically generated by lp.
 
if [ "$APPHOME" = "" ]
then
  PRG=$0
  progname=$(basename $0)
  while [ -h "$PRG" ]
  do
    ls=$(ls -ld "$PRG")
    link=$(expr "$ls" : '.*-> \(.*\)$')
    if expr "$link" : '/.*' > /dev/null
    then
      PRG="$link"
    else
      PRG="$(dirname $PRG)/$link"
    fi
  done
 
  export APPHOME=$(dirname "$PRG")/..
  export APPHOME=$(cd $APPHOME >/dev/null; pwd)
fi
 
$APPHOME/bin/lp.rb $@
EOT
 
Then make sure, the bin directory is in your environment PATH variable included
or you have a link to lp in your environment
 
18. Project Structure and Conventions
-------------------------------------
 
To ensure a clear separation between documentation and generated artifacts, a
standard project structure is recommended. This structure is inspired by common
software development practices and aligns with the LP philosophy of keeping
source and output distinct.
 
The canonical entry point for any project using 'lp' is the file
'doc/index.txt'. This file serves as the main documentation hub.
 
Default File Layout:
 
myproject/
+- bin/
|  +- myproject.rb      # The default Ruby script generated by lp for this
|  |                    # project.
|  +- (more generated scripts)
|- doc/
|  +- index.txt         # The main literate input source file.
|  +- index.html        # The through lp generated HTML output.
|  +- styles.css        # Optional CSS styling.
+- src/
|  +- (more generated source files)
+- Makefile              # Optional build automation that can invoke lp.
 
The only file mandatory is myproject/doc/index.txt in this case, and only if
'lp' get's invoked without further specifying any input text file. But even
this file is optional, if lp is used with specifying the input file on the
command line.
 
Usage Scenarios:
 
1. Default Build: Running the command 'lp' (without arguments) inside the
   project root directory automatically looks for 'doc/index.txt' and
   generates 'doc/index.html'. It also processes any specific file extractions
   defined within that document (such as scripts inside 'bin/').
 
2. Specific File Build: To process a specific text file and generate a
   corresponding HTML file (e.g., 'doc/example.txt' -> 'doc/example.html'),
   run: lp doc/example.txt
 
3. Project Root: The 'lp' tool expects to be run from the root of the
   project directory. This ensures that relative paths for file extraction
   (like 'cat > bin/script.sh') resolve correctly relative to the project
   root.
 
This structure ensures that the raw text files ('*.txt') remain the single
source of truth, while all generated artifacts ('*.html', '*.sh') are clearly
distinguished or placed in appropriate directories. It is a good habit to
indicated inside the generated files, that users should not edit them, as the
changes will be most likely overwritten and lost.
 
19. Specification
-----------------
 
19.1 Overview
'''''''''''''
 
It is time to summarize what we got so far in features and behavior in a single
place now regarding the input text format and its conversion into HTML format:
 
The LP format is designed to be a plain text document first, readable and
editable in any text editor without the need for special tools or markup
parsers. The format prioritizes human readability over machine processing,
using minimal, intuitive visual cues for structure.
 
19.2 General Principles
'''''''''''''''''''''''
 
- No escaping: The text should remain entirely readable without processing. No
  XML tags or special escape sequences are used.
- File extensions: input files use the '.txt' extension.
 
19.3 Headings
'''''''''''''
 
Headings are defined visually using underlines. The length of the underline
must match the length of the heading text exactly.
 
- Level 1 (Main Title): Underlined with equal signs ('=').
- Level 2 (sections): Underlined with minus signs ('-').
- Level 3 (Subsections): Underline with single quote signs ("'").
- Level 4 (Subsubsections): A simple head line with an empty line above and
  below.
 
Here is an example with different heading levels (note, because we don't wish
to render this example in this document, we left two underline markers out in
each line to fool the algorithm):
 
'''
Main Title
= ====== =
 
Subsection Title
- ------------ -
 
Subsubsection Title
' ''''''''''''''' '
 
Subsubsubsection Title
 
Here is some text or an entire paragraph
with one or more lines.
'''
 
19.4 Main Source Code Block
'''''''''''''''''''''''''''
 
Code is embedded within the text. It is identified by specific start and end
markers.
 
- Start Marker: a line beginning with a shebang '#!'.
  - Example: '#! /usr/bin/env ruby'
- End Marker: a line starting with the word 'exit' (followed by a parenthesis,
  space, or slash).
  - Example: 'exit( 0 )'
- Behavior: everything between and including both the '#!' line and the 'exit'
  line is considered executable code. This code is extracted to a separate
  script file determined by the projects directory name.
 
19.5 Source File Extraction (Here-Docs)
'''''''''''''''''''''''''''''''''''''''
 
To generate specific files (scripts, configuration, etc.) from the text, LP
utilizes a "Here-Doc" style syntax similar to Unix shell scripts.
 
- Syntax:
  'cat > examplemyfile.suffix <<EOF' (or '<<EOT')
  - Write Mode: 'cat > example.suffix' creates or overwrites the specified
    file.
  - Append Mode: 'cat >> example.suffix' appends to the file (detected via the '>>'
    operator instead of using '>').
- Content: all lines following this marker are written to the specified
  until the closing marker is found.
- Closing Marker: a line containing only 'EOF' or 'EOT' (matching the opening
  marker).
- File Permissions: Generated executable files should be set to 'chmod 700' as
  they are assumed to be an executable script.
  Here is an example:
 
cat > ./tmp/hello_world.sh <<EOF
#! /bin/bash
# Do not edit this file directly - it gets automatically generated.
 
echo "hello, world"
EOF
 
19.6 Links
''''''''''
 
URLs are automatically detected and converted into clickable links in the HTML
output.
 
19.7 Visual Formatting Notes
''''''''''''''''''''''''''''
 
- Line Breaks: lines are preserved.
- Special Characters: ampersands ('&') and angle brackets ('<', '>') are
  HTML-escaped in the output.
- Spaces: Multiple spaces are preserved using non-breaking space entities
  ('&nbsp;').
 
19.8 Images
'''''''''''
 
If a line contains only a valid image filename (e.g., 'logo.png'), it is
automatically inlined into the HTML output as an '<img>' tag.
 
Example: 'logo.png' -> '<img src="logo.png" />'
 
19.9 Table of Contents (TOC)
''''''''''''''''''''''''''''
 
A Table of Contents is automatically generated for the HTML output based on the
heading structure (Section 3). It is inserted before the first sub-heading in
the document.
 
19.10 CSS Styling
'''''''''''''''''
 
Use of CSS styles is optional. However, if a file named 'styles.css' exists in
the same directory as the input text file, it is automatically linked in the
generated HTML output. No markers are required in the text file itself.
 
20. Version
-----------
 
Let's print the version of lp with the --version option.
 
Now because lp.rb was the very first program that was ever written with lp,
there is no cat redirect technique in this document, to create the lp.rb file in
multiple steps.
 
Anyway, first here is the code:
 
cat > ./bin/temp.rb <<EOT
if ARGV.include?( "--version" )
  printf "lp "
  lp_home = File.dirname( File.realpath( __FILE__ ) )
  local_command = "#{lp_home}/version.sh"
  system "#{lp_home}/version.sh"
  exit 0
end
EOT
 
Now we need to add this to the main lp.rb file. Well, retroactively we add a
marker to the main file, # version placeholder #. This happens to be a comment
and is ignored by Ruby otherwise.
 
Here is a trick how to insert the temp file into lp.rb with two basic Unix
commands:
 
sed '/# version placeholder #/r  bin/temp.rb' bin/lp.rb     > bin/lp.tmp.rb
sed '/# version placeholder #/d'              bin/lp.tmp.rb > bin/lp.rb
rm                               bin/temp.rb  bin/lp.tmp.rb
 
So we add these three lines to the Makefile.
 
21. Help
--------
 
Just like with --version, first let's add the --help parameter.
 
cat > ./bin/temp_help.rb <<EOT
if ARGV.include?( "--help" )
  puts "lp [--help] [--version] [--presentation]"
  #puts "  --presentation   Creates a PDF in 16:9 format."
 
  exit 0
end
EOT
 
22. Release History
-------------------
 
Time to document a little bit the feature development.
 
2027-01-17   6.43   The table of contents gets added to each HTML file now and
                    a new test to check the installation process.
2027-01-11   5.39   Subsections are now included in the table of contents.
2026-01-10   4.37   A lot more documentation. Use 'ruby' instead of 'ruby2'.
2025-10-15   3.32   Third public version.
2025-10-15   3.28   The table of contents now uses the same monospace font as
                    the rest of a generated HTML page.
2025-10-15   3.27   New release history.
2025-10-15   3.26   New --version feature for the lp command.
2025-03-12   2.17   Second published version.
2024-01-17   1.6    First published version.
2023-12-21   1.1    Minimalist first private working version.
 
23. Slides
----------
 
As with the table of contents, we need to split the whole input text into
chapters, which in this case are separate pages. Unlike there, this time we
don't use the HTML for this but the original input text.
 
cat > ./bin/slides.rb <<EOT
#! /usr/bin/env ruby
# Do not edit this file, as it gets automatically generated by lp.
 
txt_filename = ARGV[ 0 ]
 
# Returns the content, lines, and pages with headline, index start, index end.
def split_pages( txt_filename_param )
  #puts "split_pages.enter"
 
  content = nil
  if File.exists?( txt_filename_param )
    content = Lplib.readfile( txt_filename_param )
  else
    printf "ERROR: file '#{txt_filename_param}' does not exist."
 
    exit 2
  end
 
  # A new page is defined as beginning of text or empty line, followed by a head
  # line, followed by a '=' or '-' line with the same length as the headline,
  # followed by the end of the text or an empty line.
  lines = content.split( /\n/ )
  status = "empty line"
  headline = nil
  # Contains an array for each page which contains the headline and the index of
  # the headline in the lines array and the last text line index.
  pages = []
  lines.each_with_index do |line, index|
    #puts "line[#{status}]=#{line}"
    if status == "empty line"
      if line =~ /^\s*$/
        next
      elsif line =~ /^\s*(-+|=+)\s*$/
        status = "some text"
        next
      else
        headline = line.sub( /^\s+/, '' )
        headline.sub!( /\s*$/, '' )
        status = headline.length
        next
      end
    elsif status == "some text"
      if line =~ /^\s*$/
        status = "empty line"
puts "empty line"
        next
      end
    elsif status == "dashed line"
      if line =~ /^\s*$/
        # We should compare the length of the dashed line. But hey, we call it
        # Bingo anyway.
if pages.length > 0
  pages[ -1 ] << (index - 4)
end
        pages << [ headline, index - 2 ]
        status = "empty line"
        next
      end
    elsif status.class == Integer
      # Previously we got a headline.
      if line =~ /^\s*(-+|=+)\s*$/
        status = "dashed line"
        next
      elsif line =~ /^\s*$/
        status = "empty line"
        next
      else
        # So it wasn't a headline, just the beginning of some text.
        status = "some text"
        next
      end
    end
  end
  if pages.length > 0
    pages[ -1 ] << lines.length - 1
    if status == "empty line"
      pages[ -1 ][ -1 ] = pages[ -1 ][ -1 ] - 1
    end
    if status == "dashed line"
      pages[ -1 ][ -1 ] = pages[ -1 ][ -1 ] - 3
    end
  end
  #puts "status=#{status}"
  if status == "dashed line"
    pages << [ headline, lines.length - 2, lines.length - 1 ]
  end
 
  return content, lines, pages
end
 
$: << 'lib'
require 'lplib'
 
content, lines, pages = split_pages( txt_filename )
puts pages.inspect
 
$: << "#{ENV[ 'HOME' ]}/machine/src/rb/stacky/bin"
$: << "#{ENV[ 'HOME' ]}/machine/src/rb/stacky/lib"
 
require 'stacky'
 
title = pages[ 0 ][ 0 ]
push lines[ pages[ 0 ][ 1 ] .. pages[ 0 ][ 2 ] ]
stacky "[ -1 slice: ] right :"
stacky "3 right .s"
stacky "first_page ! .s"
 
stacky "["
push ".AUTHOR \"lp slides\""
push ".TITLE \"#{title}\""
#push ".SUBTITLE \"2025-02-22\""
push ".PDF_TITLE \"\\*[\$TITLE]\""
push '.\"'
#push ".DOCTYPE SLIDES ASPECT 16:9"
push ".DOCTYPE SLIDES A4"
#push ".PAPERMEDIA A4 landscape"
push ".DOCTYPE SLIDES"
push ".START"
push ".PP"
push ""
push ".PP"
push ""
push ".PP"
push ""
push ".PP"
push ""
push ".PP"
push ".HEADING 2 \"#{title}\""
push ".PP"
push ""
stacky "first_page @"
local_lines = pop
local_lines.each do |line|
  push ".PP"
  push line
end
 
# The remaining pages.
normal_pages = pages[ 1..-1 ]
 
while normal_pages.length > 0
  headline = normal_pages[ 0 ][ 0 ]
  push ".NEWPAGE"
  push ".PP"
  push ".HEADING 3 \"#{headline}\""
  push ".PP"
 
  normal_lines = lines[ normal_pages[ 0 ][ 1 ] .. normal_pages[ 0 ][ 2 ] ]
  3.times { normal_lines.delete_at( 0 ) }
  normal_lines.each do |line|
    push ".PP"
    push line
  end
 
  normal_pages.delete_at( 0 )
end
 
stacky "] lf join"
 
mom_filename = txt_filename.sub( /\.txt$/, '.mom' )
pdf_filename = txt_filename.sub( /\.txt$/, '.pdf' )
puts mom_filename
=begin
.PP
 
.PP
Austausch mit Anton Schatz
.PP
 
.PP
Chr. Clemens Lahme
.PP
2025-02-22
.NEWPAGE
.B TechInvest
 
Tech Stack - Tools
.PP
Open-Source
=end
#push title
push mom_filename
stacky ".s writefile errorexit .s"
stacky "'pdfmom ' #{mom_filename} << ' > ' #{pdf_filename} << << system errorexit drop"
 
puts "SUCCESS: #{__FILE__} - 0."
EOT
 
24. Issues
----------
 
- Main code block extract assumes Ruby used. That is not everybody's first
  choice. It should be more generic! Or is it already?
- ''' quotes should not format headings in the text.
- Are all cases of exit for the end of the canonical script covered? Do we need
  a test for it?
 
25. Conclusion
--------------
 
Just realized, we reinvented Org mode from Emacs. But no problem, we still
can use lp for generating code and we can steal some tooling or syntax from
org mode in the future. Yeah, using "~" instead of "'" might be a good
idea. However first impression of '*' for underlining seems to be too blatant
to me for a third level layer heading.
 
26. Glossary
------------
 
26.1 Literate Programming
'''''''''''''''''''''''''
 
26.2 Markdown
'''''''''''''
 
26.3 Here-Document
''''''''''''''''''
 
A here-document (often abbreviated as heredoc) is a block of multi-line text
that is treated as a standard input (stdin) for a command.
 
The standard syntax uses a "double less-than" ('<<') followed by a delimiter
token (usually a word like 'EOF' or 'END').
 
command << DELIMITER
This is line 1
This is line 2
And this is line 3
DELIMITER
 
Here-documents are a feature of the Bourne shell ('sh').
 
The feature was introduced by Stephen Bourne at Bell Labs in 1977-1979 when he
created the Bourne shell ('sh') to replace the Mashey shell. It was officially
released with Unix Version 7 in 1979.
 
Before the Bourne shell, the Thompson shell and Mashey shell supported input
redirection via '<', but they did not support embedding multi-line text blocks
directly in the script. Programmers had to create temporary files or chain
'echo' commands to pass text to utilities.
 
Stephen Bourne added the heredoc syntax to solve the problem of generating
multi-line data without creating temporary files on the filesystem.
 
The term comes from the phrase "here document" (as in, "the document starts
here"). In the syntax 'command << END', the '<<' operator essentially tells the
shell: "Take the input from here, directly in the script, rather than from a
file over there."
 
Emacs Org Mode
''''''''''''''