10.3-old A simple text formatter system
Here we present another example of using nested classes. The domain is text formatting as known from e.g. Microsoft Word, and LaTex. The complexity of most of these systems is huge. Here we will thus stick to a very simple text formatting system.
The phenomena of our domain are documents that consists of a number of paragraphs. The paragraphs of a document are formatted according to a given template, which defines a number of styles for formating the paragraphs in the document. A style may e.g. define the font being used, the size of the font, the form of the font, etc. A paragraph may be left-aligned, centred, right-aligned, og with flushed margins.
In addition to styles, there are also properties associated with a document like page orientation (portrait or landscape), paper size (like A4, A3, and US letter), and number of columns.
To avoid ending up with a large example, we will keep the possible template and style properties to a minimum. The text formatter we define, can handle documents with two types of paragraphs: headings and body text. Headings are automatically numbered sequentially and body texts are indented by three spaces. A document has only one property which is the maximum length of a line in a paragraph.
We start by defining class Document:
class Document(temp: ref Template, lineMax: var integer):
class Paragraph(theContent: var String, theStyle: ref Style):
format -> theStyledPar: var String:
theStyledPar :=
theStyle.format(theContent.asString,lineMax)
content: obj OrderedList(Paragraph)
addPar(P: ref Paragraph):
content.insert(P)
format -> formattedContent: var String:
theStyledPar := ""
content.scan
formattedContent := formattedContent + current.format
print:
print(format)
- Class
Documenthas two parameterstemp, andlineMaxwhere temp is theTemplatedefining the format of theDocument, andlineMaxwhich defines the maximum length of a line in a paragraph.TempandlineMaxare used when adding paragraphs to a given document. - It has a local class
Paragraph, which has two parameters:theContentwhich is aString, andtheStylewhich is a reference to aStyleobject defining the style of the paragraph. - A
Paragraphhas aformatmethod that invokes theformatmethod of theStylewith theStringof theParagraphand thelineMaxas an argument. ClassParagraphis an example of a nested class. - The data-item
contentholds the paragraphs of theDocument. The objectcontentis of typeOrderedList(Paragraph).OrderedListis similar to classSetas described in Section 4. The difference is that there is no ordering of the elements in aSetobject whereas the elements of anOrderedListare ordered in the order they are inserted into the list. For details aboutOrderedList, see Section xxx. - The method
addParadds aParagraphto theDocumentby inserting it intoOrderedList. - The
printmethod ofDocumentinvokes theformatmethod and prints theStringreturned byformat.
Next, we define class Template:
class Template:
styles: obj OrderedList(Style) -- not used
heading: ref Style
bodyText: ref Style
inner(Template)
- The
Stylesdefined for a givenTemplateare held in astylesobject of typeOrderedList. - A
Templateobject has two stylesheadingandbodyTextwhich must be defined in subclasses ofTemplate
Next we define class Style:
class Style -> S: ref Style:
format(content: var String, lineMax: var integer)
-> formattedContent: var String:<
inner(format)
inner(Style)
S := this(Style)
In this simple example, a Style-object only has a virtual format method. Note that a generation of a Style object returns a reference to the generated object as expressed by assigning this(Style) to the return variable S.
We may now define DefaultHeading and DefaultBodyText as subclasses of Style:
class DefaultHeading: Style
secNo: var integer
format::<
secNo := secNo + 1
formattedContent := "\n" + secNo + ". " + content
class DefaultBodyText: Style
format::<
pos: var integer
pos := 3
formattedContent := "\n "
for (1):to(content.length):repeat
pos := pos + 1
if (pos > lineMax):then
formattedContent := formattedContent + "\n "
pos := 3
formattedContent := formattedContent + content[inx]
A DefaultHeading has a local variable secNo keeping track of the heading numbers. The format method increments secNo by one and returns a String consisting of a newline ("\n"), followed by secNo followed by a dot and a blank (". ") and the string hold in the content parameter of format.
The format method of class DefaultBodyText appends the characters in content to formattedContent. A newline and there blanks ("\n ") are inserted so each line has a maximum of lineMax characters.
We may define a standard template as a subclass of Template:
class StandardTemplate: Template
heading := DefaultHeading
bodyText:= DefaultBodyText
A StandardTemplate object initializes the predefined styles heading and bodyText to defaultHeading and defaultBodyText respectively.
We may enclose the above classes in a class SimpleTextFormatter:
class SimpleTextFormatter:
class Document:
...
class Template:
...
class Style:
...
class StandardTemplate: Template
...
We may now show a simple example of using the text formatter:
myFormatter: obj SimpleTextFormatter
doc: obj Document(StandardTemplate,10)
addCont(T: ref Text, S: ref Style):
addPar(Paragraph(T,S))
doc.addCont("Introduction",temp.heading)
doc.addCont("Once upon a time, ...",temp.bodyText)
doc.addCont("The problem",temp.heading)
doc.addCont("There was a programmer, ...",temp.bodyText)
doc.print
- The singular object
myFormatteris subclassed fromSimpleTextFormatter. - It defines a singular object
docsubclassed fromDocumentwith argumentsStandardTemplateand10– the latter settinglineMaxto 10 characters. Note thatStandardTemplategenerates an object and a reference to this object is passed as the first argument toDocument. - The
doc-object has a methodaddCont, which inserts aParagraphof a givenStylein theDocument. - Then
doc.addContis invoked four times putting two headings and two body texts into the document. For the headings, thetheStyle.bodyTextis used and for the body textstheStyle.bodyTextis used. - Finally it
printsthe document which gives the following output:
1. Introduction
Once up
on a tim
e, ...
2. The problem
There w
as a pro
grammer,
...
As can be seen, the format method may insert newlines inside a word – a real text formatting system should of course not do that.
Discussion
In the example above, we have only defined a single subclass, StandardDocument of Document and a single StandardTemplate. We may of course defined several subclasses of Document and Template. Most publishers, professional societies, etc., have their own document and template formats. For MS Word and LaTex there are numerous possibilities depending on the use of the document.
The above text formatter is very simple. If you consider at systems like MS Word, it contains a large amount of options for formating a given document. This includes fonts and font families – class Style may be extended to include a font, a font size, etc. A Paragraph may include how it is to be adjusted. A Document may have a size (A4, US Letter, …) and it may have a page orientation (landscape or portrait).
In MS Word its is possible to change the template of a given document. This implies that there must be a procedure/algorithm for converting the style of each paragraph to a similar style in the new tempalte. In MS Word this is mainly done by selecting a style of the same name.
It is a major task to develop a text system that resembles MS Word. The above classes may be extended to some extent, but it is unlikely that the structure of these classes may expand into something like MS Word. We leave it is a challenge!
Exercise
Add methods that generates a HTML-document from the content of a Document object.