[Overview][Classes][Procedures and functions][Index] Reference for unit 'HTML2TextRender' (#lazutils)

THTML2TextRenderer

[Properties (by Name)] [Methods (by Name)] [Events (by Name)]

Implements an HTML-to-Text renderer.

Declaration

Source position: html2textrender.pas line 31

type THTML2TextRenderer = class

public

  constructor Create();

  

Constructor for the class instance.

  destructor Destroy; override;

  

Frees the class instance.

  function Render();

  

Parses the HTML and renders the plain text output.

  property LineEndMark: string; [rw]

  

Defines the end-of-line character sequence.

  property TitleMark: string; [rw]

  

Defines the character used to delimit a title or header.

  property HorzLineMark: string; [rw]

  

Represents a HR tag in the plaint text output.

  property LinkBeginMark: string; [rw]

  

Represents an A start tag in the plain text output.

  property LinkEndMark: string; [rw]

  

Represents an A end tag in the plain text output.

  property ListItemMark: string; [rw]

  

Represents a list item in the plain text output.

  property MoreMark: string; [rw]

  

Indicates that the plain text output is truncated due to a line limit restriction.

  property MaxLineLen: Integer; [rw]

  

Maximum number of characters allowed in a line output by the renderer.

  property IndentStep: Integer; [rw]

  

Increment (in spaces) for each nested HTML level.

end;

Inheritance

THTML2TextRenderer

  

Implements an HTML-to-Text renderer.

|

TObject

Description

THTML2TextRenderer is an HTML-to-Text renderer. It converts HTML into plain text by stripping tags and their attributes. Converted text includes configurable indentation for HTML tags that affect the indentation level. The following HTML tags include special processing in the renderer:

The following Named character entities are converted to their plain text equivalent:

 
' '
<
'<'
&gt;
'>;'
&amp;
'&'

Other named character entities or numeric character entities are included verbatim in the plain text output.

A UTF-8 Byte Order Mark in the HTML is ignored.

Set property values in the class instance to customize the content and formatting produced in the output. Use the Render method to parse and process the HTML content passed to the constructor, and generate the output for the class instance.

See also

THTML2TextRenderer.Create

  

Constructor for the class instance.

THTML2TextRenderer.Render

  

Parses the HTML and renders the plain text output.


Version 4.0 Generated 2025-05-03 Home