erroronline1 / markdown
Markdown compiler for PHP and ECMA-script.
Requires
- php: >=8.0
README
My markdown parser from scratch
supposed to match GitHub-flavoured, basic and extended Markdown sytax to a reasonable amount
But why another?
This parser originates from another of my projects. This project has high concerns on privacy and data integrity, so I tried to create myself what I have been able to.
This may be of use to someone else, easy to tweak and understand at best, so I would not want this to be buried within another project folder.
There are about 1000 PHP packages on Packagist matching this topic, BUT
- many have a huge amount of other dependencies which I avoid in general if i can.
- this project not only has a PHP-library but also an ECMAScript. You can decide if you want to render the content on the server, or let the users machine do the work, while the payload is a bit less bloated. While using both in your project in general you can expect the same result.
- I can easily implement features I consider helpful for my projects.
Features
- Link auto-detection, as well as tel- and ftp-protocol
- Markdown link titles
- Auto-mailto
- Escaping code by double-backticks too
- Subscript, superscript and mark
- Custom header ids, as well as auto assigning referable ids to headers
- A custom Markdown
^^for larger text^^
safeMode does not convert links and aims to convert relevant characters for script execution and insertions to HTML-escaped characters to avoid malicious code from untrusted user input. Internal links like #heading are not affected though.
Most of the major created element tags have a class="eol1_md" attribute, so you can style these more easily. This is applicable for
- a
- blockquote
- code
- dl (style dt and dd as children)
- img
- input (type checkbox)
- mark
- ol and ul (style li as children)
- pre
- span (for larger text)
- table (style tr, th and td as children)
If this is not enough you would probably wrap the output into a container and address its content for your CSS and query selectors.
In TCPDF-mode tables are prefixed with a linebreak to ensure correct nesting within lists. Also every odd row is assigned class="eol1_odd", because pseudoclasses are not supported.
Table conversion
The PHP-library has two additional methods to parse a CSV-file to a Markdown-table and vice versa.
$MARKDOWN->csv2md($path, $csv = ['separator' => ';', 'enclosure' => '"', 'escape' => '']); $MARKDOWN->md2csv($content, $csv = ['separator' => ';', 'enclosure' => '"', 'escape' => '']);
handle this task and can take CSV-formatting into account.
Sample test
See the result from both parsers by loading the provided index.php-file in your browser. Play with the available options on your machine or head over to the live preview.
Installation
You know what? I hate unexpected dependencies and changes like the next guy. You are free to just grab the required file from the src-directory, import it on your own and handle changes and update as you feel comfortable! There is only one file per language. I'm not the boss of you. Just respect the AGPL license.
This does not really have any dependencies. Still this is installable via Composer, just to have a standardized autoloader behaviour and everyone but me is used to that. Run
composer install erroronline1/markdown
in your project directory and import the module with
require(__DIR__ . '/../vendor/autoload.php'); // or whatever your directory structure is
or
import { Markdown } from "../vendor/erroronline1/markdown/src/Markdown.js";
Use
Instatiate the Markdown class. In PHP you can choose to override some semantic HTML with tags supported by TCPDF v6.11.
// normal mode $MARKDOWN = new erroronline1\Markdown\Markdown(); // or TCPDF-mode $MARKDOWN = new erroronline1\Markdown\Markdown(true);
Convert your Markdown-content with
// normal mode $mycontent = $MARKDOWN->md2html($mycontent); // or safeMode to avoid malicious script insertion $mycontent = $MARKDOWN->md2html($mycontent, true);
The same goes for the ECMAScript version. Instead of the TCPDF-flag, the md2html-method can be passed selected tags, while others will be ignored. This may improve contextual performance. Without the selection all formatting will be executed.
const MARKDOWN = new Markdown(); mycontent = MARKDOWN->md2html(mycontent, true, ["emphasis", "larger", "br"]);
will only render bold and italic, my custom larger text and linebreaks. The safeMode will still be applied.
Output
Use the following sample to check against other Markdown-parsers and decide for yourself which one is more suitable for your needs.
# Plain text (h1 header)
(ATX)
This is a markdown flavour for basic text styling.
Lines should end with two or more spaces
to have an intentional linebreak
and not just continuing.
Text can be *italic*, **bold**, ***italic and bold***, ~~striked through~~, and `code style` with two ore more characters between the symbols.
Some escaping of formatting characters is possible with a leading \ as in
**bold \* asterisk**, ~~striked \~~ through~~ and `code with a \`-character`.
also ``code with ` escaped by double backticks`` and ==marked text==
Subscript like H~2~O and superscript like X^2^
Custom markdown for this engine for making ^^font larger^^
[ ] task
[x] accomplished
http://some.url, not particularly styled
a phone number: tel:012345678
[Styled link to markdown information](https://www.markdownguide.org)
Plain text (h1 header)
======================
(SETX)
--------
## Lists (h2 header) {#withcustomid}
1. Ordered list items start with a number and a period
* Unordered list items start with asterisk or dash
* Sublist nesting
* is possible
* by indentating with four spaces
1. and list types
2. are interchangeable
2. Ordered list item
with
multiple lines
1. the number
1. of ordered lists
2. actually doesn't
3. matter at all
### Nested items in lists
1. List item with
> Blockquote as item
2. Next list item with
|Table|Column2|
|---|---|
|R1C1|R1C2|
4. List item with
~~~
code with
multiple line
~~~
8. List item with
[x] accomplished task
[ ] unaccomplished task
## Tables (h3 header)
| Table header 1 | Table header 2 | Table header 3 | and 4 |
| --- | --- | --- | --- |
| *emphasis* | **is** | ***possible*** | `too` |
| linebreaks | are | not | though<br />without HTML `<br />` |
- - -
#### Blockquotes and code (h4 header)
> Blockquote
> with *multiple*
> lines
preformatted text/code must
start with 4 spaces <code>
~~~
or being surrounded by
three \` or ~
~~~
#### Nested items in blockquotes
> * List within blockquote 1
> * List within blockquote 2
> * Nested list
>
> ~~~
> Code within blockquote
> ~~~
>> Blockquote within blockquote
>
> | Tables nested | within | blockquotes |
> | :---------- | :-----: | ---: |
> | are | possible | as well |
> | like | aligning | colums |
>
> definition list
> : first definition
> : second definition
## Definitions and footnotes
definition list
: first definition
: second definition
Here's a simple footnote[^1], and here's a longer one[^bignote]. Footnotes will appear at the bottom later.
[^1]: This is the first footnote.
[^bignote]: Here's one with multiple paragraphs and code.
Indent paragraphs to include them in the footnote.
`code`
Add as many paragraphs as you like.
## Other features:
<http://some.other.url> with brackets, [urlencoded link with title](http://some.url?test2=2&test3=a=(/bcdef "some title") and [javascript: protocol](javascript:alert('hello there'))
some `code with <brackets>`
mid*word*emphasis and __underscore emphasis__
some@mail.address and escaped\@mail.address
 if loadable
123\. escaped period avoiding a list
[top header](#plain-text)
[second header](#plain-text-1)
[third header](#withcustomid)
### Safety related content that should pose lesser thread with safeMode
<script>alert('script injection')</script>
<a href="javascript:void(0)" onclick="alert('click event')">a with click event</a>
<a href="javascript:alert('click event')">href with click event</a>
[mdscript js href](javascript:alert('js href'))
<div onclick="alert('you clicked!')">clickable div</div>
renders to (look at the sourcecode, since not all features may be available in this preview...)
Plain text (h1 header)
(ATX)This is a markdown flavour for basic text styling.
Lines should end with two or more spaces
to have an intentional linebreak
and not just continuing.
Text can be italic, bold, italic and bold, striked through, and code style with two ore more characters between the symbols.
Some escaping of formatting characters is possible with a leading \ as in bold * asterisk, striked ~~ through and code with a `-character.
also code with ` escaped by double backticks and marked text
Subscript like H2O and superscript like X2
Custom markdown for this engine for making font larger
task
accomplished
http://some.url, not particularly styled
a phone number: tel:012345678
[Styled link to markdown information](https://www.markdownguide.org)
Plain text (h1 header)
(SETX)Lists (h2 header)
- Ordered list items start with a number and a period
- Unordered list items start with asterisk or dash
- Sublist nesting
- is possible
- by indentating with four spaces
- and list types
- are interchangeable
- Ordered list item
with
multiple lines- the number
- of ordered lists
- actually doesn't
- matter at all
Nested items in lists
- List item with
Blockquote as item
- Next list item with
Table Column2 R1C1 R1C2 - List item with
code with multiple line
- List item with
accomplished task
unaccomplished task
Tables (h3 header)
| Table header 1 | Table header 2 | Table header 3 | and 4 |
|---|---|---|---|
| emphasis | is | possible | too |
| linebreaks | are | not | though without HTML <br /> |
Blockquotes and code (h4 header)
Blockquote
with multiple
lines
preformatted text/code must start with 4 spaces <code>
or being surrounded by three ` or ~
Nested items in blockquotes
- List within blockquote 1
- List within blockquote 2
- Nested list
Code within blockquoteBlockquote within blockquote
Tables nested within blockquotes are possible as well like aligning colums
- definition list
- first definition
- second definition
Definitions and footnotes
- definition list
- first definition
- second definition
Other features:
<http://some.other.url> with brackets, [urlencoded link with title](http://some.url?test2=2&test3=a=(/bcdef "some title") and [javascript: protocol](javascript:alert('hello there'))some
code with <brackets>midwordemphasis and underscore emphasis
some@mail.address and escaped@mail.address
if loadable
123. escaped period avoiding a list
top header
second header
third header
Safety related content that should pose lesser thread with safeMode
<script>alert('script injection')</script>a with click event
href with click event
[mdscript js href](javascript:alert('js href'))
clickable div
- This is the first footnote.
↵ - Here's one with multiple paragraphs and code.
Indent paragraphs to include them in the footnote.
code
Add as many paragraphs as you like.
↵
without safeMode in about 0.5-2 ms in PHP (depending on the server) and 2-15 ms in ECMAScript (depending on the clients calculation power). Is the sourcecode tidy? Sure not, but does that matter? Also, no. It's about visuals anyway, isn't it?
Check for yourself.
Current limitations and things feeling off
- This flavour currently lacks support of
- Syntax highlighting
- Emojis