ReverseMarkdown.Net is a Html to Markdown converter library in C#. Conversion is very reliable since HtmlAgilityPack (HAP) library is used for traversing the Html DOM
ReverseMarkdown is a Html to Markdown converter library in C#. Conversion is very reliable since HtmlAgilityPack (HAP) library is used for traversing the Html DOM.
If you have used and benefitted from this library. Please feel free to buy me a coffee!
Install the package from NuGet using Install-Package ReverseMarkdown
or clone the repository and built it yourself.
var converter = new ReverseMarkdown.Converter();
string html = "This a sample <strong>paragraph</strong> from <a href=\"http://test.com\">my site</a>";
string result = converter.Convert(html);
Will result in:
This a sample **paragraph** from [my site](http://test.com)
The conversion can be customized:
var config = new ReverseMarkdown.Config
{
// Include the unknown tag completely in the result (default as well)
UnknownTags = Config.UnknownTagsOption.PassThrough,
// generate GitHub flavoured markdown, supported for BR, PRE and table tags
GithubFlavored = true,
// will ignore all comments
RemoveComments = true,
// remove markdown output for links where appropriate
SmartHrefHandling = true
};
var converter = new ReverseMarkdown.Converter(config);
DefaultCodeBlockLanguage
- Option to set the default code block language for Github style markdown if class based language markers are not available
GithubFlavored
- Github style markdown for br, pre and table. Default is false
SuppressDivNewlines
- Removes prefixed newlines from div
tags. Default is false
ListBulletChar
- Allows to change the bullet character. Default value is -
. Some systems expect the bullet character to be *
rather than -
, this config allows to change it.
RemoveComments
- Remove comment tags with text. Default is false
SmartHrefHandling
- how to handle <a>
tag href attribute
false
- Outputs [{name}]({href}{title})
even if name and href is identical. This is the default option.
true
- If name and href equals, outputs just the name
. Note that if Uri is not well formed as per Uri.IsWellFormedUriString
(i.e string is not correctly escaped like http://example.com/path/file name.docx
) then markdown syntax will be used anyway.
If href
contains http/https
protocol, and name
doesn't but otherwise are the same, output href
only
If tel:
or mailto:
scheme, but afterwards identical with name, output name
only.
UnknownTags
- handle unknown tags.
UnknownTagsOption.PassThrough
- Include the unknown tag completely into the result. That is, the tag along with the text will be left in output. This is the defaultUnknownTagsOption.Drop
- Drop the unknown tag and its contentUnknownTagsOption.Bypass
- Ignore the unknown tag but try to convert its contentUnknownTagsOption.Raise
- Raise an error to let you knowPassThroughTags
- Pass a list of tags to pass through as-is without any processing.
WhitelistUriSchemes
- Specify which schemes (without trailing colon) are to be allowed for <a>
and <img>
tags. Others will be bypassed (output text or nothing). By default allows everything.
If string.Empty
provided and when href
or src
schema couldn't be determined - whitelists
Schema is determined by Uri
class, with exception when url begins with /
(file schema) and //
(http schema)
TableWithoutHeaderRowHandling
- handle table without header rows
TableWithoutHeaderRowHandlingOption.Default
- First row will be used as header row (default)TableWithoutHeaderRowHandlingOption.EmptyRow
- An empty row will be added as the header rowNote that UnknownTags config has been changed to an enumeration in v2.0.0 (breaking change)
var config = new ReverseMarkdown.Config(githubFlavoured:true);
. By default table will always be converted to Github flavored markdown immaterial of this flag.This library's initial implementation ideas were from the Ruby based Html to Markdown converter xijo/reverse_markdown.
Copyright © Babu Annamalai
ReverseMarkdown is licensed under MIT. Refer to License file for more information.