Universal markup converter
Org reader:
fancy_lists
extension is enabled (#9042).JATS reader:
<permissions>
metadata (#9037, Julia Diaz). metadata objects with multiple fields are created, matching the structure in JATS.attrib
.Markdown reader:
![[foo|bar]]
, when one of the wikilinks
extension is enabled (#8853).!
(#8254).!
before nonexistent reference (#9038).LaTeX writer:
\cite
and \bibitem
to link up citations, even with citeproc. (#9031). This will give us better accessibility; when tagging is enabled, the citation can be linked to the bibliography entry. This changes some of the details of the layout and the default template. We now make CSLReferences
a special enumitem list that will contain \bibitem
s. Internal links inside citations to ids beginning in ref-
are creating using \cite
instead of \hyperref
.\phantomsection
and \label
instead of \hypertarget
(#9022).\hyperref
for LaTeX internal links, \hyperlink
for beamer (since \hyperref
doesn’t seem to work) (#9022).%
and #
in URLs (#9014).JATS writer:
back
with an empty title.)Man writer:
.PP
right after a section heading (#9020). This is at best a no-op (in groff man and mandoc) and at worst (in some formatters) may create extra whitespace.V
and a macro that makes this act like boldface in a terminal and monospace in other formats. Unfortunately, this code uses a mechanism that is not portable (and does not work in mandoc) (#9020).V
for inline code, we simply use CR
. Note that \f[CR]
is emitted instead of plain \f[C]
, because there is no C
font in man. (This produces warnings in recent versions of groff, #9020.).EX
and .EE
macros, together with .IP
for spacing and indentation. This gives more standard code that can be better interpreted e.g. by mandoc (#9020).Man template: don’t emit .hy
, regardless of setting of hyphenate
variable (#9020).
LaTeX template: special redefinition of \st
for CJK (#9019). soul’s version raises on error on CJK text.
Use latest skylighting-format-blaze-html (#7248). This works around a longstanding iOS Safari bug that caused long lines to be displayed in a different font size in highlighted code.
Allow skylighting 0.14 (and require it in pandoc core).
Allow text 2.1.
Org reader: allow example lines to end immediately after the colon (Brian Leung).
Docx reader:
JATS reader: Fix display of block elements (#8889, Julia Diaz). A number of block elements, like disp-quote, list, and disp-formula, were always treated as inlines if appearing inside paragraphs, even if their usage granted a separate block.
HTML reader: avoid duplicate id on header and div (#8991).
Typst writer:
~
for nonbreaking space, and escape literal ~
(#9010).#block
(#8991). Previously we were putting the label at the beginning of the Div’s contents, but according to the documentation such a label gets attached to the preceding element. We now use an explicit #block
and add the label at the end.LaTeX writer:
\hypertarget
. This is unnecessary (hyperref creates an anchor based on the label) and it interferes with tagging. In addition, we now use \hyperref
rather than \hyperlink
for internal links. Currently \hypertarget
is still being used for link anchors not on headings. Thanks to @u-fischer.HTML format templates (style.html): Fix typo in clause for svg (Jackson Schuster).
Use lastest texmath, typst-symbols, typst. Targets typst 0.7.
HTML reader: properly calculate RowHeadColumns (#8984). This fixes a bug in the calculation of the number of header columns in table row. It also changes the algorithm for determining the table body’s RowHeadColumns based on the numbers of head columns in each row. Previously we used the max, and #8634 switched to the min, which led to bad results. Now we only set RowHeadColumns to a non-zero value if all rows have the same number of head columns.
OpenDocument writer:
mark
(#8960). The color can be adjusted by modifying the Highlighted style.Typst writer: escape //
so it doesn’t get interpreted as a comment (#8966).
ChunkedHTML writer: Fix regression including MathJax script (#8967). The fix for #8620 caused the script to be included when the table of contents but not the body text of a page contains math. But it broke the case where the table of contents doesn’t contain math but the page does. This patch fixes the issue.
Text.Pandoc.SelfContained:
<use>
(#8969).Use pandoc-types 1.23.1. This fixes a regression with toJSONFilter (#8976), which in 1.23.0.1 no longer worked on pure values of type a -> [a]
.
Use ghc 9.6 for release builds (#8947).
Fix some links in FAQs (Diogo Almiro).
Fix new variant of the vulnerability in CVE-2023-35936. Guilhem Moulin noticed that the fix to CVE-2023-35936 was incomplete. An attacker could get around it by double-encoding the malicious extension to create or override arbitrary files.
--embed-resources
: Use inline SVG instead of data uris for SVG images in HTML5 (#8948). Note that SelfContained does not have access to the writer name, so we check for HTML5 by determining whether the document starts with <DOCTYPE! html>
. This means that inline SVG won’t be used when generating document fragments.
Fix regression on short boolean arguments (#8956). In 3.1.5 boolean arguments were allowed an optional argument (true|false
). This created a regression for uses of fused short arguments, e.g. -somyfile.html
, which was equivalent to -s -omyfile.html
, but now raised an error because pandoc attempted to parse o
as a boolean true
or false
. This change allows the fused short arguments to be used again. Note that -strue
will be interpreted as -s
with an argument true
, not as -s -t -rue
. It is best to use long option names with the optional boolean values, to avoid confusion.
Make --epub-title-page
’s argument optional. It takes a boolean argument, and now that all of our boolean flags take such an argument, we can make this one optional for consistency.
Improve errors for illegal output formats. Previously if you did pandoc -s -t bbb
, it would give you an error about the missing bbb
template instead of saying that bbb
is not a supported output format.
Improve errors for incorrect command-line option values (#8879). Always give the name of the relevant argument.
Fix typo on error message for incorrect --preserve-tabs
argument. Thanks @fsoedjede
Docx reader: use SVG version of image if present (#7244). Previously the backup PNG was exported even if an SVG was present, but the SVG should be preferred.
Typst reader: fix regression in recognition of display math (#8949). The last release caused all math to be parsed as inline math.
JATS writer: don’t use <code>
for inline code (#8889). It is intended for block-level code.
HTML writer: don’t make line blocks sensitive to --wrap
(#8952).
RST writer: fix figure handling (#8930, #8871). This fixes a number of regressions from pandoc 2.x. Properly handle caption, alt attribute in figures. No longer treat a paragraph with a single image in it as a figure (we have a dedicated Figure element now).
Docx writer: Copy “mirror margins” property from reference.docx (#8946).
Text.Pandoc.UTF8: Deprecate decodeArg
which is now a no-op. This was needed for old base versions which we no longer support.
Use released skylighting, typst.
Allow latest commonmark-extensions. This allows entities in wikilinks.
Switch back to using ghc 9.2 for linux and Windows binary releases (#8947, #8955). With ghc 9.4+, we were getting AVX instructions in the amd64 binary, which aren’t supported on older hardware. For maximum compatibility we switch back to ghc 9.2, which doesn’t cause the problem. (As documented, ghc should not be emiting these instructions, so we aren’t clear on the diagnosis, but the cure has been tested.)
Change Windows release build to use cabal instead of stack.
Allow all boolean flags to take an optional true
or false
value (#8788, Sam S. Almahri). The default is true if no value is specified, so this is fully backwards-compatible.
Support --id-prefix
for markdown output (#8878)
Markdown reader:
Typst reader:
Add typst reader tests (#8942).
MediaWiki reader:
AsciiDoc writer:
asciidoc
(#8936). The AsciiDoc community now regards the dialect parsed by asciidoctor
as the official AsciiDoc syntax, so it should be the target of our asciidoc
format. The asciidoc
output format now behaves like asciidoctor
used to. asciidoctor
is a deprecated synonynm. For the old asciidoc
behavior (targeting the Python script), use asciidoc_legacy
. The templates have been consolidated. Instead of separate default.asciidoctor
and default.asciidoc
templates, there is just default.asciidoc
.writeAsciiDoc
now behaves like writeAsciiDoctor
used to.writeAsciiDoctor
is now a deprecated synonym for writeAsciiDoc
.writeAsciiDocLegacy
behaves like writeAsciDoc
used to.Typst writer:
unlisted
class in headings (#8941).#bibliography
command (#8937).Docx writer:
DokuWiki writer: fix lists with Div elements (#8920). The DokuWiki writer doesn’t render Divs specially, so their presence in a list (e.g. because of custom-styles) need not prevent a regular DokuWiki list from being used. (Falling back to raw HTML in this case is pointless because no new information is given.)
LaTeX writer:
fa
(should be persian
).Text.Pandoc.Class:
Add toTextM
[API change]. This is like Text.Pandoc.UTF8.toText
, except:
This replaces utf8ToText
whenever we have the filename and are in a PandocMonad instance. This will lead to more informative error messages for UTF8-encoding, indicating the file path and byte offset where the error occurs (#8884).
Remove invalid term “Subject” from Turkish translations (#8921).
stack.yaml: add pkg-config to nix packages (#8927, pacien).
Allow aeson 2.2.
MANUAL: Add clarification on –section-divs. Closes #8882.
Fix a security vulnerability in MediaBag and T.P.Class.IO.writeMedia. This vulnerability, discovered by Entroy C, allows users to write arbitrary files to any location by feeding pandoc a specially crafted URL in an image element. The vulnerability is serious for anyone using pandoc to process untrusted input. The vulnerability does not affect pandoc when run with the --sandbox
flag.
Allow epub-title-page
to be used in defaults files (#8908).
Issue Extracting
info message (in --verbose
mode) when using --extract-media
or extracting media temporarily in PDF production.
HTML reader: Update TableBody RowHeadColumns caculation (#8634, Ruqi). This change sets RowHeadColumns to the minimum value of each row, which gives better results in cases where rows have different numbers of leading th tags.
Dokuwiki reader: retain image query parameters as attributes (#8887, echo0).
Textile reader: Add support for link references (#8706, Stephen Altamirano). Textile supports what it calls “link alias”, which are analogous to Markdown’s reference-style links.
LaTeX reader: support alt text on images (#8743, Albert Krewinkel).
Commonmark reader: Make implicit_figures
work again. Support for this (introduced in #6350) disappeared when we made an architectural change.
JATS reader:
JATS writer:
--number-sections
work.Mediawiki writer: allow highlighting to work for F# language (Adelar da Silva Queiróz).
LaTeX writer: Fix escaping of &
in \href
and \url
(#8903).
Docx writer:
abstract-title
to be specified in docx metadata (#8794).ChunkedHTML writer: Make math work in top-level page (#8915).
Text.Pandoc.Logging: add new log message type ScriptingWarning
[API change] (Albert Krewinkel).
Lua: report warnings from Lua scripts (Albert Krewinkel). Lua’s warning system is plugged into pandoc’s reporting architecture. Warnings that are raised with the Lua warn
function are now reported together with other messages.
Use crypton-connection instead of connection (#8896, Felix Yan). Follows the change introduced in tls 1.7.0.
Bump versions for skylighting-core, skylighting.
Include lua/module/sample.svg in cabal extra-source-files (Felix Yan).
Add Nynorsk (New Norwegian) translations (Per Christian Gaustad).
Add tests for fillMediaBag
/extractMedia
.
INSTALL.md:
pandoc-extras.md: add to “Academic publishing workflows” (#8696, Vladimir Alexiev).
New output format: typst
.
New module: Text.Pandoc.Readers.Typst [API change].
DocBook reader:
JATS reader:
Org reader (Albert Krewinkel):
#+NAME
as synonym for #+LABEL
(#8578).ODT reader:
RST reader:
HTML reader:
RTF reader:
Docx reader:
Markdown reader:
~
and "
in markdown_strict
(#8777, Albert Krewinkel). This matches the behavior of the legacy Markdown.pl
as well as what is described in the manual.LaTeX reader: ignore args to column type in \multicolumn
(#8789).
HTML writer:
Ms writer:
.TL
.LaTeX writer:
Jira writer:
OpenDocument writer:
text:p
inside text:p
from meta (#8256).ODT writer:
manifest.version
on directory file-entry (Michael Stahl). See ODF 1.3 part 2, 4.16.14.1.MediaWiki writer:
Typst writer:
citations
not enabled (#8763). With this change, the typst writer will omit the #bibliography
command when citations
is not enabled. (If you want to use pandoc’s own --citeproc
, you should combine it with -t typst-citations
to disable native typst citations.<..>
for labels, create internal links.#footnote
for notes (#8893).Commonmark writer:
EPUB template: add lang
attribute to <html>
(Gabriel Lewertoski).
Template styles.html: fix task-list styling in reveal.js (#8731, Albert Krewinkel).
LaTeX template: Fix \babelfont
(#8728).
Text.Pandoc.Parsing:
parseFromString
.Text.Pandoc.ImageSize: Drop BOM at start of SVG if present. Otherwise our code can fail to determine image size.
Lua subsystem:
Fix YAML in translation files for cs
and pl
(#8787).
Fix pdf output via typst (#8754). One must now use typst compile
rather than typst
.
MANUAL.txt:
wikilinks
to non-default extensions (Ilona).#
fancy list markers don’t work with commonmark (#8772, William Lupton).fenced_div
note (#8773, William Lupton).Update documentation for org-mode (Christian Christiansen, #8716).
doc/lua-filter.md:
CONTRIBUTING.md: update info on ghc versions.
INSTALL.md:
Tests: Rename test/docx/block_quotes_parse_indent.native for consistency (Stephan Meijer).
Add tls
constraint on cabal.project. This is needed to avoid problems caused by the transition to crypton
.
Require texmath 0.12.8.
Add a Lua REPL (Albert Krewinkel). This can be started with pandoc lua -i
. It is also possible to instruct a filter to open the REPL at a certain point, for debugging (see pandoc.cli.repl
).
Support typst
as a --pdf-engine
.
Add typst writer (#8713). New module Text.Pandoc.Writers.Typst, exporting writeTypst
[API change].
Org reader:
DocBook reader:
<part>
(#8712).HTML reader:
-native_spans-raw_html
(#8711). Previously with this configuration, <span>
s were not treated as inline elements at all.HTML writer:
.svg.gz
and .png.gz
etc. (#8699).--reference-location=section
or =block
, use an aside
element for the notes rather than a section
. When --reference-location=section
, include the aside
element inside the section element, rather than outside. (In slide shows, this option causes footnotes on a slide to be displayed at the bottom of the slide.)EPUB writer:
Docx writer:
Markdown writer:
Jira reader (Albert Krewinkel):
!
of image markup must now be followed by a non-space character; otherwise, the enclosed text is parsed as normal content.Ms writer:
ICML writer:
LaTeX writer:
LaTeX template:
babelfonts
variable to default LaTeX template. This allows specifying certain fonts to be used with certain babel languages. Thanks to Frederik Elwert.Lua (Albert Krewinkel):
pandoc.cli.repl
functionjson.encode
for nested AST elements. Ensures that objects with nested AST elements can be encoded as JSON.pandoc.text
. This only affects the name in the Lua-internal documentation. It is still possible to load the modules via require 'text'
, although this is deprecated.text
to pandoc.text
The latter is easier to use and more consistent with the other modules.pandoc.format.from_path
.Text.Pandoc.Format: Add new function formatFromFilePaths
[API change] (#8710, Albert Krewinkel).
The old Text.Pandoc.App.FormatHeuristics module has been removed.
In --version
, use Windows %APPDATA%
variable to describe user data dir (#8686, Pablo Rodríguez).
Text.Pandoc.App.CommandLineOptions: don’t lowercase arg to --from
/--read
(Albert Krewinkel). This prevented users to use custom writers with uppercase characters in their filenames. Format-normalization, including lower-casing of format identifiers, happens during format parsing.
Documentation:
doc/nix.md
.doc/extras.md
. This was formally in the website repo.doc/lua-filters.md
: improve docs for pandoc.zip
.Factor out make_macos_release.sh
from the release candidate workflow. Use cabal instead of stack to build the macos binary.
Modify linux/make_artifacts.sh so it will work on cirrus.
Switch to hslua-2.3
Depend on latest releases of texmath, doclayout.
EPUB reader: Give additional information in error if the epub zip container can’t be unpacked.
TSV reader: don’t gobble tabs as whitespace (#8661).
Org reader: accept empty tables (#8659).
LaTeX reader: fix multiplication syntax for tabular (#8658). We recognized *{6}{...}
but not *6{...}
or *6c
.
Docx reader: parse image alt texts in LibreOffice generated files. LibreOffice tags images slightly differently than Word; this change lets the parses take that difference into account when looking for an image description (alt text).
DocBook reader:
<xref>
references to tables in DocBook files (#8626, Pavol Otto).figure
as a Figure element in the AST (#8668).JATS reader: avoid generating duplicate figure captions (#8669).
RST reader: align with spec in syntax for role names (#8653). In particular, we now allow colons in row names.
Add note on converting from .doc format to FAQs (#8654).
Trap error in getAppUserDataDirectory (#8648). This can raise an error if pandoc is run in a non-user environment.
LaTeX writer: do not use longtable foot with Beamer (#8638, Albert Krewinkel). The table foot is made part of the table body, as otherwise it won’t show up in the output. The root cause for this is that longtable cannot detect page breaks in Beamer.
LaTeX template: Add CJKsansfont and CJKmonofont for XeLaTeX (#8656, Yudong Jin). CJKsansfont
and CJKmonofont
will be set for xelatex only if CJKmainfont
is also provided.
URL style in ConTeXt (#8612, Thomas Hodgson). Previously, a URL like this would be in monospace text: \useURL[url1][https://example.com]
. Now, it will match the main text unless the linkstyle
variable is set, which controls the styling of all links. Closes #8602.
Asciidoc writer: Properly escape |
in table cells (#8665).
asciidoc{,tor} template: fix revision date when author is unset (#8637, arcnmx). Revision line syntax is only valid in combination with an author line, so the date attribute must be set explicitly when the author is missing
HTML writer: allow “track” element to be treated as block-level HTML (#8629).
Include needed polyfill when MathJaX is used (#8625).
JATS writer: include alt-text in <graphic>
, <inline-graphic>
elements (#8631, Albert Krewinkel).
Chunked HTML writer: Retain metadata in processing sections for chunked HTML (#8620). Previously we suppressed metadata in all but the top page, in order to prevent the title block from being printed on every page. This prevented use of custom variables set by metadata fields. This commit moves to a better solution: a conditional in the default template restricts the title block to the top page.
Lua API:
pandoc.system.cputime
(Albert Krewinkel). The function returns the CPU time consumed by pandoc and can be used to benchmark Lua computations.pandoc.json
to handle JSON encoding (#8605, Albert Krewinkel).Use pandoc-lua-marshal 0.2.1 (Albert Krewinkel). All major AST elements now have __tojson
metamethods that return the JSON representation of an element. This allows to JSON-encode these elements with libraries that respect the __tojson
metamethod, including dkjson.
Use latest zip-archive. This allows pandoc to open certain epubs that it could not open before.
Use commonmark-extensions 0.2.3.4. This fixes some bugs involving definition lists and inline formatting.
Use latest skylighting-format-context
MANUAL.txt:
--mathml
to reflect support in all major browsers (#8667).docs/custom-readers.md: Update JSON parsing example. The example now uses the built-in pandoc.json
library to parse the API output.
doc/press.md: Add article on CiTO in J Cheminform by @egonw.
doc/lua-filters.md: fix typo in run_json_filter
(Morgan Willcock).
Fix regression with --print-highlight-style
option (#8586).
Add new --chunk-template
option (#8581), allowing more control over the filenames in chunked HTML output.
Text.Pandoc.App: Add optChunkTemplate
constructor to Opt [API change].
Text.Pandoc.Options: add writerChunkTemplate
constructor to WriterOptions
[API change].
Text.Pandoc.Chunks: add Data, Typeable, Generic, ToJSON, FromJSON instances for PathTemplate
[API change].
Text.Pandoc.Citeproc: Fix bug in metaValueToReference
(#8611). This bug caused us to get some repeated content when converting MetaBlock to Inlines.
Textile reader:
ODT reader: fix blockquote indent detection (#3437, Daniel Kessler).
LaTeX writer: include short figure/table caption if one is given (Albert Krewinkel). Short captions are used by LaTeX when generating the list of figures or list of tables. Adding a short caption will now overwrite the full caption in these lists.
Powerpoint writer: fix handling of simple figures (#8565, Albert Krewinkel). This ensures that simple figures are displayed in the same way as before the introduction of a dedicated Figure
constructor in the AST.
Improve handling of %
in bib(la)tex parsing (#8597, #8595).
Use released skylighting 0.13.2.1
INSTALL.md: direct people to cabal install pandoc-cli.
doc/lua-filters.md: document ‘Figure’ type and constructor (Albert Krewinkel). Fix typos (Martin Joerg).
Fix link in manual (#8583, Salim B).