A fast and modern parser combinator library for Scala
Parsley is a fast and modern parser combinator library for Scala based loosely on a Haskell-style parsec
API.
Parsley is distributed on Maven Central, and can be added to your project via:
// SBT
libraryDependencies += "com.github.j-mie6" %% "parsley" % "4.5.2"
// scala-cli
--dependency com.github.j-mie6::parsley:4.5.2
// or in file
//> using dep com.github.j-mie6::parsley:4.5.2
// mill
ivy"com.github.j-mie6::parsley:4.5.2"
Documentation can be found here
If you're a cats
user, you may also be interested in using parsley-cats
to augment parsley
with instances for various cats
typeclasses:
libraryDependencies += "com.github.j-mie6" %% "parsley-cats" % "1.3.0"
scala> import parsley.Parsley
scala> import parsley.syntax.character.{charLift, stringLift}
scala> val hello: Parsley[Unit] = ('h' ~> ("ello" | "i") ~> " world!").void
scala> hello.parse("hello world!")
val res0: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hi world!")
val res1: parsley.Result[String,Unit] = Success(())
scala> hello.parse("hey world!")
val res2: parsley.Result[String,Unit] =
Failure((line 1, column 2):
unexpected "ey"
expected "ello"
>hey world!
^^)
scala> import parsley.character.digit
scala> val natural: Parsley[Int] = digit.foldLeft1(0)((n, d) => n * 10 + d.asDigit)
scala> natural.parse("0")
val res3: parsley.Result[String,Int] = Success(0)
scala> natural.parse("123")
val res4: parsley.Result[String,Int] = Success(123)
For more see the Wiki!
parsec
?Mostly, this library is quite similar. However, due to Scala's differences in operator characters a few operators are changed:
(<$>)
is known as map
try
is known as attempt
($>)
is known as either #>
or as
In addition, lift2
and lift3
are uncurried in this library: this is to provide better performance and easier usage with
Scala's traditionally uncurried functions. There are also a few new operators in general to be found here!
Parsley is a modern parser combinator library, which strives to be on the bleeding-edge of parser combinator library design. This means that improvements will come naturally over time. Feel free to suggest improvements for consideration, as well as high-level problems you commonly encounter that we may be able to find a way to mitigate (see the Design Patterns for Parser Combinators paper for example!).
Part of innovation is being willing to admit
design mistakes and rectify them: when a binary-breaking release is made, the
opportunity may be taken to polish parts of the libary's API that are clunky, or
could be better organised or improved. For example, see the differences between
parsley-3.3.10
and parsley-4.0.0
! However, constant breaking changes are
not a good way to encourage the use of a library as users often want stability:
to that end, annoyances and bugbears with the API are only addressed
approximately yearly, and the frequence of these will decrease over time.
For future major releases, care will be taken to, wherever possible, publish
all patch-level changes in a final version to the previous major.minor
version, and then all minor-level changes as a final major.(minor+1).0
version before releasing the major-level changes as (major+1).0.0
: this will
allow users stuck on the old version to benefit as much as possible from the
fixes and new functionality.
As of 4.0.0
, parsley
is strictly commited to early-semver
, which means
that the version numbers are significant:
x._._
and y._._
with x != y
are incompatible with
each other at a binary level: having x._._
on the classpath with code
compiled with the y._._
will most likely result in a linkage-error at
runtime.a.x._
and a.y._
with x <= y
are binary compatible, which
means that code compiled against a.x._
will still work with a.y._
on
the classpath. A "source" component y > x
indicates that a.y._
has
added or deprecated functionality since a.x._
.a.b.x
and a.b.y
are binary and source compatible, which
means there are no compatiblity concerns between the two versions. Code
compiled against a.b.x
will run with a.b.y
on the classpath and
vice-versa. A "patch" component y > x
indicates that a.b.y
fixes
issues (bugs or poor performance) with a.b.x
.In short, if you are on version a.x.y
, you can: feel free to upgrade to
version a.x.z
if z > y
without worry; and upgrade to a.z._
if z > x
,
with a possible (but rare) need to update your code minorly. Occasionally,
a "source" component bump may deprecate functionality, but it will provide a
migration to tell you how to avoid the deprecation warning. Altered/deprecated
functionality may be hidden from the public API in a binary backwards
compatible way in a "source" bump and therefore may require updating when
recompiled; this will be done sparingly and with minimal disruption as to not
discourage updating the libary, and any immediate migration changes to user
code from a.x._
to any a.y._
with y > x
will be documented in
a.y._
's release.
Note: all functionality marked as private [parsley]
or within
the parsley.internal
package is not adherent to early-semver
and may be
removed or changed at will with no impact to regular/intended use of the
library.
Occasionally, a minor (source) release will contain either a significant body of new work, or a significant rework of some internal machinery. In these cases additional versioning may be employed:
a.b.0-Mn
versions: these are (hopefully) working pre-release versions of the
functionality, subject to even binary incompatible changes between M
versions. When the new API and behaviour becomes stable, the release
graduates to the a.b.0-RC1
release candidate.RCx
and RCy
with
y > x
except within truly exceptional circumstances.a.b.0
and is hopefully truly stable.Old versions of the library may still be given important bug-fixes after it has be obsoleted by a new release. In exceptional circumstances, performance problems may be addressed for old versions. The lifetime policy is as follows:
Some more minor bugfixes may not be ported to previous versions if they (a) do not appear in that version or (b) the code has changed too much internally to make porting feasible.
An exception to this policy is made for any version 3.x.y
, which reaches EoL effective immediately (December 2022) excluding exceptional circumstances.
Version | Released On | EoL Status |
---|---|---|
3.3.0 |
7th January 2022 | EoL reached (3.3.10 ) |
4.0.0 |
30th November 2022 | EoL reached (4.0.4 ) |
4.1.0 |
18th January 2023 | EoL reached (4.1.8 ) |
4.2.0 |
22nd January 2023 | EoL reached (4.2.14 ) |
4.3.0 |
8th July 2023 | EoL reached (4.3.1 ) |
4.4.0 |
6th October 2023 | EoL reached (4.4.1 ) |
4.5.0 |
6th January 2023 | Enjoying indefinite support |
If you encounter a bug when using Parsley, try and minimise the example of the parser (and the input) that triggers the bug. If possible, make a self contained example: this will help to identify the issue without too much issue.
Parsley represents parsers as an abstract-syntax tree AST, which is constructed lazily. As a result, Parsley is able to perform analysis and optimisations on your parsers, which helps reduce the burden on you, the programmer. This representation is then compiled into a light-weight stack-based instruction set designed to run fast on the JVM. This is what offers Parsley its competitive performance, but for best effect a parser should be compiled once and used many times (so-called hot execution).
To make recursive parsers work in this AST format, you must ensure that recursion is done by knot-tying: you should define all
recursive parsers with val
and introduce lazy val
where necessary for the compiler to accept the definition.