Pregex Versions Save

PRegEx - Programmable Regular Expressions

v2.3.0

1 year ago
  • Added classes "pregex.meta.essentials.{Text, NonWhitespace, Whitespace}".

  • Class "pregex.core.groups.Group" and method "pregex.core.pre.Pregex.group" now have an "is_case_insensitive" parameter, which can be used in order to apply the "case insenitive" modifier to the pattern that is wrapped within the group.

  • A "CannotBeNegatedException" is now thrown whenever one attempts to invert an instance of class "pregex.core.classes.Any".

  • Fixed bug where subtracting a character from a two-character class range wouldn't successfully remove said character from the range.

  • Slightly updated documentation and README.

  • Modified some existing tests and added some more in order to achieve 100% coverage.

v2.2.0

1 year ago
  • Introducing the concept of "Extensibility": All patterns that correspond to classes within "pregex.meta" have numerous assertions imposed upon them, that are essential in order for them to be able to match what they are supposed to. These assertions are mostly word boundaries that are placed at the start and at the end of the pattern, but there might be other types of assertions as well. Helpful as they are for matching, these assertions might complicate things when it comes to the pattern being used as a building block to a larger pattern. For that reason, there has been added an "is_extensible" parameter to the constructor of every meta class. As a general rule of thumb, you should only set "is_extensible" to "True" if you wish to use a pattern as part of a larger one. For matching purposes, let "is_extensible" take its default value of "False".

  • Class "pregex.core.pre.Pregex" now contains a set of "{get, iterate}_named_captures" methods through which one has access to any named captured groups stored within dictionaries.

  • Parameter "pattern" of class "pregex.core.pre.Pregex" constructor now defaults to the empty string, thus replacing "pregex.core.pre.Empty" which has now been removed.

  • All classes within "pregex.core.operators" can now receive one or even no arguments at all without throwing a "NotEnoughArgumentsException" exception. This makes it easier to do stuff like "pre = op.Either(*patterns)" without having to check whether list "patterns" contains more than one pattern.

  • Applying the alternation operator between a pattern "P" and the empty string pattern now results in pattern "P" itself.

  • Wrapping the empty string pattern in either a capturing or a non-capturing group now results into the empty string pattern itself.

  • Classes "pregex.core.assertions.{__PositiveLookaround, __NegativeLookaround}" have been removed and replaced by a single class "pregex.core.assertions.__Lookaround".

  • Classes "pregex.core.assertions.{FollowedBy, PrecededBy, EnclosedBy}" are now able to receive more than one assertion patterns, just like their negative counterparts.

  • Class "pregex.meta.essentials.Date" now receives date formats in a list instead of as arbitrary arguments.

  • Corrected mistake where method "pregex.core.pre.Pregex.not_enclosed_by" could receive multiple arguments.

  • Updated documentaton and README.

  • Modified some existing tests and added some more.

v2.1.0

1 year ago
  • Updated documentaton.
  • Added pattern chaining (see Documentation/Covering the Basics/Pattern chaining).
  • "pregex.core.groups.Backreference" can now receive integers besides strings in order to reference capturing groups by order, e.g. "\1", "\2", etc...
  • "pregex.core.operators.Enclose" can now receive multiple "enclosing" patterns.
  • Wrapping the empty string pattern within a group is now allowed.
  • Alternating between a pattern and the empty string pattern is now allowed.
  • Added params "capture_local_part" and "capture_domain" to "essentials.Email".
  • Renamed "CannotBeQuantifiedException" to "CannotBeRepeatedException".
  • Renamed private field "Pregex.__quantifiable" to "__repeatable".
  • Renamed protected method "Pregex._is_quantifiable" to "Pregex._is_repeatable".
  • Modified some existing tests and added some more.

v2.0.0

1 year ago
  • Split pregex package into "pregex.core" and "pregex.meta". All previous modules were moved to "pregex.core", while a new module "essentials.py" was added to "pregex.meta".
  • Greatly improved documentation page.
  • Added classes "pregex.core.assertions.[Not]EnclosedBy".
  • Added "Pregex.print_pattern()" method just for printing the pattern.
  • Added "Pregex.purge()" static method for clearing the regex cache.
  • Added "__quantifiable" field and "_is_quantifiable" method.
  • All match-like methods are now able to receive a path to a text file to extract text from in order to look for matches, via the "is_path" parameter.
  • "Pregex.__infer_type" now also infers whether pattern is quantifiable or not, and returns said result along with the pattern's type.

v1.5.0

1 year ago
  • Updated docs and README.
  • Replaced "pregex.assertions.MatchAt[Left|Right]WordBoundary" classes with single class "pregex.assertions.WordBoundary".
  • Added class "pregex.assertions.NonWordBoundary".
  • Added type "pregex.pre._Type.Empty".
  • Added class "pregex.pre._Empty".
  • Multiplying a "Pregex" instance with zero now returns the "Empty" pattern.
  • Parameter value n=0 is now allowed in "pregex.quantifiers.Exactly" and returns the "Empty" pattern.
  • Parameter combination (min, max) = (0, 0) in "pregex.quantifiers.AtLeastAtMost" is now alloed and returns "Empty" pattern.
  • Removed private method "pregex.__add".
  • Removed unused "type: _Type" parameter from "pregex.operators.__Operator".
  • Split "pregex.assertions.__Lookaround" to "__PositiveLookaround" and "__NegativeLookaround".
  • Negative lookarounds are now able to receive more than one pattern restrictions.
  • Added "pregex.exceptions.ZeroArgumentsException".
  • Classes "pregex.classes.Any[But]From" now throw an exception if provided with zero parameters.
  • Fixed bug in "Pregex.__infer_type" method.
  • Slightly modified some exceptions' messages.
  • Modified existing tests and added some more.

v1.4.0

1 year ago
  • Updated README file and docs.
  • Added a "CONTRIBUTING.md" file.
  • Fixed a bug occuring whenever applying "Concat/Either" on more than two patterns.
  • Fixed a bug occuring whenever multiplying an "__Assertion" class instance with an integer.
  • Added unicode flag "u" to string returned by "Pregex.get_pattern" when "include_flags"
  • Renamed "Pregex.iter_capturing_groups[_and_pos]" to "Pregex.iter_captures[_and_pos]". is set to "True".
  • Renamed "Pregex.get_capturing_groups[_and_pos]" to "Pregex.get_captures[_and_pos]".
  • Renamed "Pregex.split_by_group" to "Pregex.split_by_capture".
  • Added parameter "is_global" to "AnyWordChar" and "AnyButWordChar".
  • Added "GlobalWordCharException" that is thrown whenever one tries to subtract from "AnyWordChar/AnyButWordChar" where "include_foreign_chars" has been set to "True".
  • Added a number of new classes in "tokens.py" for matching useful symbols.
  • Added a number of new classes in "classes.py" for matching foreign characters.
  • Fixed bug in "Any[But]From" resulting in these classes being able to accept non-tokens as parameters.
  • Renamed "Any[But]WithinRange" to "Any[But]Between".
  • Loosened restrictions on what constitutes a range in "AnyBetween/AnyButBetween".
  • Modified existing tests for all the above and added some new ones.

v1.3.0

1 year ago
  • Updated README file and docs.
  • Renamed "AtLeastOnce" to "OneOrMore".
  • Renamed "CapturingGroup" to "Capture".
  • Renamed "NonCapturingGroup" to "Group".
  • Renamed "Pregex.get_groups" to "Pregex.get_captured_groups".
  • Fixed bug where some punctuation characters were through as part of ranges due to their ASCII code.
  • Removed raw-string-conversion from tokens (except for backslash).
  • Added being able to perform a union/subtraction between a regular class and a token.
  • Providing a quantifier other than "Exactly" as parameter "pre2" to either "PrecededBy" or "NotPrecededBy" now causes a "NonFixedWidthException" exception to be thrown.
  • Moved exceptions from "Pregex" protected methods, to corresponding classes.
  • Modified existing tests for all the above and added some new ones.

v1.2.0

1 year ago
  • Updated docs and README file.
  • Removed "group_on" parameters from "Pregex".
  • Added a nested "_PatternType" enum within "Pregex".
  • Added "type" parameter of type "_PatternType" to "Pregex" which dictates the instance's groupping rules.
  • Every protected method of "Pregex" that returned a "Pregex" instance, now returns a string instead.
  • Removed "Literal" class and moved its functionality to "Pregex", by adding a "escape" parameter to its constructor.
  • Removed "Token" base class.
  • Moved "simplify class patterns" functionality into classes and out of Pregex.get_pattern(), so re.sub isn't invoked every time "get_pattern" is invoked.
  • Removed "pregex.exceptions.MultiCharTokenException"
  • Renamed "NotQuantifiableException" to "CannotBeQuantifiedException".
  • Added "__Assertion" base class for "__Anchor" and "__Lookaround".
  • "__Assertion" child classes can no longer be quantified.
  • "__Quantifier" child classes can no longer be quantified.
  • Class "pregex.pre.Pregex" is now imported as well when using "from pregex import *".
  • Added corresponding tests or modified existing tests.

v1.1.0

1 year ago

v1.1.0

  • Improved README file and documentation page.
  • Classes that contain a single character are now simplified when printing their pattern.
  • Individual chars in classes are now removed when there already exists a range that includes them.
  • Added Subtraction operation for classes.
  • Added "pregex.groups.Conditional" for constructing conditional patterns.
  • Fixed various bugs that occured when printing pattern containing some punctuation characters.
  • Added corresponding tests for all the above.

v1.0.3

1 year ago

v1.0.3

  • "pregex.classes.Enforced" has been renamed to "pregex.classes.AtLeastOnce".
  • Shorthand character class notation is now used wherever possible, e.g. "[\d]" instead of "[0-9]". This also works with class combinations. For instance, (AnyLetter() | AnyDigit() | AnyFrom("_")).get_pattern() returns "[\w]".
  • Corrected documentation of parameter "is_greedy" in quantifiers.py.