T Regx Versions Save

Simple library for regular expressions in PHP.

0.34.1

1 year ago

😎 T-Regx The Dinosaur is really proud to announce its release 0.34.1!

Prepared patterns are a big part of T-Regx library, there actually isn't anything in PHP world (or regular expressions for that matter), that would handle user-data in patterns in such a way as T-Regx's Prepared Patterns with @ placeholders. Among other features, the placeholders are correctly handled with \@ and \Q@\E, as well as comments #comment@\n in Extended Mode (x flag). However, comments are only ended by "\n" when PCRE newline conventions are (*LF). We're sorry to admit, that we didn't take that into account, that when you change line convention to (*CR) (or other convention for that matter), parsing of @ placeholders in comments should be updated accordingly. As of this release, newline conventions are properly being taken into account. We're holding our statement, that @ placeholders aren't being injected into comments.

The detailed list of changes is in ChangeLog.md.

Summary of changes:

  • Features
    • Prepared patterns now support changing new line conventions with flag x (EXTENDED mode).

      This code is now valid:

      Pattern::inject("(*CR)#comment@\r@", ['value'], 'x');
      

      The first placeholder character "@" is considered a part of an extended comment, but the second placeholder @ is assigned value 'value'.

    • All types of PCRE newline conventions are supported: (*LF), (*CR), (*CRLF), (*ANYCRLF), (*ANY), (*NUL).

Rawrrrrrrr!

0.34.0

1 year ago

😎 T-Regx The Dinosaur is really proud to announce its release 0.34.0!

We noticed it would be better if we could treat placeholder @ used in prepared patterns (Patter::inject() and Pattern::template()) as first-class citizens, for example put them in look-ahead's, or use ? or + after them. Now, using Pattern::inject('@?', ['Foo']); works just fine. Any quantifier used at the placeholder @ behaves as if applied to the injected value.

Additionally, we corrected usage of backtracking priorities in Pattern::mask(), so it's more predictible.

The detailed list of changes is in ChangeLog.md.

Summary of changes:

  • Breaking changes
    • Moved PlaceholderFigureException from TRegx\CleanRegex\Internal\Prepared\Figure\ to TRegx\CleanRegex\Exception\
  • Features
    • Added support for quantifiers to placeholders in prepared patterns.
      • Added support for quantifiers in Pattern::inject()
      • Added support for quantifiers in Pattern::template()
      • Added support for quantifiers in Pattern::builder()
      • Added support for quantifiers in PcrePattern::inject()
      • Added support for quantifiers in PcrePattern::template()
      • Added support for quantifiers in PcrePattern::builder()
    • Added support for empty placeholders in prepared patterns:
      Pattern::inject('Find:@?', ['']); // matches only "Find:"
      
    • Updated backtrakcing in prepared patterns.
      • Updated backtrakcing in Pattern.pattern().

        Pattern::template('@:Bar')->pattern('Foo:Bar|Foo'); // matches "Foo:Bar" or "Foo:Bar:Bar"
        
      • Updated backtrakcing in Pattern.mask().

        $template = Pattern::template('@:Bar');
        $template->mask('*', ['*' => 'Foo:Bar|Foo']); // matches "Foo:Bar" or "Foo:Bar:Bar"
        
  • Other
    • Passing invalid arguments which also don't match the number of arguments in Pattern::inject(), now always prefers \InvalidArgumentException over PlaceholderFigureException.

Rawrrrrrrr!

0.33.0

2 years ago

😎 T-Regx The Dinosaur is really proud to announce its release 0.33.0!

We've started working on this change, wooo, before release of 0.27. We simultaneously were working on making the Pattern.search() happen, as well as adding new features to the library. We're happy we've finally made it.

It was bothering us that some methods of Pattern.match() were dealing with Detail (like first(), findFirst(), stream()) and others with string (like all(), distinct(), reduce()). We hated using that in production. It was not designed well, unpleasant, unelegant. We needed to make Pattern.match() more concise, so it would only work with Detail, but we also liked the fact that you could simply take a match and use it. To reconcile the two, we have split the matching into two methods: Pattern.match() and Pattern.search(). They have very similar set of methods, virtually all of the previous Pattern.match() methods, with slight difference.

Now, every method of Pattern.match() deals with Detail.

  • Pattern.match().all(), only(), filter(), distinct() all return Detail[]
  • Pattern.match().first(), findFirst(), nth(), findNth() return Detail
  • Methods groupByCallback() and reduce() accept and return Detail.

On the other hand, new Pattern.search() deals only with string.

  • Pattern.match().all(), only(), filter(), distinct() all return string[]
  • Pattern.match().first(), findFirst(), nth(), findNth() return string
  • Methods groupByCallback() and reduce() accept and return string.

The detailed list of changes is in ChangeLog.md.

Summary of changes:

  • Breaking changes
    • Removed previously deprecated Pattern.match().group().

    • Removed Pattern.match().asInt(). Use stream().asInt() or Detail.toInt().

    • Pattern.match().first() no longer accepts callable as its argument

    • Pattern.match().findFirst() no longer accepts callable as its argument

    • Stream.first() no longer accepts callable as its argument

    • Stream.findFirst() no longer accepts callable as its argument

    • Refactored Pattern.match() into match() and search().

      Pattern.match() and Pattern.search() have virtually the same set of methods, with a slight difference. All of Pattern.search() operate on string which is the matched occurrence, and all of Pattern.match() methods now operate on Detail.

  • Features
    • Added Pattern.search(), which is similar to Pattern.match(), but its methods only operate on string. From now on, Pattern.match() only operates on Detail.

The detailed list of changes is also in ChangeLog.md.

Rawrrrrrrr!

0.31.0

2 years ago

😎 T-Regx The Dinosaur is really proud to announce its release 0.31.0!

We have refactored T-Regx quite a bit, and there's more on the way. We renamed Pattern.match() methods, some Detail methods, unified streams and integer streams, unified optionals and more. What some call "creating breaking changes" we call "paying back the tech debt".

The detailed list of changes is in ChangeLog.md.

Summary of changes:

  • Breaking changes
    • Refactored Detail.groups() and Detail.namedGroups(), so they return Group[].
    • Renamed Detail.hasGroup() to Detail.groupExists().
    • Renamed Pattern.match().hasGroup() to Pattern.match().groupExists().
    • Removed inline-groups, Pattern.match().group().
  • Bug fixes
    • Calling Detail.group().asInt() with invalid integer base on an unmatched group, threw GroupNotMatchedException. Now it throws InvalidArgumentException.
    • Calling Detail.group().toInt() with invalid integer base on an unmatched group, threw GroupNotMatchedException. Now it throws InvalidArgumentException.
    • Pattern.match().groupBy() now correctly groups by duplicate name with J modifier.
    • Removed Detail.usingDuplicateName().

The detailed list of changes is also in ChangeLog.md.

Rawrrrrrrr!

0.9.1

2 years ago

:sunglasses: T-Regx The Dinosaur is proud to announce its second alpha version! It doesn't have any known bugs - it's just missing a few time-consuming features.

What's new in this release:

  • UTF8-safe method textLength() for Match and Match.group()
  • groupsCount() method for Match details - unsurprisingly just counts the groups (without duplicates of named and regular groups)

And more changes, that required a large part of replacing core to be rewritten, to allow more control over unmatched groups:

  • Methods by()->group()->orIgnore() and by()->group()->orElse()
  • Method by()->group()->callback() which accepts MatchGroup as an argument
  • Method by()->group()->orElse() now receives lazy-loaded Match, instead of a less useful - subject

The new features are already described in the documentation at t-regx.com :)

0.9.2

2 years ago

😎 T-Regx The Dinosaur is really proud to announce its first beta version! Despite the beta suffix, it's 100% suitable for production use. It doesn't have any known bugs - check out the issues. There is a few breaking changes (since that's a 0.* version), but there are also a looot of improvements and new feautres.

The detailed list of changes is in ChangeLog.md.

Here's a summary:

  • Breaking changes
    • Refactored entry points - Pattern::of()/pattern() and Pattern::pcre() (see more in ChangeLog.md)
    • Renamed parseInt() to toInt()
    • Removed pattern()->match()->test()/fails(). From now on, use pattern()->test()/fails()
    • Removed is() (see more in ChangeLog.md)
    • Refactored split() (see more in ChangeLog.md)
  • Features
    • Added Match.group().replace() 🔥
    • Added to pattern()->match():
      • fluent(), asInt(), distinct(), groups(), offsets()->fluent(), group(string)->offsets()->fluent()
    • Added pattern()->forArray()->strict() which throws for invalid values
    • Added Pattern::inject()/Pattern::bind()
  • SafeRegex
    • Added preg::grep_keys() 🔥
  • Other
    • Now MalformedPatternException is thrown, instead of CompileSafeRegexException, when using invalid PCRE syntax.

That's just a handful, the more detailed description of this realease is in ChangeLog.md.

0.9.4

2 years ago

What's new, new, new?

This release brings updates in exceptions (namespaces, new detailed exceptions) and a groupBy() method.

Exceptions

In previous realase we renamed SafeRegexException to PregException. In this, we're renaming CleanRegexException to PatternException. So now, those two general exceptions sync nicely with their base methods:

try {
  return preg::match('/Foo/', $subject);
} catch (PregException $e) {
try {
  return pattern('Foo')->test($subject);
} catch (PatternException $e) {

They both extend RegexException - base for all exceptions thrown by T-Regx. So that's the first thing.

The second exception update - previously, every exception thrown based on preg_last_error() method was RuntimePregException. Now, each error has a dedicated exception, which can be caught separately:

try {
  return preg::match($pattern, $subject);
} catch (BacktrackLimitPregException $exception) {
} catch (Utf8OffsetPregException $exception) {

The detailed list of changes is in ChangeLog.md.

groupBy()

This realase also comes with a brand new method - groupBy() which groups matches by a capturing group (name or index). It can match strings, offsets and also map them with map() and flatMap(). Additionally, it can be chained with filter() to leave out unwanted matches:

return pattern('(\d)(?<unit>cm|mm)')->match($strings)
  ->filter(function (Match $match) {
    return $match->group(1)->toInt() % 2 == 0;
  })
  ->groupBy('unit')
  ->map(function (Match $match) {
    return $match->group(1)->toInt() * 100;
  });

0.9.5

2 years ago

😎 T-Regx The Dinosaur is really proud to announce its second beta version! Despite the beta suffix, it's 100% suitable for production use. It doesn't have any known bugs - check out the issues. There is a few breaking changes (since that's a 0.* version), but there are also improvements and new feautres.

The detailed list of changes is in ChangeLog.md.

Here's a summary:

  • Breaking changes

    • Removed iterate() method, which was only used as a substitute for forEach(), pre PHP 7.0, when keywords couldn't be used as method names.
    • Renamed pattern()->match()->forFirst() to findFirst() #70
  • Features

    • Added match()->group()->findFirst() #22 #70
    • Added alternating groups in prepared patterns 🔥
      • Pattern::bind(), Pattern::inject() and Pattern::prepare() can now accept string[], which will be treated as a regex alternation group!
        Pattern::bind('Choice: @values', [
            'values' => ['apple?', '[orange]', 'pear']
        ]);
        
        is similar to
        Pattern::of('Choice: (apple\?|\[orange\]|pear)')
        
        The alternation is also optimized for i and u flags, for example it will remove case-insensitive duplicates (like foo and FOO when i flag is used).

That's it for this release, stay tuned!

0.9.6

2 years ago

😎 T-Regx The Dinosaur is really proud to announce its third beta version! Despite the beta suffix, it's 100% suitable for production use. This time, here come bug fixes, exception message updates, and two chaining-methods updates.

The detailed list of changes is in ChangeLog.md.

Overview

At T-Regx, we utilize chainable interface to increase readability and extensibility, for example:

<?php
pattern($p)->match($subject)->group('name')->offsets()->findFirst($callable)->orThrow(Exception::class);

(not to mention versions with fluent()).

Anyway, previously, only immediate first() call (pattern()->match()->first()) used preg_match(), any intermediate calls (pattern()->match()->offsets()->first(), group()->first(), etc.) used preg_match_all() and trimmed the result.

It wasn't easy, but in this release, every single usage of first()/findFirst() uses preg_match() under-the-hood! That can save us from, say, catastrophic backtracking in certain scenarios. The only exception is filter()->first(), for which it doesn't make much sense to call preg_match().

Summary

  • Breaking changes
    • pattern()->match()->fluent()->distinct() will no longer re-index elements (will not remove keys).
      • To re-index keys, use distinct()->values().
      • pattern()->match()->distinct() still re-indexes keys.
  • Enhancements 🔥
    • Every match()->...()->first() method calls preg_match(), instead of preg_match_all(). More below.
  • Features
    • Added pattern()->match()->fluent()->nth(int) used to get an element based on an ordinal number.
    • Added pattern()->match()->asInt(). More below.

Thanks

That's it today! The 1.0.0 release is coming! :)

0.9.7

2 years ago

😎 T-Regx The Dinosaur is really happy to announce its fourth beta version! Despite the beta suffix, it's 100% suitable for production use. This time, here comes some bug fixes and a slightly less T-Regx'y, more PHP-y way of accessing matches.

The detailed list of changes is in ChangeLog.md.

What about the matches

In vanilla regexps, after your preg_match(), you would recieve $match array of string of the whole match and capturing groups (or [string,int] if you used PREG_OFFSET_CAPTURE).

Except for the fact of bizarre design of preg_match(), which trims every empty or unmatched capturing group away. If the empty/unmatched capturing group is not at the tail of $match then it's either '' or null if you used PREG_UNMATCHED_AS_NULL (PHP 7.2 only).

So, obviously, T-Regx can't let you use such black magic :D But we still understand that often PHP-y way is more concise:

$match['unit'];                 # this one sparks joy
$match->group('unit')->text();  # this one sparks less joy

So, we created asArray() method, used like so:

$subject = "Foo:14";
pattern('\w+(?<number>:\d+)?(-\d+)?')->match($subject)->asArray()->first(function (array $match) {
    var_dump($match);
});
['Foo:14-16', 'number' => ':14', ':14', null]

Notice, that after asArray(), first() callback recieves array, instead of Match. The array structure resembles the vanilla $match structure, except for the fact that every unmatched group is represented as null (even for PHP 7.1 and earlier), and the trailing groups are never trimmed.

Why not make Match implement ArrayAccess and use array syntax?

Where-as ArrayAccess does make it possible to read/check values of array ($match[0] and isset($match[0])), which would render asArray() method unnecessary - it does come at a price, that is:

  • ArrayAccess enforces unset and add methods, yet we we don't wish to modify the Match details.
  • Can't use array_key_exists on an object, even with ArrayAccess
  • Can't cast it to array (can't use Match with array type-hint)
  • No method that works for arrays would work for Match (such as array_push(), array_map())
  • $match[] would still not work
  • We could not throw exceptions to validate groups, for example one could use isset($match[-2]) and no error could be thrown, obviously.

But these are minor, language-specific matters. The real reason we didn't use ArrayAccess are:

  • empty($match[0]) would return true for match "0" (string) !!
  • $match['100'] is passed to ArrayAccess.offsetGet() as 100 (integer) !! so it's impossible to validate a malformed group consistently.

Hence, we chose asArray() to return a real array for which all of those flaws go away :)

The rest of the changes is in ChangeLog.md.

Thanks

That's it for today! The 1.0.0 release is coming! :)