Lucenenet Versions Save

Apache Lucene.NET

Lucene.Net_4_8_0_beta00016

2 years ago

This release contains several important bug fixes and performance enhancements.

Breaking Index Changes

There are 2 breaking changes that may affect some users when reading indexes that were created from version 4.8.0-beta00015 and all prior 4.8.0 beta versions (not including 3.0.3).

  1. A bug was fixed in the generation of segment file names. This only affects users with more than 10 segments in their index.
  2. Lucene.Net.Documents.DateTools has been modified to return milliseconds since Unix epoch (that is, since Jan 1, 1970 at 00:00:00 UTC) by default to match Java Lucene. This only affects users who explicitly use Lucene.Net.Documents.DateTools in their application and store the result (in .NET ticks) in their index.

If you are affected by either of the above issues, it is recommended to regenerate your indexes during upgrading. However, if that is not feasible, we have provided the following workarounds.

  1. If you have a large index with more than 10 segments, see #576 for details on how to enable legacy segment name support.
  2. If you are storing the result of Lucene.Net.Documents.DateTools.StringToTime(string) or Lucene.Net.Documents.DateTools.Round(long) (a long) in your index, you are storing .NET ticks. There are now optional parameters inputRepresentation and outputRepresentation on these methods to specify whether the long value represents .NET ticks, .NET ticks as milliseconds, or millisenonds since the Unix epoch. To exactly match version 4.8.0-beta00015 and prior (including prior major versions):
    • Lucene.Net.Documents.DateTools.StringToTime(string, NumericRepresentation) should specify NumericRepresentation.TICKS for outputRepresentation.
    • Lucene.Net.Documents.DateTools.Round(long, NumericRepresentation, NumericRepresentation) should specify NumericRepresentation.TICKS_AS_MILLISECONDS for inputRepresentation and NumericRepresentation.TICKS for outputRepresentation.

.NET Framework Recommendations

It is recommended that all .NET Framework users migrate as soon as possible.

  1. In cases where Lucene.Net.Support.WeakDictionary<TKey, TValue> was used in .NET Framework and .NET Standard 2.0 due to missing APIs, but there is now a better solution using Prism.Core's weak events in combination with ConditionalWeakTable<TKey, TValue>, which means memory management is handled entirely by the GC in Lucene.Net.Index.IndexReader, Lucene.Net.Search.FieldCacheImpl, Lucene.Net.Search.CachingWrappingFilter and Lucene.Net.Facet.Taxonomy.CachedOrdinalsReader. See #613.
  2. All known issues with loss of floating-point precision on .NET Framework x86 have been fixed.

Change Log

Breaking Changes

  • #547 - Lucene.Net.Util.StringHelper.GOOD_FAST_HASH_SEED: converted from a static field to a property and marked obsolete. Added a new property GoodFastHashSeed. Removed SystemProperties call to populate the value of the field, since NUnit only allows us to generate a seed per test, and we need a way to inject the seed value for repeatability.
  • #547 - Lucene.Net.TestFramework: Added LuceneSetUpFixtureBuilder class to load either a subclass or our default instance of LuceneTestFrameworkInitializer. Also added LuceneTestCase.SetUpFixture to control initialization of LuceneTestFrameworkInitializer so it is only called on setup and teardown for the assembly. Added Initialize() method to LuceneTestFrameworkInitializer that must be used when setting factories during testing.
  • #547 - Lucene.Net.TestFramework.Util.LuceneTestCase: Deprecated GetClassType() method and added TestType property
  • #547 - Lucene.Net.TestFramework.Util.AbstractBeforeAfterRule``: Removed LuceneTestCaseparameter fromBefore()andAfter()` methods.
  • #551 - Changed constructors of Lucene.Net.Util.NumberFormat and Lucene.Net.QueryParsers.Flexible.Standard.Config.NumberDateFormat to accept IFormatProvider rather than CultureInfo and changed Lucene.Net.Util.NumberFormat.Culture property to Lucene.Net.Util.NumberFormat.FormatProvider.
  • #554 - Lucene.Net.Misc: Made DocFreqComparer and TotalTermFreqComparer into static singletons, only accessible by the Default property.
  • #428, #429, #570 - Lucene.Net.Search.FieldComparer: Redesigned implementation to use reference types for numerics (from J2N) to avoid boxing.
  • #570 - Lucene.Net.Search.FieldCache.IParser: Renamed method from TermsEnum() to GetTermsEnum() to match other APIs
  • #570 - Lucene.Net.Queries: ObjectVal() returns a J2N.Numerics.Number-derived type rather than a value type cast to object. Direct casts to int, long, double, single, etc. will no longer work without first casting to the J2N.Numerics.Number-derived type. Alternatively, use the corresponding Convert.ToXXX() method for the type you wish to retrieve from the object.
  • #574 - Lucene.Net.Suggest.Fst.FSTCompletionLookup/WFSTCompletionLookup: Changed Get() to return long? instead of object to eliminate boxing/unboxing
  • #574 - Lucene.Net.Index.MergePolicy::FindForcedMerges(): Removed unnecessary nullable from FindForcedMerges() and all MergePolicy subclasses
  • #574 - Lucene.Net.Replicator: Changed callback signature from Func<bool?> to Action, since the return value had no semantic meaning
  • #575 - Lucene.Net.Index.DocValuesFieldUpdates: Refactored so the subclasses will handle getting the values from DocValuesFieldUpdatesIterator or DocValuesUpdate via a cast rather than boxing the value. Also marked internal (as well as all members of BufferedUpdates), since this was not supposed to be part of the public API.
  • #573, #576 - Changed segment file names to match Lucene 4.8.0 and Lucene.NET 3.x
  • #577 - Lucene.Net.Index.SegmentInfos: Changed Info() method to an indexer (.NET Convention)
  • #580 - Lucene.Net.Documents.DateTools - Added NumericRepresentation enum to allow converting to/from long in the following formats:
    • Unix Epoch (default): Milliseconds since Jan 1, 1970 12:00:00 AM UTC.

    • Ticks: The raw ticks from DateTime or DateTimeOffset.

    • Ticks as Milliseconds: This is for compatibility with prior versions of Lucene.NET (3.0.3 and 4.8.0-beta00001 - 4.8.0-beta00015). The conversion done on input values is time * TimeSpan.TicksPerMillisecond and the conversion to output values is time / TimeSpan.TicksPerMillisecond.

      The long return value from Lucene.Net.Documents.DateTools.StringToTime(string, NumericRepresentation) has been changed from NumericRepresentation.TICKS to NumericRepresentation.UNIX_TIME_MILLISECONDS by default.

      The long input parameter provided to Lucene.Net.Documents.DateTools.Round(long, NumericRepresentation, NumericRepresentation) has been changed from NumericRepresentation.TICKS_AS_MILLISECONDS to NumericRepresentation.UNIX_TIME_MILLISECONDS by default.

      The long return value from Lucene.Net.Documents.DateTools.Round(long, NumericRepresentation, NumericRepresentation) has changed from NumericRepresentation.TICKS to NumericRepresentation.UNIX_TIME_MILLISECONDS by default.

  • #580 - Lucene.Net.Documents.DateTools - De-nested Resolution enum and renamed DateResolution.
  • #580 - Lucene.Net.QueryParsers.Flexible.Standard: Changed numeric nodes to accept and return J2N.Numerics.Number-derived types instead of object.
  • #581 - SWEEP: Lucene.Net.Util.Fst: Changed API to use J2N.Numerics.Int64 instead of long? for generic closing type as it was designed to use reference equality comparison.
  • #581 - SWEEP: Lucene.Net.Util.Fst: Added class constraints to each generic FST type and reverted to reference equality comparisons.
  • #581, #279 - Lucene.Net.Util.Fst.Int32sRefFSTEnum: Added MoveNext() method and marked Next() method obsolete. This change had already been done to BytesRefFSTEnum, which made them inconsistent.
  • #583 - Lucene.Net.QueryParsers.Flexible: Removed unnecessary nullable value types from ConfigurationKeys and configuration setters/getters in StandardQueryParser. Added AbstractQueryConfig.TryGetValue() method to allow retrieving value types so they can be defaulted properly.
  • #583 - Lucene.Net.Queries.Function.ValueSources.EnumFieldSource::ctor() - changed enumIntToStringMap to accept IDictionary<int, string> instead of IDictionary<int?, string> (removed unnecessary nullable)
  • #587 - Lucene.Net.TestFramework.Store.MockDirectoryWrapper: Renamed AssertNoUnreferencedFilesOnClose to AssertNoUnreferencedFilesOnDispose
  • #619 - Lucene.Net.Spatial: Upgraded to new Spatial4n NuGet package that unifies the types from Spatial4n.Core and Spatial4n.Core.NTS
  • #619 - Lucene.Net.Spatial.Prefix.Tree.Cell: Renamed m_outerInstance > m_spatialPrefixTree and constructor parameter outerInstance > spatialPrefixTree
  • #619 - Lucene.Net.Spatial.Prefix.AbstractPrefixTreeFilter.BaseTermsEnumTransverser: renamed m_outerInstance > m_filter, constructor parameter outerInstance > filter
  • #619 - Lucene.Net.Spatial.Prefix.AbstractPrefixTreeFilter: De-nested BaseTermsEnumTraverserclass
  • #619 - Lucene.Net.Spatial.Prefix.Tree.GeohashPrefixTree.Factory: de-nested and renamed GeohashPrefixTreeFactory
  • #619 - Lucene.Net.Spatial.Prefix.Tree.QuadPrefixTree.Factory: de-nested and renamed QuadPrefixTreeFactory
  • #619 - Lucene.Net.Spatial.Prefix.AbstractVisitingPrefixTreeFilter: De-nested VisitorTemplate class and changed protected field m_prefixGridScanLevel to a public property named PrefixGridScanLevel.
  • #619 - Lucene.Net.Spatial.Query: Renamed UnsupportedSpatialOperation > UnsupportedSpatialOperationException to match .NET conventions

Bugs

  • #363, #534 - Lucene.Net.Replicator.Http.HttpReplicatorTest::TestBasic(). This was failing intermittently due to the Timeout value being set to 1 second instead of the 60 second value that was used in Java. It has been increased to the .NET default of 100 seconds.
  • #363, #534 - Lucene.Net.Replicator.IndexAndTaxonomyReplicationClientTest::TestConsistencyOnExceptions() and Lucene.Net.Replicator.IndexReplicationClientTest::TestConsistencyOnExceptions() were failing due to to exceptions being raised on the worker thread and missing locks, which have both been addressed.
  • #535 - Added [SuppressCodecs] attribute where required for custom Lucene.NET tests
  • #536 - Modified all TermsEnum.MoveNext() methods to return a check for null rather than returning true
  • #537 - Lucene.Net.TestFramework.Index.BasePostingsFormatTestCase: Removed IndexOptions.NONE from the list of available options, since it is not a valid test option
  • #539 - Lucene.Net.Grouping.Term.TermAllGroupHeadsCollector: Use NumericUtils.SingleToSortableInt32() to compare floating point numbers (Fixes AllGroupHeadCollectorTest.TestRandom() on .NET Framework x86).
  • #540 - Lucene.Net.Tests.Util.TestPriorityQueue: Fixed issues with comparers after introducing J2N.Randomizer, which produces negative random numbers.
  • #541 - Lucene.Net.Codecs.SimpleText.SimpleTextFieldsReader::NextDoc(): Fixed assert that was throwing on BytesRef.Utf8ToString()
  • #542 - Lucene.Net.Util.Automation.MinimizationOperations::MinimizeHopcroft(): Fixed range in OpenBitSet.Clear()
  • #543 - Lucene.Net.Tests.QueryParser.Flexible.Standard.TestQPHelper: Use ParseExact() method to specify the date format, so it works across cultures.
  • #527, #548 - Lucene.Net.Search.Suggest.Analyzing.BlendedInfixSuggester: Apply patch from https://issues.apache.org/jira/browse/LUCENE-6093 to fix ArgumentNullException if there were discarded trailing characters in the query (Thanks @Maxwellwr)
  • #550 - SWEEP: Use StringComparer.OrdinalforSort()` methods, where appropriate
  • #551 - Lucene.Net.QueryParser.Flexible.Standard: Fixed calendar handling on .NET Core
  • #552 - Lucene.Net.Suggest.Jaspell.JaspellTernarySearchTree: Fixed random number generator so it produces random numbers
  • #553, #609 - Lucene.Net.TestFramework.Util.TestUtil::RandomAnalysisString(): Fixed ArgumentOutOfRangeException when passed a maxLength of 0.
  • #546, #557 - Lucene.Net.Search.DisjunctionMaxScorer: Fixed x86 floating point precision issue on .NET Framework
  • #558 - Lucene.Net.Expressions.ScoreFunctionValues: Fixed x86 floating point precision issue on .NET Framework
  • #559 - Lucene.Net.Spatial.Prefix.SpatialOpRecursivePrefixTreeTest: Ported over patch from https://github.com/apache/lucene/commit/e9906a334b8e123e93b917c3feb6e55fed0a8c57 (from 4.9.0).
  • #545, #565 - Lucene.Net.Index.TestDuelingCodecs::TestEquals(): There was a missing ! in Lucene.Net.Codecs.BlockTreeTermsReader.IntersectEnum.Frame::Load() that was inverting the logic, causing this test to fail intermittently.
  • #549, #566 - Lucene.Net.Search.TestJoinUtil::TestMultiValueRandomJoin(): Fixed x86 floating point precision issue on .NET Framework
  • #568 - Lucene.Net.Search.Spell.TestSpellChecker::TestConcurrentAccess(): Fixed issues that were causing the test to hang due to concurrency problems.
  • #513, #572 - Updated ControlledRealTimeReopenThread to correctly handle timing (thanks @rclabo)
  • Lucene.Net.Support.Collections.ReverseComparer<T>: Replaced CaseInsensitiveComparer with J2N.Collections.Generic.Comparer<T>. This only affects tests.
  • #597 - .github/workflows: Updated website/documentation configs to use subdirectory glob patterns for paths.
  • #598 - Website: Fixed codeclimber article broken links
  • #600 - Fixed broken book link for Instant Lucene.NET (Thanks @rclabo)
  • #606 - Lucene.Net.Search.FieldCacheImpl.Cache<TKey, TValue>::Put(): Logic was inverted on innerCache field so the value was being updated if exists, when it should not be updated in this case
  • #606 - Lucene.Net.Search.FieldCacheImpl::Cache<TKey, TValue> (Put + Get): Fixed issue with InitReader() being called prior to adding the item to the cache when it should be called after
  • #619 - Lucene.Net.Spatial.Query.SpatialArgs::ctor(): Set operation and shape fields rather than calling the virtual properties to set them (which can cause initialization issues for subclasses)

Improvements

  • #538 - Lucene.Net.TestFramework.Search.CheckHits::CheckHitCollector(): Removed unnecessary call to Convert.ToInt32() and simplified collection initialization.
  • #554 - SWEEP: Made stateless private sealed comparers into singletons to reduce allocations (unless they already have a static property)
  • #555, #526 - Deprecated support for System.Threading.Thread.Interrupt() when writing indexes due to the high possibility in .NET that it could break a Commit() or cause a deadlock.
  • #567 - Enabled [Serializable] exceptions on all target platforms (previously, exceptions were not serializable in .NET Core)
  • #274, LUCENENET-574, #567 - Removed [Serializable] support for all classes except for the following (See #567 for a complete list)
    • Exceptions
    • Collections
    • Low-level holder types (such as BytesRef, CharsRef, etc.)
    • Stateless IComparer<T> implementations that are publicly exposed directly or through collections
  • #568 - Lucene.Net.TestFramework.Util.LuceneTestCase::NewSearcher(): Added missing event handler to shut down LimitedConcurrencyLevelTaskScheduler to prevent it from accepting new work when we are attempting to end the background process.
  • #568 - Lucene.Net.Support: Factored out ICallable<V> and ICompletionService<V> interfaces, as they are not needed
  • #570 - PERFORMANCE: Lucene.Net.Search.NumericRangeQuery: Eliminated boxing when converting from T to the numeric type and when comparing equality
  • #570 - PERFORMANCE: Lucene.Net.Suggest.Jaspell: Use J2N numeric types to eliminate boxing
  • #570 - PERFORMANCE: Lucene.Net.Search.FieldCache: Use J2N parsers and formatters
  • #570 - PERFORMANCE: Lucene.Net.Classification.Utils.DatasetSplitter: Removed duplicate calls to field methods and stored values in local variables. Use default round-trip format from J2N.
  • #570 - PERFORMANCE: Lucene.Net.Search.FieldCacheRangeFilter: Use HasValue and Value for nullable value types rather casting and comparing to null
  • #574, #583 - SWEEP: - Removed unnecessary nullable value types
  • #578 - Lucene.Net.Facet: Added culture-sensitve ToString() overload on FacetResult and LabelAndValue
  • #578 - Lucene.Net.Facet.FacetResult: Added nullable reference type support
  • #579 - Lucene.Net.Facet.DrillDownQuery: Added collection initializer support
  • #580 - Lucene.Net.Documents.DateTools - Added support for TimeZoneInfo when converting to/from string
  • #580 - Lucene.Net.QueryParsers.Flexible.Standard.Config.NumberDateFormat: Added constructor overload to format a date without a time.
  • #580 - Lucene.Net.QueryParsers.Flexible.Standard.Config.NumberDateFormat: Added NumericRepresentation property to set the representation to use for both Format() and Parse().
  • #580 - Lucene.Net.QueryParsers - Added support for TimeZoneInfo when converting to/from string (Classic and Flexible query parsers)
  • #580 - Lucene.Net.QueryParsers.Classic.QueryParserBase: Use TryParse() instead of Parse() to parse numeric values. Use the current culture, but fall back to invariant culture.
  • #582 - PERFORAMANCE: Lucene.Net.Search.FieldCacheRangeFilter: Eliminated boxing in Equals() check
  • #584 - Lucene.Net.Expressions.SimpleBindings: Added collection initializer support. Updated DistanceFacetsExample and ExpressionAggregationFacetsExample to demonstrate usage.
  • #586 - SWEEP: Removed conditional compilation for MSTest/xUnit and the following features:
    • TESTFRAMEWORK_MSTEST
    • TESTFRAMEWORK_NUNIT
    • TESTFRAMEWORK_XUNIT
    • FEATURE_INSTANCE_TESTDATA_INITIALIZATION
    • FEATURE_INSTANCE_CODEC_IMPERSONATION
  • #587 - Fixed the documentation comments for LuceneTestCase
  • #587 - Added some documentation for random seed configuration
  • #587 - Implemented some missing console logging
  • #588 - lucene-cli: Added embedded readme to NuGet package and updated build to update docs with release version number
  • #590 - SWEEP: Added links to release notes and documentation in each NuGet package, and corrected package descriptions.
  • #594 - Website: Improved content of contributing/source code page to show current information about the Apache's two-master setup and provided additional information about contributing source code with many links to external references. (Thanks @rclabo)
  • #595 - Website: Added "How to Setup Java Debugging" page. (Thanks @rclabo)
  • #602 - Shifted most of the IndexWriter tests to Lucene.Net.Tests._I-J to make both Lucene.Net.Tests._E-I and Lucene.Net.Tests._I-J run less than 2 minutes. This cuts the total time on Azure DevOps by around 5 minutes.
  • #603, #601 - Upgraded build tools for LuceneDocsPlugins project
  • Upgraded J2N NuGet package dependency to 2.0.0
  • Upgraded ICU4N NuGet package dependency to 60.1.0-alpha.356
  • Upgraded RandomizedTesting.Generators NuGet package dependency to 2.7.8
  • Upgraded Morfologik.Stemming NuGet package dependency to 2.1.7
  • #611 - PERFORMANCE: Fixed NIOFSDirectory bottleneck on multiple instances by switching from a static shared lock to a lock per FileStream instance.
  • #611 - Lucene.Net.Store: Updated the FSDirectory documentation to remove irrelevant Java info and replace it with performance characteristics of the .NET implementation.
  • #613, #256, #604, #605 - PERFORMANCE: Factored out WeakDictionary<TKey, TValue> in favor of weak events using Prism.Core
  • #617 - SWEEP: Changed "== null" to "is null"
  • #619 - SWEEP: Lucene.Net.Spatial: Enabled nullable reference type support
  • #619 - SWEEP: Lucene.Net.Spatial: Added guard clauses, where appropriate

New Features

  • #288, #547 - Lucene.Net.TestFramework: Fixed random seed functionality so it is repeatable, so random tests can be more easily debugged. The random seed and how to configure a test assembly to repeat the same result is appended to the output message of the test (which becomes visible upon failure). The J2N.Randomizer class was used to provide random numbers, which uses the same implementation on every OS, so the random seeds are portable across operating systems.
  • #588, #612 - lucene-cli: Added multitarget support for .NET Core 3.1, .NET 5.0, and .NET 6.0
  • #592 - Added Source Link support and added documentation page to the API docs.
  • #593, #596, #364 - Added Cross-Platform Build Script

Lucene.Net_4_8_0_beta00015

2 years ago

This release contains important bug fixes, performance enhancements, concurrency improvements, and improved debugging support (full stack traces, consistent exception types, attributes for debug view, and structurally formattable lists).

Much of the exception handling has been changed so it is recommended to test thoroughly, especially if your application relies on catching exceptions from Lucene.NET for control flow. The full extent of the exception handling changes are not documented here, but can be viewed at https://github.com/apache/lucenenet/pull/476/files.

Known Issues

  • Lucene.Net.Index.IndexWriter::Dispose(): Using Thread.Interrupt() to shutdown background threads in .NET is problematic because System.Threading.ThreadInterruptedException could be thrown on any lock statement with contention on it. This includes lock statements on code that we depend on or custom components that are engaged during a Commit() (such as a custom Directory implementation). These exceptions may cause Commit() to fail in unexpected ways during IndexWriter.Dispose(). While this affected all prior releases to a much larger degree, this release provides a partial solution using UninterruptableMonitor.Enter() to ensure these exceptions are ignored and the Thread.Interrupt() state restored, which greatly reduces the chance a Commit() could be broken or a deadlock can occur. This problem will not affect applications that do not call Thread.Interrupt() to shut down a thread. It is recommended never to use Thread.Interrupt() in conjunction with IndexWriter, ConcurrentMergeScheduler, or ControlledRealTimeReopenThread.

Change Log

Breaking Changes

  • #455 - lucene-cli: Changed exit codes to well-defined constants to make testing simpler
  • #407 - Moved all Document extensions to the Lucene.Net.Documents.Extensions namespace and added tests for DocumentExtensions in Lucene.Net.Tests._J-S, Lucene.Net.Tests.ICU and Lucene.Net.Tests.Facet. Added guard clauses and updated documentation of Document extension methods and some related fields.
  • #474 - Lucene.Net.TestFramework.Util.TestUtil: Renamed method parameters from abbreviations to whole words to follow .NET API conventions and improved documentation.
  • #475 - Lucene.Net.Grouping: Refactored and improved GroupingSearch Search API and added GroupByField() and GroupByFunction() methods.
  • #479 - Moved Lucene.Net.Join types to Lucene.Net.Search.Join namespace
  • Marked public exception constructors that were meant only for testing internal (affects only .NET Framework)
  • #446, #476 - Redesigned exception handling to ensure that exception behavior is the same as in Lucene and so we consistently throw the closest .NET equivalent exception across all of the projects.
  • #480 - Changed Cardinality() methods to Cardinality property. Added obsolete Cardinality() extension methods to the namespace of each of the pertinent types for backward compatibility.
    • Lucene.Net.Index.RandomAccessOrds
    • Lucene.Net.Util.FixedBitSet
    • Lucene.Net.Util.Int64BitSet
    • Lucene.Net.Util.OpenBitSet
    • Lucene.Net.Util.PForDeltaDocIdSet
    • Lucene.Net.Util.WAH8DocIdSet
  • #481 - Lucene.Net.Index.Term: Changed Text() method into Text property. Added an obsolete Text() extension method to Lucene.Net.Index namespace for backward compatibility.
  • #482 - Lucene.Net.BinaryDocValuesField: Changed fType static field to TYPE (as it was in Lucene) and added obsolete fType field for backward compatibility.
  • #483 - Changed all GetFilePointer() methods into properties named Position to match FileStream. Types affected: Lucene.Net.Store.IndexInput (and subclasses), Lucene.Net.Store.IndexOutput (and subclasses). Added obsolete extension methods for each type in Lucene.Net.Store namespace for backward compatibility.
  • #484 - Lucene.Net.QueryParser: Factored out NLS/IMessage/Message support and changed exceptions to use string messages so end users can elect whether or not to use .NET localization, as is possible with any other .NET exception type.
  • #484 - Lucene.Net.QueryParsers.Flexible.Messages: Removed entire namespace, as we have refactored to use .NET localization rather than NLS
  • #484 - Lucene.Net.Util: Removed BundleResourceManagerFactory and IResourceManagerFactory, as these were only to support NLS. The new approach to localizing messages can be achieved by registering QueryParserMessages.SetResourceProvider(SomeResource.ResourceManager, SomeOtherResource.ResourceManager) at application startup using any ResourceManager instance or designer-generated resource's ResourceManager property.
  • #497, #507 - Factored out Lucene.Net.Support.Time in favor of J2N.Time. Replaced all calls (except Lucene.Net.Tests.Search.TestDateFilter) that were Environment.TickCount and Time.CurrentTimeMilliseconds() to use Time.NanoTime() / Time.MillisecondsPerNanosecond for more accurate results. This may break some concurrent applications that are synchronizing with Lucene.NET components using Environment.TickCount.
  • #504 - Lucene.Net.Highlighter.VectorHiglight.ScoreOrderFragmentsBuilder.ScoreComparer: Implemented singleton pattern so the class can only be used via the Default property.
  • #502 - Lucene.Net.QueryParser.Flexible.Core.Nodes.IQueryNode: Added RemoveChildren() method from Lucene 8.8.1 to fix broken RemoveFromParent() method behavior (applies patch LUCENE-5805). This requires existing IQueryNode implementations to implement RemoveChildren() and TryGetTag().
  • #502 - Lucene.Net.QueryParser.Flexible.Core.Nodes.IQueryNode: Added TryGetTag() method to simplify looking up a tag by name.
  • #528 - Lucene.Net.Analysis.Stempel.Egothor.Stemmer.MultiTrie: Changed protected m_tries field from List<Trie> to IList<Trie>
  • #528 - Lucene.Net.Search.BooleanQuery: Changed protected m_weights field from List<Weight> to IList<Weight>
  • #528 - Lucene.Net.Search.DisjunctionMaxQuery: Changed protected m_weights field from List<Weight> to IList<Weight>

Bugs

  • #461 - Lucene.Net.Grouping.GroupingSearch::GroupByFieldOrFunction<TGroupValue>(): Fixed casting bug of allGroupsCollector.Groups by changing the cast to ICollection instead of IList.
  • #453, #455 - lucene-cli: Made appsettings.json file optional. This was causing a fatal FileNotFoundException after installing lucene-cli without adding an appsettings.json file.
  • #464 - Lucene.Net.Codecs.SimpleText.SimpleTextStoredFieldsWriter + Lucene.Net.Codecs.SimpleText.SimpleTextTermVectorsWriter: Fixed Abort() methods to correctly swallow any exceptions thrown by Dispose() to match the behavior of Lucene 4.8.0.
  • #394, #467 - Lucene.Net NuGet does not compile under Visual Studio 2017. Downgraded Lucene.Net.CodeAnalysis.CSharp and Lucene.Net.CodeAnalysis.VisualBasic from .NET Standard 2.0 to .NET Standard 1.3 to fix.
  • #471 - Lucene.Net.Documents.FieldType: Corrected documentation to reflect the actual default of IsTokenaized as true and NumericType as NumericType.NONE, and to set to NumericType.NONE (rather than null) if the field has no numeric type.
  • #476 - Lucene.Net.Analysis.Common.Util.CharArraySet: Throw NotSupportedException when the set is readonly, not InvalidOperationException to match .NET collection behavior
  • #476 - Lucene.Net.Codecs.Bloom.BloomFilteringPostingsFormat::FieldsConsumer(): Throw NotSupportedException rather than InvalidOperationException
  • #476 - Lucene.Net.Codecs.Lucene42.Lucene42DocValuesProducer::LoadNumeric(): Throw AssertionError rather than InvalidOperationException
  • #476 - Lucene.Net.Store.CompoundFileDirectory::ReadEntries(): throw AssertionError rather than InvalidOperationException
  • #476 - Lucene.Net.Util.Packed.DirectPackedReader::Get(): Throw AssertionError rather than InvalidOperationException
  • #476 - Lucene.Net.Facet: Throw InvalidOperationException rather than ThreadStateException
  • #476 - Lucene.Net.Grouping.BlockGroupingCollector: Throw NotSupportedException rather than InvalidOperationException
  • #476 - Lucene.Net.Tests.Index.TestUniqueTermCount: Throw NotSupportedException rather than InvalidOperationException
  • #486 - Changed all references that were float.MinValue and double.MinValue to float.Epsilon and double.Epsilon because those are the .NET equivalent constants to Float.MIN_VALUE and Double.MIN_VALUE in Java
  • #492, #497 - Lucene.Net.Search.ControlledRealTimeReopenThread - Fixed time calculation issue that was causing wait to happen for unusually long time periods.
  • Lucene.Net.Tests.Search.TestMultiThreadTermVectors: Removed stray [Test] attribute that was causing extra overhead with no benefit
  • #509 - Lucene.Net.Support.WeakDictionary: Changed WeakKey to use WeakReference<T> instead of WeakReference to avoid problems with garbage collection
  • #504 - Lucene.Net.Highlighter.VectorHiglight.ScoreOrderFragmentsBuilder.ScoreComparer: Implemented singleton pattern so the class can only be used via the Default property.
  • #506, #509 - Lucene.Net.Index.IndexReader: Use ConditionalWeakTable<TKey, TValue>/WeakDictionary<TKey, TValue> to ensure dead elements are pruned and garbage collected
  • #525 - Fixed Lucene.Net.Index.TestIndexWriter::TestThreadInterruptDeadlock() and Lucene.Net.Index.TestIndexWriter::TestTwoThreadsInterruptDeadlock() that were failing due to a difference in .NET Thread.Interrupt() behavior. In Java, an InterruptedException is never thown from synchronized (this) (the equivalent of lock (this)), but .NET may throw ThreadInterruptedException in cases where there is contention on the lock. The patch fixes our immediate problem of these 2 tests failing and deadlocks occurring, but is only a partial fix. See #526 for an explanation.
  • Lucene.Net.Tests.Suggest.Suggest.Analyzing.TestFreeTextSuggester::TestRandom(): LookupResult calculation in the test was using different order of parentheses than the production code. This bug existed in Java, but apparently the order makes no difference on that platform. This test was getting a false positive because it was using List<T>.ToString() to make the result comparison, which J2N's List<T> corrects.
  • #529 - Fix for .NET Framework x86 Support. The following tests were fixed by using the Lucene.Net.Util.NumericUtils::SingleToSortableInt32() method to compare the raw bits for equality. This change doesn't impact performance or behavior of the application as using an approximate float comparison would.
    • Lucene.Net.Expressions.TestExpressionSorts::TestQueries()
    • Lucene.Net.Sandbox.TestSlowFuzzyQuery::TestTieBreaker()
    • Lucene.Net.Sandbox.TestSlowFuzzyQuery::TestTokenLengthOpt()
    • Lucene.Net.Search.TestBooleanQuery::TestBS2DisjunctionNextVsAdvance()
    • Lucene.Net.Search.TestFuzzyQuery::TestTieBreaker()
    • Lucene.Net.Search.TestSearchAfter::TestQueries()
    • Lucene.Net.Search.TestTopDocsMerge::TestSort_1()
    • Lucene.Net.Search.TestTopDocsMerge::TestSort_2()

Improvements

  • #284 - website: Converted code examples in documentation from Java to C#
  • #300 - website: Fixed formatting and many broken links on the website
  • PERFORMANCE: Lucene.Net.Tartarus.Snowball: Refactored to use Func<bool> instead of a Reflection call to execute stemmer code as in the original C# port: https://github.com/snowballstem/snowball
  • #461, #475 - Added GroupingSearch tests to demonstrate usage
  • #453, #455 - lucene-cli: Added appsettings.json file with the default settings
  • #455 - Lucene.Net.Tests.Cli: Added InstallationTest to install lucene-cli and run it to ensure it can be installed and has basic functionality.
  • #463 - Lucene.Net.Analysis.OpenNLP: Updated to OpenNLP 1.9.1.1 and added strong naming support.
  • #465 - PERFORMANCE: - Lucene.Net.IndexWriter.ReaderPool: Swapped in ConcurrentDictionary<TKey, TValue> instead of Dictionary<TKey, TValue> to take advantage of the fact ConcurrentDictionary<TKey, TValue> supports deleting while iterating.
  • #466 - PERFORMANCE: Lucene.Net.Queries.Mlt.MoreLikeThis: Fixed boxing issues with RetrieveTerms() and RetrieveInterestingTerms() methods by changing object[] to a class named ScoreTerm (same refactoring as Lucene 8.2.0).
  • #467 - Lucene.Net.CodeAnalysis: Added Version.props file to make it possible to manually bump the assembly number by one revision on any code change (VS requires this, see: dotnet/roslyn#4381 (comment)).
  • website - Updated release documentation.
  • #473, #349 - Moved "benchmark" tests that cannot fail to the nightly build to reduce testing time in the normal workflow.
  • #257, #474 - Moved the RandomizedTesting generators to a separate library so they can be reused across projects.
  • #474 - Lucene.Net.TestFramework: Removed FEATURE_RANDOMIZEDCONTEXT and deleted all files related to Java randomizedtesting that were partially ported bits of its test runner.
  • #476 - Lucene.Net.TestFramework, Lucene.Net.Support: Added [DebuggerStepThrough] attribute to all assertion methods so the debugger stops in the code that fails the assert not inside of the assert method (affects only internal Lucene.NET development).
  • #446, #476 - Lucene.Net.Support.ExceptionHandling: Added ExceptionExtensions class with methods named after the Java exception types so future porting efforts can use similar catch blocks with the same behavior as in Java (i.e. catch (Exception e) when (e.IsIllegalStateException()).
  • #446, #476 - Lucene.Net.Support.ExceptionHandling: Added exception classes with the same names as Java exception types so future porting efforts can use similar catch blocks with the same behavior as in Java (i.e throw IllegalStateException.Create("This is the message")).
  • #446, #476 - Added Lucene.Net.Tests.AllProjects project containing tests to confirm that all exceptions thrown by .NET and NUnit are correctly identified by ExceptionExtensions methods.
  • #482 - Lucene.Net.Documents.FieldType::Freeze(): Changed from void return to return this FieldType to allow direct chaining of the method in field initializers. Chained the Freeze() method in all static field initializers of Field subclasses to eliminate extra helper load methods. Marked BinaryDocValuesField.fType static field obsolete and added TYPE static field (as it was in Lucene).
  • #484 - Lucene.Net.QueryParsers.Flexible.Core.Messages: Redesigned QueryParserMessages.cs so that it is just a facade around a IResourceProvider implementation that provides the actual fallback logic. Added a QueryParserResourceProvider implementation that can be passed zero to many ResourceProvider instances to override and optionally localize the default resource messages.
  • #490 - Improved debugger experience for BytesRef. In addition to the decimal bytes values it now shows the BytesRef as a UTF8 string. If the BytesRef is not a UTF8 string that representation will be the string's fingerprint signature.
  • #488 - Lucene.Net.Grouping: Fix SonarQube's "Any() should be used to test for emptiness" / Code Smell
  • #504 - Lucene.Net.Support: Factored out Number class in favor of using J2N's parsers and formatters
  • #504 - Lucene.Net.Highlighter: Implemented IFormattable and added culture-aware ToString() overload to WeightedPhraseInfo and WeightedFragInfo
  • #504 - PERFORMANCE: Lucene.Net.Highlighter: Use RemoveAll() extension method rather than allocating separate collections to track which enumerated items to remove.
  • #499 - PERFORMANCE: Use overloads of J2N Parse/TryParse that accept offsets rather than allocating substrings
  • #500 - PERFORMANCE: Updated collections to use optimized removal methods
  • #501 - PERFORMANCE: Lucene.Net.Support.ListExtensions::SubList(): Factored out in favor of J2N's List<T>.GetView() method. Many calls to List<T>.GetRange() were updated to J2N.Collections.Generic.List<T>.GetView(), which reduces unnecessary allocations.
  • #503 - PERFORMANCE: Lucene.Net.Util.UnicodeUtil::ToString(): Updated to cascade the call to J2N.Character.ToString() which has been optimized to use the stack for small strings.
  • #512 - Removed FEATURE_THREAD_YIELD and FEATURE_THREAD_PRIORITY, changed all applicable calls from Thread.Sleep(0) back to Thread.Yield() as they were in Lucene.
  • #523 - Removed several .NET Standard 1.x Features
    • NETSTANDARD1_X
    • FEATURE_CULTUREINFO_GETCULTURES
    • FEATURE_DTD_PROCESSING
    • FEATURE_XSLT
    • FEATURE_STACKTRACE
    • FEATURE_APPDOMAIN_ISFULLYTRUSTED
    • FEATURE_APPDOMAIN_BASEDIRECTORY
    • FEATURE_APPDOMAIN_GETASSEMBLIES
    • FEATURE_METHODBASE_GETMETHODBODY
  • #528 - Changed all instances of System.Collections.Generic.List<T> to J2N.Collections.Generic.List<T>, which is structurally equatable and structurally formattable.
  • #528 - PERFORMANCE: Lucene.Net.Util.ListExtensions: Added optimized path for J2N.Collections.Generic.List<T> in AddRange() and Sort() extension methods
  • #530 - Upgraded J2N NuGet package dependency to 2.0.0-beta-0017
  • #530 - Upgraded ICU4N NuGet package dependency to 60.1.0-alpha.355
  • #530 - Upgraded Morfologik.Stemming package dependency to 2.1.7-beta-0004

New Features

  • #521 - Added target and tests for net6.0

Lucene.Net_4_8_0_beta00014

3 years ago

This release contains bug fixes and minor performance improvements

Known Issues

  • The lucene-cli tool requires an appsettings.json file, but none was shipped. Upon running lucene on the command line, the following error will be presented:

    F:\Projects\lucenenet>lucene
    Unhandled exception. System.IO.FileNotFoundException: The configuration file 'appsettings.json' was not found and is not optional. The         physical path is 'C:\Users\shad\.dotnet\tools\.store\lucene-cli\4.8.0-beta00010\lucene-cli\4.8.0-beta00010\tools\netcoreapp3.1\any\appsettings.json'.
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
    at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
    at Lucene.Net.Cli.Program.Main(String[] args) in D:\a\1\s\src\dotnet\tools\lucene-cli\Program.cs:line 27
    

    Adding a text file named appsettings.json to the location specified in the error message with opening and closing brackets will prevent the exception.

    appsettings.json

    {
    }
    

    IMPORTANT: There must be at least opening and closing curly brackets in the file, or it won't be parsed as valid JSON.

Benchmarks (from #310)

Index Files

Click to expand

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.867 (2004/?/20H1)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.104
  [Host]          : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00005 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00006 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00007 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00008 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00009 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00010 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00011 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00012 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00013 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00014 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT

InvocationCount=1  IterationCount=15  LaunchCount=2  
UnrollFactor=1  WarmupCount=10  

Method Job Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
IndexFiles 4.8.0-beta00005 905.6 ms 131.82 ms 197.30 ms 43000.0000 8000.0000 7000.0000 220.99 MB
IndexFiles 4.8.0-beta00006 707.1 ms 18.57 ms 26.04 ms 44000.0000 8000.0000 7000.0000 220.99 MB
IndexFiles 4.8.0-beta00007 712.2 ms 16.45 ms 23.06 ms 44000.0000 8000.0000 7000.0000 221.04 MB
IndexFiles 4.8.0-beta00008 785.7 ms 17.37 ms 25.46 ms 44000.0000 8000.0000 7000.0000 221.54 MB
IndexFiles 4.8.0-beta00009 824.9 ms 32.86 ms 48.17 ms 44000.0000 8000.0000 7000.0000 221.34 MB
IndexFiles 4.8.0-beta00010 789.6 ms 16.40 ms 24.04 ms 44000.0000 8000.0000 7000.0000 221.35 MB
IndexFiles 4.8.0-beta00011 805.4 ms 21.26 ms 31.82 ms 44000.0000 8000.0000 7000.0000 221.37 MB
IndexFiles 4.8.0-beta00012 827.8 ms 13.95 ms 20.89 ms 56000.0000 7000.0000 6000.0000 287.03 MB
IndexFiles 4.8.0-beta00013 793.6 ms 13.63 ms 19.55 ms 44000.0000 8000.0000 7000.0000 220.22 MB
IndexFiles 4.8.0-beta00014 812.0 ms 21.97 ms 30.79 ms 44000.0000 8000.0000 7000.0000 220.29 MB

Search Files

Click to expand

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.867 (2004/?/20H1)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.104
  [Host]          : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00005 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00006 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00007 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00008 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00009 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00010 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00011 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00012 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00013 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT
  4.8.0-beta00014 : .NET Core 3.1.13 (CoreCLR 4.700.21.11102, CoreFX 4.700.21.11602), X64 RyuJIT

IterationCount=15  LaunchCount=2  WarmupCount=10  

Method Job Mean Error StdDev Median Gen 0 Gen 1 Gen 2 Allocated
SearchFiles 4.8.0-beta00005 421.1 ms 111.47 ms 163.38 ms 326.3 ms 18000.0000 1000.0000 - 82.11 MB
SearchFiles 4.8.0-beta00006 349.8 ms 24.03 ms 35.97 ms 338.9 ms 18000.0000 1000.0000 - 82.11 MB
SearchFiles 4.8.0-beta00007 333.6 ms 17.36 ms 25.98 ms 336.8 ms 18000.0000 1000.0000 - 81.9 MB
SearchFiles 4.8.0-beta00008 191.7 ms 7.17 ms 10.51 ms 187.9 ms 17000.0000 1000.0000 - 80.13 MB
SearchFiles 4.8.0-beta00009 186.6 ms 8.56 ms 12.55 ms 184.0 ms 17000.0000 1000.0000 - 80.13 MB
SearchFiles 4.8.0-beta00010 182.2 ms 6.69 ms 9.16 ms 181.6 ms 17000.0000 1000.0000 - 79.85 MB
SearchFiles 4.8.0-beta00011 208.9 ms 17.73 ms 26.54 ms 207.9 ms 17000.0000 1000.0000 - 79.85 MB
SearchFiles 4.8.0-beta00012 192.3 ms 10.99 ms 16.46 ms 187.8 ms 18000.0000 1000.0000 - 81.11 MB
SearchFiles 4.8.0-beta00013 177.4 ms 7.74 ms 11.59 ms 175.1 ms 14000.0000 1000.0000 - 65.78 MB
SearchFiles 4.8.0-beta00014 172.7 ms 5.93 ms 8.88 ms 168.9 ms 14000.0000 1000.0000 - 65.78 MB

Change Log

Breaking Changes

  • #424 - Deprecated TaskMergeScheduler, a merge scheduler that was added to support .NET Standard 1.x
  • #424 - Lucene.Net.TestFramework: Removed the public LuceneTestCase.ConcurrentMergeSchedulerFactories class

Bugs

  • #405, #415 - Lucene.Net.Index.DocTermOrds: Fixed issue with enumerator (OrdWrappedTermsEnum) incorrectly returning true when the value is null.
  • #427 - Lucene.Net.Analysis.Common: Fixed TestRollingCharBuffer::Test() to prevent out of memory exceptions when running with Verbose enabled
  • #434, #418 - Hunspell affixes' file parsing corrupts some affixes' conditions
  • #434, #419 - HunspellStemFilter does not work with zero affix
  • #439 - Lucene.Net.Facet.Taxonomy.CachedOrdinalsReader: Fixed synchronization issue between adding new items to the cache and reading RamBytesUsed method
  • #439, #417, #319 - Lucene.Net.Spatial.Util.ShapeFieldCacheProvider: Fixed atomicity issue with loading the cache by using Lazy<T>.
  • #441 - Lucene.Net.TestFramework.Support.Confguration.TestConfigurationFactory: Use Lazy<T> to ensure the configurationCache.GetOrAdd() factory is atomic.
  • #441 - Lucene.Net.TestFramework.Search.ShardSearchingTestBase: Fixed possible KeyNotFoundExceptionwhen getting the value fromcollectionStatisticsCache`
  • #441, #417, #319 - Lucene.Net.Spatial.Prefix.PrefixTreeFactory: Use Lazy<T> in ConcurrentDictionary to make the valueFactory atomic.
  • #443 - Lucene.Net.Benchmark.ByTask.Feeds.SpatialDocMaker: Since Dictionary<TKey, TValue>.this[key]is not marked virtual in .NET, subclassingDictionary<string, string>is not a valid approach. So we implementIDictionary<string, string>` instead.
  • #416 - CLI Documentation issue - environment variable token not replaced.
  • #450 - Lucene.Net.Facet - Reverted locking in to the state it was in Lucene 4.8.1, however we are still making use of ReaderWriterLockSlim to improve read performance of caches. Also, removed the 1 second lock timeout from Cl2oTaxonomyWriterCache.

Improvements

  • #269 - Added [AwaitsFix] attribute to known failing tests
  • #391 - Improved plugins in DocFx when generating API docs
  • #392 - Enabled GitHub Actions to Run Tests on Pull Request
  • #395 - Improved performance of build pipeline by publishing the whole solution in one step instead of one project at a time
  • #395 - Fixed dependency NuGet package version conflicts
  • #395 - Added crash and hang detection to the test runs
  • #395 - Upgraded to the latest dotnet CLI commands dotnet build and dotnet test rather than dotnet msbuild and dotnet vstest
  • #411, #259 - Reviewed tests for Lucene.Net.Tests.Facet
  • #412, #406 - Upgraded NUnit to 3.13.1 and NUnit3TestAdapter to 3.17.0 to make Console.WriteLine() work in unit tests.
  • #414, #259 - Review of tests for Lucene.Net.Tests.Join
  • #420, #259 - Review of tests for Lucene.Net.Tests.Classification
  • #422 - Lucene.Net.Classification: Removed leading underscore from private/internal member variables
  • #423 - Reduced casting
  • #423 - azure-pipelines.yml: Added RunX86Tests option to explicitly enable x86 tests without having to run a full nightly build
  • #425, #259 - Review of tests for Lucene.Net.Tests.Codecs
  • #426 - Changed multiple naming conventions of anonymous classes to just use the suffix AnonymousClass
  • #426 - Changed accessibility of anonymous classes to private
  • #427, #259 - Review of tests for Lucene.Net.Tests.Queries
  • #433, #430 - Removed FEATURE_CLONEABLE and the MSBuild property IncludeICloneable
  • #435, #259 - Review of tests for Lucene.Net.Tests.Expressions
  • #438 - Don't insert extra newline in TFIDFSim's score explanation (this minor change had already been done to Lucene 5.0, so we are back-porting it to 4.8.0)
  • #439 - Lucene.Net.Util.VirtualMethod: Removed unnecessary call to Convert.ToInt32()
  • #439 - Lucene.Net.Util.AttributeSource: Restored comment from Lucene indicating it doesn't matter if multiple threads compete to populate the ConditionalWeakTable.
  • #440 - SWEEP: Reviewed catch blocks and made improvements to preserve stack details.
  • #441, #417 - Lucene.Net.Analysis.OpenNLP.Tools.OpenNLPOpsFactory: Simplified logic by using GetOrAdd() instead of TryGetValue.
  • #441 - Lucene.Net.TestFramework.Util (LuceneTestCase + TestUtil): Refactored the CleanupTemporaryFiles() method to be more in line with the original Java implementation, including not allowing new files/directories to be added to the queue concurrently with the deletion process.
  • #441 - PERFORMANCE: Lucene.Net.Join.ToParentBlockJoinCollector: Changed from ConcurrentQueue<T> to Queue<T> because we are dealing with a collection declared within the same method so there is no reason for the extra overhead.
  • #441 - PERFORMANCE: Lucene.Net.Tests.Suggest.Spell.TestSpellChecker: Replaced ConcurrentBag<T> with ConcurrentQueue<T> because we need to be sure the underlying implementation guarantees order and the extra call to Reverse() was just slowing things down.
  • #441 - Lucene.Net.TestFramework.Search.ShardSearchingTestBase: Display the contents of the collection to the console using Collections.ToString().
  • #441 - Lucene.Net.Search.SearcherLifetimeManager: Added comment to indicate the reason we useLazy<T>` is to make the create operation atomic.
  • #441 - Directory.Build.Targets: Added FEATURE_DICTIONARY_REMOVE_CONTINUEENUMERATION so we can support this feature in .NET 5.x + when we add a target.
  • #442 - PERFORMANCE: Lucene.Net.Search.Suggest.Fst.FSTCompletion: Use Stack<T> rather than List<T>.Reverse(). Also, removed unnecessary lock in CheckExistingAndReorder(), as it is only used in a single thread at a time.
  • #442 - PERFORMANCE: Lucene.Net.Search.Suggest.SortedInputEnumerator: Removed unnecessary call to Reverse() and allocation of HashSet<T>
  • #444, #272 - PERFORMANCE: Lucene.Net.Search.FieldCacheImpl: Reverted locking back to the state of Lucene 4.8.0.
  • #445 - Removed FEATURE_THREAD_INTERRUPT since all supported targets now support thread interrupts. Note also that Lucene depends on thread interrupts to function properly, so disabling this feature would be invalid.
  • #448 - DOCS: Added migration guide for users migrating from Lucene.NET 3.0.3 to Lucene.NET 4.8.0.
  • #396 - DOCS: Create branching scheme to track changes in docuentation between different Lucene versions and removed the JavaDocToMarkdownConverter tool from the normal build workflow of the API docs. This frees us up to update the "namespace" documentation with .NET-specific information and code examples.
  • Upgraded J2N NuGet package dependency to 2.0.0-beta-0012
  • Upgraded ICU4N NuGet package dependency to 60.1.0-alpha.254
  • Upgraded Morfologik.Stemming package dependency to 2.1.7-beta-0002

New Features

  • #385, #362 - Lucene.Net.Documents.Document: Added culture-sensitive overloads of GetValues(), Get() and GetStringValue() that accept format and IFormatProvider and implemented IFormattable on Document and LazyDocument.
  • #404 - Added Commit() method to AnalyzingInfixSuggester (from LUCENE-5889)

Lucene.Net_4_8_0_beta00013

3 years ago

This release contains important bug fixes and performance enhancements.

Known Issues

  • The lucene-cli tool requires an appsettings.json file, but none was shipped. Upon running lucene on the command line, the following error will be presented:

    F:\Projects\lucenenet>lucene
    Unhandled exception. System.IO.FileNotFoundException: The configuration file 'appsettings.json' was not found and is not optional. The         physical path is 'C:\Users\shad\.dotnet\tools\.store\lucene-cli\4.8.0-beta00010\lucene-cli\4.8.0-beta00010\tools\netcoreapp3.1\any\appsettings.json'.
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
    at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
    at Lucene.Net.Cli.Program.Main(String[] args) in D:\a\1\s\src\dotnet\tools\lucene-cli\Program.cs:line 27
    

    Adding a text file named appsettings.json to the location specified in the error message with opening and closing brackets will prevent the exception.

    appsettings.json

    {
    }
    

    IMPORTANT: There must be at least opening and closing curly brackets in the file, or it won't be parsed as valid JSON.

  • J2N versions prior to version 2.0.0-beta-0012 had an infinite recursion bug on Xamarin.Android which caused fatal crashes in Lucene.NET. Upgrading J2N to 2.0.0-beta-0012 or higher will prevent these crashes from occurring.

Benchmarks (from #310)

Index Files

Click to expand

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.630 (2004/?/20H1)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100
  [Host]          : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00005 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00006 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00007 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00008 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00009 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00010 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00011 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00012 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00013 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT

InvocationCount=1  IterationCount=15  LaunchCount=2  
UnrollFactor=1  WarmupCount=10  

Method Job Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
IndexFiles 4.8.0-beta00005 628.1 ms 8.41 ms 12.05 ms 43000.0000 8000.0000 7000.0000 220.82 MB
IndexFiles 4.8.0-beta00006 628.3 ms 13.19 ms 19.33 ms 44000.0000 8000.0000 7000.0000 220.67 MB
IndexFiles 4.8.0-beta00007 617.2 ms 8.44 ms 11.83 ms 44000.0000 8000.0000 7000.0000 220.73 MB
IndexFiles 4.8.0-beta00008 620.6 ms 5.62 ms 8.41 ms 44000.0000 8000.0000 7000.0000 221.06 MB
IndexFiles 4.8.0-beta00009 632.8 ms 12.57 ms 18.43 ms 44000.0000 8000.0000 7000.0000 220.95 MB
IndexFiles 4.8.0-beta00010 862.3 ms 51.13 ms 74.95 ms 44000.0000 8000.0000 7000.0000 221.22 MB
IndexFiles 4.8.0-beta00011 636.5 ms 11.06 ms 15.87 ms 44000.0000 8000.0000 7000.0000 221.09 MB
IndexFiles 4.8.0-beta00012 668.8 ms 14.78 ms 21.66 ms 56000.0000 7000.0000 6000.0000 286.63 MB
IndexFiles 4.8.0-beta00013 626.7 ms 7.78 ms 10.91 ms 43000.0000 8000.0000 7000.0000 219.8 MB

Search Files

Click to expand

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.630 (2004/?/20H1)
Intel Core i7-8850H CPU 2.60GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.100
  [Host]          : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00005 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00006 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00007 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00008 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00009 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00010 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00011 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00012 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT
  4.8.0-beta00013 : .NET Core 3.1.9 (CoreCLR 4.700.20.47201, CoreFX 4.700.20.47203), X64 RyuJIT

IterationCount=15  LaunchCount=2  WarmupCount=10  

Method Job Mean Error StdDev Gen 0 Gen 1 Gen 2 Allocated
SearchFiles 4.8.0-beta00005 274.8 ms 7.01 ms 10.28 ms 18000.0000 1000.0000 - 82.12 MB
SearchFiles 4.8.0-beta00006 283.4 ms 7.78 ms 11.64 ms 18000.0000 1000.0000 - 82.13 MB
SearchFiles 4.8.0-beta00007 291.5 ms 8.91 ms 13.33 ms 18000.0000 1000.0000 - 81.9 MB
SearchFiles 4.8.0-beta00008 162.3 ms 5.50 ms 8.23 ms 17000.0000 1000.0000 - 80.13 MB
SearchFiles 4.8.0-beta00009 165.6 ms 2.61 ms 3.90 ms 17000.0000 - - 80.13 MB
SearchFiles 4.8.0-beta00010 159.4 ms 2.84 ms 4.17 ms 17000.0000 1000.0000 - 79.85 MB
SearchFiles 4.8.0-beta00011 160.8 ms 1.93 ms 2.77 ms 17000.0000 1000.0000 - 79.85 MB
SearchFiles 4.8.0-beta00012 169.2 ms 6.48 ms 9.49 ms 18000.0000 1000.0000 - 81.11 MB
SearchFiles 4.8.0-beta00013 161.6 ms 3.28 ms 4.80 ms 14000.0000 1000.0000 - 65.78 MB

Change Log

Breaking Changes

  • Lucene.Net.Search.FieldCache: Added interface ICreationPlaceholder and changed CreationPlaceholder class to CreationPlaceHolder<TValue>.

Bugs

  • #356 - Lucene.Net.Store.NativeFSLockFactory: Modified options to allow read access on non-Windows operating systems. This caused the copy constructor of RAMDirectory to throw "The process cannot access the file 'file path' because it is being used by another process" excpetions.
  • #296 - Lucene.Net.Util.Automaton.State: Removed Equals() implementation; it was intended to use reference equality as a unique key. This caused random IndexOperationExceptions to occur when using FuzzyTermsEnum/FuzzyQuery.
  • #387 - Fixed formatting in ArgumentException message for all analyzer factories so it will display the dictionary contents
  • #387 - Lucene.Net.Util.ExceptionExtensions.GetSuppressedAsList(): Use J2N.Collections.Generic.List<T> so the call to ToString() will automatically list the exception messages
  • #387 - Lucene.Net.TestFramework.Analysis.MockTokenizer: Pass the AttributeFactory argument that is provided as per the documentation comment. Note this bug exists in Lucene 4.8.0, also.
  • #387 - Lucene.Net.Analysis.Common.Tartarus.Snowball.Among: Fixed MethodObject property to return private field instead of itself
  • #387 - Lucene.Net.Document.CompressionTools: Pass the offset and length to the underlying MemoryStream
  • #388 - Downgraded minimum required Microsoft.Extensions.Configuration version to 2.0.0 on .NET Standard 2.0 and 2.1

Improvements

  • Updated code examples on website home page
    1. Show cross-OS examples of building Directory paths
    2. Demonstrate where to put using statements
    3. Removed LinqPad's Dump() method and replaced with Console.WriteLine() for clarity
    4. Fixed syntax error in initialization example of MultiPhraseQuery
  • Upgraded NuGet dependency J2N to 2.0.0-beta-0010
  • Upgraded NuGet dependency ICU4N to 60.1.0-alpha.353
  • Upgraded NuGet dependency Morfologik.Stemming to 2.1.7-beta-0001
  • #344 - PERFORMANCE: Lucene.Net.Search.FieldCacheImpl: Removed unnecessary dictionary lookup
  • #352 - Added Azure DevOps tests for x86 on all platforms
  • #348 - PERFORMANCE: Reduced FieldCacheImpl casting/boxing
  • #355 - Setup nightly build (https://dev.azure.com/lucene-net/Lucene.NET/_build?definitionId=4)
  • PERFORMANCE: Lucene.Net.Util.Automaton.SortedInt32Set: Removed unnecessary IEquatable<T> implementations and converted FrozenInt32Set into a struct.
  • PERFORMANCE: Lucene.Net.Util.Bits: Removed unnecessary GetHashCode() method from MatchAllBits and MatchNoBits (didn't exist in Lucene)
  • Lucene.Net.Util.Counter: Changed Get() to Value property and added implicit operator.
  • #361 - Make CreateDirectory() method virtual so that derived classes can provide their own Directory implementation, allowing for benchmarking of custom Directory providers (e.q LiteDB)
  • #346, #383 - PERFORMANCE: Change delegate overloads of Debugging.Assert() to use generic parameters and string.Format() to reduce allocations. Use J2N.Text.StringFormatter to automatically format arrays and collections so the processing of converting it to a string is deferred until an assert fails.
  • #296 - PERFORMANCE: Lucene.Net..Index: Calling IndexOptions.CompareTo() causes boxing. Added new IndexOptionsComparer class to be used in codecs instead.
  • #387 - Fixed or Suppressed Code Analysis Rules
    • CA1012: Abstract types should not have constructors
    • CA1052: Static holder types should be Static or NotInheritable
    • CA1063: Implement IDisposable Properly (except for IndexWriter). Partially addresses #265.
    • CA1507: Use nameof instead of string (#366)
    • CA1802: Use Literals Where Appropriate
    • CA1810: Initialize reference type static fields inline
    • CA1815: Override equals and operator equals on value types
    • CA1819: Properties should not return arrays
    • CA1820: Test for empty strings using string length
    • CA1822: Mark members as static
    • CA1825: Avoid zero-length array allocations
    • CA2213: Disposable fields should be disposed (except for IndexWriter and subclasses which need more work)
    • IDE0016: use throw expression (#368)
    • IDE0018: Inline variable declaration
    • IDE0019: Use pattern matching to avoid 'is' check followed by a cast
    • IDE0020: Use pattern matching to avoid 'is' check followed by a cast
    • IDE0021: Use block body for constructors
    • IDE0025: Use expression body for properties
    • IDE0027: Use expression body for accessors
    • IDE0028: Use collection initializers
    • IDE0029: Use coalesce expression
    • IDE0030: Use coalesce expression (nullable)
    • IDE0031: Use null propagation
    • IDE0034: Simplify 'default' expression
    • IDE0038: Use pattern matching to avoid 'is' check followed by a cast
    • IDE0039: Use local function
    • IDE0040: Add accessibility modifiers
    • IDE0041: Use is null check
    • IDE0049: Use language keywords instead of framework type names for type references
    • IDE0051: Remove unused private member
    • IDE0052: Remove unread private member
    • IDE0059: Remove unnecessary value assignment
    • IDE0060: Remove unused parameter
    • IDE0063: Use simple 'using' statement
    • IDE0071: Simplify interpolation
    • IDE1005: Use conditional delegate call
    • IDE1006: Naming Styles
  • #387 - Removed dead code/commented code
  • #387 - PERFORMANCE: Added aggressive inlining in Codecs and Util namespaces
  • #387 - Simplified reuse logic of TermsEnum subclasses
  • #387 - PERFORMANCE: Lucene.Net.Index.DocValuesProducer: Optimized checks in AddXXXField() methods
  • #387 - PERFORMANCE: Lucene.Net.Index: Changed FieldInfos, FreqProxTermsWriterPerField, IndexWriter, LogMergePolicy, SegmentCoreReaders, and SegmentReader to take advantage of the fact that TryGetValue() returns a boolean
  • #370, #389 - Reverted FieldCacheImpl delegate capture introduced in #348
  • #390 - Added tests for .NET 5
  • #390 - Upgraded to C# LangVersion 9.0

New Features

  • #358 - Added Community Links page to website
  • #359 - Added builds mailing list to website
  • #365 - Added "Fork me on GitHub" to website and API docs
  • Lucene.Net.TestFramework: Added Assert.DoesNotThrow() overloads

Lucene.Net_4_8_0_beta00012

3 years ago

This release contains important bug fixes and performance enhancements.

Known Issues

  • After installation, when upgrading from versions of Lucene.Net 4.8.0-beta00009 or higher may require a restart of all instances of Visual Studio in order to reload the code analysis analyzer.

  • The lucene-cli tool requires an appsettings.json file, but none was shipped. Upon running lucene on the command line, the following error will be presented:

    F:\Projects\lucenenet>lucene
    Unhandled exception. System.IO.FileNotFoundException: The configuration file 'appsettings.json' was not found and is not optional. The         physical path is 'C:\Users\shad\.dotnet\tools\.store\lucene-cli\4.8.0-beta00010\lucene-cli\4.8.0-beta00010\tools\netcoreapp3.1\any\appsettings.json'.
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
    at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
    at Lucene.Net.Cli.Program.Main(String[] args) in D:\a\1\s\src\dotnet\tools\lucene-cli\Program.cs:line 27
    

    Adding a text file named appsettings.json to the location specified in the error message with opening and closing brackets will prevent the exception.

    appsettings.json

    {
    }
    

    IMPORTANT: There must be at least opening and closing curly brackets in the file, or it won't be parsed as valid JSON.

  • J2N versions prior to version 2.0.0-beta-0012 had an infinite recursion bug on Xamarin.Android which caused fatal crashes in Lucene.NET. Upgrading J2N to 2.0.0-beta-0012 or higher will prevent these crashes from occurring.

Change Log

Breaking Changes

  • Lucene.Net.Facet: Renamed LRUHashMap > LruDictionary. Changed all members to be virtual to allow users to provide their own LRU cache.
  • Lucene.Net.Facet.FacetsConfig: Removed ProcessSSDVFacetFields from public API (as was done in Lucene), avoid lock (this)
  • Lucene.Net.Facet.TaxonomyReader: Changed DoClose() to Dispose(bool) and implemented proper dispose pattern. Avoid lock (this).
  • Lucene.Net.Facet.WriterCache: Renamed NameInt32CacheLRU > NameIntCacheLru, NameHashInt32CacheLRU > NameHashInt32CacheLru. Refactored to utilize a generic type internally using composition to avoid boxing/unboxing without exposing the generic closing type publicly. Added public INameInt32CacheLru as a common interface between NameIntCacheLru and NameHashInt32CacheLru.
  • Lucene.Net.Facet.Taxonomy.TaxonomyReader: Restructured ChildrenIterator into ChildrenEnumerator
  • Lucene.Net.Facet.Taxonomy.CategoryPath: Changed FullPathLength from a method to a property
  • Lucene.Net.Facet.DrillSideways: Changed ScoreSubDocsAtOnce from a method to a property
  • Lucene.Net.Facet: Refactored OrdAndValue into a generic struct that can be used in both TopOrdAndSingleQueue and TopOrdAndInt32Queue. Added Insert method to Util.PriorityQueue<T> to allow adding value types without reading the previous value for reuse.
  • Lucene.Net.Analysis.Common.Miscellaneous.CapitalizationFilter: Changed default behavior to use invariant culture instead of the current thread's culture to match Lucene, which seems more natural when using filters inside of analyzers. This also fits more in line with how other filters are selected.
  • #279 - Lucene.Net.Analysis.Compound.Hyphenation.TernaryTree: Renamed Iterator > Enumerator, Keys() > GetEnumerator()
  • #279 - Lucene.Net.Benchmarks.ByTask.Feeds.DirContentSource: Renamed Iterator > Enumerator

Bugs

  • #269 - Removed cast from NGramTokenAnymousInnerClassHelper::IsTokenChar(int) that was causing surrogate pairs to fail in the TestUTF8FullRange() tests of NGramTokenizerTest and EdgeNGramTokenizerTest
  • Fixed potential issue with ArgumentExceptions being thrown from char.ConvertToUtf32(string, int) by reverting back to CodePointAt() method in TestCharTokenizers.TestCrossPlaneNomalization().
  • Lucene.Net.QueryParser.Surround.Query.ComposedQuery::MakeLuceneSubQueriesField(): Added missing using block on enumerator
  • #296 - Fixed surrogate pair and culture-sensitivity issues with many analyzers.
  • Lucene.Net.Analysis.Common: Fixed classes that were originally using invariant culture to do so again. J2N's Character class default is to use the current culture, which had changed from the prior Character class from Lucene.Net.Support that used invariant culture. Fixes TestICUFoldingFilter::TestRandomStrings().
  • Lucene.Net.ICU: Fixed ThaiWordBreaker to account for surrogate pairs. Also added locking to help with thread safety. Note that the class is still not completely thread-safe, but this patch fixes the behavior.
  • Lucene.Net.Spatial.Util.ShapeFieldCache: Removed unnecessary array allocation
  • Lucene.Net.TestFramework: Fixed LineFileDocs to read byte by byte the same way that Lucene does, except using a BufferedStream to improve performance.
  • Lucene.Net.TestFramework: Fixed NightlyAttribute, WeeklyAttribute, AwaitsFixAttribute, and SlowAttribute so they work at the class level
  • Lucene.Net.Analysis.Icu.Segmentation.ICUTokenizer: Corrected call to ICU4N.UChar.IsWhiteSpace() rather than System.Char.IsWhiteSpace(), which may return different results.
  • Lucene.Net.TestFramework.Search.SearchEquivalenceTestBase: Fixed exception when using OpenBitSet.FastGet() instead of OpenBitSet.Get(), since the size of the bit set is unknown.
  • Lucene.Net.Index.DocumentsWriterFlushControl: Fixed issue due to misbehaving locking on Monitor.TryEnter(), the code was restructured to disallow any thread that doesn't have a lock into InternalTryCheckoutForFlush() so the threads do note compete for a lock.
  • #274 - Lucene.Net.Facet: Fixed null reference exception in DrillSidewaysScorer from patch in Lucene 4.10.4 https://issues.apache.org/jira/browse/LUCENE-6001
  • Lucene.Net.Facet.Taxonomy.WriterCache.Cl2oTaxonomyWriterCache: Fixed locking on Dispose() method and made it safe to call dispose multiple times
  • Reviewed and added asserts that existed in Lucene and were missing in Lucene.NET. Effectively, this meant we were missing several test conditions that have now been put into place.
  • Lucene.Net.ICU: Added locking to ICUTokenizer to only allow a single thread to manipulate the BreakIterator at a time. This is a temporary fix to get the tests to pass until a solution is found for making BreakIterator threadsafe.
  • #332 - Lucene.Net.Replicator: Fixed an issue in IndexInputStream that meant the read method could return a number larger than what was passed in for read count or what the buffer could hold, it should instead return the total number of bytes that was read into the buffer, which logically can't be bigger than the buffer it self.
  • Lucene.Net.Tests.Index.TestIndexWithThreads::TestRollbackAndCommitWithThreads(): Must catch and ignore AssertionException, as was done in Lucene
  • Lucene.Net.Search.TopScoreDocCollector: Disabled optimizations on .NET Framework because of float comparison failures on x86 in Release mode. Fixes TestSearchAfter::TestQueries(), TestTopDocsMerge::TestSort_1(), TestTopDocsMerge::TestSort_2().
  • Lucene.Net.Sandbox.Queries.SlowFuzzyTermsEnum: Disabled optimizations on .NET Framework because of float comparison failures on x86 in Release mode. Fixes TestTokenLengthOpt().
  • Lucene.Net.Search.FuzzyTermsEnum: Disabled optimizations for Accept() method on .NET Framework because of float comparison failures on x86 in Release mode. Fixes TestTokenLengthOpt().
  • Fixed several references to J2N.BitConversion that were calling the overload that normalizes NaN when they should have been calling the raw bit conversion instead (as was done in Lucene).
  • #323 - Lucene.Net.Configuration: Removed the IConfigurationRoot interface from the ConfigurationRoot class when targeting a version of Microsoft.Extensions.Configuration less than 2.0. This will allow the end user to upgrade Microsoft.Extensions.Configuration seamlessly to versions 2.0 or higher.
  • #286 - Lucene.Net.CodeAnalysis: Separated CSharp and VisualBasic into different assemblies to prevent cross-language dependency issues when using analyzers

Improvements

  • #261 PERFORMANCE - Fixed FSTTester to delete while iterating forward instead of using .ElementAt() to iterate in reverse, which takes about 3x longer
  • #261 PERFORMANCE - Lucene.Net.Facet.Taxonomy.WriterCache.NameInt32CacheLRU: Changed from Dictionary to ConcurrentDictionary so we can delete items from the cache while forward iterating through it.
  • #261 PERFORMANCE - Lucene.Net.Index.FieldInfos: Changed Builder.FieldInfo() method to TryGetFieldInfo() to optimize check for value
  • Upgraded NuGet dependency J2N to 2.0.0-beta-0009
  • Upgraded NuGet dependency ICU4N to 60.1.0-alpha.352
  • Upgraded NuGet dependency Morfologik.Stemming to 2.1.6-beta-0007
  • #261 PERFORMANCE - Use J2N's ICollection<T>.ToArray() extension method that uses ICollection<T>.CopyTo(), which takes precedence over the LINQ IEnumerable<T>.ToArray() extension method. Benchmarks show about a 1/3 increase in performance.
  • #261 PERFORMANCE - Lucene.Net.Support.IO.FileSupport::CreateTempFile(): Optimized the check for invalid characters
  • Directory.Build.props: Disabled warnings for features that require .NET Standard 2.1
  • #261 PERFORMANCE - Eliminated several calls to FirstOrDefault(), LastOrDefault(), Skip(), First(), and Last()
  • Lucene.Net.Support.ListExtensions: Factored out BinarySearch in favor of implementation from J2N
  • Lucene.Net.Suggest.FreeTextSuggester: Converted from SubList().Clear() to RemoveRange()
  • #261 PERFORMANCE - Changed handling of LineFileDocs to unzip the file to a temp directory once per test run instead of using a MemoryStream to pick a random line from the file on each test. This significantly improves performance of many of the tests.
  • Lucene.Net.Analysis.Icu.Segmentation.ScriptIterator: Removed static constructor and initialized static state inline
  • Converted all explicit Analyzer classes to using the Analyzer.NewAnonymous() method to declare Analyzers inline.
  • #261 PERFORMANCE - Lucene.Net.Tests.Util.TestCollectionUtil: Optimized by using array instead of list for sorting tests
  • #261 PERFORMANCE - Lucene.Net.Util: Switched implementation of DisposableThreadLocal with that from RavenDB, with permission from its maintainers (https://issues.apache.org/jira/browse/LUCENENET-640?focusedCommentId=17033146&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17033146). The new implementation improves GC during several operations.
  • Lucene.Net.TestFramework.Util.LuceneTestCase: Removed TaskMergeScheduler completely from random testing
  • #261 PERFORMANCE - Moved scratch BytesRef instances outside of the loops that they were nested in so they can be reused in each iteration (as was done in Lucene)
  • #261 PERFORMANCE - Lucene.Net.Facet.Taxonomy.CachedOrdinalsReader: Refactored locking to make reads more efficient
  • #261 PERFORMANCE - Lucene.Net.Facet.Taxonomy.Directory.DirectoryTaxonomyReader: Refactored to use ReaderWriterLockSlim to make reads more efficient
  • #261 PERFORMANCE - Lucene.Net.Facet.Taxonomy.Directory.DirectoryTaxonomyWriter: Refactored locking for better efficiency
  • #265 - Lucene.Net.Facet.Taxonomy.WriterCache.Cl2oTaxonomyWriterCache: Added proper dispose pattern
  • #265 - Lucene.Net.Facet.Taxonomy.WriterCache.LruTaxonomyWriterCache: Added proper dispose pattern
  • Lucene.Net.Facet.Taxonomy.Directory.TaxonomyIndexArrays: Changed to use LazyInitializer and avoid lock (this)
  • #261 PERFORMANCE - Lucene.Net.Tests.Facet: Convert int to string in the invariant culture
  • Lucene.Net.Analysis.ICU: Updated Segmentation files to Lucene 8.6.1 to account for the latest features of ICU
  • #261 PERFORMANCE - Lucene.Net.Util.AttributeSource: Eliminated unnecessary try catch and made more efficient by using TryGetValue instead of ContainsKey followed by a lookup
  • Lucene.Net.Util: Streamlined DefaultAttributeFactory to make the get/update process of creating an attribute WeakReference atomic
  • #208 - Switch to simpler LIFO thread to ThreadState allocator during indexing. Technically, this is something from releases/lucene-solr/4.8.1, but profiling indicates it makes a huge difference in multithreaded scenarios
  • SWEEP - Removed unnecessary .NET Framework references from all test projects
  • Converted remaining compilation constants from target platforms to features to make it simpler to change targets. Eliminated references to NETSTANDARD.
  • Inverted logic so FEATURE_STACKTRACE is enabled rather than disabled when the System.Diagnositcs.StackTrace class is available.
  • Lucene.Net.Constants: Refactored to use System.Runtime.InteropServices.RuntimeInformation on .NET Framework
  • Lucene.Net.Expressions: Eliminated .NET settings file and reused JavascriptCompiler.properties file in .NET Framework so we don't have to branch for different target platforms. Simplified reading the settings by using J2N PropertyExtensions.
  • #261 PERFORMANCE - Lucene.Net.Support.AssemblyUtils: restructured to use IEnumerable<T> for deferred execution
  • #279 - Lucene.Net.Index.Terms/TermsEnum, Lucene.Net.Suggest: Refactored iterators into enumerators. Deprecated the iterators.
  • #279 - Lucene.Net.Util.FilterIterator<T>: Converted to FilterEnumerator<T> using a predicate passed into the constructor rather than having to subclass. Deprecated FilterIterator<T>. Swapped only usage in FieldFilterAtomicReader with a LINQ query/yield return, since performance is better.
  • #279 - Lucene.Net.Util.MergedIterator<T>: Converted to MergedEnumerator<T> and deprecated MergedIterator<T>
  • #279 - Lucene.Net.Codecs.Memory.DirectDocValuesConsumer: Renamed IteratorAnonymousInnerClassHelper > Enumerator, IterableAnonymousInnerClassHelper > EnumerableAnonymousInnerClassHelper

New Features

  • Added DeadlockAttribute to identify tests that are known to have threading contention issues and may deadlock during test runs
  • Added ability to turn asserts on in a Release build by using the system property "assert": "true". This is necessary to ensure all of the test conditions are being hit in all builds and to enable more thorough CheckIndex from the command line.
  • Lucene.Net.TestFramework: Added ability to turn off asserts when running tests by ignoring a few tests that require the asserts to be enabled in order to pass. This makes it possible to ensure that Lucene.NET works properly with asserts disabled. This feature didn't exist in Lucene.
  • Lucene.Net.Search.FieldCacheDocIdSet: Added public constructor with predicate parameter for filtering without having to create a subclass

Lucene.Net_4_8_0_beta00011

3 years ago

This release contains a critical patch for .NET Framework users that use Microsoft.Extensions.Configuration higher than version 1.1.2. See #311 for details.

This release contains impactful performance enhancements.

Known Issues

  • The lucene-cli tool requires an appsettings.json file, but none was shipped. Upon running lucene on the command line, the following error will be presented:

    F:\Projects\lucenenet>lucene
    Unhandled exception. System.IO.FileNotFoundException: The configuration file 'appsettings.json' was not found and is not optional. The         physical path is 'C:\Users\shad\.dotnet\tools\.store\lucene-cli\4.8.0-beta00010\lucene-cli\4.8.0-beta00010\tools\netcoreapp3.1\any\appsettings.json'.
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
    at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
    at Lucene.Net.Cli.Program.Main(String[] args) in D:\a\1\s\src\dotnet\tools\lucene-cli\Program.cs:line 27
    

    Adding a text file named appsettings.json to the location specified in the error message with opening and closing brackets will prevent the exception.

    appsettings.json

    {
    }
    

    IMPORTANT: There must be at least opening and closing curly brackets in the file, or it won't be parsed as valid JSON.

  • J2N versions prior to version 2.0.0-beta-0012 had an infinite recursion bug on Xamarin.Android which caused fatal crashes in Lucene.NET. Upgrading J2N to 2.0.0-beta-0012 or higher will prevent these crashes from occurring.

Change Log

Breaking Changes

  • PERFORMANCE: Lucene.Net.Analysis.Compound: Changed protected m_tokens field from LinkedList<T> to Queue<T> for better throughput

Bugs

  • #311 - Lucene.Net.Configuration: Removed IConfigurationBuilder implementation that prevents .NET Framework users from being able to upgrade to a higher version than 1.1.2.

Improvements

  • PERFORMANCE: Lucene.Net.TestFramework: Reverted BaseTermVectorsFormatTestCase to use the original 5000 iterations instead of 500. Reverted TestUtil.RandomSimpleString(Random) to default to a maximum string length of 10 instead of 20, which was slowing down several tests.
  • PERFORMANCE: Lucene.Net.TestFramework: Refactored Assert class to use custom comparisons for all members, since NUnit's Assert implementation uses very slow fluent expressions to do comparisons, which are not practical to use inside of tight loops.
  • PERFORMANCE: Replaced LinkedList<T> with Queue<T>` where there is a performance advantage
  • PERFORMANCE: Reduced memory allocations of CaseInsensitiveComparer by using its singleton static property
  • PERFORMANCE: Lucene.Net.Util.RamUsageEstimator: Switched back to System.Collections.Generic.Dictionary because indexer of J2N's dictionary is slower
  • PERFORMANCE: Lucene.Net.TestFramework.Util.Fst.FSTTester: Use System.Collections.Generic.Dictionary for better performance
  • PERFORMANCE: Switched all remaining tests from using NUnit.Framework.Assert to Lucene.Net.TestFramework.Assert, which are several orders of magnitude faster
  • PERFORMANCE: Lucene.Net.Facet: Optimized DirectoryTaxonomyReader by reducing locking, removing unnecessary casts, and using LazyInitializer for the taxonomy array initialization
  • PERFORMANCE: Lucene.Net.Tests.Analysis.Common: Changed Hunspell StemmerTestBase to use more optimized assert to compare arrays
  • SWEEP: Consolidated empty array creation code so it is more DRY
  • Upgraded to C# LangVersion 8.0
  • Lucene.Net.Tests.Support.TestApiConsistency: Added regex filter to exclude public fields from the scan

Lucene.Net_4_8_0_beta00010

3 years ago

This release contains impactful performance enhancements.

Known Issues

  • The lucene-cli tool requires an appsettings.json file, but none was shipped. Upon running lucene on the command line, the following error will be presented:

    F:\Projects\lucenenet>lucene
    Unhandled exception. System.IO.FileNotFoundException: The configuration file 'appsettings.json' was not found and is not optional. The         physical path is 'C:\Users\shad\.dotnet\tools\.store\lucene-cli\4.8.0-beta00010\lucene-cli\4.8.0-beta00010\tools\netcoreapp3.1\any\appsettings.json'.
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
    at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
    at Lucene.Net.Cli.Program.Main(String[] args) in D:\a\1\s\src\dotnet\tools\lucene-cli\Program.cs:line 27
    

    Adding a text file named appsettings.json to the location specified in the error message with opening and closing brackets will prevent the exception.

    appsettings.json

    {
    }
    

    IMPORTANT: There must be at least opening and closing curly brackets in the file, or it won't be parsed as valid JSON.

  • J2N versions prior to version 2.0.0-beta-0012 had an infinite recursion bug on Xamarin.Android which caused fatal crashes in Lucene.NET. Upgrading J2N to 2.0.0-beta-0012 or higher will prevent these crashes from occurring.

Change Log

Breaking Changes

  • PERFORMANCE: Lucene.Net.Suggest: Narrowed return type of Contexts property from IEnumerable<BytesRef> to ICollection<BytesRef> to improve performance for certain operations
  • Lucene.Net.TestFramework.Util.LuceneTestCase: Refactored to correctly load static fields and converted them into static properties. The names were changed to match the conventions of .NET properties.
  • #261 PERFORMANCE: Changed to use BitSet instead of BitArray so we don't have to resort to slow extension methods for certain operations

Bugs

  • Lucene.Net.Support.Collections::ToString(): Fixed overloads to write "null" when the collection passed is null rather than throw an exception
  • #267 - Lucene.Net.Codecs: Fixed testing condition for BaseTermVectorsFormatTestCase on TermVectorsReaders by throwing InvalidOperationException
  • #301 - After upgrading NuGet package dependencies for NUnit to 3.12.0, NUnit3TestAdapter to 3.16.1, and Microsoft.NET.Test.Sdk to 16.6.1, several false positives were noted due to invalid try-catch logic in tests

Improvements

  • Added missing documentation table of contents links for lucene-cli, Lucene.Net.Demo, Lucene.Net.Queries, Lucene.Net.QueryParser, Lucene.Net.Replicator, Lucene.Net.Sandbox, Lucene.Net.Spatial, and Lucene.Net.Suggest.
  • #261 PERFORMANCE: - Removed all calls to Type.GetTypeInfo() and call properties/methods on the Type instance directly.
  • #261 PERFORMANCE: - Lucene.Net.Util.AttributeSource: Create built-in attributes directly rather than using Activator.CreateInstance()
  • #261 PERFORMANCE: - Lucene.Net.Util.AttributeSource: Optimized creation of string to identify attribute type based on attribute interface name
  • #295 PERFORMANCE: - Lucene.Net.TestFramework: Added overloads of Assert.AreEqual for collections where J2N's aggressive mode can be switched off
  • #295 PERFORMANCE: - Lucene.Net.TestFramework: Replaced overloads from NUnit.Framework.CollectionAssert with optimized implementations using J2N comparers
  • #295 PERFORMANCE: - Lucene.Net.TestFramework: Compile expensive string concatenation out of the release build by using System.Diagnostics.Debug.Assert
  • #295 PERFORMANCE: - Lucene.Net.Util.Automaton: Fixed State class to initialize and trim more efficiently. Fixes the performance of Lucene.Net.Util.Automaton.TestBasicOperations::TestEmptyLanguageConcatenate().
  • #295, #261: PERFORMANCE: - Reduced zero length array allocations
  • #295, #261: PERFORMANCE: - Reduced zero length collection allocations
  • #295, #261: PERFORMANCE: - Lucene.Net.TestFramework.Search.RandomSimilarityProvider::ToString(): Use StringBuilder for better efficiency
  • #295, #261: PERFORMANCE: - Lucene.Net.TestFramework.Util.LuceneTestCase: Cache codecType and similarityName as strings so they don't have to be regenerated for each test
  • #295, #261: PERFORMANCE: - Lucene.Net.TestFramework: Use Assert.IsFalse() rather than Assert.That()
  • #295, #261: PERFORMANCE: - Reduced calls to LINQ methods and expressions.
  • #295, #261: PERFORMANCE: - Lucene.Net.Index.Term: Optimized equality checking
  • #295, #261: PERFORMANCE: - Lucene.Net.Util (BytesRef + CharsRef): Implemented IEquatable<T>
  • #295, #261: PERFORMANCE: - Lucene.Net.Tests.Index.TestDocumentsWriterDeleteQueue: Updated comparisons to reduce memory allocations
  • #295, #261: PERFORMANCE: - Lucene.Net.TestFramework: Changed ConcurrentMergeSchedulerFactories.Values to only return the TaskMergeScheduler rarely, since it is no longer a default setting and is slowing down tests
  • #295, #261: PERFORMANCE: - Added some aggressive inlining
  • Lucene.Net.TestFramework: Added overloads of Assert.Throws to supply messages
  • Lucene.Net.TestFramework: Use the Microsoft.Extensions.Configuration.EnvironmentVariables provider instead of our custom one.
  • Lucene.Net.Analysis.Tokenizer: Allow enabling "asserts" for testing
  • azure-pipelines.yml: Changed to use pipeline caching instead of build caching for better performance
  • Lucene.Net.Tests (core) - Rearranged how test projects are split so parallel jobs can process them faster
  • Fixed up leading whitespace to always use spaces instead of tabs
  • Updated NuGet package dependency for J2N to 2.0.0-beta-0008
  • Lucene.Net.Tests.Analysis.Phonetic.Language.Bm.PhoneticEnginePeformanceTest::Test(): Changed to use Stopwatch for more accurate timing
  • Lucene.Net.Tests: Changed private/internal member variables to consistently use camelCase instead of PascalCase
  • SWEEP: Changed all properties to use expression style syntax, reordered to put get before set, and changed all backing field names back to their original without the "_Renamed" suffix
  • SWEEP: Removed fully-qualified exceptions and added using directives instead
  • PERFORMANCE: Pre-compile and statically cache regular expressions
  • PERFORMANCE: Lucene.Net.Analysis: Removed unnecessary allocations caused by calling ToString() rather than passing the ICharTermAttribute directly
  • PERFORMANCE: Lucene.Net.Analysis.Common: Slight optimization of ToUpper and ToLower methods, which are used in several analyzers
  • SWEEP: Removed .NET Standard 1.x/.NET Core 1.x support from all project files
  • SWEEP: Removed unused dependencies for .NET Framework
  • PERFORMANCE: Lucene.Net.Codecs.SimpleText: Using decimal is 30% faster than using BigInteger for addition and subtraction
  • PERFORMANCE: Applied Slow and Nightly attributes where applicable for better testing performance
  • Lucene.Net.TestFramework: Added logging of system properties to the output of tests
  • PERFORMANCE: Lucene.Net.Util.ArrayUtil::GetNaturalComparer<T>(): Use statically cached singleton instance of comparer
  • Upgraded NuGet dependency ICU4N to 60.1.0-alpha.351

New Features

  • Lucene.Net.TestFramework.RandomExtensions: Added missing overload of NextInt64(long) to choose only max upper bound
  • Added a global Lucene.Net.Diagnostics.Debugging.AssertsEnabled static property that can be used to toggle "asserts" on and off in the release build, similar to how it works in Java. The setting can be injected by end users with the "assert" system property (which is a boolean).
  • Lucene.Net.TestFramework: Completed implementation of Nightly, Weekly, AwaitsFix and Slow attributes

Lucene.Net_4_8_0_beta00009

3 years ago

Known Issues

  • The lucene-cli tool requires an appsettings.json file, but none was shipped. Upon running lucene on the command line, the following error will be presented:

    F:\Projects\lucenenet>lucene
    Unhandled exception. System.IO.FileNotFoundException: The configuration file 'appsettings.json' was not found and is not optional. The         physical path is 'C:\Users\shad\.dotnet\tools\.store\lucene-cli\4.8.0-beta00010\lucene-cli\4.8.0-beta00010\tools\netcoreapp3.1\any\appsettings.json'.
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.HandleException(ExceptionDispatchInfo info)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load(Boolean reload)
    at Microsoft.Extensions.Configuration.FileConfigurationProvider.Load()
    at Microsoft.Extensions.Configuration.ConfigurationRoot..ctor(IList`1 providers)
    at Microsoft.Extensions.Configuration.ConfigurationBuilder.Build()
    at Lucene.Net.Cli.Program.Main(String[] args) in D:\a\1\s\src\dotnet\tools\lucene-cli\Program.cs:line 27
    

    Adding a text file named appsettings.json to the location specified in the error message with opening and closing brackets will prevent the exception.

    appsettings.json

    {
    }
    

    IMPORTANT: There must be at least opening and closing curly brackets in the file, or it won't be parsed as valid JSON.

  • J2N versions prior to version 2.0.0-beta-0012 had an infinite recursion bug on Xamarin.Android which caused fatal crashes in Lucene.NET. Upgrading J2N to 2.0.0-beta-0012 or higher will prevent these crashes from occurring.

Change Log

Bugs

  • #294 - Fixed the description for Lucene.NET 2.9.4 in the API documentation.

Improvements

  • #260 - Changed culture-sensitivity of function-based queries to use the invariant culture when parsing/formatting values, to match the Lucene implementation.
  • #266 - Added documentation to show example usage of injecting custom codecs, doc values formats, and postings formats with pure DI or using a dependency injection container. An demo project named Lucene.Net.Tests.TestFramework.DependencyInjection has also been provided as an example of using Microsoft.Extensions.DependencyInjection to provide custom codecs to the test environment so they can be used during testing of other components.
  • #295 - PERFORMANCE Fixed boxing issue with NUnit asserts when passing common primitive types. This significantly improves the performance of running tests.
  • Upgraded to ICU4N 60.1.0-alpha.350 and refactored the collation features of Lucene.Net.ICU to utilize the new UCultureInfo class.
  • #282 - Restructured API documentation so each assembly has its own "mini-site" to fix x-ref linking issue, which fixes the TOC and breadcrumbs.
  • #282 - Documented the docs and website building procedures.
  • #282 - Fixed links on the download page and added missing packages (https://lucenenet.apache.org/download/version-4.8.0-beta00009.html).

New Features

  • #254 - Implementation of "System Properties" which allow users to supply configuration values to the test framework and lucene-cli using configuration files or environment variables.

Lucene.Net_4_8_0_beta00008

4 years ago

This release contains performance improvements and bug fixes.

The Lucene.Net.Support namespace has been phased out, and most of the components that previously existed there have been refactored and moved to J2N, a library to fill in gaps in functionality between the JDK and .NET.

Known Issues

  • J2N versions prior to version 2.0.0-beta-0012 had an infinite recursion bug on Xamarin.Android which caused fatal crashes in Lucene.NET. Upgrading J2N to 2.0.0-beta-0012 or higher will prevent these crashes from occurring.

Change Log

Breaking Changes

  • #239 - Changed Append() overloads of OpenStringBuilder to use startIndex/count rather than start/end to match the .NET convention

  • #241 - Made public fields into properties in ArabicStemmer

  • Lucene.Net.Collections: Factored out unmodifiable methods and related classes in favor of J2N's AsReadOnly() extension methods

  • SWEEP: Lucene.Net.Support: Factored out Equatable, EquatableList, and EquatableSet, and replaced with collections from J2N

  • SWEEP: Lucene.Net.Support: Factored out Collections.Equals() and Collections.GetHashCode()

  • Lucene.Net.Support.Collections: Factored out Swap() and Shuffle() in favor of J2N's implementation

  • Lucene.Net.Support: Factored out IdentityComparer, IdentityHashMap, and IdentityHashSet and used J2N.Runtime.CompilerServices.IdentityEqualityComparer in conjunction with standard Dictionary and HashSet types

  • Factored out TreeSet and TreeDictionary from Lucene.Net.Support in favor of J2N.Collections.Generic.SortedSet and J2N.Collections.Generic.SortedDictionary

  • Lucene.Net.Support.PriorityQueue: Factored out in favor of J2N's implementation

  • Lucene.Net.Support: Refactored ConcurrentHashSet into ConcurrentSet - a wrapper class that can be used to synchronize any set object (ordered or not), similar to how it was done in Java. Changed ordered concurrent set types back to the original type from Lucene.

  • Lucene.Net.Search.FieldComparer: Replaced CompareTo() calls with JCG.Comparer<T>.Default.Compare(), factoring out Lucene.Net.Support.SignedZeroComparer in the process.

  • Factored out Character, ICharSequence, StringBuilderCharSequenceWrapper, StringBuilderExtensions, StringCharSequenceWrapper, and most StringExtensions methods in favor of J2N's implementation

  • Changed semantics of CharTermAttribute.Append() overloads to act more like the .NET StringBuilder class.

    1. The 3rd parameter was changed from an exclusive end index to a count.
    2. A null parameter on a single parameter overload will be a no-op instead of appending the text "null".
    1. A null parameter on a 3 parameter overload will be a no-op if both startIndex and count are 0, otherwise it will throw an ArgumentNullException.
  • Lucene.Net.Support: Factored out GeneralKeyedCollection and AttributeItem in favor of J2N.Collections.Generic.LinkedDictionary

  • SWEEP: Upgraded to account for breaking changes (AsCharSequence() and BitOperation) in J2N.

  • Lucene.Net.Support.Collections: Factored out ImplementsGenericInterface() in favor of J2N's implementation

  • Lucene.Net.Support.Collections: Factored out Singleton() method and used collection initializers instead

  • Lucene.Net.Support.Collections: Removed NewSetFromMap method and related SetFromMap class

  • Lucene.Net.Support: Added ConcurrentHashSet class for internal use, and factored out the use ConcurrentSet throughout the project

  • Removed Lucene.Net.Support.ConcurrentHashMapWrapper from the public API and renamed ConcurrentDictionary

  • Marked ExceptionToClassNameConventionAttribute, ExceptionToNetNumericConventionAttribute, ExceptionToNullableEnumConventionAttribute, and WritableArrayAttribute internal

  • Factored out StringExtensions

  • Factored out DictionaryExtensions

  • Factored out LurchTable in favor of J2N's implementation

  • Lucene.Net.Support.Threading: Deleted ThreadLock, DisposableThreadLocalProfiler

  • Factored out Lucene.Net.Support.AssemblyExtensions in favor of J2N's implementation

  • Lucene.Net.Support: Marked all types internal

  • Moved Lucene.Net.Support.SystemConsole to Lucene.Net.Util namespace

  • Lucene.Net.Support: Moved ExceptionExtensions to Lucene.Net.Util namespace

  • Lucene.Net.Support.ListExtensions: Moved AddRange, Sort, TimSort, and IntroSort extension methods to Lucene.Net.Util.ListExtensions.

  • Lucene.Net.Support.NumberFormat: Moved to Lucene.Net.Util namespace

  • Lucene.Net.TestFramework.Support.JavaCompatibility.AbstractBeforeAfterRule: Moved from Lucene.Net.Support namespace to Lucene.Net.Util

  • Lucene.Net.TestFramework.Support: Changed namespace of ApiScanTestBase, CultureInfoSupport, and ExceptionSerializationTestBase to Lucene.Net.Util

  • Lucene.Net.Util.NumberFormat, Lucene.Net.QueryParsers.Flexible.Standard.Config.NumberDateFormat: Changed protected locale field to private, made property named Culture, and changed constructors and methods to use "culture" rather than "locale" for parameter names

  • Lucene.Net.Support: Moved SystemProperties class to Lucene.Net.Util namespace

  • Lucene.Net.Benchmark.Support: Moved EnglishNumberFormatExtensions to Lucene.Net.Util namespace

Bugs

  • #237 - Fix duplicate FragNum value on Highlight.TextFragment
  • #235 - Fixed broken links on API documentation home page
  • Lucene.Net.Search.BooleanClause::Equals(BooleanClause): Fixed potential null reference exception when Query is set to null
  • LUCENENET-645 - Added missing call to FileStream::Flush() in FSIndexOutput::Flush() that was preventing persistence to disk from occuring at the necessary phase
  • SWEEP: Pass Random instance to all Shuffle() method calls to ensure the same psuedo-random sequence is used based on the seed
  • Lucene.Net.Index.DocumentsWriterDeleteQueue::DeleteSlice(): Removed extra Debug.Assert() statement that wasn't in Lucene which caused Lucene.Net.Index.TestIndexWriterUnicode::TestEmbeddedFFFF() test to fail when running in debug mode

Improvements

  • #234 - Moved markdown converter to the src/docs folder
  • #235 - Updated API documentation so each release exists on its own URL
  • #235 - Updated documentation converter to be more robust when dealing with namespace matching
  • #239 - Replaced all private IComparer<T> implementations with the built in .NET Comparer.Create() factory method
  • #240 - Implemented IAppendable on OpenStringBuilder
  • PERFORMANCE - Removed unnecessary memory copy operations from various IndexReader dependencies
  • LUCENENET-640 - PERFORMANCE - Factored out WeakIdentityMap in favor of .NET's ConditionalWeakTable
  • SWEEP: Factored out LinkedHashMap in favor of J2N's LinkedDictionary
  • Lucene.Net.Tests.Search.TestFieldCache::Test(): Simplified expression with LINQ query
  • Lucene.Net.Highlighter, Lucene.Net.Tests.Spatial: Swapped in LinkedHashSet from J2N, like in the original Lucene implementation
  • SWEEP: Factored out Lucene.Net.Support.HashMap in favor of J2N.Collections.Generic.Dictionary
  • SWEEP: Factored out System.Collections.Generic.SortedDictionary in favor of J2N.Collections.Generic.SortedDictionary
  • Lucene.Net.Analysis.Kuromoji.Dict.UserDictionary: Swapped out C5 TreeDictionary for J2N.Collections.Generic.SortedDictionary
  • SWEEP: Changed System.Collections.Generic.SortedSet to J2N.Collections.Generic.SortedSet
  • SWEEP: Factored out C5's TreeSet
  • SWEEP: Swapped out System.Collections.Generic.HashSet for J2N.Collections.Generic.HashSet
  • SWEEP: Factored out Arrays.AsList(), which was causing both additional operational complexity and unnecessary memory allocations
  • Lucene.Net.Util.Automaton.State: Implemented IEquatable<State>, changed enumerator to a struct
  • Removed FEATURE_HASHSET_CAPACITY, since J2N now has the full .NET Core 3.x implementation with a capacity constructor
  • Lucene.Net.Util.Fst.ListOfOutputs::Merge(): Streamlined so we don't have so many casts
  • Lucene.Net.Util.Fst: Use J2N.Collections.List<T> for the closing type of Outputs<T> to ensure the outputs can be compared for structural equality
  • Lucene.Net.TestFramework.Search.AssertingScorer: Changed to use ConditionalWeakTable/WeakDictionary
  • LUCENENET-610, LUCENENET-640 - Reduced locking in FieldCache and its dependent classes
  • Added cross-framework compatibility for nullable attributes
  • Changed project files to automatically generate InternalsVisibleTo attributes based on ItemGroup/InternalsVisibleTo elements
  • Lucene.Net.Support.DictionaryExtensions: Optimized Put() method, added guard clauses to Put and PutAll
  • Lucene.Net.Support.DictionaryExtensions: Factored out Load() and Store() methods in favor of J2N's implementation
  • LUCENENET-642 - Lucene.Net.Analysis.TokenStream: Removed Reflection code that is used to force the end user to make TokenStream subclasses or their IncrementToken() method sealed
  • LUCENENET-643 - PERFORMANCE - Lucene.Net.Support.IO.FileStreamExtensions::Read(): Moved to StreamExtensions class and optimized to read bytes in bulk instead of one byte at a time
  • Lucene.Net.Tests.Index (TestBagOfPositions + TestBagOfPostings): Fixed performance issue due to converting int to string and string to int in the current culture
  • Lucene.Net.Store (FSDirectory + BufferedIndexOutput): Refactored FSDirectory.FSIndexOutput to utilize the FileStream buffer only, rather than using both a FileStream buffer and the buffer in BufferedIndexOutput.

New Features

  • LUCENENET-642 - Added code analyzer and code fix to force the end user to make TokenStream subclasses or their IncrementToken() method sealed at design time, rather than running Reflection code at runtime to make this check

Lucene.Net_4_8_0_beta00007

4 years ago

This release contains impactful performance improvements and bug fixes.

NOTE: The Lucene.Net.Support namespace in the core Lucene.Net assembly is being phased out for external use. Much of what used to be in this namespace has been made into first-class components and moved to J2N, a library to fill in gaps in functionality between the JDK and .NET. Please do not add any new dependencies on Lucene.Net.Support in the core Lucene.Net library. Also, please open a new JIRA ticket if you have a dependency on a component from Lucene.Net.Support that is not available in J2N.

This release drops support for .NET Standard 1.6 and adds support for .NET Standard 2.1.

Change Log

Breaking Changes

  • Lucene.Net.Codecs (in Lucene.Net library) - Changed the following methods to properties with the same name for API consistency
    • Lucene.Net.Codecs.DefaultCodecFactory::AvailableServices()
    • Lucene.Net.Codecs.DefaultDocValuesFormatFactory::AvailableServices()
    • Lucene.Net.Codecs.DefaultPostingsFormatFactory::AvailableServices()
    • Lucene.Net.Codecs.Codec::AvailableCodecs()
    • Lucene.Net.Codecs.DocValuesFormat::AvailableDocValuesFormats()
    • Lucene.Net.Codecs.PostingsFormat::AvailablePostingsFormats()
  • Lucene.Net.Analysis.Kuromoji - Changed the following methods to properties named Instance for API consistency
    • Lucene.Net.Analysis.Kuromoji.Dict.CharacterDefinition::GetInstance()
    • Lucene.Net.Analysis.Kuromoji.Dict.ConnectionCosts::GetInstance()
    • Lucene.Net.Analysis.Kuromoji.Dict.TokenInfoDictionary::GetInstance()
    • Lucene.Net.Analysis.Kuromoji.Dict.UnknownDictionary::GetInstance()
  • Lucene.Net.Support.Collections::AddAll() - Factored out in favor of ISet<T>.UnionWith()
  • Lucene.Net.Support.IdentityComparer::ctor() - Factored out in favor of new Default static property
  • Lucene.Net.Support.DictionaryExtensions::EntrySet() - Factored out because IDictionary<TKey, TValue> is already enumerable and copying its contents to another data structure for the purpose of enumeration is wasteful and unnecessary
  • Lucene.Net.Support.SetExtensions::AddAll() - Factored out in favor of UnionWith(), Lucene.Net.Support.DictionaryExtensions.PutAll(), or AddRange() depending on collection type
  • Lucene.Net.Support.HashMap<TKey, TValue> - Changed behavior of indexer property to throw a KeyNotFoundException if the key doesn't exist. The exception can already be avoided by using the TryGetValue() method.
  • Lucene.Net.Support.Buffer - Factored out in favor of J2N.IO.Buffer
  • Lucene.Net.Support.ByteBuffer - Factored out in favor of J2N.IO.ByteBuffer
  • Lucene.Net.Support.LongBuffer - Factored out in favor of J2N.IO.LongBuffer
  • Lucene.Net.Support.Number - Factored out the following methods in favor of their counterparts in J2N
    • Signum() > J2N.MathExtensions::Signum()
    • SingleToInt32Bits() > J2N.BitConversion::SingleToInt32Bits()
    • SingleToRawInt32Bits() > J2N.BitConversion::SingleToRawInt32Bits()
    • Int64BitsToDouble() > J2N.BitConversion::Int64BitsToDouble()
    • DoubleToInt64Bits() > J2N.BitConversion::DoubleToInt64Bits()
    • DoubleToRawInt64Bits() > J2N.BitConversion::DoubleToRawInt64Bits()
    • ToBinaryString() > J2N.IntegralNumberExtensions::ToBinaryString()
    • URShift() > J2N.Numerics.BitOperationExtensions::TripleShift()
    • ToString(long, radix) > J2N.IntegralNumberExtensions::ToString(long, radix)
    • BitCount() > J2N.Numerics.BitOperationExtensions::PopCount()
    • NumberOfLeadingZeros() > J2N.IntegralNumberExtensions::LeadingZeroCount()
    • NumberOfTrailingZeros() > J2N.IntegralNumberExtensions::TrailingZeroCount()
    • RotateLeft() > J2N.IntegralNumberExtensions::RotateLeft()
    • RotateRight() > J2N.IntegralNumberExtensions::RotateRight()
  • Lucene.Net.Support.Number - Removed the following methods
    • FlipEndian()
    • IsNumber()
    • ToInt64()
  • Lucene.Net.Support.MathExtensions - Factored out the following methods in favor of their counterparts in J2N
    • ToDegrees() > J2N.MathExtensions::ToDegrees()
    • ToRadians() > J2N.MathExtensions::ToRadians()
  • Lucene.Net.Support.AtomicBoolean - Factored out in favor of J2N.Threading.Atomic.AtomicBoolean
  • Lucene.Net.Support.AtomicInt32 - Factored out in favor of J2N.Threading.Atomic.AtomicInt32
  • Lucene.Net.Support.AtomicInt64 - Factored out in favor of J2N.Threading.Atomic.AtomicInt64
  • Lucene.Net.Support.AtomicObject - Factored out in favor of J2N.Threading.Atomic.AtomicReference
  • Lucene.Net.Support.AtomicReferenceArray - Factored out in favor of J2N.Threading.Atomic.AtomicReferenceArray
  • Lucene.Net.Support.Threading.ThreadClass - Factored out in favor of J2N.Threading.ThreadJob
  • Lucene.Net.Support.Character - Factored out the following methods in favor of their counterparts in J2N
    • Digit()
    • ForDigit()
  • Lucene.Net.Support.StringTokenizer - Factored out in favor of J2N.Text.StringTokenizer
  • Lucene.Net.Support.CultureContext - Factored out in favor of J2N.Globalization.CultureContext
  • Lucene.Net.Support.IndexWriterConfigExtensions - Moved to Lucene.Net.Index.Extensions namespace
  • Lucene.Net.Documents.IndexableFieldExtensions - Moved to Lucene.Net.Documents.Extensions namespace
  • Lucene.Net.Documents.DocumentExtensions - Moved to Lucene.Net.Documents.Extensions namespace
  • Lucene.Net.Support.IResourceManagerFactory - Moved to Lucene.Net.Util namespace
  • Lucene.Net.Support.BundleResourceMangerFactory Moved to Lucene.Net.Util namespace

Bugs

  • LUCENENET-615 - PerFieldAnalyzerWrapper and PerFieldReuseStrategy must support null keys.
  • LUCENENET-617 - Reverted changes introduced in pull #222 that caused deadlock in Lucene.Net.Tests.Replicator.IndexAndTaxonomyReplicationClientTest::TestConsistencyOnExceptions().
  • Lucene.Net.Tests.Analysis.Common/Analysis/Util/TestCharArrayMap_TestCharArrayMap() - Was failing in Turkish, lowercasing must be done in the invariant culture to match Lucene
  • Lucene.Net.Tests.Util.Fst.TestFSTs::TestPrimaryKeys() - Fixed sorting issue that was causing the test to fail with negative values
  • SWEEP: Corrected number-to-string and string-to-number conversions to use the invariant culture, as was the case in Lucene
  • LUCENENET-622 - Lucene.Net.Tests.Util.TestVersionComparer::TestVersions() - Fixed version number conversion to use the invariant culture so the test consistently passes
  • LUCENENET-621 - Lucene.Net.Tests.Search.TestSearchAfter::TestQueries() - Fixed number-to-string conversion to use the invariant culture so the test consistently passes
  • `Lucene.Net.Tests.Benchmark.ByTask.Tasks.WriteLineDocTaskTest::TestMultiThreaded(): Added lock to synchronize write files. Also refactored test to assist future debugging efforts.
  • Lucene.Net.Analysis.Common - Fixed to use invariant culture converting numbers and uppercasing/lowercasing
  • Lucene.Net.Benchmark - Fixed to use invariant culture converting numbers and uppercasing/lowercasing
  • Lucene.Net.Facet - Fixed to use invariant culture for number conversion
  • Lucene.Net.Misc - Fixed to use invariant culture for number conversion
  • Lucene.Net.Suggest - Fixed to use invariant culture for number conversion
  • Lucene.Net.Grouping - Fixed to use invariant culture for number conversion
  • Lucene.Net.Spatial - Fixed to use invariant culture for number conversion
  • Lucene.Net.Tests.Replicator - Fixed issue with incorrect error message due to missing Dispose() call
  • Lucene.Net.Tests.Replicator.LocalReplicatorTest::TestObtainMissingFile() - Added missing catch block for DirectoryNotFoundException
  • Lucene.Net.Support.Codecs - Fixed initialization locking between parallel tasks. Also added guard clauses and inlined variable declarations.
  • #154 - Fix OpenBitSet.Union and .Xor methods.
  • LUCENENET-618 - CRITICAL: Fixed Lucene.Net.Store.NativeFSLockFactory to be thread-safe on non-Windows operating systems
  • Fixed broken OS detection on .NET Framework
  • LUCENENET-602 - Fixed generic stucts of LurchTable to function on Xamarin.iOS without throwing exceptions
  • Lucene.Net.TestFramework.Index.ThreadedIndexingAndSearchingTestCase - WeakDictionary<TKey, TValue> must be wrapped to make it thread safe
  • Lucene.Net.Codecs.Lucene45.Lucene45DocValuesConsumer::MISSING_ORD - Correct value should be -1L to match Lucene

Improvements

  • azure-pipelines.yml: Added job to generate documentation
  • Improved automated documentation converter accuracy
  • Moved Lucene.Net.Util.VirtualMethod to its original location in the Lucene.Net assembly
  • Updated usage examples on home page of web site
  • Changed categorization on API documentation to list by assembly rather than namespace
  • Lucene.Net.Analysis.Phonetic - Create culture during static initialization to improve performance
  • Lucene.Net.Index.CheckIndex - Added guard clause to ensure dir is not null
  • Lucene.Net.Support.IO.FileSupport::CreateTempFile() - Simplified error handling to use a when clause to only catch the relevant exceptions (improving thread safety)
  • SWEEP: CA1810 - Avoid static constructors (see #224 (comment)).
  • LUCENENET-612 & LUCENENET-615 - Added documentation to clarify that the IDictionary<TKey, TValue> type that is provided will determine how the component behaves.
  • Lucene.Net.Support.BitArrayExtensions::Cardinality() - Replaced implementation with one that benchmarked more than 3x faster, at the cost of a small amount of RAM
  • Consolidated build features for the solution in the root Directory.Build.targets file
  • Added target for .NET Standard 2.1
  • Removed target for .NET Standard 1.6
  • Removed support code for targeting .NET Framework 3.5
  • LUCENENET-610 - Added conditional compilation to favor WeakConditionalTable<TKey, TValue> over WeakDictionary<TKey, TValue> when the features Lucene requires are available.
  • Lucene.Net.Support.Arrays - Changed Equals() and GetHashCode() to be backed by J2N.Collections.ArrayEqualityComparer<T>.OneDimensional for improved performance
  • Lucene.Net.Support.Arrays::ToString() - Changed implementation to the same as Apache Harmony for improved performance
  • Lucene.Net.Support.Character
    • ToChars() - Cascade to J2N.Character.ToChars(), which benchmarked 5x faster
    • ToUpper() - Cascade to J2N.Character.ToUpper()
    • ToLower() - Cascade to J2N.Character.ToLower()
    • IsLetter() - Cascade to J2N.Character.IsLetter()
  • Lucene.Net.Support.DictionaryExtensions::Load() - Cascade the call to J2N.PropertyExtensions::LoadProperties()
  • Lucene.Net.Support.DictionaryExtensions::Store() - Cascade the call to J2N.PropertyExtensions::SaveProperties()
  • Upgraded lucene-cli to target netcoreapp3.1
  • Fixed NuGet Warning NU5048, deprecation of PackageIconUrl element
  • build.ps1 - Added testname/framework to test script to show which projects are running
  • Changed testing target from net451 to net48
  • Changed testing target from netcoreapp2.1 to netcoreapp2.2
  • Added testing target for netcoreapp3.1
  • Added new debug artifact for .pdb files so they aren't pushed into our Apache release distribution binary
  • Added .rat-excludes file for Apache RAT

New Features

  • LUCENENET-614 - Released Lucene.Net.TestFramework on NuGet
  • Added Lucene.Net.Tests.TestFramework (ported compatible tests from Lucene 8.2.0, since Lucene 4.8.0 did not have any tests)
  • Lucene.Net.Codecs.DefaultCodecFactory - Added CustomCodecTypes property that can be used to initialize custom codec types without requiring a subclass
  • Lucene.Net.Codecs.DefaultDocValuesFormatFactory - Added CustomDocValuesFormatTypes property that can be used to initialize custom codec types without requiring a subclass
  • Lucene.Net.Codecs.DefaultPostingsFormatFactory - Added CustomPostingsFormatTypes property that can be used to initialize custom codec types without requiring a subclass
  • LUCENENET-623 - Released Lucene.Net.Analysis.OpenNLP on NuGet
  • LUCENENET-568 - Released Lucene.Net.Analysis.Morfologik on NuGet