OpenRefine Versions Save

OpenRefine is a free, open source power tool for working with messy data and improving it

3.8-beta1

2 months ago

This is the first beta release of the 3.8 series. Please backup your workspace directory before installing and report any problems that you encounter.

New features and improvements

Keyboard navigation improvements

  • Dialogs are focusable (#5578, by @Abbe98)
  • The tab order for reconciliation create and match buttons is fixed (#5685, by @Abbe98)
  • The show/hide left panel is keyboard accessible (#5852, by @Abbe98)
  • Menu buttons in the extension-bar can be opened with the keyboard (#5853, by @Abbe98)
  • Tab buttons are outlined when focused (#5851, by @Abbe98)
  • The element outlines set by the browser are retained (#5867, by @Abbe98)
  • The project permalink can be selected (#5871, by @Abbe98)
  • The button to rename a project can be selected (#5868, by @Abbe98)
  • The column header buttons are selectable/openable by keyboard (#5854, by @Abbe98)
  • Make action area tabs selectable by keyboard (#5885, by @Abbe98)
  • Make it possible to set custom headers using only the keyboard when fetching urls (#5886, by @Abbe98)
  • The menu system is navigable by keyboard (#5901, by @Abbe98)
  • The "Get data from" menu is keyboard accessible (#5900, by @Abbe98)
  • Cells in the grid can be edited by keyboard (#5855, by @Abbe98)

Reconciliation usability improvements

  • The waiting screen while guessing reconciliation types is internal to the reconciliation dialog (#4877, by @ayushrai206)
  • The "auto-match" checkbox persists after restarts of OpenRefine (#4722, by @ayushrai206)
  • The documentation of the reconciliation service is displayed in the reconciliation dialog (if available) (#5784, by @ayushriai206)
  • The default types supplied by the reconciliation service are always offered to users (#4224, by @ayushrai206)
  • The reconciliation types are displayed with both name and id (#5907, by @ayushrai206)
  • OpenRefine honours the batch size announced by reconciliation services (#5603, by @ayushrai206)
  • The dialog for the operation to add a column of entity identifiers is improved (#5998, by @elebitzero)
  • Errors encountered by the reconciliation operation are displayed in the grid and are available via the cell.recon.error GREL expression (#3194, by @ayushrai206)
  • Those errors can also be isolated via facets (#6232, @ayushrai206)
  • The interface to select a reconciliation service when reconciling was improved (#6118, by @ayushrai206 and @Lydiaofficial)
  • The "Search for match" option is present in cells with reconciliation errors so that they can be fixed manually (#6192, by @ayushrai206)
  • The error messages generated during reconciling are more helpful (#6111, by @ayushrai206)
  • A new operation to extract URLs for reconciled cells is available (#5960, by @ayushrai206)
  • Property selection in the reconciliation dialog gives better feedback to the user about whether a column is successfully mapped to a property or not (#6060, by @elebitzero)
  • Type selection is similarly improved (#6131, by @elebitzero)
  • Only up to three reconciliation candidates are displayed by default, with the option to see more (#6154, by @ayushrai206 and @Lydiaofficial)
  • A new design for the reconciliation dialog was proposed but has not been implemented yet. Your opinion about it is welcome on the forum. By @Lydiaofficial
  • It is possible to discover the source of a column obtained by fetching data from a reconciliation service, by hovering the column header (#5130, by @ayushrai206)

Facet improvements

  • The height of text facets is fitted to their contents (#5514, by @elebitzero)
  • The text facet windows to their content (#5513, by @elebitzero)
  • The facets panel does not shift vertically on refresh (#5610, by @elebitzero)

Improvements to linking to specific parts of OpenRefine via URLs

  • It is possible to link to a given home screen panel (#5597, by @Abbe98)
  • The tags used to filter projects are also reflected in the URL (#5769, by @Abbe98)

Layout improvements to the Wikibase extension

  • Layout improvements in the Wikibase schema editor fixes (#5449, by @prashasti-7)
  • The Wikibase issues tab displays the number of schema validation errors (#5358, by @SoniaSun810)
  • The list of available Wikibase schemas is numbered (#5893, by @Hisiste)
  • Wikibase: Layout and label changes to schema alignment tab (#5346, by @Abbe98 and @lozanaross)
  • Clicking on the triangle unfolds Wikibase references in the schema editor (#6304, by @amparab and @cooperzoe)

Other improvements

  • Applying a list of operations stored in a JSON file is possible without copy and paste (#5022 by @IjayAbby)
  • The cluster choice limit is configurable via the preferences (#5847, by @5tigerjelly)
  • The metadata dialog shows JSON contents in a more readable way (#5870, by @Abbe98)
  • OpenRefine is known to be compatible with Java versions 11 to 21 (#5930, by @wetneb)
  • URIs with the geo: protocol are rendered as links in cells (#5940, by @Abbe98)
  • Project archives can be imported via URL (#5431, by @SoniaSun810)
  • A Windows installer is available on top of the existing zip distribution. We no longer publish zip distribution without embedded Java. Do let us know if this is a problem for you. (#3224, by @dori4n)
  • CSV Byte Order Mark (BOM) is supported (#1241, by @tfmorris)
  • More HTTP headers are supported when fetching a column via URLs (#6334, by @tfmorris)
  • Columns can be expanded with a more easily identifiable button (#5879, by @VhugoJc)
  • TSV export avoids adding unneccessary quotes around cells (#2071, by @tfmorris)

GREL changes

  • The forRange GREL function accepts negative increments (#5520, by @Huishin-pie)
  • Accessing the record & columnNames fields works again (#5633, by @tfmorris)
  • The split() function filters trailing empty token when there is a trailing string separator and leading empty token when there is a leading pattern separator match (#5587, by @tfmorris)
  • The replaceEach() function is more faithful to its documentation (#5463, by @Huishin-pie)
  • The cross() function returns an empty list on no match (#5531, by @jenny-Musah)
  • The length() function returns the number of keys in object (#5991, by @tfmorris)
  • The forEachIndex() control supports JSON objects and arrays (#3147, by @tfmorris)
  • Controls (+ and comparison operators such as <) have less unexpected behaviors (#6340 , #6341, by @tfmorris)

Bug fixes

  • The reconciliation dialog does not suggest properties specific to a type when "No Type" is selected (#5523, by @tfmorris)
  • The word facet uses an internationalized word separator instead of just space characters (#557, by @tfmorris)
  • The HTTP proxy configuration is correctly handled (#5476, by @tledoux)
  • The 'Facet choices' dialog does not bleed over the dialog window on resize. (#5619, by @elebitzero)
  • The memory usage display shows up correctly on project creation (#5665, by @elebitzero)
  • The readability of the no-projects message is improved (#5679, by @Abbe98)
  • The visual style of the database import panel is more consistent with the rest of the application (#5548, by @Abbe98)
  • Dialogs cannot be dragged past the top of the window (#5714, by @elebitzero)
  • The clipboard import input does not overflow in narrow windows (#5753, by @Abbe98)
  • File upload buttons have a style that is consistent with other buttons (#5743, by @Abbe98)
  • Errors in preference changes are properly reported (#5772, #5785, by @tsukipedia and @tfmorris)
  • Numeric facets are properly refreshed when switching tabs (#5781, by @elebitzero)
  • Manages importing URL with illegal characters. (#4625, by @yeungven)
  • The auto-completion for fields in the Wikibase schema conforms to the new MediaWiki API (#5716, by @SoniaSun810)
  • When removing rows, the cache of facet counts is correctly updated, updating the duplicates facet as well (#5799, by @tfmorris)
  • Column names are correctly quoted in the SQL exporter (#5388, by @mahikaajain)
  • The star/flag action correctly updates the row without reverting reconciliation changes (#5738, by @SoniaSun810)
  • The pointer cursor only changes when hovering the column menu, not the entire column headers (#5977, by @Abbe98)
  • The "Use quote as separator" defaults to False for TSV import (#3853, by @jnchen1)
  • The "Transpose cells across columns" treats blank cells like null cells (#5229, by @skhoylow8)
  • Wikibase edits on deleted items are skipped (#5385, by @wetneb)
  • The OpenRefine launcher on Windows no longer refuses to start if the Java version is too high (#5583, by @wetneb)
  • Reconciliation services can no longer be added multiple times (#5926, by @tsukipedia)
  • The SQL importer no longer checks for particular keywords in the query (#6019, by @tfmorris)
  • The allowed characters in file names of media uploads to Wikibase file are extended (#5656, by @santi4o and @wetneb)
  • Case sensitive sort of rows works as expected (#6047, by @tfmorris)
  • Missing spaces in facet sorting and "add column based on reconciled values" dialog are back (#6047, #6143, by @frafra and @SrinathKadam048)
  • The URL fetching operation returns an error, not null, on a bad URL (#6137, by @tfmorris)
  • Bzip2 import works again now supports concatenated compressed streams (#6129, by @tfmorris)
  • Dates on Open Project page are localized (#6172, by @tfmorris)
  • Text in alert dialogs is selectable (#6187, by @elebitzero)
  • The grid is properly updated when a cell is matched (#6236, by @elebitzero)
  • The message shown at the top of the screen after operations has less encoding errors (#6063, by @tfmorris)
  • Setting the log level from the command line works again (#6286, by @amparab)
  • Creating a project from a URL with trailing whitespace no longer fails (#6330, by @surajbora59)
  • A spurious warning about missing units in the Wikibase extension was removed (#5452, by @payalsaraljain)
  • The ./refine script ignores HTTP proxies for querying OpenRefine directly (#2000, by @tejasbhosale17)

Performance improvements

  • We are now using the uniVocity CSV parser instead of the Apache Commons CSV parser. Beyond the performance improvements this brings, it is likely that this change comes with different parsing behaviour in some cases. Do let us know if those seem to be regressions (#2268, #1372, by @tfmorris)
  • Project initialization is faster by sending parallel requests from the frontend (#5941, by @Abbe98)
  • The metadata files of projects are only written when needed (#3805, by @ComgLq24)
  • Newly read projects are not written before they're modified (#3805, by @tfmorris)
  • Jython interpeter is not initialized during startup but only the first time it is used (#6174, by @tfmorris)
  • The clustering dialog no longer runs the default clustering method by default to avoid unnecessary heavy computations on large projects (#241, by @elebitzero)

For developers

  • The timestamps for project changes have been migrated to UTC time objects (#3047), Switch HistoryEntry to Instant from OffsetDateTime(at UTC) (#6176)
  • We migrated from using .less to .css for our stylesheets. Extensions should still be able to use .less but are encouraged to migrate to using CSS variables instead (#5525). This is part of an ongoing effort to offer a dark mode (#3017, by @Abbe98 and @elebitzero)
  • Extensions can now register commands which can respond to HTTP HEAD requests (#6097, by @Abbe98)
  • The get-all-preferences-command now respond to HTTP GET requests. HTTP POST requests are still supported but extensions and clients are encouraged to migrate (#5850, by @Abbe98)
  • The create-project-from-upload command can now be used to set a project description and creator (#5739, by @Abbe98)
  • It is now optional for action areas to implement a resize function (#5598, by @Abbe98)
  • Project tags are expose in a data attribute (#5590, by @Abbe98)
  • Commands can now support the HTTP HEAD request type (#6097, by @Abbe98)
  • SVG images are supported in Butterfly (simile-butterfly#90, by @Abbe98)

3.7.9

2 months ago

This is the ninth stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New in 3.7.9

Signing of the MacOS package was fixed.

Vulnerabilities

  • (fixed since 3.7.8) The fix for the MySQL vulnerabilities CVE-2023-41886 and CVE-2023-41887 was corrected, as it still allowed reading arbitrary files or executing code on the machine running OpenRefine using a different set of import parameters in the database extension. This problem bears the vulnerability id CVE-2024-23833, reported by @l0n3rs.
  • (fixed since 3.7.5) A moderate vulnerabilities in the database extension was fixed. Connecting to a malicious MySQL server could read files or execute arbitrary code on the machine running OpenRefine. The vulnerabilities were assigned CVE-2023-41886 and CVE-2023-41887 identifiers respectively, and were reported by @nbxiglk0.
  • (fixed since 3.7.4) A moderate vulnerability in project import was fixed. Importing a maliciously crafted project could execute arbitrary code on the machine running OpenRefine. This vulnerability has been assigned the CVE-2023-37476 identifier. It was reported by Stefan Schiller from SonarSource.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

Bug fixes

  • The layout of the dialog to select a reconciliation match was improved so that the auto-complete widget does not hide the other options (#4821)
  • Better in-tool documentation around the way the scatterplot facet detects numerical columns (#4890)
  • The detection of URLs in cell values was fixed (#4546)
  • The error message displayed when trying to add a Wikibase manifest with a manifest version that is too old or recent was improved (#4847)
  • Errors returned by Jython expressions are more readable (#3012)
  • The ODS exporter no longer creates a default sheet "Sheet1" in the documents it creates (#4864)
  • Longer descriptions in auto-completion widget are not cut off anymore (#4988)
  • The interface for editing cell values was improved to better explain how to input dates (#3082)
  • The Windows refine.bat script was made more consistent with the Unix refine script (#4949, #5404)
  • The "Search for Match" dialog was rearranged so that the dropdown does not cover the buttons (#4945)
  • Error handling in the scatterplot facet was improved (#4893)
  • The "Collapse consecutive whitespaces" operation now handles unicode whitespace correctly (#4898)
  • (#4991)
  • The handling of GZIP-compressed files without .gz extension was improved in the importing pipeline (#547)
  • (#5153)
  • The "Add column based on this column dialog" can be submitted by pressing "Enter" in the column field (#5143)
  • The editing of redirected Wikibase items was fixed (#5162)
  • The user experience was improved in the case of incomplete Wikibase schemas (#5131)
  • The memory usage display was improved to show the used memory instead of the total memory, and was made more precise (#5222)
  • The association of labels to form inputs was improved, enhancing the accessibility of the interface (#5239, #5242, #5249, #5284)
  • An overflow issue in the reconciliation dialog was fixed (#5285)
  • The Wikibase manifests now properly support locally-running reconciliation endpoints (#5035)
  • The aspect ratio of Wikibase logos is now properly preserved (#5306)
  • The SQL exporter interface was improved (#5224)
  • The cell edit popup and dialogs with textbox inputs became resizable (#5330)
  • When marking a set of cells as "New" in an unreconciled column, the user is prompted for the reconciliation service to use (#4985)
  • More quality assurance checks were introduced in the Wikibase extension, such as checking for identical label and description in new Wikibase items (#4980)
  • The caching of auto-completion results in the Wikibase extension was fixed (#5190)
  • The Wikidata extension was fully renamed to "Wikibase extension" (#4525)
  • The controls of the cluster and edit dialog are greyed out while clustering is taking place (#5369)
  • The handling of unicode whitespace was improved throughout the application (#5105)
  • Our MacOS packages (.DMG) are now properly signed and notarized, which should make their installation easier (#4586). Also, the presentation of the DMG image was made more user-friendly by including the customary link to the Applications folder. (#5509)
  • The parsing of the unary minus sign in GREL was fixed (#5465)
  • (From 3.7-beta3 on) The clustering dialog no longer introduces non-breaking spaces when selecting options with spaces in them (#5581)
  • (From 3.7.1 on) The display of the memory usage during project import was fixed (#5665)
  • (From 3.7.2 on) The localization in German was fixed (#5750)
  • (From 3.7.3 on) Starting openrefine.exe on Windows with Java 17 was fixed (#5583)
  • (From 3.7.3 on) Wikibase edits on deleted items are skipped and do not stall the entire batch (#5385)
  • (From 3.7.3 on) The HTML document language is aligned with the language of the interface (#5925)
  • (From 3.7.3 on) The default reconciliation types are displayed with both name and id (#5907)
  • (From 3.7.3 on) The transpose cells across columns was fixed so it treats blank cells as null (#5229)
  • (from 3.7.6 on) A missing space was added in the layout of the text facet, by @frafra (#6071)
  • (from 3.7.6 on) Browser launching on startup was fixed for Snap-packaged OpenRefine, by @kristbaum (#6065)
  • (from 3.7.6 on) The selection of Wikibase statement merging strategies was fixed, by @wetneb (#6066)
  • (from 3.7.6 on) A resizing issue in the presence of the Wikibase extension and facets was fixed, by @Abbe98 (#6070)
  • (from 3.7.7 on) A rendering issue in the Wikibase schema editor was fixed, by @wetneb (#6165)
  • (from 3.7.7 on) Attempts to fetch invalid URLs in the "Add column by fetching URLs" operation are properly reported to the user, by @tfmorris (#6141)
  • (from 3.7.7 on) A missing space between the "remove" and "configure" buttons of the "Add column from reconciled values" operation was added, by @SrinathKadam048 (#6151)

3.7.8

2 months ago

This is the eight stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New in 3.7.8

The fix for the MySQL vulnerabilities CVE-2023-41886 and CVE-2023-41887 was corrected, as it still allowed reading arbitrary files or executing code on the machine running OpenRefine using a different set of import parameters in the database extension. This problem bears the vulnerability id CVE-2024-23833, reported by @l0n3rs.

Vulnerabilities

  • (fixed since 3.7.8) Vulnerability CVE-2024-23833 (see above).
  • (fixed since 3.7.5) A moderate vulnerabilities in the database extension was fixed. Connecting to a malicious MySQL server could read files or execute arbitrary code on the machine running OpenRefine. The vulnerabilities were assigned CVE-2023-41886 and CVE-2023-41887 identifiers respectively, and were reported by @nbxiglk0.
  • (fixed since 3.7.4) A moderate vulnerability in project import was fixed. Importing a maliciously crafted project could execute arbitrary code on the machine running OpenRefine. This vulnerability has been assigned the CVE-2023-37476 identifier. It was reported by Stefan Schiller from SonarSource.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

Bug fixes

  • The layout of the dialog to select a reconciliation match was improved so that the auto-complete widget does not hide the other options (#4821)
  • Better in-tool documentation around the way the scatterplot facet detects numerical columns (#4890)
  • The detection of URLs in cell values was fixed (#4546)
  • The error message displayed when trying to add a Wikibase manifest with a manifest version that is too old or recent was improved (#4847)
  • Errors returned by Jython expressions are more readable (#3012)
  • The ODS exporter no longer creates a default sheet "Sheet1" in the documents it creates (#4864)
  • Longer descriptions in auto-completion widget are not cut off anymore (#4988)
  • The interface for editing cell values was improved to better explain how to input dates (#3082)
  • The Windows refine.bat script was made more consistent with the Unix refine script (#4949, #5404)
  • The "Search for Match" dialog was rearranged so that the dropdown does not cover the buttons (#4945)
  • Error handling in the scatterplot facet was improved (#4893)
  • The "Collapse consecutive whitespaces" operation now handles unicode whitespace correctly (#4898)
  • (#4991)
  • The handling of GZIP-compressed files without .gz extension was improved in the importing pipeline (#547)
  • (#5153)
  • The "Add column based on this column dialog" can be submitted by pressing "Enter" in the column field (#5143)
  • The editing of redirected Wikibase items was fixed (#5162)
  • The user experience was improved in the case of incomplete Wikibase schemas (#5131)
  • The memory usage display was improved to show the used memory instead of the total memory, and was made more precise (#5222)
  • The association of labels to form inputs was improved, enhancing the accessibility of the interface (#5239, #5242, #5249, #5284)
  • An overflow issue in the reconciliation dialog was fixed (#5285)
  • The Wikibase manifests now properly support locally-running reconciliation endpoints (#5035)
  • The aspect ratio of Wikibase logos is now properly preserved (#5306)
  • The SQL exporter interface was improved (#5224)
  • The cell edit popup and dialogs with textbox inputs became resizable (#5330)
  • When marking a set of cells as "New" in an unreconciled column, the user is prompted for the reconciliation service to use (#4985)
  • More quality assurance checks were introduced in the Wikibase extension, such as checking for identical label and description in new Wikibase items (#4980)
  • The caching of auto-completion results in the Wikibase extension was fixed (#5190)
  • The Wikidata extension was fully renamed to "Wikibase extension" (#4525)
  • The controls of the cluster and edit dialog are greyed out while clustering is taking place (#5369)
  • The handling of unicode whitespace was improved throughout the application (#5105)
  • Our MacOS packages (.DMG) are now properly signed and notarized, which should make their installation easier (#4586). Also, the presentation of the DMG image was made more user-friendly by including the customary link to the Applications folder. (#5509)
  • The parsing of the unary minus sign in GREL was fixed (#5465)
  • (From 3.7-beta3 on) The clustering dialog no longer introduces non-breaking spaces when selecting options with spaces in them (#5581)
  • (From 3.7.1 on) The display of the memory usage during project import was fixed (#5665)
  • (From 3.7.2 on) The localization in German was fixed (#5750)
  • (From 3.7.3 on) Starting openrefine.exe on Windows with Java 17 was fixed (#5583)
  • (From 3.7.3 on) Wikibase edits on deleted items are skipped and do not stall the entire batch (#5385)
  • (From 3.7.3 on) The HTML document language is aligned with the language of the interface (#5925)
  • (From 3.7.3 on) The default reconciliation types are displayed with both name and id (#5907)
  • (From 3.7.3 on) The transpose cells across columns was fixed so it treats blank cells as null (#5229)
  • (from 3.7.6 on) A missing space was added in the layout of the text facet, by @frafra (#6071)
  • (from 3.7.6 on) Browser launching on startup was fixed for Snap-packaged OpenRefine, by @kristbaum (#6065)
  • (from 3.7.6 on) The selection of Wikibase statement merging strategies was fixed, by @wetneb (#6066)
  • (from 3.7.6 on) A resizing issue in the presence of the Wikibase extension and facets was fixed, by @Abbe98 (#6070)
  • (from 3.7.7 on) A rendering issue in the Wikibase schema editor was fixed, by @wetneb (#6165)
  • (from 3.7.7 on) Attempts to fetch invalid URLs in the "Add column by fetching URLs" operation are properly reported to the user, by @tfmorris (#6141)
  • (from 3.7.7 on) A missing space between the "remove" and "configure" buttons of the "Add column from reconciled values" operation was added, by @SrinathKadam048 (#6151)

3.7.7

4 months ago

This is the seventh stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New in 3.7.7

  • A rendering issue in the Wikibase schema editor was fixed, by @wetneb (#6165)
  • Attempts to fetch invalid URLs in the "Add column by fetching URLs" operation are properly reported to the user, by @tfmorris (#6141)
  • A missing space between the "remove" and "configure" buttons of the "Add column from reconciled values" operation was added, by @SrinathKadam048 (#6151)

Vulnerabilities

  • (fixed since 3.7.5) A moderate vulnerabilities in the database extension was fixed. Connecting to a malicious MySQL server could read files or execute arbitrary code on the machine running OpenRefine. The vulnerabilities were assigned CVE-2023-41886 and CVE-2023-41887 identifiers respectively, and were reported by @nbxiglk0.
  • (fixed since 3.7.4) A moderate vulnerability in project import was fixed. Importing a maliciously crafted project could execute arbitrary code on the machine running OpenRefine. This vulnerability has been assigned the CVE-2023-37476 identifier. It was reported by Stefan Schiller from SonarSource.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

Bug fixes

  • The layout of the dialog to select a reconciliation match was improved so that the auto-complete widget does not hide the other options (#4821)
  • Better in-tool documentation around the way the scatterplot facet detects numerical columns (#4890)
  • The detection of URLs in cell values was fixed (#4546)
  • The error message displayed when trying to add a Wikibase manifest with a manifest version that is too old or recent was improved (#4847)
  • Errors returned by Jython expressions are more readable (#3012)
  • The ODS exporter no longer creates a default sheet "Sheet1" in the documents it creates (#4864)
  • Longer descriptions in auto-completion widget are not cut off anymore (#4988)
  • The interface for editing cell values was improved to better explain how to input dates (#3082)
  • The Windows refine.bat script was made more consistent with the Unix refine script (#4949, #5404)
  • The "Search for Match" dialog was rearranged so that the dropdown does not cover the buttons (#4945)
  • Error handling in the scatterplot facet was improved (#4893)
  • The "Collapse consecutive whitespaces" operation now handles unicode whitespace correctly (#4898)
  • (#4991)
  • The handling of GZIP-compressed files without .gz extension was improved in the importing pipeline (#547)
  • (#5153)
  • The "Add column based on this column dialog" can be submitted by pressing "Enter" in the column field (#5143)
  • The editing of redirected Wikibase items was fixed (#5162)
  • The user experience was improved in the case of incomplete Wikibase schemas (#5131)
  • The memory usage display was improved to show the used memory instead of the total memory, and was made more precise (#5222)
  • The association of labels to form inputs was improved, enhancing the accessibility of the interface (#5239, #5242, #5249, #5284)
  • An overflow issue in the reconciliation dialog was fixed (#5285)
  • The Wikibase manifests now properly support locally-running reconciliation endpoints (#5035)
  • The aspect ratio of Wikibase logos is now properly preserved (#5306)
  • The SQL exporter interface was improved (#5224)
  • The cell edit popup and dialogs with textbox inputs became resizable (#5330)
  • When marking a set of cells as "New" in an unreconciled column, the user is prompted for the reconciliation service to use (#4985)
  • More quality assurance checks were introduced in the Wikibase extension, such as checking for identical label and description in new Wikibase items (#4980)
  • The caching of auto-completion results in the Wikibase extension was fixed (#5190)
  • The Wikidata extension was fully renamed to "Wikibase extension" (#4525)
  • The controls of the cluster and edit dialog are greyed out while clustering is taking place (#5369)
  • The handling of unicode whitespace was improved throughout the application (#5105)
  • Our MacOS packages (.DMG) are now properly signed and notarized, which should make their installation easier (#4586). Also, the presentation of the DMG image was made more user-friendly by including the customary link to the Applications folder. (#5509)
  • The parsing of the unary minus sign in GREL was fixed (#5465)
  • (From 3.7-beta3 on) The clustering dialog no longer introduces non-breaking spaces when selecting options with spaces in them (#5581)
  • (From 3.7.1 on) The display of the memory usage during project import was fixed (#5665)
  • (From 3.7.2 on) The localization in German was fixed (#5750)
  • (From 3.7.3 on) Starting openrefine.exe on Windows with Java 17 was fixed (#5583)
  • (From 3.7.3 on) Wikibase edits on deleted items are skipped and do not stall the entire batch (#5385)
  • (From 3.7.3 on) The HTML document language is aligned with the language of the interface (#5925)
  • (From 3.7.3 on) The default reconciliation types are displayed with both name and id (#5907)
  • (From 3.7.3 on) The transpose cells across columns was fixed so it treats blank cells as null (#5229)
  • (from 3.7.6 on) A missing space was added in the layout of the text facet, by @frafra (#6071)
  • (from 3.7.6 on) Browser launching on startup was fixed for Snap-packaged OpenRefine, by @kristbaum (#6065)
  • (from 3.7.6 on) The selection of Wikibase statement merging strategies was fixed, by @wetneb (#6066)
  • (from 3.7.6 on) A resizing issue in the presence of the Wikibase extension and facets was fixed, by @Abbe98 (#6070)

3.7.6

6 months ago

This is the sixth stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New in 3.7.6

  • A missing space was added in the layout of the text facet, by @frafra (#6071)
  • Browser launching on startup was fixed for Snap-packaged OpenRefine, by @kristbaum (#6065)
  • The selection of Wikibase statement merging strategies was fixed, by @wetneb (#6066)
  • A resizing issue in the presence of the Wikibase extension and facets was fixed, by @Abbe98 (#6070)

Vulnerabilities

  • (fixed since 3.7.5) A moderate vulnerabilities in the database extension was fixed. Connecting to a malicious MySQL server could read files or execute arbitrary code on the machine running OpenRefine. The vulnerabilities were assigned CVE-2023-41886 and CVE-2023-41887 identifiers respectively, and were reported by @nbxiglk0.
  • (fixed since 3.7.4) A moderate vulnerability in project import was fixed. Importing a maliciously crafted project could execute arbitrary code on the machine running OpenRefine. This vulnerability has been assigned the CVE-2023-37476 identifier. It was reported by Stefan Schiller from SonarSource.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

Bug fixes

  • The layout of the dialog to select a reconciliation match was improved so that the auto-complete widget does not hide the other options (#4821)
  • Better in-tool documentation around the way the scatterplot facet detects numerical columns (#4890)
  • The detection of URLs in cell values was fixed (#4546)
  • The error message displayed when trying to add a Wikibase manifest with a manifest version that is too old or recent was improved (#4847)
  • Errors returned by Jython expressions are more readable (#3012)
  • The ODS exporter no longer creates a default sheet "Sheet1" in the documents it creates (#4864)
  • Longer descriptions in auto-completion widget are not cut off anymore (#4988)
  • The interface for editing cell values was improved to better explain how to input dates (#3082)
  • The Windows refine.bat script was made more consistent with the Unix refine script (#4949, #5404)
  • The "Search for Match" dialog was rearranged so that the dropdown does not cover the buttons (#4945)
  • Error handling in the scatterplot facet was improved (#4893)
  • The "Collapse consecutive whitespaces" operation now handles unicode whitespace correctly (#4898)
  • (#4991)
  • The handling of GZIP-compressed files without .gz extension was improved in the importing pipeline (#547)
  • (#5153)
  • The "Add column based on this column dialog" can be submitted by pressing "Enter" in the column field (#5143)
  • The editing of redirected Wikibase items was fixed (#5162)
  • The user experience was improved in the case of incomplete Wikibase schemas (#5131)
  • The memory usage display was improved to show the used memory instead of the total memory, and was made more precise (#5222)
  • The association of labels to form inputs was improved, enhancing the accessibility of the interface (#5239, #5242, #5249, #5284)
  • An overflow issue in the reconciliation dialog was fixed (#5285)
  • The Wikibase manifests now properly support locally-running reconciliation endpoints (#5035)
  • The aspect ratio of Wikibase logos is now properly preserved (#5306)
  • The SQL exporter interface was improved (#5224)
  • The cell edit popup and dialogs with textbox inputs became resizable (#5330)
  • When marking a set of cells as "New" in an unreconciled column, the user is prompted for the reconciliation service to use (#4985)
  • More quality assurance checks were introduced in the Wikibase extension, such as checking for identical label and description in new Wikibase items (#4980)
  • The caching of auto-completion results in the Wikibase extension was fixed (#5190)
  • The Wikidata extension was fully renamed to "Wikibase extension" (#4525)
  • The controls of the cluster and edit dialog are greyed out while clustering is taking place (#5369)
  • The handling of unicode whitespace was improved throughout the application (#5105)
  • Our MacOS packages (.DMG) are now properly signed and notarized, which should make their installation easier (#4586). Also, the presentation of the DMG image was made more user-friendly by including the customary link to the Applications folder. (#5509)
  • The parsing of the unary minus sign in GREL was fixed (#5465)
  • (From 3.7-beta3 on) The clustering dialog no longer introduces non-breaking spaces when selecting options with spaces in them (#5581)
  • (From 3.7.1 on) The display of the memory usage during project import was fixed (#5665)
  • (From 3.7.2 on) The localization in German was fixed (#5750)
  • (From 3.7.3 on) Starting openrefine.exe on Windows with Java 17 was fixed (#5583)
  • (From 3.7.3 on) Wikibase edits on deleted items are skipped and do not stall the entire batch (#5385)
  • (From 3.7.3 on) The HTML document language is aligned with the language of the interface (#5925)
  • (From 3.7.3 on) The default reconciliation types are displayed with both name and id (#5907)
  • (From 3.7.3 on) The transpose cells across columns was fixed so it treats blank cells as null (#5229)

3.7.5

7 months ago

This is the fifth stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New since 3.7.4

  • A moderate vulnerability in the database extension was fixed. Connecting to a malicious MySQL server could read files or execute arbitrary code on the machine running OpenRefine. A CVE identifier for the vulnerability has been requested. The vulnerability was reported by @nbxiglk0.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

For developers

And many bug fixes, see the full list of changes for 3.7.

3.7.4

9 months ago

This is the fifth stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New since 3.7.3

  • A moderate vulnerability in project import was fixed. Importing a maliciously crafted project could execute arbitrary code on the machine running OpenRefine. A CVE identifier for the vulnerability has been requested. The vulnerability was reported by @stefan-schiller-sonarsource.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

For developers

And many bug fixes, see the full list of changes for 3.7.

3.7.3

10 months ago

This is the fourth stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New since 3.7.2

  • Starting openrefine.exe on Windows with Java 17 was fixed (#5583)
  • Wikibase edits on deleted items are skipped and do not stall the entire batch (#5385)
  • The HTML document language is aligned with the language of the interface (#5925)
  • The default reconciliation types are displayed with both name and id (#5907)
  • The transpose cells across columns was fixed so it treats blank cells as null (#5229)

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

For developers

And many bug fixes, see the full list of changes for 3.7.

3.7.2

1 year ago

This is the third stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New since 3.7.1

  • The German translation was fixed and other translations were updated (#5750)

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

For developers

And many bug fixes, see the full list of changes for 3.7.

3.7.1

1 year ago

This is the second stable release of the 3.7 series. Please backup your workspace directory before installing and report any problems that you encounter.

New features

  • Most text exposed to users in OpenRefine's UI can now be translated. Some strings (generated server-side) were not translatable so far. To help translators catch up on this backlog, do not hesitate to join us on Weblate. (#5030)
  • New media files can be uploaded to Wikibase instances such as Wikimedia Commons. The wikitext of existing files can also be edited thanks to the new fields introduced. (#4682)
  • A button "Discover Wikibase instances…" was added on the dialog which lists the registered Wikibase instances (#5007), whose design was improved (#5009)
  • In the Wikibase schema editor, statements with non-standard datatypes (such as EDTF dates or musical notations) are now supported, assuming they use strings as underlying representation (#3263)
  • The Wikibase issues tab now makes it possible to locate which rows are responsible for certain issues, using facets (#5033)
  • The default throttle delay for the "Add column by fetching URLs" operation was reduced to 500ms and the error reporting for this field was improved (#5188)
  • Wikibase templates (incomplete Wikibase schemas) can be saved and shared, as a way of helping contributors use the same way of structuring data in a Wikibase instance (#5043, #5303)
  • The line-based importer now supports a custom delimiter, instead of only newlines (#4103)
  • The Excel importer can be configured to import all cells as text, disabling the use of other datatypes supported by OpenRefine (#4838)
  • The "some value" and "no value" Wikibase values can now be uploaded by OpenRefine (#5360)
  • The Excel importer will also avoid coercing cell values to OpenRefine datatypes which do not fully fit them, such as representing a date as a date with time (#5389, #5390).

GREL changes

  • Improved error handling in number formatting with the GREL toString function (#816)
  • The behaviour of the GREL function wholeText() has changed slightly in the way it handles newlines, following an upstream change in the jsoup library (jsoup issue #1636)
  • A new parent GREL function, to obtain the parent element of an XML element, was added (#5176)

For developers

And many bug fixes, see the full list of changes for 3.7.