Ssmtool Versions Save

Simple sentence mining tool for language learning

v0.12.0

1 month ago

New features

  • Replace Web Reader with Epub.js based reader
    • This should look much better than the old reader. The old reader is now removed and will not be accessible from the new version. You should convert all of your ebooks to epub format and use the new reader on upgrade.
  • Improved Wiktionary (English) dictionary source by using kaikki.org API. Among other things, this would now include extra information like gender, animacy, and IPA pronunciation, depending on data availability
  • Support for importing kaikki.org Wiktionary archives.
    • You can import any archive you find on that site, which include Wiktionary with definition in several languages. Support for archives in non-English languages is preliminary, though it should mostly work. Report any issues on Github or the chatroom.
  • Go through definitions by scrolling on the definition viewer
  • Fail existing cards when adding a duplicate
  • Custom lemmatization rules: you can now add custom rules in the form of regular expressions in the config window
  • Word list importer is now implemented: you can now generate Anki cards from any text file with one word on each line
  • Auto text importer will now treat capitalized names as known to reduce useless cards, except in German and Luxembourgish.
  • Minimum window width setting by @artjomsR in https://github.com/FreeLanguageTools/vocabsieve/pull/127

Other changes

  • The behavior of Ctrl-D and Ctrl-Shift-D should be more intuitive now
  • Fix image not found bug
  • Seen data processing should now be faster. You may need to rebuild your seen database in Track -> Content manager
  • Default lemma_policy is now "Lemma first, then original". Existing users are recommended to change it automatically
  • Old database migrations have been removed. If you are upgrading from before 0.10, you should first upgrade to 0.11.1.
  • Interface should no longer hang when fetching definition

Full Changelog: https://github.com/FreeLanguageTools/vocabsieve/compare/v0.11.1...v0.12.0

v0.11.1

3 months ago

New features

Bugfixes

New Contributors

v0.11.0

4 months ago

New features

  • Clipboard monitoring should now work on Wayland and on MacOS now that a workaround (polling) has been implemented.
  • Word scores are now cached and is now used to display word status on lookup
  • Dictionary definition handling has been revamped. Now you can switch between several different results in each of the two definition boxes.
    • You can specify the order in which definitions are fetched, as well as what form of the word to use for each dictionary (lemma policy)
  • Pronunciation: Forvo fetching should now be able to get either mp3 or ogg. Windows or IOS users should use mp3.
  • Logging is introduced in the program. You can view the logs by going to Help -> View session logs. Please include the log when reporting an issue.

Other changes

  • Interface has been somewhat simplified to reduce clutter. Lookup buttons have been removed but they are still accessible via Ctrl/Cmd+D and Ctrl/Cmd+Shift+D.
  • Shortcut for clearing the selected image is now Ctrl+W in order to ensure that all shortcuts are accessible with the left hand alone. In the future a different set of shortcuts will be added for right-handed operation.
  • The local HTTP API has been removed for now. A more refined API will be introduced when the need arises.
  • You are recommended to use the new manual from now on.
  • The config tool has been moved to the menu bar.

v0.10.2

8 months ago

New features

  • Option to capitalize sentences from clipboard (by @AtilioA)
  • Async audio fetching for better UI responsiveness (by @AtilioA)

Bug fixes

  • Fetching all pronunciations (by @artjomsR)
  • KOReader importer should now truncate language codes with regional suffixes. (by @billyc)

Other changes

  • Default instance of Lingva translate is now https://lingva.lunar.icu. Existing users will need to change this manually.

v0.10.1

1 year ago

Users upgrading from before 0.10 should also view the release notes of 0.10.0. It is NOT safe to downgrade to a version before 0.10 after upgrading due to a database change.

New features

  • Implemented basic dark theme support. The theme should also look better overall especially for Windows users.

Bug fixes

  • Kindle importer should now record the correct lookup timestamps. For existing users a database change will take place to fix this.
  • KOReader vocab builder importer should now display a correct count
  • Python package should now have all dependencies listed

v0.10.0

1 year ago

It is NOT safe to downgrade to a version before 0.10 after upgrading due to a database change.

New features

  • Vocabulary tracking. From this version, VocabSieve is now able to process data from a variety of sources, including Anki, past lookups, and texts you have read (incl. video subtitles) to create a database of words you know. Parameters are configurable via the settings.
    • You can export either the exact data or simply a list of known words in JSON format for your own use.
  • Book analyzer. Based on the words you know, a book analyzer is provided to help you quickly screen for immersion content based on difficulty and ease of making flashcards.
  • The learning simulator takes into account effects of learning words on text difficulty over the course of reading.
  • Statistics. Several statistics are now viewable in the program itself, via the Statistics option on the menu bar.
  • A new KOReader importer based on its Vocabulary Builder is available. Due to the various limitations and inherent complexity of the original KOReader importer, it is now considered deprecated, and is planned to be removed two releases from now (in 0.12). To use it, please ensure that your KOReader vocabulary builder is set to save context, and not add words automatically to reduce noise.
  • Cognates are now supported as a type of dictionary. It can help make vocabulary tracking more accurate, especially if your target language shares a lot of words in common with languages you know.

Other changes

  • When multiple words are selected, it will now brute-force through all combinations of lemmatized and non-lemmatized versions of each word until a definition is found. This should improve dictionary usability with phrases.
  • You can now use compressed json formats (.json.xz, .json.gz, .json.bz2). NOT zip, tar, 7z, or .json.zst. It should behave the same as if you first decompress the json.
  • Kindle and KOReader imports will now import the entirety of your lookup histories automatically, before selecting any book to make flashcards from. This provides a fuller picture of the words you looked up.
    • KOReader users (regardless of using the old or the new importer) should now select the directory containing both the KOReader settings folder and the books. For most users, the easiest way to do this is to select the device directly for Kobo or the user's home folder (/storage/emulated/0) for Android.

v0.9.2

1 year ago

Fixes

  • Define and Define (direct) buttons should work as it did in v0.8.3.

Changes

  • Importers will now remember the latest note from your eReader. This should prevent annoyances where you did not synchronize your notes properly, or if your eReader clock is wrong.
  • The Windows/Mac packages should now support PyMorphy2. You can expect a large lemmatization performance improvement for Russian and Ukrainian.

Note: Windows users will see a console window when opening the program. This is due to a mistake in the build process. It does not indicate a problem. Please simply ignore it for now. It will be fixed in the next release.

v0.9.1

1 year ago

This version fixes an error affecting some users upgrading from earlier versions to v0.9.0.

Below describes differences compared to v0.8.3

Existing users: the first launch will take longer than usual for a database update. This should only happen once. Do not try to stop it.

New features

  • All reader import features have been rewritten to share a single base, with a greatly improved interface. Now you can change date and get an immediate feedback as to how many words you are selecting.
  • KOReader importer works the same as before, supporting epub and fb2, and now also fb2.zip when they are read compressed. The Kindle importer now reads both vocab.db and "My Clippings.txt", so that it no longer has to go through the books to find the sentences. This also means it would be able to support all formats that your Kindle can read. Both will show a card review so you can tell what is going to be added to Anki. They will also add tags based on the name of the book.
  • Default note type: for new users, it will add a default note type automatically named "vocabsieve-notes", and use the appropriate fields automatically to save the effort of downloading and importing a note type.
  • Lemma is now recorded in the lookups records. Ereader imports are now recorded back in time when they were first created. It is recommended that you press the "Look up" button for all of your past notes at least once (adding to Anki is not necessary) to make sure VocabSieve knows you looked up the word before.
  • There is now an indicator of how many times you have looked up the word previously based on the lemma.
  • The importer now remembers your last import date and will restore it automatically. It will also remember the path, which makes it possible to use the "Repeat last import" option, which should start an import where you last successfully imported notes.

Changes

  • The local resources manager will now remove dictionaries when rebuilding if it cannot be found.
  • Interface should no longer freeze when playing audio. This should feel more responsive.

v0.8.3

1 year ago

This version fixes a bug that causes the program to not launch in the previous release (0.8.2) . Apologies for the delay. Below are the changes compared to 0.8.1

Features

  • Better bolding: words can now be bolded directly in the text box, and when one word is bolded, all other words in the sentence with the same lemma will be bolded together. Contributed by @jonahsol
  • Greedy lemmatization: may help with lemmatization in situations where the lemmatizer coverage is low. By @jonahsol
  • Bold is now included for batch imported notes from Kindle or KOReader.

Changes

  • Markdown is now the default display mode. This is due to issues present with Markdown-HTML mode.

Bugfixes

  • Fоrvо audio fetching should work again.

v0.8.2

1 year ago

Features

  • Better bolding: words can now be bolded directly in the text box, and when one word is bolded, all other words in the sentence with the same lemma will be bolded together. Contributed by @jonahsol
  • Greedy lemmatization: may help with lemmatization in situations where the lemmatizer coverage is low. By @jonahsol
  • Bold is now included for batch imported notes from Kindle or KOReader.

Changes

  • Markdown is now the default display mode. This is due to issues present with Markdown-HTML mode.

Bugfixes

  • Fоrvо audio fetching should work again.