Linguistic tools for texts in Japanese language
It's been a while but I managed to put together a new release. This release is based on JMdict dump from December 26, 2023.
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -d <database_name> ichiran-240107.pgdump
Note: run (ichiran/mnt:add-errata)
after configuring your ichiran to use this database to get the latest database fixes.
Happy new year! I finally managed to put together another release...
Some segmenting bugs are fixed, き-adjectives are now supported (e.g. 幸多き), なる is now treated as suffix for く-adverbs (固くなる) etc.
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-230122.pgdump
Note: run (ichiran/mnt:add-errata)
after configuring your ichiran to use this database to get the latest database fixes.
Apropos of nothing, another Ichiran update!
A possibly breaking change is the support for classical -す causative form (i.e. 言わす vs 言わせる).
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-170521.pgdump
Note: run (ichiran/mnt:add-errata)
after configuring your ichiran to use this database to get the latest database fixes.
Happy new year! It's yet another release of Ichiran database!
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-040121.pgdump
Note: run (ichiran/mnt:add-errata)
after configuring your ichiran to use this database to get the latest database fixes.
This is the first release that includes Japanese municipalities data from the start.
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-220720.pgdump
Note: run (ichiran/mnt:add-errata)
after configuring your ichiran to use this database to get the latest database fixes.
Here's another update including words like 新型コロナウィルス
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-030420.pgdump
Note: run (ichiran/mnt:with-db nil (ichiran/mnt:add-errata))
after configuring your ichiran to use this database to get the latest database fixes.
EDIT (05/01/2020): the dump has been updated because the old one had incorrect conjugations of だ
Happy new yeardecade!
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-050120.pgdump
New dictionary update which includes new words such as 令和
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-110419.pgdump
Note: run (ichiran/mnt:with-db nil (ichiran/mnt:add-errata))
after configuring your ichiran to use this database to get the latest database fixes.
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-090119.pgdump
Note: run (ichiran/mnt:with-db nil (ichiran/mnt:add-errata))
after configuring your ichiran to use this database to get the latest database fixes.
Create a new database from this dump using the following commands:
createdb -E 'UTF8' -l 'ja_JP.utf8' -T template0 <database_name>
pg_restore -C -d <database_name> ichiran-260818.pgdump
Note: run (ichiran/mnt:with-db nil (ichiran/mnt:add-errata))
after configuring your ichiran to use this database to get the latest database fixes.