Easy to use open source fast database for search | Good alternative to Elasticsearch now | Drop-in replacement for E in the ELK soon
Released: Aug 23rd 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Version 6.2.12 continues the 6.2 series and addresses issues discovered after the release of 6.2.0.
TimeoutStartSec
from infinity
to 0
for better compatibility with Centos 7.searchdreplication.cpp
: beggining -> beginning.Thd_t
build issue on Windows related to atomic copy restrictions.ColumnarScan
.AF_INET
error in the test./bulk
endpoints in the manual.Released: Aug 4th 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
mysqldump
pseudo_sharding
has been adjusted to be limited to the number of free threads. This update considerably enhances the throughput performance./json/pq
HTTP endpoint.upper()
and lower()
.count(*)
queries, a precalculated value is now returned.SELECT
for making arbitrary calculations and displaying @@sysvars
. Unlike before, you are no longer limited to just one calculation. Therefore, queries like select user(), database(), @@version_comment, version(), 1+1 as a limit 10
will return all the columns. Note that the optional 'limit' will always be ignored.CREATE DATABASE
stub query.ALTER TABLE table REBUILD SECONDARY
, secondary indexes are now always rebuilt, even if attributes weren't updated.SELECT DATABASE()
command. However, it will always return Manticore
. This addition is crucial for integrations with various MySQL tools./cli_json
endpoint to function as the previous /cli
.thread_stack
can now be altered during runtime using the SET
statement. Both session-local and daemon-wide variants are available. Current values can be accessed in the show variables
output.SHOW STATUS
command.DESC
and SHOW CREATE TABLE
now match that of SELECT * FROM
.P01
) during various errors. This enhancement aids in identifying which parser caused an error and also obscures non-essential internal details.sentence
to show the entire sentencestrftime()
function./bulk
endpoint reports information regarding the number of processed and non-processed strings (documents) in case of an error.CREATE TABLE
operation can run at a time.Get
call, replacing the previous two-step AdvanceTo
+ Get
calls to retrieve a value.CheckReplaceEntry
call was removed from the group sorter to expedite the calculation of aggregate functions.CREATE TABLE
options read_buffer_docs
and read_buffer_hits
now support k/m/g syntax.apt/yum install manticore-language-packs
. On macOS, use the command brew install manticoresoftware/tap/manticore-language-packs
.SHOW CREATE TABLE
and DESC
operations.INSERT
queries, new INSERT
queries will fail until enough disk space becomes available./bulk
endpoint now processes empty lines as a commit command. More info here.count(*)
is used with a single filter, queries now leverage precalculated data from secondary indexes when available, substantially speeding up query times./*+ SecondaryIndex(uid) */
. Please note that the old syntax is no longer supported.@
in table names has been disallowed to prevent syntax conflicts.indexed
and attribute
are now regarded as a single field during INSERT
, DESC
, and ALTER
operations.manticore.json
config..sph
files could be corrupted ALTER
. Fixed.pre_commit
error occurring when replace is replicated from multiple master nodes.pseudo_sharding
was disabled.show index status
command has been modified and now varies depending on the type of index in use.expand_keywords
option.SNIPPETS()
was called.not_terms_only_allowed
option to RT index with killed documents.FEDERATED
engine with aggregate.rt_attr_json
column was incompatible with columnar storage.ignore_chars
.--dumpdocids
command.morphology_skip_fields
.max_packet_size
check for replication commands between nodes. Additionally, the latest cluster error has been added to the status display.MANTICORE_BUDDY_TIMEOUT
(default 3 seconds) to control the daemon's wait duration for a buddy message at startup.SHOW CREATE TABLE
.SNIPPET()
function.all()/any()
is logged.Released: Mar 15 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
/pq
HTTP endpoint to be an alias of the /json/pq
HTTP endpoint.Released: Feb 10 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Released: Feb 7 2023
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Starting with this release, Manticore Search comes with Manticore Buddy, a sidecar daemon written in PHP that handles high-level functionality that does not require super low latency or high throughput. Manticore Buddy operates behind the scenes, and you may not even realize it is running. Although it is invisible to the end user, it was a significant challenge to make Manticore Buddy easily installable and compatible with the main C++-based daemon. This major change will allow the team to develop a wide range of new high-level features, such as shards orchestration, access control and authentication, and various integrations like mysqldump, DBeaver, Grafana mysql connector. For now it already handles SHOW QUERIES, BACKUP and Auto schema.
This release also includes more than 130 bug fixes and numerous features, many of which can be considered major.
SET GLOBAL ES_COMPAT=off
.manticore-backup
for backing up and restoring Manticore instance
KILL
to kill a long-running SELECT
.max_matches
for aggregation queries to increase accuracy and lower response time.Issue #822 SQL commands FREEZE/UNFREEZE to prepare a real-time/plain table for a backup.
Commit c470 New settings accurate_aggregation
and max_matches_increase_threshold
for controlled aggregation accuracy.
Issue #718 Support for signed negative 64-bit IDs. Note, you still can't use IDs > 2^63, but you can now use ids in the range of from -2^63 to 0.
As we recently added support for secondary indexes, things became confusing as "index" could refer to a secondary index, a full-text index, or a plain/real-time index
. To reduce confusion, we are renaming the latter to "table". The following SQL/command line commands are affected by this change. Their old versions are deprecated, but still functional:
index <table name>
=> table <table name>
,searchd -i / --index
=> searchd -t / --table
,SHOW INDEX STATUS
=> SHOW TABLE STATUS
,SHOW INDEX SETTINGS
=> SHOW TABLE SETTINGS
,FLUSH RTINDEX
=> FLUSH TABLE
,OPTIMIZE INDEX
=> OPTIMIZE TABLE
,ATTACH TABLE plain TO RTINDEX rt
=> ATTACH TABLE plain TO TABLE rt
,RELOAD INDEX
=> RELOAD TABLE
,RELOAD INDEXES
=> RELOAD TABLES
.We are not planning to make the old forms obsolete, but to ensure compatibility with the documentation, we recommend changing the names in your application. What will be changed in a future release is the "index" to "table" rename in the output of various SQL and JSON commands.
Queries with stateful UDFs are now forced to be executed in a single thread.
Issue #1011 Refactoring of all related to time scheduling as a prerequisite for parallel chunks merging.
⚠️ BREAKING CHANGE: Columnar storage format has been changed. You need to rebuild those tables that have columnar attributes.
⚠️ BREAKING CHANGE: Secondary indexes file format has been changed, so if you are using secondary indexes for searching and have searchd.secondary_indexes = 1
in your configuration file, be aware that the new Manticore version will skip loading the tables that have secondary indexes. It's recommended to:
searchd.secondary_indexes
to 0 in the configuration file.ALTER TABLE <table name> REBUILD SECONDARY
for each index to rebuild secondary indexes.If you are running a replication cluster, you'll need to run ALTER TABLE <table name> REBUILD SECONDARY
on all the nodes or follow this instruction with just change: run the ALTER .. REBUILD SECONDARY
instead of the OPTIMIZE
.
⚠️ BREAKING CHANGE: The binlog version has been updated, so any binlogs from previous versions will not be replayed. It is important to ensure that Manticore Search is stopped cleanly during the upgrade process. This means that there should be no binlog files in /var/lib/manticore/binlog/
except for binlog.meta
after stopping the previous instance.
Issue #849 SHOW SETTINGS
: helper command for manticore-backup.
Issue #1007 SET GLOBAL CPUSTATS=1/0 turns on/off cpu time tracking; SHOW THREADS now doesn't show CPU statistics when the cpu time tracking is off.
Issue #1009 RT table RAM chunk segments can now be merged while the RAM chunk is being flushed.
Issue #1012 Added secondary index progress to the output of indexer.
Issue #1013 Previously a table record could be removed by Manticore from the index list if it couldn't start serving it on start. The new behaviour is to keep it in the list to try to load it on the next start.
indextool --docextract returns all the words and hits belonging to requested document.
Commit 2b29 Environment variable dump_corrupt_meta
enables dumping a corrupted table meta data to log in case searchd can't load the index.
Commit c7a3 DEBUG META
can show max_matches
and pseudo sharding statistics.
Commit 6bca A better error instead of the confusing "Index header format is not json, will try it as binary...".
Commit bef3 Ukirainian lemmatizer path has been changed.
Commit 4ae7 Secondary indexes statistics has been added to SHOW META.
Commit 2e7c JSON interface can now be easily visualized using Swagger Editor https://manual.manticoresearch.com/dev/Openapi#OpenAPI-specification.
select attr, count(*) from plain_index
(w/o filtering) are now faster in case you are using MCL.brew install manticoresoftware/manticore/manticoresearch manticoresoftware/manticore/manticore-extra
.text
binlog_flush = 1
has been broken all the time since Sphinx. Fixed.got exception while reading ist stream: mkstemp(./gmb_pF6TJi) failed: 13 (Permission denied)
if the searchd was started from a directory it can't write to.➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Release blogpost https://manticoresearch.com/blog/manticore-search-5-0-0/
secondary_indexes = 1
either in your configuration file or using SET GLOBAL. The new functionality is supported in all operating systems except old Debian Stretch and Ubuntu Xenial.a=1 and (b=2 or c=3)
in JSON: must
(AND), should
(OR) and must_not
(NOT) worked only on the highest level. Now they can be nested.Content-Length
). On the server's side Manticore now always processes incoming HTTP data in streaming fashion without waiting for the whole batch to be transferred as previously, which:
max_packet_size
(128MB), e.g. 1GB at once.100 Continue
: now you can transfer large batches from curl
(including curl libraries used by various programming languages) which by default does Expect: 100-continue
and waits some time before actually sending the batch. Previously you had to add Expect:
header, now it's not needed.select * from <columnar table>
are now much faster than previously, especially if there are many fields in the schema.total_found
in SHOW META and hits.total in JSON output. It is now only accurate in case you see total_relation: eq
while total_relation: gte
means the actual number of matching documents is greater than the total_found
value you've got. To retain the previous behaviour you can use search option cutoff=0
, which makes total_relation
always eq
.stored_fields =
(empty value) to make all fields non-stored (i.e. revert to the previous behaviour)..meta
, .sph
) were in binary format, now it's just json. The new Manticore version will convert older indexes automatically, but:
WARNING: ... syntax error, unexpected TOK_IDENT
SHOW META
after SELECT
and it will work the same way it works via mysql. Note, previously Connection: keep-alive
HTTP header was supported too, but it only caused reusing the same connection. Since this version it also makes the session stateful.columnar_attrs = *
to define all your attributes as columnar in the plain mode which is useful in case the list is long.--new-cluster
(run tool manticore_new_cluster
in Linux).127.0.0.1
instead of 0.0.0.0
in case no listen
at all is specified in config. Even though in the default configuration which is shipped with Manticore Search the listen
setting is specified and it's not typical to have a configuration with no listen
at all, it's still possible. Previously Manticore would listen on 0.0.0.0
which is not secure, now it listens on 127.0.0.1
which is usually not exposed to the Internet.AVG()
accuracy: previously Manticore used float
internally for aggregations, now it uses double
which increases the accuracy significantly.DEBUG malloc_stats
support for jemalloc.sphinxql
by default. If you are used to plain
format you need to add query_log_format = plain
to your configuration file.max_connections
limit, which could cause "maxed out" error for non-VIP connections. Now VIP connections are not counted towards the limit. Current number of VIP connections can be also seen in SHOW STATUS
and status
./sql?mode=raw
now requires escaping and returns an array./bulk
INSERT/REPLACE/DELETE requests:
low_priority
and boolean_simplify
now require a value (0/1
): previously you could do SELECT ... OPTION low_priority, boolean_simplify
, now you need to do SELECT ... OPTION low_priority=1, boolean_simplify=1
.query_log_format=sphinxql
. Previously only full-text part was logged, now it's logged as is.⚠️ BREAKING CHANGE: because of the new structure when you upgrade to Manticore 5 it's recommended to remove old packages before you install the new ones:
yum remove manticore*
apt remove manticore*
New deb/rpm packages structure. Previous versions provided:
manticore-server
with searchd
(main search daemon) and all needed for itmanticore-tools
with indexer
and indextool
manticore
including everythingmanticore-all
RPM as a meta package referring to manticore-server
and manticore-tools
The new structure is:
manticore
- deb/rpm meta package which installes all the above as dependenciesmanticore-server-core
- searchd
and everything to run it alonemanticore-server
- systemd files and other supplementary scriptsmanticore-tools
- indexer
, indextool
and other toolsmanticore-common
- default configuration file, default data directory, default stopwordsmanticore-icudata
, manticore-dev
, manticore-converter
didn't change much.tgz
bundle which includes all the packagesSupport for Ubuntu Jammy
Support for Amazon Linux 2 via YUM repo
application/x-ndjson
ranker
could be specified twice in query log➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
It takes 48 seconds to insert 1M PQ rules and 406 seconds to REPLACE just 40K in 10K batches.
root@perf3 ~ # mysql -P9306 -h0 -e "drop table if exists pq; create table pq (f text, f2 text, j json, s string) type='percolate';"; date; for m in `seq 1 1000`; do (echo -n "insert into pq (id,query,filters,tags) values "; for n in `seq 1 1000`; do echo -n "(0,'@f (cat | ( angry dog ) | (cute mouse)) @f2 def', 'j.json.language=\"en\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; [ $n != 1000 ] && echo -n ","; done; echo ";")|mysql -P9306 -h0; done; date; mysql -P9306 -h0 -e "select count(*) from pq"
Wed Dec 22 10:24:30 AM CET 2021
Wed Dec 22 10:25:18 AM CET 2021
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
root@perf3 ~ # date; (echo "begin;"; for offset in `seq 0 10000 30000`; do n=0; echo "replace into pq (id,query,filters,tags) values "; for id in `mysql -P9306 -h0 -NB -e "select id from pq limit $offset, 10000 option max_matches=1000000"`; do echo "($id,'@f (tiger | ( angry bear ) | (cute panda)) @f2 def', 'j.json.language=\"de\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; n=$((n+1)); [ $n != 10000 ] && echo -n ","; done; echo ";"; done; echo "commit;") > /tmp/replace.sql; date
Wed Dec 22 10:26:23 AM CET 2021
Wed Dec 22 10:26:27 AM CET 2021
root@perf3 ~ # time mysql -P9306 -h0 < /tmp/replace.sql
real 6m46.195s
user 0m0.035s
sys 0m0.008s
It takes 34 seconds to insert 1M PQ rules and 23 seconds to REPLACE them in 10K batches.
root@perf3 ~ # mysql -P9306 -h0 -e "drop table if exists pq; create table pq (f text, f2 text, j json, s string) type='percolate';"; date; for m in `seq 1 1000`; do (echo -n "insert into pq (id,query,filters,tags) values "; for n in `seq 1 1000`; do echo -n "(0,'@f (cat | ( angry dog ) | (cute mouse)) @f2 def', 'j.json.language=\"en\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; [ $n != 1000 ] && echo -n ","; done; echo ";")|mysql -P9306 -h0; done; date; mysql -P9306 -h0 -e "select count(*) from pq"
Wed Dec 22 10:06:38 AM CET 2021
Wed Dec 22 10:07:12 AM CET 2021
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
root@perf3 ~ # date; (echo "begin;"; for offset in `seq 0 10000 990000`; do n=0; echo "replace into pq (id,query,filters,tags) values "; for id in `mysql -P9306 -h0 -NB -e "select id from pq limit $offset, 10000 option max_matches=1000000"`; do echo "($id,'@f (tiger | ( angry bear ) | (cute panda)) @f2 def', 'j.json.language=\"de\"', '{\"tag1\":\"tag1\",\"tag2\":\"tag2\"}')"; n=$((n+1)); [ $n != 10000 ] && echo -n ","; done; echo ";"; done; echo "commit;") > /tmp/replace.sql; date
Wed Dec 22 10:12:31 AM CET 2021
Wed Dec 22 10:14:00 AM CET 2021
root@perf3 ~ # time mysql -P9306 -h0 < /tmp/replace.sql
real 0m23.248s
user 0m0.891s
sys 0m0.047s
searchd
. It's useful when you want to limit the RT chunks count in all your indexes to a particular number globally.YEAR()
and other timestamp functions.rt_mem_limit
of data before saving a new disk chunk to disk, and while saving was still collecting up to 10% more (aka double-buffer) to minimize possible insert suspension. If that limit was also exhausted, adding new documents was blocked until the disk chunk was fully saved to disk. The new adaptive limit is built on the fact that we have auto-optimize now, so it's not a big deal if disk chunks do not fully respect rt_mem_limit
and start flushing a disk chunk earlier. So, now we collect up to 50% of rt_mem_limit
and save that as a disk chunk. Upon saving we look at the statistics (how much we've saved, how many new documents have arrived while saving) and recalculate the initial rate which will be used next time. For example, if we saved 90 million documents, and another 10 million docs arrived while saving, the rate is 90%, so we know that next time we can collect up to 90% of rt_mem_limit
before starting flushing another disk chunk. The rate value is calculated automatically from 33.3% to 95%.indexer -v
and --version
. Previously you could still see indexer's version, but -v
/--version
were not supported.MANTICORE_TRACK_RT_ERRORS
useful for debugging RT segments corruption./var/lib/manticore/binlog/
except binlog.meta
after stopping the previous instance.show threads option format=all
. It shows stack of some task info tickets, most useful for profiling needs, so if you are parsing show threads
output be aware of the new column.searchd.workers
was obsoleted since 3.5.0, now it's deprecated, if you still have it in your configuration file it will trigger a warning on start. Manticore Search will start, but with a warning.indextool --check
could crash➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
Full support of Manticore Columnar Library. Previously Manticore Columnar Library was supported only for plain indexes. Now it's supported:
INSERT
, REPLACE
, DELETE
, OPTIMIZE
ALTER
indextool --check
Automatic indexes compaction (#478). Finally you don't have to call OPTIMIZE manually or via a crontask or other kind of automation. Manticore now does it on your own. You can set default compaction threshold via optimize_cutoff.
Chunk snapshots and locks system revamp. These changes may be invisible from outside at first glance, but they improve the behaviour of many things happening in real-time indexes significantly. In a nutshell, previously most Manticore data manipulation operations relied on locks heavily, now we use disk chunk snapshots instead.
ALTER can add/remove a full-text field. Previously it could only add/remove an attribute.
🔬 Experimental: pseudo sharding for full-scan queries - allows to parallelize any non-full-text search query. Instead of preparing shards manually you can now just enable new option searchd.pseudo_sharding and expect up to CPU cores
lower response time for non-full-text search queries. Note it can easily occupy all existing CPU cores, so if you care not only about latency, but throughput too - use it with caution.
time curl -X POST -d '{"update":{"index":"idx","id":4611686018427387905,"doc":{"mode":0}}}' -H "Content-Type: application/x-ndjson" http://127.0.0.1:6358/json/bulk
real 0m43.783s
user 0m0.008s
sys 0m0.007s
time curl -X POST -d '{"update":{"index":"idx","id":4611686018427387905,"doc":{"mode":0}}}' -H "Content-Type: application/x-ndjson" http://127.0.0.1:6358/json/bulk
real 0m0.006s
user 0m0.004s
sys 0m0.001s
--replay-flags=ignore-trx-errors
and --replay-flags=ignore-all-errors
so one can still start searchd if the binlog is corruptedcharset_table
's default value changes from 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+401->U+451, U+451
to non_cjk
OPTIMIZE
happens automatically. If you don't need it make sure to set auto_optimize=0
in section searchd
in the configuration fileondisk_attrs_default
were deprecated, now they are removedtotal
in SHOW META, but not total_found
which is the actual number of found documents./var/lib/manticore/binlog/
(only binlog.meta
should be in the directory)--new-cluster
(run tool manticore_new_cluster
in Linux).ERROR 1064 (42000): invalid GTID, (null)
, the donor could become unresponsive while another node was joiningindextool --help
doesn't display parameter --rotate
command_insert
, command_replace
and others were showing wrong metricscharset_table
for a plain index had a wrong default valueSELECT * FROM pq ORDER BY id desc LIMIT 1000 , 100 OPTION max_matches=1100
was not working previouslyMaintenance release before Manticore 4
➡️➡️➡️ DOWNLOAD HERE ⬅️⬅️⬅️
manticore_new_cluster [--force]
useful for restarting a replication cluster via systemdindexer --merge
blend_mode='trim_all'
WHERE json.a = 1
DEBUG SPLIT
as a prerequisite for automatic sharding/rebalancingindextool --dumpheader
reverse_scan
is deprecated. Make sure you don't use this option in your queries since 3.6.0 since they will fail otherwisereverse_scan
has been deprecated