A blazing fast tool for building data pipelines: read, process and output events. Our community: https://t.me/file_d_community
file.d
is a blazing fast tool for building data pipelines: read, process, and output events. Primarily developed to read from files, but also supports numerous input/action/output plugins.
⚠ Although we use it in production,
it still isn't v1.0.0
. Please, test your pipelines carefully on dev/stage environments.
file.d
is an open-source project and contributions are very welcome!
Please make sure to read our contributing guide before creating an issue and opening a PR!
Well, we already have several similar tools: vector, filebeat, logstash, fluend-d, fluent-bit, etc.
Performance tests state that best ones achieve a throughput of roughly 100MB/sec. Guys, it's 2023 now. HDDs and NICs can handle the throughput of a few GB/sec and CPUs processes dozens of GB/sec. Are you sure 100MB/sec is what we deserve? Are you sure it is fast?
On MacBook Pro 2017 with two physical cores file.d
can achieve the following throughput:
files > devnull
casefiles > json decode > devnull
caseTBD: throughput on production servers.
Input: dmesg, fake, file, http, journalctl, k8s, kafka
Action: add_file_name, add_host, convert_date, convert_log_level, convert_utf8_bytes, debug, discard, flatten, join, join_template, json_decode, json_encode, json_extract, keep_fields, mask, modify, move, parse_es, parse_re2, remove_fields, rename, set_time, split, throttle
Output: clickhouse, devnull, elasticsearch, file, gelf, kafka, postgres, s3, splunk, stdout
Join our community in Telegram: https://t.me/file_d_community
Generated using insane-doc