Zipkin Versions Save

Zipkin is a distributed tracing system

3.3.0

1 month ago

Zipkin 3.3 is maintenance only with no new features since the last release.

Notably, this raises the floor JRE version of libraries except core from 11 to 17. The only reason we had 11 in the past was due to Spark limitations that affected zipkin-dependencies. This was resolved by Spark 3.4, which we were recently able to upgrade to once libraries we used all became compatible with it.

Also, we now run Trivy security and misconfiguration scanner on every commit, in support of our new security policy. This policy was designed around the norms of our maintenance community, which is currently 100pct volunteers with no dedicated paid time for the project.

We appreciate Trivy adjusting the open source code for the somewhat unique needs of tracing projects: it requires running tests on old library versions. Their open mindedness in classification policy was critical in coming up with a policy at all. We need to focus the small amount of time we have available to the most important alerts, and not the noise: now we can.

3.2.1

1 month ago

Zipkin 3.2.1 fixes a regression where libraries that improve network performance (netty-tcnative) were not included in the main zipkin jar, resulting in unpublished Docker images.

3.2.0

1 month ago

Zipkin 3.2 improves accessibility blindness and language controls.

Before, there was no way to control "dark mode". Also, the color scheme lacked contrast and other features to support vision accessibility. @giaroc's first commit to zipkin knocked this out of the park, resulting in an easier to read and control UI.

before:

Screenshot 2024-04-12 at 08 14 04

after:

Screenshot 2024-04-12 at 08 12 40

Full Changelog: https://github.com/openzipkin/zipkin/compare/3.1.1..3.2.0

3.1.1

2 months ago

Zipkin 3.1.1 is a hardening release, notably polishing out some UI glitches and experience problems for Cassandra users. Thanks a lot for all the feedback and patience, as we delayed this patch until we felt confident glitches were handled in a way that would be easy to diagnose in the future!

UI Fixes

Users and maintainers have noticed a few glitches since our UI moved from the abandoned react-scripts to vite for packaging. We think we've corrected everything at this point, but please reach out if you believe we didn't.

  • Fixed our test image ghcr.io/openzipkin/zipkin-ui resulting in 404s
  • Fixed handling of the env variable ZIPKIN_UI_BASEDIR, used when zipkin is deployed in a proxying
    • added a new ghcr.io/openzipkin/zipkin-uiproxy image that proves this works.
    • A lot of folks pitched in here, special thanks to @ujo-trackunit who uses this and provided a lot of insight leading to the fix, as well @SamTV12345 @reta and @anuraaga who all took time away to contribute towards resolution.

Cassandra and SASI default change

When STORAGE_TYPE=cassandra3, zipkin uses a feature called SASI for search features. This was enabled by default in Cassandra 3.11+, but in 4.x it became disabled by default.

Unlike schema settings, sasi_indexes_enabled: true is not something zipkin can change. Before, we weren't logging this critical setup problem, so users upgrading from cassandra 3 to 4 had a very hard time figuring it out. We now properly log what's going on, with more context. Ideally, this will help folks correct their configuration.

Here's an example, if you use the default cassandra docker image which has SASI disabled

2024-03-07T08:02:47.184+08:00 ERROR [/] 83635 --- [cking-tasks-2-1] z.s.c.Schema                             : Failed to execute [CREATE CUSTOM INDEX IF NOT EXISTS ON zipkin2.span (l_service) USING 'org.apache.cassandra.index.sasi.SASIIndex'
   WITH OPTIONS = {'mode': 'PREFIX'}]: SASI indexes are disabled. Enable in cassandra.yaml to use.

Build updates

While these changes won't impact end users, they do affect forks and are important.

  • we moved from long form license headers to SPDX ID
  • @anuraaga fixed our ServerIntegratedBenchmark

Full Changelog: https://github.com/openzipkin/zipkin/compare/3.1.0..3.1.1

3.1.0

2 months ago

Zipkin 3.1 includes our first additional features since the 3.0 platform update. Notably gRPC span collection is enabled by default, Eureka registration includes more properties, and you can now disable the UI independent of the REST API. Those using kubernetes should have a second look at our helm chart which is recently renovated as well!

While most won't see this, we'd like to give a special shout out to @SamTV12345 for helping renovate our javascript build. It was Sam's first change in the project and quite a big one. We'd like to thank all the users for your feedback and the continued support from our all volunteer team, notably @reta and @anuraaga who've stuck here with you so long.

Here are the changes end users might notice

  • COLLECTOR_GRPC_ENABLED is now true by default, accepting spans from the zipkin.proto3.SpanService/Report service hosted on the same HTTP port as the normal API (default 9411)
  • Eureka registration now populates the homePageUrl and statusPageUrl fields, the latter used in the spring-cloud-netflix UI. This was thanks to upstream changes in Armeria driven by @minwoox
  • New UI_ENABLED for users who wish to expose the query API, but not host the javascript UI.

Here are the build related changes:

  • UI build now uses vite. @SamTV12345 was the MVP of this change, which eliminated a build-time CVE. This was a quite a lot of work, and we're grateful for Sam's help. We also appreciate others work on this, too, notably @anuraaga who advised and pitched in a test migration PR.
  • @reta switched us to SLF4J 2.0, with heaps of thanks to @wilkinsona who helped us come to the same page on what versions do what.
  • our zipkin-slim image now includes netty tcnative libraries.

Thank folks who helped with changes you want and don't forget to star the project if you're happy with our continued efforts! If you'd like to get in touch, please chat on gitter. See you next release!

Full Changelog: https://github.com/openzipkin/zipkin/compare/3.0.6..3.1.0

3.0.6

3 months ago

Zipkin 3.0.6 updates to Armeria 1.27.1, fixes ES_HTTP_LOGGING and a glitch in Eureka registration.

  • Armeria 1.27.1 helped us remove code around Eureka, which is now upstream, as well bring the server runtime to the latest Netty
  • ES_HTTP_LOGGING broke when we updated to SLF4J 2. @reta resolved by bringing us back to the more compatible 1.7 plus config adjustment.
  • Those using spring-cloud-sleuth were unable to discover zipkin even when it set env like EUREKA_HOSTNAME.

Eureka and spring-cloud

Skip this part unless you want to take a walk with us down troubleshooting lane!

Eureka is a service registry originally started at Netflix. Zipkin can register itself in Eureka, allowing traced services to discover its listen address and health state. So, this is an alternative to normal DNS. We added support for this in Zipkin 2.27 and have been polishing that since.

Before, we were testing Eureka integration with armeria. Armeria doesn't use the netflix/eureka codebase at all, as it implements its api directly. This is great for Armeria users as the Netflix/Eureka codebase uses a lot of antique dependencies, some not updated in 8 years. However, it isn't a good test for zipkin for the same reason.

Most users who use Eureka, use Spring Boot 2, and most of those who use zipkin, use spring-cloud-sleuth (which uses brave internally). To get a better sense of confidence registration works in practice, we decided to update our sleuth example to use Eureka. The idea was to set a pseudo hostname in the zipkin endpoint: that would be replaced dynamically by a real endpoint in the "zipkin" application in Eureka. Then, we're all good.

But, we weren't all good. This didn't work at all, as our example used a reactive WebFlux configuration. For some reason, when a sleuth-instrumented application is using reactive, you cannot use Eureka to discover zipkin. So, we backported our sleuth example to a version that can use Eureka. Ironically, we had to go back to WebMvc which was the original canonical zipkin example! However, despite webmvc5-sleuth using the right parts, the pseudo zipkin hostname wasn't replaced.

In close inspection, the first thing we noticed was something documented, but not entirely intuitive. Documentation says to use the "service ID" as the pseudo-hostname in the zipkin URL, which would be replaced with the real hostname and port. In the case of Eureka, it seems intuitive to use the service to find instances of it. Specifically the Eureka application (EUREKA_APP_NAME of all zipkin instances). However, the "service ID" is not that, and it isn't even the instanceId in Eureka. Oddly, the "service ID" maps to the vipAddress field in Eureka, which is actually an instance's hostname! So, the strange thing is that the pseudo-hostname is actually the real hostname!

Fine, so we put the vipAddress zipkin registered into Eureka into the hostname field as a quasi hostname, but still it didn't work. Stepping through a debugger, we found that if there is a port in the hostname (e.g. zipkin-server:9411) the configuration code assumes it is not something to look up, rather something already resolved. This led to a realization that the vipAddress having a port encoded, was actually a config default bug, but a simple one to work around. In 3.0.6, when someone sets EUREKA_HOSTNAME, we also set vipAddress explicitly to avoid the accidental port adding default.

Voilla! Finally, we're all good: sleuth replaces vipAddress with that same address and also a port, and it could have only gotten that from eureka info. While it feels like a lot of work to accomplish little, people will still get the other benefits of Eureka (specifically spring-cloud-netflix use of it) including health checking and discovery of other endpoints besides the one you knew about and stuffed into the zipkin URL. While not as ideal as specifying the app name, this approach isn't completely unique to spring. Other technology sometimes ask for "well known addresses" in order to find the rest of a cluster.

Through comments and issue links in the webmvc5-sleuth example, we containerized this hard earned experience, to save future maintainers work trying to figure it all out again. In other words, they don't have to read these release notes and can just use the working binary.

The moral of the story, is: integration test things twice or three times if you can, as some behaviors are not necessarily intuitive. If you have more integrations, all the strange things will present themselves. While painful to get through all of the troubleshooting, it is definitely better to have the project bear this weight than relying on end users to figure it out!

Follow-up

Immediately after this release, spring-cloud-sleuth released 3.1.11 which fixed WebFlux discovery with Eureka. Hence, we our the webflux5-sleuth example, while still keeping the webmvc one. All our Eureka-compatible examples are integration tested against a real eureka server on change now, to prevent unknowing regressions in the future.

Full Changelog: https://github.com/openzipkin/zipkin/compare/3.0.5..3.0.6

3.0.5

3 months ago

Zipkin 3.0.5 cleans up CVEs and supports Eureka authentication. We also allow those testing with Cassandra to disable SSL hostname verification. While this is a point version, quite a lot of work went into this. Please thank volunteers involved on gitter or otherwise!

Dependency updates

Most notably, this updates our docker image to use JRE 21.0.2_p13, and all recent java libraries. We audited the UI and were able to fix all CVEs identified by Trivy and used at runtime, with special thanks to @anuraaga on this. We also test with latest Elasticsearch 8.12.0, now. This was trickier than usual due to a JRE compatibility issue @reta discovered a workaround for, and will be resolved when ES 8.12.1 is out. Rag and Andriy made themselves available and are the reason this release is all polished.

Eureka authentication

Zipkin 2.27 added Eureka discovery support, but we missed a spot. Eureka supports BASIC authentication via user info embedded in the service url. e.g. http://user:password@localhost:8761/eureka/v2. This is also handled the same way in spring-cloud-netflix. By also allowing url-embedded credentials, folks can use the same properties with zipkin as they do elsewhere.

To achieve this, and test it fully, we updated the following:

  • Our test eureka server image, ghcr.io/openzipkin/zipkin-eureka, to require authentication via EUREKA_USERNAME and EUREKA_PASSWORD
  • Our test armeria client image, ghcr.io/openzipkin/brave-example:armeria, to pass embedded credentials when looking up zipkin via EUREKA_SERVICE_URL
  • Our main code (applicable to all zipkin packaging) to use embedded credentials when registering via EUREKA_SERVICE_URL
  • Our docker-compose example to suggest how you can try the whole thing integrated.

Thanks for your patience with supporting this option, we hope you can tell that doing it right was a lot of work, and why we didn't just "wing it" earlier!

Disabling Cassandra hostname verification

Cassandra includes a setting for disabling hostname validation when using SSL, which is helpful for self-signed certificates. Thanks to @priyavivek2307 and @ankit-gautam23 for review, you can disable this now, by setting the env CASSANDRA_SSL_HOSTNAME_VALIDATION=false

Full Changelog: https://github.com/openzipkin/zipkin/compare/3.0.4..3.0.5

3.0.4

4 months ago

Zipkin 3.0.4 fixes a packaging bug which caused the UI to not load. Thanks @jinyulei0710 for reporting!

3.0.3

4 months ago

Zipkin 3.0.3 updates its self-tracing to use the latest zipkin-reporter 3.2.1. It also enhances the Eureka example to include client tracing with Armeria services support.

3.0.2

4 months ago

Zipkin 3.0.2 removes a log warning from console output.

You may also be interested in the new homebrew formula. On mac or linux, you can now try zipkin via brew install zipkin