A command-line tool that helps you ship changes to a Kubernetes namespace and understand the result
This project used to be called
kubernetes-deploy
. Check out our migration guide for more information including details about breaking changes.
krane
is a command line tool that helps you ship changes to a Kubernetes namespace and understand the result. At Shopify, we use it within our much-beloved, open-source Shipit deployment app.
Why not just use the standard kubectl apply
mechanism to deploy? It is indeed a fantastic tool; krane
uses it under the hood! However, it leaves its users with some burning questions: What just happened? Did it work?
Especially in a CI/CD environment, we need a clear, actionable pass/fail result for each deploy. Providing this was the foundational goal of krane
, which has grown to support the following core features:
:eyes: Watches the changes you requested to make sure they roll out successfully.
:interrobang: Provides debug information for changes that failed.
:1234: Predeploys certain types of resources (e.g. ConfigMap, PersistentVolumeClaim) to make sure the latest version will be available when resources that might consume them (e.g. Deployment) are deployed.
:closed_lock_with_key: Creates Kubernetes secrets from encrypted EJSON, which you can safely commit to your repository
:running: Running tasks at the beginning of a deploy using bare pods (example use case: Rails migrations)
If you need the ability to render dynamic values in templates before deploying, you can use krane render. Alongside that, this repo also includes tools for running tasks and restarting deployments.
KRANE DEPLOY
KRANE GLOBAL DEPLOY
KRANE RESTART
KRANE RUN
KRANE RENDER
CONTRIBUTING
1 We run integration tests against these Kubernetes versions. You can find our official compatibility chart below.
Krane provides support for official upstream supported versions Kubernetes, Ruby that are part of the compatibility matrix; Nevertheless, older releases are still likely to work.
Kubernetes version | Currently Tested? | Last officially supported in gem version |
---|---|---|
1.18 | No | 2.3.7 |
1.19 | No | 2.4.9 |
1.20 | No | 2.4.9 |
1.21 | No | 2.4.9 |
1.22 | No | 3.0.1 |
1.23 | No | 3.4.2 |
1.24 | Yes | -- |
1.25 | No | -- |
1.26 | Yes | -- |
1.27 | Yes | -- |
1.28 | Yes | -- |
gem install krane
krane deploy <app's namespace> <kube context>
Environment variables:
$KUBECONFIG
: points to one or multiple valid kubeconfig files that include the context you want to deploy to. File names are separated by colon for Linux and Mac, and semi-colon for Windows. If omitted, Krane will use the Kubernetes default of ~/.kube/config
.$GOOGLE_APPLICATION_CREDENTIALS
: points to the credentials for an authenticated service account (required if your kubeconfig user
's auth provider is GCP)Options:
Refer to krane help
for the authoritative set of options.
--filenames / -f [PATHS]
: Accepts a list of directories and/or filenames to specify the set of directories/files that will be deployed, use -
to specify reading from STDIN.--no-prune
: Skips pruning of resources that are no longer in your Kubernetes template set. Not recommended, as it allows your namespace to accumulate cruft that is not reflected in your deploy directory.--global-timeout=duration
: Raise a timeout error if it takes longer than duration for any
resource to deploy.--selector
: Instructs krane to only prune resources which match the specified label selector, such as environment=staging
. If you use this option, all resource templates must specify matching labels. See Sharing a namespace below.--selector-as-filter
: Instructs krane to only deploy resources that are filtered by the specified labels in --selector
. The deploy will not fail if not all resources match the labels. This is useful if you only want to deploy a subset of resources within a given YAML file. See Sharing a namespace below.--no-verify-result
: Skip verification that workloads correctly deployed.--protected-namespaces=default kube-system kube-public
: Fail validation if a deploy is targeted at a protected namespace.--verbose-log-prefix
: Add [context][namespace] to the log prefixNOTICE: Deploy Secret resources at your own risk. Although we will fix any reported leak vectors with urgency, we cannot guarantee that sensitive information will never be logged.
By default, krane will prune any resources in the target namespace which have the kubectl.kubernetes.io/last-applied-configuration
annotation and are not a result of the current deployment process, on the assumption that there is a one-to-one relationship between application deployment and namespace, and that a deployment provisions all relevant resources in the namespace.
If you need to, you may specify --no-prune
to disable all pruning behaviour, but this is not recommended.
If you need to share a namespace with resources which are managed by other tools or indeed other krane deployments, you can supply the --selector
option, such that only resources with labels matching the selector are considered for pruning.
If you need to share a namespace with different set of resources using the same YAML file, you can supply the --selector
and --selector-as-filter
options, such that only the resources that match with the labels will be deployed. In each run of deploy, you can use different labels in --selector
to deploy a different set of resources. Only the deployed resources in each run are considered for pruning.
All templates must be YAML formatted.
We recommended storing each app's templates in a single directory, {app root}/config/deploy/{env}
. However, you may use multiple directories.
If you want dynamic templates, you may render ERB with krane render
and then pipe that result to krane deploy -f -
.
krane.shopify.io/timeout-override
: Override the tool's hard timeout for one specific resource. Both full ISO8601 durations and the time portion of ISO8601 durations are valid. Value must be between 1 second and 24 hours.
krane.shopify.io/required-rollout
: Modifies how much of the rollout needs to finish
before the deployment is considered successful.
full
: The deployment is successful when all pods in the new replicaSet
are ready.none
: The deployment is successful as soon as the new replicaSet
is created for the deployment.maxUnavailable
: The deploy is successful when minimum availability is reached in the new replicaSet
.
In other words, the number of new pods that must be ready is equal to spec.replicas
- strategy.RollingUpdate.maxUnavailable
(converted from percentages by rounding up, if applicable). This option is only valid for deployments that use the RollingUpdate
strategy.spec.replicas
* Percent.full
: The deployment is successful when all pods are ready.krane.shopify.io/predeployed
: Causes a Custom Resource to be deployed in the pre-deploy phase.
true
true
: The custom resource will be deployed in the pre-deploy phase.krane.shopify.io/deploy-method-override
: Cause a resource to be deployed by the specified kubectl
command, instead of the default apply
.
PodDisruptionBudget
, since it always uses create/replace-force
create
, replace
, and replace-force
To run a task in your cluster at the beginning of every deploy, simply include a Pod
template in your deploy directory. krane
will first deploy any ConfigMap
and PersistentVolumeClaim
resources present in the provided templates, followed by any such pods. If the command run by one of these pods fails (i.e. exits with a non-zero status), the overall deploy will fail at this step (no other resources will be deployed).
Requirements:
<%= deployment_id %>
to ensure that a unique name will be used on every deploy (the deploy will fail if a pod with the same name already exists).spec.restartPolicy
must be set to Never
so that it will be run exactly once. We'll fail the deploy if that run exits with a non-zero status.spec.activeDeadlineSeconds
should be set to a reasonable value for the performed task (not required, but highly recommended)A simple example can be found in the test fixtures: test/fixtures/hello-cloud/unmanaged-pod-1.yml.erb.
The logs of all pods run in this way will be printed inline. If there is only one pod, the logs will be streamed in real-time. If there are multiple, they will be fetched when the pod terminates.
Note: If you're a Shopify employee using our cloud platform, this setup has already been done for you. Please consult the CloudPlatform User Guide for usage instructions.
Since their data is only base64 encoded, Kubernetes secrets should not be committed to your repository. Instead, krane
supports generating secrets from an encrypted ejson file in your template directory. Here's how to use this feature:
gem install ejson
ejson keygen
(prints the keypair to stdout)kubectl create secret generic ejson-keys --from-literal=YOUR_PUBLIC_KEY=YOUR_PRIVATE_KEY --namespace=TARGET_NAMESPACE
Warning: Do not use
apply
to create theejson-keys
secret. krane will fail ifejson-keys
is prunable. This safeguard is to protect against the accidental deletion of your private keys.
secrets.ejson
with the format shown below. The _type
key should have the value “kubernetes.io/tls” for TLS secrets and “Opaque” for all others. The data
key must be a json object, but its keys and values can be whatever you need.{
"_public_key": "YOUR_PUBLIC_KEY",
"kubernetes_secrets": {
"catphotoscom": {
"_type": "kubernetes.io/tls",
"data": {
"tls.crt": "cert-data-here",
"tls.key": "key-data-here"
}
},
"monitoring-token": {
"_type": "Opaque",
"data": {
"api-token": "token-value-here"
}
}
}
}
ejson encrypt /PATH/TO/secrets.ejson
kubernetes_secrets
key. The ejson file must be included in the resources passed to --filenames
it can not be read through stdin.Note: Since leading underscores in ejson keys are used to skip encryption of the associated value, krane
will strip these leading underscores when it creates the keys for the Kubernetes secret data. For example, given the ejson data below, the monitoring-token
secret will have keys api-token
and property
(not _property
):
{
"_public_key": "YOUR_PUBLIC_KEY",
"kubernetes_secrets": {
"monitoring-token": {
"_type": "kubernetes.io/tls",
"data": {
"api-token": "EJ[ENCRYPTED]",
"_property": "some unencrypted value"
}
}
}
A warning about using EJSON secrets with --selector
: when using EJSON to generate Secret
resources and specifying a --selector
for deployment, the labels from the selector are automatically added to the Secret
. If the same EJSON file is deployed to the same namespace using different selectors, this will cause the resource to thrash - even if the contents of the secret were the same, the resource has different labels on each deploy.
By default, krane does not check the status of custom resources; it simply assumes that they deployed successfully. In order to meaningfully monitor the rollout of custom resources, krane supports configuring pass/fail conditions using annotations on CustomResourceDefinitions (CRDs).
Requirements:
status
subresource with an observedGeneration
field.krane.shopify.io/instance-rollout-conditions
annotation must be present on the CRD that defines the custom resource.krane.shopify.io/instance-timeout
annotation can be added to the CRD that defines the custom resource to override the global default timeout for all instances of that resource. This annotation can use ISO8601 format or unprefixed ISO8601 time components (e.g. '1H', '60S').The presence of a valid krane.shopify.io/instance-rollout-conditions
annotation on a CRD will cause krane to monitor the rollout of all instances of that custom resource. Its value can either be "true"
(giving you the defaults described in the next section) or a valid JSON string with the following format:
'{
"success_conditions": [
{ "path": <JsonPath expression>, "value": <target value> }
... more success conditions
],
"failure_conditions": [
{ "path": <JsonPath expression>, "value": <target value> }
... more failure conditions
]
}'
For all conditions, path
must be a valid JsonPath expression that points to a field in the custom resource's status. value
is the value that must be present at path
in order to fulfill a condition. For a deployment to be successful, all success_conditions
must be fulfilled. Conversely, the deploy will be marked as failed if any one of failure_conditions
is fulfilled. success_conditions
are mandatory, but failure_conditions
can be omitted (the resource will simply time out if it never reaches a successful state).
In addition to path
and value
, a failure condition can also contain error_msg_path
or custom_error_msg
. error_msg_path
is a JsonPath expression that points to a field you want to surface when a failure condition is fulfilled. For example, a status condition may expose a message
field that contains a description of the problem it encountered. custom_error_msg
is a string that can be used if your custom resource doesn't contain sufficient information to warrant using error_msg_path
. Note that custom_error_msg
has higher precedence than error_msg_path
so it will be used in favor of error_msg_path
when both fields are present.
Warning:
You must ensure that your custom resource controller sets .status.observedGeneration
to match the observed .metadata.generation
of the monitored resource once its sync is complete. If this does not happen, krane will not check success or failure conditions and the deploy will time out.
As an example, the following is the default configuration that will be used if you set krane.shopify.io/instance-rollout-conditions: "true"
on the CRD that defines the custom resources you wish to monitor:
'{
"success_conditions": [
{
"path": "$.status.conditions[?(@.type == \"Ready\")].status",
"value": "True",
},
],
"failure_conditions": [
{
"path": '$.status.conditions[?(@.type == \"Failed\")].status',
"value": "True",
"error_msg_path": '$.status.conditions[?(@.type == \"Failed\")].message',
},
],
}'
The paths defined here are based on the typical status properties as defined by the Kubernetes community. It expects the status
subresource to contain a conditions
array whose entries minimally specify type
, status
, and message
fields.
You can see how these conditions relate to the following resource:
apiVersion: stable.shopify.io/v1
kind: Example
metadata:
generation: 2
name: example
namespace: namespace
spec:
...
status:
observedGeneration: 2
conditions:
- type: "Ready"
status: "False"
reason: "exampleNotReady"
message: "resource is not ready"
- type: "Failed"
status: "True"
reason: "exampleFailed"
message: "resource is failed"
observedGeneration == metadata.generation
, so krane will check this resource's success and failure conditions.$.status.conditions[?(@.type == "Ready")].status == "False"
, the resource is not considered successful yet.$.status.conditions[?(@.type == "Failed")].status == "True"
means that a failure condition has been fulfilled and the resource is considered failed.error_msg_path
is specified, krane will log the contents of $.status.conditions[?(@.type == "Failed")].message
, which in this case is: resource is failed
.Let's walk through what happens when you run the deploy
task with this directory of templates. This particular example uses ERB templates as well, so we'll use the krane render task to achieve that.
You can test this out for yourself by running the following command:
krane render -f test/fixtures/hello-cloud --current-sha 1 | krane deploy my-namespace my-k8s-cluster -f -
As soon as you run this, you'll start seeing some output being streamed to STDERR.
In this phase, we:
In this phase, we check resource statuses. For each resource listed in the previous step, we check Kubernetes for their status; in the first deploy this might show a bunch of items as "Not Found", but for the deploy of a new version, this is an example of what it could look like:
Certificate/services-foo-tls Exists
Cloudsql/foo-production Provisioned
Deployment/jobs 3 replicas, 3 updatedReplicas, 3 availableReplicas
Deployment/web 3 replicas, 3 updatedReplicas, 3 availableReplicas
Ingress/web Created
Memcached/foo-production Healthy
Pod/db-migrate-856359 Unknown
Pod/upload-assets-856359 Unknown
Redis/foo-production Healthy
Service/web Selects at least 1 pod
The next phase might be either "Predeploying priority resources" (if there's any) or "Deploying all resources". In this example we'll go through the former, as we do have predeployable resources.
This is the first phase that could modify the cluster.
In this phase we predeploy certain types of resources (e.g. ConfigMap
, PersistentVolumeClaim
, Secret
, ...) to make sure the latest version will be available when resources that might consume them (e.g. Deployment
) are deployed. This phase will be skipped if the templates don't include any resources that would need to be predeployed.
When this runs, we essentially run kubectl apply
on those templates and periodically check the cluster for the current status of each resource so we can display error or success information. This will look different depending on the type of resource. If you're running the command described above, you should see something like this in the output:
Deploying ConfigMap/hello-cloud-configmap-data (timeout: 30s)
Successfully deployed in 0.2s: ConfigMap/hello-cloud-configmap-data
Deploying PersistentVolumeClaim/hello-cloud-redis (timeout: 300s)
Successfully deployed in 3.3s: PersistentVolumeClaim/hello-cloud-redis
Deploying Role/role (timeout: 300s)
Don't know how to monitor resources of type Role. Assuming Role/role deployed successfully.
Successfully deployed in 0.2s: Role/role
As you can see, different types of resources might have different timeout values and different success criteria; in some specific cases (such as with Role) we might not know how to confirm success or failure, so we use a higher timeout value and assume it did work.
In this phase, we:
--no-prune
).Just like in the previous phase, we essentially run kubectl apply
on those templates and periodically check the cluster for the current status of each resource so we can display error or success information.
If pruning is enabled (which, again, is the default), any kind not listed in the blacklist that we can find in the namespace but not in the templates will be removed. A particular message about pruning will be printed in the next phase if any resource matches this criteria.
The result section will show:
At this point the command also returns a status code:
0
70
1
On timeouts: It's important to notice that a single resource timeout or a global deploy timeout doesn't necessarily mean that the operation failed. Since Kubernetes updates are asynchronous, maybe something was just too slow to return in the configured time; in those cases, usually running the deploy again might work (that should be a no-op for most - if not all - resources).
Ship non-namespaced resources to a cluster
krane global-deploy (accessible through the Ruby API as Krane::GlobalDeployTask) can deploy global (non-namespaced) resources such as PersistentVolume, Namespace, and CustomResourceDefinition. Its interface is very similar to krane deploy.
krane global-deploy <kube context>
$ cat my-template.yml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: testing-storage-class
labels:
app: krane
provisioner: kubernetes.io/no-provisioner
$ krane global-deploy my-k8s-context -f my-template.yml --selector app=krane
Options:
Refer to krane global-deploy help
for the authoritative set of options.
--filenames / -f [PATHS]
: Accepts a list of directories and/or filenames to specify the set of directories/files that will be deployed. Use -
to specify STDIN.--no-prune
: Skips pruning of resources that are no longer in your Kubernetes template set. Not recommended, as it allows your namespace to accumulate cruft that is not reflected in your deploy directory.--selector
: Instructs krane to only prune resources which match the specified label selector, such as environment=staging
. By using this option, all resource templates must specify matching labels. See Sharing a namespace below.--selector-as-filter
: Instructs krane to only deploy resources that are filtered by the specified labels in --selector
. The deploy will not fail if not all resources match the labels. This is useful if you only want to deploy a subset of resources within a given YAML file. See Sharing a namespace below.--global-timeout=duration
: Raise a timeout error if it takes longer than duration for any
resource to deploy.--no-verify-result
: Skip verification that resources correctly deployed.krane restart
is a tool for restarting all of the pods in one or more deployments, statefuls sets, and/or daemon sets. It triggers the restart by patching template metadata with the kubectl.kubernetes.io/restartedAt
annotation (with the value being an RFC 3339 representation of the current time). Note this is the manner in which kubectl rollout restart
itself triggers restarts.
Option 1: Specify the deployments you want to restart
The following command will restart all pods in the web
and jobs
deployments:
krane restart <kube namespace> <kube context> --deployments=web jobs
Option 2: Annotate the deployments you want to restart
Add the annotation shipit.shopify.io/restart
to all the deployments you want to target, like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: web
annotations:
shipit.shopify.io/restart: "true"
With this done, you can use the following command to restart all of them:
krane restart <kube namespace> <kube context>
Options:
Refer to krane help restart
for the authoritative set of options.
--selector
: Only restarts Deployments which match the specified Kubernetes resource selector.--deployments
: Restart specific Deployment resources by name.--global-timeout=duration
: Raise a timeout error if it takes longer than duration for any
resource to restart.--no-verify-result
: Skip verification that workloads correctly restarted.krane run
is a tool for triggering a one-off job, such as a rake task, outside of a deploy.
PodTemplate
object with field template
containing a Pod
specification that does not include the apiVersion
or kind
parameters. An example is provided in this repo in test/fixtures/hello-cloud/template-runner.yml
.Pod
specification in that template has a container named task-runner
.Based on this specification krane run
will create a new pod with the entrypoint of the task-runner
container overridden with the supplied arguments.
krane run <kube namespace> <kube context> --arguments=<arguments> --command=<command> --template=<template name>
Options:
--template=TEMPLATE
: Specifies the name of the PodTemplate to use.--env-vars=ENV_VARS
: Accepts a list of environment variables to be added to the pod template. For example, --env-vars="ENV=VAL ENV2=VAL2"
will make ENV
and ENV2
available to the container.--command=
: Override the default command in the container image.--no-verify-result
: Skip verification of pod success--global-timeout=duration
: Raise a timeout error if the pod runs for longer than the specified duration--arguments:
: Override the default arguments for the command with a space-separated list of argumentskrane render
is a tool for rendering ERB templates to raw Kubernetes YAML. It's useful for outputting YAML that can be passed to other tools, for validation or introspection purposes.
krane render
does not require a running cluster or an active kubernetes context, which is nice if you want to run it in a CI environment, potentially alongside something like https://github.com/garethr/kubeval to make sure your configuration is sound.To render all templates in your template dir, run:
krane render -f ./path/to/template/dir
To render some templates in a template dir, run krane render with the names of the templates to render:
krane render -f ./path/to/template/dir/this-template.yaml.erb
To render a template in a template dir and output it to a file, run krane render with the name of the template and redirect the output to a file:
krane render -f ./path/to/template/dir/template.yaml.erb > template.yaml
Options:
--filenames / -f [PATHS]
: Accepts a list of directories and/or filenames to specify the set of directories/files that will be deployed. Use -
to specify STDIN.--bindings=BINDINGS
: Makes additional variables available to your ERB templates. For example, krane render --bindings=color=blue size=large -f some-template.yaml.erb
will expose color
and size
to some-template.yaml.erb
.--current-sha
: Expose SHA current_sha
in ERB bindingsYou can add additional variables using the --bindings=BINDINGS
option which can be formatted as a string, JSON string or path to a JSON or YAML file. Complex JSON or YAML data will be converted to a Hash for use in templates. To load a file, the argument should include the relative file path prefixed with an @
sign. An argument error will be raised if the string argument cannot be parsed, the referenced file does not include a valid extension (.json
, .yaml
or .yml
) or the referenced file does not exist.
# Comma separated string. Exposes, 'color' and 'size'
$ krane render --bindings=color=blue,size=large
# JSON string. Exposes, 'color' and 'size'
$ krane render --bindings='{"color":"blue","size":"large"}'
# Load JSON file from ./config
$ krane render --bindings='@config/production.json'
# Load YAML file from ./config (.yaml or yml supported)
$ krane render --bindings='@config/production.yaml'
# Load multiple files via a space separated string
$ krane render --bindings='@config/production.yaml' '@config/common.yaml'
krane
supports composing templates from so called partials in order to reduce duplication in Kubernetes YAML files. Given a directory DIR
, partials are searched for in DIR/partials
and in 'DIR/../partials', in that order. They can be embedded in other ERB templates using the helper method partial
. For example, let's assume an application needs a number of different CronJob resources, one could place a template called cron
in one of those directories and then use it in the main deployment.yaml.erb like so:
<%= partial "cron", name: "cleanup", schedule: "0 0 * * *", args: %w(cleanup), cpu: "100m", memory: "100Mi" %>
<%= partial "cron", name: "send-mail", schedule: "0 0 * * *", args: %w(send-mails), cpu: "200m", memory: "256Mi" %>
Inside a partial, parameters can be accessed as normal variables, or via a hash called locals
. Thus, the cron
template could like this:
---
apiVersion: batch/v1
kind: CronJob
metadata:
name: cron-<%= name %>
spec:
schedule: <%= schedule %>
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 3
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: cron-<%= name %>
image: ...
args: <%= args %>
resources:
requests:
cpu: "<%= cpu %>"
memory: <%= memory %>
restartPolicy: OnFailure
Both .yaml.erb
and .yml.erb
file extensions are supported. Templates must refer to the bare filename (e.g. use partial: 'cron'
to reference cron.yaml.erb
).
Partials can be included almost everywhere in ERB templates. Note: when using a partial to insert additional key-value pairs to a map you must use YAML merge keys. For example, given a partial p
defining two fields 'a' and 'b',
a: 1
b: 2
you cannot do this:
x: yz
<%= partial 'p' %>
hoping to get
x: yz
a: 1
b: 2
but you can do:
```yaml
<<: <%= partial 'p' %>
x: yz
This is a limitation of the current implementation.
We :heart: contributors! To make it easier for you and us we've written a Contributing Guide
You can also reach out to us on our slack channel, #krane, at https://kubernetes.slack.com. All are welcome!
Everyone is expected to follow our Code of Conduct.
The gem is available as open source under the terms of the MIT License.