dec: Deep Environmental Config

There are many too-strong opinions on the "right" way to do configuration but they primarily come down to these options:

flat files in some known format like JSON, YAML, properties files, EDN, etc.
environment variables
CLI arguments at start up
config from a data store (which ironically requires separate initial config in one of the above methods in order to access the data store)

While #4 is an interesting option in light of modern cluster-based infrastructure and tools like consul and etcd, we're going to focus on the first three mechanisms for this post: flat files, env vars and CLI arguments.

Systems should not require configuration via only one mechanism, but instead allow any or all available mechanisms in combination. This affords the operator of the system to configure it in the way that's most suitable according to their infrastructure, preferences, policies, and politics.

When using multiple mechanisms there needs to be a very clear priority in which one mechanism overrides the other. Typically env vars override flat file config, as they are more ephemeral and easy to set.

A few modern examples of systems that work this way today include:

ElasticSearch
HashiCorp tools like Consul
TICK stack e.g. Chronograf

Symmetry

A problem that immediately comes to light once we say that any individual value can be configured by any available mechanism is the representational difference between mechanisms:

Env vars and CLI args are stringly typed key value pairs
Config files support arbitrarily-deep tree structures with varying support for primitive and collection types like strings, numbers, booleans, arrays and maps

Because of these intrinsic significant structural differences we need a translation layer between KV pair and tree. If we adopt a single constraint the translation layer becomes simpler to express: All leafs in the tree are strings.

The implication is that our system which consumes the config must parse the string to obtain its expected format.

Introducing dec: Deep Environmental Config

dec is a tiny library that embraces this constraint and provides an explode function to transform KV pairs into an expected shape equivalent to a potentially deep EDN structure.

The following configurations are equivalent from the perspective of dec:


_10MY_DB_PORT=4567
_10MY_DB_HOST=database
_10MY_URL=https://my-system/


_10{:my
_10 {:db {:port "4567" :host "database}
_10  :url "https://my-system"}}

Notice how trees are serialized into KV pairs with a simple _ delimiter. (The delimiter is customizable in dec but _ is the default.)

Similarly, arrays can be represented:


_10MY_SERVER_0_HOST=serverA
_10MY_SERVER_1_HOST=serverB


_10{:my
_10 {:server [{:host "serverA"} {:host "serverB"}]}}

Schema

Due to our constraint that all values be strings, it becomes important that the system consuming the config strings have a valid way to parse and validate the strings into actual expected type.

Building a configuration schema into config consumers has some other nice properties, particularly when using clojure.spec:

Validate expected shape of config at runtime with precise error messaging describing the exact location of invalid config and expected values
Generate config example structures from schema in any format (e.g. the full list of supported env vars or EDN tree)

In the wild

To demonstrate we can look at Yetibot, an open source chat bot written in Clojure that embraces the above concepts.

Yetibot can be configured via env, edn, or a combination of both. For example, a minimal Yetibot config might be:


_10YB_ADAPTERS_MYSLACK_TYPE=Slack
_10YB_ADAPTERS_MYSLACK_TOKEN=xoxb-my-token
_10YB_DB_URL=postgresql://yetibot:yetibot@postgres:5432/yetibot

Yetibot expects all env vars to be either prefixed by YB or YETIBOT, thus the following is equivalent:


_10YETIBOT_ADAPTERS_MYSLACK_TYPE=Slack
_10YETIBOT_ADAPTERS_MYSLACK_TOKEN=xoxb-my-token
_10YETIBOT_DB_URL=postgresql://yetibot:yetibot@postgres:5432/yetibot

And since it uses dec, an edn file is also equivalent:


_10{:yetibot
_10 {:adapters {:myslack {:type "Slack" :token "xoxb-my-token"}}
_10  :db {:url "postgresql://yetibot:yetibot@postgres:5432/yetibot"}}}

Note that edn isn't a requirement; anything that can be parsed into Clojure collections would be equivalent.

Future

It might be useful to create a library that actually slurps the config and munge multiple mechanisms, using dec. Prefix handling and merging is slightly tricky. This is all currently part of yetibot.core. This library should also utilize clojure.spec to validate the expected shape of a config, provide validation, friendly error messages, and config generation.

Conclusion

There are known pros and cons to both approaches. A system should not perscribe which sets of constraints to adopt, but instead allow consumers to weigh their own tradeoffs and allow them to run things however they want.

By adopting a simple constraint and providing a very small library we can support this.

⇽