this repo has no description
1+++ 2title = "Log all the things" 3date = 2021-10-13 4draft = true 5 6[taxonomies] 7tags = [ 8 "elixir", 9 "programming", 10 "observability" 11] 12 13[[extra.thanks]] 14name = "Kai Wern Choong" 15+++ 16 17In Elixir 1.11 landed set of new features that allows for more powerful logging 18by utilising Erlang's [`logger`][erl-log] features. Here I will try to describe 19new possibilities and how You can use them to improve your logs. 20 21<!-- more --> 22 23## New log levels {#levels} 24 25Elixir gained 4 new log levels to total 8 (from most verbose to least verbose): 26 27- debug 28- info 29- **notice** * 30- warning (renamed from warn) 31- error 32- **critical** * 33- **alert** * 34- **emergency** * 35 36<small>* new levels</small> 37 38This allow to provide finer graded verbosity control, due to compatibility 39reasons, in Elixir backends we need to translate these levels back to "old" set 40of 4. The current table looks like: 41 42| Call level | What Elixir backend will see | 43| -- | -- | 44| `debug` | `debug` | 45| `info` | `info` | 46| `notice` | **`info`** * | 47| `warning` (or `warn`) | `warn` | 48| `error` | `error` | 49| `critical` | **`error`** * | 50| `alert` | **`error`** * | 51| `emergency` | **`error`** * | 52 53<small>* "translated" messages</small> 54 55We can set verbosity to all levels. This may be confusing during the transition 56period, but we cannot change the behaviour until Elixir 2 (which is not 57happening any time soon). 58 59Usage of the new levels is "obvious": 60 61```elixir 62Logger.notice("Hello") 63``` 64 65Will produce message with `notice` level of verbosity. 66 67Additionally the `logger.level` option in configuration supports 2 additional 68verbosity levels that you can use in your config: 69 70- `:all` - all messages will be logged, logically exactly the same as `:debug` 71- `:none` - no messages will be logged 72 73## Per module log level {#per-module-level} 74 75This is change that can be quite handy during debugging sessions. With this 76change we have 4 new functions in `Logger` module: 77 78- [`get_module_level/1`](https://hexdocs.pm/logger/Logger.html#get_module_level/1) 79- [`put_module_level/2`](https://hexdocs.pm/logger/Logger.html#put_module_level/2) 80- [`delete_module_level/1`](https://hexdocs.pm/logger/Logger.html#delete_module_level/1) 81- [`delete_all_module_level/0`](https://hexdocs.pm/logger/Logger.html#delete_all_module_level/0) 82 83These allow us to manipulate verbosity level on per-module basis. What is 84non-obvious and is super handy is that it allows both lowering **and raising** 85verbosity for given module. This mean that: 86 87```elixir 88require Logger 89 90Logger.configure(level: :error) 91 92defmodule Foo do 93 def run do 94 Logger.debug("I am still there") 95 end 96end 97 98Foo.run() # Does not log anything 99 100# Set `debug` level for `Foo` module only 101Logger.put_module_level(Foo, :debug) 102Foo.run() 103# `I am still there` is logged 104Logger.debug("I will not be printed") 105# Nothing got logged as top-level verbositi is still set to `:error` 106``` 107 108Of course it will not work if you decide to use [compile time purging][logger-purge] 109 110## Logger handlers {#handlers} 111 112--- 113 114**Warning!** This is not fully implemented in both - Erlang and Elixir. Writing 115your own handlers without additional knowledge can cause overload problems. 116 117--- 118 119Erlang together with their logging implementation needed to provide a way to 120ingest these logs somehow. This is done via Erlang logger handlers (in this 121article called *handlers* in contrast to Elixir backends called *backends* 122there). 123 124Handlers are modules that export at least 1 function `log/2` that takes 2 125arguments: 126 127- `log_event` which is a map with 3 fields: 128 - `:level` - verbosity level 129 - `:msg` - tuple describing message: 130 - `{:io.format(), [term()]}` - format string and list of terms that should 131 be passed to `:io_lib.format/2` function 132 - `{:report, map() | keyword()}` - report that can be formatted into string 133 by `report_cb/{1,2}` set in metadata map (see below) 134 - `{:string, :unicode.chardata()}` - raw string that should be printed as 135 a message 136 - `:meta` - map containing all metadata for given event. All keys should be 137 atoms and values can be anything. Some keys have special meaning, and some 138 of them will be populated automatically by the `Logger` macros and functions. 139 These are: 140 - `:pid` - PID of the process that fired log event 141 - `:gl` - group leader of the process that fired log event 142 - `:mfa` - tuple in form of `{module(), name :: atom(), arity :: non_neg_integer()}` 143 that describe function that fired log event 144 - `:file` - filename of file that defines the code that fired log event 145 - `:line` - line in the given file where the log event was fired 146 - `:domain` - list of atoms that can be used to describe log events 147 hierarchy which then can be used for filtering. All events fired using 148 `Logger` macros and functions will have `:elixir` prepended to their 149 domain list. 150 - `:report_cb` - function that will be used to format `{:report, map() | 151 keyword()}` messages. This can be either 1-ary function, that takes report 152 and returns `{:io.format(), [term()]}` leaving truncation and further 153 formatting up to the main formatter, or 2-ary function that takes report 154 and configuration map `%{depth: pos_integer() | :unlimited, chars_limit: 155 pos_integer() | :unlimited, single_line: boolean()}` and returns already 156 formatted `:unicode.chardata()`. More about it can be found in [separate 157 section](#structured-logging). 158 159Return value of this function is ignored. If there will be any exception raised 160when calling this function, then it will be captured and failing handler will be 161removed. This is important, as if such handler is the only one, then you can be 162left without any logging handler and miss logs. 163 164The important thing about Erlang handlers and Elixir backends is that Erlang 165handlers functions are called **within caller process** while Elixir backends 166are called in separate process. This mean that wrongly written Erlang handler 167can cause quite substantial load on application. 168 169To read on other, optional, callbacks that can be defined by Erlang handler, that 170will not be covered there, I suggest looking into [Erlang documentation][formatter_cb]. 171 172## Structured logging {#structured-logging} 173 174One of the biggest new features in the Elixir 1.11 is support for structured 175logging. This mean that the log message do not need to be free-form string, but 176instead we can pass structure, that can provide more machine-readable data for 177processing in log aggregators. In Elixir 1.11 is simple as passing map as a 178first argument to the `Logger` macros: 179 180```elixir 181Logger.info(%{ 182 status: :completed, 183 response: :ok 184}) 185``` 186 187This will produce message that looks like: 188 189```log 19014:08:46.849 [info] [response: :ok, status: :completed] 191``` 192 193As we can see, the map (called *report*) is formatted as a keyword list. This is 194default way to present the report data. Unfortunately we cannot access the 195metadata from the Elixir backends, but we have 2 ways to make these messages 196more readable for the human operator: 197 1981. Utilise [`Logger`'s translators](https://hexdocs.pm/logger/Logger.Translator.html) 1991. Using `:report_cb` field in metadata 200 2011st option is described quite good in Elixir documentation and is available 202since Elixir 1.0 as it was used to translate `error_logger` messages in old 203Erlang versions. Here I will describe the 2nd option which provide way for 204**caller** to define how report should be formatted into human-readable string. 205 206`:report_cb` accepts 2 kind of functions as an argument: 207 208- 1-ary function, that takes report as an argument and should return tuple 209 in form of `{:io.format(), [term()]}` that will be later formatted 210 respectively by the formatters. 211- 2-ary function that takes report and configuration map as an arguments and 212 should return formatted string. 213 2141st option is much easier for most use cases, as it do not force you to worry 215about handling width, depth, and multiline logs, as it will all be handled for 216you. 217 218For example, instead of doing: 219 220```elixir 221Logger.info("Started HTTP server on http://localhost:8080") 222``` 223 224We can do: 225 226```elixir 227Logger.info( 228 %{ 229 protocol: :http, 230 port: 8080, 231 address: "localhost", 232 endpoint: MyEndpoint, 233 handler: Plug.Cowboy 234 }, 235 report_cb: &__MODULE__.report_cb/1 236) 237 238# … 239 240def report_cb(%{protocol: protocol, port: port, address: address}) do 241 {"Started ~s server on ~s://~s:~B", [protocol, protocol, address, port]} 242end 243``` 244 245While the second entry seems much more verbose, with proper handler, it can 246provide much more detailed output. Just imagine that we would have handler that 247output JSON data and what information we could contain in such message: 248 249```json 250{ 251 "msg": "Started HTTP server on http://localhost:8080", 252 "metadata": { 253 "mfa": "MyMod.start/2", 254 "file": "foo.ex", 255 "line": 42 256 } 257} 258``` 259 260Now our log aggregation service need to parse `msg` field to extract all 261information that is contained there, like port, address, and protocol. With 262structured logging we can have that message available already there while 263presenting the "human readable" form as well: 264 265```json 266{ 267 "text": "Started HTTP server on http://localhost:8080", 268 "msg": { 269 "address": "localhost", 270 "port": 8080, 271 "protocol": "http", 272 "endpoint": "MyEndpoint", 273 "handler": "Plug.Cowboy" 274 }, 275 "metadata": { 276 "mfa": "MyMod.start/2", 277 "file": "foo.ex", 278 "line": 42 279 } 280} 281``` 282 283You can see there that we can have more information available in the structured 284log that would otherwise needed to be crammed somewhere into the text message, 285even if it is not important in "regular" Ops observability. 286 287This can raise a question - why not use metadata for such functionality, like it 288is available in [`LoggerJSON`][] or [`Ink`][]? The reason is that their reason 289existence is different. Metadata meant for "meta" stuff like location, tracing 290ID, but not for the information about the message itself. It is best shown on 291example. For this use Elixir's implementation of `GenServer` wrapper that 292produces error log entry on unknown message handled by default `handle_info/2`: 293 294```elixir 295Logger.error( 296 # Report 297 %{ 298 label: {GenServer, :no_handle_info}, 299 report: %{ 300 module: __MODULE__, 301 message: msg, 302 name: proc 303 } 304 }, 305 # Metadata 306 %{ 307 error_logger: %{tag: :error_msg}, 308 report_cb: &GenServer.format_report/1 309 } 310) 311``` 312 313As we can see there, the report contains informations like: 314 315- `:label` - that describes type of the event 316- `:report` - content of the "main" event 317 - `:module` - module that created the event, it is important to notice, that 318 it is also present in metadata (as part of `:mfa` key), but their meaning is 319 different. Module name here is meant for the operator to know the name of 320 the implementor that failed to handle message, while `:mfa` is meant to 321 describe the location of the code that fired the event. 322 - `:message` - the message itself that hasn't been handled. Notice, that it is 323 not stringified in any way there, it is simply passed "as is" to the 324 report. It is meant to be stringified later by the `:report_cb` function. 325 - `:name` - name of the process. Remember, similarly to `:module`, the PID of 326 the current process is part of the metadata, so in theory we could use value 327 from there, but their meaning is different (additionally this one may be an 328 atom in case if the process is locally registered with name). 329 330Metadata on the other hand contains information that will be useful for 331filtering or formatting of the event. 332 333The rule of thumb you can follow is: 334 335> If it is thing that you will want to filter on, then it probably should be 336> part of the metadata. If you want to aggregate information or just display 337> them, it should be part of the message report. 338 339## Log filtering 340 341Finally we come to first feature that is not directly accessible from the Elixir 342`Logger` API (yet). Erlang's `logger` have powerful functionality for filtering 343log messages which allows us to dynamically decide which message should, or 344should not be logged. These even can alter messages on the fly. 345 346Currently that functionality is available only via `:logger` module. It can be 347used like: 348 349```elixir 350defmodule MyFilter do 351 def filter(log_event, opts) do 352 # … 353 end 354end 355 356:logger.add_primary_filter(:my_filter, {&MyFilter.filter/2, opts}) 357# Or 358:logger.add_handler_filter(handler_id, :my_filter, {&MyFilter.filter/2, opts}) 359``` 360 361Few important things that need to be remembered when writing such filters: 362 363- It is best practice to make such functions public and define filters using 364 remote function capture, like `&__MODULE__.process_disabled/2` (so not 365 anonymous functions either). It will make such filter much easier for VM to 366 handle (it is bigger topic why it is that, I may to cover it in another post). 367- Filters are ran **within the same process that fired log event**, so it is 368 important to make such filters as fast as possible, and do not do any heavy 369 work there. 370 371Filters can be used for 2 different things: 372 373- preventing some messages from being logged 374- modifying a message 375 376While the former is much more common, I will try to describe both use cases 377there, as the latter is also quite useful. 378 379Filters are defined as 2-ary functions where 1st argument is log event, and 380second argument is any term that can be used as a configuration for filter. 381Filter should return one of these 3 values: 382 383- `:stop` - which will immediately discard message and do not run any additional 384 filters. 385- `:ignore` - which mean that given filter didn't recognise the given message 386 and leaves it up to other filters to decide on the action. If all filters 387 return `:ignore` then `:filter_default` option for the handler will be taken. 388 By default it is `:log`, which mean that message will be logged, but default 389 handler has it set to `:stop` by default, which mean, that non-matching 390 messages will be discarded. 391- Just log event (possibly modified) that will cause next filter to be called 392 with altered message. The message returned by the last filter (or in case of 393 `:ignore` return, previous filters) will be the message passed to handler. 394 395### Preventing some messages from being logged 396 397Most common use-case for filters will probably be rejecting messages that aren't 398important for us. [Erlang even prepared some useful filters][logger_filters]: 399 400- `domain` - allow filtering by metadata `:domain` field (remember as I said 401 that metadata is for filtering?). It supports multiple possible relations 402 between the log domain and defined domain. 403- `level` - allow filtering (in or out) messages depending on their level, in 404 both directions. It will allow you to filter messages with higher level for 405 some handlers. Just remember, that it will not receive messages that will not 406 pass primary/module level. 407- `progress` - filters all reports from `supervisor` and 408 `application_controller`. Simply, reduces startup/process shutdown chatter 409 that often is meaningless for most time. 410- `remote_gl` - filters messages coming from group leader on another node. 411 Useful when you want to discard/log messages coming from other nodes in 412 cluster. 413 414### Modifying a message 415 416Sometimes there is need to alter messages in the system. For example we may need 417to prevent sensitive information from being logged. When using "old" Elixir 418approach you could abuse translators, but that was error prone, as first 419successful translator was breaking pipeline, so you couldn't just smash one on 420top and then keep rest working as is. With "new" approach and structured logging 421you can just traverse the report and replace all occurrences of the unsafe data 422with anonymised data. For example: 423 424```elixir 425def filter_out_password(%{msg: {:report, report}} = event, _opts) do 426 %{event | msg: {:report, replace(report)}} 427end 428 429@filtered "[FILTERED]" 430 431defp replace(%{password: _} = map) do 432 for {k, v} <- %{map | password: @filtered}, into: %{} do 433 {k, replace(v)} 434 end 435end 436 437defp replace(%{"password" => _} = map) do 438 for {k, v} <- %{map | "password" => @filtered}, into: %{} do 439 {k, replace(v)} 440 end 441end 442 443defp replace(list) when is_list(list) do 444 for elem <- list do 445 case elem do 446 {:password, _} -> {:password, @filtered} 447 {"password", _} -> {"password", @filtered} 448 {k, v} -> {k, replace(v)} 449 other -> replace(other) 450 end 451 end 452end 453 454defp replace(other), do: other 455``` 456 457This snippet will replace all occurrences of `:password` or `"password"` with 458filtered out value. 459 460The disadvantage of such approach - it will make all messages with such fields 461allowed in case if your filter has `:filter_default` set to `:stop`. That mean, 462that if you want to make some of them rejected anyway, then you will need to 463manually add additional step to reject messages that do not fit into your 464patterns. Alternatively you can use `filter_default: :log` and then use opt-out 465logging. There currently is no way to alter the message and make other filters 466decide whether log it or not (as of OTP 24). 467 468## Summary 469 470New features and possibilities with relation to logging in Elixir 1.11 can be 471overwhelming. Fortunately all of the new features are optional and provided in 472addition to "good 'ol `Logger.info("logging")`". But for the people who works on 473the observability in BEAM (EEF Observability WG, Sentry, Logflare, etc.) it 474brings a lot of new powerful capabilities. 475 476I am thrilled to see what will people create using all that power. 477 478[erl-log]: https://erlang.org/doc/man/logger.html 479[syslog]: https://en.wikipedia.org/wiki/Syslog#Severity_level 480[`LoggerJSON`]: https://github.com/Nebo15/logger_json 481[`Ink`]: https://hex.pm/packages/ink 482[logger_filters]: https://erlang.org/doc/man/logger_filters.html 483[logger-purge]: https://hexdocs.pm/logger/Logger.html#module-application-configuration 484[formatter_cb]: https://erlang.org/doc/man/logger.html#formatter-callback-functions