mirror of
https://github.com/wezm/wezm.net.git
synced 2024-12-18 18:29:54 +00:00
Add Docker debug post
This commit is contained in:
parent
ab0d9faafe
commit
4b12ad7fef
1 changed files with 221 additions and 0 deletions
221
v2/content/posts/2023/systemd-gdb-coredumps.md
Normal file
221
v2/content/posts/2023/systemd-gdb-coredumps.md
Normal file
|
@ -0,0 +1,221 @@
|
|||
+++
|
||||
title = "Debugging a Docker Core Dump"
|
||||
date = 2023-02-25T10:39:49+10:00
|
||||
|
||||
#[extra]
|
||||
#updated = 2023-01-11T21:11:28+10:00
|
||||
+++
|
||||
|
||||
On my main machine I use an excellent cross-platform tool called [Docuum] that
|
||||
automatically cleans up unused docker images. This allows me to use Docker
|
||||
without the need to periodically wonder why I'm out of disk space, run `docker
|
||||
system prune` and recover half my disk.
|
||||
|
||||
I installed Docuum via [the AUR package][aurpkg] (although tweaked to build the
|
||||
latest Docuum release) and ran it via the bundled systemd service definition.
|
||||
This worked great for a while but some time back it started failing. Every time
|
||||
Docuum would try to check for things to clean up I'd see the following in the
|
||||
system journal:
|
||||
|
||||
<!-- more -->
|
||||
|
||||
```
|
||||
Feb 25 10:03:12 ryzen docuum[77751]: [2023-02-25 10:03:12 +10:00 INFO] Performing an initial vacuum on startup…
|
||||
Feb 25 10:03:12 ryzen kernel: audit: type=1326 audit(1677283392.831:2): auid=4294967295 uid=0 gid=0 ses=4294967295 pid=77763 comm="docker" exe="/usr/bin/docke>
|
||||
Feb 25 10:03:12 ryzen systemd[1]: Created slice Slice /system/systemd-coredump.
|
||||
Feb 25 10:03:12 ryzen systemd[1]: Started Process Core Dump (PID 77768/UID 0).
|
||||
Feb 25 10:03:13 ryzen systemd-coredump[77769]: [🡕] Process 77763 (docker) of user 0 dumped core.
|
||||
|
||||
Stack trace of thread 77763:
|
||||
#0 0x00005568dbcb5c4e n/a (docker + 0x243c4e)
|
||||
#1 0x00005568dbd35a3b n/a (docker + 0x2c3a3b)
|
||||
#2 0x00005568dbd3482f n/a (docker + 0x2c282f)
|
||||
#3 0x00005568dbd6c2ee n/a (docker + 0x2fa2ee)
|
||||
#4 0x00005568dbcfafa8 n/a (docker + 0x288fa8)
|
||||
#5 0x00005568dbcfaef1 n/a (docker + 0x288ef1)
|
||||
#6 0x00005568dbced953 n/a (docker + 0x27b953)
|
||||
#7 0x00005568dbd1eb41 n/a (docker + 0x2acb41)
|
||||
ELF object binary architecture: AMD x86-64
|
||||
Feb 25 10:03:13 ryzen docuum[77751]: [2023-02-25 10:03:13 +10:00 ERROR] Unable to list images.
|
||||
Feb 25 10:03:13 ryzen docuum[77751]: [2023-02-25 10:03:13 +10:00 INFO] Retrying in 5 seconds…
|
||||
Feb 25 10:03:13 ryzen systemd[1]: systemd-coredump@0-77768-0.service: Deactivated successfully.
|
||||
```
|
||||
|
||||
This would repeat every 5 seconds. I ignored this for a while but finally decided
|
||||
to investigate it today. To find the failing command I ran
|
||||
`coredumpctl list` then identified one of the docker crashes and ran
|
||||
`coredumpctl info` with its PID:
|
||||
|
||||
```
|
||||
$ coredumpctl info 78255
|
||||
PID: 78255 (docker)
|
||||
UID: 0 (root)
|
||||
GID: 0 (root)
|
||||
Signal: 31 (SYS)
|
||||
Timestamp: Sat 2023-02-25 10:03:23 AEST (44min ago)
|
||||
Command Line: docker image ls --all --no-trunc --format $'{{.ID}}\\t{{.Repository}}\\t{{.Tag}}\\t{{.CreatedAt}}'
|
||||
Executable: /usr/bin/docker
|
||||
Control Group: /system.slice/docuum.service
|
||||
Unit: docuum.service
|
||||
Slice: system.slice
|
||||
Boot ID: 0ac9f0dd246548949c3a90a0e7494665
|
||||
Machine ID: affcb0b7a7d1464385d65464d9be450e
|
||||
Hostname: ryzen
|
||||
Storage: /var/lib/systemd/coredump/core.docker.0.0ac9f0dd246548949c3a90a0e7494665.78255.1677283403000000.zst (inaccessible)
|
||||
Message: Process 78255 (docker) of user 0 dumped core.
|
||||
```
|
||||
|
||||
Strangely I could run the command myself (`Command Line` line) just fine. I
|
||||
figured I needed to see where in docker it was crashing. I learned how to
|
||||
access systemd coredumps with gdb and ran: `sudo coredumpctl gdb 78255`. `sudo`
|
||||
is needed because the core dump belongs to root due the crashing `docker`
|
||||
process belonging to root. This didn't yield much extra info as debug symbols
|
||||
were not present for the binary. It did identify why it crashed though:
|
||||
|
||||
> Program terminated with signal SIGSYS, Bad system call.
|
||||
|
||||
Knowing that docker was implemented in Go and that they make system calls
|
||||
manually on Linux I wondered if this was some sort of Go bug—although given
|
||||
the popularity of Docker this did seem unlikely.
|
||||
|
||||
To get more info I needed debug symbols. Arch Linux makes these available via
|
||||
[debuginfod](https://wiki.archlinux.org/title/Debuginfod) and gdb can
|
||||
automatically download debuginfo files if `DEBUGINFOD_URLS` is set. I reran
|
||||
`gdb`, telling `sudo` to pass the `DEBUGINFOD_URLS` environment variable
|
||||
through (I had already set `DEBUGINFOD_URLS=https://debuginfod.archlinux.org/`
|
||||
in my Zsh config some time ago):
|
||||
|
||||
```
|
||||
sudo --preserve-env=DEBUGINFOD_URLS coredumpctl gdb 78255
|
||||
```
|
||||
|
||||
Now there was a proper backtrace:
|
||||
|
||||
```
|
||||
#0 runtime/internal/syscall.Syscall6 () at runtime/internal/syscall/asm_linux_amd64.s:36
|
||||
#1 0x00005588e7b93c33 in syscall.RawSyscall6 (num=160, a1=7, a2=94046491589710, a3=7, a4=824634289408, a5=0, a6=0, r1=<optimized out>, r2=<optimized out>,
|
||||
errno=<optimized out>) at runtime/internal/syscall/syscall_linux.go:38
|
||||
#2 0x00005588e7c13a3b in syscall.RawSyscall (trap=160, a1=7, a2=94046491589710, a3=7, r1=<optimized out>, r2=<optimized out>, err=<optimized out>)
|
||||
at syscall/syscall_linux.go:62
|
||||
#3 0x00005588e7c1282f in syscall.Setrlimit (resource=<optimized out>, rlim=<optimized out>, err=...) at syscall/zsyscall_linux_amd64.go:1326
|
||||
#4 0x00005588e7c4a2ee in os.init.1 () at os/rlimit.go:30
|
||||
#5 0x00005588e7bd8fa8 in runtime.doInit (t=0x5588e91a7be0 <os.[inittask]>) at runtime/proc.go:6506
|
||||
#6 0x00005588e7bd8ef1 in runtime.doInit (t=0x5588e91a9900 <main.[inittask]>) at runtime/proc.go:6483
|
||||
#7 0x00005588e7bcb953 in runtime.main () at runtime/proc.go:233
|
||||
#8 0x00005588e7bfcb41 in runtime.goexit () at runtime/asm_amd64.s:1598
|
||||
#9 0x0000000000000000 in ?? ()
|
||||
```
|
||||
|
||||
So the issue seems to be a call to `setrlimit`. Looking at [the code][gocode]
|
||||
and searching the Go issue tracker didn't turn up anyone else having this
|
||||
issue, which pointed at an issue on my system.
|
||||
|
||||
I'm honestly not sure what led me to the next step but I decided to take a look
|
||||
at the Docuum service definition. I was surprised to see that it was more
|
||||
complicated than most definitions I'm used to seeing:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=LRU eviction of Docker images
|
||||
Documentation=https://github.com/stepchowfun/docuum
|
||||
DefaultDependencies=false
|
||||
After=docker.service docker.socket
|
||||
Requires=docker.service docker.socket
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
Environment=DOCUUM_THRESHOLD=10GB
|
||||
EnvironmentFile=-/etc/default/docuum
|
||||
ExecStart=/usr/bin/docuum --threshold $DOCUUM_THRESHOLD
|
||||
ProtectSystem=full
|
||||
PrivateTmp=true
|
||||
PrivateDevices=true
|
||||
PrivateNetwork=true
|
||||
CapabilityBoundingSet=
|
||||
KeyringMode=private
|
||||
RestrictNamespaces=~cgroup ipc net mnt pid user uts
|
||||
RestrictAddressFamilies=AF_UNIX
|
||||
ReadWritePaths=/var/run/docker.sock
|
||||
DeviceAllow=
|
||||
IPAddressDeny=any
|
||||
NoNewPrivileges=true
|
||||
PrivateTmp=true
|
||||
PrivateDevices=true
|
||||
PrivateMounts=true
|
||||
PrivateUsers=true
|
||||
ProtectControlGroups=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=tmpfs
|
||||
ProtectKernelModules=true
|
||||
ProtectKernelTunables=true
|
||||
RestrictSUIDSGID=true
|
||||
SystemCallArchitectures=native
|
||||
SystemCallFilter=@system-service
|
||||
SystemCallFilter=~@privileged @resources
|
||||
RestrictRealtime=true
|
||||
LockPersonality=true
|
||||
MemoryDenyWriteExecute=true
|
||||
UMask=0077
|
||||
ProtectHostname=true
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Suspiciously it seemed to be doing some sandboxing and filtering system calls
|
||||
(`SystemCallFilter`). A bit more research pointed me to `systemd-analyze
|
||||
syscall-filter`, which lists which system calls belong to the predefined system
|
||||
call sets (`@privileged`, `@resources`, etc.).
|
||||
|
||||
`setrlimit` was listed under `@resources`:
|
||||
|
||||
```
|
||||
@resources
|
||||
# Alter resource settings
|
||||
ioprio_set
|
||||
mbind
|
||||
migrate_pages
|
||||
move_pages
|
||||
nice
|
||||
sched_setaffinity
|
||||
sched_setattr
|
||||
sched_setparam
|
||||
sched_setscheduler
|
||||
set_mempolicy
|
||||
setpriority
|
||||
setrlimit
|
||||
```
|
||||
|
||||
The [systemd docs for `SystemCallFilter`][SystemCallFilter] also mentioned:
|
||||
|
||||
> If the first character of the list is "~", the effect is inverted: only the
|
||||
> listed system calls will result in immediate process termination
|
||||
> (deny-listing)
|
||||
|
||||
So we finally had our culprit: the service definition was denying system calls in
|
||||
the `@resources` set and at some point `docker` had started making `setrlimit` calls,
|
||||
which were resulting in termination.
|
||||
|
||||
The fix was simple enough: I removed `@resources` from the deny list, rebuilt
|
||||
the package, and then re-enabled the `docuum` service (I'd previously disabled
|
||||
it do to the constant crashes). I was pleased to see it start successfully and
|
||||
begin vacuuming up a few months of Docker image detritus.
|
||||
|
||||
### Conclusion
|
||||
|
||||
This small debugging session taught me a number of things: I learned how to
|
||||
find and list core dumps managed by systemd, how to open them in GDB with
|
||||
symbols present, and that systemd has powerful, fine-grained system call
|
||||
sandboxing.
|
||||
|
||||
I was ultimately able to resolve the issue and get Docuum working again.
|
||||
I have published my patched version of the AUR package to my personal AUR
|
||||
repo in case it's useful to anyone else:
|
||||
|
||||
<https://github.com/wezm/aur/tree/master/docuum>.
|
||||
|
||||
[aurpkg]: https://aur.archlinux.org/packages/docuum
|
||||
[Docuum]: https://github.com/stepchowfun/docuum
|
||||
[gocode]: https://github.com/golang/go/blob/203e59ad41bd288e1d92b6f617c2f55e70d3c8e3/src/syscall/zsyscall_linux_amd64.go#L1335
|
||||
[service]: https://aur.archlinux.org/cgit/aur.git/tree/docuum.service?h=docuum&id=650d2c24fe9df712e8a98dde37f3ee47d3af4e47
|
||||
[SystemCallFilter]: https://www.freedesktop.org/software/systemd/man/systemd.exec.html#System%20Call%20Filtering
|
Loading…
Reference in a new issue