Using strace and ltrace to help with troubleshooting on Linux

Both trace and trace are powerful command-line tools for debugging and troubleshooting programs in Linux: Strace captures and logs all system calls made by a process as well as received signals, while trace does the same for library calls.

If a program is acting differently than you expected, you can use these tools to see “behind the curtain” and maybe get some clues as to what’s going on.

Be warned, however. When you use any of these commands, you will end up with a lot of exit to go. Nevertheless, it can tell you a lot about how a process works and sometimes give you important insights.

trace

To run trace against a program, use a command like the one shown below—the trace command followed by the name of the program. The output below has been cropped.

$ strace who
execve(“/usr/bin/who”, [“who”], 0x7ffe889f45c0 /* 41 vars */) = 0
brk(NULL)                               = 0x55e7d6720000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fff38c72b20) = -1 EINVAL (Invalid argument)
access(“/etc/ld.so.preload”, R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, “/etc/ld.so.cache”, O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, “”, {st_mode=S_IFREG|0644, st_size=60195, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 60195, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd1ba4d7000
close(3)
…

The above command runs the who ordered and reports on the system calls it makes. The beginning of each line (for example, execute, brk and arch_prctl) indicates the current system call.

To run trace against a running process to see what it is doing, use the command with the -p option followed by the process ID. Note that the trace the output starts at the 3rd line below after the process is connected.

$ strace -p 34512
strace: Process 34512 attached
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 34516
…

NOTE: You cannot redirect the output of the trace command through a pipe, but you can pipe it to a file using the -o option. In the example below, the Date the output of the command is displayed. After that, the top of the output file is displayed using the head ordered.

$ strace -o outputfile date
Tue May  3 03:33:52 PM EDT 2022
$ head -10 outputfile
execve(“/usr/bin/date”, [“date”], 0x7ffc30f3bd00 /* 41 vars */) = 0
brk(NULL)                               = 0x55e78c077000
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffe52bf2ec0) = -1 EINVAL (Invalid argument)
access(“/etc/ld.so.preload”, R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, “/etc/ld.so.cache”, O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, “”, {st_mode=S_IFREG|0644, st_size=60195, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 60195, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f589661b000
close(3)                                = 0
openat(AT_FDCWD, “/lib64/libc.so.6”, O_RDONLY|O_CLOEXEC) = 3
read(3, “177ELF21133>1PP4”..., 832) = 832

Remember you are looking at system calls, not programs. So don’t be surprised if you don’t recognize “execve”, “brk”, “arch_prctl”, etc. You can, however, consult the man pages for these system calls if you want to know more about what they do and how they work. Note that the man pages for the system calls are taken from section 2 of the man pages.

$ man execve | head -4
EXECVE(2)                  Linux Programmer’s Manual                 EXECVE(2) <==

NAME
       execve - execute program

Another useful way to use the trace command is to add the -vs option. When you do this, the output provides a summary of the system calls made ranked by percentage of time spent on each system call (most important first). As you can see in the example below, more time is spent on mmap (map files) than on any other system call.

$ strace -c date
Tue May  3 03:40:25 PM EDT 2022
% time     seconds  usecs/call     calls    errors syscall
——— —————- —————- ————- ————- ————————
 28.49    0.000106          11         9           mmap
 12.10    0.000045          11         4           openat
 10.22    0.000038           9         4           mprotect
  9.95    0.000037           6         6           newfstatat
  8.87    0.000033           8         4           pread64
  6.72    0.000025           4         6           close
  4.57    0.000017           5         3           read
  3.49    0.000013          13         1         1 access
  2.96    0.000011          11         1           munmap
  2.69    0.000010          10         1           write
  2.15    0.000008           2         3           brk
  1.88    0.000007           3         2         1 arch_prctl
  1.34    0.000005           5         1           set_tid_address
  1.34    0.000005           5         1           set_robust_list
  1.08    0.000004           4         1           lseek
  1.08    0.000004           4         1           prlimit64
  1.08    0.000004           4         1           getrandom
  0.00    0.000000           0         1           execve
——— —————- —————- ————- ————- ————————
100.00    0.000372           7        50         2 total

trace

the trace command works much the same as trace, but examines library calls. In the examples below we see that trace essentially uses the same syntax as trace.

This first example examines the library calls used by the who ordered.

$ ltrace who
strrchr(“who”, ‘/’)                              = nil
setlocale(LC_ALL, “”)                            ="en_US.UTF-8"
bindtextdomain(“coreutils”, “/usr/share/locale”) ="/usr/share/locale"
textdomain(“coreutils”)                          ="coreutils"
__cxa_atexit(0x560c6532d220, 0, 0x560c65334940, 0) = 0
getopt_long(1, 0x7ffc1f74a468, “abdlmpqrstuwHT”, 0x560c65334960, nil) = -1
setlocale(LC_TIME, nil)                          ="en_US.UTF-8"
strlen(“en_US.UTF-8”)                            = 11
__memcpy_chk(0x7ffc1f74a1f0, 0x560c65bf6370, 12, 257) = 0x7ffc1f74a1f0
strcmp(“en_US.UTF-8”, “POSIX”)                   = 21
utmpxname(0x560c65331016, 1, 0x560c65331266, 0x560c65331266) = 0
setutxent(0x560c6533101e, 0x7f2330570cc0, 5, 0)  = 1
…

The following example shows library calls for a running process:

$ ltrace -p 35243
sigemptyset(<>)                                  = 0
sigemptyset(<>)                                  = 0
sigaction(SIGINT, { nil, <>, 0x3f679854, 0x560643c72e53 }, { 0x560643c73f80, <>, 0x9018, 0x1000 }) = 0
ioctl(2, 21523, 0x7ffe3f679870)                  = 0
malloc(48)                                       = 0x56064548b020
strlen(“LINES”)                                  = 5
malloc(6)                                        = 0x56064548dfb0
strcpy(0x56064548dfb0, “LINES”)                  = 0x56064548dfb0
strlen(“LINES”)                                  = 5
malloc(6)

Sending in progress trace output to a file requires the use of the -o option.

$ ltrace -o outputfile date
Tue May  3 04:17:30 PM EDT 2022
$ head -10 outputfile
strrchr(“date”, ‘/’)                             = nil
setlocale(LC_ALL, “”)                            ="en_US.UTF-8"
bindtextdomain(“coreutils”, “/usr/share/locale”) ="/usr/share/locale"
textdomain(“coreutils”)                          ="coreutils"
__cxa_atexit(0x55e8afe3f280, 0, 0x55e8afe4e000, 0) = 0
getopt_long(1, 0x7ffdc8986948, “d:f:I::r:Rs:u”, 0x55e8afe4e920, nil) = -1
nl_langinfo(0x2006c, 0x7ffdc8986948, 0, 0)       = 0x7f3afe645435
getenv(“TZ”)                                     = nil
malloc(128)                                      = 0x55e8b05b0440
clock_gettime(0, 0x7ffdc8986740, 1, 145)         = 0

Note that the man page for library calls is from section 3 of the man pages.

$ man strrchr | head -4
STRCHR(3)                  Linux Programmer’s Manual                 STRCHR(3) <==

NAME
       strchr, strrchr, strchrnul - locate character in string
$ man strlen | head -4
STRLEN(3)                  Linux Programmer’s Manual                 STRLEN(3)

NAME
       strlen - calculate the length of a string

By using the -vs options with tracelike with traceorders the list with the longest execution times first.

$ ltrace -c pwd
/home/shs
% time     seconds  usecs/call     calls      function
——— —————- —————- ————- ——————————
 16.02    0.000428         428         1 setlocale
 10.82    0.000289          72         4 __freading
  7.93    0.000212         212         1 puts
  6.74    0.000180          90         2 fclose
  5.39    0.000144          72         2 fileno
  5.39    0.000144         144         1 getcwd
  5.35    0.000143          71         2 __fpending
  5.28    0.000141          70         2 fflush
  4.79    0.000128         128         1 free
  4.64    0.000124         124         1 bindtextdomain
  4.49    0.000120         120         1 exit_group
  4.49    0.000120         120         1 getenv
  4.19    0.000112         112         1 getopt_long
  4.00    0.000107         107         1 textdomain
  3.97    0.000106         106         1 strrchr
  3.93    0.000105         105         1 __cxa_atexit
  2.58    0.000069          69         1 __cxa_finalize
——— —————- —————- ————- ——————————
100.00    0.002672                    24 total

Installing strace and ltrace

Commands like these are used to install trace:

$ sudo apt install strace	# Debian/Ubuntu
$ sudo yum install strace	# RHEL/Centos
$ sudo dnf install strace	# Fedora

In the same way, trace will install with commands like these:

$ sudo apt install ltrace	# Debian/Ubuntu
$ sudo yum install ltrace	# RHEL/Centos
$ sudo dnf install ltrace	# Fedora

Conclude

The two trace and trace Commands provide abundant output, but can be very useful, especially once you get used to the output they provide.

Join the Network World communities on Facebook and LinkedIn to comment on topics that matter to you.

Copyright © 2022 IDG Communications, Inc.

Leave a Reply

Your email address will not be published.