*this
home about contact feed
projects
vagra vaBlo
categories
C++(7) Solaris(4) tntnet(3) vi(3) Linux(2) DeleGate(2) Postgres(2) proxy(2) cxxtools(1) regex(1) readline(1) dtrace(1) gcc(1) MeeGo(1) ssh(1) firefox(1)

Mesure times in system calls with dtrace

drace syscall times

Sometimes processes are slow without consuming much ressources. In such cases you can try to use truss (or strace on some systems) to find out what the process is waiting for, however, in multithreaded processes this might not help you that much. Also tracing the process from userland will slow down the process further what might not be an option in production einvironments.

Using dtrace will not only be less intrusive to your application flow, it can also summarize the times for you. The following script will not only summarize the times spending in system calls (/opt/DTT/procsystime would do), it will also print you a line for each system call when it retruns

#!/usr/sbin/dtrace -s

#pragma D option quiet

syscall:::entry
/ zonename == "myzone" && execname == "tntnet" /
{
        self->t=timestamp;
}

syscall:::return
/ zonename == "myzone" && self->t!=0 && execname == "tntnet" /
{
        printf("%d/%d spent %d nsecs in %s\n",
        pid, tid, timestamp - self->t, probefunc);

        @function[probefunc] = sum (timestamp - self->t);
}

For these who are familiar with awk, the syntax will not surprise you, in / / there are filters on which the code blocks apply, self is a special variable you can use to ensure coherency between threads.

...
14611/52 spent 3415416747 nsecs in pollsys
14611/52 spent 8583 nsecs in gtime
14611/52 spent 5333 nsecs in gtime
14611/52 spent 21666 nsecs in lwp_park
14611/52 spent 2798833 nsecs in nanosleep
14611/52 spent 20010167 nsecs in pollsys
14611/52 spent 18000 nsecs in read
14611/52 spent 9864917 nsecs in nanosleep
14611/52 spent 5417 nsecs in gtime
14611/53 spent 10008364573 nsecs in lwp_park
14611/53 spent 11500 nsecs in gtime
14611/53 spent 5500 nsecs in gtime
^C

  lseek                                                         29583
  fcntl                                                         34666
  close                                                         45083
  accept                                                        57417
  getpid                                                       133751
  open                                                         134834
  stat                                                         248915
  gtime                                                        282585
  read                                                        1725499
  write                                                       2817832
  nanosleep                                                  80435583
  pollsys                                                  5699637580
  lwp_park                                                35622920131

You see, that the process is spending the most time on beeing idle and waiting for requests, but tntnet is actually not a really good example since it doesn't appreas to be slow.

Write comment