Writing Fast Apps


As part of the ongoing Small Systems Performance effort, I put together this presentation for SunNetwork 9/2003.  It contains some useful tools and techniques for improving application performance under Solaris.  This web page contains some expansion of the ideas in the presentation and will be updated as additional tools and examples are available.

Ptime example

Ptime is a useful way of obtaining process cpu time information, and enabling microstate accounting while doing so.  Microstate accounting tracks each thread's transition from user to system to idle, etc, and does accounting accurately.  This is in contrast to the statistical accounting performed by the clock thread at HZ intervals.  The following program illustrates this rather nicely:
/* waste.c */
#include

main()
{
int i, s;
for(s = time(NULL); time(NULL) < s + 5; ) {
poll(0,0,1); /* sleep for a bit to line up witht clock thread */
for (i=0;i<10000;i++)
getpid(); /* waste some time */
}
}


You can see the usefulness of microstate accounting for tracking down apps that sleep for a bit and then compute for a bit when we compare the results of time(1) vs ptime(1):

cyber% cc -o waste waste.c
cyber% time ./waste
0.0u 0.0s 0:04 0% 0+0k 0+0io 0pf+0w
cyber% ptime ./waste

real 4.529
user 1.975
sys 0.637
cyber%


Future plans for Solaris are to enable microstate accounting by default to obtain more accurate accounting. Note that prstat -m will also use microstate accounting; you can also enable microstate accounting with a small program using the proc(4) interfaces.

Truss examples

truss -c will count system call during a processes lifetime and display elapsed time spent in each call:

barts@cyber:/home/barts 77% truss -c ls > /dev/null

syscall seconds calls errors
_exit .000 1
write .000 1
open .000 6
close .000 6
time .000 1
brk .000 16
stat .000 6
getpid .000 1
fstat .000 5
ioctl .000 2 2
execve .000 1
fcntl .000 1
getcontext .000 1
setustack .000 1
mmap .000 11
munmap .000 3
getrlimit .000 1
memcntl .000 2
resolvepath .000 6
getdents64 .000 4
lstat64 .000 1
fstat64 .000 3
-------- ------ ----
sys totals: .003 80 2
usr time: .003
elapsed: .020

We can also use truss to track calls to functions in shared libraries. Here we disable tracing system calls since it clutters up the output:

barts@cyber:/home/barts 82% truss -t\!all -u:malloc ls -l > /dev/null
/1: -> libc:malloc(0x100, 0x24800, 0x24800, 0x0)
/1: <- libc:malloc() = 0x26158
/1: -> libc:malloc(0x9000, 0x40, 0x83f74, 0x0)
/1: <- libc:malloc() = 0x26260
/1: -> libc:malloc(0x9000, 0x200, 0x31688, 0x26258)
/1: <- libc:malloc() = 0x31890
/1: -> libc:malloc(0x9000, 0x300, 0x3a898, 0x31888)
/1: <- libc:malloc() = 0x3aba0
/1: -> libc:malloc(0x9000, 0x400, 0x43ba8, 0x3ab98)
/1: <- libc:malloc() = 0x43fb0
/1: -> libc:malloc(0x9000, 0x500, 0x4cfb8, 0x43fa8)
/1: <- libc:malloc() = 0x4d4c0
/1: -> libc:malloc(0x9000, 0x600, 0x564c8, 0x4d4b8)
/1: <- libc:malloc() = 0x56ad0
barts@cyber:/home/barts 83%


Note that the syntax for the -u option takes some getting used to...

<more coming>