cat

Let’s start with the simplest case: when cat is run with no arguments. Such a program is gratifyingly simple in C:

#include <stdio.h>
int main() {
  int c;
  while(EOF != (c = getchar())) putchar(c);
  return 0;
}

while Go requires more lines for the same behaviour:

package main
import ("bufio";"io";"os")
func main() {
  in := bufio.NewReader(os.Stdin)
  out := bufio.NewWriter(os.Stdout)
  for {
    b,er := in.ReadByte()
    if io.EOF == er { break }
    out.WriteByte(b)
    if '\n' == b { out.Flush() }
  }
  out.Flush()
}

The standard C library flushes the output buffer when a newline character is written and when the program terminates normally. Go requires manual flushing. This adds flexibility and clarity, but means a programmer must do a bit more work to mimic C’s behaviour.

It is perhaps no coincidence that a principal designer of Go has argued that tools are weakened by the built-in concept of a line. Unlike standard C libraries, the Go libraries treat all characters equally. May this foster the development of similarly unprejudiced tools!

Reading from standard input requires care. In C, an fread() for n bytes will block until n bytes are received, possibly including newlines, or if there is an end-of-file or an error.

One might guess that Go’s Read() behaves similarly, but in fact it impatiently returns as long as at least one byte has been read. if you are running such a program interactively, then when you hit Enter, a Read() call reads up to and including the newline even if the there are fewer than n bytes.

Instead, use the io.ReadFull() function to emulate C.

The complete cat

Let’s examine a fully grown cat, namely one that treats its arguments as files to print to standard output. First, in C:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void cat(FILE* f) {
  int c;
  while(EOF != (c = getc(f))) {
    if (EOF == putchar(c)) {
      perror("cat");
      return;
    }
  }
  if (ferror(f)) {
    perror("cat");
    return;
  }
}
int main(int argc, char **argv) {
  if (argc==1) return cat(stdin), 0;
  for(int i=1; i<argc; i++) {
    if (!strcmp("-", argv[i])) {
      cat(stdin);
      continue;
    }
    FILE *f = fopen(argv[i], "r");
    if (!f) {
      perror("cat");
      continue;
    }
    cat(f);
    fclose(f);
  }
  return 0;
}

A rough Go translation:

package main
import("bufio";"os";"flag")
func main() {
  cat := func(f *os.File) {
    in := bufio.NewReader(f)
    out := bufio.NewWriter(os.Stdout)
    for done := false; !done; {
      switch b,er := in.ReadByte(); er {
      case os.EOF: done = true
      case nil:
        if nil != out.WriteByte(b) {
          println("cat:", f.Name() + ":", er.String())
          done = true
        }
      default:
        println("cat:", f.Name() + ":", er.String())
        done = true
      }
    }
    out.Flush()
  }
  flag.Parse()
  if 0 == flag.NArg() {
    cat(os.Stdin)
    return
  }
  for i := 0; i < flag.NArg(); i++ {
    if flag.Arg(i) == "-" {
      cat(os.Stdin)
      continue
    }
    f,er := os.Open(flag.Arg(i),os.O_RDONLY,0)
    if f == nil {
      println("cat:", er.String())
      continue
    }
    cat(f)
    f.Close()
  }
}

We’ve taken this opportunity to show off lexical closures and anonymous functions. The cat function definition is nested in the main function, and is anonymous until assigned to the cat variable. GNU C supports lexical closures, but not anonymous functions.

Choosing to embrace the new philosophy, we no longer flush on newline.

Flags, signals, buffers

Our cat is now complete. Some may claim we are missing command-line flags, but flags should never have been added to cat in the first place.

However, there are at least two more wrinkles. Firstly, if a signal interrupts a read, we may abort and report an EINTR error even though retrying is desirable.

Secondly, for efficiency, we should avoid bufio and do our own buffering, namely read and write blocks instead of single characters. Choosing a good block size is tricky. As discussed in the comments of system.h in the source of the GNU Coreutils, the st_blksize from a stat call usually provides a good recommendation, but in certain circumstances returns 0 or an unreasonably large number. Additionally, we must determine the minimum block size that minimizes syscall overhead. Again, the comments in system.h shed some light on this:

/* As of Mar 2009, 32KiB is determined to be the minimium
   blksize to best minimize system call overhead.
   This can be tested with this script with the results
   shown for a 1.7GHz pentium-m with 2GB of 400MHz DDR2 RAM:

   for i in $(seq 0 10); do
     size=$((8*1024**3)) #ensure this is big enough
     bs=$((1024*2**$i))
     printf "%7s=" $bs
     dd bs=$bs if=/dev/zero of=/dev/null count=$(($size/$bs)) 2>&1 |
     sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
   done

      1024=734 MB/s
      2048=1.3 GB/s
      4096=2.4 GB/s
      8192=3.5 GB/s
     16384=3.9 GB/s
     32768=5.2 GB/s
     65536=5.3 GB/s
    131072=5.5 GB/s
    262144=5.7 GB/s
    524288=5.7 GB/s
   1048576=5.8 GB/s

   Note that this is to minimize system call overhead.
   Other values may be appropriate to minimize file system
   or disk overhead.  For example on my current GNU/Linux system
   the readahead setting is 128KiB which was read using:

   file="."
   device=$(df -P --local "$file" | tail -n1 | cut -d' ' -f1)
   echo $(( $(blockdev --getra $device) * 512 ))

   However there isn't a portable way to get the above.
   In the future we could use the above method if available
   and default to io_blksize() if not.
 */

It seems the logical next step for our program would be to replace the single character I/O with 32KiB-block I/O. Let’s move on to another program instead.





Ben Lynn blynn@cs.stanford.edu