[cs615asa] HW#4

Thu Apr 18 23:03:00 EDT 2013

Hello,

I will be sending out the grades for HW#4 in a little bit.  Before doing
so, I wanted to point out a few things that I hope you learned as part
of this exercise:

The assignment is a fairly realistic task in a sysadmin's life.  That
is, we frequently have to create simple tools that wrap others, that
interact with multiple systems and which may behave differently based on
who is running them.

It is a common beginner's mistake to write a tool only for yourself.
When doing so, we hardcode filenames or private variables and if anybody
else were to try to run the command, it would inevitably fail.  This
exercise tried to lead you to _not_ do that, by explicitly noting the
possibilities offered via the environment variables.

If your submission does not use these variables, then your program
cannot be portable and is unlikely to work for users other than
yourself.

As we have seen in this assignment, even though the systems we're
accessing are well defined, we cannot necessarily rely on the libraries
that we want to use to work flawlessly -- recall that we had to remove
the functionality surrounding the tagged volume because the AWS tools
had a bug.  This is something you will run into all the time.

You may have noticed that at it's core, the tool you were asked to write
was not very complicated.  We could condense the main functionality into
the following pseudocode:

- identify size of the directory in question
- create a volume of double that size
- if no instance was specified, create one
- attach the volume to the instance
- write data to the remote volume using the chosen method

The biggest increase in complexity derives from the way we choose to
wrap these commands.  This, too, is nearly universally true:  for small
tools like this, core functionality is often trivial, but getting the
surrounding logic, the error checking and multi-user implications right
takes up a lot of time and effort.

Here are a few mistakes that I saw repeatedly:

Many of you created a local .tar file of the directory to be backed up.
This is not a good idea: if I ask you to back up your entire filesystem
of several hundred gigabyte, you may not have the additional storage
space required to store an interim copy.  Secondly, if you try to create
a file under the directory you're trying to back up, you need to exclude
that file from the backup, which makes things more difficult.

Several of you also attempted to create such a file in the current
working directory, assuming that you have write permissions there.  This
is one of the many assumptions that you will learn do not hold when you
write a general purpose tool.  If you need to create a file, you need to
specify an absolute path of where to write it to, otherwise your tool
cannot be useful.

I don't believe anybody understood how to perform the backup using
tar(1) and dd(1).  The idea was not to create a .tar file and store that
on a remote filesystem, but to write the archive to the remote volume as
raw disk space.  tar(1) creates an archive suitable for this -- it is
used to write these to magnetic tapes as well as to raw disk space.  If
we wanted to just create a .tar file and store that on a different
filesystem, then you could just copy the file instead of using dd(1).

dd(1) performs a bitwise copy of the input to its output, and so allows
you to store the archive created with tar(1) on the volume without
requiring a filesystem to exist on the volume.  This has the advantage
that you can use any volume on any instance type without having to know
the operating- or file system.

The best way to perform this step would have been along the lines of
this command:

tar vf - directory | ssh instance "dd of=/volume"

I did not expect you all to know or even necessarily understand this
right away, but I did expect you to *ask* if you are unclear on
anything.  I do not recall anybody asking about this...

The second backup method is different -- it allows you to create
incremental backups, but requires a file system to exist on the volume.
This means that you have to make a choice of file system and handle the
situation where the specified instance cannot create the file system you
have chosen.  This can get tricky.

General programming advise:

- any function that can fail, will fail at some point; always check the
  return value or exit code of any function or command you invoke; all
  errors should be handled gracefully

- do not hardcode anything that you cannot 100% assume to be correct;
  anything that might be different for different users should either be
  a command-line option, come from a configuration file or from an
  environment variable

- use functions; logical code blocks spanning more than two or three
  screen fulls should be factored into their own functions

- line break at around 80 characters; this makes the code more readable
  and gives you an indicator when you need to refactor into functions:
  if your indentation is too deep so that you can't write any more code
  without line breaks, it's time to take that block and turn it into its
  own function

- try to re-read your code again in a few weeks; can you still make
  sense of it?

Anyway, I hope in the end you all felt like you learned something from
this exercise.  As I said, tasks like these are indeed very common, and
it takes time and practice to get better at implementing them.

-Jan