lamshrink



LAMSHRINK(1)                     LAM COMMANDS                     LAMSHRINK(1)




NAME

       lamshrink - Shrink a LAM universe.


SYNTAX

       lamshrink [-dhv] [-w <delay>] <nodeid>


OPTIONS

       -d            Print detailed debugging information.

       -h            Print useful information on this command.

       -v            Be verbose.

       <nodeid>      Remove the LAM node with this ID.

       -w <delay>    Notify processes on the doomed node and pause for <delay>
                     seconds before proceeding.


DESCRIPTION

       An existing LAM session, initiated by lamboot(1), can be shrunk to  in-
       clude  less nodes with lamshrink.  One node is removed for each invoca-
       tion.  At a minimum, the node ID is given on the  command  line.   Once
       lamshrink  completes, the node ID is invalid across the remaining nodes
       (as can be seen by running lamnodes(1)).

       Existing application processes on the target node can be warned of  im-
       pending  shutdown  with  the -w option.  A LAM signal (SIGFUSE) will be
       sent to these processes and lamshrink will then  pause  for  the  given
       number  of  seconds  before  proceeding with removing the node.  By de-
       fault, SIGFUSE is ignored.  A different handler can be  installed  with
       ksignal(2).

       All application processes on all remaining nodes are always informed of
       the death of a node.  This is also  done  with  a  signal  (SIGSHRINK),
       which  by  default causes a process’s runtime route cache to be flushed
       (to remove any cached information on the dead node).  If this signal is
       re-vectored  for the purpose of fault tolerance, the old handler should
       be called at the beginning of the new handler.  The signal does not, by
       itself,  give  the  process information on which node has been removed.
       One technique for getting this information is to query the  router  for
       information  on  all  relevant  nodes using getroute(2).  The dead node
       will cause this routine to return an error.

   FAULT TOLERANCE
       If enabled with lamboot(1), LAM will watch for nodes  that  fail.   The
       procedure  for removing a node that has failed is the same as lamshrink
       after the warning step.  In particular, the SIGSHRINK signal is  deliv-
       ered.


EXAMPLES

       lamshrink -v n1 Remove LAM on n1.  Report about important steps as
           they are done.

       lamshrink n30 -w 10
           Inform  all processes on LAM node 30, that the node will be dead in
           10 seconds.  Wait 10 seconds and remove the node.  Operate  silent-
           ly.


SEE ALSO

       lamboot(1), lamnodes(1), ksignal(2), getroute(2)



LAM 7.1.1                       September, 2004                   LAMSHRINK(1)

Man(1) output converted with man2html