<navid="sidebar"class="sidebar"aria-label="Table of contents">
<divclass="sidebar-scrollbox">
<olclass="chapter"><liclass="chapter-item expanded affix "><ahref="zshguide.html">A User's Guide to the Z-Shell</a></li><liclass="chapter-item expanded "><ahref="zshguide01.html"><strongaria-hidden="true">1.</strong> A short introduction</a></li><liclass="chapter-item expanded "><ahref="zshguide02.html"><strongaria-hidden="true">2.</strong> What to put in your startup files</a></li><liclass="chapter-item expanded "><ahref="zshguide03.html"class="active"><strongaria-hidden="true">3.</strong> Dealing with basic shell syntax</a></li><liclass="chapter-item expanded "><ahref="zshguide04.html"><strongaria-hidden="true">4.</strong> The Z-Shell Line Editor</a></li><liclass="chapter-item expanded "><ahref="zshguide05.html"><strongaria-hidden="true">5.</strong> Substitutions</a></li><liclass="chapter-item expanded "><ahref="zshguide06.html"><strongaria-hidden="true">6.</strong> Completion, old and new</a></li><liclass="chapter-item expanded "><ahref="zshguide07.html"><strongaria-hidden="true">7.</strong> Modules and other bits and pieces Not written</a></li></ol>
<buttonid="sidebar-toggle"class="icon-button"type="button"title="Toggle Table of Contents"aria-label="Toggle Table of Contents"aria-controls="sidebar">
<ahref="print.html"title="Print this book"aria-label="Print this book">
<iid="print-button"class="fa fa-print"></i>
</a>
</div>
</div>
<divid="search-wrapper"class="hidden">
<formid="searchbar-outer"class="searchbar-outer">
<inputtype="search"id="searchbar"name="searchbar"placeholder="Search this book ..."aria-controls="searchresults-outer"aria-describedby="searchresults-header">
<p>Builtin commands, or builtins for short, are commands which are part of
the shell itself. Since builtins are necessary for controlling the
shell's own behaviour, introducing them actually serves as an
introduction to quite a lot of what is going on in the shell. So a fair
fraction of what would otherwise appear later in the chapter has
accumulated here, one way or another. This does make things a little
tricksy in places; count how many times I use the word `<code>subtle</code>' and
keep it for your grandchildren to see.</p>
<p>I just described one reason for builtins, but there's a simpler one:
speed. Going through the process of setting up an entirely new
environment for the command at the beginning, swapping between this
command and anything else which is being run on the computer, then
destroying it again at the end is considerable overkill if all you want
to do is, say, print out a message on the screen. So there are builtins
for this sort of thing.</p>
<p><spanid="l32"></span></p>
<h3id="321-builtins-for-printing"><aclass="header"href="#321-builtins-for-printing">3.2.1: Builtins for printing</a></h3>
<p>The commands `<code>echo</code>' and `<code>print</code>' are shell builtins; they just show
what you typed, after the shell has removed all the quoting. The
difference between the two is really historical: `<code>echo</code>' came first,
and only handled a few simple options; ksh provided `<code>print</code>', which
had more complex options and so became a different command. The
difference remains between the two commands in zsh; if you want wacky
effects, you should look to <code>print</code>. Note that there is usually also an
external command called <code>echo</code>, which may not be identical to zsh's;
there is no standard external command called <code>print</code>, but if someone has
installed one on your system, the chances are it sends something to the
printer, not the screen.</p>
<p>One special effect is `<code>print -z</code>' puts the arguments onto the editing
buffer stack, a list maintained by the shell of things you are about to
edit. Try:</p>
<pre><code> print -z print -z print This is a line
</code></pre>
<p>(it may look as if something needs quoting, but it doesn't) and hit
return three times. The first time caused everything after the first
`<code>print -z</code>' to appear for you to edit, and so on.</p>
<p>For something more useful, you can write functions that give you a line
to edit:</p>
<pre><code> fn() { print -z print The time now is $(date); }
</code></pre>
<p>Now when you type `<code>fn</code>', the line with the date appears on the command
line for you to edit. The option `<code>-s</code>' is a bit similar; the line
appears in the history list, so you will see it if you use up-arrow, but
it doesn't reappear automatically.</p>
<p>A few other useful options, some of which you've already seen, are</p>
<ul>
<li><strong><code>-r</code></strong><br/>
don't interpret special character sequences like `<code>\n</code>'</li>
<li><strong><code>-P</code></strong><br/>
use `<code>%</code>' as in prompts</li>
<li><strong><code>-n</code></strong><br/>
don't put a newline at the end in case there's more output to follow</li>
<li><strong><code>-c</code></strong><br/>
print the output in columns --- this means that `<code>print -c *</code>' has
the effect of a sort of poor person's `<code>ls</code>', only faster</li>
<li><strong><code>-l</code></strong><br/>
use one line per argument instead of one column, which is sometimes
useful for sticking lists into files, and for working out what part
of an array parameter is in each element.</li>
</ul>
<p>If you don't use the <code>-r</code> option, there are a whole lot of special
character sequences. Many of these may be familiar to you from C.</p>
<ul>
<li><strong><code>\n</code></strong><br/>
newline</li>
<li><strong><code>\t</code></strong><br/>
tab</li>
<li><strong><code>\e</code> or <code>\E</code></strong><br/>
escape character</li>
<li><strong><code>\a</code></strong><br/>
ring the bell (alarm), usually a euphemism for a hideous beep</li>
<li><strong><code>\b</code></strong><br/>
move back one character.</li>
<li><strong><code>\c</code></strong><br/>
don't print a newline --- like the <code>-n</code> option, but embedded in the
string. This alternative comes from Berkeley UNIX.</li>
<li><strong><code>\f</code></strong><br/>
form feed, the phrase for `advance to next page' from the days when
terminals were called teletypes, maybe more familiar to you as <code>^L</code></li>
<li><strong><code>\r</code></strong><br/>
carriage return --- when printed, the annoying <code>^M</code>'s you get in DOS
files, but actually rather useful with `<code>print</code>', since it will
erase everything to the start of the line. The combination of the
<code>-n</code> option and a <code>\r</code> at the start of the print string can give the
illusion of a continuously changing status line.</li>
<li><strong><code>\v</code></strong><br/>
vertical tab, which I for one have never used (I just tried it now
and it behaved like a newline, only without assuming a carriage
return, but that's up to your terminal).</li>
</ul>
<p>In fact, you can get any of the 255 characters possible, although your
terminal may not like some or all of the ones above 127, by specifying a
number after the backslash. Normally this consists of three octal
characters, but you can use two hexadecimal characters after <code>\x</code>
instead --- so `<code>\n</code>', `<code>\012</code>' and `<code>\x0a</code>' are all newlines. `<code>\</code>'
itself escapes any other character, i.e. they appear as themselves even
if they normally wouldn't.</p>
<p>Two notes: first, don't get confused because `<code>n</code>' is the fourteenth
letter of the alphabet; printing `<code>\016</code>' (fourteen in octal) won't do
you any good. The remedy, after you discover your text is unreadable
(for VT100-like terminals including xterm), is to print `<code>\017</code>'.</p>
<p>Secondly, those backslashes can land you in real quoting difficulties.
Normally a backslash on the command line escapes the next character ---
this is a <em>different</em> form of escaping to <code>print</code>'s --- so</p>
<pre><code> print \n
</code></pre>
<p>doesn't produce a newline, it just prints out an `<code>n</code>'. So you need to
quote that. This means</p>
<pre><code> print \\
</code></pre>
<p>passes a single backslash to quote, and</p>
<pre><code> print \\n
</code></pre>
<p>or</p>
<pre><code> print '\n'
</code></pre>
<p>prints a newline (followed by the extra one that's usually there). To
print a real backslash, you would thus need</p>
<pre><code> print \\\\
</code></pre>
<p>Actually, you can get away with the two if there's nothing else after
--- <code>print</code> just shrugs its shoulders and outputs what it's been given
--- but that's not a good habit to get into. There are other ways of
doing this: since single quotes quote anything, including backslashes
(they are the only way of making backslashes behave like normal
characters), and since the `<code>-r</code>' option makes print treat characters
normally,</p>
<pre><code> print -r '\'
</code></pre>
<p>has the same effect. But you need to remember the two levels of quoting
for backslashes. Quotes aren't special to <code>print</code>, so</p>
<pre><code> print \'
</code></pre>
<p>is good enough for printing a quote.</p>
<p><strong><code>echotc</code></strong></p>
<p>There's an oddity called `<code>echotc</code>', which takes as its argument
`termcap' capabilities. This now lives in its own module,
<code>zsh/termcap</code>.</p>
<p>Termcap is a now rather old-fashioned way of giving the commands
necessary for performing various standard operations on terminals:
moving the cursor, clearing to the end of the line, turning on standout
mode, and so on. It has now been replaced almost everywhere by
`terminfo', a completely different way of specifying capabilities, and
by `curses', a more advanced system for manipulating objects on a
character terminal. This means that the arguments you need to give to
<code>echotc</code> can be rather hard to come by; try the <code>termcap</code> manual page;
if there are two, it's probably the one in section five which gives the
codes, i.e. `<code>man 5 zsh</code>' or `<code>man -s 5 zsh</code>' on Solaris. Otherwise
you'll have to search the web. The reason the <code>zsh</code> manual doesn't give
a list is that the shell only uses a few well-known sequences, and there
are very many others which will work with <code>echotc</code>, because the
sequences are interpreted by a the terminal, not the shell.</p>
<p>This chunk gives you a flavour:</p>
<pre><code> zmodload -i zsh/termcap
echotc md
echo -n bold
echotc mr
echo -n reverse
echotc me
echo
</code></pre>
<p>First we make sure the module is loaded into the shell; on some older
operating systems, this only works if it was compiled in when zsh was
installed. The option <code>-i</code> to <code>zmodload</code> stops the shell from
complaining if the module was already loaded. This is a sensible way of
ensuring you have the right facilities available in a shell function,
since loading a module makes it available until it is explicitly
unloaded.</p>
<p>You should see `<code>bold</code>' in bold characters, and `<code>reverse</code>' in bold
reverse video. The `<code>md</code>' capability turns on bold mode; `<code>mr</code>' turns
on reverse video; `<code>me</code>' turns off both modes. A more typical zsh way
of doing this is:</p>
<pre><code> print -P '%Bbold%Sreverse%b%s'
</code></pre>
<p>which should show the same thing, but using prompt escapes --- prompts
are the most common use of special fonts. The `<code>%S</code>' is because zsh
calls reverse `standout' mode, because it does. (On a colour xterm, you
may find `bold' is interpreted as `blue'.)</p>
<p>There's a lot more you can do with <code>echotc</code> if you really try. The shell
has just acquired a way of printing terminfo sequences, predictably
called <code>echoti</code>, although it's only available on systems where zsh needs
terminfo to compile --- this happens when the termcap code is actually a
part of terminfo. The good news about this is that terminfo tends to be
better documented, so you have a good chance of finding out the
capabilities you want from the <code>terminfo</code> manual page. The <code>echoti</code>
command lives in another predictably named module, <code>zsh/terminfo</code>.</p>
<p><spanid="l33"></span></p>
<h3id="322-other-builtins-just-for-speed"><aclass="header"href="#322-other-builtins-just-for-speed">3.2.2: Other builtins just for speed</a></h3>
<p>There are only a few other builtins which are there just to make things
go faster. Strictly, tests could go into this category, but as I
explained in the last chapter it's useful to have tests in the form</p>
<pre><code> if [[ $var1 = $var2 ]]; then
print doing something
fi
</code></pre>
<p>be treated as a special syntax by the shell, in case <code>$var1</code> or <code>$var2</code>
expands to nothing which would otherwise confuse it. This example
consists of two features described below: the test itself, between the
double square brackets, which is true if the two substituted values are
the same string, and the `<code>if</code>' construct which runs the commands in
the middle (here just the <code>print</code>) if that test was true.</p>
<p>The builtins `<code>true</code>' and `<code>false</code>' do nothing at all, except return a
command status zero or one, respectively. They're just used as
placeholders: to run a loop forever --- <code>while</code> will also be explained
in more detail later --- you use</p>
<pre><code> while true; do
print doing something over and over
done
</code></pre>
<p>since the test always succeeds.</p>
<p>A synonym for `<code>true</code>' is `<code>:</code>'; it's often used in this form to give
arguments which have side effects but which shouldn't be used ---
something like</p>
<pre><code> : ${param:=value}
</code></pre>
<p>which is a common idiom in all Bourne shell derivatives. In the
parameter expansion, <code>$param</code> is given the value <code>value</code> if it was empty
before, and left alone otherwise. Since that was the only reason for the
parameter expansion, you use <code>:</code> to ignore the argument. Actually, the
shell blithely builds the command line --- the colon, followed by
whatever the value of <code>$param</code> is, whether or not the assignment
happened --- then executes the command; it just so happens that `<code>:</code>'
takes no notice of the arguments it was given. If you're switching from
ksh, you may expect certain synonyms like this to be aliases, rather
than builtins themselves, but in zsh they are actually builtins; there
are no aliases predefined by the shell. (You can still get rid of them
using `<code>disable</code>', as described below.)</p>
<p><spanid="l34"></span></p>
<h3id="323-builtins-which-change-the-shells-state"><aclass="header"href="#323-builtins-which-change-the-shells-state">3.2.3: Builtins which change the shell's state</a></h3>
<p>A more common use for builtins is that they change something inside the
shell, or report information about what's going on in the shell. There
is one vital thing to remember about external commands. It applies, too,
to other cases we'll meet where the shell `forks', literally splitting
itself into two parts, where the forked-off part behaves just like an
external command. In both of these cases, the command is in a different
<em>process</em>, UNIX's basic unit of things that run. (In fact, even Windows
knows about processes nowadays, although they interact a little bit
differently with one another.)</p>
<p>The vital thing is that no change in a separate process started by the
shell affects the shell itself. The most common case of this is the
current directory --- every process has its own current directory. You
can see this by starting a new zsh:</p>
<pre><code> % pwd # show the current directory
~
% zsh # start a new shell, which
# is a separate process
% cd tmp
% pwd # now I'm in a different
# directory...
~/tmp
% exit # leave the new shell...
% pwd # now I'm back where I was...
~
</code></pre>
<p>Hence the <code>cd</code> command must be a shell builtin, or this would happen
every time you ran it.</p>
<p>Here's a more useful example. Putting parentheses around a command asks
the shell to start a different process for it. That's useful when you
specifically <em>don't</em> want the effects propagating back:</p>
<pre><code> (cd some-other-dir; run-some-command)
</code></pre>
<p>runs the command, but doesn't change the directory the `real' shell is
in, only its forked-off `subshell'. Hence,</p>
<pre><code> % pwd
~
% (cd /; pwd)
/
% pwd
~
</code></pre>
<p>There's a more subtle case:</p>
<pre><code> cd some-other-dir | print Hello
</code></pre>
<p>Remember, the `<code>|</code>' (`pipe') connects the output of the first command
to the input of the next --- though actually no information is passed
that way in this example. In zsh, all but the last portion of the
`pipeline' thus created is run in different processes. Hence the <code>cd</code>
doesn't affect the main shell. I'll refer to it as the `parent' shell,
which is the standard UNIX language for processes; when you start
another command or fork off a subshell, you are creating `children'
(without meaning to be morbid, the children usually die first in this
case). Thus, as you would guess,</p>
<pre><code> print Hello | cd some-other-dir
</code></pre>
<p><em>does</em> have the effect of changing the directory. Note that other shells
do this differently; it is always guaranteed to work this way in zsh,
because many people rely on it for setting parameters, but many shells
have the <em>left</em> hand of the pipeline being the bit that runs in the
parent shell. If both sides of the pipe symbol are external commands of
some sort, both will of course run in subprocesses.</p>
<p>There are other ways you change the state of the shell, for example by
declaring parameters of a particular type, or by telling it how to
interpret certain commands, or, of course, by changing options. Here are
the most useful, grouped in a vaguely logical fashion.</p>
<p><spanid="l35"></span></p>
<h3id="324-cd-and-friends"><aclass="header"href="#324-cd-and-friends">3.2.4: cd and friends</a></h3>
<p>You will not by now be surprised to learn that the `<code>cd</code>' command
changes directory. There is a synonym, `<code>chdir</code>', which as far as I
know no-one ever uses. (It's the same name as the system call, so if you
had been programming in C or Perl and forgot that you were now using the
shell, you might use `<code>chdir</code>'. But that seems a bit far-fetched.)</p>
<p>There are various extra features built into <code>cd</code> and <code>chdir</code>. First, if
you miss out the directory to which you want to change, you will be
taken to your home directory, although it's not as if `<code>cd ~</code>' is all
that hard to type.</p>
<p>Next, the command `<code>cd -</code>' is special: it takes you to the last
directory you were in. If you do a sequence of <code>cd</code> commands, only the
immediately preceding directory is remembered; they are not stacked up.</p>
<p>Thirdly, there is a shortcut for changing between similarly named
directories. If you type `<code>cd <old><new></code>', then the shell will look
for the first occurrence of the string `<code><old></code>' in the current
directory, and try to replace it with `<code><new></code>'. For example,</p>
<pre><code> % pwd
~/src/zsh-3.0.8/Src
% cd 0.8 1.9
~/src/zsh-3.1.9/Src
</code></pre>
<p>The <code>cd</code> command actually reported the new directory, as it usually does
if it's not entirely obvious where it's taken you.</p>
<p>Note that only the <em>first</em> match of <code><old></code> is taken. It's an easy
mistake to think you can change from
<code>/home/export1/pws/mydir1/something</code> to
<code>/home/export1/pws/mydir2/something</code> with `<code>cd 1 2</code>', but that first
`<code>1</code>' messes it up. Arguably the shell could be smarter here. Of
course, `<code>cd r1 r2</code>' will work in this case.</p>
<p><code>cd</code>'s friend `<code>pwd</code>' (print working directory) tells you what the
current working directory is; this information is also available in the
shell parameter <code>$PWD</code>, which is special and automatically updated when
the directory changes. Later, when you know all about expansion, you
will find that you can do tricks with this to refer to other
directories. For example, <code>${PWD/old/new}</code> uses the parameter
substitution mechanism to refer to a different directory with <code>old</code>
replaced by <code>new</code> --- and this time <code>old</code> can be a pattern, i.e.
something with wildcard matches in it. So if you are in the
<code>zsh-3.0.8/Src</code> directory as above and want to copy a file from the
<code>zsh-3.1.9/Src</code> directory, you have a shorthand:</p>
<pre><code> cp ${PWD/0.8/1.9}/myfile.c .
</code></pre>
<p><strong>Symbolic links</strong></p>
<p>Zsh tries to track directories across symbolic links. If you're not
familiar with these, you can think of them as a filename which behaves
like a pointer to another file (a little like Windows' shortcuts, though
UNIX has had them for much longer and they work better). You create them
like this (<code>ln</code> is not a builtin command, but its use to make symbolic
links is very standard these days):</p>
<pre><code> ln -s existing-file-name name-of-link
</code></pre>
<p>for example</p>
<pre><code> ln -s /usr/bin/ln ln
</code></pre>
<p>creates a file called <code>ln</code> in the current directory which does nothing
but point to the file <code>/usr/bin/ln</code>. Symbolic links are very good at
behaving as much like the original file as you usually want; for
example, you can run the <code>ln</code> link you've just created as if it were
<code>/usr/bin/ln</code>. They show up differently in a long file listing with
`<code>ls -l</code>', the last column showing the file they point to.</p>
<p>You can make them point to any sort of file at all, including
directories, and that is why they are mentioned here. Suppose you create
a symbolic link from your home directory to the root directory and
change into it:</p>
<pre><code> ln -s / ~/mylink
cd ~/mylink
</code></pre>
<p>If you don't know it's a link, you expect to be able to change to the
parent directory by doing `<code>cd ..</code>'. However, the operating system ---
which just has one set of directories starting from <code>/</code> and going down,
and ignores symbolic links after it has followed them, they really are
just pointers --- thinks you are in the root directory <code>/</code>. This can be
confusing. Hence zsh tries to keep track of where <em>you</em> probably think
you are, rather than where the system does. If you type `<code>pwd</code>', you
will see `<code>/home/you/mylink</code>' (wherever your home directory is), not
`<code>/</code>'; if you type `<code>cd ..</code>', you will find yourself back in your home
directory.</p>
<p>You can turn all this second-guessing off by setting the option
<code>CHASE_LINKS</code>; then `<code>cd ~/mydir; pwd</code>' will show you to be in <code>/</code>,
where changing to the parent directory has no effect; the parent of the
root directory is the root directory, except on certain slightly
psychedelic networked file systems. This does have advantages: for
example, `<code>cd ~/mydir; ls ..</code>' always lists the root directory, not
your home directory, regardless of the option setting, because <code>ls</code>
doesn't know about the links you followed, only zsh does, and it treats
the <code>..</code> as referring to the root directory. Having <code>CHASE_LINKS</code> set
allows `<code>pwd</code>' to warn you about where the system thinks you are.</p>
<p>An aside for non-UNIX-experts (over 99.9% of the population of the world
at the last count): I said `symbolic links' instead of just `links'
because there are others called `hard links'. This is what `<code>ln</code>'
creates if you don't use the <code>-s</code> option. A hard link is not so much a
pointer to a file as an alternative name for a file. If you do</p>
<pre><code> ln myfile othername
ls -l
</code></pre>
<p>where <code>myfile</code> already exists you can't tell which of <code>myfile</code> and
<code>othername</code> is the original --- and in fact the system doesn't care. You
can remove either, and the other will be perfectly happy as the name for
the file. This is pretty much how renaming files works, except that
creating the hard link is done for you in that case. Hard links have
limitations --- you can't link to directories, or to a file on another
disk partition (and if you don't know what a disk partition is, you'll
see what a limitation that can be). Furthermore, you usually want to
know which is the original and which is the link --- so for most users,
creating symbolic links is more useful. The only drawback is that
following the pointers is a tiny bit slower; if you think you can notice
the difference, you definitely ought to slow down a bit.</p>
<p>The target of a symbolic link, unlike a hard link, doesn't actually have
to exist and no checking is performed until you try to use the link. The
best thing to do is to run `<code>ls -lL</code>' when you create the link; the
<code>-L</code> part tells <code>ls</code> to follow links, and if it worked you should see
that your link is shown as having exactly the same characteristics as
the file it points to. If it is still shown as a link, there was no such
file.</p>
<p>While I'm at it, I should point out one slight oddity with symbolic
links: the name of the file linked to (the first name), if it is not an
absolute path (beginning with <code>/</code> after any <code>~</code> expansion), is treated
relative to the directory where the link is created --- not the current
directory when you run <code>ln</code>. Here:</p>
<pre><code> ln -s ../mydir ~/links/otherdir
</code></pre>
<p>the link <code>otherdir</code> will refer to <code>mydir</code> in <em>its own</em> parent directory,
i.e. <code>~/links</code> --- not, as you might think, the parent of the directory
where you were when you ran the command. What makes it worse is that the
second word, if is not an absolute path, <em>is</em> interpreted relative to
the directory where you ran the command.</p>
<p><strong>$cdpath and AUTO_CD</strong></p>
<p>We're nowhere near the end of the magic you can do with directories yet
(and, in fact, I haven't even got to the zsh-specific parts). The next
trick is <code>$cdpath</code> and <code>$CDPATH</code>. They look a lot like <code>$path</code> and
<code>$PATH</code> which you met in the last chapter, and I mentioned them briefly
back in the last chapter in that context: <code>$cdpath</code> is an array of
directories, while <code>$CDPATH</code> is colon-separated list behaving otherwise
like a scalar variable. They give a list of directories whose
subdirectories you may want to change into. If you use a normal cd
command (i.e. in the form `<code>cd </code><em>dirname</em>', and <em>dirname</em> does not
begin with a <code>/</code> or <code>~</code>, the shell will look through the directories in
<code>$cdpath</code> to find one which contains the subdirectory <em>dirname</em>. If
<code>$cdpath</code> isn't set, as you'd guess, it just uses the current directory.</p>
<p>Note that <code>$cdpath</code> is always searched in order, and you can put a <code>.</code>
in it to represent the current directory. If you do, the current
directory will always be searched <em>at that point</em>, not necessarily
first, which may not be what you expect. For example, let's set up some
directories:</p>
<pre><code> mkdir ~/crick ~/crick/dna
mkdir ~/watson ~/watson/dna
cdpath=(~/crick .)
cd ~/watson
cd dna
</code></pre>
<p>So I've moved to the directory <code>~/watson</code>, which contains the
subdirectory <code>dna</code>, and done `<code>cd dna</code>'. But because of <code>$cdpath</code>, the
shell will look first in <code>~/crick</code>, and find the <code>dna</code> there, and take
you to that copy of the self-reproducing directory, not the one in
<code>~/watson</code>. Most people have <code>.</code> at the start of their <code>cdpath</code> for that
reason. However, at least <code>cd</code> warns you --- if you tried it, you will
see that it prints the name of the directory it's picked in cases like
this.</p>
<p>In fact, if you don't have <code>.</code> in your directory at all, the shell will
always look there first; there's no way of making <code>cd</code> never change to a
subdirectory of the current one, short of turning <code>cd</code> into a function.
Some shells don't do this; they use the directories in <code>$cdpath</code>, and
only those.</p>
<p>There's yet another shorthand, this time specific to zsh: the option
<code>AUTO_CD</code> which I mentioned in the last chapter. That way a command
without any arguments which is really a directory will take you to that
directory. Normally that's perfect --- you would just get a `command
not found' message otherwise, and you might as well make use of the
option. Just occasionally, however, the name of a directory clashes with
the name of a command, builtin or external, or a shell function, and
then there can be some confusion: zsh will always pick the command as
long as it knows about it, but there are cases where it doesn't, as I
described above.</p>
<p>What I didn't say in the last chapter is that <code>AUTO_CD</code> respects
<code>$cdpath</code>; in fact, it really is implemented so that `<em>dirname</em>' on its
own behaves as much like `<code>cd</code><em>dirname</em>' as is possible without tying
the shell's insides into knots.</p>
<p><strong>The directory stack</strong></p>
<p>One very useful facility that zsh inherited from the C-shell family
(traditional Korn shell doesn't have it) is the directory stack. This is
a list of directories you have recently been in. If you use the command
`<code>pushd</code>' instead of `<code>cd</code>', e.g. `<code>pushd</code><em>dirname</em>', then the
directory you are in is saved in this list, and you are taken to
<em>dirname</em>, using <code>$CDPATH</code> just as <code>cd</code> does. Then when you type
`<code>popd</code>', you are taken back to where you were. The list can be as long
as you like; you can <code>pushd</code> any number of directories, and each <code>popd</code>
will take you back through the list (this is how a `stack', or more
precisely a `last-in-first-out' stack usually operates in computer
jargon, hence the name `directory stack').</p>
<p>You can see the list --- which always starts with the current directory
--- with the <code>dirs</code> command. So, for example:</p>
<pre><code> cd ~
pushd ~/src
pushd ~/zsh
dirs
</code></pre>
<p>displays</p>
<pre><code> ~/zsh ~/src ~
</code></pre>
<p>and the next <code>popd</code> will take you back to <code>~/src</code>. If you do it, you
will see that <code>pushd</code> reports the list given by <code>dirs</code> automatically as
it goes along; you can turn this off with the option <code>PUSHD_SILENT</code>,
when you will have to rely on typing <code>dirs</code> explicitly.</p>
<p>In fact, a lot of the use of this comes not from using simple <code>pushd</code>
and <code>popd</code> combinations, but from two other features. First, `<code>pushd</code>'
on its own swaps the top two directories on the stack. Second, <code>pushd</code>
with a numeric argument preceded by a `<code>+</code>' or `<code>-</code>' can take you to
one of the other directories in the list. The command `<code>dirs -v</code>' tells
you the numbers you need; <code>0</code> is the current directory. So if you get,</p>
<pre><code> 0 ~/zsh
1 ~/src
2 ~
</code></pre>
<p>then `<code>pushd +2</code>' takes you to <code>~</code>. (A little suspension of disbelief
that I didn't just use <code>AUTO_CD</code> and type `<code>..</code>' is required here.) If
you use a <code>-</code>, it counts from the other end of the list; <code>-0</code> (with
apologies to the numerate) is the last item, i.e. the same as <code>~</code> in
this case. Some people are used to having the `<code>-</code>' and `<code>+</code>'
arguments behave the other way around; the option <code>PUSHD_MINUS</code> exists
for this.</p>
<p>Apart from <code>PUSHD_SILENT</code> and <code>PUSHD_MINUS</code>, there are a few other
relevant options. Setting <code>PUSHD_IGNORE_DUPS</code> means that if you <code>pushd</code>
to a directory which is already somewhere in the list, the duplicate
entry will be silently removed. This is useful for most human operations
--- however, if you are using <code>pushd</code> in a function or script to
remember previous directories for a future matching <code>popd</code>, this can be
dangerous and you probably want to turn it off locally inside the
function.</p>
<p><code>AUTO_PUSHD</code> means that any directory-changing command, including an
auto-cd, is treated as a <code>pushd</code> command with the target directory as
argument. Using this can make the directory stack get very long, and
there is a parameter <code>$DIRSTACKSIZE</code> which you can set to specify a
maximum length. The oldest entry (the highest number in the `<code>dirs -v</code>'
listing) is automatically removed when this length is exceeded. There is
no limit unless this is explicitly set.</p>
<p>The final <code>pushd</code> option is <code>PUSHD_TO_HOME</code>. This makes <code>pushd</code> on its
own behave like <code>cd</code> on its own in that it takes you to your home
directory, instead of swapping the top two directories. Normally a
series of `<code>pushd</code>' commands works pretty much like a series of `<code>cd -</code>' commands, always taking you the directory you were in before, with
the obvious difference that `<code>cd -</code>' doesn't consult the directory
stack, it just remembers the previous directory automatically, and hence
it can confuse <code>pushd</code> if you just use `<code>cd -</code>' instead.</p>
<p>There's one remaining subtlety with <code>pushd</code>, and that is what happens to
the rest of the list when you bring a particular directory to the front
with something like `<code>pushd +2</code>'. Normally the list is simply cycled,
so the directories which were +3, and +4 are now right behind the new
head of the list, while the two directories which were ahead of it get
moved to the end. If the list before was:</p>
<pre><code> dir1 dir2 dir3 dir4
</code></pre>
<p>then after <code>pushd +2</code> you get</p>
<pre><code> dir3 dir4 dir1 dir2
</code></pre>
<p>That behaviour changed during the lifetime of zsh, and some of us
preferred the old behaviour, where that one directory was yanked to the
front and the rest just closed the gap:</p>
<pre><code> # Old behaviour
dir3 dir1 dir2 dir4
</code></pre>
<p>so that after a while you get a `greatest hits' group at the front of
the list. If you like this behaviour too (I feel as if I'd need to have
written papers on group theory to like the new behaviour) there is a
function <code>pushd</code> supplied with the source code, although it's short
enough to repeat here --- this is in the form for autoloading in the zsh
fashion:</p>
<pre><code> # pushd function to emulate the old zsh behaviour.
# With this, pushd +/-n lifts the selected element
# to the top of the stack instead of cycling
# the stack.
emulate -R zsh
setopt localoptions
if [[ ARGC -eq 1 &&"$1" == [+-]<-> ]] then
setopt pushdignoredups
builtin pushd ~$1
else
builtin pushd "$@"
fi
</code></pre>
<p>The `<code>&&</code>' is a logical `and', requiring both tests to be true. The
tests are that there is exactly one argument to the function, and that
it has the form of a `<code>+</code>' or a `<code>-</code>' followed by any number (`<code><-></code>'
is a special zsh pattern to match any number, an extension of forms like
`<code><1-100></code>' which matches any number in the range 1 to 100 inclusive).</p>
<p><strong>Referring to other directories</strong></p>
<p>Zsh has two ways of allowing you to refer to particular directories.
They have in common that they begin with a <code>~</code> (in very old versions of
zsh, the second form actually used an `<code>=</code>', but the current way is
much more logical).</p>
<p>You will certainly be aware, because I've made a lot of use of it, that
a `<code>~</code>' on its own or followed by a <code>/</code> refers to your own home
directory. An extension of this --- again from the C-shell, although the
Korn shell has it too in this case --- is that <code>~name</code> can refer to the
home directory of any user on the system. So if your user name is <code>pws</code>,
then <code>~</code> and <code>~pws</code> are the same directory.</p>
<p>Zsh has an extension to this; you can actually name your own
directories. This was described in <ahref="zshguide02.html#init">chapter 2</a>, à
propos of prompts, since that is the major use:</p>
<pre><code> host% PS1='%~? '
~? cd zsh/Src
~/zsh/Src? zsrc=$PWD
~/zsh/Src? echo ~zsrc
/home/pws/zsh/Src
~zsrc?
</code></pre>
<p>Consult that chapter for the ways of forcing a parameter to be
recognised as a named directory.</p>
<p>There's a slightly more sophisticated way of doing this directly:</p>
<pre><code> hash -d zsrc=~/zsh/Src
</code></pre>
<p>makes <code>~zsrc</code> appear in prompts as before, and in this case there is no
parameter <code>$zsrc</code>. This is the purist's way (although very few zsh users
are purists). You can guess what `<code>unhash -d zsrc</code>' does; this works
with directories named via parameters, too, but leaves the parameter
itself alone.</p>
<p>It's possible to have a named directory with the same name as a user. In
that case `<code>~name</code>' refers to the directory you named explicitly, and
there is no easy way of getting <code>name</code>'s home directory without removing
the name you defined.</p>
<p>If you're using named directories with one of the <code>cd</code>-like commands or
<code>AUTO_CD</code>, you can set the option <code>CDABLEVARS</code> which allows you to omit
the leading <code>~</code>; `<code>cd zsrc</code>' with this option would take you to
<code>~zsrc</code>. The name is a historical artifact and now a misnomer; it really
is named directories, not parameters (i.e. variables), which are used.</p>
<p>The second way of referring to directories with <code>~</code>'s is to use numbers
instead of names: the numbers refer to directories in the directory
stack. So if <code>dirs -v</code> gives you</p>
<pre><code> 0 ~zsf
1 ~src
</code></pre>
<p>then <code>~+1</code> and <code>~-0</code> (not very mathematical, but quite logical if you
think about it) refer to <code>~src</code>. In this case, unlike pushd arguments,
you can omit the <code>+</code> and use <code>~1</code>. The option <code>PUSHD_MINUS</code> is
respected. You'll see this was used in the <code>pushd</code> function above: the
trick was that <code>~+3</code>, for example, refers to the same element as <code>pushd +3</code>, hence <code>pushd ~+3</code> pushed that directory onto the front of the list.
However, we set <code>PUSHD_IGNORE_DUPS</code>, so that the value in the old
position was removed as well, giving us the effect we wanted of simply
yanking the directory to the front with no trick cycling.</p>
<p><spanid="l36"></span></p>
<h3id="325-command-control-and-information-commands"><aclass="header"href="#325-command-control-and-information-commands">3.2.5: Command control and information commands</a></h3>
<p>Various builtins exist which control how you access commands, and which
show you information about the commands which can be run.</p>
<p>The first two are strictly speaking `precommand modifiers' rather than
commands: that means that they go before a command line and modify its
behaviour, rather than being commands in their own right. If you put
`<code>command</code>' in front of a command line, the command word (the next one
along) will be taken as the name of an external command, however it
would normally be interpreted; likewise, if you put `<code>builtin</code>' in
front, the shell will try to run the command as a builtin command.
Normally, shell functions take precedence over builtins which take
precedence over external commands. So, for example, if your printer
control system has the command `<code>enable</code>' (as many System V versions
do), which clashes with a builtin I am about to talk about, you can run
`<code>command enable lp</code>' to enable a printer; otherwise, the builtin
enable would have been run. Likewise, if you have defined <code>cd</code> to be a
function, but this time want to call the normal builtin <code>cd</code>, you can
say `<code>builtin cd mydir</code>'.</p>
<p>A common use for <code>command</code> is inside a shell function of the same name.
Sometimes you want to enhance an ordinary command by sticking some extra
stuff around it, then calling that command, so you write a shell
function of the same name. To call the command itself inside the shell
function, you use `<code>command</code>'. The following works, although it's
obviously not all that useful as it stands:</p>
<pre><code> ls() {
command ls "$[@]"
}
</code></pre>
<p>so when you run `<code>ls</code>', it calls the function, which calls the real
<code>ls</code> command, passing on the arguments you gave it.</p>
<p>You can gain longer lasting control over the commands which the shell
will run with the `<code>disable</code>' and `<code>enable</code>' commands. The first
normally takes builtin arguments; each such builtin will not be
recognised by the shell until you give an `<code>enable</code>' command for it. So
if you want to be able to run the external <code>enable</code> command and don't
particularly care about the builtin version, `<code>disable enable</code>' (sorry
if that's confusing) will do the trick. Ha, you're thinking, you can't
run `<code>enable enable</code>'. That's correct: some time in the dim and distant
past, <code>builtin enable enable</code>' would have worked, but currently it
doesn't; this may change, if I remember to change it. You can list all
disabled builtins with just `<code>disable</code>' on its own --- most of the
builtins that do this sort of manipulation work like that.</p>
<p>You can manipulate other sets of commands with <code>disable</code> and <code>enable</code> by
giving different options: aliases with the option <code>-a</code>, functions with
<code>-f</code>, and reserved words with <code>-r</code>. The first two you probably know
about, and I'll come to them anyway, but `reserved words' need
describing. They are essentially builtin commands which have some
special syntactic meaning to the shell, including some symbols such as
`<code>{</code>' and `<code>[[</code>'. They take precedence over everything else except
aliases --- in fact, since they're syntactically special, the shell
needs to know very early on that it has found a reserved word, it's no
use just waiting until it tries to execute a command. For example, if
the shell finds `<code>[[</code>' it needs to know that everything until `<code>]]</code>'
must be treated as a test rather than as ordinary command arguments.
Consequently, you wouldn't often want to disable a reserved word, since
the shell wouldn't work properly. The most obvious reason why you might
would be for compatibility with some other shell which didn't have one.
You can get a complete list with:</p>
<pre><code> whence -wm '*' | grep reserved
</code></pre>
<p>which I'll explain below, since I'm coming to `<code>whence</code>'.</p>
<p>Furthermore, I tend to find that if I want to get rid of aliases or
functions I use the commands `<code>unalias</code>' and `<code>unfunction</code>' to get rid
of them permanently, since I always have the original definitions stored
somewhere, so these two options may not be that useful either. Disabling
builtins is definitely the most useful of the four possibilities for
<code>disable</code>.</p>
<p>External commands have to be manipulated differently. The types given
above are handled internally by the shell, so all it needs to do is
remember what code to call. With external commands, the issue instead is
how to find them. I mentioned <code>rehash</code> above, but didn't tell you that
the <code>hash</code> command, which you've already seen with the <code>-d</code> option, can
be used to tell the shell how to find an external command:</p>
<pre><code> hash foo=/path/to/foo
</code></pre>
<p>makes <code>foo</code> execute the command using the path shown (which doesn't even
have to end in `<code>foo</code>'). This is rather like an alias --- most people
would probably do this with an alias, in fact --- although a little
faster, though you're unlikely to notice the difference. You can remove
this with <code>unhash</code>. One gotcha here is that if the path is rehashed,
either by calling <code>rehash</code> or when you alter <code>$path</code>, the entire hash
table is emptied, including anything you put in in this way; so it's not
particularly useful.</p>
<p>In the midst of all this, it's useful to be able to find out what the
shell thinks a particular command name does. The command `<code>whence</code>'
tells you this; it also exists, with slightly different options, under
the names <code>where</code>, <code>which</code> and <code>type</code>, largely to provide compatibility
with other shells. I'll just stick to <code>whence</code>.</p>
<p>Its standard output isn't actually sparklingly interesting. If it's a
command somehow known to the shell internally, it gets echoed back, with
the alias expanded if it was an alias; if it's an external command it's
printed with the full path, showing where it came from; and if it's not
known the command returns status 1 and prints nothing.</p>
<p>You can make it more useful with the <code>-v</code> or <code>-c</code> options, which are
more verbose; the first prints out an information message, while the
second prints out the definitions of any functions it was asked about
(this is also the effect of using `<code>which</code>' instead of `<code>whence</code>). A
very useful option is <code>-m</code>, which takes any arguments as patterns using
the usual zsh pattern format, in other words the same one used for
matching files. Thus</p>
<pre><code> whence -vm "*"
</code></pre>
<p>prints out every command the shell knows about, together with what it
thinks of it.</p>
<p>Note the quotes around the `<code>*</code>' --- you have to remember these
anywhere where the pattern is not to be used to generate filenames on
the command line, but instead needs to be passed to the command to be
interpreted. If this seems a rather subtle distinction, think about what
would happen if you ran</p>
<pre><code> # Oops. Better not try this at home.
# (Even better, don't do it at work either.)
whence -vm *
</code></pre>
<p>in a directory with the files `<code>foo</code>' and (guess what) `<code>bar</code>' in it.
The shell hasn't decided what command it's going to run when it first
looks at the command line; it just sees the `<code>*</code>' and expands the line
to</p>
<pre><code> whence -vm foo bar
</code></pre>
<p>which isn't what you meant.</p>
<p>There are a couple of other tricks worth mentioning: <code>-p</code> makes the
shell search your path for them, even if the name is matched as
something else (say, a shell function). So if you have <code>ls</code> defined as a
function,</p>
<pre><code> which -p ls
</code></pre>
<p>will still tell what `<code>command ls</code>' would find. Also, the option <code>-a</code>
searches for all commands; in the same example, this would show you both
the <code>ls</code> command and the <code>ls</code> function, whereas <code>whence</code> would normally
only show the function because that's the one that would be run. The
<code>-a</code> option also shows if it finds more than one external command in
your path.</p>
<p>Finally, the option <code>-w</code> is useful because it identifies the type of a
command with a single word: <code>alias</code>, <code>builtin</code>, <code>command</code>, <code>function</code>,
<code>hashed</code>, <code>reserved</code> or <code>none</code>. Most of those are obvious, with
<code>command</code> being an ordinary external command; <code>hashed</code> is an external
command which has been explicitly given a path with the <code>hash</code> builtin,
and <code>none</code> means it wasn't recognised as a command at all. Now you know
how we extracted the reserved words above.</p>
<p>A close relative of <code>whence</code> is <code>functions</code>, which applies, of course,
to shell functions; it usually lists the definitions of all functions
given as arguments, but its relatives (of which <code>autoload</code> is one)
perform various other tricks, to be described in the section on shell
functions below. Be careful with <code>function</code>, without the `s', which is
completely different and not like <code>command</code> or <code>builtin</code> --- it is
actually a keyword used to <em>define</em> a function.</p>
<p>There are various builtins for controlling the shells parameters. You
already know how to set and use parameters, but it's a good deal more
complicated than that when you look at the details.</p>
<p><strong>Local parameters</strong></p>
<p>The principal command for manipulating the behaviour of parameters is
`<code>typeset</code>'. Its easiest usage is to declare a parameter; you just give
it a list of parameter names, which are created as scalar parameters.
You can create parameters just by assigning to them, but the major point
of `<code>typeset</code>' is that if a parameter is created that way inside a
function, the parameter is restored to its original value, or removed if
it didn't previously exist, at the end of the function --- in other
words, it has `local scope' like the variables which you declare in
most ordinary programming languages. In fact, to use the jargon it has
`dynamical' rather than `syntactic' scope, which means that the same
parameter is visible in any function called within the current one; this
is different from, say, C or FORTRAN where any function or subroutine
called wouldn't see any variable declared in the parent function.</p>
<p>The following makes this more concrete.</p>
<pre><code> var='Original value'
subfn() {
print $var
}
fn() {
print $var
typeset var='Value in function'
print $var
subfn
}
fn
print $var
</code></pre>
<p>This chunk of code prints out</p>
<pre><code> Original value
Value in function
Value in function
Original value
</code></pre>
<p>The first three chunks of the code just define the parameter <code>$var</code>, and
two functions, <code>subfn</code> and <code>fn</code>. Then we call <code>fn</code>. The first thing this
does is print out <code>$var</code>, which gives `<code>Original value</code>' since we
haven't changed the original definition. However, the <code>typeset</code> next
does that; as you see, we can assign to the parameter during the
typeset. Thus when we print <code>$var</code> out again, we get `<code>Value in function</code>'. Then <code>subfn</code> is called, which prints out the same value as
in <code>fn</code>, because we haven't changed it --- this is where C or FORTRAN
would differ, and wouldn't recognise the variable because it hadn't been
declared in that function. Finally, <code>fn</code> exits and the original value is
restored, and is printed out by the final `<code>print</code>'.</p>
<p>Note the value changes twice: first at the <code>typeset</code>, then again at the
end of <code>fn</code>. The value of <code>$var</code> at any point will be one of those two
values.</p>
<p>Although you can do assignments in a <code>typeset</code> statement, you can't
assign to arrays (I already said this in the last chapter):</p>
<pre><code> typeset var=(Doesn\'t work\!)
</code></pre>
<p>because the syntax with the parentheses is special; it only works when
the line consists of nothing but assignments. However, the shell doesn't
complain if you try to assign an array to a scalar, or vice versa; it
just silently converts the type:</p>
<pre><code> typeset var='scalar value'
var=(array value)
</code></pre>
<p>I put in the assignment in the typeset statement to rub the point in
that it creates scalars, but actually the usual way of setting up an
array in a function is</p>
<pre><code> typeset var
var=()
</code></pre>
<p>which creates an empty scalar, then converts that to an empty array.
Recent versions of the shell have `<code>typeset -a var</code>' to do that in one
go --- but you <em>still</em> can't assign to it in the same statement.</p>
<p>There are other catches associated with the fact that <code>typeset</code> and its
relatives are just ordinary commands with ordinary sets of arguments.
Consider this:</p>
<pre><code> % typeset var=`echo two words`
% print $var
two
</code></pre>
<p>What has happened to the `<code>words</code>'? The answer is that backquote
substitution, to be discussed below, splits words when not quoted. So
the <code>typeset</code> statement is equivalent to</p>
<pre><code> % typeset var=two words
</code></pre>
<p>There are two ways to get round this; first, use an ordinary assignment:</p>
<pre><code> % typeset var
% var=`echo two words`
</code></pre>
<p>which can tell a scalar assignment, and hence knows not to split words,
or quote the backquotes,</p>
<pre><code> % typeset var="`echo two words`"
</code></pre>
<p>There are three important types we haven't talked about; both of these
can only be created with <code>typeset</code> or one of the similar builtins I'll
list in a moment. They are integer types, floating point types, and
associative array types.</p>
<p><strong>Numeric parameters</strong></p>
<p>Integers are created with `<code>typeset -i</code>', or `<code>integer</code>' which is
another way of saying the same thing. They are used for arithmetic,
which the shell can do as follows:</p>
<pre><code> integer i
(( i = 3 * 2 + 1 ))
</code></pre>
<p>The double parentheses surround a complete arithmetic expression: it
behaves as if it's quoted. The expression inside can be pretty much
anything you might be used to from arithmetic in other programming
languages. One important point to note is that parameters don't need to
have the <code>$</code> in front, even when their value is being taken:</p>
<pre><code> integer i j=12
(( i = 3 * ( j + 4 ) ** 2 ))
</code></pre>
<p>Here, <code>j</code> will be replaced by 12 and <code>$i</code> gets the value 768 (sixteen
squared times three). One thing you might not recognise is the <code>**</code>,
which is the `to the power of' operator which occurs in FORTRAN and
Perl. Note that it's fine to have parentheses inside the double
parentheses --- indeed, you can even do</p>
<pre><code> (( i = (3 * ( j + 4 )) ** 2 ))
</code></pre>
<p>and the shell won't get confused because it knows that any parentheses
inside must be in balanced pairs (until you deliberately confuse it with
your buggy code).</p>
<p>You would normally use `<code>print $i</code>' to see what value had been given to
<code>$i</code>, of course, and as you would expect it gets printed out as a
decimal number. However, <code>typeset</code> allows you to specify another base
for printing out. If you do</p>
<pre><code> typeset -i 16 i
print $i
</code></pre>
<p>after the last calculation, you should see <code>16#900</code>, which means 900 in
base 16 (hexadecimal). That's the only effect the option `<code>-i 16</code>' has
on <code>$i</code> --- you can assign to it and use it in arithmetical expressions
just as normal, but when you print it out it appears in this form. You
can use this base notation for inputting numbers, too:</p>
<pre><code> (( i = 16#ff * 2#10 ))
</code></pre>
<p>which means 255 (<code>ff</code> in hexadecimal) times 2 (<code>10</code> in binary). The
shell understands C notation too, so `<code>16#ff</code>' could have been
expressed `<code>0xff</code>'.</p>
<p>Floating point variables are very similar. You can declare them with
`<code>typeset -F</code>' or `<code>typeset -E</code>'. The only difference between the two
is, again, on output; <code>-F</code> uses a fixed point notation, while <code>-E</code> uses
scientific (mnemonic: exponential) notation. The builtin `<code>float</code>' is
equivalent to `<code>typeset -E</code>' (because Korn shell does it, that's why).
Floating point expressions also work the way you are probably used to:</p>
<pre><code> typeset -E e
typeset -F f
(( e = 32/3, f = 32.0/3.0 ))
print $e $f
</code></pre>
<p>prints</p>
<pre><code> 1.000000000e+01 10.6666666667
</code></pre>
<p>Various points: the `<code>,</code>' can separate different expressions, just like
in C, so the <code>e</code> and <code>f</code> assignments are performed separately. The <code>e</code>
assignment was actually an integer division, because neither 32 nor 3 is
a floating point number, which must contain a dot. That means an integer
division was done, producing 10, which was then converted to a floating
point number only at the end. Again, this is just how grown-up languages
work, so it's no use cursing. The <code>f</code> assignment was a full floating
point performance. Floating point parameters weren't available before
version <code>3.1.7</code>.</p>
<p>Although this is really a matter for a later chapter, there is a library
of floating point functions you can load (actually it's just a way of
linking in the system mathematical library). The usual incantation is
`<code>zmodload zsh/mathfunc</code>'; you may not have `dynamic loading' of
libraries on your system, which may mean that doesn't work. If it does,
you can do things like</p>
<pre><code> (( pi = 4.0 * atan(1.0) ))
</code></pre>
<p>Broadly, all the functions which appear in most system mathematical
libraries (see the manual page for <code>math</code>) are available in zsh.</p>
<p>Like all other parameters created with <code>typeset</code> or one of its cousins,
integer and floating point parameters are local to functions. You may
wonder how to create a global parameter (i.e. one which is valid outside
as well as inside the function) which has an integer or floating point
value. There's a recent addition to the shell (in version 3.1.6) which
allows this: use the flag <code>-g</code> to typeset along with any others. For
example,</p>
<pre><code> fn() {
typeset -Fg f
(( f = 42.75 ))
}
fn
print $f
</code></pre>
<p>If you try it, you will see the value of <code>$f</code> has survived beyond the
function. The <code>g</code> stands for global, obviously, although it's not quite
that simple:</p>
<pre><code> fn() {
typeset -Fg f
}
outerfn() {
typeset f='scalar value'
fn
print $f
}
outerfn
</code></pre>
<p>The function <code>outerfn</code> creates a local scalar value for <code>f</code>; that's what
<code>fn</code> sees. So it was not really operating on a `global' value, it just
didn't create a new one for the scope of <code>fn</code>. The error message comes
because it tried to preserve the value of <code>$f</code> while changing its type,
and the value wasn't a proper floating point expression. The error
message,</p>
<pre><code> fn: bad math expression: operator expected at `value'
</code></pre>
<p>comes about because assigning to numeric parameters always does an
arithmetic evaluation. Operating on `<code>scalar value</code>' it found
`<code>scalar</code>' and assumed this was a parameter, then looked for an
operator like `<code>+</code>' to come next; instead it found `<code>value</code>'. If you
want to experiment, change the string to `<code>scalar + value</code>' and set
`<code>value=42</code>', or whatever, then try again. This is a little confusing
(which is a roundabout way of saying it confused me), but consistent
with how zsh usually treats parameters.</p>
<p>Actually, to a certain extent you don't need to use the integer and
floating point parameters. Any time zsh needs a numeric expression it
will force a scalar to the right value, and any time it produces a
numeric expression and assigns it to a scalar, it will convert the
result to a string. So</p>
<pre><code> typeset num=3 # This is the *string* `3'.
(( num = num + 1 )) # But this works anyway
# ($num is still a string).
</code></pre>
<p>This can be useful if you have a parameter which is sometimes a number,
sometimes a string, since zsh does all the conversion work for you.
However, it can also be confusing if you always want a number, because
zsh can't guess that for you; plus it's a little more efficient not to
have to convert back and forth; plus you lose accuracy when you do,
because if the number is stored as a string rather than in the internal
numeric representation, what you say is what you get (although zsh tends
to give you quite a lot of decimal places when converting implicitly to
strings). Anyway, I'd recommend that if you know a parameter has to be
an integer or floating point value you should declare it as such.</p>
<p>There is a builtin called <code>let</code> to handle mathematical expressions, but
since</p>
<pre><code> let "num = num + 1"
</code></pre>
<p>is equivalent to</p>
<pre><code> (( num = num + 1 ))
</code></pre>
<p>and the second form is easier and more memorable, you probably won't
need to use it. If you do, remember that (unlike BASIC) each
mathematical expression should appear as one argument in quotes.</p>
<p><strong>Associative arrays</strong></p>
<p>The one remaining major type of parameter is the associative array; if
you use Perl, you may call it a `hash', but we tend not to since that's
really a description of how it's implemented rather than what it does.
(All right, what it does is hash things. Now shut up.)</p>
<p>These have to be declared by a typeset statement --- there's no getting
round it. There are some quite eclectic builtins that produce a
filled-in associative array for you, but the only way to tell zsh you
want your very own associative array is</p>
<pre><code> typeset -A assoc
</code></pre>
<p>to create <code>$assoc</code>. As to what it does, that's best shown by example:</p>
<pre><code> typeset -A assoc
assoc=(one eins two zwei three drei)
print ${assoc[two]}
</code></pre>
<p>which prints `<code>zwei</code>'. So it works a bit like an ordinary array, but
the numeric <em>subscript</em> of an ordinary array which would have appeared
inside the square bracket is replaced by the string <em>key</em>, in this case
<code>two</code>. The array assignment was a bit deceptive; the `values' were
actually pairs, with `<code>one</code>' being the key for the value `<code>eins</code>', and
so on. The shell will complain if there are an odd number of elements in
such a list. This may also be familiar from Perl. You can assign values
one at a time:</p>
<pre><code> assoc[four]=vier
</code></pre>
<p>and also unset one key/value pair:</p>
<pre><code> unset 'assoc[one]'
</code></pre>
<p>where the quotes stop the square brackets from being interpreted as a
pattern on the command line.</p>
<p>Expansion has been held over, but you might like to know about the ways
of getting back what you put in. If you do</p>
<pre><code> print $assoc
</code></pre>
<p>you just see the values --- that's exactly the same as with an ordinary
array, where the subscripts 1, 2, 3, etc. aren't shown. Note they are in
random order --- that's the other main difference from ordinary arrays;
associative arrays have no notion of an order unless you explicitly sort
them.</p>
<p>But here the keys may be just as interesting. So there is:</p>
<pre><code> print ${(k)assoc}
print ${(kv)assoc}
</code></pre>
<p>giving (if you've followed through all the commands above):</p>
<pre><code> four two three
four vier two zwei three drei
</code></pre>
<p>which print out the keys instead of the values, and the key and value
pairs much as you entered them. You can see that, although the order of
the pairs isn't obvious, it's the same each time. From this example you
can work out how to copy an associative array into another one:</p>
<pre><code> typeset -A newass
newass=(${(kv)assoc})
</code></pre>
<p>where the `<code>(kv)</code>' is important --- as is the <code>typeset</code> just before the
assignment, otherwise <code>$newass</code> would be a badass ordinary array. You
can also prove that <code>${(v)assoc}</code> does what you would probably expect.
There are lots of other tricks, but they are mostly associated with
clever types of parameter expansion, to be described in <ahref="zshguide05.html#subst">chapter
5</a>.</p>
<p><strong>Other typeset and type tricks</strong></p>
<p>There are variants of <code>typeset</code>, some mentioned sporadically above.
There is nothing you can do with any of them that you can't do with
<code>typeset</code> --- that wasn't always the case; we've tried to improve the
orthogonality of the options. They differ in the options which are set
by default, and the additional options which are allowed. Here's a list:
<code>declare</code>, <code>export</code>, <code>float</code>, <code>integer</code>, <code>local</code>, <code>readonly</code>. I won't
confuse you by describing all in detail; see the manual.</p>
<p>If there is an odd one out, it's <code>export</code>, which not only marks a
parameter for export but has the <code>-g</code> flag turned on by default, so that
that parameter is not local to the function; in other words, it's
equivalent to <code>typeset -gx</code>. However, one holdover from the days when
the options weren't quite so logical is that <code>typeset -x</code> behaves like
<code>export</code>, in other words the <code>-g</code> flag is turned on by default. You can
fix this by unsetting the option <code>GLOBAL_EXPORT</code> --- the option only
exists for compatibility; logically it should always be unset. This is
partly because in the old days you couldn't export local parameters, so
<code>typeset -x</code> either had to turn on <code>-g</code> or turn off <code>-x</code>; that was fixed
for the 3.1.9 release, and (for example) `<code>local -x</code>' creates a local
parameter which is exported to the environment; both the parameter
itself, and the value in the environment, will be restored when the
function exits. The builtin <code>local</code> is essentially a form of <code>typeset</code>
which renounces the <code>-g</code> flag and all its works.</p>
<p>Another old restriction which has gone is that you couldn't make special
parameters, in particular <code>$PATH</code>, local to a function; you just
modified the original parameter. Now if you say `<code>typeset PATH</code>',
things happen the way you probably expect, with <code>$PATH</code> having its usual
effect, and being restored to its old value when the function exits.
Since <code>$PATH</code> is still special, though, you should make sure you assign
something to it in the function before calling external commands, else
it will be empty and no commands will be found. It's possible that you
specifically don't want some parameter you make local to have the
special property; 3.1.7 and after allow the typeset flag <code>-h</code> to hide
the specialness for that parameter, so in `<code>typeset -h PATH</code>', <code>PATH</code>
would be an ordinary variable for the duration of the enclosing
function. Internally, the same value as was previously set would
continue to be used for finding commands, but it wouldn't be exported.</p>
<p>The second main use of <code>typeset</code> is to set attributes for the
parameters. In this case it can operate on an existing parameter, as
well as creating a new one. For example,</p>
<pre><code> typeset -r msg='This is an important message.'
</code></pre>
<p>sets the readonly flag (-r) for the parameter <code>msg</code>. If the parameter
didn't exist, it would be created with the usual scoping rules; but if
it did exist at the current level of scoping, it would be made readonly
with the value assigned to it, meaning you can't set that particular
copy of the parameter. For obvious reasons, it's normal to assign a
value to a readonly parameter when you first declare it. Here's a
reality check on how this affects scoping:</p>
<pre><code> msg='This is an ordinary parameter'
fn() {
typeset msg='This is a local ordinary parameter'
print $msg
typeset -r msg='This is a local readonly parameter'
print $msg
msg='Watch me cause an error.'
}
fn
print $msg
msg='This version of the parameter'\
' can still be overwritten'
print $msg
</code></pre>
<p>outputs</p>
<pre><code> This is a local ordinary parameter
This is a local readonly parameter
fn:5: read-only variable: msg
This is an ordinary parameter
This version of the parameter can still be overwritten
</code></pre>
<p>Unfortunately there was a bug with this code until recently --- thirty
seconds ago, actually: the second <code>typeset</code> in <code>fn</code> incorrectly added
the readonly flag to the existing <code>msg</code><em>before</em> attempting to set the
new value, which was wrong and inconsistent with what happens if you
create a new local parameter. Maybe it's reassuring that the shell can
get confused about local parameters, too. (I don't find it reassuring in
the slightest, since <code>typeset</code> is one of the parts of the code where I
tend to fix the bugs, but maybe you do.)</p>
<p>Anyway, when the bug is fixed, you should get the output shown, because
the first typeset created a local variable which the second typeset made
readonly, so that the final assignment caused an error. Then the <code>$msg</code>
in the function went out of scope, and the ordinary parameter, with no
readonly restriction, was visible again.</p>
<p>I mentioned another special typeset option in the previous chapter:</p>
<pre><code> typeset -T TEXINPUTS texinputs
</code></pre>
<p>to tie together the scalar <code>$TEXINPUTS</code> and the array <code>$texinputs</code> in
the same way that <code>$PATH</code> and <code>$path</code> work. This is a one-off; it's the
only time <code>typeset</code> takes exactly two parameter names on the command
line. All other uses of typeset take a list of parameters to which any
flags given are applied. See the manual for the remaining flags,
although most of the more interesting ones have been discussed.</p>
<p>The other thing you need to know about flags is that you use them with a
`<code>+</code>' sign to turn off the corresponding attribute. So</p>
<pre><code> typeset +r msg
</code></pre>
<p>allows you to set <code>$msg</code> again. From version <code>4.1</code>, you won't be able to
turn off the readonly attribute for a special parameter; that's because
there's too much scope for confusion, including attempting to set
constant strings in the code. For example, `<code>$ZSH_VERSION</code>' always
prints a fixed string; attempting to change that is futile.</p>
<p>The final use of typeset is to list parameters. If you type `<code>typeset</code>'
on its own, you get a complete list of parameters and their values. From
3.1.7, you can turn on the flag <code>-H</code> for a parameter, which means to
hide its value while you're doing this. This can be useful for some of
the more enormous parameters, particularly special parameters which I'll
talk about in the section in <ahref="zshguide07.html#ragbag">chapter 7</a> on
modules, which tend to swamp the display <code>typeset</code> produces.</p>
<p>You can also list parameters of a particular type, by listing the flags
you want to know about. For example,</p>
<pre><code> typeset -r
</code></pre>
<p>lists all readonly parameters. You might expect `<code>typeset +r</code>' to list
parameters which <em>don't</em> have that attribute, but actually it lists the
same parameters but without showing their value. `<code>typeset +</code>' lists
all parameters in this way.</p>
<p>Another good way of finding out about parameters is to use the special
expansion `<code>${(t)</code><em>param</em><code>}</code>', for example</p>
<pre><code> print ${(t)PATH}
</code></pre>
<p>prints `<code>scalar-export-special</code>': <code>$PATH</code> is a scalar parameter, with
the <code>-x</code> flag set, and has a special meaning to the shell. Actually,
`<code>special</code>' means something a bit more than that: it means the internal
code to get and set the parameter behaves in a way which has side
effects, either to the parameter itself or elsewhere in the shell. There
are other parameters, like <code>$HISTFILE</code>, which are used by the shell, but
which are get and set in a normal way --- they are only special in that
the value is looked at by the shell; and, after all, any old shell
function can do that, too. Contrast this with <code>$PATH</code> which has all that
paraphernalia to do with hashing commands to take care of when it's set,
as I discussed above, and I hope you'll see the difference.</p>
<p><strong>Reading into parameters</strong></p>
<p>The `<code>read</code>' builtin, as its name suggests, is the opposite to
`<code>print</code>' (there's no `<code>write</code>' command in the shell, though there is
often an external command of that name to send a message to another
user), but reading, unlike printing, requires something in the shell to
change to take the value, so unlike <code>print</code>, <code>read</code> is forced to be a
builtin. Inevitably, the values are read into a parameter. Normally they
are taken from standard input, very often the terminal (even if you're
running a script, unless you redirected the input). So the simplest case
is just</p>
<pre><code> read param
</code></pre>
<p>and if you type a line, and hit return, it will be put into <code>$param</code>,
without the final newline.</p>
<p>The <code>read</code> builtin actually does a bit of processing on the input. It
will usually strip any initial or final whitespace (spaces or tabs) from
the line read in, though any in the middle are kept. You can read a set
of values separated by whitespace just by listing the parameters to
assign them to; the last parameter gets all the remainder of the line
without it being split. Very often it's easiest just to read into an
array:</p>
<pre><code> % read -A array
this is a line typed in now, \
by me, in this space
% print ${array[1]} ${array[12]}
this space
</code></pre>
<p>(I'm assuming you're using the native zsh array format, rather than the
one set with <code>KSH_ARRAYS</code>, and shall continue to assume this.)</p>
<p>It's useful to be able to print a prompt when you want to read
something. You can do this with `<code>print -n</code>', but there's a shorthand:</p>
<pre><code> % read line'?Please enter a line: '
Please enter a line: some words
% print $line
some words
</code></pre>
<p>Note the quotes surround the `<code>?</code>' to prevent it being taken as part of
a pattern on the command line. You can quote the whole expression from
the beginning of `<code>line</code>', if you like; I just write it like that
because I know parameter names don't need quoting, because they can't
have funny characters in. It's almost logical.</p>
<p>Another useful trick with <code>read</code> is to read a single character; the
`<code>-k</code>' option does this, and in fact you can stick a number immediately
after the `<code>k</code>' which specifies a number to read. Even easier, the
`<code>-q</code>' option reads a single character and returns status 0 if it was
<code>y</code> or <code>Y</code>, and status 1 otherwise; thus you can read the answer to
yes/no questions without using a parameter at all. Note, however, that
if you don't supply a parameter, the reply gets assigned in any case to
<code>$REPLY</code> if it's a scalar --- as it is with <code>-q</code> --- or <code>$reply</code> if it's
an array --- i.e. if you specify <code>-A</code>, but no parameter name. These are
more examples of the non-special parameters which the shell uses --- it
sets <code>$REPLY</code> or <code>$reply</code>, but only in the same way you would set them;
there are no side-effects.</p>
<p>Like <code>print</code>, <code>read</code> has a <code>-r</code> flag for raw mode. However, this just
has one effect for <code>read</code>: without it, a <code>\</code> at the end of the line
specifies that the next line is a continuation of the current one (you
can do this when you're typing at the terminal). With it, <code>\</code> is not
treated specially.</p>
<p>Finally, a more sophisticated note about word-splitting. I said that,
when you are reading to many parameters or an array, the word is split
on whitespace. In fact the shell splits words on any of the characters
found in the (genuinely special, because it affects the shell's guts)
parameter <code>$IFS</code>, which stands for `input field separator'. By default
--- and in the vast majority of uses --- it contains space, tab, newline
and a null character (character zero: if you know that these are usually
used to mark the end of strings, you might be surprised the shell
handles these as ordinary characters, but it does, although printing
them out usually doesn't show anything). However, you can set it to any
string: enter</p>
<pre><code> fn() {
local IFS=:
read -A array
print -l $array
}
fn
</code></pre>
<p>and type</p>
<pre><code>one word:two words:three words:four
</code></pre>
<p>The shell will show you what's in the array it's read, one `word' per
line:</p>
<pre><code> one word
two words
three words
four
</code></pre>
<p>You'll see the bananas, er, words (joke for the over-thirties) have been
treated as separated by a colon, not by whitespace. Making <code>$IFS</code> local
didn't work in old versions of zsh, as with other specials; you had to
save it and restore it.</p>
<p>The <code>read</code> command in zsh doesn't let you do line editing, which some
shells do. For that, you should use the <code>vared</code> command, which runs the
line editor to edit a parameter, with the <code>-c</code> option, which allows
<code>vared</code> to create a new parameter. It also takes the option <code>-p</code> to
specify a prompt, so one of the examples above can be rewritten</p>
<pre><code> vared -c -p 'Please enter a line: ' line
</code></pre>
<p>which works rather like read but with full editing support. If you give
the option <code>-h</code> (history), you can even retrieve values from previous
command lines. It doesn't have all the formatting options of read,
however, although when reading an array (use the option <code>-a</code> with <code>-c</code>
if creating a new array) it will perform splitting.</p>
<p><strong>Other builtins to control parameters</strong></p>
<p>The remaining builtins which handle parameters can be dealt with more
swiftly.</p>
<p>The builtin <code>set</code> simply sets the special parameter which is passed as
an argument to functions or scripts, and which you access as <code>$*</code> or
<code>$@</code>, or <code>$<number></code> (Bourne-like format), or via <code>$argv</code> (csh-like
format), known however you set them as the `positional parameters':</p>
<pre><code> % set a whole load of words
% print $1
a
% print $*
a whole load of words
% print $argv[2,-2]
whole load of
</code></pre>
<p>It's exactly as if you were in a function and had called the function
with the arguments `<code>a whole load of words</code>'. Actually, set can also be
used to set shell options, either as flags, e.g. `<code>set -x</code>', or as
words after `<code>-o</code>' , e.g. `<code>set -o xtrace</code>' does the same as the
previous example. It's generally easier to use <code>setopt</code>, and the upshot
is that you need to be careful when setting arguments this way in case
they begin with a `<code>-</code>'. Putting `<code>-``-</code>' before the real arguments
fixes this.</p>
<p>One other use of <code>set</code> is to set any array, via</p>
<pre><code> set -A any_array words to assign to any_array
</code></pre>
<p>which is equivalent to (and the standard Korn shell version of)</p>
<pre><code> any_array=(words to assign to any_array)
</code></pre>
<p>One case where the <code>set</code> version is more useful is if the name of an
array itself comes from a parameter:</p>
<pre><code> arrname=myarray
set -A $arrname words to assign
</code></pre>
<p>has no easy equivalent in the other form; the left hand side of an
ordinary assignment won't expand a parameter:</p>
<pre><code> # Doesn't work; syntax error
$arrname=(words to assign)
</code></pre>
<p>This worked in old versions of zsh, but that was on the non-standard
side. The <code>eval</code> command, described below, gives another way around
this.</p>
<p>Next comes `<code>shift</code>', which simply moves an array up one element,
deleting the original first one. Without an array name, it operates on
the positional parameters. You can also give it a number to shift other
than one, before the array name.</p>
<pre><code> shift array
</code></pre>
<p>is equivalent to</p>
<pre><code> array=(${array[2,-1]})
</code></pre>
<p>(almost --- I'll leave the subtleties here for the chapter on expansion)
which picks the second to last elements of the array and assigns them
back to the original array. Note, yet again, that <code>shift</code> operates using
the <em>name</em>, not the <em>value</em> of the array, so no `<code>$</code>' should appear in
front, otherwise you get something similar to the trick I showed for
`<code>set -A</code>'.</p>
<p>Finally, <code>unset</code> unsets a parameter, and I already showed you could
unset a key/value pair of an associative array. There is one subtlety to
be mentioned here. Normally, <code>unset</code> just makes the parameter named
disappear off the face of the earth. However, if you call <code>unset</code> in a
function, its ghost lives on in the sense that any parameter you create
in the same name will be scoped as the original parameter was. Hence:</p>
<pre><code> var='global value'
fn() {
typeset var='local value'
unset var
var='what about this?'
}
fn
print $var
</code></pre>
<p>The final statement prints `<code>global value</code>': even though the local copy
of <code>$var</code> was unset, the shell remembers that it was local, so the
second <code>$var</code> in the function is also local and its value disappears at
the end of the function.</p>
<p><spanid="l38"></span></p>
<h3id="327-history-control-commands"><aclass="header"href="#327-history-control-commands">3.2.7: History control commands</a></h3>
<p>The easiest way to access the shell's command history is by editing it
directly. The second easiest way is to use the `<code>!</code>'-history mechanism.
Other ways of manipulating it are based around the <code>fc</code> builtin, which
probably once stood for something (according to Oliver Kiddle, `fix
command', which is as good as anything). I talked quite a bit about it
in the last chapter, and don't really have anything to add. Just note
that the two other commands based around it are <code>history</code> and <code>r</code>.</p>
<p><spanid="l39"></span></p>
<h3id="328-job-control-and-process-control"><aclass="header"href="#328-job-control-and-process-control">3.2.8: Job control and process control</a></h3>
<p>One of the major contributions of the C-shell was job control. You need
to know about foreground and background tasks, and again I introduced
these in the last chapter along with the options that control them. Here
is an introduction to the relevant builtins.</p>
<p>You start a background job in two ways. First, directly, by putting an
`<code>&</code>' after it:</p>
<pre><code> sleep 10 &
</code></pre>
<p>and secondly by starting it in the normal way (i.e. in the foreground),
then typing <code>^Z</code>, and using the <code>bg</code> command to put it in the
background. Between typing <code>^Z</code> and <code>bg</code>, the job is still there, but is
not running; it is `suspended' or `stopped' (systems use different
descriptions for the same thing), waiting for you to decide what to do
with it. In either case, the job then continues without the shell
waiting for it. It will still try and read from or write to the terminal
if that's how you started it; you need to use the shell's redirection
facilities right at the start if you want to change that, there's
nothing you can do after the job has already started.</p>
<p>By the way, `sleep' isn't a builtin. Oddly enough, you can suspend a
builtin command or sequence of commands (such as shell function) with
<code>^Z</code>, although since the shell has to continue executing your commands
as well as being suspended, it does the only thing it can do --- fork,
so that the commands you suspend are put into the background. Probably
you will only rarely do this with builtins. No other shell, so far as I
know, has this feature.</p>
<p>A job will stop if it needs to read from the terminal. You see a message
like:</p>
<pre><code> [1] + 1348 suspended (tty input) jobname and arguments
</code></pre>
<p>which means the job is suspended very much like you had just typed <code>^Z</code>.
You need to bring the job into the forground, as described below, so
that you can type something to it.</p>
<p>By the way, the key to type to suspend a command may not be <code>^Z</code>; it
usually is, but that can be changed. Run `<code>stty -a</code>' and look for what
is listed after `<code>susp =</code>' --- probably, but not necessarily, <code>^Z</code>. So
if you want to use another character --- it must be a single character;
this is handled deep in the terminal interface, not in the shell --- you
can run</p>
<pre><code> stty susp '^]'
</code></pre>
<p>or whatever. You will note from the <code>stty</code> output that various other job
control characters can be changed similarly. The <code>stty</code> command is
external and its format for both output and input can vary quite a bit
from system to system.</p>
<p>Instead of putting the command into the background, you can bring it
back to the foreground again with <code>fg</code>. This is useful for temporarily
stopping what you are doing so you can do something else. These days you
would probably do it in another window; in the old days when people
logged in from simple terminals this was even more useful. A typical
example of this is</p>
<pre><code> more file # look at file
^Z # suspend
[1] + 8592 suspended more file # message printed
... # do something else
fg %1 # resume the `more'
</code></pre>
<p>The `<code>%</code>' is the usual way of referring to jobs. The number after it is
what appeared in square brackets with the suspended message; I don't
know why the shell doesn't use the `<code>%</code>' notation there, too. You also
see that with the `continued' message when you put something into the
background, and again at the end with the `done' message which tells
you a background job is finished. The `<code>%</code>' can take other forms; the
most common is to follow it by the name of a command, such as `<code>%more</code>'
in this case. The forms <code>%+</code> and <code>%-</code> refer to the most recent and
second most recent jobs --- the `<code>+</code>' in the `suspended' message is
telling you that the <code>more</code> job could be referred to like that.</p>
<p>Most of the job control commands will actually assume you are talking
about `<code>%+</code>' if you don't give an argument, so assuming I hadn't
started any other commands in the background, I could just have put
`<code>fg</code>' at the end of the sequence of commands above. This actually cuts
both ways: <code>fg</code> is the default operation on jobs referred to with the
`<code>%</code>' notation, so just typing `<code>%1</code>' with no command name would have
worked, too.</p>
<p>You can jog your memory about what's going on with the `<code>jobs</code>'
command. It looks like a series of messages of the form beginning with
the number in square brackets; usually the jobs will either be
`running' or `suspended'. This will tell you the numbers you need.</p>
<p>One other useful thing you can do with a job is to tell the shell to
forget about it. This is only really useful if it is already running in
the background; then you can run `<code>disown</code>' with the job identifier.
It's useful for jobs you want to continue after you've logged out, as
well as jobs that have their own windows which you can therefore control
directly. With disowned jobs, the shell doesn't warn you that they are
still there when you log out. You can actually disown a background job
when you start it by putting `<code>&|</code>' or `<code>&!</code>' at the end of the line
instead of simply `<code>&</code>'. Note that if the job was suspended when you
disowned it, it will stay disowned; this is pretty pointless, so you
probably should run `<code>bg</code>' on it first.</p>
<p>The next most likely thing you want to do with a job is kill it, or
maybe suspend it when it's already in the background and you can't just
type <code>^Z</code>. This is where the <code>kill</code> builtin comes in. There's more to
this than there is to the builtins mentioned above. First, you can use
<code>kill</code> with other processes that weren't started from the current shell.
In that case, you would use a number to identify it, with no <code>%</code> ---
that's why the <code>%</code>'s were there in the other cases. Of course, you need
to find out the number; the usual way is with the <code>ps</code> command, which is
not a builtin but which appears on all UNIX-like systems. As a stupid
example, here I start a disowned process which does very little, look
for it, then kill it:</p>
<pre><code> % sleep 60 &|
% ps -f
UID PID PPID C STIME TTY TIME CMD
pws 623 614 0 22:12 pts/0 00:00:00 zsh
pws 8613 623 0 23:12 pts/0 00:00:00 sleep 60
pws 8615 623 0 23:12 pts/0 00:00:00 ps -f
% kill 8613
% ps -f
UID PID PPID C STIME TTY TIME CMD
pws 623 614 0 22:12 pts/0 00:00:00 zsh
pws 8616 623 0 23:12 pts/0 00:00:00 ps -f
</code></pre>
<p>The process has disappeared the second time I look. Notice that in the
usual lugubrious UNIX way the shell didn't bother to tell you the
process had been killed; however, it will report an error if it failed
to send it the signal. Sending it the signal is all the shell cares
about; the shell won't warn if you if the process decided it didn't want
to die when told to, so it's still a good idea to check.</p>
<p>Sometimes you want to wait for a process to exit; the <code>wait</code> builtin can
do this, and like <code>kill</code> can take a process number as well as a job
number. However, that's a bit deceptive --- you can't actually wait for
a process which wasn't started directly from the shell. Indeed, the
mechanism for waiting is all bound up with the way UNIX handles
processes; unless its parent waits for it, a process becomes a `zombie'
and hangs around until the system's foster parent, the `init' process
(always process number 1) waits for it instead. It's all a little bit
baroque, but for the shell user, wait just means you can hang on until
something you started has finished. Indeed, that's how foreground
processes work: the shell in effect uses the internal version of <code>wait</code>
to hang around until the job exits. (Well, actually that's a lie; the
system wakes it up from whatever it's doing to tell it a child has
finished, so all it has to do is doze off to wait.)</p>
<p>Furthermore, you can wait for a process even if job control isn't
running. Job control, basically anything involving those <code>%</code>'s, is only
useful when you are sitting at a terminal fiddling with commands; it
doesn't operate when you run scripts, say. Then the shell has much less
freedom in how to control its jobs, but it can still wait for a
background process, and it can still use <code>kill</code> on a process if it knows
its number. For this purpose, the shell stores the ID of the last
process started in the background in the parameter <code>$!</code>; there's
probably a good reason for the `<code>!</code>', but I don't know what it is. This
happens regardless of job control.</p>
<p><strong>Signals</strong></p>
<p>The <code>kill</code> command can do a good deal more than just kill a process.
That is the default action, which is why the command has that name. But
what it's really doing is sending a `signal' to a process. Signals are
the simplest way of communicating to another process; in fact, they are
about the only simple way if you haven't made special arrangements for
the process to read messages from you. Signal names are written like
<code>SIGINT</code>, <code>SIGTSTP</code>, <code>SIGKILL</code>; to send a particular signal to a
process, you remove the <code>SIG</code>, stick a hyphen in front, and use that as
the first argument to <code>kill</code>, e.g.:</p>
<pre><code> kill -KILL 8613
</code></pre>
<p>Some of the things you already know about are actually doing just that.
When you type <code>^C</code> to stop a process, you are actually sending it a
<code>SIGINT</code> for `interrupt', as if you had done</p>
<pre><code> kill -INT 8613
</code></pre>
<p>The usual signal sent by <code>kill</code> is not, as you might have guessed,
<code>SIGKILL</code>, but actually <code>SIGTERM</code> for `terminate'; <code>SIGKILL</code> is
stronger as the process can't block that signal, as it can with many
(we'll see how the shell can do that in a moment). It's familiar to UNIX
hackers as `<code>kill -9</code>', because all the signals also have numbers. You
can see the list of signals in zsh by doing:</p>
<pre><code> % print $signals
EXIT HUP INT QUIT ILL TRAP ABRT BUS FPE KILL USR1
SEGV USR2 PIPE ALRM TERM STKFLT CLD CONT STOP TSTP
<p>I told you in the last chapter that the right way to write tests in zsh
was using the `<code>[[ ... ]]</code>' form, and why. So you can ignore the two
builtins `<code>test</code>' and `<code>[</code>', even though they're the ones that
resemble the Bourne shell. You can safely write</p>
<pre><code> if [[ $foo = '' ]]; then
print The parameter foo is empty. O, misery me.
fi
</code></pre>
<p>or</p>
<pre><code> if [[ -z $foo ]]; then
print Alack and alas, foo still has nothing in it.
fi
</code></pre>
<p>instead of monstrosities like</p>
<pre><code> if test x$foo != x; then
echo The emptiness of foo. Yet are we not all empty\?
fi
</code></pre>
<p>because even if <code>$foo</code> does expand to an empty string, which is what is
implied if the tests are true, `<code>[[ ... ]]</code>' remembers there was
something there and gets the syntax right. Rather than a builtin, this
is actually a reserved word --- in fact it has to be, to be
syntactically special --- but you probably aren't too bothered about the
difference.</p>
<p>There are two sorts of tests, both shown above: those with three
arguments, and those with two. The three-argument forms all have some
comparison in the middle; in addition to `<code>=</code>' (or `<code>==</code>', which means
the same here, and which according to the manual page we should be
using, though none of us does), there are `<code>!=</code>' (not equal), `<code><</code>',
`<code>></code>', `<code><=</code>' and `<code>>=</code>'. All these do <em>string</em> comparisons, i.e.
they compare the sort order of the strings.</p>
<p>Since there are better ways of sorting things in zsh, the `<code>=</code>' and
`<code>!=</code>' forms are by far the most common. Actually, they do something a
bit more than string comparison: the expression on the right can be a
pattern. The patterns understood are just the same as for matching
filenames, except that `<code>/</code>' isn't special, so it can be matched by a
`<code>*</code>'. Note that, because `<code>=</code>' and `<code>!=</code>' are treated specially by
the shell, you shouldn't quote the patterns: you might think that unless
you do, they'll be turned into file names, but they won't. So</p>
<pre><code> if [[ biryani = b* ]]; then
print Word begins with a b.
fi
</code></pre>
<p>works. If you'd written <code>'b*'</code>, including the quotes, it wouldn't have
been treated as a pattern; it would have tested for a string which was
exactly the two letters `<code>b*</code>' and nothing else. Pattern matching like
this can be very powerful. If you've done any Bourne shell programming,
you may remember the only way to use patterns there was via the
`<code>case</code>' construction: that's still in zsh (see below), and uses the
same sort of patterns, but the test form shown above is often more
useful.</p>
<p>Then there are other three-argument tests which do numeric comparison.
Rather oddly, these use letters rather than mathematical symbols:
`<code>-eq</code>', `<code>-lt</code>' and `<code>-le</code>' compare if two numbers are equal, less
than, or less than or equal, to one another. You can guess what `<code>-gt</code>'
and `<code>-ge</code>' do. Note this is the other way round to Perl, which much
more logically uses `<code>==</code>' to test for equality of numbers (not `<code>=</code>',
since that's always an assignment operator in Perl) and `<code>eq</code>' (minus
the minus) to test for equality of strings. Unfortunately we're now
stuck with it this way round. If you are only comparing numbers, it's
better to use the `<code>(( ... ))</code>' expression, because that has a proper
understanding of arithmetic. However,</p>
<pre><code> if [[ $number -gt 3 ]]; then
print Wow, that\'s big
fi
</code></pre>
<p>and</p>
<pre><code> if (( $number > 3 )); then
print Wow, that\'s STILL big
fi
</code></pre>
<p>are essentially equivalent. In the second case, the status is zero
(true) if the number in the expression was non-zero (sorry if I'm
confusing you again) and vice versa. This means that</p>
<pre><code> if (( 3 )); then
print It seems that 3 is non-zero, Watson.
fi
</code></pre>
<p>is a perfectly valid test. As in C, the test operators in arithmetic
return 1 for true and 0 for false, i.e. `<code>$number > 3</code>' is 1 if
<code>$number</code> is greater than 3 and 0 otherwise; the inversion to shell
logic, zero for true, only occurs at the final step when the expression
has been completely evaluated and the `<code>(( ... ))</code>' command returns. At
least with `<code>[[ ... ]]</code>' you don't need to worry about the extra
negation; you can simply think in logical terms (although that's hard
enough for a lot of people).</p>
<p>Finally, there are a few other odd comparisons in the three-argument
form:</p>
<pre><code> if [[ file1 -nt file2 ]]; then
print file1 is newer than file2
fi
</code></pre>
<p>does the test implied by the example; there is also `<code>-ot</code>' to test for
an older file, and there is also the little-used `<code>-ef</code>' which tests
for an `equivalent file', meaning that they refer to the same file ---
in other words, are linked; this can be a hard or a symbolic link, and
in the second case it doesn't matter which of the two is the symbolic
link. (If you were paying attention above, you'll know it can't possibly
matter in the first case.)</p>
<p>In addition to these tests, which are pretty recognisable from most
programming languages --- although you'll just have to remember that the
`<code>=</code>' family compares strings and not numbers --- there are another set
which are largely peculiar to UNIXy scripting languages. These are all
in the form of a hyphen followed by a letter as the test, which always
takes a single argument. I showed one: `-z $var' tests whether
`<code>$var</code>' has zero length. It's opposite is `-n $var' which tests for
non-zero length. Perhaps this is as good a time as any to point out that
the arguments to these commands can be any single word expression, not
just variables or filenames. You are quite at liberty to test</p>
<pre><code> if [[ -z "$var is sqrt(`print bibble`)" ]]; then
print Flying pig detected.
fi
</code></pre>
<p>if you like. In fact, the tests are so eager to make sure that they only
have a one word argument that they will treat things like arrays, which
usually return a whole set of words, as if they were in double quotes,
joining the bits with spaces:</p>
<pre><code> array=(two words)
if [[ $array = 'two words' ]]; then
print "The array \$array is OK. O, joy."
fi
</code></pre>
<p>Apart from `<code>-z</code>' and `<code>-n</code>', most of the two-argument tests are to do
with files: `<code>-e</code>' tests that the file named next exists, whatever type
of file it is (it might be a directory or something weirder); `<code>-f</code>'
tests if it exists and is a regular file (so it isn't a directory or
anything weird this time); `<code>-x</code>' tests whether you can execute it.
There are all sorts of others which are listed in the manual page for
various properties of files. Then there are a couple of others: ``-o</p>
<option>`' you've met and tests whether the option is set, and \``-t
<fd>`' tests whether the file descriptor is attached to a terminal. A
file descriptor is a number which for the shell must be between 0 and 9
inclusive (others may exist, but you can't access them directly); 0 is
the standard input, 1 the standard output, and 2 the channel on which
errors are usually printed. Hence \``[[ -t 0 ]]`' tests whether the
input is coming from a terminal.
<p>There are only four other things making up tests. `<code>&&</code>' and `<code>||</code>'
mean logical `and' and `or', `<code>!</code>' negates the effect of a test, and
parentheses `<code>( ... )</code>' can be used to surround a set of tests which
are to be treated as one. These are all essentially the same as in C. So</p>
<pre><code> if [[ 3 -gt 2 && ( me > you || ! -z bah ) ]]; then
print will I, won\'t I...
fi
</code></pre>
<p>will, because 3 is numerically greater than 2; the expression in
parentheses is evaluated and though `me' actually comes before `you'
in the alphabet, so the first test fails, `<code>-z bah</code>' is false because
you gave it a non-empty string, and hence `<code>! -z bah</code>' is true. So both
sides of the `<code>&&</code>' are true and the test succeeds.</p>
<p><spanid="l44"></span></p>
<h3id="3213-handling-options-to-functions-and-scripts"><aclass="header"href="#3213-handling-options-to-functions-and-scripts">3.2.13: Handling options to functions and scripts</a></h3>
<p>It's often convenient to have your own functions and scripts process
single-letter options in the way a lot of builtin commands (as well as a
great many other UNIX-style commands) do. The shell provides a builtin
for this called `<code>getopts</code>'. This should always be called in some kind
of loop, usually a `<code>while</code>' loop. It's easiest to explain by example.</p>
<p>There's quite a lot here if you're new to shell programming. You might
want to read the stuff on structures like <code>while</code> and <code>case</code> below and
then come back and look at this. First let's see what it does.</p>
<pre><code> % testopts -b foo -a -- args
Option b set to foo
Option a set
Remaining arguments are: args
</code></pre>
<p>Here's what's happening. `<code>getopts ab: opt</code>' is the argument to the
`<code>while</code>'. That means that the <code>getopts</code> gets run; if it succeeds
(returns status zero), then the loop after the `<code>do</code>' is run. When
that's finished, the <code>getopts</code> command is run again, and so on until it
fails (returns a non-zero status). It will do that when there are no
more options left on the command line. So the loop processes the options
one by one. Each time through, the number of the next argument to look
at is left in the parameter <code>$OPTIND</code>, so this gradually increases;
that's how <code>getopts</code> knows how far it has got.</p>
<p>The first argument to the <code>getopts</code> is `<code>ab:</code>'. That means `<code>a</code>' is an
option which doesn't take an argument, while `<code>b</code>' is an argument which
takes a single argument, signified by the colon after it. You can have
any number of single-letter (or even digit) arguments, which are
case-sensitive; for example `<code>ab:c:ABC:</code>' are six different options,
three with arguments. If the option found has an argument, that is
stored in the parameter <code>$OPTARG</code>; <code>getopts</code> then increments <code>$OPTIND</code>
by however much is necessary, which may be 2 or just 1 since `<code>-b foo</code>'
and `<code>-bfoo</code>' are both valid ways of giving the argument.</p>
<p>If an option is found, we use the `<code>case</code>' mechanism to find out what
it was. The idea of this is simple, even if the syntax has the look of
an 18th-century French chateau: the argument `<code>$opt</code>' is tested against
all of the patterns in the `<code>pattern</code>)' lines until one matches, and
the commands are executed until the next `<code>;;</code>'. It's the shell's
equivalent of C's `<code>switch</code>'. In this example, we just print out what
the <code>getopts</code> brought in. Note the last line, which is called if <code>$opt</code>
is a question mark --- it's quoted because `<code>?</code>' on its own can stand
for any single character. This is how <code>getopts</code> signals an unknown
option. If you try it, you'll see that <code>getopts</code> prints its own error
message, so ours was unnecessary: you can turn the former off by putting
a colon right at the start of the list of options, making it `<code>:ab:</code>'
here.</p>
<p>Actually, having this last pattern as an <em>un</em>quoted `<code>?</code>' isn't such a
bad idea. Suppose you add a letter to the list that <code>getopts</code> should
handle and forget to add a corresponding item in the <code>case</code> list for it.
If the last item matches any character, you will get the behaviour for
an unhandled option, which is probably the best thing to do. Otherwise
nothing in the <code>case</code> list will match, the shell will sail blithely on
to the next call to <code>getopts</code>, and when you try to use the function with
the new option you will be left wondering quite what happened to it.</p>
<p>The last piece of the <code>getopts</code> jigsaw is the next line, which tests if
<code>$OPTIND</code> is larger than 1, i.e. an option was found and <code>$OPTIND</code> was
advanced --- it is automatically set to 1 at the start of every function
or script. If it was, the `<code>shift</code>' builtin with a numeric argument,
but no array name, moves the positional parameters, i.e. the function's
arguments, to shift away the options that have been processed. The
<code>print</code> in the next line shows you that only the remaining non-option
arguments are left. You don't need to do that --- you can just start
using the remaining arguments from <code>$argv[$OPTIND]</code> on --- but it's a
pretty good way of doing it.</p>
<p>In the call, I showed a line with `<code>-``-</code>' in it: that's the standard
way of telling <code>getopts</code> that the options are finished; even if later
words start with a <code>-</code>, they are not treated as options. However,
<code>getopts</code> stops anyway when it reaches a word not beginning with `<code>-</code>',
so that wasn't necessary here. But it works anyway.</p>
<p>You can do all of what <code>getopts</code> does without <em>that</em> much difficulty
with a few extra lines of shell programming, of course. The best
argument for using <code>getopts</code> is probably that it allows you to group
single-letter options, e.g. `<code>-abc</code>' is equivalent to `<code>-a -b -c</code>' if
none of them was defined to have an argument. In this case, <code>getopts</code>
has to remember the position <em>inside</em> the word on the command line for
you to call it next, since the `<code>a</code>' `<code>b</code>' and `<code>c</code>' still appear on
different calls. Rather unsatisfactorily, this is hidden inside the
shell (as it is in other shells --- we haven't fixed <em>all</em> of everybody
else's bad design decisions); you can't get at it or reset it without
altering <code>$OPTIND</code>. But if you read the small print at the top of the
guide, you'll find I carefully avoided saying everything was
satisfactory.</p>
<p>While we're at it, why do blocks starting with `<code>if</code>' and `<code>then</code>' end
with `<code>fi</code>', and blocks starting with `<code>case</code>' end with `<code>esac</code>',
while those starting with `<code>while</code>' and `<code>do</code>' end with `<code>done</code>', not
`<code>elihw</code>' (perfectly pronounceable in Welsh, after all) or `<code>od</code>'?
Don't ask me.</p>
<p><spanid="l45"></span></p>
<h3id="3214-random-file-control-things"><aclass="header"href="#3214-random-file-control-things">3.2.14: Random file control things</a></h3>
<p>We're now down into the odds and ends. If you know UNIX at all, you will
already be familiar with the <code>umask</code> command and its effect on file
creation, but as it is a builtin I will describe it here. Create a file
and look at it:</p>
<pre><code> % touch tmpfile
% ls -l tmpfile
-rw-r--r-- 1 pws pws 0 Jul 19 21:19 tmpfile
</code></pre>
<p>(I've shortened the output line for the TeX version of this document.)
You'll see that the permissions are read for everyone, write-only for
the owner. How did the command (<code>touch</code>, not a builtin, creates an empty
file if there was none, or simply updates the modification time of an
existing file) know what permissions to set?</p>
<pre><code> % umask
022
% umask 077
% rm tmpfile; touch tmpfile
% ls -l tmpfile
-rw------- 1 pws pws 0 Jul 19 21:22 tmpfile
</code></pre>
<p><code>umask</code> was how it knew. It gives an octal number corresponding to the
permissions which should <em>not</em> be given to a newly created file (only
newly created files are affected; operations on existing files don't
involve <code>umask</code>). Each digit is made up of a 4 for read, 2 for write, 1
for executed, in the same order that <code>ls</code> shows for permissions: user,
then group, then everyone else. (On this Linux/GNU-based system, like
many others, users have their own groups, so both are called `<code>pws</code>'.)
So my original `022' specified that everything should be allowed except
writing for group and other, while `077' disallowed any operation by
group and other. These are the two most common settings, although here
`002' or `007' would be useful because of the way groups are specific
to users, making it easier to grant permission to specific other users
to write in my directories. (Except there aren't any other users.)</p>
<p>You can also use <code>chmod</code>-like permission changing expressions in
<code>umask</code>. So</p>
<pre><code> % umask go+rx
</code></pre>
<p>would restore group and other permissions for reading and executing,
hence returning the mask to 022. Note that because it is <em>adding</em>
permissions, just like <code>chmod</code> does, it is <em>removing</em> numbers from the
umask.</p>
<p>You might have wondered about execute permissions, since `<code>touch</code>'
didn't give any, even where it was allowed by <code>umask</code>. That's because
only operations which create executable programmes, such as running a
compiler and linker, set that bit; the normal way of opening a new file
--- internally, the UNIX <code>open</code> function, with the <code>O_CREAT</code> flag set
--- doesn't touch it. For the same reason, if you create shell scripts
which you want to be able to execute by typing the name, you have to
make them executable yourself:</p>
<pre><code> % chmod +x myscript
</code></pre>
<p>and, indeed, you can think of <code>chmod</code> as <code>umask</code>'s companion for files
which already exist. It doesn't need to be a builtin, because the files
you are operating on are external to <code>zsh</code>; <code>umask</code>, on the other hand,
operates when you create a file from within <code>zsh</code> or any child process,
so needs to be a builtin. The fact that it's inherited means you can set
<code>umask</code> before you start an editor, and files created by that editor
will reflect the permissions.</p>
<p>Note that the value set by <code>umask</code> is also inherited and used by
<code>chmod</code>. In the example of <code>chmod</code> I gave, I didn't see <em>which</em> type of
execute permission to add; <code>chmod</code> looks at my <code>umask</code> and decides based
on that --- in other words, with 022, everybody would be allowed to
execute <code>myscript</code>, while with 077, only I would, because of the 1's in
the number: (0+0+0)+(4+2+1)+(4+2+1). Of course, you can be explicit with
chmod and say `<code>chmod u+x myscript</code>' and so on.</p>
<p>Something else that may or may not be obvious: if you run a script by
passing it as an argument to the shell,</p>
<pre><code> % zsh myscript
</code></pre>
<p>what matters is <em>read</em> permission. That's what the shell's doing to the
script to find the commands, after all. Execute permission applies when
the system (or, in some cases, including zsh, the parent shell where you
typed `<code>myscript</code>') has to decide whether to find a shell to run the
script by itself.</p>
<p><spanid="l46"></span></p>
<h3id="3215-dont-watch-this-space-watch-some-other"><aclass="header"href="#3215-dont-watch-this-space-watch-some-other">3.2.15: Don't watch this space, watch some other</a></h3>
<p>Finally for builtins, some things which really belong elsewhere. There
are three commands you use to control the shell's editor. These will be
described in <ahref="zshguide04.html#zle">chapter 4</a>, where I talk all about
the editor.</p>
<p>The <code>bindkey</code> command allows you to attach a set of keystrokes to a
command. It understands an abbreviated notation for the keystrokes.</p>
<pre><code> % bindkey '^Xc' copy-prev-word
</code></pre>
<p>This binds the keystrokes consisting of <code>Ctrl</code> held down with <code>x</code>, then
<code>c</code>, to the command which copies the previous word on the line to the
current position. The commands are listed in the <code>zshzle</code> manual page.
<code>bindkey</code> can also do things with keymaps, which are a complete set of
mappings between keys and commands like the one I showed.</p>
<p>The <code>vared</code> command is an extremely useful builtin for editing a shell
variable. Usually much the easiest way to change <code>$path</code> (or <code>$PS1</code>, or
whatever) is to run `<code>vared path</code>': note the lack of a `<code>$</code>', since
otherwise you would be editing whatever <code>$path</code> was expanded to. This is
because very often you want to leave most of what's there and just
change the odd character or word. Otherwise, you would end up doing this
with ordinary parameter substitutions, which are a lot more complicated
and error prone. Editing a parameter is exactly like editing a command
line, except without the prompt at the start.</p>
<p>Finally, there is the <code>zle</code> command. This is the most mysterious, as it
offers a fairly low-level interface to the line editor; you use it to
define your own editing commands. So I'll leave this alone for now.</p>
<p><spanid="l47"></span></p>
<h3id="3216-and-also"><aclass="header"href="#3216-and-also">3.2.16: And also</a></h3>
<p>There is one more standard builtin that I haven't covered: <code>zmodload</code>,
which allows you to manipulate add-on packages for zsh. Many extras are
supplied with the shell which aren't normally loaded to keep down the
use of memory and to avoid having too many rarely used builtins, etc.,
getting in the way. In the last chapter I will talk about some of these.
To be more honest, a lot of the stuff in between actually uses these
addons, generically referred to as modules --- the line editor, zle, is
itself a separate module, though heavily dependent on the main shell ---
and you've probably forgotten I mentioned above using `<code>zmodload zsh/mathfunc</code>' to load mathematical functions.</p>
<p>Filename generation is exactly the same as `globbing': the expanding of
any unquoted wildcards to match files. This is only done in one
directory at a time. So for example</p>
<pre><code> print *.c
</code></pre>
<p>won't match files in a subdirectory ending in `<code>.c</code>'. However, it <em>is</em>
done on all parts of a path, so</p>
<pre><code> print */*.c
</code></pre>
<p>will match all `<code>.c</code>' files in all immediate subdirectories of the
current directory. Furthermore, zsh has an extension --- one of its most
commonly used special features --- to match files in any subdirectory at
any depth, including the current directory: use two `<code>*</code>'s as part of
the path:</p>
<pre><code> print **/*.c
</code></pre>
<p>will match `<code>prog.c</code>', `<code>version1/prog.c</code>', `<code>version2/test/prog.c</code>',
`<code>oldversion/working/saved/prog.c</code>', and so on. I will talk about
filename generation and other uses of zsh's extremely powerful patterns
at much greater length in <ahref="zshguide05.html#subst">chapter 5</a>. My main
thrust here is to fit it into other forms of expansion; the main thing
to remember is that it comes last, after everything has already been
done.</p>
<p>So although you would certainly expect this to work,</p>
<pre><code> print ~/*
</code></pre>
<p>generating all files in your home directory, you now know why: it is
first expanded to `<code>/home/pws/*</code>' (or wherever), then the shell scans
down the path until it finds a pattern, and looks in the directory it
has reached (<code>/home/pws</code>) for matching files. Furthermore,</p>
<pre><code> foo=~/
print $foo*
</code></pre>
<p>works. However, as I explained in the last chapter, you need to be
careful with</p>
<pre><code> foo=*
print ~/$foo
</code></pre>
<p>This just prints `<code>/home/pws/*</code>'. To get the `<code>*</code>' from the parameter
to be a wildcard, you need to tell the shell explicitly that's what you
want:</p>
<pre><code> foo=*
print ~/${~foo}
</code></pre>
<p>As also noted, other shells do expand the <code>*</code> as a wildcard anyway. The
zsh attitude here, as with word splitting, is that parameters should do
exactly what they're told rather than waltz off generating extra words
or expansions.</p>
<p>Be even more careful with arrays:</p>
<pre><code> foo=(*)
</code></pre>
<p>will expand the <code>*</code> immediately, in the current directory --- the
elements of the array assignment are expanded exactly like a normal
command line glob. This is often very useful, but note the difference
from scalar assignments, which do other forms of expansion, but not
globbing.</p>
<p>I'll mention a few possible traps for the unwary, which might confuse
you until you are a zsh globbing guru. Firstly, parentheses actually
have two uses. Consider:</p>
<pre><code> print (foo|bar)(.)
</code></pre>
<p>The first set of parentheses means `match either <code>foo</code> or <code>bar</code>'. If
you've used <code>egrep</code>, you will probably be familiar with this. The
second, however, simply means `match only regular files'. The `<code>(.)</code>'
is called a `globbing qualifier', because it limits the scope of any
matches so far found. For example, if either or both of <code>foo</code> and <code>bar</code>
were found, but were directories, they would not now be matched. There
are many other possibilities for globbing qualifiers. For now, the
easiest way to tell if something at the end is <em>not</em> a globbing
qualifier is if it contains a `<code>|</code>'.</p>
<p>The second point is about forms like this:</p>
<pre><code> print file-<1-10>.dat
</code></pre>
<p>The `<code><</code>' and `<code>></code>' smell of redirection, as described next, but
actually the form `<code><</code>', optional start number, `<code>-</code>', optional finish
number, `<code>></code>' means match any positive integer in the range between the
two numbers, inclusive; if either is omitted, there is no limit on that
end, hence the cryptic but common `<code><-></code>' to match any positive integer
--- in other words, any group of decimal digits (bases other than ten
are not handled by this notation). Older versions of the shell allowed
the form `<code><></code>' as a shorthand to match any number, but the overlap
with redirection was too great, as you'll see, so this doesn't work any
more.</p>
<p>Another two cryptic symbols are the two that do negation. These only
work with the option `<code>EXTENDED_GLOB</code>' set: this is necessary to get
the most out of zsh's patterns, but it can be a trap for the unwary by
turning otherwise innocuous characters into patterns:</p>
<pre><code> print ^foo
</code></pre>
<p>This means any file in the current directory <em>except</em> the file <code>foo</code>.
One way of coming unstuck with `<code>^</code>' is something like</p>
<pre><code> stty kill ^u
</code></pre>
<p>where you would hope `<code>^u</code>' means control with `<code>u</code>', i.e. ASCII
character 21. But it doesn't, if <code>EXTENDED_GLOB</code> is set: it means `any
file in the current directory except one called `<code>u</code>' ', which is
definitely a different thing. The other negation operator isn't usually
so fraught, but it can look confusing:</p>
<pre><code> print *.c~f*
</code></pre>
<p>is a pattern of two halves; the shell tries to match `<code>*.c</code>', but
rejects any matches which also match `<code>f*</code>'. Luckily, a `<code>~</code>' right at
the end isn't special, so</p>
<pre><code> rm *.c~
</code></pre>
<p>removes all files ending in `<code>.c~</code>' --- it wouldn't be very nice if it
matched all files ending in `<code>.c</code>' and treated the final `<code>~</code>' as an
instruction not to reject any, so it doesn't. The most likely case I can
think of where you might have problems is with Emacs' numeric backup
files, which can have a `<code>~</code>' in the middle which you should quote.
There is no confusion with the directory use of `<code>~</code>', however: that
only occurs at the beginning of a word, and this use only occurs in the
middle.</p>
<p>The final oddments that don't fit into normal shell globbing are forms
with `<code>#</code>'. These also require that <code>EXTENDED_GLOB</code> be set. In the
simplest use, a `<code>#</code>' after a pattern says `match this zero or more
times'. So `<code>(foo|bar)#.c</code>' matches <code>foo.c</code>, <code>bar.c</code>, <code>foofoo.c</code>,
<code>barbar.c</code>, <code>foobarfoo.c</code>, ... With an extra <code>#</code>, the pattern before (or
single character, if it has no special meaning) must match at least
once. The other use of `<code>#</code>' is in a facility called `globbing flags',
which look like `<code>(#X)</code>' where `<code>X</code>' is some letter, possibly followed
by digits. These turn on special features from that point in the pattern
and are one of the newest features of zsh patterns; they will receive
much more space in <ahref="zshguide05.html#subst">chapter 5</a>.</p>
<p><spanid="l60"></span></p>
<h2id="37-redirection-greater-thans-and-less-thans"><aclass="header"href="#37-redirection-greater-thans-and-less-thans">3.7: Redirection: greater-thans and less-thans</a></h2>
<p>Redirection means retrieving input from some other file than the usual
one, or sending output to some other file than the usual one. The
simplest examples of these are `<code><</code>' and `<code>></code>', respectively.</p>
<pre><code> % echo 'This is an announcement' >tempfile
% cat <tempfile >newfile
% cat newfile
This is an announcement
</code></pre>
<p>Here, <code>echo</code> sends its output to the file <code>tempfile</code>; <code>cat</code> took its
input from that file and sent its output --- the same as its input ---
to the file <code>newfile</code>; the second <code>cat</code> takes its input from <code>newfile</code>
and, since its output wasn't redirected, it appeared on the terminal.</p>
<p>The other basic form of redirection is a pipe, using `<code>|</code>'. Some people
loosely refer to all redirections as pipes, but that's rather confusing.
The input and output of a pipe are <em>both</em> programmes, unlike the case
above where one end was a file. You've seen lots of examples already:</p>
<pre><code> echo foo | sed 's/foo/bar/'
</code></pre>
<p>Here, <code>echo</code> sends its output to the programme <code>sed</code>, which substitutes
foo by bar, and sends its own output to standard output. You can chain
together as many pipes as you like; once you've grasped the basic
behaviour of a single pipe, it should be obvious how that works:</p>
<pre><code> echo foo is a word |
sed 's/foo/bar/' |
sed 's/a word/an unword/'
</code></pre>
<p>runs another <code>sed</code> on the output of the first one. (You can actually
type it like that, by the way; the shell knows a pipe symbol can't be at
the end of a command.) In fact, a single <code>sed</code> will suffice:</p>
<pre><code> echo foo is a word |
sed -e 's/foo/bar/' -e 's/a word/an unword/'
</code></pre>
<p>has the same effect in this case.</p>
<p>Obviously, all three forms of redirection only work if the programme in
question expects input from standard input, and sends output to standard
output. You can't do:</p>
<pre><code> echo 'edit me' | vi
</code></pre>
<p>to edit input, since <code>vi</code> doesn't use the input sent to it; it always
deals with files. Most simple UNIX commands can be made to deal with
standard input and output, however. This is a big difference from other
operating systems, where getting programmes to talk to each other in an
<p>UNIX-like systems refer to different channels such as input, output and
error by `file descriptors', which are small integers. Usually three
are special: 0, standard input; 1, standard output; and 2, standard
error. Bourne-like shells (but not csh-like shells) allow you to refer
to a particular file descriptor, instead of standard input or output, by
putting the integer immediately before the `<code><</code>' or `<code>></code>' (no space is
allowed). What's more, if the `<code><</code>' or `<code>></code>' is followed immediately
by `<code>&</code>', a file descriptor can follow the redirection (the one before
is optional as usual). A common use is:</p>
<pre><code> % echo This message will go to standard error >&2
</code></pre>
<p>The command sends its message to standard output, file descriptor 1. As
usual, `<code>></code>' redirects standard output. This time, however, it is
redirected not to a file, but to file descriptor 2, which is standard
error. Normally this is the same device as standard output, but it can
be redirected completely separately. So:</p>
<pre><code> % { echo A message
cursh> echo An error >&2 } >file
An error
% cat file
A message
</code></pre>
<p>Apologies for the slightly unclear use of the continuation prompt
`<code>cursh></code>': this guide goes into a lot of different formats, and some
are a bit finicky about long lines in preformatted text. As pointed out
above, the `<code>>file</code>' here will redirect all output from the stuff in
braces, just as if it were a single command. However, the `<code>>&2</code>'
inside redirects the output of the second <code>echo</code> to standard error.
Since this wasn't redirected, it goes straight to the terminal.</p>
<p>Note the form in braces in the previous example --- I'm going to use
that in a few more examples. It simply sends something to standard
output, and something else to standard error; that's its only use. Apart
from that, you can treat the bit in braces as a black box --- anything
which can produce both sorts of output.</p>
<p>Sometimes you want to redirect both at once. The standard Bourne-like
way of doing this is:</p>
<pre><code> % { echo A message
cursh> echo An error >&2 } >file 2>&1
</code></pre>
<p>The `<code>>file</code>' redirects standard output from the <code>{</code><em>...</em><code>}</code> to the
file; the following <code>2>&1</code> redirects standard error to wherever standard
output happens to be at that point, which is the same file. This allows
you to copy two file descriptors to the same place. Note that the order
is important; if you swapped the two around, `<code>2>&1</code>' would copy
standard error to the initial destination of standard output, which is
the terminal, before it got around to redirecting standard output.</p>
<p>Zsh has a shorthand for this borrowed from csh-like shells:</p>
<pre><code> % { echo A message
cursh> echo An error >&2 } >&file
</code></pre>
<p>is exactly equivalent to the form in the previous paragraph, copying
standard output and standard error to the same file. There is obviously
a clash of syntax with the descriptor-copying mechanism, but if you
don't have files whose names are numbers you won't run into it. Note
that csh-like shells don't have the descriptor-copying mechanism: the
simple `<code>>&</code>' and the same thing with pipes are the only uses of `<code>&</code>'
for redirections, and it's not possible there to refer to particular
file descriptors.</p>
<p>To copy standard error to a pipe, there are also two forms:</p>
<pre><code> % { echo A message
cursh> echo An error >&2 } 2>&1 | sed -e 's/A/I/'
I message
In error
% { echo A message
cursh> echo An error >&2 } |& sed -e 's/A/I/'
I message
In error
</code></pre>
<p>In the first case, note that the pipe is opened before the other
redirection, so that `<code>2>&1</code>' copies standard error to the pipe, not
the original standard output; you couldn't put that after the pipe in
any case, since it would refer to the `<code>sed</code>' command's output. The
second way is like csh; unfortunately, `<code>|&</code>' has a different meaning
in ksh (start a coprocess), so zsh is incompatible with ksh in this
respect.</p>
<p>You can also close a file descriptor you don't need: the form `<code>2<&-</code>'
will close standard error for the command where it appears.</p>
<p>One thing not always appreciated about redirections is that they can
occur anywhere on the command line, not just at the end.</p>
<pre><code> % >file echo foo
% cat file
foo
</code></pre>
<p><spanid="l63"></span></p>
<h3id="373-appending-here-documents-here-strings-read-write"><aclass="header"href="#373-appending-here-documents-here-strings-read-write">3.7.3: Appending, here documents, here strings, read write</a></h3>
<p>There are various other forms which use multiple `<code>></code>'s and `<code><</code>'s.
First,</p>
<pre><code> % echo foo >file
% echo bar >>file
% cat file
foo
bar
</code></pre>
<p>The `<code>>``></code>' appends to the file instead of overwriting it. Note,
however, that if you use this a lot you may find there are neater ways
of doing the same thing. In this example,</p>
<pre><code> % { echo foo
cursh> echo bar } >file
% cat file
foo
bar
</code></pre>
<p>Here, `<code>cursh></code>' is a prompt from the shell that it is waiting for you
to close the `<code>{</code>' construct which executes a set of commands in the
current shell. This construct can have a redirection applied to the
entire sequence of commands: `<code>>file</code>' after the closing brace
therefore redirects the output from both <code>echo</code>s.</p>
<p>In the case of input, doubling the sign has a totally different effect.
The word after the <code><``<</code> is not a file, but a string which will be used
to mark in the end of input. Input is read until a line with only this
string is found:</p>
<pre><code> % sed -e 's/foo/bar/' <<HERE
heredoc> This line has foo in it.
heredoc> There is another foo in this one.
heredoc> HERE
This line has a bar in it.
There is another bar in this one.
</code></pre>
<p>The shell prompts you with `<code>heredoc></code>' to tell you it is reading a
`here document', which is how this feature is referred to. When it
finds the final string, in this case `<code>HERE</code>', it passes everything you
have typed as input to the command as if it came from a file. The
command in this case is the stream editor, which has been told to
replace the first `<code>foo</code>' on each line with a `<code>bar</code>'. (Replacing
things with a bar is a familiar experience from the city centre of my
home town, Newcastle upon Tyne.)</p>
<p>So far, the features are standard in Bourne-like shells, but zsh has an
extension to here documents, sometimes referred to as `here strings'.</p>
<pre><code> % sed -e 's/string/nonsense/' \
><<<'This string is the entire document.'
This nonsense is the entire document.
</code></pre>
<p>Note that `<code>></code>' on the second line is a continuation prompt, not part
of the command line; it was just too long for the TeX version of this
document if I didn't split it. This is a shorthand form of `here'
document if you just want to pass a single string to standard input.</p>
<p>The final form uses both symbols: `<code><>file</code>' opens the file for reading
and writing --- but only on standard input. In other words, a programme
can now both read from and write to standard input. This isn't used all
that often, and when you do use it you should remember that you need to
open standard output explicitly to the same file:</p>
<pre><code> % echo test >/tmp/redirtest
% sed 's/e/Z/g' <>/tmp/redirtest 1>&0
% cat /tmp/redirtest
tZtst
</code></pre>
<p>As standard input (the 0) was opened for writing, you can perform the
unusual trick of copying standard output (the 1) into it. This is
generally not a particularly safe way of doing in-place editing,
however, though it seems to work fine with sed. Note that in older
versions of zsh, `<code><></code>' was equivalent to `<code><-></code>', which is a pattern
that matches any number; this was changed quite some time ago.</p>
<p><spanid="l64"></span></p>
<h3id="374-clever-tricks-exec-and-other-file-descriptors"><aclass="header"href="#374-clever-tricks-exec-and-other-file-descriptors">3.7.4: Clever tricks: exec and other file descriptors</a></h3>
<p>All Bourne-like shells have two other features. First, the `command'
<code>exec</code>, which I described above as being used to replace the shell with
the command you give after it, can be used with only redirections after
it. These redirections then apply permanently to the shell itself,
rather than temporarily to a single command. So</p>
<pre><code> exec >file
</code></pre>
<p>makes <code>file</code> the destination for standard output from that point on.
This is most useful in scripts, where it's quite common to want to
change the destination of all output.</p>
<p>The second feature is that you can use file descriptors which haven't
even been opened yet, as long as they are single digits --- in other
words, you can use numbers 3 to 9 for your own purposes. This can be
combined with the previous feature for some quite clever effects:</p>
<pre><code> exec 3>&1
# 3 refers to stdout
exec >file
# stdout goes to `file', 3 untouched
# random commands output to `file'
exec 1>&3
# stdout is now back where it was
exec 3>&-
# file descriptor 3 closed to tidy up
</code></pre>
<p>Here, file descriptor 3 has been used simply as a placeholder to
remember where standard output was while we temporarily divert it. This
is an alternative to the `<code>{</code><em>...</em><code>} >file</code>' trick. Note that you can
put more than one redirection on the <code>exec</code> line: `<code>exec 3>&1 >file</code>'
also works, as long as you keep the order the same.</p>
<p>where the command's output is copied to both files. This is done by a
process forked off by the shell: it simply sits waiting for input, then
copies it to all the files in its list. There's a problem in all
versions of the shell to date (currently 4.0.6): this process is
asynchronous, so you can't rely on it having finished when the shell
starts executing the next command. In other words, if you look at
<code>file1</code> or <code>file2</code> immediately after the command has finished, they may
not yet contain all the output because the forked process hasn't
finished writing to it.</p>
<p>This is really a bug, but for the time being you will have to live with
it as it's quite complicated to fix in all cases. Multios are most
useful as a shorthand in interactive use, like so much of zsh; in a
script or function it is safer to use <code>tee</code>,</p>
<pre><code> command-generating-output | tee file1 file2
</code></pre>
<p>which does the same thing, but as <code>tee</code> is handled as a synchronous
process <code>file1</code> and <code>file2</code> are guaranteed to be complete when the
pipeline exits.</p>
<p><spanid="l66"></span></p>
<h2id="38-shell-syntax-loops-subshells-and-so-on"><aclass="header"href="#38-shell-syntax-loops-subshells-and-so-on">3.8: Shell syntax: loops, (sub)shells and so on</a></h2>
<p>I've shown plenty of examples of one sort of shell structure already,
the <code>if</code> statement:</p>
<pre><code> if [[ black = white ]]; then
print Yellow is no colour.
fi
</code></pre>
<p>The main points are: the `<code>if</code>' itself is followed by some command
whose return status is tested; a `<code>then</code>' follows as a new command; any
number of commands may follow, as complex as you like; the whole
sequence is ended by a `<code>fi</code>' as a command on its own. You can write
the `<code>then</code>' on a new line if you like, I just happen to find it neater
to stick it where it is. If you follow the form here, remember the
semicolon before it; the <code>then</code> must start a separate command. (You can
put another command immediately after the <code>then</code> without a newline or
semicolon, though, although people tend not to.)</p>
<p>The double-bracketed test is by far the most common thing to put here in
zsh, as in ksh, but any command will do; only the status is important.</p>
<pre><code> if true; then
print This always gets executed
fi
if false; then
print This never gets executed
fi
</code></pre>
<p>Here, <code>true</code> always returns true (status 0), while <code>false</code> always
returns false (status 1 in zsh, although some versions return status 255
--- anything nonzero will do). So the statements following the <code>print</code>s
are correct.</p>
<p>The <code>if</code> construct can be extended by `<code>elif</code>' and `<code>else</code>':</p>
<pre><code> read var
if [[ $var = yes ]]; then
print Read yes
elif [[ $var = no ]]; then
print Read no
else
print Read something else
fi
</code></pre>
<p>The extension is pretty straightforward. You can have as many `<code>elif</code>'s
with different tests as you like; the code following the first test to
succeed is executed. If no test succeeded, and there is an `<code>else</code>'
(there doesn't need to be), the code following that is executed. Note
that the form of the `<code>elif</code>' is identical to that of `<code>if</code>',
including the `<code>then</code>', while the else just appears on its own.</p>
<p>The <code>while</code>-loop is quite similar to <code>if</code>. There are two differences:
the syntax uses <code>while</code>, <code>do</code> and <code>done</code> instead of <code>if</code>, <code>then</code> and
<code>fi</code>, and after the loop body is executed (if it is), the test is
evaluated again. The process stops as soon as the test is false. So</p>
<pre><code> i=0
while (( i++ < 3 )); do
print $i
done
</code></pre>
<p>prints 1, then 2, then 3. As with <code>if</code>, the commands in the middle can
be any set of zsh commands, so</p>
<pre><code> i=0
while (( i++ < 3 )); do
if (( i & 1 )); then
print $i is odd
else
print $i is even
fi
done
</code></pre>
<p>tells you that 1 and 3 are odd while 2 is even. Remember that the
indentation is irrelevant; it is purely there to make the structures
more easy to understand. You can write the code on a single line by
replacing all the newlines with semicolons.</p>
<p>There is also an <code>until</code> loop, which is identical to the <code>while</code> loop
except that the loop is executed until the test is true. `<code>until [[</code><em>...</em>' is equivalent to `<code>while ! [[</code><em>...</em>'.</p>
<p>Next comes the <code>for</code> loop. The normal case can best be demonstrated by
another example:</p>
<pre><code> for f in one two three; do
print $f
done
</code></pre>
<p>which prints out `<code>one</code>' on the first iteration, then `<code>two</code>', then
`<code>three</code>'. The <code>f</code> is set to each of the three words in turn, and the
body of the loop executed for each. It is very useful that the words
after the `<code>in</code>' may be anything you would normally have on a shell
command line. So `<code>for f in *; do</code>' will execute the body of the loop
once for each file in the current directory, with the file available as
<code>$f</code>, and you can use arrays or command substitutions or any other kind
of substitution to generate the words to loop over.</p>
<p>The <code>for</code> loop is so useful that the shell allows a shorthand that you
can use on the command line: try</p>
<pre><code> for f in *; print $f
</code></pre>
<p>and you will see the files in the current directory printed out, one per
line. This form, without the <code>do</code> and the <code>done</code>, involves less typing,
but is also less clear, so it is recommended that you only use it
interactively, not in scripts or functions. You can turn the feature off
with <code>NO_SHORT_LOOPS</code>.</p>
<p>The <code>case</code> statement is used to test a pattern against a series of
possibilities until one succeeds. It is really a short way of doing a
series of <code>if</code> and <code>elif</code> tests on the same pattern:</p>
<pre><code> read var
case $var in
(yes) print Read yes
;;
(no) print Read no
;;
(*) print Read something else
;;
esac
</code></pre>
<p>is identical to the <code>if</code>/<code>elif</code>/<code>else</code> example above. The <code>$var</code> is
compared against each pattern in turn; if one matches, the code
following that is executed --- then the statement is exited; no further
matches are looked for. Hence the `<code>*</code>' at the end, which can match
anything, acts like the `<code>else</code>' of an <code>if</code> statement.</p>
<p>Note the quirks of the syntax: the pattern to test must appear in
parentheses. For historical reasons, you can miss out the left
parenthesis before the pattern. I haven't done that mainly because
unbalanced parentheses confuse the system I am using for writing this
guide. Also, note the double semicolon: this is the only use of double
semicolons in the shell. That explains the fact that if you type `<code>;;</code>'
on its own the shell will report a `parse error'; it couldn't find a
<code>case</code> to associate it with.</p>
<p>You can also use alternative patterns by separating them with a vertical
bar. Zsh allows alternatives with extended globbing anyway; but this is
actually a separate feature, which is present in other shells which
don't have zsh's extended globbing feature; it doesn't depend on the
<code>EXTENDED_GLOB</code> option:</p>
<pre><code> read var
case $var in
(yes|true|1) print Reply was affirmative
;;
(no|false|0) print Reply was negative
;;
(*) print Reply was cobblers
;;
esac
</code></pre>
<p>The first `<code>print</code>' is used if the value of <code>$var</code> read in was
`<code>yes</code>', `<code>true</code>' or `<code>1</code>', and so on. Each of the separate items can
be a pattern, with any of the special characters allowed by zsh, this
time depending on the setting of the option <code>EXTENDED_GLOB</code>.</p>
<p>The <code>select</code> loop is not used all that often, in my experience. It is
only useful with interactive input (though the code may certainly appear
in a script or function):</p>
<pre><code> select var in earth air fire water; do
print You selected $var
done
</code></pre>
<p>This prints a menu; you must type 1, 2, 3 or 4 to select the
corresponding item; then the body of the loop is executed with <code>$var</code>
set to the value in the list corresponding to the number. To exit the
loop hit the break key (usually <code>^G</code>) or end of file (usually <code>^D</code>: the
feature is so infrequently used that currently there is a bug in the
shell that this tells you to use `<code>exit</code>' to exit, which is nonsense).
If the user entered a bogus value, then the loop is executed with <code>$var</code>
set to the empty string, though the actual input can be retrieved from
<code>$REPLY</code>. Note that the prompt printed for the user input is <code>$PROMPT3</code>,
the only use of this parameter in the shell: all normal prompt
substitutions are available.</p>
<p>There is one final type of loop which is special to zsh, unlike the
others above. This is `<code>repeat</code>'. It can be used two ways:</p>
<pre><code> % repeat 3 print Hip Hip Hooray
Hip Hip Hooray
Hip Hip Hooray
Hip Hip Hooray
</code></pre>
<p>Here, the first word after <code>repeat</code> is a count, which could be a
variable as normal substitutions are performed. The rest of the line (or
until the first semicolon) is a command to repeat; it is executed
identically each time.</p>
<p>The second form is a fully fledged loop, just like <code>while</code>:</p>
<pre><code> % repeat 3; do
repeat> print Hip Hip Hooray
repeat> done
Hip Hip Hooray
Hip Hip Hooray
Hip Hip Hooray
</code></pre>
<p>which has the identical effect to the previous one. The `<code>repeat></code>' is
the shell's prompt to show you that it is parsing the contents of a
`<code>repeat</code>' loop.</p>
<p><spanid="l69"></span></p>
<h3id="383-subshells-and-current-shell-constructs"><aclass="header"href="#383-subshells-and-current-shell-constructs">3.8.3: Subshells and current shell constructs</a></h3>
<p>More catching up with stuff you've already seen. The expression in
parentheses here:</p>
<pre><code> % (cd ~; ls)
<all the files in my home directory>
% pwd
<where I was before, not necessarily ~>
</code></pre>
<p>is run in a subshell, as if it were a script. The main difference is
that the shell inherits almost everything from the main shell in which
you are typing, including options settings, functions and parameters.
The most important thing it doesn't inherit is probably information
about jobs: if you run <code>jobs</code> in a subshell, you will get no output; you
can't use <code>fg</code> to resume a job in a subshell; you can't use `<code>kill %</code><em>n</em>' to kill a job (though you can still use the process ID); and so
on. By now you should have some feel for the effect of running in a
separate process. Running a command, or set of commands, in a different
directory, as in this example, is one quite common use for this
construct. (In zsh 4.1, you can use <code>jobs</code> in a subshell; it lists the
jobs running in the parent shell; this is because it is very useful to
be able to pipe the output of jobs into some processing loop.)</p>
<p>On the other hand, the expression in braces here:</p>
<pre><code> % {cd ~; ls}
<all the files in my home directory>
% pwd
/home/pws
</code></pre>
<p>is run in the current shell. This is what I was blathering on about in
the section on redirection. Indeed, unless you need some special effect
like redirecting a whole set of commands, you won't use the
current-shell construct. The example here would behave just the same way
if the braces were missing.</p>
<p>As you might expect, the syntax of the subshell and current-shell forms
is very similar. You can use redirection with both, just as with simple
commands, and they can appear in most places where a simple command can
appear:</p>
<pre><code> [[ $test = true ]] && {
print Hello.
print Well, this is exciting.
}
</code></pre>
<p>That would be much clearer using an `<code>if</code>', but it works. For some
reason, you often find expressions of this form in system start-up files
located in the directory <code>/etc/rc.d</code> or, on older systems, in files
whose names begin with `<code>/etc/rc.</code>'. You can even do:</p>
<pre><code> if { foo=bar; [[ $foo = bar ]] }; then
print yes
fi
</code></pre>
<p>but that's also pretty gross.</p>
<p>One use for <code>{</code><em>...</em><code>}</code> is to make sure a whole set of commands is
executed at once. For example, if you copy a set of commands from a
script in one window and want them to be run in one go in a shell in
another window, you can do:</p>
<pre><code> % {
cursh> # now paste your commands in here...
...
cursh> }
</code></pre>
<p>and the commands will only be executed when you hit return after the
final `<code>}</code>'. This is also a workaround for some systems where cut and
paste has slightly odd effects due to the way different states of the
terminal are handled. The current-shell construct is a little bit like
an anonymous function, although it doesn't have any of the usual
features of functions --- you can't pass it arguments, and variables
declared inside aren't local to that section of code.</p>
<p><spanid="l70"></span></p>
<h3id="384-subshells-and-current-shells"><aclass="header"href="#384-subshells-and-current-shells">3.8.4: Subshells and current shells</a></h3>
<p>In case you're confused about what happens in the current shell and what
happens in a subshell, here's a summary.</p>
<p>The following are run in the current shell.</p>
<ol>
<li>All shell builtins and anything which looks like one, such as a
precommand modifier and tests with `<code>[[</code>'.</li>
<li>All complex statements and loops such as <code>if</code> and <code>while</code>. Tests and
code inside the block must both be considered separately.</li>
<li>All shell functions.</li>
<li>All files run by `<code>source</code>' or `<code>.</code>' as well as startup files.</li>
<li>The code inside a `<code>{</code><em>...</em><code>}</code>'.</li>
<li>The right hand side of a pipeline: this is guaranteed in zsh, but
don't rely on it for other shells.</li>
<li>All forms of substitution except <code>`</code><em>...</em><code>`</code>, <code>$</code>(<em>...</em>),
<code>=</code>(<em>...</em>), <code><</code>(<em>...</em>) and <code>></code>(<em>...</em>).</li>
</ol>
<p>The following are run in a subshell.</p>
<ol>
<li>All external commands.</li>
<li>Anything on the left of a pipe, i.e. all sections of a pipeline but
the last.</li>
<li>The code inside a `<code></code>(<em>...</em>)'.</li>
<li>Substitutions involving execution of code, i.e. <code>`</code><em>...</em><code>`</code>,
<code>$</code>(<em>...</em>), <code>=</code>(<em>...</em>), <code><</code>(<em>...</em>) and <code>></code>(<em>...</em>). (TCL fans note
that this is different from the `<code>[</code><em>...</em><code>]</code>' command substitution
in that language.)</li>
<li>Anything started in the background with `<code>&</code>' at the end.</li>
<li>Anything which has ever been suspended. This is a little subtle:
suppose you execute a set of commands in the current shell and
suspend it with <code>^Z</code>. Since the shell needs to return you to the
prompt, it forks a subshell to remember the commands it was
executing when you interrupted it. If you use <code>fg</code> or <code>bg</code> to
restart, the commands will stay in the subshell. This is a special
feature of zsh; most shells won't let you interrupt anything in the
current shell like that, though you can still abort it with <code>^C</code>.</li>
</ol>
<p>With an alias, you can't tell where it will be executed --- you need to
find out what it expands too first. The expansion naturally takes place
in the current shell.</p>
<p>Of course, if for some reason the current set of commands is already
running in a subshell, it doesn't get magically returned to the current
shell --- so a shell builtin on the left hand side of a pipeline is
running in a subshell. However, it doesn't get an extra subshell, as an
external command would. What I mean is:</p>
<pre><code> { print Hello; cat file } |
while read line; print $line; done
</code></pre>
<p>The shell forks, producing a subshell, to execute the left hand side of
the pipeline, and that subshell forks to execute the <code>cat</code> external
command, but nothing else in that set of commands will cause a new
subshell to be created.</p>
<p>(For the curious only: actually, that's not quite true, and I already
pointed this out when I talked about command substitutions: the shell
keeps track of occasions when it is in a subshell and has no more
commands to execute. In this case it will not bother forking to create a
new process for the <code>cat</code>, it will simply replace the subshell which is
not needed any more. This can only happen in simple cases where the
shell has no clearing up to do.)</p>
<p><spanid="l71"></span></p>
<h2id="39-emulation-and-portability"><aclass="header"href="#39-emulation-and-portability">3.9: Emulation and portability</a></h2>
<p>I described the options you need to set for compatibility with ksh in
the previous chapter. Here I'm more interested in the best way of
running ksh scripts and functions.</p>
<p>First, you should remember that because of all zsh's options you can't
assume that a piece of zsh code will simply run a piece of sh or ksh
code without any extra changes. Our old friend <code>SH_WORD_SPLIT</code> is the
most common problem, but there are plenty of others. In addition to
options, there are other differences which simply need to be worked
around. I will list some of them a bit later. Generally speaking, Bourne
shell is simple enough that zsh emulates it pretty well --- although
beware in case you are using bash extensions, since to many Linux users
bash is the nearest approximation to the Bourne shell they ever come
across. Zsh makes no attempt to emulate bash, even though some of bash's
features have been incorporated.</p>
<p>To make zsh emulate ksh or sh as closely as it knows how, there are
various things you can do.</p>
<ol>
<li>
<p>Invoke zsh under the name sh or ksh, as appropriate. You can do this
by creating a symbolic link from zsh to sh or ksh. Then when zsh
starts up all the options will be set appropriately. If you are
starting that shell from another zsh, you can use the feature of zsh
that tricks a programme into thinking it has a different name:
`<code>ARGV0=sh zsh</code>' runs zsh under the name sh, just like the symbolic
link method.</p>
</li>
<li>
<p>Use `<code>emulate ksh</code>' at the top of the script or function you want
to run. In the case of a function, it is better to run `<code>emulate -L ksh</code>' since this makes sure the normal options will be restored when
the function exits; this is irrelevant for a script as the options
cannot be propagated to the process which ran the script. You can
also use the option `<code>-R</code>' after <code>emulate</code>, which forces more
options to be like ksh; these extra options are generally for user
convenience and not relevant to basic syntax, but in some cases you
may want the extra cover provided.</p>
<p>If it's possible the script may already be running under ksh, you
<li>The <code>keyword</code> option does not exist and <code>-k</code> is instead
interactivecomments. (<code>keyword</code> will not be in the next ksh
release either.)</li>
<li>Management of histories in multiple shells is different: the
history list is not saved and restored after each command. The
option <code>SHARE_HISTORY</code> appeared in 3.1.6 and is set in ksh
compatibility mode to remedy this.</li>
<li><code>\</code> does not escape editing chars (use <code>^V</code>).</li>
<li>Not all ksh bindings are set (e.g. <code><ESC>#</code>; try <code><ESC>q</code>).</li>
<li>* <code>#</code> in an interactive shell is not treated as a comment by
default.</li>
</ul>
</li>
<li>Built-in commands:
<ul>
<li>Some built-ins (<code>r</code>, <code>autoload</code>, <code>history</code>, <code>integer</code> ...) were
aliases in ksh.</li>
<li>There is no built-in command newgrp: use e.g. <code>alias newgrp="exec newgrp"</code></li>
<li><code>jobs</code> has no <code>-n</code> flag.</li>
<li><code>read</code> has no <code>-s</code> flag.</li>
</ul>
</li>
<li>Other idiosyncrasies:
<ul>
<li><code>select</code> always redisplays the list of selections on each loop.</li>
</ul>
</li>
</ul>
<p><spanid="l73"></span></p>
<h3id="392-making-your-own-scripts-and-functions-portable"><aclass="header"href="#392-making-your-own-scripts-and-functions-portable">3.9.2: Making your own scripts and functions portable</a></h3>
<p>There are also problems in making your own scripts and functions
available to other people, who may have different options set.</p>
<p>In the case of functions, it is always best to put `<code>emulate -L zsh</code>'
at the top of the function, which will reset the options to the default
zsh values, and then set any other necessary options. It doesn't take
the shell a great deal of time to process these commands, so try and get
into the habit of putting them any function you think may be used by
other people. (Completion functions are a special case as the
environment is already standardised --- see <ahref="zshguide06.html#comp">chapter
6</a> for this.)</p>
<p>The same applies to scripts, since if you run the script without using
the option `<code>-f</code>' to zsh the user's non-interactive startup files will
be run, and in any case the file <code>/etc/zshenv</code> will be run. We urge
system administrators not to set options unconditionally in that file
unless absolutely necessary; but they don't always listen. Hence an
<code>emulate</code> can still save a lot of grief.</p>
<p>Here are some final comments on running scripts: they apply regardless
of the problems of portability, but you should certainly also be aware
of what I was saying in the previous section.</p>
<p>You may be aware that you can force the operating system to run a script
using a particular interpreter by putting `<code>#!</code>' and the path to the
interpreter at the top of the script. For example, a zsh script could
start with</p>
<pre><code> #!/usr/local/bin/zsh
print The arguments are $*
</code></pre>
<p>assuming that zsh lives in the directory <code>/usr/local/bin</code>. Then you can
run the script under its name as if it were an ordinary command. Suppose
the script were called `<code>scriptfile</code>' and in the current directory, and
you want to run it with the arguments `<code>one two forty-three</code>'. First
you must make sure the script is executable:</p>
<pre><code> % chmod +x scriptfile
</code></pre>
<p>and then you can run it with the arguments:</p>
<pre><code> % ./scriptfile one two forty-three
The arguments are one two forty-three
</code></pre>
<p>The shell treats the first line as a comment, since it begins with a
`<code>#</code>', but note it still gets evaluated by the shell; the system simply
looks inside the file to see if what's there, it doesn't change it just
because the first line tells it to execute the shell.</p>
<p>I put the `<code>./</code>' in front to refer to the current directory because I
don't usually have that in my path --- this is for safety, to avoid
running things which happen to have names like commands simply because
they were in the current directory. But many people aren't so paranoid,
and if `<code>.</code>' is in your path, you can omit the `<code>./</code>'. Hence,
obviously, it can be anywhere else in your path: it is searched for as
an ordinary executable.</p>
<p>The shell actually provides this mechanism even on operating systems
(now few and far between in the UNIX world) that don't have the feature
built into them. The way this works is that if the shell found the file,
and it was executable, but running it didn't work, then it will look for
the <code>#!</code>, extract the name following and run (in this example)
`<code>/usr/local/bin/zsh</code><em><path></em>/scriptfile <code>one two forty-three</code>',
where <em><path></em> is the path where the file was found. This is, in fact,
pretty much what the system does if it handles it itself.</p>
<p>Some shells search for scripts using the path when they are given as
filenames at invocation, but zsh happens not to. In other words, `<code>zsh scriptfile</code>' only runs <code>scriptfile</code> in the current directory.</p>
<p>There are two other features you may want to be aware of. Both are down
to the operating system, if that is what is responsible for the `<code>#!</code>'
trick (true of all the most common UNIX-like systems at the moment).
First, you are usually allowed to supply one, but only one, argument or
option in the `<code>#!</code>' line, thus:</p>
<pre><code> #!/usr/local/bin/zsh -f
print other stuff here
</code></pre>
<p>which stops startup files other than <code>/etc/zshenv</code> from being run, but
otherwise works the same as before. If you need more options, you should
combine them in the same word. However, it's usually clearer, for
anything apart from <code>-f</code>, <code>-i</code> (which forces the shell into interactive
mode) and a few other options which need to take effect immediately, to
put a `<code>setopt</code>' line at the start of the body of the script. In a few
versions of zsh, there was an unexpected consequence of the fact that
the line would only be split once: if you accidentally left some spaces
at the end of the line (e.g. `<code>#!/usr/local/bin/zsh -f </code>') they would
be passed down to the shell, which would report an error, which was hard
to interpret. The spaces will still usually be passed down, but the
shell is now smart enough to ignore spaces in an option list.</p>
<p>The second point is that the length of the `<code>#!</code>' line which will be
evaluated is limited. Often the limit is 32 characters, in total, That