bash-hackers-wiki/docs/syntax/basicgrammar.md

346 lines
13 KiB
Markdown
Raw Normal View History

2024-04-02 21:19:20 +02:00
---
tags:
- bash
- shell
- scripting
- grammar
- syntax
- language
---
2023-07-05 11:43:35 +02:00
2024-04-02 21:19:20 +02:00
# Basic grammar rules of Bash
2023-07-05 11:43:35 +02:00
Bash builds its features on top of a few basic **grammar rules**. The
code you see everywhere, the code you use, is based on those rules.
However, **this is a very theoretical view**, but if you're interested,
2023-07-05 11:43:35 +02:00
it may help you understand why things look the way they look.
2024-03-30 20:09:26 +01:00
If you don't know the commands used in the following examples, just
2023-07-05 11:43:35 +02:00
trust the explanation.
## Simple Commands
Bash manual says:
A simple command is a sequence of optional variable assignments followed by blank-separated words and redirections,
and terminated by a control operator. The first word specifies the command to be executed, and is passed as argument
zero. The remaining words are passed as arguments to the invoked command.
Sounds harder than it actually is. It is what you do daily. You enter
simple commands with parameters, and the shell executes them.
Every complex Bash operation can be split into simple commands:
ls
ls > list.txt
ls -l
LC_ALL=C ls
The last one might not be familiar. That one simply adds "`LC_ALL=C`"
2024-03-30 20:09:26 +01:00
to the environment of the `ls` program. It doesn't affect your current
2023-07-05 11:43:35 +02:00
shell. This also works while calling functions, unless Bash runs in
POSIX(r) mode (in which case it affects your current shell).
Every command has an exit code. It's a type of return status. The shell
2023-07-05 11:43:35 +02:00
can catch it and act on it. Exit code range is from 0 to 255, where 0
means success, and the rest mean either something failed, or there is an
issue to report back to the calling program.
!!! info "info"
The simple command construct is the
**base** for all higher constructs. Everything you execute, from
pipelines to functions, finally ends up in (many) simple commands.
That's why Bash only has one method to [expand and execute a simple
command](../syntax/grammar/parser_exec.md).
2023-07-05 11:43:35 +02:00
## Pipelines
!!! warning "FIXME"
Missing an additional article about pipelines and pipelining
2023-07-05 11:43:35 +02:00
`[time [-p]] [ ! ] command [ | command2 ... ]`
**Don't get confused** about the name "pipeline." It's a grammatic
2024-03-30 20:09:26 +01:00
name for a construct. Such a pipeline isn't necessarily a pair of
2023-07-05 11:43:35 +02:00
commands where stdout/stdin is connected via a real pipe.
Pipelines are one or more [simple
2024-04-02 21:36:43 +02:00
commands](#simple-commands) (separated by the `|` symbol
2023-07-05 11:43:35 +02:00
connects their input and output), for example:
ls /etc | wc -l
will execute `ls` on `/etc` and **pipe** the output to `wc`, which will
count the lines generated by the ls command. The result is the number of
directory entries in /etc.
The last command in the pipeline will set the exit code for the
pipeline. This exit code can be "inverted" by prefixing an exclamation
mark to the pipeline: An unsuccessful pipeline will exit "successful"
2023-07-05 11:43:35 +02:00
and vice versa. In this example, the commands in the if stanza will be
executed if the pattern "^root:" is **not** found in `/etc/passwd`:
2023-07-05 11:43:35 +02:00
if ! grep '^root:' /etc/passwd; then
echo "No root user defined... eh?"
fi
Yes, this is also a pipeline (although there is no pipe!), because the
**exclamation mark to invert the exit code** can only be used in a
pipeline. If `grep`'s exit code is 1 (FALSE) (the text was not found),
the leading `!` will "invert" the exit code, and the shell sees (and
2023-07-05 11:43:35 +02:00
acts on) exit code 0 (TRUE) and the `then` part of the `if` stanza is
executed. One could say we checked for
"`not grep "^root" /etc/passwd`".
2023-07-05 11:43:35 +02:00
The [set option pipefail](../commands/builtin/set.md#attributes) determines
the behavior of how bash reports the exit code of a pipeline. If it's
2023-07-05 11:43:35 +02:00
set, then the exit code (`$?`) is the last command that exits with non
zero status, if none fail, it's zero. If it's not set, then `$?`
2023-07-05 11:43:35 +02:00
always holds the exit code of the last command (as explained above).
The shell option `lastpipe` will execute the last element in a pipeline
construct in the current shell environment, i.e. not a subshell.
There's also an array `PIPESTATUS[]` that is set after a foreground
2023-07-05 11:43:35 +02:00
pipeline is executed. Each element of `PIPESTATUS[]` reports the exit
code of the respective command in the pipeline. Note: (1) it's only for
2023-07-05 11:43:35 +02:00
foreground pipe and (2) for higher level structure that is built up from
a pipeline. Like list, `PIPESTATUS[]` holds the exit status of the last
pipeline command executed.
Another thing you can do with pipelines is log their execution time.
Note that **`time` is not a command**, it is part of the pipeline
syntax:
# time updatedb
real 3m21.288s
user 0m3.114s
sys 0m4.744s
## Lists
!!! warning "FIXME"
Missing an additional article about list operators
2023-07-05 11:43:35 +02:00
A list is a sequence of one or more [pipelines](#pipelines)
2023-07-05 11:43:35 +02:00
separated by one of the operators `;`, `&`, `&&`, or `││`, and
optionally terminated by one of `;`, `&`, or `<newline>`.
=> It's a group of **pipelines** separated or terminated by **tokens**
2023-07-05 11:43:35 +02:00
that all have **different meanings** for Bash.
Your whole Bash script technically is one big single list!
|Operator|Description|
|--------|-----------|
|`<PIPELINE1> <newline> <PIPELINE2>`|Newlines completely separate pipelines. The next pipeline is executed without any checks. (You enter a command and press `<RETURN>`!)|
|`<PIPELINE1> ; <PIPELINE2>`|The semicolon does what `<newline>` does: It separates the pipelines|
|`<PIPELINE> & <PIPELINE>`|The pipeline in front of the `&` is executed **asynchronously** ("in the background"). If a pipeline follows this, it is executed immediately after the async pipeline starts|
|`<PIPELINE1> && <PIPELINE2>`|`<PIPELINE1>` is executed and **only** if its exit code was 0 (TRUE), then `<PIPELINE2>` is executed (AND-List)|
|`<PIPELINE1>||<PIPELINE2>`|`<PIPELINE1>` is executed and **only** if its exit code was **not** 0 (FALSE), then `<PIPELINE2>` is executed (OR-List)|
2023-07-05 11:43:35 +02:00
**Note:** POSIX calls this construct a "compound lists".
2023-07-05 11:43:35 +02:00
## Compound Commands
See also the [list of compound commands](../syntax/ccmd/intro.md).
2023-07-05 11:43:35 +02:00
There are two forms of compound commands:
- form a new syntax element using a list as a "body"
2023-07-05 11:43:35 +02:00
- completly independant syntax elements
Essentially, everything else that's not described in this article.
2023-07-05 11:43:35 +02:00
Compound commands have the following characteristics:
- they **begin** and **end** with a specific keyword or operator (e.g.
`for ... done`)
- they can be redirected as a whole
See the following table for a short overview (no details - just an
overview):
|Compound command syntax|Description|
|-----------------------|-----------|
|`( <LIST> )`|Execute `<LIST>` in an extra subshell =\> [article](../syntax/ccmd/grouping_subshell.md)|
|`{ <LIST> ; }`|Execute `<LIST>` as separate group (but not in a subshell) =\> [article](../syntax/ccmd/grouping_plain.md)|
|`(( <EXPRESSION> ))`|Evaluate the arithmetic expression `<EXPRESSION>` =\> [article](../syntax/ccmd/arithmetic_eval.md)|
|`[[ <EXPRESSION> ]]`|Evaluate the conditional expression `<EXPRESSION>` (aka "the new test command") =\> [article](../syntax/ccmd/conditional_expression.md)|
|`for <NAME> in <WORDS> ; do <LIST> ; done`|Executes `<LIST>` while setting the variable `<NAME>` to one of `<WORDS>` on every iteration (classic for-loop) =\> [article](../syntax/ccmd/classic_for.md)|
|`for (( <EXPR1> ; <EXPR2> ; <EXPR3> )) ; do <LIST> ; done`|C-style for-loop (driven by arithmetic expressions) =\> [article](../syntax/ccmd/c_for.md)|
|`select <NAME> in <WORDS> ; do <LIST> ; done`|Provides simple menus =\> [article](../syntax/ccmd/user_select.md)|
|`case <WORD> in <PATTERN>) <LIST> ;; ... esac`|Decisions based on pattern matching - executing `<LIST>` on match =\> [article](../syntax/ccmd/case.md)|
|`if <LIST> ; then <LIST> ; else <LIST> ; fi`|The if clause: makes decisions based on exit codes =\> [article](../syntax/ccmd/if_clause.md)|
|`while <LIST1> ; do <LIST2> ; done`|Execute `<LIST2>` while `<LIST1>` returns TRUE (exit code) =\> [article](../syntax/ccmd/while_loop.md)|
|`until <LIST1> ; do <LIST2> ; done`|Execute `<LIST2>` until `<LIST1>` returns TRUE (exit code) =\> [article](../syntax/ccmd/until_loop.md)|
2023-07-05 11:43:35 +02:00
## Shell Function Definitions
!!! warning "FIXME"
Missing an additional article about shell functions
2023-07-05 11:43:35 +02:00
A shell function definition makes a [compound
command](#compound_commands) available via a new name. When
the function runs, it has its own "private" set of positional
2023-07-05 11:43:35 +02:00
parameters and I/O descriptors. It acts like a script-within-the-script.
Simply stated: **You've created a new command.**
2023-07-05 11:43:35 +02:00
The definition is easy (one of many possibilities):
`<NAME> () <COMPOUND_COMMAND> <REDIRECTIONS>`
which is usually used with the `{...; }` compound command, and thus
looks like:
print_help() { echo "Sorry, no help available"; }
As above, a function definition can have any [compound
command](#compound_commands) as a body. Structures like
2023-07-05 11:43:35 +02:00
countme() for ((x=1;x<=9;x++)); do echo $x; done
are unusual, but perfectly valid, since the for loop construct is a
compound command!
If **redirection** is specified, the redirection is not performed when
the function is defined. It is performed when the function runs:
# this will NOT perform the redirection (at definition time)
f() { echo ok ; } > file
# NOW the redirection will be performed (during EXECUTION of the function)
f
Bash allows three equivalent forms of the function definition:
NAME () <COMPOUND_COMMAND> <REDIRECTIONS>
function NAME () <COMPOUND_COMMAND> <REDIRECTIONS>
function NAME <COMPOUND_COMMAND> <REDIRECTIONS>
The space between `NAME` and `()` is optional, usually you see it
without the space.
I suggest using the first form. It's specified in POSIX and all
2023-07-05 11:43:35 +02:00
Bourne-like shells seem to support it.
<u>**Note:**</u> Before version `2.05-alpha1`, Bash only
2023-07-05 11:43:35 +02:00
recognized the definition using curly braces (`name() { ... }`), other
shells allow the definition using **any** command (not just the compound
command set).
To execute a function like a regular shell script you put it together
like this:
#!/bin/bash
# Add shebang
mycmd()
{
# this $1 belongs to the function!
find / -iname "$1"
}
# this $1 belongs the script itself!
mycmd "$1" # Execute command immediately after defining function
2023-07-05 11:43:35 +02:00
exit 0
**Just informational(1):**
Internally, for forking, Bash stores function definitions in environment
variables. Variables with the content "*() ....*".
2023-07-05 11:43:35 +02:00
Something similar to the following works without "officially"
2023-07-05 11:43:35 +02:00
declaring a function:
$ export testfn="() { echo test; }"
$ bash -c testfn
test
$
**Just informational(2):**
It is possible to create function names containing slashes:
/bin/ls() {
echo LS FAKE
}
2024-03-30 20:09:26 +01:00
The elements of this name aren't subject to a path search.
2023-07-05 11:43:35 +02:00
Weird function names should not be used. Quote from the maintainer:
- * It was a mistake to allow such characters in function names
(`unset` doesn't work to unset them without forcing `-f`, for
instance). We're stuck with them for backwards compatibility, but I
2024-03-30 20:09:26 +01:00
don't have to encourage their use. *
2023-07-05 11:43:35 +02:00
## Grammar summary
- a [simple command](#simple_commands) is just a command
2023-07-05 11:43:35 +02:00
and its arguments
- a [pipeline](#pipelines) is one or more [simple
command](#simple_commands) probably connected in a pipe
- a [list](#lists) is one or more
[pipelines](#pipelines) connected by special operators
- a [compound command](#compound_commands) is a
[list](#lists) or a special command that forms a new
2023-07-05 11:43:35 +02:00
meta-command
- a [function definition](#shell_function_definitions)
makes a [compound command](#compound_commands) available
2023-07-05 11:43:35 +02:00
under a new name, and a separate environment
## Examples for classification
!!! warning "FIXME"
more...
2023-07-05 11:43:35 +02:00
------------------------------------------------------------------------
<u>A (very) simple command</u>
2023-07-05 11:43:35 +02:00
echo "Hello world..."
<u>All of the following are simple commands</u>
2023-07-05 11:43:35 +02:00
x=5
>tmpfile
{x}<"$x" _=${x=<(echo moo)} <&0$(cat <&"$x" >&2)
------------------------------------------------------------------------
<u>A common compound command</u>
2023-07-05 11:43:35 +02:00
if [ -d /data/mp3 ]; then
cp mymusic.mp3 /data/mp3
fi
- the [compound command](#compound_commands) for the `if`
2023-07-05 11:43:35 +02:00
clause
- the [list](#lists) that `if` **checks** actually
contains the [simple command](#simple_commands)
2023-07-05 11:43:35 +02:00
`[ -d /data/mp3 ]`
- the [list](#lists) that `if` **executes** contains a
2023-07-05 11:43:35 +02:00
simple command (`cp mymusic.mp3 /data/mp3`)
Let's invert test command exit code, only one thing changes:
2023-07-05 11:43:35 +02:00
if ! [ -d /data/mp3 ]; then
cp mymusic.mp3 /data/mp3
fi
- the [list](#lists) that `if` **checks** contains a
[pipeline](#pipelines) now (because of the `!`)
2023-07-05 11:43:35 +02:00
## See also
- Internal: [List of compound commands](../syntax/ccmd/intro.md)
2023-07-05 11:43:35 +02:00
- Internal: [Parsing and execution of simple
commands](../syntax/grammar/parser_exec.md)
- Internal: [Quoting and escaping](../syntax/quoting.md)
2023-07-05 11:43:35 +02:00
- Internal: [Introduction to expansions and
substitutions](../syntax/expansion/intro.md)
- Internal: [Some words about words\...](../syntax/words.md)