diff --git a/docs/syntax/arith_expr.md b/docs/syntax/arith_expr.md new file mode 100644 index 0000000..9d59071 --- /dev/null +++ b/docs/syntax/arith_expr.md @@ -0,0 +1,343 @@ +# Arithmetic expressions + +![](keywords>bash shell scripting math arithmetic C calculation integer) + +Arithmetic expressions are used in several situations: + +- [arithmetic evaluation command](/syntax/ccmd/arithmetic_eval) +- [arithmetic expansion](/syntax/expansion/arith) +- [substring parameter expansion](/syntax/pe#substring_expansion) +- [the `let` builtin command](/commands/builtin/let) +- [C-style for loop](/syntax/ccmd/c_for) +- [array indexing](/syntax/arrays) +- [conditional expressions](/syntax/ccmd/conditional_expression) +- Assignment statements, and arguments to declaration commands of + variables with the integer attribute. + +These expressions are evaluated following some rules described below. +The operators and rules of arithmetic expressions are mainly derived +from the C programming language. + +This article describes the theory of the used syntax and the behaviour. +To get practical examples without big explanations, see [this page on +Greg\'s +wiki](http://mywiki.wooledge.org/BashGuide/CompoundCommands#Arithmetic_Evaluation). + +## Constants + +Mathematical constants are simply fixed values you write: `1`, `3567`, +or `4326`. Bash interprets some notations specially: + +- `0...` (leading zero) is interpreted as an **octal** value +- `0x...` is interpreted as a **hex** value +- `0X...` also interpreted as a **hex** +- `#...` is interpreted as a number according to the **specified + base** ``, e.g., `2#00111011` (see below) + +If you have a constant set in a variable, like, + + x=03254 + +this is interpreted as an octal value. If you want it to be interpreted +as a decimal value, you need to expand the parameter and specify base +10: + + # this is interpreted as a decimal: + echo $(( 10#$x )) + + # this is interpreted as an octal: + echo $(( x )) + + # this is an invalid digit for base 10 (the "x")...: + echo $(( 10#x )) + +## Different bases + +For a constant, the base can be specified using the form + + # + +Regardless of the specified base, the arithmetic expressions will, if +ever displayed, be **displayed in decimal**! + +When no base is specified, the base 10 (decimal) is assumed, except when +the prefixes as mentioned above (octals, hexadecimals) are present. The +specified base can range from 2 to 64. To represent digits in a +specified base greater than 10, characters other than 0 to 9 are needed +(in this order, low =\> high): + +- `0 ... 9` +- `a ... z` +- `A ... Z` +- `@` +- `_` + +Let\'s quickly invent a new number system with base 43 to show what I +mean: + + $ echo $((43#1)) + 1 + + $ echo $((43#a)) + 10 + + $echo $((43#A)) + 36 + + $ echo $((43#G)) + 42 + + $ echo $((43#H)) + bash: 43#H: value too great for base (error token is "43#H") + +If you have no clue what a base is and why there might be other bases, +and what numbers are and how they are built, then you don\'t need +different bases. + +If you want to convert between the usual bases (octal, decimal, hex), +use [the printf command](/commands/builtin/printf) and its format +strings. + +## Shell variables + +Shell variables can of course be used as operands, even when the integer +attribute is not turned on (by `declare -i `). If the variable is +empty (null) or unset, its reference evaluates to 0. If the variable +doesn\'t hold a value that looks like a valid expression (numbers or +operations), the expression is re-used to reference, for example, the +named parameters, e.g.: + + test=string + string=3 + + echo $((test)) + # will output "3"! + +Of course, in the end, when it finally evaluates to something that is +**not** a valid arithmetic expression (newlines, ordinary text, \...) +then you\'ll get an error. + +When variables are referenced, the notation `1 + $X` is equivalent to +the notation `1 + X`, both are allowed. + +When variables are referenced like `$X`, the rules of [parameter +expansion](/syntax/pe) apply and are performed **before** the expression +is evaluated. Thus, a construct like `${MYSTRING:4:3}` is valid inside +an arithmetic expression. + +## Truth + +Unlike command exit and return codes, arithmetic expressions evaluate to +logical \"true\" when they are not 0. When they are 0, they evaluate to +\"false\". The [arithmetic evaluation compound +command](/syntax/ccmd/arithmetic_eval) reverses the \"truth\" of an +arithmetic expression to match the \"truth\" of command exit codes: + +- if the arithmetic expression brings up a value not 0 (arithmetic + true), it returns 0 (shell true) +- if the arithmetic expression evaluates to 0 (arithmetic false), it + returns 1 (shell false) + +That means, the following `if`-clause will execute the `else`-thread: + + if ((0)); then + echo "true" + else + echo "false" + fi + +## Operators + +### Assignment + + Operator Description + --------------------- ---------------------------------------------------------------------------------------------------- + ` = ` normal assignment + ` *= ` equivalent to ` = * `, see [calculation operators](/syntax/arith_expr#calculations) + ` /= ` equivalent to ` = / `, see [calculation operators](/syntax/arith_expr#calculations) + ` %= ` equivalent to ` = % `, see [calculation operators](/syntax/arith_expr#calculations) + ` += ` equivalent to ` = + `, see [calculation operators](/syntax/arith_expr#calculations) + ` -= ` equivalent to ` = - `, see [calculation operators](/syntax/arith_expr#calculations) + ` <<= ` equivalent to ` = << `, see [bit operations](/syntax/arith_expr#bit_operations) + ` >>= ` equivalent to ` = >> `, see [bit operations](/syntax/arith_expr#bit_operations) + ` &= ` equivalent to ` = & `, see [bit operations](/syntax/arith_expr#bit_operations) + ` ^= ` equivalent to ` = ^ `, see [bit operations](/syntax/arith_expr#bit_operations) + ` |= ` equivalent to ` = | `, see [bit operations](/syntax/arith_expr#bit_operations) + +### Calculations + + Operator Description + ---------- -------------------- + `*` multiplication + `/` division + `%` remainder (modulo) + `+` addition + `-` subtraction + `**` exponentiation + +### Comparisons + + Operator Description + ---------- ----------------------------------- + `<` comparison: less than + `>` comparison: greater than + `<=` comparison: less than or equal + `>=` comparison: greater than or equal + `==` equality + `!=` inequality + +### Bit operations + + Operator Description + ---------- ---------------------------- + `~` bitwise negation + `<<` bitwise shifting (left) + `>>` bitwise shifting (right) + `&` bitwise AND + `^` bitwise exclusive OR (XOR) + `|` bitwise OR + +### Logical + + Operator Description + ---------- ------------------ + `!` logical negation + `&&` logical AND + `||` logical OR + +### Misc + + ------------------------------------------------------------------------------------------------- + Operator Description + ---------------------------- -------------------------------------------------------------------- + `id++` **post-increment** of the variable `id` (not required by POSIX(r)) + + `id--` **post-decrement** of the variable `id` (not required by POSIX(r)) + + `++id` **pre-increment** of the variable `id` (not required by POSIX(r)) + + `--id` **pre-decrement** of the variable `id` (not required by POSIX(r)) + + `+` unary plus + + `-` unary minus + + ` ? : ` conditional (ternary) operator\ + \ ? \ : \ + + ` , ` expression list + + `( )` subexpression (to force precedence) + ------------------------------------------------------------------------------------------------- + +## Precedence + +The operator precedence is as follows (highest -\> lowest): + +- Postfix (`id++`, `id--`) +- Prefix (`++id`, `--id`) +- Unary minus and plus (`-`, `+`) +- Logical and bitwise negation (`!`, `~`) +- Exponentiation (`**`) +- Multiplication, division, remainder (`*`, `/`, `%`) +- Addition, subtraction (`+`, `-`) +- Bitwise shifts (`<<`, `>>`) +- Comparison (`<`, `>`, `<=`, `>=`) +- (In-)equality (`==`, `!=`) +- Bitwise AND (`&`) +- Bitwise XOR (`^`) +- Bitwise OR (`|`) +- Logical AND (`&&`) +- Logical OR (`||`) +- Ternary operator (` ? : `) +- Assignments (`=`, `*=`, `/=`, `%=`, `+=`, `-=`, `<<=`, `>>=`, `&=`, + `^=`, `|=`) +- Expression list operator (` , `) + +The precedence can be adjusted using subexpressions of the form +`( )` at any time. These subexpressions are always evaluated +first. + +## Arithmetic expressions and return codes + +Bash\'s overall language construct is based on exit codes or return +codes of commands or functions to be executed. `if` statements, `while` +loops, etc., they all take the return codes of commands as conditions. + +Now the problem is: The return codes (0 means \"TRUE\" or \"SUCCESS\", +not 0 means \"FALSE\" or \"FAILURE\") don\'t correspond to the meaning +of the result of an arithmetic expression (0 means \"FALSE\", not 0 +means \"TRUE\"). + +That\'s why all commands and keywords that do arithmetic operations +attempt to **translate** the arithmetical meaning into an equivalent +return code. This simply means: + +- if the arithmetic operation evaluates to 0 (\"FALSE\"), the return + code is not 0 (\"FAILURE\") +- if the arithmetic operation evaluates to 1 (\"TRUE\"), the return + code is 0 (\"SUCCESS\") + +This way, you can easily use arithmetic expressions (along with the +commands or keywords that operate them) as conditions for `if`, `while` +and all the others, including `set -e` for autoexit on error: + +``` bash +MY_TEST_FLAG=0 + +if ((MY_TEST_FLAG)); then + echo "MY_TEST_FLAG is ON" +else + echo "MY_TEST_FLAG is OFF" +fi +``` + +\ Beware that `set -e` can change the +runtime behavior of scripts. For example, + +This non-equivalence of code behavior deserves some attention. Consider +what happens if v happens to be zero in the expression below: + +``` bash +((v += 0)) +echo $? +``` + +1 + +(\"FAILURE\") + +``` bash +v=$((v + 0)) +echo $? +``` + +0 + +(\"SUCCESS\") + +The return code behavior is not equivalent to the arithmetic behavior, +as has been noted. + +A workaround is to use a list operation that returns True, or use the +second assignment style. + +``` bash +((v += 0)) || : +echo $? +``` + +0 + +(\"SUCCESS\") + +This change in code behavior was discovered once the script was run +under set -e. \ + +## Arithmetic expressions in Bash + +- [The C-style for-loop](/syntax/ccmd/c_for) +- [Arithmetic expansion](/syntax/expansion/arith) +- [Arithmetic evaluation compound + command](/syntax/ccmd/arithmetic_eval) +- [The \"let\" builtin command](/commands/builtin/let) diff --git a/docs/syntax/arrays.md b/docs/syntax/arrays.md new file mode 100644 index 0000000..512c631 --- /dev/null +++ b/docs/syntax/arrays.md @@ -0,0 +1,692 @@ +# Arrays + +## Purpose + +An array is a parameter that holds mappings from keys to values. Arrays +are used to store a collection of parameters into a parameter. Arrays +(in any programming language) are a useful and common composite data +structure, and one of the most important scripting features in Bash and +other shells. + +Here is an **abstract** representation of an array named `NAMES`. The +indexes go from 0 to 3. + + NAMES + 0: Peter + 1: Anna + 2: Greg + 3: Jan + +Instead of using 4 separate variables, multiple related variables are +grouped grouped together into *elements* of the array, accessible by +their *key*. If you want the second name, ask for index 1 of the array +`NAMES`. + +## Indexing + +Bash supports two different types of ksh-like one-dimensional arrays. +**Multidimensional arrays are not implemented**. + +- *Indexed arrays* use positive integer numbers as keys. Indexed + arrays are **always sparse**, meaning indexes are not necessarily + contiguous. All syntax used for both assigning and dereferencing + indexed arrays is an [arithmetic evaluation + context](/syntax/arith_expr) (see [#Referencing](#Referencing)). As + in C and many other languages, the numerical array indexes start at + 0 (zero). Indexed arrays are the most common, useful, and portable + type. Indexed arrays were first introduced to Bourne-like shells by + ksh88. Similar, partially compatible syntax was inherited by many + derivatives including Bash. Indexed arrays always carry the `-a` + attribute. +- *Associative arrays* (sometimes known as a \"hash\" or \"dict\") use + arbitrary nonempty strings as keys. In other words, associative + arrays allow you to look up a value from a table based upon its + corresponding string label. **Associative arrays are always + unordered**, they merely *associate* key-value pairs. If you + retrieve multiple values from the array at once, you can\'t count on + them coming out in the same order you put them in. Associative + arrays always carry the `-A` attribute, and unlike indexed arrays, + Bash requires that they always be declared explicitly (as indexed + arrays are the default, see [declaration](#Declaration)). + Associative arrays were first introduced in ksh93, and similar + mechanisms were later adopted by Zsh and Bash version 4. These three + are currently the only POSIX-compatible shells with any associative + array support. + +## Syntax + +### Referencing + +To accommodate referring to array variables and their individual +elements, Bash extends the parameter naming scheme with a subscript +suffix. Any valid ordinary scalar parameter name is also a valid array +name: `[[:alpha:]_][[:alnum:]_]*`. The parameter name may be followed by +an optional subscript enclosed in square brackets to refer to a member +of the array. + +The overall syntax is `arrname[subscript]` - where for indexed arrays, +`subscript` is any valid arithmetic expression, and for associative +arrays, any nonempty string. Subscripts are first processed for +parameter and arithmetic expansions, and command and process +substitutions. When used within parameter expansions or as an argument +to the [unset](commands/builtin/unset) builtin, the special subscripts +`*` and `@` are also accepted which act upon arrays analogously to the +way the `@` and `*` special parameters act upon the positional +parameters. In parsing the subscript, bash ignores any text that follows +the closing bracket up to the end of the parameter name. + +With few exceptions, names of this form may be used anywhere ordinary +parameter names are valid, such as within [arithmetic +expressions](/syntax/arith_expr), [parameter expansions](/syntax/pe), +and as arguments to builtins that accept parameter names. An *array* is +a Bash parameter that has been given the `-a` (for indexed) or `-A` (for +associative) *attributes*. However, any regular (non-special or +positional) parameter may be validly referenced using a subscript, +because in most contexts, referring to the zeroth element of an array is +synonymous with referring to the array name without a subscript. + + # "x" is an ordinary non-array parameter. + $ x=hi; printf '%s ' "$x" "${x[0]}"; echo "${_[0]}" + hi hi hi + +The only exceptions to this rule are in a few cases where the array +variable\'s name refers to the array as a whole. This is the case for +the `unset` builtin (see [destruction](#Destruction)) and when declaring +an array without assigning any values (see [declaration](#Declaration)). + +### Declaration + +The following explicitly give variables array attributes, making them +arrays: + + Syntax Description + -------------------- ------------------------------------------------------------------------------------------------------------------------- + `ARRAY=()` Declares an **indexed** array `ARRAY` and initializes it to be empty. This can also be used to empty an existing array. + `ARRAY[0]=` Generally sets the first element of an **indexed** array. If no array `ARRAY` existed before, it is created. + `declare -a ARRAY` Declares an **indexed** array `ARRAY`. An existing array is not initialized. + `declare -A ARRAY` Declares an **associative** array `ARRAY`. This is the one and only way to create associative arrays. + +As an example, and for use below, let\'s declare our `NAMES` array as +described [above](#purpose): + + declare -a NAMES=('Peter' 'Anna' 'Greg' 'Jan') + +### Storing values + +Storing values in arrays is quite as simple as storing values in normal +variables. + + Syntax Description + --------------------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + `ARRAY[N]=VALUE` Sets the element `N` of the **indexed** array `ARRAY` to `VALUE`. **`N` can be any valid [arithmetic expression](/syntax/arith_expr)**. + `ARRAY[STRING]=VALUE` Sets the element indexed by `STRING` of the **associative array** `ARRAY`. + `ARRAY=VALUE` As above. If no index is given, as a default the zeroth element is set to `VALUE`. Careful, this is even true of associative arrays - there is no error if no key is specified, and the value is assigned to string index \"0\". + `ARRAY=(E1\ E2\ ...)` Compound array assignment - sets the whole array `ARRAY` to the given list of elements indexed sequentially starting at zero. The array is unset before assignment unless the += operator is used. When the list is empty (`ARRAY=()`), the array will be set to an empty array. This method obviously does not use explicit indexes. An **associative array** can **not** be set like that! Clearing an associative array using `ARRAY=()` works. + `ARRAY=([X]=E1\ [Y]=E2\ ...)` Compound assignment for indexed arrays with index-value pairs declared individually (here for example `X` and `Y`). X and Y are arithmetic expressions. This syntax can be combined with the above - elements declared without an explicitly specified index are assigned sequentially starting at either the last element with an explicit index, or zero. + `ARRAY=([S1]=E1\ [S2]=E2\ ...)` Individual mass-setting for **associative arrays**. The named indexes (here: `S1` and `S2`) are strings. + `ARRAY+=(E1\ E2\ ...)` Append to ARRAY. + `ARRAY=("${ANOTHER_ARRAY[@]}")` Copy ANOTHER_ARRAY to ARRAY, copying each element. + +As of now, arrays can\'t be exported. + +### Getting values + +\ For completeness and details on several parameter expansion +variants, see the [article about parameter expansion](/syntax/pe) and +check the notes about arrays. \ + + Syntax Description + ----------------------------------------------------------------------- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + `${ARRAY[N]}` Expands to the value of the index `N` in the **indexed** array `ARRAY`. If `N` is a negative number, it\'s treated as the offset from the maximum assigned index (can\'t be used for assignment) - 1 + `${ARRAY[S]}` Expands to the value of the index `S` in the **associative** array `ARRAY`. + `"${ARRAY[@]}" ${ARRAY[@]} "${ARRAY[*]}" ${ARRAY[*]}` Similar to [mass-expanding positional parameters](/scripting/posparams#mass_usage), this expands to all elements. If unquoted, both subscripts `*` and `@` expand to the same result, if quoted, `@` expands to all elements individually quoted, `*` expands to all elements quoted as a whole. + `"${ARRAY[@]:N:M}" ${ARRAY[@]:N:M} "${ARRAY[*]:N:M}" ${ARRAY[*]:N:M}` Similar to what this syntax does for the characters of a single string when doing [substring expansion](/syntax/pe#substring_expansion), this expands to `M` elements starting with element `N`. This way you can mass-expand individual indexes. The rules for quoting and the subscripts `*` and `@` are the same as above for the other mass-expansions. + +For clarification: When you use the subscripts `@` or `*` for +mass-expanding, then the behaviour is exactly what it is for `$@` and +`$*` when [mass-expanding the positional +parameters](/scripting/posparams#mass_usage). You should read this +article to understand what\'s going on. + +### Metadata + + -------------------------------------------------------------------------------------------------------------------------------- + Syntax Description + --------------------- ---------------------------------------------------------------------------------------------------------- + `${#ARRAY[N]}` Expands to the **length** of an individual array member at index `N` (**stringlength**) + + `${#ARRAY[STRING]}` Expands to the **length** of an individual associative array member at index `STRING` (**stringlength**) + + `${#ARRAY[@]}`\ Expands to the **number of elements** in `ARRAY` + `${#ARRAY[*]}` + + `${!ARRAY[@]}`\ Expands to the **indexes** in `ARRAY` since BASH 3.0 + `${!ARRAY[*]}` + -------------------------------------------------------------------------------------------------------------------------------- + +### Destruction + +The [unset](commands/builtin/unset) builtin command is used to destroy +(unset) arrays or individual elements of arrays. + + -------------------------------------------------------------------------------------------------- + Syntax Description + -------------------------- ----------------------------------------------------------------------- + `unset -v ARRAY`\ Destroys a complete array + `unset -v ARRAY[@]`\ + `unset -v ARRAY[*]` + + `unset -v ARRAY[N]` Destroys the array element at index `N` + + `unset -v ARRAY[STRING]` Destroys the array element of the associative array at index `STRING` + -------------------------------------------------------------------------------------------------- + +It is best to [explicitly specify +-v](commands/builtin/unset#portability_considerations) when unsetting +variables with unset. + +\ Specifying unquoted array elements as arguments to any +command, such as with the syntax above **may cause [pathname +expansion](/syntax/expansion/globs) to occur** due to the presence of +glob characters. + +Example: You are in a directory with a file named `x1`, and you want to +destroy an array element `x[1]`, with + + unset x[1] + +then pathname expansion will expand to the filename `x1` and break your +processing! + +Even worse, if `nullglob` is set, your array/index will disappear. + +To avoid this, **always quote** the array name and index: + + unset -v 'x[1]' + +This applies generally to all commands which take variable names as +arguments. Single quotes preferred. \ + +## Usage + +### Numerical Index + +Numerical indexed arrays are easy to understand and easy to use. The +[Purpose](#purpose) and [Indexing](#indexing) chapters above more or +less explain all the needed background theory. + +Now, some examples and comments for you. + +Let\'s say we have an array `sentence` which is initialized as follows: + + sentence=(Be liberal in what you accept, and conservative in what you send) + +Since no special code is there to prevent word splitting (no quotes), +every word there will be assigned to an individual array element. When +you count the words you see, you should get 12. Now let\'s see if Bash +has the same opinion: + + $ echo ${#sentence[@]} + 12 + +Yes, 12. Fine. You can take this number to walk through the array. Just +**subtract 1 from the number of elements, and start your walk at 0 +(zero)**: + + ((n_elements=${#sentence[@]}, max_index=n_elements - 1)) + + for ((i = 0; i <= max_index; i++)); do + echo "Element $i: '${sentence[i]}'" + done + +You always have to remember that, it seems newbies have problems +sometimes. Please understand that **numerical array indexing begins at 0 +(zero)**! + +The method above, walking through an array by just knowing its number of +elements, only works for arrays where all elements are set, of course. +If one element in the middle is removed, then the calculation is +nonsense, because the number of elements doesn\'t correspond to the +highest used index anymore (we call them \"*sparse arrays*\"). + +Now, suppose that you want to replace your array `sentence` with the +values in the [previously-declared array](#purpose) `NAMES` . You might +think you could just do + + $ unset sentence ; declare -a sentence=NAMES + $ echo ${#sentence[@]} + 1 + # omit calculating max_index as above, and iterate as one-liner + $ for ((i = 0; i < ${#sentence[@]}; i++)); do echo "Element $i: '${sentence[i]}'" ; done + Element 0: 'NAMES' + +Obviously that\'s wrong. What about + + $ unset sentence ; declare -a sentence=${NAMES} + +? Again, wrong: + + $ echo ${#sentence[*]} + 1 + $ for ((i = 0; i < ${#sentence[@]}; i++)); do echo "Element $i: '${sentence[i]}'" ; done + Element 0: 'Peter' + +So what\'s the **right** way? The (slightly ugly) answer is, reuse the +enumeration syntax: + + $ unset sentence ; declare -a sentence=("${NAMES[@]}") + $ echo ${#sentence[@]} + 4 + $ for ((i = 0; i < ${#sentence[@]}; i++)); do echo "Element $i: '${sentence[i]}'" ; done + Element 0: 'Peter' + Element 1: 'Anna' + Element 2: 'Greg' + Element 3: 'Jan' + +### Associative (Bash 4) + +Associative arrays (or *hash tables*) are not much more complicated than +numerical indexed arrays. The numerical index value (in Bash a number +starting at zero) just is replaced with an arbitrary string: + + # declare -A, introduced with Bash 4 to declare an associative array + declare -A sentence + + sentence[Begin]='Be liberal in what' + sentence[Middle]='you accept, and conservative' + sentence[End]='in what you send' + sentence['Very end']=... + +[**Beware:**]{.underline} don\'t rely on the fact that the elements are +ordered in memory like they were declared, it could look like this: + + # output from 'set' command + sentence=([End]="in what you send" [Middle]="you accept, and conservative " [Begin]="Be liberal in what " ["Very end"]="...") + +This effectively means, you can get the data back with +`"${sentence[@]}"`, of course (just like with numerical indexing), but +you can\'t rely on a specific order. If you want to store ordered data, +or re-order data, go with numerical indexes. For associative arrays, you +usually query known index values: + + for element in Begin Middle End "Very end"; do + printf "%s" "${sentence[$element]}" + done + printf "\n" + +**A nice code example:** Checking for duplicate files using an +associative array indexed with the SHA sum of the files: + + # Thanks to Tramp in #bash for the idea and the code + + unset flist; declare -A flist; + while read -r sum fname; do + if [[ ${flist[$sum]} ]]; then + printf 'rm -- "%s" # Same as >%s<\n' "$fname" "${flist[$sum]}" + else + flist[$sum]="$fname" + fi + done < <(find . -type f -exec sha256sum {} +) >rmdups + +### Integer arrays + +Any type attributes applied to an array apply to all elements of the +array. If the integer attribute is set for either indexed or associative +arrays, then values are considered as arithmetic for both compound and +ordinary assignment, and the += operator is modified in the same way as +for ordinary integer variables. + + ~ $ ( declare -ia 'a=(2+4 [2]=2+2 [a[2]]="a[2]")' 'a+=(42 [a[4]]+=3)'; declare -p a ) + declare -ai a='([0]="6" [2]="4" [4]="7" [5]="42")' + +`a[0]` is assigned to the result of `2+4`. `a[2]` gets the result of +`2+2`. The last index in the first assignment is the result of `a[2]`, +which has already been assigned as `4`, and its value is also given +`a[2]`. + +This shows that even though any existing arrays named `a` in the current +scope have already been unset by using `=` instead of `+=` to the +compound assignment, arithmetic variables within keys can self-reference +any elements already assigned within the same compound-assignment. With +integer arrays this also applies to expressions to the right of the `=`. +(See [evaluation order](#evaluation_order), the right side of an +arithmetic assignment is typically evaluated first in Bash.) + +The second compound assignment argument to declare uses `+=`, so it +appends after the last element of the existing array rather than +deleting it and creating a new array, so `a[5]` gets `42`. + +Lastly, the element whose index is the value of `a[4]` (`4`), gets `3` +added to its existing value, making `a[4]` == `7`. Note that having the +integer attribute set this time causes += to add, rather than append a +string, as it would for a non-integer array. + +The single quotes force the assignments to be evaluated in the +environment of `declare`. This is important because attributes are only +applied to the assignment after assignment arguments are processed. +Without them the `+=` compound assignment would have been invalid, and +strings would have been inserted into the integer array without +evaluating the arithmetic. A special-case of this is shown in the next +section. + +\ Bash declaration commands are really keywords in disguise. They +magically parse arguments to determine whether they are in the form of a +valid assignment. If so, they are evaluated as assignments. If not, they +are undergo normal argument expansion before being passed to the builtin +which evaluates the resulting string as an assignment (somewhat like +`eval`, but there are differences.) `'Todo:`\' Discuss this in detail. +\ + +### Indirection + +Arrays can be expanded indirectly using the indirect parameter expansion +syntax. Parameters whose values are of the form: `name[index]`, +`name[@]`, or `name[*]` when expanded indirectly produce the expected +results. This is mainly useful for passing arrays (especially multiple +arrays) by name to a function. + +This example is an \"isSubset\"-like predicate which returns true if all +key-value pairs of the array given as the first argument to isSubset +correspond to a key-value of the array given as the second argument. It +demonstrates both indirect array expansion and indirect key-passing +without eval using the aforementioned special compound assignment +expansion. + + isSubset() { + local -a 'xkeys=("${!'"$1"'[@]}")' 'ykeys=("${!'"$2"'[@]}")' + set -- "${@/%/[key]}" + + (( ${#xkeys[@]} <= ${#ykeys[@]} )) || return 1 + + local key + for key in "${xkeys[@]}"; do + [[ ${!2+_} && ${!1} == ${!2} ]] || return 1 + done + } + + main() { + # "a" is a subset of "b" + local -a 'a=({0..5})' 'b=({0..10})' + isSubset a b + echo $? # true + + # "a" contains a key not in "b" + local -a 'a=([5]=5 {6..11})' 'b=({0..10})' + isSubset a b + echo $? # false + + # "a" contains an element whose value != the corresponding member of "b" + local -a 'a=([5]=5 6 8 9 10)' 'b=({0..10})' + isSubset a b + echo $? # false + } + + main + +This script is one way of implementing a crude multidimensional +associative array by storing array definitions in an array and +referencing them through indirection. The script takes two keys and +dynamically calls a function whose name is resolved from the array. + + callFuncs() { + # Set up indirect references as positional parameters to minimize local name collisions. + set -- "${@:1:3}" ${2+'a["$1"]' "$1"'["$2"]'} + + # The only way to test for set but null parameters is unfortunately to test each individually. + local x + for x; do + [[ $x ]] || return 0 + done + + local -A a=( + [foo]='([r]=f [s]=g [t]=h)' + [bar]='([u]=i [v]=j [w]=k)' + [baz]='([x]=l [y]=m [z]=n)' + ) ${4+${a["$1"]+"${1}=${!3}"}} # For example, if "$1" is "bar" then define a new array: bar=([u]=i [v]=j [w]=k) + + ${4+${a["$1"]+"${!4-:}"}} # Now just lookup the new array. for inputs: "bar" "v", the function named "j" will be called, which prints "j" to stdout. + } + + main() { + # Define functions named {f..n} which just print their own names. + local fun='() { echo "$FUNCNAME"; }' x + + for x in {f..n}; do + eval "${x}${fun}" + done + + callFuncs "$@" + } + + main "$@" + +## Bugs and Portability Considerations + +- Arrays are not specified by POSIX. One-dimensional indexed arrays + are supported using similar syntax and semantics by most Korn-like + shells. +- Associative arrays are supported via `typeset -A` in Bash 4, Zsh, + and Ksh93. +- In Ksh93, arrays whose types are not given explicitly are not + necessarily indexed. Arrays defined using compound assignments which + specify subscripts are associative by default. In Bash, associative + arrays can *only* be created by explicitly declaring them as + associative, otherwise they are always indexed. In addition, ksh93 + has several other compound structures whose types can be determined + by the compound assignment syntax used to create them. +- In Ksh93, using the `=` compound assignment operator unsets the + array, including any attributes that have been set on the array + prior to assignment. In order to preserve attributes, you must use + the `+=` operator. However, declaring an associative array, then + attempting an `a=(...)` style compound assignment without specifying + indexes is an error. I can\'t explain this + inconsistency.` $ ksh -c 'function f { typeset -a a; a=([0]=foo [1]=bar); typeset -p a; }; f' # Attribute is lost, and since subscripts are given, we default to associative. + typeset -A a=([0]=foo [1]=bar) + $ ksh -c 'function f { typeset -a a; a+=([0]=foo [1]=bar); typeset -p a; }; f' # Now using += gives us the expected results. + typeset -a a=(foo bar) + $ ksh -c 'function f { typeset -A a; a=(foo bar); typeset -p a; }; f' # On top of that, the reverse does NOT unset the attribute. No idea why. + ksh: f: line 1: cannot append index array to associative array a + ` +- Only Bash and mksh support compound assignment with mixed explicit + subscripts and automatically incrementing subscripts. In ksh93, in + order to specify individual subscripts within a compound assignment, + all subscripts must be given (or none). Zsh doesn\'t support + specifying individual subscripts at all. +- Appending to a compound assignment is a fairly portable way to + append elements after the last index of an array. In Bash, this also + sets append mode for all individual assignments within the compound + assignment, such that if a lower subscript is specified, subsequent + elements will be appended to previous values. In ksh93, it causes + subscripts to be ignored, forcing appending everything after the + last element. (Appending has different meaning due to support for + multi-dimensional arrays and nested compound datastructures.) + ` $ ksh -c 'function f { typeset -a a; a+=(foo bar baz); a+=([3]=blah [0]=bork [1]=blarg [2]=zooj); typeset -p a; }; f' # ksh93 forces appending to the array, disregarding subscripts + typeset -a a=(foo bar baz '[3]=blah' '[0]=bork' '[1]=blarg' '[2]=zooj') + $ bash -c 'function f { typeset -a a; a+=(foo bar baz); a+=(blah [0]=bork blarg zooj); typeset -p a; }; f' # Bash applies += to every individual subscript. + declare -a a='([0]="foobork" [1]="barblarg" [2]="bazzooj" [3]="blah")' + $ mksh -c 'function f { typeset -a a; a+=(foo bar baz); a+=(blah [0]=bork blarg zooj); typeset -p a; }; f' # Mksh does like Bash, but clobbers previous values rather than appending. + set -A a + typeset a[0]=bork + typeset a[1]=blarg + typeset a[2]=zooj + typeset a[3]=blah + ` +- In Bash and Zsh, the alternate value assignment parameter expansion + (`${arr[idx]:=foo}`) evaluates the subscript twice, first to + determine whether to expand the alternate, and second to determine + the index to assign the alternate to. See [evaluation + order](#evaluation_order). + ` $ : ${_[$(echo $RANDOM >&2)1]:=$(echo hi >&2)} + 13574 + hi + 14485 + ` +- In Zsh, arrays are indexed starting at 1 in its default mode. + Emulation modes are required in order to get any kind of + portability. +- Zsh and mksh do not support compound assignment arguments to + `typeset`. +- Ksh88 didn\'t support modern compound array assignment syntax. The + original (and most portable) way to assign multiple elements is to + use the `set -A name arg1 arg2 ...` syntax. This is supported by + almost all shells that support ksh-like arrays except for Bash. + Additionally, these shells usually support an optional `-s` argument + to `set` which performs lexicographic sorting on either array + elements or the positional parameters. Bash has no built-in sorting + ability other than the usual comparison operators. + ` $ ksh -c 'set -A arr -- foo bar bork baz; typeset -p arr' # Classic array assignment syntax + typeset -a arr=(foo bar bork baz) + $ ksh -c 'set -sA arr -- foo bar bork baz; typeset -p arr' # Native sorting! + typeset -a arr=(bar baz bork foo) + $ mksh -c 'set -sA arr -- foo "[3]=bar" "[2]=baz" "[7]=bork"; typeset -p arr' # Probably a bug. I think the maintainer is aware of it. + set -A arr + typeset arr[2]=baz + typeset arr[3]=bar + typeset arr[7]=bork + typeset arr[8]=foo + ` +- Evaluation order for assignments involving arrays varies + significantly depending on context. Notably, the order of evaluating + the subscript or the value first can change in almost every shell + for both expansions and arithmetic variables. See [evaluation + order](#evaluation_order) for details. +- Bash 4.1.\* and below cannot use negative subscripts to address + array indexes relative to the highest-numbered index. You must use + the subscript expansion, i.e. `"${arr[@]:(-n):1}"`, to expand the + nth-last element (or the next-highest indexed after `n` if `arr[n]` + is unset). In Bash 4.2, you may expand (but not assign to) a + negative index. In Bash 4.3, ksh93, and zsh, you may both assign and + expand negative offsets. +- ksh93 also has an additional slice notation: `"${arr[n..m]}"` where + `n` and `m` are arithmetic expressions. These are needed for use + with multi-dimensional arrays. +- Assigning or referencing negative indexes in mksh causes + wrap-around. The max index appears to be `UINT_MAX`, which would be + addressed by `arr[-1]`. +- So far, Bash\'s `-v var` test doesn\'t support individual array + subscripts. You may supply an array name to test whether an array is + defined, but can\'t check an element. ksh93\'s `-v` supports both. + Other shells lack a `-v` test. + +### Bugs + +- **Fixed in 4.3** Bash 4.2.\* and earlier considers each chunk of a + compound assignment, including the subscript for globbing. The + subscript part is considered quoted, but any unquoted glob + characters on the right-hand side of the `[...]=` will be clumped + with the subscript and counted as a glob. Therefore, you must quote + anything on the right of the `=` sign. This is fixed in 4.3, so that + each subscript assignment statement is expanded following the same + rules as an ordinary assignment. This also works correctly in ksh93. + `$ touch '[1]=a'; bash -c 'a=([1]=*); echo "${a[@]}"' + [1]=a + ` mksh has a similar but even worse problem in that the entire + subscript is considered a glob. + `$ touch 1=a; mksh -c 'a=([123]=*); print -r -- "${a[@]}"' + 1=a + ` +- **Fixed in 4.3** In addition to the above globbing issue, + assignments preceding \"declare\" have an additional effect on brace + and pathname expansion. `$ set -x; foo=bar declare arr=( {1..10} ) + + foo=bar + + declare 'arr=(1)' 'arr=(2)' 'arr=(3)' 'arr=(4)' 'arr=(5)' 'arr=(6)' 'arr=(7)' 'arr=(8)' 'arr=(9)' 'arr=(10)' + + $ touch xy=foo + + touch xy=foo + $ declare x[y]=* + + declare 'x[y]=*' + $ foo=bar declare x[y]=* + + foo=bar + + declare xy=foo + ` Each word (the entire assignment) is subject to globbing and brace + expansion. This appears to trigger the same strange expansion mode + as `let`, `eval`, other declaration commands, and maybe more. +- **Fixed in 4.3** Indirection combined with another modifier expands + arrays to a single word. + `$ a=({a..c}) b=a[@]; printf '<%s> ' "${!b}"; echo; printf '<%s> ' "${!b/%/foo}"; echo + + + ` +- **Fixed in 4.3** Process substitutions are evaluated within array + indexes. Zsh and ksh don\'t do this in any arithmetic context. + `# print "moo" + dev=fd=1 _[1<(echo moo >&2)]= + + # Fork bomb + ${dev[${dev='dev[1>(${dev[dev]})]'}]} + ` + +### Evaluation order + +Here are some of the nasty details of array assignment evaluation order. +You can use this [testcase code](https://gist.github.com/ormaaj/4942297) +to generate these results. + + Each testcase prints evaluation order for indexed array assignment + contexts. Each context is tested for expansions (represented by digits) and + arithmetic (letters), ordered from left to right within the expression. The + output corresponds to the way evaluation is re-ordered for each shell: + + a[ $1 a ]=${b[ $2 b ]:=${c[ $3 c ]}} No attributes + a[ $1 a ]=${b[ $2 b ]:=c[ $3 c ]} typeset -ia a + a[ $1 a ]=${b[ $2 b ]:=c[ $3 c ]} typeset -ia b + a[ $1 a ]=${b[ $2 b ]:=c[ $3 c ]} typeset -ia a b + (( a[ $1 a ] = b[ $2 b ] ${c[ $3 c ]} )) No attributes + (( a[ $1 a ] = ${b[ $2 b ]:=c[ $3 c ]} )) typeset -ia b + a+=( [ $1 a ]=${b[ $2 b ]:=${c[ $3 c ]}} [ $4 d ]=$(( $5 e )) ) typeset -a a + a+=( [ $1 a ]=${b[ $2 b ]:=c[ $3 c ]} [ $4 d ]=${5}e ) typeset -ia a + + bash: 4.2.42(1)-release + 2 b 3 c 2 b 1 a + 2 b 3 2 b 1 a c + 2 b 3 2 b c 1 a + 2 b 3 2 b c 1 a c + 1 2 3 c b a + 1 2 b 3 2 b c c a + 1 2 b 3 c 2 b 4 5 e a d + 1 2 b 3 2 b 4 5 a c d e + + ksh93: Version AJM 93v- 2013-02-22 + 1 2 b b a + 1 2 b b a + 1 2 b b a + 1 2 b b a + 1 2 3 c b a + 1 2 b b a + 1 2 b b a 4 5 e d + 1 2 b b a 4 5 d e + + mksh: @(#)MIRBSD KSH R44 2013/02/24 + 2 b 3 c 1 a + 2 b 3 1 a c + 2 b 3 c 1 a + 2 b 3 c 1 a + 1 2 3 c a b + 1 2 b 3 c a + 1 2 b 3 c 4 5 e a d + 1 2 b 3 4 5 a c d e + + zsh: 5.0.2 + 2 b 3 c 2 b 1 a + 2 b 3 2 b 1 a c + 2 b 1 a + 2 b 1 a + 1 2 3 c b a + 1 2 b a + 1 2 b 3 c 2 b 4 5 e + 1 2 b 3 2 b 4 5 + +## See also + +- [Parameter expansion](/syntax/pe) (contains sections for arrays) +- [classic_for](/syntax/ccmd/classic_for) (contains some examples to + iterate over arrays) +- [declare](/commands/builtin/declare) +- [BashFAQ 005 - How can I use array + variables?](http://mywiki.wooledge.org/BashFAQ/005) - A very + detailed discussion on arrays with many examples. +- [BashSheet - Arrays](http://mywiki.wooledge.org/BashSheet#Arrays) - + Bashsheet quick-reference on Greycat\'s wiki. + +\
vim: set fenc=utf-8 ff=unix ts=4 sts=4 sw=4 ft=dokuwiki et +wrap lbr: \ diff --git a/docs/syntax/basicgrammar.md b/docs/syntax/basicgrammar.md new file mode 100644 index 0000000..ec9b616 --- /dev/null +++ b/docs/syntax/basicgrammar.md @@ -0,0 +1,332 @@ +# Basic grammar rules of Bash + +![](keywords>bash shell scripting grammar syntax language) + +Bash builds its features on top of a few basic **grammar rules**. The +code you see everywhere, the code you use, is based on those rules. +However, **this is a very theoretical view**, but if you\'re interested, +it may help you understand why things look the way they look. + +If you don\'t know the commands used in the following examples, just +trust the explanation. + +## Simple Commands + +Bash manual says: + + A simple command is a sequence of optional variable assignments followed by blank-separated words and redirections, + and terminated by a control operator. The first word specifies the command to be executed, and is passed as argument + zero. The remaining words are passed as arguments to the invoked command. + +Sounds harder than it actually is. It is what you do daily. You enter +simple commands with parameters, and the shell executes them. + +Every complex Bash operation can be split into simple commands: + + ls + ls > list.txt + ls -l + LC_ALL=C ls + +The last one might not be familiar. That one simply adds \"`LC_ALL=C`\" +to the environment of the `ls` program. It doesn\'t affect your current +shell. This also works while calling functions, unless Bash runs in +POSIX(r) mode (in which case it affects your current shell). + +Every command has an exit code. It\'s a type of return status. The shell +can catch it and act on it. Exit code range is from 0 to 255, where 0 +means success, and the rest mean either something failed, or there is an +issue to report back to the calling program. + +\ The simple command construct is the +**base** for all higher constructs. Everything you execute, from +pipelines to functions, finally ends up in (many) simple commands. +That\'s why Bash only has one method to [expand and execute a simple +command](/syntax/grammar/parser_exec). \ + +## Pipelines + +FIXME Missing an additional article about pipelines and pipelining + +`[time [-p]] [ ! ] command [ | command2 ... ]` + +**Don\'t get confused** about the name \"pipeline.\" It\'s a grammatic +name for a construct. Such a pipeline isn\'t necessarily a pair of +commands where stdout/stdin is connected via a real pipe. + +Pipelines are one or more [simple +commands](basicgrammar##simple_commands) (separated by the `|` symbol +connects their input and output), for example: + + ls /etc | wc -l + +will execute `ls` on `/etc` and **pipe** the output to `wc`, which will +count the lines generated by the ls command. The result is the number of +directory entries in /etc. + +The last command in the pipeline will set the exit code for the +pipeline. This exit code can be \"inverted\" by prefixing an exclamation +mark to the pipeline: An unsuccessful pipeline will exit \"successful\" +and vice versa. In this example, the commands in the if stanza will be +executed if the pattern \"\^root:\" is **not** found in `/etc/passwd`: + + if ! grep '^root:' /etc/passwd; then + echo "No root user defined... eh?" + fi + +Yes, this is also a pipeline (although there is no pipe!), because the +**exclamation mark to invert the exit code** can only be used in a +pipeline. If `grep`\'s exit code is 1 (FALSE) (the text was not found), +the leading `!` will \"invert\" the exit code, and the shell sees (and +acts on) exit code 0 (TRUE) and the `then` part of the `if` stanza is +executed. One could say we checked for +\"`not grep "^root" /etc/passwd`\". + +The [set option pipefail](/commands/builtin/set#attributes) determines +the behavior of how bash reports the exit code of a pipeline. If it\'s +set, then the exit code (`$?`) is the last command that exits with non +zero status, if none fail, it\'s zero. If it\'s not set, then `$?` +always holds the exit code of the last command (as explained above). + +The shell option `lastpipe` will execute the last element in a pipeline +construct in the current shell environment, i.e. not a subshell. + +There\'s also an array `PIPESTATUS[]` that is set after a foreground +pipeline is executed. Each element of `PIPESTATUS[]` reports the exit +code of the respective command in the pipeline. Note: (1) it\'s only for +foreground pipe and (2) for higher level structure that is built up from +a pipeline. Like list, `PIPESTATUS[]` holds the exit status of the last +pipeline command executed. + +Another thing you can do with pipelines is log their execution time. +Note that **`time` is not a command**, it is part of the pipeline +syntax: + + # time updatedb + real 3m21.288s + user 0m3.114s + sys 0m4.744s + +## Lists + +FIXME Missing an additional article about list operators + +A list is a sequence of one or more [pipelines](basicgrammar#pipelines) +separated by one of the operators `;`, `&`, `&&`, or `││`, and +optionally terminated by one of `;`, `&`, or ``. + +=\> It\'s a group of **pipelines** separated or terminated by **tokens** +that all have **different meanings** for Bash. + +Your whole Bash script technically is one big single list! + + Operator Description + ------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + ` ` Newlines completely separate pipelines. The next pipeline is executed without any checks. (You enter a command and press ``!) + ` ; ` The semicolon does what `` does: It separates the pipelines + ` & ` The pipeline in front of the `&` is executed **asynchronously** (\"in the background\"). If a pipeline follows this, it is executed immediately after the async pipeline starts + ` && ` `` is executed and **only** if its exit code was 0 (TRUE), then `` is executed (AND-List) + ` || ` `` is executed and **only** if its exit code was **not** 0 (FALSE), then `` is executed (OR-List) + +**Note:** POSIX calls this construct a \"compound lists\". + +## Compound Commands + +See also the [list of compound commands](/syntax/ccmd/intro). + +There are two forms of compound commands: + +- form a new syntax element using a list as a \"body\" +- completly independant syntax elements + +Essentially, everything else that\'s not described in this article. +Compound commands have the following characteristics: + +- they **begin** and **end** with a specific keyword or operator (e.g. + `for ... done`) +- they can be redirected as a whole + +See the following table for a short overview (no details - just an +overview): + + Compound command syntax Description + ------------------------------------------------------------ --------------------------------------------------------------------------------------------------------------------------------------------------------- + `( )` Execute `` in an extra subshell =\> [article](/syntax/ccmd/grouping_subshell) + `{ ; }` Execute `` as separate group (but not in a subshell) =\> [article](/syntax/ccmd/grouping_plain) + `(( ))` Evaluate the arithmetic expression `` =\> [article](/syntax/ccmd/arithmetic_eval) + `[[ ]]` Evaluate the conditional expression `` (aka \"the new test command\") =\> [article](/syntax/ccmd/conditional_expression) + `for in ; do ; done` Executes `` while setting the variable `` to one of `` on every iteration (classic for-loop) =\> [article](/syntax/ccmd/classic_for) + `for (( ; ; )) ; do ; done` C-style for-loop (driven by arithmetic expressions) =\> [article](/syntax/ccmd/c_for) + `select in ; do ; done` Provides simple menus =\> [article](/syntax/ccmd/user_select) + `case in ) ;; ... esac` Decisions based on pattern matching - executing `` on match =\> [article](/syntax/ccmd/case) + `if ; then ; else ; fi` The if clause: makes decisions based on exit codes =\> [article](/syntax/ccmd/if_clause) + `while ; do ; done` Execute `` while `` returns TRUE (exit code) =\> [article](/syntax/ccmd/while_loop) + `until ; do ; done` Execute `` until `` returns TRUE (exit code) =\> [article](/syntax/ccmd/until_loop) + +## Shell Function Definitions + +FIXME Missing an additional article about shell functions + +A shell function definition makes a [compound +command](basicgrammar#compound_commands) available via a new name. When +the function runs, it has its own \"private\" set of positional +parameters and I/O descriptors. It acts like a script-within-the-script. +Simply stated: **You\'ve created a new command.** + +The definition is easy (one of many possibilities): + +` () ` + +which is usually used with the `{...; }` compound command, and thus +looks like: + + print_help() { echo "Sorry, no help available"; } + +As above, a function definition can have any [compound +command](basicgrammar#compound_commands) as a body. Structures like + + countme() for ((x=1;x<=9;x++)); do echo $x; done + +are unusual, but perfectly valid, since the for loop construct is a +compound command! + +If **redirection** is specified, the redirection is not performed when +the function is defined. It is performed when the function runs: + + # this will NOT perform the redirection (at definition time) + f() { echo ok ; } > file + + # NOW the redirection will be performed (during EXECUTION of the function) + f + +Bash allows three equivalent forms of the function definition: + + NAME () + function NAME () + function NAME + +The space between `NAME` and `()` is optional, usually you see it +without the space. + +I suggest using the first form. It\'s specified in POSIX and all +Bourne-like shells seem to support it. + +[**Note:**]{.underline} Before version `2.05-alpha1`, Bash only +recognized the definition using curly braces (`name() { ... }`), other +shells allow the definition using **any** command (not just the compound +command set). + +To execute a function like a regular shell script you put it together +like this: + + #!/bin/bash + # Add shebang + + mycmd() + { + # this $1 belongs to the function! + find / -iname "$1" + } + + # this $1 belongs the script itself! + mycmd "$1" # Execute command immediately after defining function + + exit 0 + +**Just informational(1):** + +Internally, for forking, Bash stores function definitions in environment +variables. Variables with the content \"*() \....*\". + +Something similar to the following works without \"officially\" +declaring a function: + + $ export testfn="() { echo test; }" + $ bash -c testfn + test + $ + +**Just informational(2):** + +It is possible to create function names containing slashes: + + /bin/ls() { + echo LS FAKE + } + +The elements of this name aren\'t subject to a path search. + +Weird function names should not be used. Quote from the maintainer: + +- * It was a mistake to allow such characters in function names + (\`unset\' doesn\'t work to unset them without forcing -f, for + instance). We\'re stuck with them for backwards compatibility, but I + don\'t have to encourage their use. * + +## Grammar summary + +- a [simple command](basicgrammar#simple_commands) is just a command + and its arguments +- a [pipeline](basicgrammar#pipelines) is one or more [simple + command](basicgrammar#simple_commands) probably connected in a pipe +- a [list](basicgrammar#lists) is one or more + [pipelines](basicgrammar#pipelines) connected by special operators +- a [compound command](basicgrammar#compound_commands) is a + [list](basicgrammar#lists) or a special command that forms a new + meta-command +- a [function definition](basicgrammar#shell_function_definitions) + makes a [compound command](basicgrammar#compound_commands) available + under a new name, and a separate environment + +## Examples for classification + +FIXME more\... + +------------------------------------------------------------------------ + +[A (very) simple command]{.underline} + + echo "Hello world..." + +[All of the following are simple commands]{.underline} + + x=5 + + >tmpfile + + {x}<"$x" _=${x=<(echo moo)} <&0$(cat <&"$x" >&2) + +------------------------------------------------------------------------ + +[A common compound command]{.underline} + + if [ -d /data/mp3 ]; then + cp mymusic.mp3 /data/mp3 + fi + +- the [compound command](basicgrammar#compound_commands) for the `if` + clause +- the [list](basicgrammar#lists) that `if` **checks** actually + contains the [simple command](basicgrammar#simple_commands) + `[ -d /data/mp3 ]` +- the [list](basicgrammar#lists) that `if` **executes** contains a + simple command (`cp mymusic.mp3 /data/mp3`) + +Let\'s invert test command exit code, only one thing changes: + + if ! [ -d /data/mp3 ]; then + cp mymusic.mp3 /data/mp3 + fi + +- the [list](basicgrammar#lists) that `if` **checks** contains a + [pipeline](basicgrammar#pipelines) now (because of the `!`) + +## See also + +- Internal: [List of compound commands](/syntax/ccmd/intro) +- Internal: [Parsing and execution of simple + commands](/syntax/grammar/parser_exec) +- Internal: [Quoting and escaping](/syntax/quoting) +- Internal: [Introduction to expansions and + substitutions](/syntax/expansion/intro) +- Internal: [Some words about words\...](/syntax/words) diff --git a/docs/syntax/ccmd/arithmetic_eval.md b/docs/syntax/ccmd/arithmetic_eval.md new file mode 100644 index 0000000..4be96bf --- /dev/null +++ b/docs/syntax/ccmd/arithmetic_eval.md @@ -0,0 +1,30 @@ +# Arithmetic evaluation (command) + +## Synopsis + + (( )) + +## Description + +This command evaluates the [arithmetic expression](/syntax/arith_expr) +``. + +If the expression evaluates to 0 then the exit code of the expression is +set to 1 (`FALSE`). If the expression evaluates to something else than +0, then the exit code of the expression is set to 0 (`TRUE`). For this +return code mapping, please see [this +section](/syntax/arith_expr#arithmetic_expressions_and_return_codes). + +The functionality basically is equivalent to what the [`let` builtin +command](/commands/builtin/let) does. The arithmetic evaluation compound +command should be preferred. + +## Examples + +## Portability considerations + +## See also + +- Internal: [arithmetic expressions](/syntax/arith_expr) +- Internal: [arithmetic expansion](/syntax/expansion/arith) +- Internal: [The `let` builtin command](/commands/builtin/let) diff --git a/docs/syntax/ccmd/c_for.md b/docs/syntax/ccmd/c_for.md new file mode 100644 index 0000000..c6bd9e2 --- /dev/null +++ b/docs/syntax/ccmd/c_for.md @@ -0,0 +1,239 @@ +# The C-style for-loop + +## Synopsis + + for (( ; ; )); do + + done + + # as a special case: without semicolon after ((...)) + for (( ; ; )) do + + done + + # alternative, historical and undocumented syntax + for (( ; ; )) { + + } + +## Description + +The C-style for-loop is a [compound +command](syntax/basicgrammar#compound_commands) derived from the +equivalent ksh88 feature, which is in turn derived from the C \"for\" +keyword. Its purpose is to provide a convenient way to evaluate +arithmetic expressions in a loop, plus initialize any required +arithmetic variables. It is one of the main \"loop with a counter\" +mechanisms available in the language. + +The `((;;))` syntax at the top of the loop is not an ordinary +[arithmetic compound command](syntax/ccmd/arithmetic_eval), but is part +of the C-style for-loop\'s own syntax. The three sections separated by +semicolons are [arithmetic expression](/syntax/arith_expr) contexts. +Each time one of the sections is to be evaluated, the section is first +processed for: brace, parameter, command, arithmetic, and process +substitution/expansion as usual for arithmetic contexts. When the loop +is entered for the first time, `` is evaluated, then `` is +evaluated and checked. If `` is true, then the loop body is +executed. After the first and all subsequent iterations, `` is +skipped, `` is evaluated, then `` is evaluated and checked +again. This process continues until `` is false. + +- `` is to **initialize variables** before the first run. +- `` is to **check** for a termination condition. This is + always the last section to evaluate prior to leaving the loop. +- `` is to **change** conditions after every iteration. For + example, incrementing a counter. + +:!: If one of these arithmetic expressions in the for-loop is empty, it +behaves as if it would be 1 (**TRUE** in arithmetic context). + +:!: Like all loops (Both types of `for`-loop, `while` and `until`), this +loop can be: + +- Terminated (broken) by the [break](commands/builtin/continuebreak) + builtin, optionally as `break N` to break out of `N` levels of + nested loops. +- Forced immediately to the next iteration using the + [continue](commands/builtin/continuebreak) builtin, optionally as + the `continue N` analog to `break N`. + +The equivalent construct using a [while loop](syntax/ccmd/while_loop) +and the [arithmetic expression compound +command](/syntax/ccmd/arithmetic_eval) would be structured as: + + (( )) + while (( )); do + + (( )) + done + +The equivalent `while` construct isn\'t exactly the same, because both, +the `for` and the `while` loop behave differently in case you use the +[continue](commands/builtin/continuebreak) command. + +### Alternate syntax + +Bash, Ksh93, Mksh, and Zsh also provide an alternate syntax for the +`for` loop - enclosing the loop body in `{...}` instead of +`do ... done`: + + for ((x=1; x<=3; x++)) + { + echo $x + } + +This syntax is **not documented** and shouldn\'t be used. I found the +parser definitions for it in 1.x code, and in modern 4.x code. My guess +is that it\'s there for compatibility reasons. Unlike the other +aforementioned shells, Bash does not support the analogous syntax for +[case..esac](syntax/ccmd/case#portability_considerations). + +### Return status + +The return status is that of the last command executed from ``, or +`FALSE` if any of the arithmetic expressions failed. + +## Alternatives and best practice + +\
TODO: Show some alternate usages involving +functions and local variables for initialization.\ + +## Examples + +### Simple counter + +A simple counter, the loop iterates 101 times (\"0\" to \"100\" are 101 +numbers -\> 101 runs!), and everytime the variable `x` is set to the +current value. + +- It **initializes** `x = 0` +- Before every iteration it **checks** if `x ≤ 100` +- After every iteration it **changes** `x++` + +```{=html} + +``` + for ((x = 0 ; x <= 100 ; x++)); do + echo "Counter: $x" + done + +### Stepping counter + +This is the very same counter (compare it to the simple counter example +above), but the **change** that is made is a `x += 10`. That means, it +will count from 0 to 100, but with a **step of 10**. + + for ((x = 0 ; x <= 100 ; x += 10)); do + echo "Counter: $x" + done + +### Bits analyzer + +This example loops through the bit-values of a Byte, beginning from 128, +ending at 1. If that bit is set in the `testbyte`, it prints \"`1`\", +else \"`0`\" =\> it prints the binary representation of the `testbyte` +value (8 bits). + + #!/usr/bin/env bash + # Example written for http://wiki.bash-hackers.org/syntax/ccmd/c_for#bits_analyzer + # Based on TheBonsai's original. + + function toBin { + typeset m=$1 n=2 x='x[(n*=2)>m]' + for ((x = x; n /= 2;)); do + printf %d $(( m & n && 1)) + done + } + + function main { + [[ $1 == +([0-9]) ]] || return + typeset result + if (( $(ksh -c 'printf %..2d $1' _ "$1") == ( result = $(toBin "$1") ) )); then + printf '%s is %s in base 2!\n' "$1" "$result" + else + echo 'Oops, something went wrong with our calculation.' >&2 + exit 1 + fi + } + + main "${1:-123}" + + # vim: set fenc=utf-8 ff=unix ft=sh : + +\
+ + testbyte=123 + for (( n = 128 ; n >= 1 ; n /= 2 )); do + if (( testbyte & n )); then + printf %d 1 + else + printf %s 0 + fi + done + echo + +\ + +Why that one begins at 128 (highest value, on the left) and not 1 +(lowest value, on the right)? It\'s easier to print from left to +right\... + +We arrive at 128 for `n` through the recursive arithmetic expression +stored in `x`, which calculates the next-greatest power of 2 after `m`. +To show that it works, we use ksh93 to double-check the answer, because +it has a built-in feature for `printf` to print a representation of any +number in an arbitrary base (up to 64). Very few languages have that +ability built-in, even things like Python. + +### Up, down, up, down\... + +This counts up and down from `0` to `${1:-5}`, `${2:-4}` times, +demonstrating more complicated arithmetic expressions with multiple +variables. + + for (( incr = 1, n=0, times = ${2:-4}, step = ${1:-5}; (n += incr) % step || (incr *= -1, --times);)); do + printf '%*s\n' "$((n+1))" "$n" + done + +\ \~ \$ bash \<(xclip -o) 1 + + 2 + 3 + 4 + 5 + 4 + 3 + 2 + +1 0 1 + + 2 + 3 + 4 + 5 + 4 + 3 + 2 + +1 \ + +## Portability considerations + +- C-style for loops aren\'t POSIX. They are available in Bash, ksh93, + and zsh. All 3 have essentially the same syntax and behavior. +- C-style for loops aren\'t available in mksh. + +## Bugs + +- *Fixed in 4.3*. ~~There appears to be a bug as of Bash 4.2p10 in + which command lists can\'t be distinguished from the for loop\'s + arithmetic argument delimiter (both semicolons), so command + substitutions within the C-style for loop expression can\'t contain + more than one command.~~ + +## See also + +- Internal: [Arithmetic expressions](/syntax/arith_expr) +- Internal: [The classic for-loop](/syntax/ccmd/classic_for) +- Internal: [The while-loop](/syntax/ccmd/while_loop) diff --git a/docs/syntax/ccmd/case.md b/docs/syntax/ccmd/case.md new file mode 100644 index 0000000..06420d0 --- /dev/null +++ b/docs/syntax/ccmd/case.md @@ -0,0 +1,161 @@ +# The case statement + +## Synopsis + + case in + [(] ) ;; # or ;& or ;;& in Bash 4 + [(] ) ;; + [(] | ) ;; + ... + [(] ) [;;] + esac + +## Description + +The `case`-statement can execute commands based on a [pattern +matching](/syntax/pattern) decision. The word `` is matched +against every pattern `` and on a match, the associated +[list](/syntax/basicgrammar#lists) `` is executed. Every +commandlist is terminated by `;;`. This rule is optional for the very +last commandlist (i.e., you can omit the `;;` before the `esac`). Every +`` is separated from it\'s associated `` by a `)`, and +is optionally preceded by a `(`. + +Bash 4 introduces two new action terminators. The classic behavior using +`;;` is to execute only the list associated with the first matching +pattern, then break out of the `case` block. The `;&` terminator causes +`case` to also execute the next block without testing its pattern. The +`;;&` operator is like `;;`, except the case statement doesn\'t +terminate after executing the associated list - Bash just continues +testing the next pattern as though the previous pattern didn\'t match. +Using these terminators, a `case` statement can be configured to test +against all patterns, or to share code between blocks, for example. + +The word `` is expanded using *tilde*, *parameter* and *variable +expansion*; *arithmetic*, *command* and *process substitution*; and +*quote removal*. **No word splitting, brace, or pathname expansion is +done**, which means you can leave expansions unquoted without problems: + + var="test word" + + case $var in + ... + esac + +This is similar to the behavior of the [conditional expression command +(\"new test command\")](/syntax/ccmd/conditional_expression) (also no +word splitting for expansions). + +Unlike the C-case-statement, only the matching list and nothing else is +executed. If more patterns match the word, only the first match is +taken. (**Note** the comment about Bash v4 changes above.) + +Multiple `|`-delimited patterns can be specified for a single block. +This is a POSIX-compatable equivalent to the `@(pattern-list)` extglob +construct. + +The `case` statement is one of the most difficult commands to indent +clearly, and people frequently ask about the most \"correct\" style. +Just do your best - there are many variations of indenting style for +`case` and no real agreed-upon best practice. + +## Examples + +Another one of my stupid examples\... + + printf '%s ' 'Which fruit do you like most?' + read -${BASH_VERSION+e}r fruit + + case $fruit in + apple) + echo 'Mmmmh... I like those!' + ;; + banana) + echo 'Hm, a bit awry, no?' + ;; + orange|tangerine) + echo $'Eeeks! I don\'t like those!\nGo away!' + exit 1 + ;; + *) + echo "Unknown fruit - sure it isn't toxic?" + esac + +Here\'s a practical example showing a common pattern involving a `case` +statement. If the first argument is one of a valid set of alternatives, +then perform some sysfs operations under Linux to control a video +card\'s power profile. Otherwise, show a usage synopsis, and print the +current power profile and GPU temperature. + +``` bash +# Set radeon power management +function clk { + typeset base=/sys/class/drm/card0/device + [[ -r ${base}/hwmon/hwmon0/temp1_input && -r ${base}/power_profile ]] || return 1 + + case $1 in + low|high|default) + printf '%s\n' "temp: $(<${base}/hwmon/hwmon0/temp1_input)C" "old profile: $(<${base}/power_profile)" + echo "$1" >${base}/power_profile + echo "new profile: $(<${base}/power_profile)" + ;; + *) + echo "Usage: $FUNCNAME [ low | high | default ]" + printf '%s\n' "temp: $(<${base}/hwmon/hwmon0/temp1_input)C" "current profile: $(<${base}/power_profile)" + esac +} +``` + +A template for experiments with `case` logic, showing shared code +between blocks using `;&`, and the non-short-circuiting `;;&` operator: + +``` bash +#!/usr/bin/env bash + +f() { + local -a "$@" + local x + + for x; do + case $x in + $1) + local "$x"'+=(1)' ;;& + $2) + local "$x"'+=(2)' ;& + $3) + local "$x"'+=(3)' ;; + $1|$2) + local "$x"'+=(4)' + esac + IFS=, local -a "$x"'=("${x}: ${'"$x"'[*]}")' + done + + for x; do + echo "${!x}" + done +} + +f a b c + +# output: +# a: 1,4 +# b: 2,3 +# c: 3 +``` + +## Portability considerations + +- Only the `;;` delimiter is specified by POSIX. +- zsh and mksh use the `;|` control operator instead of Bash\'s `;;&`. + Mksh has `;;&` for Bash compatability (undocumented). +- ksh93 has the `;&` operator, but no `;;&` or equivalent. +- ksh93, mksh, zsh, and posh support a historical syntax where open + and close braces may be used in place of `in` and `esac`: + `case word { x) ...; };`. This is similar to the alternate form Bash + supports for its [for loops](syntax/ccmd/classic_for), but Bash + doesn\'t support this syntax for `case..esac`. + +## See also + +- [POSIX case conditional + construct](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_04_05) diff --git a/docs/syntax/ccmd/classic_for.md b/docs/syntax/ccmd/classic_for.md new file mode 100644 index 0000000..9490af8 --- /dev/null +++ b/docs/syntax/ccmd/classic_for.md @@ -0,0 +1,188 @@ +# The classic for-loop + +## Synopsis + + for ; do + + done + + for in ; do + + done + +alternative, historical and undocumented syntax [^1] + + for ; { + + } + + for in ; { + + } + +## Description + +For every word in ``, one iteration of the loop is performed and +the variable `` is set to the current word. If no \"`in `\" +is present to give an own word-list, then the positional parameters +(`"$@"`) are used (the arguments to the script or function). In this +case (and only in this case), the semicolon between the variable name +and the `do` is optional. + +If you use the loop-variable inside the for-loop and it can contain +spaces, you need to quote it, since normal word-splitting procedures +apply. + +:!: Like all loops (both `for`-loops, `while` and `until`), this loop +can be + +- terminated (broken) by the `break` command, optionally as `break N` + to break `N` levels of nested loops +- forced to immediately do the next iteration using the `continue` + command, optionally as `continue N` analog to `break N` + +Bash knows an alternative syntax for the `for` loop, enclosing the loop +body in `{...}` instead of `do ... done`: + +``` bash +for x in 1 2 3 +{ + echo $x +} +``` + +This syntax is **not documented** and should not be used. I found the +parser definitions for it in 1.x code, and in modern 4.x code. My guess +is that it\'s there for compatiblity reasons. This syntax is not +specified by POSIX(r). + +### Return status + +The return status is the one of the last command executed in `` or +`0` (`TRUE`), if the item list `` evaluates to nothing (i.e.: +\"is empty\"!). + +## Examples + +### Iterate over array elements + +With some array syntax (see [arrays](/syntax/arrays)) you can easily +\"feed\" the for-loop to iterate over all elements in an array (by +mass-expanding all elements): + +``` bash +for element in "${myarray[@]}"; do + echo "Element: $element" +done +``` + +Another way is to mass-expand all used indexes and access the array by +index: + +``` bash +for index in "${!myarray[@]}"; do + echo "Element[$index]: ${myarray[$index]}" +done +``` + +### List positional parameters + +You can use this +[function](/syntax/basicgrammar#shell_function_definitions) to test how +arguments to a command will be interpreted and parsed, and finally used: + +``` bash +argtest() { + n=1 + for arg; do + echo "Argument $((n++)): \"$arg\"" + done +} +``` + +### Loop through a directory + +Since pathname expansion will expand all filenames to separate words, +regardless of spaces, you can use the for-loop to iterate through +filenames in a directory: + +``` bash +for fn in *; do + if [ -h "$fn" ]; then + echo -n "Symlink: " + elif [ -d "$fn" ]; then + echo -n "Dir: " + elif [ -f "$fn" ]; then + echo -n "File: " + else + echo -n "Unknown: " + fi + echo "$fn" +done +``` + +Stupid example, I know ;-) + +### Loop over lines of output + +To be complete: You can change the internal field separator (IFS) to a +newline and thus make a for-loop iterating over lines instead of words: + +``` bash +IFS=$'\n' +for f in $(ls); do + echo $f +done +``` + +This is just an example. In *general* + +- it\'s not a good idea to parse `ls(1)` output +- the [while loop](/syntax/ccmd/while_loop) (using the `read` command) + is a better joice to iterate over lines + +### Nested for-loops + +It\'s of course possible to use another for-loop as ``. Here, +counting from 0 to 99 in a weird way: + +``` bash +for x in 0 1 2 3 4 5 6 7 8 9; do + for y in 0 1 2 3 4 5 6 7 8 9; do + echo $x$y + done +done +``` + +### Loop over a number range + +Beginning in Bash 4, you can also use \"sequence expression\" form of +[brace expansion](/syntax/expansion/brace) syntax when looping over +numbers, and this form does not create leading zeroes unless you ask for +them: + +``` bash +# 100 numbers, no leading zeroes +for x in {0..99}; do + echo $x +done +``` + +``` bash +# Every other number, width 3 +for x in {000..99..2}; do + echo $x +done +``` + +WARNING: the entire list is created before looping starts. If your list +is huge this may be an issue, but no more so than for a glob that +expands to a huge list. + +## Portability considerations + +## See also + +- [c_for](/syntax/ccmd/c_for) + +[^1]: diff --git a/docs/syntax/ccmd/conditional_expression.md b/docs/syntax/ccmd/conditional_expression.md new file mode 100644 index 0000000..b1db4b5 --- /dev/null +++ b/docs/syntax/ccmd/conditional_expression.md @@ -0,0 +1,211 @@ +# The conditional expression + +## Synopsis + + [[ ]] + +## Description + +The conditional expression is meant as the modern variant of the +[classic test command](/commands/classictest). Since it is **not** a +normal command, Bash doesn\'t need to apply the normal commandline +parsing rules like recognizing `&&` as [command +list](/syntax/basicgrammar#lists) operator. + +The testing features basically are the same (see the lists for [classic +test command](/commands/classictest)), with some additions and +extensions. + + ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + Operator Description + -------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------- + `( )` Used to group expressions, to influence precedence of operators + + ` && ` `TRUE` if ``**and**`` are `TRUE` (do **not** use `-a`!) + + ` || ` `TRUE` if ``**or**`` is `TRUE` (do **not** use `-o`!) + + ` == ` `` is checked against the pattern `` - `TRUE` on a match\ + *But note¹, quoting the pattern forces a literal comparison.* + + ` = ` equivalent to the `==` operator + + ` != ` `` is checked against the pattern `` - `TRUE` on **no match** + + ` =~ ` `` is checked against the [extended regular expression](https://en.wikipedia.org/wiki/Regular_expression#POSIX_extended) `` - `TRUE` on a match + + See the [classic test operators](/commands/classictest#file_tests) Do **not** use the `test`-typical operators `-a` and `-o` for AND and OR. + + See also [arithmetic comparisons](/syntax/arith_expr#comparisons) Using `(( ))`, the [arithmetic expression compound command](/syntax/ccmd/arithmetic_eval) + ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +When the `==` and `!=` operators are used, the string to the right of +the operator is considered a pattern and matched according to the rules +of [Pattern Matching](/syntax/pattern). If the shell option +`nocasematch` is enabled, the match is performed without regard to the +case of alphabetic characters. + +¹Any part of the pattern may be quoted to force it to be matched as a +literal string. + +When the operators `<` and `>` are used (string collation order), the +test happens using the current locale when the `compat` level is greater +than \"40\". + +Operator precedence (highest =\> lowest): + +- `( )` +- `! ` +- ` && ` +- ` || ` + +Do **not** use the `test`-typical operators `-a` and `-o` for AND and +OR, they are not known to the conditional expression. Instead, use the +operators `&&` and `||`. + +### Word splitting + +[Word splitting](/syntax/expansion/wordsplit) and [pathname +expansion](/syntax/expansion/globs) are not performed in the expression +you give. That means, a variable containing spaces can be used without +quoting: + + sentence="Be liberal in what you accept, and conservative in what you send" + checkme="Be liberal in what you accept, and conservative in what you send" + if [[ $sentence == $checkme ]]; then + echo "Matched...!" + else + echo "Sorry, no match :-(" + fi + +Compare that to the [classic test command](/commands/classictest), where +word splitting is done (because it\'s a normal command, not something +special): + + sentence="Be liberal in what you accept, and conservative in what you send" + checkme="Be liberal in what you accept, and conservative in what you send" + if [ "$sentence" == "$checkme" ]; then + echo "Matched...!" + else + echo "Sorry, no match :-(" + fi + +You need to quote that variable reference in the classic test command, +since (due to the spaces) the word splitting will break it otherwise! + +### Regular Expression Matching + +Using the operator `=~`, the left hand side operand is matched against +the **extended regular expression (ERE)** on the right hand side. + +This is consistent with matching against patterns: Every quoted part of +the regular expression is taken literally, even if it contains regular +expression special characters. + +Best practise is to put the regular expression to match against into a +variable. This is to avoid shell parsing errors on otherwise valid +regular expressions. + + REGEX="^[[:upper:]]{2}[[:lower:]]*$" + + # Test 1 + STRING=Hello + if [[ $STRING =~ $REGEX ]]; then + echo "Match." + else + echo "No match." + fi + # ==> "No match." + + # Test 2 + STRING=HEllo + if [[ $STRING =~ $REGEX ]]; then + echo "Match." + else + echo "No match." + fi + # ==> "Match." + +The interpretation of quoted regular expression special characters can +be influenced by setting the `compat31` and `compat32` shell options +(`compat*` in general). See [shell_options](/internals/shell_options). + +#### The special BASH_REMATCH array variable + +An array variable whose members are assigned by the `=~` binary operator +to the `[[` conditional command. + +The element with index 0 is the portion of the string matching the +entire regular expression. The element with index n is the portion of +the string matching the nth parenthesized subexpression. + +See [BASH_REMATCH](syntax/shellvars#bash_rematch). + +Example: + + if [[ "The quick, red fox" =~ ^The\ (.*),\ (.*)\ fox$ ]]; then + echo "${BASH_REMATCH[0]} is ${BASH_REMATCH[1]} and ${BASH_REMATCH[2]}."; + fi + + ==> The quick, red fox is quick and red. + +### Behaviour differences compared to the builtin test command + +As of Bash 4.1 alpha, the test primaries \'\<\' and \'\>\' (compare +strings lexicographically) use the current locale settings, while the +same primitives for the builtin test command don\'t. This leads to the +following situation where they behave differently: + + $ ./cond.sh + [[ ' 4' < '1' ]] --> exit 1 + [[ 'step+' < 'step-' ]] --> exit 1 + [ ' 4' \< '1' ] --> exit 0 + [ 'step+' \< 'step-' ] --> exit 0 + +It won\'t be aligned. The conditional expression continues to respect +the locate, as introduced with 4.1-alpha, the builtin `test`/`[` command +continues to behave differently. + +### Implicit arithmetic context + +When you use a numeric comparison, the arguments are evaluated as an +arithmetic expression. The arithmetic expression must be quoted if it +both contains whitespace and is not the result of an expansion. + + [[ 'i=5, i+=2' -eq 3+4 ]] && echo true # prints true. + +## Examples + +## Portability considerations + +- `[[ ... ]]` functionality isn\'t specified by POSIX(R), though it\'s + a reserved word +- Amongst the major \"POSIX-shell superset languages\" (for lack of a + better term) which do have `[[`, the test expression compound + command is one of the very most portable non-POSIX features. Aside + from the `=~` operator, almost every major feature is consistent + between Ksh88, Ksh93, mksh, Zsh, and Bash. Ksh93 also adds a large + number of unique pattern matching features not supported by other + shells including support for several different regex dialects, which + are invoked using a different syntax from Bash\'s `=~`, though `=~` + is still supported by ksh and defaults to ERE. +- As an extension to POSIX ERE, most GNU software supports + backreferences in ERE, including Bash. According to POSIX, only BRE + is supposed to support them. This requires Bash to be linked against + glibc, so it won\'t necessarily work on all platforms. For example, + `$(m='(abc(def))(\1)(\2)'; [[ abcdefabcdefdef =~ $m ]]; printf '<%s> ' $? "${BASH_REMATCH[@]}" )` + will give `<0> `. +- the `=~` (regex) operator was introduced in Bash 3.0, and its + behaviour changed in Bash 3.2: since 3.2, quoted strings and + substrings are matched as literals by default. +- the behaviour of the `<` and `>` operators (string collation order) + has changed since Bash 4.0 + +## See also + +- Internal: [pattern matching language](/syntax/pattern) +- Internal: [the classic test command](/commands/classictest) +- Internal: [the if-clause](/syntax/ccmd/if_clause) +- [What is the difference between test, \[ and \[\[ + ?](http://mywiki.wooledge.org/BashFAQ/031) - BashFAQ 31 - Greg\'s + wiki. diff --git a/docs/syntax/ccmd/grouping_plain.md b/docs/syntax/ccmd/grouping_plain.md new file mode 100644 index 0000000..070f6fb --- /dev/null +++ b/docs/syntax/ccmd/grouping_plain.md @@ -0,0 +1,71 @@ +# Grouping commands + +## Synopsis + + { ; } + + { + + } + +## Description + +The [list](/syntax/basicgrammar#lists) `` is simply executed in +the **current** shell environment. The list must be terminated with a +**newline** or **semicolon**. For parsing reasons, the curly braces must +be separated from `` by a **semicolon** and **blanks** if they\'re +in the same line! [^1][^2] + +This is known as a **group command**. The return status is the [exit +status (exit code)](/scripting/basics#exit_codes) of the list. + +The input and output **filedescriptors** are cumulative: + + { + echo "PASSWD follows" + cat /etc/passwd + echo + echo "GROUPS follows" + cat /etc/group + } >output.txt + +This compound command also usually is the body of a [function +definition](/syntax/basicgrammar#shell_function_definitions), though not +the only compound command that\'s valid there: + + print_help() { + echo "Options:" + echo "-h This help text" + echo "-f FILE Use config file FILE" + echo "-u USER Run as user USER" + } + +## Examples + +### A Try-Catch block + + try_catch() { + { # Try-block: + eval "$@" + } || + { # Catch-block: + echo "An error occurred" + return -1 + } + } + +## Portability considerations + +## See also + + * [[syntax:ccmd:grouping_subshell | grouping commands in a subshell]] + +[^1]: Actually any properly terminated compound command will work + without extra separator (also in some other shells), **example**: + `{ while sleep 1; do echo ZzZzzZ; done }` is valid. But this is not + documented, infact the documentation explicitly says that a + semicolon or a newline must separate the enclosed list. \-- thanks + `geirha` at Freenode + +[^2]: The main reason is the fact that in shell grammar, the curly + braces are not control operators but reserved words \-- TheBonsai diff --git a/docs/syntax/ccmd/grouping_subshell.md b/docs/syntax/ccmd/grouping_subshell.md new file mode 100644 index 0000000..5fa6dc9 --- /dev/null +++ b/docs/syntax/ccmd/grouping_subshell.md @@ -0,0 +1,35 @@ +# Grouping commands in a subshell + +## Synopsis + + ( ) + +## Description + +The [list](/syntax/basicgrammar#lists) `` is executed in a +separate shell - a subprocess. No changes to the environment (variables +etc\...) are reflected in the \"main shell\". + +## Examples + +Execute a command in a different directory. + +``` bash +echo "$PWD" +( cd /usr; echo "$PWD" ) +echo "$PWD" # Still in the original directory. +``` + +## Portability considerations + +- The subshell compound command is specified by POSIX. +- Avoid ambiguous syntax. + +``` bash +(((1+1))) # Equivalent to: (( (1+1) )) +``` + +## See also + +- [grouping commands](/syntax/ccmd/grouping_plain) +- [Subshells on Greycat\'s wiki](http://mywiki.wooledge.org/SubShell) diff --git a/docs/syntax/ccmd/if_clause.md b/docs/syntax/ccmd/if_clause.md new file mode 100644 index 0000000..e6bed60 --- /dev/null +++ b/docs/syntax/ccmd/if_clause.md @@ -0,0 +1,91 @@ +# The if-clause + +## Synopsis + + if ; then + + fi + + if ; then + + else + + fi + + if ; then + + elif ; then + + else + + fi + +## Description + +The `if`-clause can control the script\'s flow (what\'s executed) by +looking at the exit codes of other commands. + +All commandsets `` are interpreted as [command +lists](/syntax/basicgrammar#lists), thus they can contain the whole +palette from [simple commands](/syntax/basicgrammar#simple_commands) +over [pipelines](/syntax/basicgrammar#pipelines) to [compound +commands](/syntax/basicgrammar#compound_commands) (and their +combination) as condition. + +### Operation + +The **`if `** commands are executed. If the exit code was 0 (TRUE) +then the **`then `** commands are executed, otherwise the +**`elif `** commands and their **`then `** statements are +executed in turn, if all down to the last one fails, the +**`else `** commands are executed, if one of the `elif` succeeds, +its `then` thread is executed, and the `if`-clause finishes. + +Basically, the `elif` clauses are just additional conditions to test +(like a chain of conditions) if the very first condition failed. If one +of the conditions fails, the `else` commands are executed, otherwise the +commands of the condition that succeeded. + +## Examples + +**Check if a specific user exists in /etc/passwd :-)** + + if grep ^myuser: /etc/passwd >/dev/null 2>&1; then + echo "Yes, it seems I'm real" + else + echo "Uh - am I a ghost?" + fi + +**Mount with check** + + if ! mount /mnt/backup >/dev/null 2>&1; then + echo "FATAL: backup mount failed" >&2 + exit 1 + fi + +**Multiple commands as condition** + +It\'s perfectly valid to do: + + if echo "I'm testing!"; [ -e /some/file ]; then + ... + fi + +The exit code that dictates the condition\'s value is the exit code of +the very last command executed in the condition-list (here: The +`[ -e /some/file ]`) + +**A complete pipe as condition** + +A complete pipe can also be used as condition. It\'s very similar to the +example above (multiple commands): + + if echo "Hello world!" | grep -i hello >/dev/null 2>&1; then + echo "You just said 'hello', yeah?" + fi + +## Portability considerations + +## See also + +- Internal: [the classic test command](/commands/classictest) diff --git a/docs/syntax/ccmd/intro.md b/docs/syntax/ccmd/intro.md new file mode 100644 index 0000000..f26f98a --- /dev/null +++ b/docs/syntax/ccmd/intro.md @@ -0,0 +1,35 @@ +# Bash compound commands + +The main part of Bash\'s syntax are the so-called **compound commands**. +They\'re called like that because they use \"real\" commands ([simple +commands](/syntax/basicgrammar#simple_commands) or +[lists](/syntax/basicgrammar#lists)) and knit some intelligence around +them. That is what the essential \"Bash language\" is made of. + +## Command grouping + +- grouping: [command grouping](grouping_plain) +- grouping again: [command grouping in a subshell](grouping_subshell) + +## Conditional reactions + +Note that conditionals can also be scripted using +[list](/syntax/basicgrammar#lists), which are syntax elements, not +commands. + +- the \"new\" test command: [conditional + expression](conditional_expression) +- if-clause: [conditional branching](if_clause) +- case statement: [pattern-based branching](case) + +## Loops + +- [classic for-loop](classic_for) +- [C-style for-loop](c_for) +- [while loop](while_loop) +- [until loop](until_loop) + +## Misc + +- math: [arithmetic evaluation](arithmetic_eval) +- menus: [user selections](user_select) diff --git a/docs/syntax/ccmd/until_loop.md b/docs/syntax/ccmd/until_loop.md new file mode 100644 index 0000000..78737af --- /dev/null +++ b/docs/syntax/ccmd/until_loop.md @@ -0,0 +1,38 @@ +# The until loop + +## Synopsis + + until ; do + + done + +## Description + +The until-loop is relatively simple in what it does: it executes the +[command list](/syntax/basicgrammar#lists) `` and if the exit +code of it was **not** 0 (FALSE) it executes ``. This happens +again and again until `` returns TRUE. + +This is exactly the opposite of the [while +loop](/syntax/ccmd/while_loop). + +:!: Like all loops (both `for`-loops, `while` and `until`), this loop +can be + +- terminated (broken) by the `break` command, optionally as `break N` + to break `N` levels of nested loops +- forced to immediately do the next iteration using the `continue` + command, optionally as `continue N` analog to `break N` + +### Return status + +The return status is the one of the last command executed in ``, +or `0` (`TRUE`) if none was executed. + +## Examples + +## Portability considerations + +## See also + +- Internal: [The while loop](/syntax/ccmd/while_loop) diff --git a/docs/syntax/ccmd/user_select.md b/docs/syntax/ccmd/user_select.md new file mode 100644 index 0000000..fd0bb42 --- /dev/null +++ b/docs/syntax/ccmd/user_select.md @@ -0,0 +1,82 @@ +# User selections + +## Synopsis + + select ; do + + done + + select in ; do + + done + + # alternative, historical and undocumented syntax + + select + { + + } + + select in + { + + } + +## Description + +This compound command provides a kind of menu. The user is prompted with +a *numbered list* of the given words, and is asked to input the index +number of the word. If a word was selected, the variable `` is set +to this word, and the [list](/syntax/basicgrammar#lists) `` is +executed. + +If no `in ` is given, then the positional parameters are taken as +words (as if `in "$@"` was written). + +Regardless of the functionality, the *number* the user entered is saved +in the variable `REPLY`. + +Bash knows an alternative syntax for the `select` command, enclosing the +loop body in `{...}` instead of `do ... done`: + + select x in 1 2 3 + { + echo $x + } + +This syntax is **not documented** and should not be used. I found the +parser definitions for it in 1.x code, and in modern 4.x code. My guess +is that it\'s there for compatiblity reasons. This syntax is not +specified by POSIX(R). + +## Examples + +``` bash +# select in ; do +# +# done + + +# meaning e.g.: + +clear +echo +echo hit number key 1 2 or 3 then ENTER-key +echo ENTER alone is an empty choice and will loop endlessly until Ctrl-C or Ctrl-D +echo + +select OPTIONX in beer whiskey wine liquor ; do + + echo you ordered a $OPTIONX + break # break avoids endless loop -- second line to be executed always + +done + +# place some if else fi business here +# and explain how it makes sense that $OPTIONX is red but OPTIONX is black +# even though both are variables +``` + +## Portability considerations + +## See also diff --git a/docs/syntax/ccmd/while_loop.md b/docs/syntax/ccmd/while_loop.md new file mode 100644 index 0000000..590d21c --- /dev/null +++ b/docs/syntax/ccmd/while_loop.md @@ -0,0 +1,41 @@ +# The while-loop + +## Synopsis + + while ; do + + done + +## Description + +The while-loop is relatively simple in what it does: it executes the +[command list](/syntax/basicgrammar#lists) `` and if the exit +code of it was 0 (TRUE) it executes ``. This happens again and +again until `` returns FALSE. + +This is exactly the opposite of the [until +loop](/syntax/ccmd/until_loop). + +:!: Like all loops (both `for`-loops, `while` and `until`), this loop +can be + +- terminated (broken) by the `break` command, optionally as `break N` + to break `N` levels of nested loops +- forced to immediately do the next iteration using the `continue` + command, optionally as `continue N` analog to `break N` + +### Return status + +The return status is the one of the last command executed in ``, +or `0` (`TRUE`) if none was executed. + +## Examples + +## Portability considerations + +## See also + +- Internal: [The until loop](/syntax/ccmd/until_loop) +- Internal: [code examples of the read builtin + command](/commands/builtin/read#code_examples) to see how you can + loop over lines diff --git a/docs/syntax/expansion/arith.md b/docs/syntax/expansion/arith.md new file mode 100644 index 0000000..257f665 --- /dev/null +++ b/docs/syntax/expansion/arith.md @@ -0,0 +1,75 @@ +# Arithmetic expansion + + $(( )) + + $[ ] + +The [arithmetic expression](/syntax/arith_expr) `` is +evaluated and expands to the result. The output of the arithmetic +expansion is guaranteed to be one word and a digit in Bash. + +Please **do not use the second form `$[ ... ]`**! It\'s deprecated. The +preferred and standardized form is `$(( ... ))`! + +Example + +``` bash +function printSum { + typeset -A args + typeset name + for name in first second; do + [[ -t 0 ]] && printf 'Enter %s positive integer: ' "$name" >&2 + read -r ${BASH_VERSION+-e} "args[$name]" + [[ ${args[$name]} == +([[:digit:]]) ]] || return 1 # Validation is extremely important whenever user input is used in arithmetic. + done + printf 'The sum is %d.' $((${args[first]} + ${args[second]})) +} +``` + +**Note** that in Bash you don\'t need the arithmetic expansion to check +for the boolean value of an arithmetic expression. This can be done +using the [arithmetic evaluation compound +command](/syntax/ccmd/arithmetic_eval): + +``` bash +printf %s 'Enter a number: ' >&2 +read -r number +if ((number == 1234)); then + echo 'Good guess' +else + echo 'Haha... :-P' +fi +``` + +**Variables** used inside the arithmetic expansion, as in all arithmetic +contexts, can be used with or without variable expansion: + +``` bash +x=1 + +echo $((x)) # Good. +echo $(($x)) # Ok. Avoid expansions within arithmetic. Use variables directly. +echo $(("$x")) # Error. There is no quote-removal in arithmetic contexts. It expands to $(("1")), which is an invalid arithmetic expression. +echo $((x[0])) # Good. +echo $((${x[0]})) # Ok. Nested expansion again. +echo $((${x[$((${x[!$x]}-$x))]})) # Same as above but more ridiculous. +echo $(($x[0])) # Error. This expands to $((1[0])), an invalid expression. +``` + +## Bugs and Portability considerations + +- The original Bourne shell doesn\'t have arithmetic expansions. You + have to use something like `expr(1)` within backticks instead. Since + `expr` is horrible (as are backticks), and arithmetic expansion is + required by POSIX, you should not worry about this, and preferably + fix any code you find that\'s still using `expr`. + +## See also + +- [arithmetic expressions](/syntax/arith_expr) +- [arithmetic evaluation compound + command](/syntax/ccmd/arithmetic_eval) +- [Introduction to expansion and + substitution](/syntax/expansion/intro) +- [POSIX + definition](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04) diff --git a/docs/syntax/expansion/brace.md b/docs/syntax/expansion/brace.md new file mode 100644 index 0000000..f4295c8 --- /dev/null +++ b/docs/syntax/expansion/brace.md @@ -0,0 +1,272 @@ +# Brace expansion + +![](keywords>bash shell scripting expansion substitution text list brace) + + {string1,string2,...,stringN} + {..} + + {....} (Bash 4) + + {........} + + {........} + + {........} + +Brace expansion is used to generate arbitrary strings. The specified +strings are used to generate **all possible combinations** with the +optional surrounding prefixes and suffixes. + +Usually it\'s used to generate mass-arguments for a command, that follow +a specific naming-scheme. + +:!: It is the very first step in expansion-handling, it\'s important to +understand that. When you use + + echo {a,b}$PATH + +then the brace expansion **does not expand the variable** - this is done +in a **later step**. Brace expansion just makes it being: + + echo a$PATH b$PATH + +Another common pitfall is to assume that a range like `{1..200}` can be +expressed with variables using `{$a..$b}`. Due to what I described +above, it **simply is not possible**, because it\'s the very first step +in doing expansions. A possible way to achieve this, if you really +can\'t handle this in another way, is using the `eval` command, which +basically evaluates a commandline twice: `eval echo {$a..$b}` For +instance, when embedded inside a for loop : +`for i in $(eval echo {$a..$b})` This requires that the entire command +be properly escaped to avoid unexpected expansions. If the sequence +expansion is to be assigned to an array, another method is possible +using [declaration commands](/commands/builtin/declare): +`declare -a 'pics=(img{'"$a..$b"'}.png)'; mv "${pics[@]}" ../imgs` This +is significantly safer, but one must still be careful to control the +values of \$a and \$b. Both the exact quoting, and explicitly including +\"-a\" are important. + +The brace expansion is present in two basic forms, **string lists** and +**ranges**. + +It can be switched on and off under runtime by using the `set` builtin +and the option `-B` and `+B` or the long option `braceexpand`. If brace +expansion is enabled, the stringlist in `SHELLOPTS` contains +`braceexpand`. + +## String lists + + {string1,string2,...,stringN} + +Without the optional prefix and suffix strings, the result is just a +space-separated list of the given strings: + + $ echo {I,want,my,money,back} + I want my money back + +With prefix or suffix strings, the result is a space-separated list of +**all possible combinations** of prefix or suffix specified strings: + + $ echo _{I,want,my,money,back} + _I _want _my _money _back + + $ echo {I,want,my,money,back}_ + I_ want_ my_ money_ back_ + + $ echo _{I,want,my,money,back}- + _I- _want- _my- _money- _back- + +The brace expansion is only performed, if the given string list is +really a **list of strings**, i.e., if there is a minimum of one \"`,`\" +(comma)! Something like `{money}` doesn\'t expand to something special, +it\'s really only the text \"`{money}`\". + +## Ranges + + {..} + +Brace expansion using ranges is written giving the startpoint and the +endpoint of the range. This is a \"sequence expression\". The sequences +can be of two types + +- integers (optionally zero padded, optionally with a given increment) +- characters + +```{=html} + +``` + $ echo {5..12} + 5 6 7 8 9 10 11 12 + + $ echo {c..k} + c d e f g h i j k + +When you mix these both types, brace expansion is **not** performed: + + $ echo {5..k} + {5..k} + +When you zero pad one of the numbers (or both) in a range, then the +generated range is zero padded, too: + + $ echo {01..10} + 01 02 03 04 05 06 07 08 09 10 + +There\'s a chapter of Bash 4 brace expansion changes at [the end of this +article](#new_in_bash_4.0). + +Similar to the expansion using stringlists, you can add prefix and +suffix strings: + + $ echo 1.{0..9} + 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 + + $ echo ---{A..E}--- + ---A--- ---B--- ---C--- ---D--- ---E--- + +## Combining and nesting + +When you combine more brace expansions, you effectively use a brace +expansion as prefix or suffix for another one. Let\'s generate all +possible combinations of uppercase letters and digits: + + $ echo {A..Z}{0..9} + A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 C0 C1 C2 C3 C4 C5 C6 + C7 C8 C9 D0 D1 D2 D3 D4 D5 D6 D7 D8 D9 E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 F0 F1 F2 F3 + F4 F5 F6 F7 F8 F9 G0 G1 G2 G3 G4 G5 G6 G7 G8 G9 H0 H1 H2 H3 H4 H5 H6 H7 H8 H9 I0 + I1 I2 I3 I4 I5 I6 I7 I8 I9 J0 J1 J2 J3 J4 J5 J6 J7 J8 J9 K0 K1 K2 K3 K4 K5 K6 K7 + K8 K9 L0 L1 L2 L3 L4 L5 L6 L7 L8 L9 M0 M1 M2 M3 M4 M5 M6 M7 M8 M9 N0 N1 N2 N3 N4 + N5 N6 N7 N8 N9 O0 O1 O2 O3 O4 O5 O6 O7 O8 O9 P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 Q0 Q1 + Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 S0 S1 S2 S3 S4 S5 S6 S7 S8 + S9 T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 U0 U1 U2 U3 U4 U5 U6 U7 U8 U9 V0 V1 V2 V3 V4 V5 + V6 V7 V8 V9 W0 W1 W2 W3 W4 W5 W6 W7 W8 W9 X0 X1 X2 X3 X4 X5 X6 X7 X8 X9 Y0 Y1 Y2 + Y3 Y4 Y5 Y6 Y7 Y8 Y9 Z0 Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9 + +Hey.. that **saves you writing** 260 strings! + +Brace expansions can be nested, but too much of it usually makes you +losing overview a bit ;-) + +Here\'s a sample to generate the alphabet, first the uppercase letters, +then the lowercase ones: + + $ echo {{A..Z},{a..z}} + A B C D E F G H I J K L M N O P Q R S T U V W X Y Z a b c d e f g h i j k l m n o p q r s t u v w x y z + +## Common use and examples + +### Massdownload from the Web + +In this example, `wget` is used to download documentation that is split +over several numbered webpages. + +`wget` won\'t see your braces. It will see **6 different URLs** to +download. + + wget http://docs.example.com/documentation/slides_part{1,2,3,4,5,6}.html + +Of course it\'s possible, and even easier, to do that with a sequence: + + wget http://docs.example.com/documentation/slides_part{1..6}.html + +### Generate a subdirectory structure + +Your life is hard? Let\'s ease it a bit - that\'s what shells are here +for. + + mkdir /home/bash/test/{foo,bar,baz,cat,dog} + +### Generate numbers with a prefix 001 002 \... + +- Using a prefix: + +```{=html} + +``` + for i in 0{1..9} 10; do printf "%s\n" "$i";done + +If you need to create words with the number embedded, you can use nested +brace: + + printf "%s\n" img{00{1..9},0{10..99},{100..999}}.png + +- Formatting the numbers with printf: + +```{=html} + +``` + echo $(printf "img%02d.png " {1..99}) + +See the [text below](#news_in_bash_4.0) for a new Bash 4 method. + +### Repeating arguments or words + + somecommand -v -v -v -v -v + +Can be written as + + somecommand -v{,,,,} + +\...which is a kind of a hack, but hey, it works. + +\
+ +#### More fun + +The most optimal possible brace expansion to expand n arguments of +course consists of n\'s prime factors. We can use the \"factor\" program +bundled with GNU coreutils to emit a brace expansion that will expand +any number of arguments. + + function braceify { + [[ $1 == +([[:digit:]]) ]] || return + typeset -a a + read -ra a < <(factor "$1") + eval "echo $(printf '{$(printf ,%%.s {1..%s})}' "${a[@]:1}")" + } + + printf 'eval printf "$arg"%s' "$(braceify 1000000)" + +\"Braceify\" generates the expansion code itself. In this example we +inject that output into a template which displays the most terse brace +expansion code that would expand `"$arg"` 1,000,000 times if evaluated. +In this case, the output is: + + eval printf "$arg"{,,}{,,}{,,}{,,}{,,}{,,}{,,,,,}{,,,,,}{,,,,,}{,,,,,}{,,,,,}{,,,,,} + +\ + +## New in Bash 4.0 + +### Zero padded number expansion + +Prefix either of the numbers in a numeric range with `0` to pad the +expanded numbers with the correct amount of zeros: + + $ echo {0001..5} + 0001 0002 0003 0004 0005 + +### Increment + +It is now possible to specify an increment using ranges: + + {....} + +`` is numeric, you can use a negative integer but the correct sign +is deduced from the order of `` and `` anyways. + + $ echo {1..10..2} + 1 3 5 7 9 + $ echo {10..1..2} + 10 8 6 4 2 + +Interesting feature: The increment specification also works for +letter-ranges: + + $ echo {a..z..3} + a d g j m p s v y + +## See also + +- [Introduction to expansion and + substitution](/syntax/expansion/intro) diff --git a/docs/syntax/expansion/cmdsubst.md b/docs/syntax/expansion/cmdsubst.md new file mode 100644 index 0000000..db304f2 --- /dev/null +++ b/docs/syntax/expansion/cmdsubst.md @@ -0,0 +1,138 @@ +# Command substitution + +![](keywords>bash shell scripting expansion substitution text variable output execute stdout save result return value) + + $( ) + + ` ` + +The command substitution expands to the output of commands. These +commands are executed in a subshell, and their `stdout` data is what the +substitution syntax expands to. + +All **trailing** newlines are removed (below is an example for a +workaround). + +In later steps, **if not quoted**, the results undergo [word +splitting](/syntax/expansion/wordsplit) and [pathname +expansion](/syntax/expansion/globs). You have to remember that, because +the word splitting will also remove embedded newlines and other `IFS` +characters and break the results up into several words. Also you\'ll +probably get unexpected pathname matches. **If you need the literal +results, quote the command substitution!** + +The second form `` `COMMAND` `` is more or less obsolete for Bash, since +it has some trouble with nesting (\"inner\" backticks need to be +escaped) and escaping characters. Use `$(COMMAND)`, it\'s also POSIX! + +When you [call an explicit subshell](/syntax/ccmd/grouping_subshell) +`(COMMAND)` inside the command substitution `$()`, then take care, this +way is **wrong**: + + $((COMMAND)) + +Why? because it collides with the syntax for [arithmetic +expansion](/syntax/expansion/arith). You need to separate the command +substitution from the inner `(COMMAND)`: + + $( (COMMAND) ) + +## Specialities + +When the inner command is only an input redirection, and nothing else, +for example + + $( bash shell scripting expansion substitution text variable filename macro wildcard) + +Before executing your commands, Bash checks whether there are any syntax +elements in the command line that should be interpreted rather than +taken literally. After splitting the command line into tokens (words), +Bash scans for these special elements and interprets them, resulting in +a changed command line: the elements are said to be **expanded** to or +**substituted** to **new text and maybe new tokens** (words). + +The most simple example of this behaviour is a referenced variable: + + mystring="Hello world" + echo "$mystring" + +The `echo` program definitely doesn\'t care about what a shell variable +is. It is Bash\'s job to deal with the variable. Bash **expands** the +string \"`$mystring`\" to \"`Hello world`\", so that `echo` will only +see `Hello world`, not the variable or anything else! + +After all these expansions and substitutions are done, all quotes that +are not meant literally (i.e., [the quotes that marked contiguous +words](/syntax/quoting), as part of the shell syntax) are removed from +the commandline text, so the called program won\'t see them. This step +is called **quote-removal**. + +## Overview + +Saw a possible expansion syntax but don\'t know what it is? Here\'s a +small list. + +- [Parameter expansion](/syntax/pe) (it has its own [overview + section](/syntax/pe#overview)) + - `$WORD` + - `${STUFF...}` +- [Pathname expansion](/syntax/expansion/globs) + - `*.txt` + - `page_1?.html` +- [Arithmetic expansion](/syntax/expansion/arith) + - `$(( EXPRESSION ))` + - `$[ EXPRESSION ]` +- [Command substitution](/syntax/expansion/cmdsubst) + - `$( COMMAND )` + - `` ` COMMAND ` `` +- [Tilde expansion](/syntax/expansion/tilde) + - `~` + - `~+` + - `~-` +- [Brace expansion](/syntax/expansion/brace) + - `{X,Y,Z}` + - `{X..Y}` + - `{X..Y..Z}` +- [Process substitution](/syntax/expansion/proc_subst) + - `<( COMMAND )` + - `>( COMMAND )` + +## Order + +Bash performs expansions and substitutions in a defined order. This +explains why globbing (pathname expansion), for example, is safe to use +on filenames with spaces (because it happens **after** the final word +splitting!). + +The order is (from first to last): + +- [Brace expansion](/syntax/expansion/brace) +- [Tilde expansion](/syntax/expansion/tilde) +- The following expansions happen at the same time, in a left-to-right + fashion on the commandline (see below) + - [Parameter expansion](/syntax/pe) + - [Arithmetic expansion](/syntax/expansion/arith) + - [Command substitution](/syntax/expansion/cmdsubst) +- [Word splitting](/syntax/expansion/wordsplit) +- [Pathname expansion](/syntax/expansion/globs) + +[Process substitution](/syntax/expansion/proc_subst) is performed +**simultaneously** with [parameter expansion](/syntax/pe), [command +substitution](/syntax/expansion/cmdsubst) and [arithmetic +expansion](/syntax/expansion/arith). It is only performed when the +underlying operating system supports it. + +The 3 steps [parameter expansion](/syntax/pe), [arithmetic +expansion](/syntax/expansion/arith) and [command +substitution](/syntax/expansion/cmdsubst) happen at the same time in a +left-to-right fashion on nthe commandline. This means + + i=1 + echo $i $((i++)) $i + +will output `1 1 2` and not `1 1 1`. diff --git a/docs/syntax/expansion/proc_subst.md b/docs/syntax/expansion/proc_subst.md new file mode 100644 index 0000000..47d951d --- /dev/null +++ b/docs/syntax/expansion/proc_subst.md @@ -0,0 +1,184 @@ +# Process substitution + +![](keywords>bash shell scripting expansion substitution text stdin stdout save capture) + +Process substitution is a form of redirection where the input or output +of a process (some sequence of commands) appear as a temporary file. + + <( ) + + >( ) + +Process substitution is performed **simultaneously** with [parameter +expansion](/syntax/pe), [command +substitution](/syntax/expansion/cmdsubst) and [arithmetic +expansion](/syntax/expansion/arith). + +The [command list](/syntax/basicgrammar#lists) `` is executed and +its + +- standard output filedescriptor in the `<( ... )` form or +- standard input filedescriptor in the `>( ... )` form + +is connected to a FIFO or a file in `/dev/fd/`. The filename (where the +filedescriptor is connected) is then used as a substitution for the +`<(...)`-construct. + +That, for example, allows to give data to a command that can\'t be +reached by pipelining (that doesn\'t expect its data from `stdin` but +from a file). + +### Scope + +\ Note: According to multiple comments and sources, the +scope of process substitution file descriptors is **not** stable, +guaranteed, or specified by bash. Newer versions of bash (5.0+) seem to +have shorter scope, and substitutions scope seems to be shorter than +function scope. See +[stackexchange](https://unix.stackexchange.com/questions/425456/conditional-process-substitution) +and +[stackoverflow](https://stackoverflow.com/questions/46660020/bash-what-is-the-scope-of-the-process-substitution); +the latter discussion contains a script that can test the scoping +behavior case-by-case \ + +If a process substitution is expanded as an argument to a function, +expanded to an environment variable during calling of a function, or +expanded to any assignment within a function, the process substitution +will be \"held open\" for use by any command within the function or its +callees, until the function in which it was set returns. If the same +variable is set again within a callee, unless the new variable is local, +the previous process substitution is closed and will be unavailable to +the caller when the callee returns. + +In essence, process substitutions expanded to variables within functions +remain open until the function in which the process substitution occured +returns - even when assigned to locals that were set by a function\'s +caller. Dynamic scope doesn\'t protect them from closing. + +## Examples + +This code is useless, but it demonstrates how it works: + +``` bash +$ echo <(ls) +/dev/fd/63 +``` + +The **output** of the `ls`-program can then be accessed by reading the +file `/dev/fd/63`. + +Consider the following: + +``` bash +diff <(ls "$first_directory") <(ls "$second_directory") +``` + +This will compare the contents of each directory. In this command, each +*process* is *substituted* for a *file*, and diff doesn\'t see \<(bla), +it sees two files, so the effective command is something like + +``` bash +diff /dev/fd/63 /dev/fd/64 +``` + +where those files are written to and destroyed automatically. + +### Avoiding subshells + +\ See Also: +[BashFAQ/024](http://mywiki.wooledge.org/BashFAQ/024) \-- *I set +variables in a loop that\'s in a pipeline. Why do they disappear after +the loop terminates? Or, why can\'t I pipe data to read?* \ + +One of the most common uses for process substitutions is to avoid the +final subshell that results from executing a pipeline. The following is +a **wrong** piece of code to count all files in `/etc` is: + +``` bash +counter=0 + +find /etc -print0 | while IFS= read -rd '' _; do + ((counter++)) +done + +echo "$counter files" # prints "0 files" +``` + +Due to the pipe, the `while read; do ... done` part is executed in a +subshell (in Bash, by default), which means `counter` is only +incremented within the subshell. When the pipeline finishes, the +subshell is terminated, and the `counter` visible to `echo` is still at +\"0\"! + +Process substitution helps us avoid the pipe operator (the reason for +the subshell): + +``` bash +counter=0 + +while IFS= read -rN1 _; do + ((counter++)) +done < <(find /etc -printf ' ') + +echo "$counter files" +``` + +This is the normal input file redirection `< FILE`, just that the `FILE` +in this case is the result of process substitution. It\'s important to +note that the space is required in order to disambiguate the syntax from +[here documents](/syntax/redirection#here_documents). + +``` bash +: < <(COMMAND) # Good. +: <<(...) # Wrong. Will be parsed as a heredoc. Bash fails when it comes across the unquoted metacharacter ''('' +: ><(...) # Technically valid but pointless syntax. Bash opens the pipe for writing, while the commands within the process substitution have their stdout connected to the pipe. +``` + +### Process substitution assigned to a parameter + +This example demonstrates how process substitutions can be made to +resemble \"passable\" objects. This results in converting the output of +`f`\'s argument to uppercase. + +``` bash +f() { + cat "$1" >"$x" +} + +x=>(tr '[:lower:]' '[:upper:]') f <(echo 'hi there') +``` + +See the above section on [#scope](#scope) + +## Bugs and Portability Considerations + +- Process substitution is not specified by POSIX. +- Process substitution is disabled completely in Bash POSIX mode. +- Process substitution is implemented by Bash, Zsh, Ksh{88,93}, but + not (yet) pdksh derivatives (mksh). Coprocesses may be used instead. +- Process substitution is supported only on systems that support + either named pipes (FIFO - a [special + file](/dict/terms/special_file)) or the `/dev/fd/*` method for + accessing open files. If the system doesn\'t support `/dev/fd/*`, + Bash falls back to creating named pipes. Note that not all shells + that support process substitution have that fallback. +- Bash evaluates process substitutions within array indices, but not + other arithmetic contexts. Ksh and Zsh do not. (Possible Bug) + +``` bash +# print "moo" +dev=fd=1 _[1<(echo moo >&2)]= +# fork bomb +${dev[${dev='dev[1>(${dev[dev]})]'}]} +``` + +- Issues with wait, race conditions, etc: + + +## See also + +- Internal: [Introduction to expansion and + substitution](/syntax/expansion/intro) +- Internal: [Bash in the process tree](/scripting/processtree) + (subshells) +- Internal: [Redirection](/syntax/redirection) diff --git a/docs/syntax/expansion/tilde.md b/docs/syntax/expansion/tilde.md new file mode 100644 index 0000000..5d281bf --- /dev/null +++ b/docs/syntax/expansion/tilde.md @@ -0,0 +1,110 @@ +# Tilde expansion + +![](keywords>bash shell scripting expansion substitution tilde home homedir shortcut) + + ~ + ~/... + + ~NAME + ~NAME/... + + ~+ + ~+/... + + ~- + ~-/... + +The tilde expansion is used to expand to several specific pathnames: + +- home directories +- current working directory +- previous working directory + +Tilde expansion is only performed, when the tilde-construct is at the +beginning of a word, or a separate word. + +If there\'s nothing to expand, i.e., in case of a wrong username or any +other error condition, the tilde construct is not replaced, it stays +what it is. + +Tilde expansion is also performed everytime a variable is assigned: + +- after the **first** `=`: `TARGET=~moonman/share` +- after **every** `:` (colon) in the assigned value: + `TARGET=file:~moonman/share` + +\ As of now (Bash 4.3-alpha) the following constructs +**also** works, though it\'s not a variable assignment: + + echo foo=~ + echo foo=:~ + +I don\'t know yet, if this is a bug or intended. \ + +This way you can correctly use the tilde expansion in your +[PATH](/syntax/shellvars#PATH): + + PATH=~/mybins:~peter/mybins:$PATH + +**Spaces in the referenced pathes?** A construct like\... + + ~/"my directory" + +\...is perfectly valid and works! + +## Home directory + + ~ + ~ + +This form expands to the home-directory of the current user (`~`) or the +home directory of the given user (`~`). + +If the given user doesn\'t exist (or if his home directory isn\'t +determinable, for some reason), it doesn\'t expand to something else, it +stays what it is. The requested home directory is found by asking the +operating system for the associated home directory for ``. + +To find the home directory of the current user (`~`), Bash has a +precedence: + +- expand to the value of [HOME](/syntax/shellvars#HOME) if it\'s + defined +- expand to the home directory of the user executing the shell + (operating system) + +That means, the variable `HOME` can override the \"real\" home +directory, at least regarding tilde expansion. + +## Current working directory + + ~+ + +This expands to the value of the [PWD](/syntax/shellvars#PWD) variable, +which holds the currect working directory: + + echo "CWD is $PWD" + +is equivalent to (note it **must** be a separate word!): + + echo "CWD is" ~+ + +## Previous working directory + + ~- + +This expands to the value of the [OLDPWD](/syntax/shellvars#OLDPWD) +variable, which holds the previous working directory (the one before the +last `cd`). If `OLDPWD` is unset (never changed the directory), it is +not expanded. + + $ pwd + /home/bash + $ cd /etc + $ echo ~- + /home/bash + +## See also + +- Internal: [Introduction to expansion and + substitution](/syntax/expansion/intro) diff --git a/docs/syntax/expansion/wordsplit.md b/docs/syntax/expansion/wordsplit.md new file mode 100644 index 0000000..ccfcc9b --- /dev/null +++ b/docs/syntax/expansion/wordsplit.md @@ -0,0 +1,52 @@ +# Word splitting + +FIXME to be continued! + +Word splitting occurs once any of the following expansions are done (and +only then!) + +- [Parameter expansion](/syntax/pe) +- [Command substitution](/syntax/expansion/cmdsubst) +- [Arithmetic expansion](/syntax/expansion/arith) + +Bash will scan the results of these expansions for special `IFS` +characters that mark word boundaries. This is only done on results that +are **not double-quoted**! + +## Internal Field Separator IFS + +The `IFS` variable holds the characters that Bash sees as word +boundaries in this step. The default contains the characters + +- \ +- \ +- \ + +These characters are also assumed when IFS is **unset**. When `IFS` is +**empty** (nullstring), no word splitting is performed at all. + +## Behaviour + +The results of the expansions mentioned above are scanned for +`IFS`-characters. If **one or more** (in a sequence) of them is found, +the expansion result is split at these positions into multiple words. + +This doesn\'t happen when the expansion results were **double-quoted**. + +When a null-string (e.g., something that before expanded to +\>\>nothing\<\<) is found, it is removed, unless it is quoted (`''` or +`""`). + +[**Again note:**]{.underline} Without any expansion beforehand, Bash +won\'t perform word splitting! In this case, the initial token parsing +is solely responsible. + +## See also + +- [Introduction to expansion and + substitution](/syntax/expansion/intro) +- [Quoting and escaping](/syntax/quoting) +- [WordSplitting](http://mywiki.wooledge.org/WordSplitting), + [IFS](http://mywiki.wooledge.org/IFS), and + [DontReadLinesWithFor](http://mywiki.wooledge.org/DontReadLinesWithFor) - + Greg\'s wiki diff --git a/docs/syntax/grammar/parser_exec.md b/docs/syntax/grammar/parser_exec.md new file mode 100644 index 0000000..2385cd5 --- /dev/null +++ b/docs/syntax/grammar/parser_exec.md @@ -0,0 +1,125 @@ +FIXME work in progress\... + +# Parsing and execution + +![](keywords>bash shell scripting syntax language behaviour executing execution) + +Nearly everything in [Bash grammar](/syntax/basicgrammar) can be broken +down to a \"simple command\". The only thing Bash has to expand, +evaluate and execute is the simple command. + +## Simple command expansion + +\
+ +- +- + +\ + +This step happens after the initial command line splitting. + +The expansion of a simple command is done in four steps (interpreting +the simple command **from left to right**): + +1. The words the parser has marked as **variable assignments** and + **redirections** are saved for later processing. + - variable assignments precede the command name and have the form + `WORD=WORD` + - redirections can appear anywhere in the simple command +2. The rest of the words are [expanded](/syntax/expansion/intro). If + any words remain after expansion, the first word is taken to be the + **name of the command** and the remaining words are the + **arguments**. +3. [Redirections](/syntax/redirection) are performed. +4. The text after the `=` in each variable assignment undergoes [tilde + expansion](/syntax/expansion/tilde), [parameter + expansion](/syntax/pe), [command + substitution](/syntax/expansion/cmdsubst), [arithmetic + expansion](/syntax/expansion/arith), and quote removal before being + assigned to the variable. + +If **no command name** results after expansion: + +- The variable assignments affect the **current shell** environment. + - This is what happens when you enter only a variable assignment + at the command prompt. + - Assignment to readonly variables causes an error and the command + exits non-zero. +- Redirections are performed, but do not affect the current shell + environment. + - that means, a `> FILE` without any command **will** be + performed: the `FILE` will be created! +- The command exits + - with an exit code indicating the redirection error, if any + - with the exit code of the last command-substitution parsed, if + any + - with exit code 0 (zero) if no redirection error happened and no + command substitution was done + +Otherwise, if a command name results: + +- The variables saved and parsed are added to the environment of the + executed command (and thus do not affect the current environment) + - Assignment to readonly variables causes an error and the command + exits with a non-zero error code. + - **Assignment errors** in non-POSIX modes cause the *enclosing + commands (e.g. loops) to completely terminate* + - **Assignment errors** in (non-interactive) POSIX mode cause *the + entire script to terminate* + +The behavior regarding the variable assignment errors can be tested: +\
\ + +**[This one exits the script completely]{.underline}** + + #!/bin/sh + # This shell runs in POSIX mode! + + echo PRE + + # The following is an assignment error, since there is no digit '9' + # for a base eight number! + foo=$((8#9)) + + echo POST + +**[This one terminates only the enclosing compound command (the +`{ ...; }`):]{.underline}** + + #!/bin/bash + # This shell runs in native Bash-mode! + + echo PRE + + # The following is an assignment error! + # The "echo TEST" won't be executed, since the { ...; } is terminated + { foo=$((8#9)); echo TEST; } + + echo POST + +## Simple command execution + +If a parsed simple command contains no slashes, the shell attempts to +locate and execute it: + +- shell functions +- shell builtin commands +- check own hash table +- search along `PATH` + +As of Bash Version 4, when a command search fails, the shell executes a +shell function named `command_not_found_handle()` using the failed +command as arguments. This can be used to provide user friendly messages +or install software packages etc. Since this function runs in a separate +execution environment, you can\'t really influence the main shell with +it (changing directory, setting variables). + +FIXME to be continued + +## See also + +- Internal: [Redirection](/syntax/redirection) +- Internal: [Introduction to expansions and + substitutions](/syntax/expansion/intro) diff --git a/docs/syntax/keywords/coproc.md b/docs/syntax/keywords/coproc.md new file mode 100644 index 0000000..fa45004 --- /dev/null +++ b/docs/syntax/keywords/coproc.md @@ -0,0 +1,270 @@ +# The coproc keyword + +## Synopsis + + coproc [NAME] command [redirections] + +## Description + +Bash 4.0 introduced *coprocesses*, a feature certainly familiar to ksh +users. The `coproc` keyword starts a command as a background job, +setting up pipes connected to both its stdin and stdout so that you can +interact with it bidirectionally. Optionally, the co-process can have a +name `NAME`. If `NAME` is given, the command that follows **must be a +compound command**. If no `NAME` is given, then the command can be +either simple or compound. + +The process ID of the shell spawned to execute the coprocess is +available through the value of the variable named by `NAME` followed by +a `_PID` suffix. For example, the variable name used to store the PID of +a coproc started with no `NAME` given would be `COPROC_PID` (because +`COPROC` is the default `NAME`). [wait](/commands/builtin/wait) may be +used to wait for the coprocess to terminate. Additionally, coprocesses +may be manipulated through their `jobspec`. + +### Return status + +The return status of a coprocess is the exit status of its command. + +### Redirections + +The optional redirections are applied after the pipes have been set up. +Some examples: + +``` bash +# redirecting stderr in the pipe +$ coproc { ls thisfiledoesntexist; read; } 2>&1 +[2] 23084 +$ IFS= read -ru ${COPROC[0]} x; printf '%s\n' "$x" +ls: cannot access thisfiledoesntexist: No such file or directory +``` + +``` bash +#let the output of the coprocess go to stdout +$ { coproc mycoproc { awk '{print "foo" $0;fflush()}'; } >&3; } 3>&1 +[2] 23092 +$ echo bar >&${mycoproc[1]} +$ foobar +``` + +Here we need to save the previous file descriptor of stdout, because by +the time we redirect the fds of the coprocess, stdout has already been +redirected to the pipe. + +### Pitfalls + +#### Avoid the final pipeline subshell + +The traditional Ksh workaround to avoid the subshell when doing +`command | while read` is to use a coprocess. Unfortunately, Bash\'s +behavior differs. + +In Ksh you would do: + +``` bash +# ksh93 or mksh/pdksh derivatives +ls |& # start a coprocess +while IFS= read -rp file; do print -r -- "$file"; done # read its output +``` + +In bash: + +``` bash +#DOESN'T WORK +$ coproc ls +[1] 23232 +$ while IFS= read -ru ${COPROC[0]} line; do printf '%s\n' "$line"; done +bash: read: line: invalid file descriptor specification +[1]+ Done coproc COPROC ls +``` + +By the time we start reading from the output of the coprocess, the file +descriptor has been closed. + +See [this FAQ entry on Greg\'s +wiki](http://mywiki.wooledge.org/BashFAQ/024) for other pipeline +subshell workarounds. + +#### Buffering + +In the first example, we GNU awk\'s `fflush()` command. As always, when +you use pipes the I/O operations are buffered. Let\'s see what happens +with `sed`: + +``` bash +$ coproc sed s/^/foo/ +[1] 22981 +$ echo bar >&${COPROC[1]} +$ read -t 3 -ru ${COPROC[0]} _; (( $? > 127 )) && echo "nothing read" +nothing read +``` + +Even though this example is the same as the first `awk` example, the +`read` doesn\'t return because the output is waiting in a buffer. + +See [this faq entry on Greg\'s +wiki](http://mywiki.wooledge.org/BashFAQ/009) for some workarounds and +more information on buffering issues. + +#### background processes + +A coprocess\' file descriptors are accessible only to the process from +which the `coproc` was started. They are not inherited by subshells. + +Here is a not-so-meaningful illustration. Suppose we want to +continuously read the output of a coprocess and `echo` the result: + +``` bash +#NOT WORKING +$ coproc awk '{print "foo" $0;fflush()}' +[2] 23100 +$ while IFS= read -ru ${COPROC[0]} x; do printf '%s\n' "$x"; done & +[3] 23104 +bash: line 243: read: 61: invalid file descriptor: Bad file descriptor +``` + +This fails because the file descriptors created by the parent are not +available to the subshell created by &. + +A possible workaround: + +``` bash +#WARNING: for illustration purpose ONLY +# this is not the way to make the coprocess print its output +# to stdout, see the redirections above. +$ coproc awk '{print "foo" $0;fflush()}' +[2] 23109 +$ exec 3<&${COPROC[0]} +$ while IFS= read -ru 3 x; do printf '%s\n' "$x"; done & +[3] 23110 +$ echo bar >&${COPROC[1]} +$ foobar +``` + +Here, fd 3 is inherited. + +## Examples + +### Anonymous Coprocess + +Unlike ksh, Bash doesn\'t have true anonymous coprocesses. Instead, Bash +assigns FDs to a default array named `COPROC` if no `NAME` is supplied. +Here\'s an example: + +``` bash +$ coproc awk '{print "foo" $0;fflush()}' +[1] 22978 +``` + +This command starts in the background, and `coproc` returns immediately. +Two new file descriptors are now available via the `COPROC` array. We +can send data to our command: + +``` bash +$ echo bar >&${COPROC[1]} +``` + +And then read its output: + +``` bash +$ IFS= read -ru ${COPROC[0]} x; printf '%s\n' "$x" +foobar +``` + +When we don\'t need our command anymore, we can kill it via its pid: + + $ kill $COPROC_PID + $ + [1]+ Terminated coproc COPROC awk '{print "foo" $0;fflush()}' + +### Named Coprocess + +Using a named coprocess is simple. We just need a compound command (like +when defining a function), and the resulting FDs will be assigned to the +indexed array `NAME` we supply instead. + +``` bash +$ coproc mycoproc { awk '{print "foo" $0;fflush()}' ;} +[1] 23058 +$ echo bar >&${mycoproc[1]} +$ IFS= read -ru ${mycoproc[0]} x; printf '%s\n' "$x" +foobar +$ kill $mycoproc_PID +$ +[1]+ Terminated coproc mycoproc { awk '{print "foo" $0;fflush()}'; } +``` + +### Redirecting the output of a script to a file and to the screen + +``` bash +#!/bin/bash +# we start tee in the background +# redirecting its output to the stdout of the script +{ coproc tee { tee logfile ;} >&3 ;} 3>&1 +# we redirect stding and stdout of the script to our coprocess +exec >&${tee[1]} 2>&1 +``` + +## Portability considerations + +- The `coproc` keyword is not specified by POSIX(R) +- The `coproc` keyword appeared in Bash version 4.0-alpha +- The `-p` option to Bash\'s `print` loadable is a NOOP and not + connected to Bash coprocesses in any way. It is only recognized as + an option for ksh compatibility, and has no effect. +- The `-p` option to Bash\'s `read` builtin conflicts with that of all + kshes and zsh. The equivalent in those shells is to add a `\?prompt` + suffix to the first variable name argument to `read`. i.e., if the + first variable name given contains a `?` character, the remainder of + the argument is used as the prompt string. Since this feature is + pointless and redundant, I suggest not using it in either shell. + Simply precede the `read` command with a `printf %s prompt >&2`. + +### Other shells + +ksh93, mksh, zsh, and Bash all support something called \"coprocesses\" +which all do approximately the same thing. ksh93 and mksh have virtually +identical syntax and semantics for coprocs. A *list* operator: `|&` is +added to the language which runs the preceding *pipeline* as a coprocess +(This is another reason not to use the special `|&` pipe operator in +Bash \-- its syntax is conflicting). The `-p` option to the `read` and +`print` builtins can then be used to read and write to the pipe of the +coprocess (whose FD isn\'t yet known). Special redirects are added to +move the last spawned coprocess to a different FD: `<&p` and `>&p`, at +which point it can be accessed at the new FD using ordinary redirection, +and another coprocess may then be started, again using `|&`. + +zsh coprocesses are very similar to ksh except in the way they are +started. zsh adds the shell reserved word `coproc` to the pipeline +syntax (similar to the way Bash\'s `time` keyword works), so that the +pipeline that follows is started as a coproc. The coproc\'s input and +output FDs can then be accessed and moved using the same `read`/`print` +`-p` and redirects used by the ksh shells. + +It is unfortunate that Bash chose to go against existing practice in +their coproc implementation, especially considering it was the last of +the major shells to incorporate this feature. However, Bash\'s method +accomplishes the same without requiring nearly as much additional +syntax. The `coproc` keyword is easy enough to wrap in a function such +that it takes Bash code as an ordinary argument and/or stdin like +`eval`. Coprocess functionality in other shells can be similarly wrapped +to create a `COPROC` array automatically. + +### Only one coprocess at a time + +The title says it all, complain to the bug-bash mailing list if you want +more. See + for +more details + +The ability to use multiple coprocesses in Bash is considered +\"experimental\". Bash will throw an error if you attempt to start more +than one. This may be overridden at compile-time with the +`MULTIPLE_COPROCS` option. However, at this time there are still issues +\-- see the above mailing list discussion. + +## See also + +- [Anthony Thyssen\'s Coprocess + Hints](http://www.ict.griffith.edu.au/anthony/info/shell/co-processes.hints) - + excellent summary of everything around the topic diff --git a/docs/syntax/pattern.md b/docs/syntax/pattern.md new file mode 100644 index 0000000..e292ad5 --- /dev/null +++ b/docs/syntax/pattern.md @@ -0,0 +1,163 @@ +# Patterns and pattern matching + +![](keywords>bash shell scripting glob globbing wildcards filename pattern matching) + +A pattern is a **string description**. Bash uses them in various ways: + +- [Pathname expansion](/syntax/expansion/globs) (Globbing - matching + filenames) +- Pattern matching in [conditional + expressions](/syntax/ccmd/conditional_expression) +- [Substring removal](/syntax/pe#substring_removal) and [search and + replace](/syntax/pe#search_and_replace) in [Parameter + Expansion](/syntax/pe) +- Pattern-based branching using the [case command](/syntax/ccmd/case) + +The pattern description language is relatively easy. Any character +that\'s not mentioned below matches itself. The `NUL` character may not +occur in a pattern. If special characters are quoted, they\'re matched +literally, i.e., without their special meaning. + +Do **not** confuse patterns with ***regular expressions***, because they +share some symbols and do similar matching work. + +## Normal pattern language + + Sequence Description + ---------- ---------------------------------------------------------------------------------------------------------------- + `*` Matches **any string**, including the null string (empty string) + `?` Matches any **single character** + `X` Matches the character `X` which can be any character that has no special meaning + `\X` Matches the character `X`, where the character\'s special meaning is stripped by the backslash + `\\` Matches a backslash + `[...]` Defines a pattern **bracket expression** (see below). Matches any of the enclosed characters at this position. + +### Bracket expressions + +The bracket expression `[...]` mentioned above has some useful +applications: + + Bracket expression Description + ---------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + `[XYZ]` The \"normal\" bracket expression, matching either `X`, `Y` or `Z` + `[X-Z]` A range expression: Matching all the characters from `X` to `Y` (your current **locale**, defines how the characters are **sorted**!) + `[[:class:]]` Matches all the characters defined by a [POSIX(r) character class](https://pubs.opengroup.org/onlinepubs/009696899/basedefs/xbd_chap07.html#tag_07_03_01): `alnum`, `alpha`, `ascii`, `blank`, `cntrl`, `digit`, `graph`, `lower`, `print`, `punct`, `space`, `upper`, `word` and `xdigit` + `[^...]` A negating expression: It matches all the characters that are **not** in the bracket expression + `[!...]` Equivalent to `[^...]` + `[]...]` or `[-...]` Used to include the characters `]` and `-` into the set, they need to be the first characters after the opening bracket + `[=C=]` Matches any character that is eqivalent to the collation weight of `C` (current locale!) + `[[.SYMBOL.]]` Matches the collating symbol `SYMBOL` + +### Examples + +Some simple examples using normal pattern matching: + +- Pattern `"Hello world"` matches + - `Hello world` +- Pattern `[Hh]"ello world"` matches + - =\> `Hello world` + - =\> `hello world` +- Pattern `Hello*` matches (for example) + - =\> `Hello world` + - =\> `Helloworld` + - =\> `HelloWoRlD` + - =\> `Hello` +- Pattern `Hello world[[:punct:]]` matches (for example) + - =\> `Hello world!` + - =\> `Hello world.` + - =\> `Hello world+` + - =\> `Hello world?` +- Pattern + `[[.backslash.]]Hello[[.vertical-line.]]world[[.exclamation-mark.]]` + matches (using [collation + symbols](https://pubs.opengroup.org/onlinepubs/009696899/basedefs/xbd_chap07.html#tag_07_03_02_04)) + - =\> `\Hello|world!` + +## Extended pattern language + +If you set the [shell option](/internals/shell_options) `extglob`, Bash +understands some powerful patterns. A `` is one or more +patterns, separated by the pipe-symbol (`PATTERN|PATTERN`). + + --------------------- ------------------------------------------------------------ + `?()` Matches **zero or one** occurrence of the given patterns + `*()` Matches **zero or more** occurrences of the given patterns + `+()` Matches **one or more** occurrences of the given patterns + `@()` Matches **one** of the given patterns + `!()` Matches anything **except** one of the given patterns + --------------------- ------------------------------------------------------------ + +### Examples + +**[Delete all but one specific file]{.underline}** + + rm -f !(survivior.txt) + +## Pattern matching configuration + +### Related shell options + + option classification description + ------------------- ------------------------------------- ------------------------------------------------------------------------------- + `dotglob` [globbing](/syntax/expansion/globs) see [Pathname expansion customization](/syntax/expansion/globs#Customization) + `extglob` global enable/disable extended pattern matching language, as described above + `failglob` [globbing](/syntax/expansion/globs) see [Pathname expansion customization](/syntax/expansion/globs#Customization) + `nocaseglob` [globbing](/syntax/expansion/globs) see [Pathname expansion customization](/syntax/expansion/globs#Customization) + `nocasematch` pattern/string matching perform pattern matching without regarding the case of individual letters + `nullglob` [globbing](/syntax/expansion/globs) see [Pathname expansion customization](/syntax/expansion/globs#Customization) + `globasciiranges` [globbing](/syntax/expansion/globs) see [Pathname expansion customization](/syntax/expansion/globs#Customization) + +## Bugs and Portability considerations + +\* Counter-intuitively, only the `[!chars]` syntax for negating a +character class is specified by POSIX for shell pattern matching. +`[^chars]` is merely a commonly-supported extension. Even dash supports +`[^chars]`, but not posh. + +\* All of the extglob quantifiers supported by bash were supported by +ksh88. The set of extglob quantifiers supported by ksh88 are identical +to those supported by Bash, mksh, ksh93, and zsh. + +\* mksh does not support POSIX character classes. Therefore, character +ranges like `[0-9]` are somewhat more portable than an equivalent POSIX +class like `[:digit:]`. + +\* Bash uses a custom runtime interpreter for pattern matching. (at +least) ksh93 and zsh translate patterns into regexes and then use a +regex compiler to emit and cache optimized pattern matching code. This +means Bash may be an order of magnitude or more slower in cases that +involve complex back-tracking (usually that means extglob quantifier +nesting). You may wish to use Bash\'s regex support (the `=~` operator) +if performance is a problem, because Bash will use your C library regex +implementation rather than its own pattern matcher. + +TODO: describe the pattern escape bug + + +### ksh93 extras + +ksh93 supports some very powerful pattern matching features in addition +to those described above. + +\* ksh93 supports arbitrary quantifiers just like ERE using the +`{from,to}(pattern-list)` syntax. `{2,4}(foo)bar` matches between 2-4 +\"foo\"\'s followed by \"bar\". `{2,}(foo)bar` matches 2 or more +\"foo\"\'s followed by \"bar\". You can probably figure out the rest. So +far, none of the other shells support this syntax. + +\* In ksh93, a `pattern-list` may be delimited by either `&` or `|`. `&` +means \"all patterns must be matched\" instead of \"any pattern\". For +example, `[[ fo0bar == @(fo[0-9]&+([[:alnum:]]))bar ]]` would be true +while `[[ f00bar == @(fo[0-9]&+([[:alnum:]]))bar ]]` is false, because +all members of the and-list must be satisfied. No other shell supports +this so far, but you can simulate some cases in other shells using +double extglob negation. The aforementioned ksh93 pattern is equivalent +in Bash to: `[[ fo0bar == !(!(fo[0-9])|!(+([[:alnum:]])))bar ]]`, which +is technically more portable, but ugly. + +\* ksh93\'s [printf](commands/builtin/printf) builtin can translate from +shell patterns to ERE and back again using the `%R` and `%P` format +specifiers respectively. + +TODO: `~()` (and regex), `.sh.match`, backrefs, special `${var/.../...}` +behavior, `%()` diff --git a/docs/syntax/pe.md b/docs/syntax/pe.md new file mode 100644 index 0000000..db9d535 --- /dev/null +++ b/docs/syntax/pe.md @@ -0,0 +1,1056 @@ +# Parameter expansion + +![](keywords>bash shell scripting expansion substitution text variable parameter mangle substitute change check defined null array arrays) + +## Introduction + +One core functionality of Bash is to manage **parameters**. A parameter +is an entity that stores values and is referenced by a **name**, a +**number** or a **special symbol**. + +- parameters referenced by a name are called **variables** (this also + applies to [arrays](/syntax/arrays)) +- parameters referenced by a number are called **positional + parameters** and reflect the arguments given to a shell +- parameters referenced by a **special symbol** are auto-set + parameters that have different special meanings and uses + +**Parameter expansion** is the procedure to get the value from the +referenced entity, like expanding a variable to print its value. On +expansion time you can do very nasty things with the parameter or its +value. These things are described here. + +**If you saw** some parameter expansion syntax somewhere, and need to +check what it can be, try the overview section below! + +**Arrays** can be special cases for parameter expansion, every +applicable description mentions arrays below. Please also see the +[article about arrays](/syntax/arrays). + +For a more technical view what a parameter is and which types exist, +[see the dictionary entry for \"parameter\"](/dict/terms/parameter). + +## Overview + +Looking for a specific syntax you saw, without knowing the name? + +- [Simple usage](#simple_usage) + - `$PARAMETER` + - `${PARAMETER}` +- [Indirection](#indirection) + - `${!PARAMETER}` +- [Case modification](#case_modification) + - `${PARAMETER^}` + - `${PARAMETER^^}` + - `${PARAMETER,}` + - `${PARAMETER,,}` + - `${PARAMETER~}` + - `${PARAMETER~~}` +- [Variable name expansion](#variable_name_expansion) + - `${!PREFIX*}` + - `${!PREFIX@}` +- [Substring removal](#substring_removal) (also for **filename + manipulation**!) + - `${PARAMETER#PATTERN}` + - `${PARAMETER##PATTERN}` + - `${PARAMETER%PATTERN}` + - `${PARAMETER%%PATTERN}` +- [Search and replace](#search_and_replace) + - `${PARAMETER/PATTERN/STRING}` + - `${PARAMETER//PATTERN/STRING}` + - `${PARAMETER/PATTERN}` + - `${PARAMETER//PATTERN}` +- [String length](#string_length) + - `${#PARAMETER}` +- [Substring expansion](#substring_expansion) + - `${PARAMETER:OFFSET}` + - `${PARAMETER:OFFSET:LENGTH}` +- [Use a default value](#use_a_default_value) + - `${PARAMETER:-WORD}` + - `${PARAMETER-WORD}` +- [Assign a default value](#assign_a_default_value) + - `${PARAMETER:=WORD}` + - `${PARAMETER=WORD}` +- [Use an alternate value](#use_an_alternate_value) + - `${PARAMETER:+WORD}` + - `${PARAMETER+WORD}` +- [Display error if null or unset](#display_error_if_null_or_unset) + - `${PARAMETER:?WORD}` + - `${PARAMETER?WORD}` + +## Simple usage + +`$PARAMETER` + +`${PARAMETER}` + +The easiest form is to just use a parameter\'s name within braces. This +is identical to using `$FOO` like you see it everywhere, but has the +advantage that it can be immediately followed by characters that would +be interpreted as part of the parameter name otherwise. Compare these +two expressions (`WORD="car"` for example), where we want to print a +word with a trailing \"s\": + + echo "The plural of $WORD is most likely $WORDs" + echo "The plural of $WORD is most likely ${WORD}s" + +[Why does the first one fail?]{.underline} It prints nothing, because a +parameter (variable) named \"`WORDs`\" is undefined and thus printed as +\"\" (*nothing*). Without using braces for parameter expansion, Bash +will interpret the sequence of all valid characters from the introducing +\"`$`\" up to the last valid character as name of the parameter. When +using braces you just force Bash to **only interpret the name inside +your braces**. + +Also, please remember, that **parameter names are** (like nearly +everything in UNIX(r)) **case sensitive!** + +The second form with the curly braces is also needed to access +positional parameters (arguments to a script) beyond `$9`: + + echo "Argument 1 is: $1" + echo "Argument 10 is: ${10}" + +### Simple usage: Arrays + +See also the [article about general array syntax](/syntax/arrays) + +For arrays you always need the braces. The arrays are expanded by +individual indexes or mass arguments. An individual index behaves like a +normal parameter, for the mass expansion, please read the article about +arrays linked above. + +- \${array\[5\]} +- \${array\[\*\]} +- \${array\[@\]} + +## Indirection + +`${!PARAMETER}` + +In some cases, like for example + + ${PARAMETER} + + ${PARAMETER:0:3} + +you can instead use the form + + ${!PARAMETER} + +to enter a level of indirection. The referenced parameter is not +`PARAMETER` itself, but the parameter whose name is stored as the value +of `PARAMETER`. If the parameter `PARAMETER` has the value \"`TEMP`\", +then `${!PARAMETER}` will expand to the value of the parameter named +`TEMP`: + + read -rep 'Which variable do you want to inspect? ' look_var + + printf 'The value of "%s" is: "%s"\n' "$look_var" "${!look_var}" + +Of course the indirection also works with special variables: + + # set some fake positional parameters + set one two three four + + # get the LAST argument ("#" stores the number of arguments, so "!#" will reference the LAST argument) + echo ${!#} + +You can think of this mechanism as being roughly equivalent to taking +any parameter expansion that begins with the parameter name, and +substituting the `!PARAMETER` part with the value of PARAMETER. + + echo "${!var^^}" + # ...is equivalent to + eval 'echo "${'"$var"'^^}"' + +It was an unfortunate design decision to use the `!` prefix for +indirection, as it introduces parsing ambiguity with other parameter +expansions that begin with `!`. Indirection is not possible in +combination with any parameter expansion whose modifier requires a +prefix to the parameter name. Specifically, indirection isn\'t possible +on the `${!var@}`, `${!var*}`, `${!var[@]}`, `${!var[*]}`, and `${#var}` +forms. This means the `!` prefix can\'t be used to retrieve the indices +of an array, the length of a string, or number of elements in an array +indirectly (see [syntax/arrays#indirection](syntax/arrays#indirection) +for workarounds). Additionally, the `!`-prefixed parameter expansion +conflicts with ksh-like shells which have the more powerful +\"name-reference\" form of indirection, where the exact same syntax is +used to expand to the name of the variable being referenced. + +Indirect references to [array names](/syntax/arrays) are also possible +since the Bash 3 series (exact version unknown), but undocumented. See +[syntax/arrays#indirection](syntax/arrays#indirection) for details. + +Chet has added an initial implementation of the ksh `nameref` +declaration command to the git devel branch. (`declare -n`, `local -n`, +etc, will be supported). This will finally address many issues around +passing and returning complex datatypes to/from functions. + +## Case modification + +`${PARAMETER^}` + +`${PARAMETER^^}` + +`${PARAMETER,}` + +`${PARAMETER,,}` + +`${PARAMETER~}` + +`${PARAMETER~~}` + +These expansion operators modify the case of the letters in the expanded +text. + +The `^` operator modifies the first character to uppercase, the `,` +operator to lowercase. When using the double-form (`^^` and `,,`), all +characters are converted. + +\ + +The (**currently undocumented**) operators `~` and `~~` reverse the case +of the given text (in `PARAMETER`).`~` reverses the case of first letter +of words in the variable while `~~` reverses case for all. Thanks to +`Bushmills` and `geirha` on the Freenode IRC channel for this finding. + +\ + +[**Example: Rename all `*.txt` filenames to lowercase**]{.underline} + + for file in *.txt; do + mv "$file" "${file,,}" + done + +[**Note:**]{.underline} Case modification is a handy feature you can +apply to a name or a title. Or is it? Case modification was an important +aspect of the Bash 4 release. Bash version 4, RC1 would perform word +splitting, and then case modification, resulting in title case (where +every word is capitalized). It was decided to apply case modification to +values, not words, for the Bash 4 release. Thanks Chet. + +### Case modification: Arrays + +Case modification can be used to create the proper capitalization for +names or titles. Just assign it to an array: + +`declare -a title=(my hello world john smith)` + +For [array](/syntax/arrays) expansion, the case modification applies to +**every expanded element, no matter if you expand an individual index or +mass-expand** the whole array using `@` or `*` subscripts. Some +examples: + +Assume: `array=(This is some Text)` + +- `echo "${array[@],}"` + - =\> `this is some text` +- `echo "${array[@],,}"` + - =\> `this is some text` +- `echo "${array[@]^}"` + - =\> `This Is Some Text` +- `echo "${array[@]^^}"` + - =\> `THIS IS SOME TEXT` + +```{=html} + +``` + * ''echo "${array[2]^^}"'' + * => ''SOME'' + +## Variable name expansion + +`${!PREFIX*}` + +`${!PREFIX@}` + +This expands to a list of all set **variable names** beginning with the +string `PREFIX`. The elements of the list are separated by the first +character in the `IFS`-variable (\ by default). + +This will show all defined variable names (not values!) beginning with +\"BASH\": + + $ echo ${!BASH*} + BASH BASH_ARGC BASH_ARGV BASH_COMMAND BASH_LINENO BASH_SOURCE BASH_SUBSHELL BASH_VERSINFO BASH_VERSION + +This list will also include [array names](/syntax/arrays). + +## Substring removal + +`${PARAMETER#PATTERN}` + +`${PARAMETER##PATTERN}` + +`${PARAMETER%PATTERN}` + +`${PARAMETER%%PATTERN}` + +This one can **expand only a part** of a parameter\'s value, **given a +pattern to describe what to remove** from the string. The pattern is +interpreted just like a pattern to describe a filename to match +(globbing). See [Pattern matching](/syntax/pattern) for more. + +Example string (*just a quote from a big man*): + + MYSTRING="Be liberal in what you accept, and conservative in what you send" + +### From the beginning + +`${PARAMETER#PATTERN}` and `${PARAMETER##PATTERN}` + +This form is to remove the described [pattern](/syntax/pattern) trying +to **match it from the beginning of the string**. The operator \"`#`\" +will try to remove the shortest text matching the pattern, while +\"`##`\" tries to do it with the longest text matching. Look at the +following examples to get the idea (matched text ~~marked striked~~, +remember it will be removed!): + + Syntax Result + -------------------- ---------------------------------------------------------------------- + `${MYSTRING#*in}` ~~Be liberal in~~ what you accept, and conservative in what you send + `${MYSTRING##*in}` ~~Be liberal in what you accept, and conservative in~~ what you send + +### From the end + +`${PARAMETER%PATTERN}` and `${PARAMETER%%PATTERN}` + +In the second form everything will be the same, except that Bash now +tries to match the pattern from the end of the string: + + Syntax Result + -------------------- ---------------------------------------------------------------------- + `${MYSTRING%in*}` Be liberal in what you accept, and conservative ~~in what you send~~ + `${MYSTRING%%in*}` Be liberal ~~in what you accept, and conservative in what you send~~ + +The second form nullifies variables that begin with `in`, by working +from the end. + +### Common use + +[**How the heck does that help to make my life easier?**]{.underline} + +Well, maybe the most common use for it is to **extract parts of a +filename**. Just look at the following list with examples: + +- **Get name without extension** + - `${FILENAME%.*}` + - =\> `bash_hackers.txt` +- **Get extension** + - `${FILENAME##*.}` + - =\> `bash_hackers.txt` +- **Get directory name** + - `${PATHNAME%/*}` + - =\> `/home/bash/bash_hackers.txt` +- **Get filename** + - `${PATHNAME##*/}` + - =\> `/home/bash/bash_hackers.txt` + +These are the syntaxes for filenames with a single extension. Depending +on your needs, you might need to adjust shortest/longest match. + +### Substring removal: Arrays + +As for most parameter expansion features, working on +[arrays](/syntax/arrays) **will handle each expanded element**, for +individual expansion and also for mass expansion. + +Simple example, removing a trailing `is` from all array elements (on +expansion): + +Assume: `array=(This is a text)` + +- `echo "${array[@]%is}"` + - =\> `Th a text` + - (it was: `This is a text`) + +All other variants of this expansion behave the same. + +## Search and replace + +`${PARAMETER/PATTERN/STRING}` + +`${PARAMETER//PATTERN/STRING}` + +`${PARAMETER/PATTERN}` + +`${PARAMETER//PATTERN}` + +This one can substitute (*replace*) a substring [matched by a +pattern](/syntax/pattern), on expansion time. The matched substring will +be entirely removed and the given string will be inserted. Again some +example string for the tests: + + MYSTRING="Be liberal in what you accept, and conservative in what you send" + +The two main forms only differ in **the number of slashes** after the +parameter name: `${PARAMETER/PATTERN/STRING}` and +`${PARAMETER//PATTERN/STRING}` + +The first one (*one slash*) is to only substitute **the first +occurrence** of the given pattern, the second one (*two slashes*) is to +substitute **all occurrences** of the pattern. + +First, let\'s try to say \"happy\" instead of \"conservative\" in our +example string: + + ${MYSTRING//conservative/happy} + +=\> +`Be liberal in what you accept, and conservativehappy in what you send` + +Since there is only one \"conservative\" in that example, it really +doesn\'t matter which of the two forms we use. + +Let\'s play with the word \"in\", I don\'t know if it makes any sense, +but let\'s substitute it with \"by\". + +[**First form: Substitute first occurrence**]{.underline} + + ${MYSTRING/in/by} + +=\> `Be liberal inby what you accept, and conservative in what you send` + +[**Second form: Substitute all occurrences**]{.underline} + + ${MYSTRING//in/by} + +=\> +`Be liberal inby what you accept, and conservative inby what you send` + +[**Anchoring**]{.underline} Additionally you can \"anchor\" an +expression: A `#` (hashmark) will indicate that your expression is +matched against the beginning portion of the string, a `%` +(percent-sign) will do it for the end portion. + + MYSTRING=xxxxxxxxxx + echo ${MYSTRING/#x/y} # RESULT: yxxxxxxxxx + echo ${MYSTRING/%x/y} # RESULT: xxxxxxxxxy + +If the replacement part is completely omitted, the matches are replaced +by the nullstring, i.e., they are removed. This is equivalent to +specifying an empty replacement: + + echo ${MYSTRING//conservative/} + # is equivalent to + echo ${MYSTRING//conservative} + +### Search and replace: Arrays + +This parameter expansion type applied to [arrays](/syntax/arrays) +**applies to all expanded elements**, no matter if an individual element +is expanded, or all elements using the mass expansion syntaxes. + +A simple example, changing the (lowercase) letter `t` to `d`: + +Assume: `array=(This is a text)` + +- `echo "${array[@]/t/d}"` + - =\> `This is a dext` +- `echo "${array[@]//t/d}"` + - =\> `This is a dexd` + +## String length + +`${#PARAMETER}` + +When you use this form, the length of the parameter\'s value is +expanded. Again, a quote from a big man, to have a test text: + + MYSTRING="Be liberal in what you accept, and conservative in what you send" + +Using echo `${#MYSTRING}`\... + +=\> `64` + +The length is reported in characters, not in bytes. Depending on your +environment this may not always be the same (multibyte-characters, like +in UTF8 encoding). + +There\'s not much to say about it, mh? + +### (String) length: Arrays + +For [arrays](/syntax/arrays), this expansion type has two meanings: + +- For **individual** elements, it reports the string length of the + element (as for every \"normal\" parameter) +- For the **mass subscripts** `@` and `*` it reports the number of set + elements in the array + +Example: + +Assume: `array=(This is a text)` + +- `echo ${#array[1]}` + - =\> 2 (the word \"is\" has a length of 2) +- `echo ${#array[@]}` + - =\> 4 (the array contains 4 elements) + +[**Attention:**]{.underline} The number of used elements does not need +to conform to the highest index. Sparse arrays are possible in Bash, +that means you can have 4 elements, but with indexes 1, 7, 20, 31. **You +can\'t loop through such an array with a counter loop based on the +number of elements!** + +## Substring expansion + +`${PARAMETER:OFFSET}` + +`${PARAMETER:OFFSET:LENGTH}` + +This one can expand only a **part** of a parameter\'s value, given a +**position to start** and maybe a **length**. If `LENGTH` is omitted, +the parameter will be expanded up to the end of the string. If `LENGTH` +is negative, it\'s taken as a second offset into the string, counting +from the end of the string. + +`OFFSET` and `LENGTH` can be **any** [arithmetic +expression](/syntax/arith_expr). **Take care:** The `OFFSET` starts at +0, not at 1! + +Example string (a quote from a big man): +`MYSTRING="Be liberal in what you accept, and conservative in what you send"` + +### Using only Offset + +In the first form, the expansion is used without a length value, note +that the offset 0 is the first character: + + echo ${MYSTRING:35} + +=\> +`Be liberal in what you accept, and conservative in what you send` + +### Using Offset and Length + +In the second form we also give a length value: + + echo ${MYSTRING:35:12} + +=\> +`Be liberal in what you accept, and conservative in what you send` + +### Negative Offset Value + +If the given offset is negative, it\'s counted from the end of the +string, i.e. an offset of -1 is the last character. In that case, the +length still counts forward, of course. One special thing is to do when +using a negative offset: You need to separate the (negative) number from +the colon: + + ${MYSTRING: -10:5} + ${MYSTRING:(-10):5} + +Why? Because it\'s interpreted as the parameter expansion syntax to [use +a default value](/syntax/pe#use_a_default_value). + +### Negative Length Value + +If the `LENGTH` value is negative, it\'s used as offset from the end of +the string. The expansion happens from the first to the second offset +then: + + echo "${MYSTRING:11:-17}" + +=\> +`Be liberal in what you accept, and conservative in what you send` + +This works since Bash 4.2-alpha, see also +[bashchanges](/scripting/bashchanges). + +### Substring/Element expansion: Arrays + +For [arrays](/syntax/arrays), this expansion type has again 2 meanings: + +- For **individual** elements, it expands to the specified substring + (as for every "normal" parameter) +- For the **mass subscripts** `@` and `*` it mass-expands individual + array elements denoted by the 2 numbers given (*starting element*, + *number of elements*) + +Example: + +Assume: `array=(This is a text)` + +- `echo ${array[0]:2:2}` + - =\> `is` (the \"is\" in \"This\", array element 0) +- `echo ${array[@]:1:2}` + - =\> `is a` (from element 1 inclusive, 2 elements are expanded, + i.e. element 1 and 2) + +## Use a default value + +`${PARAMETER:-WORD}` + +`${PARAMETER-WORD}` + +If the parameter `PARAMETER` is unset (never was defined) or null +(empty), this one expands to `WORD`, otherwise it expands to the value +of `PARAMETER`, as if it just was `${PARAMETER}`. If you omit the `:` +(colon), like shown in the second form, the default value is only used +when the parameter was **unset**, not when it was empty. + + echo "Your home directory is: ${HOME:-/home/$USER}." + echo "${HOME:-/home/$USER} will be used to store your personal data." + +If `HOME` is unset or empty, everytime you want to print something +useful, you need to put that parameter syntax in. + + #!/bin/bash + + read -p "Enter your gender (just press ENTER to not tell us): " GENDER + echo "Your gender is ${GENDER:-a secret}." + +It will print \"Your gender is a secret.\" when you don\'t enter the +gender. Note that the default value is **used on expansion time**, it is +**not assigned to the parameter**. + +### Use a default value: Arrays + +For [arrays](/syntax/arrays), the behaviour is very similar. Again, you +have to make a difference between expanding an individual element by a +given index and mass-expanding the array using the `@` and `*` +subscripts. + +- For individual elements, it\'s the very same: If the expanded + element is `NULL` or unset (watch the `:-` and `-` variants), the + default text is expanded +- For mass-expansion syntax, the default text is expanded if the array + - contains no element or is unset (the `:-` and `-` variants mean + the **same** here) + - contains only elements that are the nullstring (the `:-` + variant) + +In other words: The basic meaning of this expansion type is applied as +consistent as possible to arrays. + +Example code (please try the example cases yourself): + + + #### + # Example cases for unset/empty arrays and nullstring elements + #### + + + ### CASE 1: Unset array (no array) + + # make sure we have no array at all + unset array + + echo ${array[@]:-This array is NULL or unset} + echo ${array[@]-This array is NULL or unset} + + ### CASE 2: Set but empty array (no elements) + + # declare an empty array + array=() + + echo ${array[@]:-This array is NULL or unset} + echo ${array[@]-This array is NULL or unset} + + + ### CASE 3: An array with only one element, a nullstring + array=("") + + echo ${array[@]:-This array is NULL or unset} + echo ${array[@]-This array is NULL or unset} + + + ### CASE 4: An array with only two elements, a nullstring and a normal word + array=("" word) + + echo ${array[@]:-This array is NULL or unset} + echo ${array[@]-This array is NULL or unset} + +## Assign a default value + +`${PARAMETER:=WORD}` + +`${PARAMETER=WORD}` + +This one works like the [using default +values](/syntax/pe#use_a_default_value), but the default text you give +is not only expanded, but also **assigned** to the parameter, if it was +unset or null. Equivalent to using a default value, when you omit the +`:` (colon), as shown in the second form, the default value will only be +assigned when the parameter was **unset**. + + echo "Your home directory is: ${HOME:=/home/$USER}." + echo "$HOME will be used to store your personal data." + +After the first expansion here (`${HOME:=/home/$USER}`), `HOME` is set +and usable. + +Let\'s change our code example from above: + + #!/bin/bash + + read -p "Enter your gender (just press ENTER to not tell us): " GENDER + echo "Your gender is ${GENDER:=a secret}." + echo "Ah, in case you forgot, your gender is really: $GENDER" + +### Assign a default value: Arrays + +For [arrays](/syntax/arrays) this expansion type is limited. For an +individual index, it behaves like for a \"normal\" parameter, the +default value is assigned to this one element. The mass-expansion +subscripts `@` and `*` **can not be used here** because it\'s not +possible to assign to them! + +## Use an alternate value + +`${PARAMETER:+WORD}` + +`${PARAMETER+WORD}` + +This form expands to nothing if the parameter is unset or empty. If it +is set, it does not expand to the parameter\'s value, **but to some text +you can specify**: + + echo "The Java application was installed and can be started.${JAVAPATH:+ NOTE: JAVAPATH seems to be set}" + +The above code will simply add a warning if `JAVAPATH` is set (because +it could influence the startup behaviour of that imaginary application). + +Some more unrealistic example\... Ask for some flags (for whatever +reason), and then, if they were set, print a warning and also print the +flags: + + #!/bin/bash + + read -p "If you want to use special flags, enter them now: " SPECIAL_FLAGS + echo "The installation of the application is finished${SPECIAL_FLAGS:+ (NOTE: there are special flags set: $SPECIAL_FLAGS)}." + +If you omit the colon, as shown in the second form +(`${PARAMETER+WORD}`), the alternate value will be used if the parameter +is set (and it can be empty)! You can use it, for example, to complain +if variables you need (and that can be empty) are undefined: + + # test that with the three stages: + + # unset foo + # foo="" + # foo="something" + + if [[ ${foo+isset} = isset ]]; then + echo "foo is set..." + else + echo "foo is not set..." + fi + +### Use an alternate value: Arrays + +Similar to the cases for [arrays](/syntax/arrays) to expand to a default +value, this expansion behaves like for a \"normal\" parameter when using +individual array elements by index, but reacts differently when using +the mass-expansion subscripts `@` and `*`: + + * For individual elements, it's the very same: If the expanded element is **not** NULL or unset (watch the :+ and + variants), the alternate text is expanded + * For mass-expansion syntax, the alternate text is expanded if the array + * contains elements where min. one element is **not** a nullstring (the :+ and + variants mean the same here) + * contains **only** elements that are **not** the nullstring (the :+ variant) + +For some cases to play with, please see the code examples in the +[description for using a default value](#use_a_default_valuearrays). + +## Display error if null or unset + +`${PARAMETER:?WORD}` + +`${PARAMETER?WORD}` + +If the parameter `PARAMETER` is set/non-null, this form will simply +expand it. Otherwise, the expansion of `WORD` will be used as appendix +for an error message: + + $ echo "The unset parameter is: ${p_unset?not set}" + bash: p_unset: not set + +After printing this message, + +- an interactive shell has `$?` to a non-zero value +- a non-interactive shell exits with a non-zero exit code + +The meaning of the colon (`:`) is the same as for the other parameter +expansion syntaxes: It specifies if + +- only unset or +- unset and empty parameters + +are taken into account. + +## Code examples + +### Substring removal + +Removing the first 6 characters from a text string: + + STRING="Hello world" + + # only print 'Hello' + echo "${STRING%??????}" + + # only print 'world' + echo "${STRING#??????}" + + # store it into the same variable + STRING=${STRING#??????} + +## Bugs and Portability considerations + +- **Fixed in 4.2.36** + ([patch](ftp://ftp.cwru.edu/pub/bash/bash-4.2-patches/bash42-036)). + Bash doesn\'t follow either POSIX or its own documentation when + expanding either a quoted `"$@"` or `"${arr[@]}"` with an adjacent + expansion. `"$@$x"` expands in the same way as `"$*$x"` - i.e. all + parameters plus the adjacent expansion are concatenated into a + single argument. As a workaround, each expansion needs to be quoted + separately. Unfortunately, this bug took a very long time to + notice.`~ $ set -- a b c; x=foo; printf '<%s> ' "$@$x" "$*""$x" "$@""$x" + + ` + +```{=html} + +``` +- Almost all shells disagree about the treatment of an unquoted `$@`, + `${arr[@]}`, `$*`, and `${arr[*]}` when + [IFS](http://mywiki.wooledge.org/IFS) is set to null. POSIX is + unclear about the expected behavior. A null IFS causes both [word + splitting](/syntax/expansion/wordsplit) and [pathname + expansion](/syntax/expansion/globs) to behave randomly. Since there + are few good reasons to leave `IFS` set to null for more than the + duration of a command or two, and even fewer to expand `$@` and `$*` + unquoted, this should be a rare issue. **Always quote + them**!`touch x 'y z' + for sh in bb {{d,b}a,{m,}k,z}sh; do + echo "$sh" + "$sh" -s a 'b c' d \* ' $* + echo + printf "<%s> " $@ + echo + EOF + ``bb + + + dash + + + bash + + + mksh + + + ksh + + + zsh + + + `When `IFS` is set to a non-null value, or unset, all shells behave + the same - first expanding into separate args, then applying + pathname expansion and word-splitting to the results, except for + zsh, which doesn\'t do pathname expansion in its default mode. + +```{=html} + +``` +- Additionally, shells disagree about various wordsplitting behaviors, + the behavior of inserting delimiter characters from IFS in `$*`, and + the way adjacent arguments are concatenated, when IFS is modified in + the middle of expansion through + side-effects.`for sh in bb {{d,b}a,po,{m,}k,z}sh; do + printf '%-4s: ' "$sh" + "$sh" ' ${*}${IFS=}${*}${IFS:=-}"${*}" + echo + EOF + ``bb : + dash: + bash: + posh: + mksh: + ksh : + zsh : + `ksh93 and mksh can additionally achieve this side effect (and + others) via the `${ cmds;}` expansion. I haven\'t yet tested every + possible side-effect that can affect expansion halfway through + expansion that way. + +```{=html} + +``` +- As previously mentioned, the Bash form of indirection by prefixing a + parameter expansion with a `!` conflicts with the same syntax used + by mksh, zsh, and ksh93 for a different purpose. Bash will + \"slightly\" modify this expansion in the next version with the + addition of namerefs. + +```{=html} + +``` +- Bash (and most other shells) don\'t allow .\'s in identifiers. In + ksh93, dots in variable names are used to reference methods (i.e. + \"Discipline Functions\"), attributes, special shell variables, and + to define the \"real value\" of an instance of a class. + +```{=html} + +``` +- In ksh93, the `_` parameter has even more uses. It is used in the + same way as `self` in some object-oriented languages; as a + placeholder for some data local to a class; and also as the + mechanism for class inheritance. In most other contexts, `_` is + compatible with Bash. + +```{=html} + +``` +- Bash only evaluates the subscripts of the slice expansion + (`${x:y:z}`) if the parameter is set (for both nested expansions and + arithmetic). For ranges, Bash evaluates as little as possible, i.e., + if the first part is out of range, the second won\'t be evaluated. + ksh93 and mksh always evaluate the subscript parts even if the + parameter is unset. + ` $ bash -c 'n="y[\$(printf yo >&2)1]" m="y[\$(printf jo >&2)1]"; x=(); echo "${x[@]:n,6:m}"' # No output + $ bash -c 'n="y[\$(printf yo >&2)1]" m="y[\$(printf jo >&2)1]"; x=([5]=hi); echo "${x[@]:n,6:m}"' + yo + $ bash -c 'n="y[\$(printf yo >&2)1]" m="y[\$(printf jo >&2)1]"; x=([6]=hi); echo "${x[@]:n,6:m}"' + yojo + $ bash -c 'n="y[\$(printf yo >&2)1]" m="y[\$(printf jo >&2)1]"; x=12345; echo "${x:n,5:m}"' + yojo + $ bash -c 'n="y[\$(printf yo >&2)1]" m="y[\$(printf jo >&2)1]"; x=12345; echo "${x:n,6:m}"' + yo + ` + +### Quote Nesting + +- In most shells, when dealing with an \"alternate\" parameter + expansion that expands to multiple words, and nesting such + expansions, not all combinations of nested quoting are possible. + +```{=html} + +``` + # Bash + $ typeset -a a=(meh bleh blerg) b + $ IFS=e + $ printf "<%s> " "${b[@]-"${a[@]}" "${a[@]}"}"; echo # The entire PE is quoted so Bash considers the inner quotes redundant. + + $ printf "<%s> " "${b[@]-${a[@]} ${a[@]}}"; echo # The outer quotes cause the inner expansions to be considered quoted. + + $ b=(meep beep) + $ printf "<%s> " "${b[@]-"${a[@]}" "${a[@]}"}" "${b[@]-${a[@]} ${a[@]}}"; echo # Again no surprises. Outer quotes quote everything recursively. + + +Now lets see what can happen if we leave the outside unquoted. + + # Bash + $ typeset -a a=(meh bleh blerg) b + $ IFS=e + $ printf "<%s> " ${b[@]-"${a[@]}" "${a[@]}"}; echo # Inner quotes make inner expansions quoted. + + $ printf "<%s> " ${b[@]-${a[@]} ${a[@]}}; echo' # No quotes at all wordsplits / globs, like you'd expect. + + +This all might be intuitive, and is the most common implementation, but +this design sucks for a number of reasons. For one, it means Bash makes +it absolutely impossible to expand any part of the inner region +*unquoted* while leaving the outer region quoted. Quoting the outer +forces quoting of the inner regions recursively (except nested command +substitutions of course). Word-splitting is necessary to split words of +the inner region, which cannot be done together with outer quoting. +Consider the following (only slightly far-fetched) code: + + # Bash (non-working example) + + unset -v IFS # make sure we have a default IFS + + if some crap; then + typeset -a someCmd=(myCmd arg1 'arg2 yay!' 'third*arg*' 4) + fi + + someOtherCmd=mycommand + typeset -a otherArgs=(arg3 arg4) + + # What do you think the programmer expected to happen here? + # What do you think will actually happen... + + "${someCmd[@]-"$someOtherCmd" arg2 "${otherArgs[@]}"}" arg5 + +This final line is perhaps not the most obvious, but I\'ve run into +cases were this type of logic can be desirable and realistic. We can +deduce what was intended: + +- If `someCmd` is set, then the resulting expansion should run the + command: `"myCmd" "arg1" "arg2 yay!" "third*arg*" "4" "arg5"` +- Otherwise, if `someCmd` is not set, expand `$someOtherCmd` and the + inner args, to run a different command: + `"mycommand" "arg2" "arg3" "arg4" "arg5"`. + +Unfortunately, it is impossible to get the intended result in Bash (and +most other shells) without taking a considerably different approach. The +only way to split the literal inner parts is through word-splitting, +which requires that the PE be unquoted. But, the only way to expand the +outer expansion correctly without word-splitting or globbing is to quote +it. Bash will actually expand the command as one of these: + + # The quoted PE produces a correct result here... + $ bash -c 'typeset -a someCmd=(myCmd arg1 "arg2 yay!" "third*arg*" 4); printf "<%s> " "${someCmd[@]-"$someOtherCmd" arg2 "${otherArgs[@]}"}" arg5; echo' + <4> + + # ...but in the opposite case the first 3 arguments are glued together. There are no workarounds. + $ bash -c 'typeset -a otherArgs=(arg3 arg4); someOtherCmd=mycommand; printf "<%s> " "${someCmd[@]-"$someOtherCmd" arg2 "${otherArgs[@]}"}" arg5; echo' + + + # UNLESS! we unquote the outer expansion allowing the inner quotes to + # affect the necessary parts while allowing word-splitting to split the literals: + $ bash -c 'typeset -a otherArgs=(arg3 arg4); someOtherCmd=mycommand; printf "<%s> " ${someCmd[@]-"$someOtherCmd" arg2 "${otherArgs[@]}"} arg5; echo' + + + # Success!!! + $ bash -c 'typeset -a someCmd=(myCmd arg1 "arg2 yay!" "third*arg*" 4); printf "<%s> " ${someCmd[@]-"$someOtherCmd" arg2 "${otherArgs[@]}"} arg5; echo' + <4> + + # ...Ah f^^k. (again, no workaround possible.) + +#### The ksh93 exception + +To the best of my knowledge, ksh93 is the only shell that acts +differently. Rather than forcing nested expansions into quoting, a quote +at the beginning and end of the nested region will cause the quote state +to reverse itself within the nested part. I have no idea whether it\'s +an intentional or documented effect, but it does solve the problem and +consequently adds a lot of potential power to these expansions. + +All we need to do is add two extra double-quotes: + + # ksh93 passing the two failed tests from above: + + $ ksh -c 'otherArgs=(arg3 arg4); someOtherCmd="mycommand"; printf "<%s> " "${someCmd[@]-""$someOtherCmd" arg2 "${otherArgs[@]}""}" arg5; echo' + + + $ ksh -c 'typeset -a someCmd=(myCmd arg1 "arg2 yay!" "third*arg*" 4); printf "<%s> " "${someCmd[@]-""$someOtherCmd" arg2 "${otherArgs[@]}""}" arg5; echo' + <4> + +This can be used to control the quote state of any part of any expansion +to an arbitrary depth. Sadly, it is the only shell that does this and +the difference may introduce a possible compatibility problem. + +## See also + +- Internal: [Introduction to expansion and + substitution](/syntax/expansion/intro) +- Internal: [Arrays](/syntax/arrays) +- Dictionary, internal: [parameter](/dict/terms/parameter) diff --git a/docs/syntax/quoting.md b/docs/syntax/quoting.md new file mode 100644 index 0000000..46ae4c6 --- /dev/null +++ b/docs/syntax/quoting.md @@ -0,0 +1,277 @@ +# Quotes and escaping + +![](keywords>bash shell scripting quoting quotes escape backslash marks singlequotes doublequotes single double) + +Quoting and escaping are important, as they influence the way Bash acts +upon your input. There are three recognized types: + +- **per-character escaping** using a backslash: `\$stuff` +- **weak quoting** with double-quotes: `"stuff"` +- **strong quoting** with single-quotes: `'stuff'` + +All three forms have the very same purpose: **They give you general +control over parsing, expansion and expansion results.** + +Besides these basic variants, there are some special quoting methods +(like interpreting ANSI-C escapes in a string) you\'ll meet below. + +:!: **ATTENTION** :!: The quote characters (`"`, double quote and `'`, +single quote) are a syntax element that influence parsing. It is not +related to the quote characters passed as text to the command line! The +syntax quotes are removed before the command is called! Example: + + ### NO NO NO: this passes three strings: + ### (1) "my + ### (2) multiword + ### (3) argument" + MYARG="\"my multiword argument\"" + somecommand $MYARG + + ### THIS IS NOT (!) THE SAME AS ### + command "my multiword argument" + + ### YOU NEED ### + MYARG="my multiword argument" + command "$MYARG" + +## Per-character escaping + +Per-character escaping is useful in on expansions and substitutions. In +general, a character that has a special meaning to Bash, like the +dollar-sign (`$`) can be masked to not have a special meaning using the +backslash: + + echo \$HOME is set to \"$HOME\" + +- `\$HOME` won\'t expand because it\'s not in variable-expansion + syntax anymore +- The backslash changes the quotes into literals - otherwise Bash + would interpret them + +The sequence `\` (an unquoted backslash, followed by a +`` character) is interpreted as **line continuation**. It is +removed from the input stream and thus effectively ignored. Use it to +beautify your code: + + # escapestr_sed() + # read a stream from stdin and escape characters in text that could be interpreted as + # special characters by sed + escape_sed() { + sed \ + -e 's/\//\\\//g' \ + -e 's/\&/\\\&/g' + } + +The backslash can be used to mask every character that has a special +meaning to bash. [Exception:]{.underline} Inside a single-quoted string +(see below). + +## Weak quoting + +Inside a weak-quoted string there\'s **no special interpretion of**: + +- spaces as word-separators (on inital command line splitting and on + [word splitting](/syntax/expansion/wordsplit)!) +- single-quotes to introduce strong-quoting (see below) +- characters for pattern matching +- tilde expansion +- pathname expansion +- process substitution + +Everything else, especially [parameter expansion](/syntax/pe), is +performed! + + ls -l "*" + +Will not be expanded. `ls` gets the literal `*` as argument. It will, +unless you have a file named `*`, spit out an error. + + echo "Your PATH is: $PATH" + +Will work as expected. `$PATH` is expanded, because it\'s double (weak) +quoted. + +If a backslash in double quotes (\"weak quoting\") occurs, there are 2 +ways to deal with it + +- if the baskslash is followed by a character that would have a + special meaning even inside double-quotes, the backslash is removed + and the following character looses its special meaning +- if the backslash is followed by a character without special meaning, + the backslash is not removed + +In particuar this means that `"\$"` will become `$`, but `"\x"` will +become `\x`. + +## Strong quoting + +Strong quoting is very easy to explain: + +Inside a single-quoted string **nothing** is interpreted, except the +single-quote that closes the string. + + echo 'Your PATH is: $PATH' + +`$PATH` won\'t be expanded, it\'s interpreted as ordinary text because +it\'s surrounded by strong quotes. + +In practise that means, to produce a text like `Here's my test...` as a +single-quoted string, you have to leave and re-enter the single quoting +to get the character \"`'`\" as literal text: + + # WRONG + echo 'Here's my test...' + + # RIGHT + echo 'Here'\''s my test...' + + # ALTERNATIVE: It's also possible to mix-and-match quotes for readability: + echo "Here's my test" + +## ANSI C like strings + +Bash provides another quoting mechanism: Strings that contain ANSI +C-like escape sequences. The Syntax is: + + $'string' + +where the following escape sequences are decoded in `string`: + + Code Meaning + -------------- ------------------------------------------------------------------------------------------------------------------------------------- + `\"` double-quote + `\'` single-quote + `\\` backslash + `\a` terminal alert character (bell) + `\b` backspace + `\e` escape (ASCII 033) + `\E` escape (ASCII 033) **\\E is non-standard** + `\f` form feed + `\n` newline + `\r` carriage return + `\t` horizontal tab + `\v` vertical tab + `\cx` a control-x character, for example, `$'\cZ'` to print the control sequence composed of Ctrl-Z (`^Z`) + `\uXXXX` Interprets `XXXX` as a hexadecimal number and prints the corresponding character from the character set (4 digits) (Bash 4.2-alpha) + `\UXXXXXXXX` Interprets `XXXX` as a hexadecimal number and prints the corresponding character from the character set (8 digits) (Bash 4.2-alpha) + `\nnn` the eight-bit character whose value is the octal value nnn (one to three digits) + `\xHH` the eight-bit character whose value is the hexadecimal value HH (one or two hex digits) + +This is especially useful when you want to pass special characters as +arguments to some programs, like passing a newline to sed. + +The resulting text is treated as if it were **single-quoted**. No +further expansion happens. + +The `$'...'` syntax comes from ksh93, but is portable to most modern +shells including pdksh. A +[specification](http://austingroupbugs.net/view.php?id=249#c590) for it +was accepted for SUS issue 7. There are still some stragglers, such as +most ash variants including dash, (except busybox built with \"bash +compatibility\" features). + +## I18N/L10N + +A dollar-sign followed by a double-quoted string, for example + + echo $"generating database..." + +means I18N. If there is a translation available for that string, it is +used instead of the given text. If not, or if the locale is `C`/`POSIX`, +the dollar sign is simply ignored, which results in a normal double +quoted string. + +If the string was replaced (translated), the result is double quoted. + +In case you\'re a C programmer: The purpose of `$"..."` is the same as +for `gettext()` or `_()`. + +For useful examples to localize your scripts, please see [Appendix I of +the Advanced Bash Scripting +Guide](http://tldp.org/LDP/abs/html/localization.html). + +**Attention:** There is a security hole. Please read [the gettext +documentation](http://www.gnu.org/software/gettext/manual/html_node/bash.html) + +## Common mistakes + +### String lists in for-loops + +The [classic for loop](/syntax/ccmd/classic_for) uses a list of words to +iterate through. The list can also be in a variable: + + mylist="DOG CAT BIRD HORSE" + +**[WRONG]{.underline}** way to iterate through this list: + + for animal in "$mylist"; do + echo $animal + done + +Why? Due to the double-quotes, technically, the expansion of `$mylist` +is seen as **one word**. The for loop iterates exactly one time, with +`animal` set to the whole list. + +**[RIGHT]{.underline}** way to iterate through this list: + + for animal in $mylist; do + echo $animal + done + +### Working out the test-command + +The command `test` or `[ ... ]` ([the classic test +command](/commands/classictest)) is an ordinary command, so ordinary +syntax rules apply. Let\'s take string comparison as an example: + + [ WORD = WORD ] + +The `]` at the end is a convenience; if you type `which [` you will see +that there is in fact a binary file with that name. So if we were +writing this as a test command it would be: + + test WORD = WORD + +When you compare variables, it\'s wise to quote them. Let\'s create a +test string with spaces: + + mystring="my string" + +And now check that string against the word \"testword\": + + [ $mystring = testword ] # WRONG! + +This fails! These are too many arguments for the string comparison test. +After expansion is performed, you really execute: + + [ my string = testword ] + test my string = testword + +Which is wrong, because `my` and `string` are two separate arguments. + +So what you really want to do is: + + [ "$mystring" = testword ] # RIGHT! + + test 'my string' = testword + +Now the command has three parameters, which makes sense for a binary +(two argument) operator. + +**[Hint:]{.underline}** Inside the [conditional +expression](/syntax/ccmd/conditional_expression) (`[[ ]]`) Bash doesn\'t +perform word splitting, and thus you don\'t need to quote your variable +references - they are always seen as \"one word\". + +## See also + +- Internal: [Some words about words\...](/syntax/words) +- Internal: [Word splitting](/syntax/expansion/wordsplit) +- Internal: [Introduction to expansions and + substitutions](/syntax/expansion/intro) + +```{=html} + +``` +- External: [Grymore: + Shellquoting](http://www.grymoire.com/Unix/Quote.html) diff --git a/docs/syntax/redirection.md b/docs/syntax/redirection.md new file mode 100644 index 0000000..f8bba2b --- /dev/null +++ b/docs/syntax/redirection.md @@ -0,0 +1,251 @@ +# Redirection + +\Fix me: To be continued\\ +Redirection makes it possible to control where the output of a command +goes to, and where the input of a command comes from. It\'s a mighty +tool that, together with pipelines, makes the shell powerful. The +redirection operators are checked whenever a [simple command is about to +be executed](/syntax/grammar/parser_exec). + +Under normal circumstances, there are 3 files open, accessible by the +file descriptors 0, 1 and 2, all connected to your terminal: + + Name FD Description + ---------- ---- -------------------------------------------------------- + `stdin` 0 standard input stream (e.g. keyboard) + `stdout` 1 standard output stream (e.g. monitor) + `stderr` 2 standard error output stream (usually also on monitor) + +\The terms \"monitor\" and \"keyboard\" refer to the +same device, the **terminal** here. Check your preferred UNIX(r)-FAQ for +details, I\'m too lazy to explain what a terminal is ;-) \ + +Both, `stdout` and `stderr` are output file descriptors. Their +difference is the **convention** that a program outputs payload on +`stdout` and diagnostic- and error-messages on `stderr`. If you write a +script that outputs error messages, please make sure you follow this +convention! + +Whenever you **name** such a filedescriptor, i.e. you want to redirect +this descriptor, you just use the number: + + # this executes the cat-command and redirects its error messages (stderr) to the bit bucket + cat some_file.txt 2>/dev/null + +Whenever you **reference** a descriptor, to point to its current target +file, then you use a \"`&`\" followed by a the descriptor number: + + # this executes the echo-command and redirects its normal output (stdout) to the standard error target + echo "There was an error" 1>&2 + +The redirection operation can be **anywhere** in a simple command, so +these examples are equivalent: + + cat foo.txt bar.txt >new.txt + cat >new.txt foo.txt bar.txt + >new.txt cat foo.txt bar.txt + +\Every redirection operator takes one or two +words as operands. If you have to use operands (e.g. filenames to +redirect to) that contain spaces you **must** quote them!\ + +## Valid redirection targets and sources + +This syntax is recognized whenever a `TARGET` or a `SOURCE` +specification (like below in the details descriptions) is used. + + Syntax Description + ---------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------- + `FILENAME` references a normal, ordinary filename from the filesystem (which can of course be a FIFO, too. Simply everything you can reference in the filesystem) + `&N` references the current target/source of the filedescriptor `N` (\"duplicates\" the filedescriptor) + `&-` closes the redirected filedescriptor, useful instead of `> /dev/null` constructs (`> &-`) + `/dev/fd/N` duplicates the filedescriptor `N`, if `N` is a valid integer + `/dev/stdin` duplicates filedescriptor 0 (`stdin`) + `/dev/stdout` duplicates filedescriptor 1 (`stdout`) + `/dev/stderr` duplicates filedescriptor 2 (`stderr`) + `/dev/tcp/HOST/PORT` assuming `HOST` is a valid hostname or IP address, and `PORT` is a valid port number or service name: redirect from/to the corresponding TCP socket + `/dev/udp/HOST/PORT` assuming `HOST` is a valid hostname or IP address, and `PORT` is a valid port number or service name: redirect from/to the corresponding UDP socket + +If a target/source specification fails to open, the whole redirection +operation fails. Avoid referencing file descriptors above 9, since you +may collide with file descriptors Bash uses internally. + +## Redirecting output + + N > TARGET + +This redirects the file descriptor number `N` to the target `TARGET`. If +`N` is omitted, `stdout` is assumed (FD 1). The `TARGET` is +**truncated** before writing starts. + +If the option `noclobber` is set with [the set +builtin](/commands/builtin/set), with cause the redirection to fail, +when `TARGET` names a regular file that already exists. You can manually +override that behaviour by forcing overwrite with the redirection +operator `>|` instead of `>`. + +## Appending redirected output + + N >> TARGET + +This redirects the file descriptor number `N` to the target `TARGET`. If +`N` is omitted, `stdout` is assumed (FD 1). The `TARGET` is **not +truncated** before writing starts. + +## Redirecting output and error output + + &> TARGET + + >& TARGET + +This special syntax redirects both, `stdout` and `stderr` to the +specified target. It\'s **equivalent** to + + > TARGET 2>&1 + +Since Bash4, there\'s `&>>TARGET`, which is equivalent to +`>> TARGET 2>&1`. + +\This syntax is deprecated and should not be +used. See the page about [obsolete and deprecated +syntax](/scripting/obsolete).\ + +## Appending redirected output and error output + +To append the cumulative redirection of `stdout` and `stderr` to a file +you simply do + + >> FILE 2>&1 + + &>> FILE + +## Transporting stdout and stderr through a pipe + + COMMAND1 2>&1 | COMMAND2 + + COMMAND1 |& COMMAND2 + +## Redirecting input + + N < SOURCE + +The input descriptor `N` uses `SOURCE` as its data source. If `N` is +omitted, filedescriptor 0 (`stdin`) is assumed. + +## Here documents + +\ + + <It seems that here-documents (tested on versions +`1.14.7`, `2.05b` and `3.1.17`) are correctly terminated when there is +an EOF before the end-of-here-document tag. The reason is unknown, but +it seems to be done on purpose. Bash 4 introduced a warning message when +end-of-file is seen before the tag is reached.\ + +## Here strings + + <<< WORD + +The here-strings are a variation of the here-documents. The word `WORD` +is taken for the input redirection: + + cat <<< "Hello world... $NAME is here..." + +Just beware to quote the `WORD` if it contains spaces. Otherwise the +rest will be given as normal parameters. + +The here-string will append a newline (`\n`) to the data. + +## Multiple redirections + +More redirection operations can occur in a line of course. The order is +**important**! They\'re evaluated from **left to right**. If you want to +redirect both, `stderr` and `stdout` to the same file (like `/dev/null`, +to hide it), this is **the wrong way**: + +``` bash +# { echo OUTPUT; echo ERRORS >&2; } is to simulate something that outputs to STDOUT and STDERR +# you can test with it +{ echo OUTPUT; echo ERRORS >&2; } 2>&1 1>/dev/null +``` + +Why? Relatively easy: + +- initially, `stdout` points to your terminal (you read it) +- same applies to `stderr`, it\'s connected to your terminal +- `2>&1` redirects `stderr` away from the terminal to the target for + `stdout`: **the terminal** (again\...) +- `1>/dev/null` redirects `stdout` away from your terminal to the file + `/dev/null` + +What remains? `stdout` goes to `/dev/null`, `stderr` still (or better: +\"again\") goes to the terminal. You have to swap the order to make it +do what you want: + +``` bash +{ echo OUTPUT; echo ERRORS >&2; } 1>/dev/null 2>&1 +``` + +## Examples + +How to make a program quiet (assuming all output goes to `STDOUT` and +`STDERR`? + + command >/dev/null 2>&1 + +## See also + +- Internal: [Illustrated Redirection + Tutorial](/howto/redirection_tutorial) +- Internal: [The noclobber + option](/commands/builtin/set#tag_noclobber) +- Internal: [The exec builtin command](/commands/builtin/exec) +- Internal: [Simple commands parsing and + execution](/syntax/grammar/parser_exec) +- Internal: [Process substitution + syntax](/syntax/expansion/proc_subst) +- Internal: [Obsolete and deprecated syntax](/scripting/obsolete) +- Internal: [Nonportable syntax and command + uses](/scripting/nonportable) diff --git a/docs/syntax/shellvars.md b/docs/syntax/shellvars.md new file mode 100644 index 0000000..d6f3133 --- /dev/null +++ b/docs/syntax/shellvars.md @@ -0,0 +1,1437 @@ +# Special parameters and shell variables + +## Special Parameters + + -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + parameter character expansion description + ----------- ------------------ ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + `*` asterisk The positional parameters starting from the first. When used inside doublequotes (see [quoting](/syntax/quoting)), like `"$*"`, it expands to all positional parameters *as one word*, delimited by the first character of the `IFS` variable (a space in this example): `"$1 $2 $3 $4"`.\ + If `IFS` is unset, the delimiter used will be always a space, if `IFS` is NULL, the delimiter will be nothing, which effectively concatenates all the positional parameters without any delimiter.\ + When used unquoted, it will just expand to the strings, one by one, not preserving the word boundaries (i.e. word splitting will split the text again, if it contains `IFS` characters.\ + See also the [scripting article about handling positional parameters](/scripting/posparams). + + `@` at-sign The positional parameters starting from the first. When used inside doublequotes (see [quoting](/syntax/quoting)), like `"$@"`, it expands all positional parameters *as separate words*: `"$1" "$2" "$3" "$4"`\ + Without doublequotes, the behaviour is like the one of `*` without doublequotes.\ + See also the [scripting article about handling positional parameters](/scripting/posparams). + + `#` hash mark Number of positional parameters (decimal)\ + See also the [scripting article about handling positional parameters](/scripting/posparams). + + `?` question mark Status of the most recently executed foreground-pipeline (exit/return code) + + `-` dash Current option flags set by the shell itself, on invocation, or using the [set builtin command](/commands/builtin/set). It\'s just a set of characters, like `himB` for `h`, `i`, `m` and `B`. + + `$` dollar-sign The process ID (PID) of the shell. In an [explicit subshell](/syntax/ccmd/grouping_subshell) it expands to the PID of the current \"main shell\", not the subshell. This is different from `$BASHPID`! + + `!` exclamation mark The process ID (PID) of the most recently executed background pipeline (like started with `command &`) + + `0` zero The name of the shell or the shell script (filename). Set by the shell itself.\ + If Bash is started with a filename to execute (script), it\'s set to this filename. If started with the `-c ` option (commandline given as argument), then `$0` will be the first argument after the given ``. Otherwise, it is set to the string given on invocation for `argv[0]`.\ + Unlike popular belief, `$0` is *not a positional parameter*. + + `_` underscore A kind of catch-all parameter. Directly after shell invocation, it\'s set to the filename used to invoke Bash, or the absolute or relative path to the script, just like `$0` would show it. Subsequently, expands to the last argument to the previous command. Placed into the environment when executing commands, and set to the full pathname of these commands. When checking mail, this parameter holds the name of the mail file currently being checked. + -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +## Shell Variables + +### BASH + + Variable: `BASH` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +Expands to the full file name used to invoke the current instance of +Bash. + +### BASHOPTS + + Variable: `BASHOPTS` Since: 4.1-alpha + -------------- ----------------- ------------ ----------- + Type: normal variable Read-only: yes + Set by Bash: yes Default: n/a + +A colon-separated list of enabled shell options. + +Each word in the list is a valid argument for the `-s` option to the +[shopt builtin command](/commands/builtin/shopt). The options appearing +in `BASHOPTS` are those reported as on by `shopt`. If this variable is +in the environment when Bash starts up, each shell option in the list +will be enabled before reading any startup files. + +Example content: + + cmdhist:expand_aliases:extquote:force_fignore:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath + +This variable is read-only. + +### BASHPID + + Variable: `BASHPID` Since: 4.0-alpha + -------------- ------------------ ------------ ----------- + Type: integer variable Read-only: yes + Set by Bash: yes Default: n/a + +Always expands to the process ID of the current Bash process. This +differs from the special parameter `$` under certain circumstances, such +as subshells that do not require Bash to be re-initialized. + +### BASH_ALIASES + + Variable: `BASH_ALIASES` Since: unknown + -------------- ------------------- ------------ --------- + Type: associative array Read-only: no + Set by Bash: yes Default: n/a + +An associative array variable whose members correspond to the internal +list of aliases as maintained by the alias builtin. Elements added to +this array appear in the alias list; unsetting array elements cause +aliases to be removed from the alias list. + +The associative key is the name of the alias as used with the [alias +builtin command](/commands/builtin/alias). + +### BASH_ARGC + + Variable: `BASH_ARGC` Since: 3.0 + -------------- --------------------------------- ------------ ----- + Type: integer indexed array Read-only: no + Set by Bash: only in extended debugging mode Default: n/a + +An array variable whose values are the number of parameters in each +frame of the current Bash execution call stack. + +The number of parameters to the current subroutine (shell function or +script executed with [`.` or `source` builtin +command](/commands/builtin/source)) is at the top of the stack. When a +subroutine is executed, the number of parameters passed is pushed onto +`BASH_ARGC`. + +### BASH_ARGV + + Variable: `BASH_ARGV` Since: 3.0 + -------------- --------------------------------- ------------ ----- + Type: integer indexed array Read-only: no + Set by Bash: only in extended debugging mode Default: n/a + +An array variable containing all of the parameters in the current Bash +execution call stack. + +The final parameter of the last subroutine call is at the top of the +stack; the first parameter of the initial call is at the bottom. When a +subroutine is executed, the parameters supplied are pushed onto +`BASH_ARGV`. + +### BASH_ARGV0 + + Variable: `BASH_ARGV0` Since: 5.0-alpha + -------------- -------------- ------------ -------------- + Type: string Read-only: no + Set by Bash: yes Default: same as `$0` + +Expands to the name of the shell or shell script - as the special +parameter `$0` does. Assignments to `BASH_ARGV0` causes the value to be +assigned to `$0`. + +If this parameter is unset, it loses its special properties, even if +subsequently reset. + +### BASH_CMDS + + Variable: `BASH_CMDS` Since: unknown + -------------- ------------------- ------------ --------- + Type: associative array Read-only: no + Set by Bash: yes Default: n/a + +An associative array variable whose members correspond to the internal +hash table of commands as maintained by the [hash builtin +command](/commands/builtin/hash). Elements added to this array appear in +the hash table; unsetting array elements cause commands to be removed +from the hash table. + +The associative key is the name of the command as used with the[hash +builtin command](/commands/builtin/hash). + +### BASH_COMMAND + + Variable: `BASH_COMMAND` Since: 3.0 + -------------- ----------------- ------------ ----- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The command currently being executed or about to be executed, unless the +shell is executing a command as the result of a trap, in which case it +is the command executing at the time of the trap. + +### BASH_COMPAT + + Variable: `BASH_COMPAT` Since: 4.3-alpha + -------------- ----------------- ------------ ----------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The value is used to set the shell\'s compatibility level. The value may +be a decimal number (e.g., `4.2`) or an integer (e.g., `42`) +corresponding to the desired compatibility level. If `BASH_COMPAT` is +unset or set to the empty string, the compatibility level is set to the +default for the current version. If `BASH_COMPAT` is set to a value that +is not one of the valid compatibility levels, the shell prints an error +message and sets the compatibility level to the default for the current +version. The valid compatibility levels correspond to the compatibility +options accepted by the shopt builtin. The current version is also a +valid value. + +### BASH_EXECUTION_STRING + + Variable: `BASH_EXECUTION_STRING` Since: 3.0 + -------------- ------------------------- ------------ ----- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The command argument to the `-c` invocation option. + +### BASH_LINENO + + Variable: `BASH_LINENO` Since: 3.0 + -------------- ----------------------- ------------ ----- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable whose members are the line numbers in source files +corresponding to each member of `FUNCNAME`. + +`${BASH_LINENO[$i]}` is the line number in the source file where +`${FUNCNAME[$ifP]}` was called. The corresponding source file name is +`${BASH_SOURCE[$i]}`. Use `LINENO` to obtain the current line number. + +### BASH_REMATCH + + Variable: `BASH_REMATCH` Since: 3.0 + -------------- ----------------------- ------------ ----- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable whose members are assigned by the `=~` binary operator +to the `[[` conditional command. + +The element with index 0 is the portion of the string matching the +entire regular expression. The element with index `n` is the portion of +the string matching the nth parenthesized subexpression. + +Before Bash version 5.1-alpha this variable was readonly. + +### BASH_SOURCE + + Variable: `BASH_SOURCE` Since: 3.0 + -------------- ----------------------- ------------ ----- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable whose members are the source filenames corresponding +to the elements in the `FUNCNAME` array variable. + +### BASH_SUBSHELL + + Variable: `BASH_SUBSHELL` Since: 3.0 + -------------- ----------------- ------------ ----- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +Incremented by one each time a subshell or subshell environment is +spawned. The initial value is 0. + +### BASH_VERSINFO + + Variable: `BASH_VERSINFO` Since: 2.0 + -------------- ----------------------- ------------ ----- + Type: integer indexed array Read-only: yes + Set by Bash: yes Default: n/a + +A readonly array variable whose members hold version information for +this instance of Bash. The values assigned to the array members are as +follows: + + -------------------- ---------------------------------------- + BASH_VERSINFO\[0\] The major version number (the release) + BASH_VERSINFO\[1\] The minor version number (the version) + BASH_VERSINFO\[2\] The patch level + BASH_VERSINFO\[3\] The build version + BASH_VERSINFO\[4\] The release status (e.g., beta1) + BASH_VERSINFO\[5\] The value of `MACHTYPE` + -------------------- ---------------------------------------- + +### BASH_VERSION + + Variable: `BASH_VERSION` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +Expands to a string describing the version of this instance of Bash. + +Since Bash 2.0 it includes the shell\'s \"release status\" (alpha\[N\], +beta\[N\], release). + +### CHILD_MAX + + Variable: `CHILD_MAX` Since: 4.3-alpha + -------------- ----------------- ------------ ----------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +Set the number of exited child status values for the shell to remember. +Bash will not allow this value to be decreased below a POSIX-mandated +minimum, and there is a maximum value (currently 8192) that this may not +exceed. The minimum value is system-dependent. + +### COMP_CWORD + + Variable: `COMP_CWORD` Since: unknown + -------------- --------------------------------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: only for programmable completion facilities Default: n/a + +An index into `COMP_WORDS` of the word containing the current cursor +position. + +### COMP_KEY + + Variable: `COMP_KEY` Since: unknown + -------------- --------------------------------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: only for programmable completion facilities Default: n/a + +The key (or final key of a key sequence) used to invoke the current +completion function. + +### COMP_LINE + + Variable: `COMP_LINE` Since: unknown + -------------- --------------------------------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: only for programmable completion facilities Default: n/a + +The current command line. + +### COMP_POINT + + Variable: `COMP_POINT` Since: unknown + -------------- --------------------------------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: only for programmable completion facilities Default: n/a + +The index of the current cursor position relative to the beginning of +the current command. If the current cursor position is at the end of the +current command, the value of this variable is equal to `${#COMP_LINE}`. + +### COMP_TYPE + + Variable: `COMP_TYPET` Since: unknown + -------------- --------------------------------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: only for programmable completion facilities Default: n/a + +Set to an integer value corresponding to the type of completion +attempted that caused a completion function to be called: + + ------- --------------------------------------------------- + `TAB` normal completion + `?` listing completions after successive tabs + `!` listing alternatives on partial word completion + `@` to list completions if the word is not unmodified + `%` for menu completion + ------- --------------------------------------------------- + +FIXME where are the integer values? + +### COMP_WORDBREAKS + + Variable: `COMP_WORDBREAKS` Since: unknown + -------------- ------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +Reports the set of characters that the readline library treats as word +separators when performing word completion. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### COMP_WORDS + + Variable: `COMP_WORDS` Since: unknown + -------------- --------------------------------------------- ------------ --------- + Type: integer indexed array Read-only: no + Set by Bash: only for programmable completion facilities Default: n/a + +An array variable consisting of the individual words in the current +command line. The line is split into words as readline would split it, +using `COMP_WORDBREAKS` as described above. + +### COPROC + + Variable: `COPROC` Since: unknown + -------------- ----------------------- ------------ --------- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable created to hold the file descriptors for output from +and input to an unnamed coprocess. + +### DIRSTACK + + Variable: `DIRSTACK` Since: unknown + -------------- ----------------------- ------------ --------- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable containing the current contents of the directory +stack. + +Directories appear in the stack in the order they are displayed by the +dirs builtin. Assigning to members of this array variable may be used to +modify directories already in the stack, but the pushd and popd builtins +must be used to add and remove directories. + +Assignment to this variable will not change the current directory. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### EPOCHREALTIME + + Variable: `EPOCHREALTIME` Since: 5.0-alpha + -------------- ------------------ ------------ ----------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +Expands to the number of seconds since Unix expoch as a floating point +value with micro-second granularity. + +Assignments to this parameter are ignored. If this parameter is unset, +it loses its special properties, even if it is subsequently reset. + +### EPOCHSECONDS + + Variable: `EPOCHSECONDS` Since: 5.0-alpha + -------------- ------------------ ------------ ----------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +Expands to the number of seconds since Unix expoch. + +Assignments to this parameter are ignored. If this parameter is unset, +it loses its special properties, even if it is subsequently reset. + +### EUID + + Variable: `EUID` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: yes + Set by Bash: yes Default: n/a + +Expands to the effective user ID of the current user, initialized at +shell startup. + +:!: Do not rely on this variable when security is a concern. + +### FUNCNAME + + Variable: `FUNCNAME` Since: 2.04 + -------------- ----------------------------- ------------ ------ + Type: integer indexed array Read-only: no + Set by Bash: only inside shell functions Default: n/a + +An array variable containing the names of all shell functions currently +in the execution call stack. + +The element with index 0 is the name of any currently-executing shell +function. The bottom-most element (the one with the highest index) is +\"main\". + +This variable can be used with `BASH_LINENO` and `BASH_SOURCE`: Each +element of `FUNCNAME` has corresponding elements in `BASH_LINENO` and +`BASH_SOURCE` to describe the call stack. For instance, +`${FUNCNAME[$i]}` was called from the file `${BASH_SOURCE[$i+1]}` at +line number `${BASH_LINENO[$i]}`. The [caller builtin +command](/commands/builtin/caller) displays the current call stack using +this information. + +This variable exists only when a shell function is executing. + +Assignments to this parameter have no effect and return an error status. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### GROUPS + + Variable: `GROUPS` Since: 2.01 + -------------- ----------------------- ------------ ------ + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable containing the list of groups of which the current +user is a member. + +Assignments to this parameter have no effect and return an error status. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### HISTCMD + + Variable: `HISTCMD` Since: 1.14.0 + -------------- ------------------ ------------ -------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +Expands to the history number (index in the history list) of the current +command. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### HOSTNAME + + Variable: `HOSTNAME` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +Automatically set to the name of the current host. + +### HOSTTYPE + + Variable: `HOSTTYPE` Since: unknown + -------------- ----------------- ------------ ------------------ + Type: normal variable Read-only: no + Set by Bash: yes Default: system-dependent + +Automatically set to a string that uniquely describes the type of +machine on which Bash is executing. + +Example content: + + x86_64 + +### LINENO + + Variable: `LINENO` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +Each time this parameter is referenced, the shell substitutes a decimal +number representing the current sequential line number (starting with 1) +within a script or function. + +When not in a script or function, the value substituted is not +guaranteed to be meaningful. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### MACHTYPE + + Variable: `MACHTYPE` Since: unknown + -------------- ----------------- ------------ ------------------ + Type: normal variable Read-only: no + Set by Bash: yes Default: system-dependent + +Automatically set to a string that fully describes the system type on +which Bash is executing, in the standard GNU \"cpu-company-system\" +format. + +Example content: + + x86_64-unknown-linux-gnu + +### MAPFILE + + Variable: `MAPFILE` Since: unknown + -------------- ----------------------- ------------ --------- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable created to hold the text read by the [mapfile builtin +command](/commands/builtin/mapfile) when no variable name is supplied. + +### OLDPWD + + Variable: `OLDPWD` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The previous working directory as set by the cd command. + +### OPTARG + + Variable: `OPTARG` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The value of the last option argument processed by the [getopts builtin +command](/commands/builtin/getopts). + +### OPTIND + + Variable: `OPTIND` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +The index of the next argument to be processed by the [getopts builtin +command](/commands/builtin/getopts). + +### OSTYPE + + Variable: `OSTYPE` Since: unknown + -------------- ----------------- ------------ ------------------ + Type: normal variable Read-only: no + Set by Bash: yes Default: system-dependent + +Automatically set to a string that describes the operating system on +which Bash is executing. + +Example content: + + linux-gnu + +### PIPESTATUS + + Variable: `PIPESTATUS` Since: 2.0 + -------------- ----------------------- ------------ ----- + Type: integer indexed array Read-only: no + Set by Bash: yes Default: n/a + +An array variable containing a list of exit status values from the +processes in the most-recently-executed foreground pipeline (which may +contain only a single command). + +### PPID + + Variable: `PPID` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: yes + Set by Bash: yes Default: n/a + +The process ID of the shell\'s parent process. + +### PWD + + Variable: `PWD` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The current working directory as set by the [cd builtin +command](/commands/builtin/cd). + +### RANDOM + + Variable: `RANDOM` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +Each time this parameter is referenced, a random integer between 0 and +32767 is generated. The sequence of random numbers may be initialized by +assigning a value to `RANDOM`. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### READLINE_LINE + + Variable: `READLINE_LINE` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The contents of the readline line buffer, for use with `bind -x`. + +### READLINE_POINT + + Variable: `READLINE_POINT` Since: unknown + -------------- ------------------ ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +The position of the insertion point in the readline line buffer, for use +with `bind -x`. + +### REPLY + + Variable: `REPLY` Since: unknown + -------------- ------------------------------------------------------------ ------------ --------- + Type: normal variable Read-only: no + Set by Bash: only by the [read builtin command](/commands/builtin/read) Default: n/a + +Set to the line of input read by the [read builtin +command](/commands/builtin/read) when no arguments are supplied that +name target variables. + +### SECONDS + + Variable: `SECONDS` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: no + Set by Bash: yes Default: n/a + +Each time this parameter is referenced, the number of seconds since +shell invocation is returned. If a value is assigned to SECONDS, the +value returned upon subsequent references is the number of seconds since +the assignment plus the value assigned. + +If this parameter is unset, it loses its special properties, even if it +is subsequently reset. + +### SHELLOPTS + + Variable: `SHELLOPTS` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: yes + Set by Bash: yes Default: n/a + +A colon-separated list of enabled shell options. Each word in the list +is a valid argument for the `-o` option to the [set builtin +command](/commands/builtin/set). The options appearing in `SHELLOPTS` +are those reported as on by `set -o`. + +If this variable is in the environment when Bash starts up, each shell +option in the list will be enabled before reading any startup files. + +### SHLVL + + Variable: `SHLVL` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +Incremented by one each time an instance of Bash is started. + +### UID + + Variable: `UID` Since: unknown + -------------- ------------------ ------------ --------- + Type: integer variable Read-only: yes + Set by Bash: yes Default: n/a + +Expands to the user ID of the current user, initialized at shell +startup. + +:!: Do not rely on this variable when security is a concern. + +### BASH_ENV + + Variable: `BASH_ENV` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If this parameter is set when Bash is executing a shell script, its +value is interpreted as a filename containing commands to initialize the +shell, as in `~/.bashrc`. The value of `BASH_ENV` is subjected to + +- [parameter expansion](/syntax/pe) +- [command substitution](/syntax/expansion/cmdsubst) +- [arithmetic expansion](/syntax/expansion/arith) + +before being interpreted as a file name. + +`PATH` is not used to search for the resultant file name. + +### BASH_XTRACEFD + + Variable: `BASH_XTRACEFD` Since: 4.1-alpha + -------------- ----------------- ------------ ----------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If set to an integer corresponding to a valid file descriptor, Bash will +write the trace output generated when `set -x` is enabled to that file +descriptor. + +The file descriptor is closed when `BASH_XTRACEFD` is unset or assigned +a new value. + +Unsetting `BASH_XTRACEFD` or assigning it the empty string causes the +trace output to be sent to the standard error. Note that setting +`BASH_XTRACEFD` to 2 (the standard error file descriptor) and then +unsetting it will result in the standard error being closed. + +### CDPATH + + Variable: `CDPATH` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The search path for the [cd builtin command](/commands/builtin/cd). + +This is a colon-separated list of directories in which the shell looks +for destination directories specified by the `cd` command. + +Example content: + + .:~:/usr + +### COLUMNS + + Variable: `COLUMNS` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: on `SIGWINCH` Default: n/a + +Used by the select compound command to determine the terminal width when +printing selection lists. Automatically set upon receipt of a +`SIGWINCH`. + +### COMPREPLY + + Variable: `COMPREPLY` Since: unknown + -------------- ----------------------- ------------ --------- + Type: integer indexed array Read-only: no + Set by Bash: no Default: n/a + +An array variable from which Bash reads the possible completions +generated by a shell function invoked by the programmable completion +facility. + +### EMACS + + Variable: `EMACS` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If Bash finds this variable in the environment when the shell starts +with value \"t\", it assumes that the shell is running in an Emacs shell +buffer and disables line editing. + +### ENV + + Variable: `ENV` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +Similar to `BASH_ENV`: Used when the shell is invoked in POSIX(r) mode. + +### FCEDIT + + Variable: `FCEDIT` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The default editor for the [fc builtin command](/commands/builtin/fc). + +### FIGNORE + + Variable: `FIGNORE` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +A colon-separated list of suffixes to ignore when performing filename +completion. A filename whose suffix matches one of the entries in +`FIGNORE` is excluded from the list of matched filenames. + +Example content: + + .o:~ + +### FUNCNEST + + Variable: `FUNCNEST` Since: 4.2-alpha + -------------- ----------------- ------------ ----------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If set to a numeric value greater than 0, defines a maximum function +nesting level. Function invocations that exceed this nesting level will +cause the current command to abort. + +Negative values, 0 or non-numeric assignments have the effect as if +`FUNCNEST` was unset or empty: No nest control + +### GLOBIGNORE + + Variable: `GLOBIGNORE` Since: 2.0 + -------------- ----------------- ------------ ----- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +A colon-separated list of patterns defining the set of filenames to be +ignored by pathname expansion. If a filename matched by a pathname +expansion pattern also matches one of the patterns in `GLOBIGNORE`, it +is removed from the list of matches. + +### HISTCONTROL + + Variable: `HISTCONTROL` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +A colon-separated list of values controlling how commands are saved on +the history list: + + --------------- ------------------------------------------------------------------------------------------------------------ + `ignorespace` lines which begin with a space character are not saved in the history list + `ignoredups` don\'t save lines matching the previous history entry + `ignoreboth` short for `ignorespace:ignoredups` + `erasedups` remove all previous lines matching the current line from the history list before the current line is saved + --------------- ------------------------------------------------------------------------------------------------------------ + +Any value not in the above list is ignored. + +If `HISTCONTROL` is unset, or does not include a valid value, all lines +read by the shell parser are saved on the history list, subject to the +value of `HISTIGNORE`. The second and subsequent lines of a multi-line +compound command are not tested, and are added to the history regardless +of the value of `HISTCONTROL`. + +### HISTFILE + + Variable: `HISTFILE` Since: unknown + -------------- ----------------- ------------ --------------------------- + Type: normal variable Read-only: no + Set by Bash: if unset Default: \'\' \~/.bash_history\'\' + +The name of the file in which command history is saved. + +If unset, the command history is not saved when an interactive shell +exits. + +### HISTFILESIZE + + Variable: `HISTFILESIZE` Since: unknown + -------------- ----------------- ------------ ------------ + Type: normal variable Read-only: no + Set by Bash: if unset Default: `HISTSIZE` + +The maximum number of lines contained in the history file. + +When this variable is assigned a value, the history file is truncated, +if necessary, by removing the oldest entries, to contain no more than +the given number of lines. If the given number of lines is 0 (zero), the +file is truncated to zero size. Non-numeric values and numeric values +less than zero inhibit truncation. + +The history file is also truncated to this size after writing it when an +interactive shell exits. + +### HISTIGNORE + + Variable: `HISTIGNORE` Since: 2.0 + -------------- ----------------- ------------ ----- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +A colon-separated list of patterns used to decide which command lines +should be saved on the history list. Each pattern is anchored at the +beginning of the line and must match the complete line (no implicit +\'\*\' is appended). + +Each pattern is tested against the line after the checks specified by +`HISTCONTROL` are applied. + +In addition to the normal shell pattern matching characters, \"&\" +matches the previous history line. \"&\" may be escaped using a +backslash; the backslash is removed before attempting a match. + +The second and subsequent lines of a multi-line compound command are not +tested, and are added to the history regardless of the value of +`HISTIGNORE`. + +### HISTSIZE + + Variable: `HISTSIZE` Since: unknown + -------------- ----------------- ------------ ----------------------------------- + Type: normal variable Read-only: no + Set by Bash: if unset Default: set at compile time (default 500) + +The number of commands to remember in the command history. + +If the number is set to 0 (zero), then the history list is disabled. If +the number is set to any negative number, then the history list is +unlimited. + +### HISTTIMEFORMAT + + Variable: `HISTTIMEFORMAT` Since: unknown + -------------- ------------------ ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If this variable is set and not null, its value is used as a format +string for `strftime(3)` to print the time stamp associated with each +history entry displayed by the history builtin. + +If this variable is set, time stamps are written to the history file so +they may be preserved across shell sessions. This uses the history +comment character to distinguish timestamps from other history lines. + +### HOME + + Variable: `HOME` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The home directory of the current user. + +The default argument for the [cd builtin command](/commands/builtin/cd). + +The value of this variable is also used when performing [tilde +expansion](/syntax/expansion/tilde). + +### HOSTFILE + + Variable: `HOSTFILE` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +Contains the name of a file in the same format as `/etc/hosts` that +should be read when the shell needs to complete a hostname. + +The list of possible hostname completions may be changed while the shell +is running. the next time hostname completion is attempted after the +value is changed, Bash adds the contents of the new file to the existing +list. + +If `HOSTFILE` is set, but has no value, or does not name a readable +file, Bash attempts to read `/etc/hosts` to obtain the list of possible +hostname completions. + +When `HOSTFILE` is unset, the hostname list is cleared. + +### IFS + + Variable: `IFS` Since: unknown + -------------- ----------------- ------------ ------------------------- + Type: normal variable Read-only: no + Set by Bash: no Default: `` + +The Internal Field Separator that is used for word splitting after +expansion and to split lines into words with the read builtin command. + +### IGNOREEOF + + Variable: `IGNOREEOF` Since: unknown + -------------- ----------------- ------------ ------------------- + Type: normal variable Read-only: no + Set by Bash: no Default: 10 (when invalid) + +Controls the action of an interactive shell on receipt of an `EOF` +character (e.g. by Ctrl-D) as the sole input. + +If set, the value is the number of consecutive EOF characters which must +be typed as the first characters on an input line before Bash exits. + +If the variable exists but does not have a numeric value, or has no +value, the default value is 10. + +If it does not exist, `EOF` signifies the end of input to the shell. + +### INPUTRC + + Variable: `INPUTRC` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The filename for the readline startup file, overriding the default of +`~/.inputrc`. + +### LANG + + Variable: `LANG` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +Used to determine the locale category for any category not specifically +selected with a variable starting with `LC_`. + +### LC_ALL + + Variable: `LC_ALL` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +This variable overrides the value of `LANG` and any other `LC_` variable +specifying a locale category. + +### LC_COLLATE + + Variable: `LC_COLLATE` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +This variable determines the collation order used when sorting the +results of pathname expansion, and determines the behavior of range +expressions, equivalence classes, and collating sequences within +pathname expansion and pattern matching. + +### LC_CTYPE + + Variable: `LC_CTYPE` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +This variable determines the interpretation of characters and the +behavior of character classes within pathname expansion and pattern +matching. + +### LC_MESSAGES + + Variable: `LC_MESSAGES` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +This variable determines the locale used to translate double- quoted +strings preceded by a `$`. + +### LC_NUMERIC + + Variable: `LC_NUMERIC` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +This variable determines the locale category used for number formatting. + +### LINES + + Variable: `LINES` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: on `SIGWINCH` Default: n/a + +Used by the select compound command to determine the column length for +printing selection lists. Automatically set upon receipt of a +`SIGWINCH`. + +### MAIL + + Variable: `MAIL` Since: unknown + -------------- ----------------- ------------ ------------------ + Type: normal variable Read-only: no + Set by Bash: no Default: system-dependent + +If this parameter is set to a file or directory name and the `MAILPATH` +variable is not set, Bash informs the user of the arrival of mail in the +specified file or Maildir-format direc‐ tory. + +### MAILCHECK + + Variable: `MAILCHECK` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: 60 + +Specifies how often (in seconds) Bash checks for mail. + +When it is time to check for mail, the shell does so before displaying +the primary prompt. + +If this variable is unset, or set to a value that is not a number +greater than or equal to zero, the shell disables mail checking. + +### MAILPATH + + Variable: `MAILPATH` Since: unknown + -------------- ----------------- ------------ ------------------ + Type: normal variable Read-only: no + Set by Bash: no Default: system-dependent + +A colon-separated list of file names to be checked for mail. + +The message to be printed when mail arrives in a particular file may be +specified by separating the file name from the message with a \'?\' +(question mark). + +When used in the text of the message, `$_` expands to the name of the +current mailfile. + +Example content: + + /var/mail/bfox?"You have mail":~/shell-mail?"$_ has mail!" + +### OPTERR + + Variable: `OPTERR` Since: unknown + -------------- ----------------- ------------ -------------------- + Type: normal variable Read-only: no + Set by Bash: yes Default: 1 (set on startup) + +If set to the value 1, Bash displays error messages generated by the +[getopts builtin command](/commands/builtin/getopts). + +`OPTERR` is initialized to 1 each time the shell is invoked or a shell +script is executed. + +### PATH + + Variable: `PATH` Since: unknown + -------------- ----------------- ------------ ---------------------------------------- + Type: normal variable Read-only: no + Set by Bash: no Default: system-dependent (set on compile time) + +The search path for commands. This is a colon-separated list of +directories in which the shell looks for commands. + +A zero-length (null) directory name in the value of `PATH` indicates the +current directory. + +A null directory name may appear as two adjacent colons, or as an +initial or trailing colon. + +There can be a static path compiled in for use in a restricted shell. + +### POSIXLY_CORRECT + + Variable: `POSIXLY_CORRECT` Since: unknown + -------------- ------------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If this variable is in the environment when Bash starts, the shell +enters posix mode before reading the startup files, as if the `--posix` +invocation option had been supplied. + +If it is set while the shell is running, Bash enables posix mode, as if +the command `set -o posix` had been executed. + +### PROMPT_COMMAND + + Variable: `PROMPT_COMMAND` Since: unknown + -------------- ------------------ ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If set, the value is executed as a command prior to issuing each primary +prompt. + +### PROMPT_COMMANDS + + Variable: `PROMPT_COMMANDS` Since: 5.1-alpha + -------------- ----------------------- ------------ ----------- + Type: integer indexed array Read-only: no + Set by Bash: no Default: n/a + +If set, each element is executed as a command prior to issuing each +primary prompt (like `PROMPT_COMMAND`, just as array). + +### PROMPT_DIRTRIM + + Variable: `PROMPT_DIRTRIM` Since: unknown + -------------- ------------------ ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If set to a number greater than zero, the value is used as the number of +trailing directory components to retain when expanding the `\w` and `\W` +prompt string escapes. + +Characters removed are replaced with an ellipsis. + +### PS0 + + Variable: `PS0` Since: 4.4.0 + -------------- ----------------- ------------ -------------- + Type: normal variable Read-only: no + Set by Bash: if unset Default: \"\'\'\'\'\" + +Expanded and displayed by interactive shells after reading a complete +command but before executing it. + +### PS1 + + Variable: `PS1` Since: unknown + -------------- ----------------- ------------ -------------------------- + Type: normal variable Read-only: no + Set by Bash: if unset Default: \"\'\'\\s-\\v\\\$ \'\'\" + +The value of this parameter is expanded and used as the primary prompt +string. See [Controlling the +Prompt](https://www.gnu.org/software/bash/manual/bash.html#Controlling-the-Prompt). + +### PS2 + + Variable: `PS2` Since: unknown + -------------- ----------------- ------------ ----------------- + Type: normal variable Read-only: no + Set by Bash: if unset Default: \"\'\'\> \'\'\" + +The value of this parameter is expanded as with PS1 and used as the +secondary prompt string. + +### PS3 + + Variable: `PS3` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The value of this parameter is used as the prompt for the select +command. + +### PS4 + + Variable: `PS4` Since: unknown + -------------- ----------------- ------------ ---------------- + Type: normal variable Read-only: no + Set by Bash: if unset Default: \"\'\'+ \'\'\" + +The value of this parameter is expanded as with `PS1` and the value is +printed before each command Bash displays during an execution trace. The +first character of `PS4` is replicated multiple times, as necessary, to +indicate multiple levels of indirection. + +### SHELL + + Variable: `SHELL` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The full pathname to the shell is kept in this environment variable. If +it is not set when the shell starts, Bash assigns the full pathname of +the current user\'s login shell. + +### SRANDOM + + Variable: `SRANDOM` Since: 5.1-alpha + -------------- ----------------- ------------ ----------- + Type: normal variable Read-only: no + Set by Bash: yes Default: n/a + +A variable that delivers a 32bit random number. The random number +generation uses platform specific generators in the background and a +builtin fallback generator. + +### TIMEFORMAT + + Variable: `TIMEFORMAT` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The value of this parameter is used as a format string specifying how +the timing information for pipelines prefixed with the time reserved +word should be displayed. + +The % character introduces an escape sequence that is expanded to a time +value or other information. The escape sequences and their meanings are +as follows, the braces denote optional portions: + + ------------ ---------------------------------------------- + `%%` a literal `%` (percent sign) + `%[p][l]R` elapsed time in seconds + `%[p][l]U` number of CPU seconds spent in user mode + `%[p][l]S` number of CPU seconds spent in system mode + `%P` CPU percentage, computed as `(%U + %S) / %R` + ------------ ---------------------------------------------- + +The optional modifiers (p and l) are: + + ----- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + `p` A digit specifying the precision. A value of 0 causes no decimal point or fraction to be output. At most three digits after the decimal point are shown. If not specified, the value 3 is used. + `l` A longer format, including minutes, of the form MMmSS.FFs. The value of p determines whether or not the fraction is included. + ----- ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- + +If this variable is not set, Bash acts as if it had the value + + $'\nreal\t%3lR\nuser\t%3lU\nsys%3lS' + +If the value is null, no timing information is displayed. + +A trailing newline is added when the format string is displayed. + +### TMOUT + + Variable: `TMOUT` Since: 2.05b + -------------- ----------------- ------------ ------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If set to a value greater than zero, `TMOUT` is treated as the default +timeout for the [read builtin command](/commands/builtin/read). + +The [select command](/commands/builtin/select) terminates if input does +not arrive after `TMOUT` seconds when input is coming from a terminal. + +In an interactive shell, the value is interpreted as the number of +seconds to wait for input after issuing the primary prompt. Bash +terminates after waiting for that number of seconds if input does not +arrive. + +### TMPDIR + + Variable: `TMPDIR` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +If set, Bash uses its value as the name of a directory in which Bash +creates temporary files for the shell\'s use. + +### auto_resume + + Variable: `auto_resume` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +This variable controls how the shell interacts with the user and job +control. If this variable is set, single word simple commands without +redirections are treated as candidates for resumption of an existing +stopped job. There is no ambiguity allowed; if there is more than one +job beginning with the string typed, the job most recently accessed is +selected. The name of a stopped job, in this context, is the command +line used to start it. If set to the value exact, the string supplied +must match the name of a stopped job exactly; if set to substring, the +string supplied needs to match a substring of the name of a stopped job. +The substring value provides functionality analogous to the %? job +identifier. + +If set to any other value, the supplied string must be a prefix of a +stopped job\'s name; this provides functionality analogous to the +`%string` job identifier. + +### histchars + + Variable: `histchars` Since: unknown + -------------- ----------------- ------------ --------- + Type: normal variable Read-only: no + Set by Bash: no Default: n/a + +The two or three characters which control history expansion and +tokenization. + +The first character is the history expansion character, the character +which signals the start of a history expansion, normally \'!\' +(exlamation mark). + +The second character is the quick substitution character, which is used +as shorthand for re-running the previous command entered, substi tuting +one string for another in the command. The default is \'\^\' (carret). + +The optional third character is the character which indicates that the +remainder of the line is a comment when found as the first character of +a word, normally \'#\' (hash mark). The history comment character causes +history substitution to be skipped for the remaining words on the line. +It does not necessarily cause the shell parser to treat the rest of the +line as a comment. diff --git a/docs/syntax/words.md b/docs/syntax/words.md new file mode 100644 index 0000000..d45d16d --- /dev/null +++ b/docs/syntax/words.md @@ -0,0 +1,171 @@ +# Words\... + +![](keywords>bash shell scripting token words split splitting recognition) + +FIXME This article needs a review, it covers two topics (command line +splitting and word splitting) and mixes both a bit too much. But in +general, it\'s still usable to help understand this behaviour, it\'s +\"wrong but not wrong\". + +One fundamental principle of Bash is to recognize words entered at the +command prompt, or under other circumstances like variable-expansion. + +## Splitting the commandline + +Bash scans the command line and splits it into words, usually to put the +parameters you enter for a command into the right C-memory (the `argv` +vector) to later correctly call the command. These words are recognized +by splitting the command line at the special character position, +**Space** or **Tab** (the manual defines them as **blanks**). For +example, take the echo program. It displays all its parameters separated +by a space. When you enter an echo command at the Bash prompt, Bash will +look for those special characters, and use them to separate the +parameters. + +You don\'t know what I\'m talking about? I\'m talking about this: + + $ echo Hello little world + Hello little world + +In other words, something you do (and Bash does) everyday. The +characters where Bash splits the command line (SPACE, TAB i.e. blanks) +are recognized as delimiters. There is no null argument generated when +you have 2 or more blanks in the command line. **A sequence of more +blank characters is treated as a single blank.** Here\'s an example: + + $ echo Hello little world + Hello little world + +Bash splits the command line at the blanks into words, then it calls +echo with **each word as an argument**. In this example, echo is called +with three arguments: \"`Hello`\", \"`little`\" and \"`world`\"! + +[Does that mean we can\'t echo more than one Space?]{.underline} Of +course not! Bash treats blanks as special characters, but there are two +ways to tell Bash not to treat them special: **Escaping** and +**quoting**. + +Escaping a character means, to **take away its special meaning**. Bash +will use an escaped character as text, even if it\'s a special one. +Escaping is done by preceeding the character with a backslash: + + $ echo Hello\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ little \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ world + Hello little world + +None of the escaped spaces will be used to perform word splitting. Thus, +echo is called with one argument: \"`Hello little world`\". + +Bash has a mechanism to \"escape\" an entire string: **Quoting**. In the +context of command-splitting, which this section is about, it doesn\'t +matter which kind of quoting you use: weak quoting or strong quoting, +both cause Bash to not treat spaces as special characters: + + $ echo "Hello little world" + Hello little world + + $ echo 'Hello little world' + Hello little world + +[What is it all about now?]{.underline} Well, for example imagine a +program that expects a filename as an argument, like cat. Filenames can +have spaces in them: + + $ ls -l + total 4 + -rw-r--r-- 1 bonsai bonsai 5 Apr 18 18:16 test file + + $ cat test file + cat: test: No such file or directory + cat: file: No such file or directory + + $ cat test\ file + m00! + + $ cat "test file" + m00! + +If you enter that on the command line with Tab completion, that will +take care of the spaces. But Bash also does another type of splitting. + +## Word splitting + +For a more technical description, please read the [article about word +splitting](/syntax/expansion/wordsplit)! + +The first kind of splitting is done to parse the command line into +separate tokens. This is what was described above, it\'s a pure +**command line parsing**. + +After the command line has been split into words, Bash will perform +expansion, if needed - variables that occur in the command line need to +be expanded (substituted by their value), for example. This is where the +second type of word splitting comes in - several expansions undergo +**word splitting** (but others do not). + +Imagine you have a filename stored in a variable: + + MYFILE="test file" + +When this variable is used, its occurance will be replaced by its +content. + + $ cat $MYFILE + cat: test: No such file or directory + cat: file: No such file or directory + +Though this is another step where spaces make things difficult, +**quoting** is used to work around the difficulty. Quotes also affect +word splitting: + + $ cat "$MYFILE" + m00! + +## Example + +Let\'s follow an unquoted command through these steps, assuming that the +variable is set: + + MYFILE="THE FILE.TXT" + +and the first review is: + + echo The file is named $MYFILE + +The parser will scan for blanks and mark the relevant words (\"splitting +the command line\"): + + Initial command line splitting: + --------------------------------- -------- -------- -------- --------- ----------- + Word 1 Word 2 Word 3 Word 4 Word 5 Word 6 + `echo` `The` `file` `is` `named` `$MYFILE` + +A [parameter/variable expansion](/syntax/pe) is part of that command +line, Bash will perform the substitution, and the [word +splitting](/syntax/expansion/wordsplit) on the results: + + Word splitting after substitution: + ------------------------------------ -------- -------- -------- --------- -------- ------------ + Word 1 Word 2 Word 3 Word 4 Word 5 Word 6 Word 7 + `echo` `The` `file` `is` `named` `THE` `FILE.TXT` + +Now let\'s imagine we quoted `$MYFILE`, the command line now looks like: + + echo The file is named "$MYFILE" + + Word splitting after substitution (quoted!): + ---------------------------------------------- -------- -------- -------- --------- ---------------- + Word 1 Word 2 Word 3 Word 4 Word 5 Word 6 + `echo` `The` `file` `is` `named` `THE FILE.TXT` + +***Bold Text*72i love this world**===== See also ===== + +- Internal: [Quoting and character escaping](/syntax/quoting) +- Internal: [Word splitting](/syntax/expansion/wordsplit) +- Internal: [Introduction to expansions and + substitutions](/syntax/expansion/intro) + +```{=html} + +``` +- External: [Grymore: + Shellquoting](http://www.grymoire.com/Unix/Quote.html)