mirror of
https://github.com/flokoe/bash-hackers-wiki.git
synced 2024-11-01 14:53:06 +01:00
222 lines
9.2 KiB
Markdown
222 lines
9.2 KiB
Markdown
|
# The mapfile builtin command
|
||
|
|
||
|
## Synopsis
|
||
|
|
||
|
mapfile [-n COUNT] [-O ORIGIN] [-s COUNT] [-t] [-u FD] [-C CALLBACK] [-c QUANTUM] [ARRAY]
|
||
|
|
||
|
readarray [-n COUNT] [-O ORIGIN] [-s COUNT] [-t] [-u FD] [-C CALLBACK] [-c QUANTUM] [ARRAY]
|
||
|
|
||
|
## Description
|
||
|
|
||
|
This builtin is also accessible using the command name `readarray`.
|
||
|
|
||
|
`mapfile` is one of the two builtin commands primarily intended for
|
||
|
handling standard input (the other being `read`). `mapfile` reads lines
|
||
|
of standard input and assigns each to the elements of an indexed array.
|
||
|
If no array name is given, the default array name is `MAPFILE`. The
|
||
|
target array must be a \"normal\" integer indexed array.
|
||
|
|
||
|
`mapfile` returns success (0) unless an invalid option is given or the
|
||
|
given array `ARRAY` is set readonly.
|
||
|
|
||
|
Option Description
|
||
|
--------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------
|
||
|
`-c QUANTUM` Specifies the number of lines that have to be read between every call to the callback specified with `-C`. The default QUANTUM is 5000
|
||
|
`-C CALLBACK` Specifies a callback. The string `CALLBACK` can be any shell code, the index of the array that will be assigned, and the line is appended at evaluation time.
|
||
|
`-n COUNT` Reads at most `COUNT` lines, then terminates. If `COUNT` is 0, then all lines are read (default).
|
||
|
`-O ORIGIN` Starts populating the given array `ARRAY` at the index `ORIGIN` rather than clearing it and starting at index 0.
|
||
|
`-s COUNT` Discards the first `COUNT` lines read.
|
||
|
`-t` Remove any trailing newline from a line read, before it is assigned to an array element.
|
||
|
`-u FD` Read from filedescriptor `FD` rather than standard input.
|
||
|
|
||
|
While `mapfile` isn\'t a common or portable shell feature, it\'s
|
||
|
functionality will be familiar to many programmers. Almost all
|
||
|
programming languages (aside from shells) with support for compound
|
||
|
datatypes like arrays, and which handle open file objects in the
|
||
|
traditional way, have some analogous shortcut for easily reading all
|
||
|
lines of some input as a standard feature. In Bash, `mapfile` in itself
|
||
|
can\'t do anything that couldn\'t already be done using read and a loop,
|
||
|
and if portability is even a slight concern, should never be used.
|
||
|
However, it does *significantly* outperform a read loop, and can make
|
||
|
for shorter and cleaner code - especially convenient for interactive
|
||
|
use.
|
||
|
|
||
|
## Examples
|
||
|
|
||
|
Here\'s a real-world example of interactive use borrowed from Gentoo
|
||
|
workflow. Xorg updates require rebuilding drivers, and the
|
||
|
Gentoo-suggested command is less than ideal, so let\'s Bashify it. The
|
||
|
first command produces a list of packages, one per line. We can read
|
||
|
those into the array named \"args\" using `mapfile`, stripping trailing
|
||
|
newlines with the \'-t\' option. The resulting array is then expanded
|
||
|
into the arguments of the \"emerge\" command - an interface to Gentoo\'s
|
||
|
package manager. This type of usage can make for a safe and effective
|
||
|
replacement for xargs(1) in certain situations. Unlike xargs, all
|
||
|
arguments are guaranteed to be passed to a single invocation of the
|
||
|
command with no wordsplitting, pathname expansion, or other monkey
|
||
|
business.
|
||
|
|
||
|
# eix --only-names -IC x11-drivers | { mapfile -t args; emerge -av1 "${args[@]}" <&1; }
|
||
|
|
||
|
Note the use of command grouping to keep the emerge command inside the
|
||
|
pipe\'s subshell and within the scope of \"args\". Also note the unusual
|
||
|
redirection. This is because the -a flag makes emerge interactive,
|
||
|
asking the user for confirmation before continuing, and checking with
|
||
|
isatty(3) to abort if stdin isn\'t pointed at a terminal. Since stdin of
|
||
|
the entire command group is still coming from the pipe even though
|
||
|
mapfile has read all available input, we just borrow FD 1 as it just so
|
||
|
happens to be pointing where we want it. More on this over at greycat\'s
|
||
|
wiki: <http://mywiki.wooledge.org/BashFAQ/024>
|
||
|
|
||
|
### The callback
|
||
|
|
||
|
This is one of the more unusual features of a Bash builtin. As far as
|
||
|
I\'m able to tell, the exact behavior is as follows: If defined, as each
|
||
|
line is read, the code contained within the string argument to the -C
|
||
|
flag is evaluated and executed *before* the assignment of each array
|
||
|
element. There are no restrictions to this string, which can be any
|
||
|
arbitrary code, however, two additional \"words\" are automatically
|
||
|
appended to the end before evaluation: the index, and corresponding line
|
||
|
of data to be assigned to the next array element. Since all this happens
|
||
|
before assignment, the callback feature cannot be used to modify the
|
||
|
element to be assigned, though it can read and modify any array elements
|
||
|
already assigned.
|
||
|
|
||
|
A very simple example might be to use it as a kind of progress bar. This
|
||
|
will print a dot for each line read. Note the escaped comment to hide
|
||
|
the appended words from printf.
|
||
|
|
||
|
$ printf '%s\n' {1..5} | mapfile -c 1 -C 'printf . \#' )
|
||
|
.....
|
||
|
|
||
|
Really, the intended usage is for the callback to just contain the name
|
||
|
of a function, with the extra words passed to it as arguments. If
|
||
|
you\'re going to use callbacks at all, this is probably the best way
|
||
|
because it allows for easy access to the arguments with no ugly \"code
|
||
|
in a string\".
|
||
|
|
||
|
$ foo() { echo "|$1|"; }; mapfile -n 11 -c 2 -C 'foo' <file
|
||
|
|2|
|
||
|
|4|
|
||
|
etc..
|
||
|
|
||
|
For the sake of completeness, here are some more complicated examples
|
||
|
inspired by a question asked in #bash - how to prepend something to
|
||
|
every line of some input, and then output even and odd lines to separate
|
||
|
files. This is far from the best possible answer, but hopefully
|
||
|
illustrates the callback behavior:
|
||
|
|
||
|
$ { printf 'input%s\n' {1..10} | mapfile -c 1 -C '>&$(( (${#x[@]} % 2) + 3 )) printf -- "%.sprefix %s"' x; } 3>outfile0 4>outfile1
|
||
|
$ cat outfile{0,1}
|
||
|
prefix input1
|
||
|
prefix input3
|
||
|
prefix input5
|
||
|
prefix input7
|
||
|
prefix input9
|
||
|
prefix input2
|
||
|
prefix input4
|
||
|
prefix input6
|
||
|
prefix input8
|
||
|
prefix input10
|
||
|
|
||
|
Since redirects are syntactically allowed anywhere in a command, we put
|
||
|
it before the printf to stay out of the way of additional arguments.
|
||
|
Rather than opening \"outfile\<n\>\" for appending on each call by
|
||
|
calculating the filename, open an FD for each first and calculate which
|
||
|
FD to send output to by measuring the size of x mod 2. The zero-width
|
||
|
format specification is used to absorb the index number argument.
|
||
|
|
||
|
Another variation might be to add each of these lines to the elements of
|
||
|
separate arrays. I\'ll leave dissecting this one as an exercise for the
|
||
|
reader. This is quite the hack but illustrates some interesting
|
||
|
properties of printf -v and mapfile -C (which you should probably never
|
||
|
use in real code).
|
||
|
|
||
|
$ y=( 'odd[j]' 'even[j++]' ); printf 'input%s\n' {1..10} | { mapfile -tc 1 -C 'printf -v "${y[${#x[@]} % 2]}" -- "%.sprefix %s"' x; printf '%s\n' "${odd[@]}" '' "${even[@]}"; }
|
||
|
prefix input1
|
||
|
prefix input3
|
||
|
prefix input5
|
||
|
prefix input7
|
||
|
prefix input9
|
||
|
|
||
|
prefix input2
|
||
|
prefix input4
|
||
|
prefix input6
|
||
|
prefix input8
|
||
|
prefix input10
|
||
|
|
||
|
This example based on yet another #bash question illustrates mapfile in
|
||
|
combination with read. The sample input is the heredoc to `main`. The
|
||
|
goal is to build a \"struct\" based upon records in the input file made
|
||
|
up of the numbers following the colon on each line. Every 3rd line is a
|
||
|
key followed by 2 corresponding fields. The showRecord function takes a
|
||
|
key and returns the record.
|
||
|
|
||
|
#!/usr/bin/env bash
|
||
|
|
||
|
showRecord() {
|
||
|
printf 'key[%d] = %d, %d\n' "$1" "${vals[@]:keys[$1]*2:2}"
|
||
|
}
|
||
|
|
||
|
parseRecords() {
|
||
|
trap 'unset -f _f' RETURN
|
||
|
_f() {
|
||
|
local x
|
||
|
IFS=: read -r _ x
|
||
|
((keys[x]=n++))
|
||
|
}
|
||
|
local n
|
||
|
|
||
|
_f
|
||
|
mapfile -tc2 -C _f "$1"
|
||
|
eval "$1"'=("${'"$1"'[@]##*:}")' # Return the array with some modification
|
||
|
}
|
||
|
|
||
|
main() {
|
||
|
local -a keys vals
|
||
|
parseRecords vals
|
||
|
showRecord "$1"
|
||
|
}
|
||
|
|
||
|
main "$1" <<-"EOF"
|
||
|
fabric.domain:123
|
||
|
routex:1
|
||
|
routey:2
|
||
|
fabric.domain:321
|
||
|
routex:6
|
||
|
routey:4
|
||
|
EOF
|
||
|
|
||
|
For example, running `scriptname 321` would output `key[321] = 6, 4`.
|
||
|
Every 2 lines read by `mapfile`, the function `_f` is called, which
|
||
|
reads one additional line. Since the first line in the file is a key,
|
||
|
and `_f` is responsible for the keys, it gets called first so that
|
||
|
`mapfile` starts by reading the second line of input, calling `_f` with
|
||
|
each subsequent 2 iterations. The RETURN trap is unimportant.
|
||
|
|
||
|
## Bugs
|
||
|
|
||
|
- Early implementations were buggy. For example, `mapfile` filling the
|
||
|
readline history buffer with calls to the `CALLBACK`. This was fixed
|
||
|
in 4.1 beta.
|
||
|
- `mapfile -n` reads an extra line beyond the last line assigned to
|
||
|
the array, through Bash. [Fixed in
|
||
|
4.2.35](ftp://ftp.gnu.org/gnu/bash/bash-4.2-patches/bash42-035).
|
||
|
- `mapfile` callbacks could cause a crash if the variable being
|
||
|
assigned is manipulated in certain ways.
|
||
|
<https://lists.gnu.org/archive/html/bug-bash/2013-01/msg00039.html>.
|
||
|
Fixed in 4.3.
|
||
|
|
||
|
## To Do
|
||
|
|
||
|
- Create an implementation as a shell function that\'s portable
|
||
|
between Ksh, Zsh, and Bash (and possibly other bourne-like shells
|
||
|
with array support).
|
||
|
|
||
|
## See also
|
||
|
|
||
|
- [arrays](/syntax/arrays)
|
||
|
- [read](/commands/builtin/read) - If you don\'t know about this yet,
|
||
|
why are you reading this page?
|
||
|
- <http://mywiki.wooledge.org/BashFAQ/001> - It\'s FAQ 1 for a reason.
|