bash-hackers-wiki/docs/syntax/expansion/globs.md

115 lines
4.3 KiB
Markdown
Raw Permalink Normal View History

2023-07-05 11:43:35 +02:00
# Pathname expansion (globbing)
## General
Unlike on other platforms you may have seen, on UNIX(r), the shell is
responsible for interpreting and expanding globs ("filename
wildcards"). A called program will never see the glob itself; it will
2023-07-05 11:43:35 +02:00
only see the expanded filenames as its arguments (here, all filenames
matching `*.log`):
grep "changes:" *.log
The base syntax for the pathname expansion is the [pattern
matching](../../syntax/pattern.md) syntax. The pattern you describe is matched
2023-07-05 11:43:35 +02:00
against all existing filenames and the matching ones are substituted.
Since this substitution happens **after [word
splitting](../../syntax/expansion/wordsplit.md)**, all resulting filenames are
2023-07-05 11:43:35 +02:00
literal and treated as separate words, no matter how many spaces or
other `IFS`-characters they contain.
## Normal behaviour
- with [the set command](../../commands/builtin/set.md) (`-f`, `noglob`) you
2023-07-05 11:43:35 +02:00
can entirely disable pathname expansion
- when matching a pathname, the slash-character (`/`) always needs to
be matched explicitly
- the dot at the beginning of a filename must be matched explicitly
(also one following a `/` in the glob)
2024-03-30 20:09:26 +01:00
- a glob that doesn't match a filename is unchanged and remains what
2023-07-05 11:43:35 +02:00
it is
## Customization
- when the shell option `nullglob` is set, non-matching globs are
removed, rather than preserved
- when the shell option `failglob` is set, non-matching globs produce
an error message and the current command is not executed
- when the shell option `nocaseglob` is set, the match is performed
case-insensitive
- when the shell option `dotglob` is set, wildcard-characters can
match a dot at the beginning of a filename
- when the shell option `dirspell` is set, Bash performs spelling
corrections when matching directory names
- when the shell option `globstar` is set, the glob `**` will
2024-03-30 20:09:26 +01:00
recursively match all files and directories. This glob isn't
"configurable", i.e. you **can't** do something like `**.c` to
2023-07-05 11:43:35 +02:00
recursively get all `*.c` filenames.
- when the shell option `globasciiranges` is set, the bracket-range
globs (e.g. `[A-Z]`) use C locale order rather than the configured
locale's order (i.e. `ABC...abc...` instead of e.g. `AaBbCc...`) -
2023-07-05 11:43:35 +02:00
since 4.3-alpha
- the variable [GLOBIGNORE](../../syntax/shellvars.md#GLOBIGNORE) can be set
2023-07-05 11:43:35 +02:00
to a colon-separated list of patterns to be removed from the list
before it is returned
### nullglob
Normally, when no glob specified matches an existing filename, no
pathname expansion is performed, and the globs are <u>**not**</u>
2023-07-05 11:43:35 +02:00
removed:
$ echo "Textfiles here:" *.txt
Textfiles here: *.txt
In this example, no files matched the pattern, so the glob was left
intact (a literal asterisk, followed by dot-txt).
This can be very annoying, for example when you drive a
[for-loop](../../syntax/ccmd/classic_for.md) using the pathname expansion:
2023-07-05 11:43:35 +02:00
for filename in *.txt; do
echo "=== BEGIN: $filename ==="
cat "$filename"
echo "=== END: $filename ==="
done
When no file name matches the glob, the loop will not only output stupid
text ("`BEGIN: *.txt`"), but also will make the `cat`-command fail
2023-07-05 11:43:35 +02:00
with an error, since no file named `*.txt` exists.
Now, when the shell option `nullglob` is set, Bash will remove the
entire glob from the command line. In case of the for-loop here, not
2024-03-30 20:09:26 +01:00
even one iteration will be done. It just won't run.
2023-07-05 11:43:35 +02:00
So in our first example:
$ shopt -s nullglob
$ echo "Textfiles here:" *.txt
Textfiles here:
and the glob is gone.
### Glob characters
- `*` - means \'match any number of characters\'. \'/\' is not matched
2023-07-05 11:43:35 +02:00
(and depending on your settings, things like \'.\' may or may not be
matched, see above)
- `?` - means \'match any single character\'
- `[abc]` - match any of the characters listed. This syntax also
2023-07-05 11:43:35 +02:00
supports ranges, like \[0-9\]
For example, to match something beginning with either \'S\' or \'K\'
followed by two numbers, followed by at least 3 more characters:
[SK][0-9][0-9]???*
## See also
- [Introduction to expansion and
substitution](../../syntax/expansion/intro.md)
- [pattern matching syntax](../../syntax/pattern.md)
- [the set builtin command](../../commands/builtin/set.md)
- [the shopt builtin command](../../commands/builtin/shopt.md)
- [list of shell options](../../internals/shell_options.md)