2023-07-05 11:10:03 +02:00
|
|
|
# Dissect a bad oneliner
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ ls *.zip | while read i; do j=`echo $i | sed 's/.zip//g'`; mkdir $j; cd $j; unzip ../$i; cd ..; done
|
|
|
|
```
|
|
|
|
|
|
|
|
This is an actual one-liner someone asked about in `#bash`. **There are
|
2024-03-30 19:22:45 +01:00
|
|
|
several things wrong with it. Let's break it down!**
|
2023-07-05 11:10:03 +02:00
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ ls *.zip | while read i; do ...; done
|
|
|
|
```
|
|
|
|
|
|
|
|
(Please read <http://mywiki.wooledge.org/ParsingLs>.) This command
|
|
|
|
executes `ls` on the expansion of `*.zip`. Assuming there are filenames
|
2024-10-08 06:00:17 +02:00
|
|
|
in the current directory that end in '.zip', ls will give a
|
2023-07-05 11:10:03 +02:00
|
|
|
human-readable list of those names. The output of ls is not for parsing.
|
|
|
|
But in sh and bash alike, we can loop safely over the glob itself:
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ for i in *.zip; do j=`echo $i | sed 's/.zip//g'`; mkdir $j; cd $j; unzip ../$i; cd ..; done
|
|
|
|
```
|
|
|
|
|
2024-03-30 19:22:45 +01:00
|
|
|
Let's break it down some more!
|
2023-07-05 11:10:03 +02:00
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
j=`echo $i | sed 's/.zip//g'` # where $i is some name ending in '.zip'
|
|
|
|
```
|
|
|
|
|
|
|
|
The goal here seems to be get the filename without its `.zip` extension.
|
|
|
|
In fact, there is a POSIX(r)-compliant command to do this: `basename`
|
|
|
|
The implementation here is suboptimal in several ways, but the only
|
2024-10-08 06:00:17 +02:00
|
|
|
thing that's genuinely error-prone with this is "`echo $i`". Echoing
|
2023-07-05 11:10:03 +02:00
|
|
|
an *unquoted* variable means
|
2024-01-29 02:01:50 +01:00
|
|
|
[wordsplitting](../syntax/expansion/wordsplit.md) will take place, so any
|
2023-07-05 11:10:03 +02:00
|
|
|
whitespace in `$i` will essentially be normalized. In `sh` it is
|
|
|
|
necessary to use an external command and a subshell to achieve the goal,
|
|
|
|
but we can eliminate the pipe (subshells, external commands, and pipes
|
|
|
|
carry extra overhead when they launch, so they can really hurt
|
2024-03-30 19:22:45 +01:00
|
|
|
performance in a loop). Just for good measure, let's use the more
|
2024-01-29 02:01:50 +01:00
|
|
|
readable, [modern](../syntax/expansion/cmdsubst.md) `$()` construct instead
|
2023-07-05 11:10:03 +02:00
|
|
|
of the old style backticks:
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
sh $ for i in *.zip; do j=$(basename "$i" ".zip"); mkdir $j; cd $j; unzip ../$i; cd ..; done
|
|
|
|
```
|
|
|
|
|
2024-03-30 20:09:26 +01:00
|
|
|
In Bash we don't need the subshell or the external basename command.
|
2023-07-05 11:10:03 +02:00
|
|
|
See [Substring removal with parameter
|
2024-01-29 02:01:50 +01:00
|
|
|
expansion](../syntax/pe.md#substring_removal):
|
2023-07-05 11:10:03 +02:00
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
bash $ for i in *.zip; do j="${i%.zip}"; mkdir $j; cd $j; unzip ../$i; cd ..; done
|
|
|
|
```
|
|
|
|
|
2024-03-30 19:22:45 +01:00
|
|
|
Let's keep going:
|
2023-07-05 11:10:03 +02:00
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ mkdir $j; cd $j; ...; cd ..
|
|
|
|
```
|
|
|
|
|
|
|
|
As a programmer, you **never** know the situation under which your
|
|
|
|
program will run. Even if you do, the following best practice will never
|
|
|
|
hurt: When a following command depends on the success of a previous
|
2024-10-08 06:00:17 +02:00
|
|
|
command(s), check for success! You can do this with the "`&&`"
|
2023-07-05 11:10:03 +02:00
|
|
|
conjunction, that way, if the previous command fails, bash will not try
|
2024-03-30 19:22:45 +01:00
|
|
|
to execute the following command(s). It's fully POSIX(r). Oh, and
|
2024-01-29 02:01:50 +01:00
|
|
|
remember what I said about [wordsplitting](../syntax/expansion/wordsplit.md)
|
2024-03-30 20:09:26 +01:00
|
|
|
in the previous step? Well, if you don't quote `$j`, wordsplitting can
|
2023-07-05 11:10:03 +02:00
|
|
|
happen again.
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ mkdir "$j" && cd "$j" && ... && cd ..
|
|
|
|
```
|
|
|
|
|
2024-04-01 06:10:32 +02:00
|
|
|
That's almost right, but there's one problem -- what happens if `$j`
|
2023-07-05 11:10:03 +02:00
|
|
|
contains a slash? Then `cd ..` will not return to the original
|
2024-03-30 19:22:45 +01:00
|
|
|
directory. That's wrong! `cd -` causes cd to return to the previous
|
|
|
|
working directory, so it's a much better choice:
|
2023-07-05 11:10:03 +02:00
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ mkdir "$j" && cd "$j" && ... && cd -
|
|
|
|
```
|
|
|
|
|
|
|
|
(If it occurred to you that I forgot to check for success after cd -,
|
2024-10-08 06:00:17 +02:00
|
|
|
good job! You could do this with `{ cd - || break; }`, but I'm going to
|
2024-03-30 19:22:45 +01:00
|
|
|
leave that out because it's verbose and I think it's likely that we
|
2023-07-05 11:10:03 +02:00
|
|
|
will be able to get back to our original working directory without a
|
|
|
|
problem.)
|
|
|
|
|
|
|
|
So now we have:
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
sh $ for i in *.zip; do j=$(basename "$i" ".zip"); mkdir "$j" && cd "$j" && unzip ../$i && cd -; done
|
|
|
|
```
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
bash $ for i in *.zip; do j="${i%.zip}"; mkdir "$j" && cd "$j" && unzip ../$i && cd -; done
|
|
|
|
```
|
|
|
|
|
2024-03-30 19:22:45 +01:00
|
|
|
Let's throw the `unzip` command back in the mix:
|
2023-07-05 11:10:03 +02:00
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
mkdir "$j" && cd "$j" && unzip ../$i && cd -
|
|
|
|
```
|
|
|
|
|
2024-03-30 19:22:45 +01:00
|
|
|
Well, besides word splitting, there's nothing terribly wrong with this.
|
2023-07-05 11:10:03 +02:00
|
|
|
Still, did it occur to you that unzip might already be able to target a
|
2024-03-30 20:09:26 +01:00
|
|
|
directory? There isn't a standard for the `unzip` command, but all the
|
2024-10-08 06:00:17 +02:00
|
|
|
implementations I've seen can do it with the -d flag. So we can drop
|
2023-07-05 11:10:03 +02:00
|
|
|
the cd commands entirely:
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
$ mkdir "$j" && unzip -d "$j" "$i"
|
|
|
|
```
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
sh $ for i in *.zip; do j=$(basename "$i" ".zip"); mkdir "$j" && unzip -d "$j" "$i"; done
|
|
|
|
```
|
|
|
|
|
2024-10-08 06:00:17 +02:00
|
|
|
```bash
|
2023-07-05 11:10:03 +02:00
|
|
|
bash $ for i in *.zip; do j="${i%.zip}"; mkdir "$j" && unzip -d "$j" "$i"; done
|
|
|
|
```
|
|
|
|
|
2024-03-30 19:22:45 +01:00
|
|
|
There! That's as good as it gets.
|