yanbin's Blog

简单 Shell 编程 FAQ（〇）

0.sed 命令如何使用 shell 脚本某个变量的值？

使用＇“＇括起表达式，或者即不用＇‘＇也不用＇“＇括起表达式。

'expression' 是使用 sed 时常用/常见的方式。

可以用 "expression" 的形式给 sed 指定表达式参数。

shell 对待参数的方式：

1)使用＇‘＇括起来的字符以它的文本形式对待。$ 就是 $ 这个字符，不会对待为展开变量的值；

2)使用＇“＇括起来的＇$ ＇用于展开变量的值，＇\＇用来转义，其它字符仍然以文本形式对待；

3)执行命令时指定参数而不用＇‘＇或＇“＇括起来与使用＇“＇类似；

# ＇$＇ 在这里用于匹配 ＇＄＇字符。
sed 's/A $foobar value/foobar/g' foobar.txt

# $ 展开 foobar 这个变量的值。转义的 \$ 匹配 ＇＄＇。
# 参数在传递给 sed 程序时已经完成变量值展开和转义了。
# 完成转义和变量展开的是 shell 而不是 sed.
sed "s/A $foobar \$value/foobar/g" foobar.txt

# 这个与用 ＇“＇括起来的效果是相同的。
sed s/A $foobar \$value/$/g foobar.txt

1.sed 如何从文件中直接删除一行或多行？

使用 sed 的 d 命令。

# 不熟悉 sed 时一般会写这样的代码。这种方式容易出错且耗费资源。
cat foobar.txt | sed 's/patter to match//gp' > tmp_file.txt
mv tm_file.txt foobar.txt

# sed 的 d 命令 加 -i 参数 可以完成直接修改文件的操作。
sed -i '/pattern to match/d' foobar.txt

# 只输未匹配即没有被删除的行，而不修改文件
sed 'pattern to match/d' foobar.txt

-i[SUFFIX], --in-place[=SUFFIX]
edit files in place (makes backup if SUFFIX supplied)

# -i 参数接受一个可选的 suffix, 指定这个 suffix, sed 会修改文件前备份文件
sed -i.bak '/pattern to match/d' foobar.txt

2.找出只在一个文件中出现，不在另一个文件中出现的行？

使用 diff 命令，并且设置 --LTYPE-line-format="", 并且要求两个文件的行时排序过的。

# 输出在 file1 中出现，不在 file2 中出现的行。
diff --new-line-format="" --unchanged-line-format="" file1 file2

基本原理：

--new-line-format, --unchanged-line-format, --old-line-format 参数分别用来操作 diff 格式化输出：

(file2 中)新增加的行、没有改变过的行、以及(在 file2 中)删除的行。

参数的值是 ""，也就是不输出。

(a)--new-line-format="", 不会输出只在 file2 中出现的行；

(b)--unchanged-line-format="", 不会输出在两个文件中都出现的行；

(c)指定了 --new-line-format="", --old-line-format 没有指定值，diff 默认直接输出只在 file1出现 old lines.

换个思路，假设一个 new_file, 一个 old_file, 都排序过:

# 输出在 new_file 中出现，不在 old file 中出现的行。
# 指定 --old-line-format=“”, 没有指定 --new-line-format，diff 默认直接输出 new line, 没有前导的 ＇＜＇.
diff --old-line-format="" --unchanged-line-format="" old_file new_file

如果文件没有排序过也可以用 <(sort file1) 的方式排序

diff --new-line-foramt="" --unchanged-line-format=="" <(sort file1) <(sort file2)

3.sort 用指定 key (某些字符) 排序？uniq 按每行的前 N 个字符去重？

sort --key 参数； uniq -w 参数

# 以每行的 1到15个字符为 key 排序。1,15 都是 position, 第一个字符的 position 是 1.
sort --key 1,15 foobar.txt
# uniq 按每行的前 12 字符去重
sort --key 1,15 foobar.txt | uniq -w 12

sort:

-k, --key=KEYDEF sort via a key; KEYDEF gives location and type

KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position

uniq:

-w, --check-chars=N compare no more than N characters in lines

4.超时退出一个程序的执行?

timeout 程序：run a command with a time limit.

# 5 秒后向 command 程序发 SIGTERM 信号，退出 command 的执行。
tiemout 5 /path/to/slow/command with options

timeout 没有指定 --preserve-status 参数，返回 124.

timeout 有更灵活的用法。man timeout 可以获得更多参数介绍。