One-liners: Difference between revisions

Latest revision as of 14:58, 9 December 2024

Sometimes one-liners are so cool, you just want to remember them. And good one-liners can also teach you the intricacies and features of the Bash shell. Although there are better sites on the Internet for finding one-liners, understanding one-liners or playing on the command line, we'd still like to illustrate a few here.

Find big files or directories[edit | edit source]

ducks

Help! I'm out of disc space. How do I find out where the big files or directories are that are consuming all storage?

du -cks -- * | sort -rn | head

du --total --block-size=1K --summarize and the double dash argument means 'take the arguments from STDIN' then the asterisk is the glob character that matches 'everything in this directory', so each file and directory in the current working directory is summarized. This is the piped to sort with the reverse, numeric options and then piped to head for showing just the top 10. Adjust to taste.

Mount remote filesystem[edit | edit source]

Using sshfs is a great tool for mounting remote filesystems so that you can use your local tools on them. This example supplies a complex SSH command, including port-forwarding at the same time, to the SSHFS tool.

sshfs -o idmap=user -o ssh_command='ssh -t -i /home/greg/.ssh/eQualityTech-Test.pem -o IdentitiesOnly=true -o ForwardAgent=true -L 127.0.0.1:43306:10.0.50.53:3306 centos@ec2-52-203-160-83.compute-1.amazonaws.com ssh -A' centos@10.0.50.161:/ /mnt/es1

Compare two wikis for extensions and skins[edit | edit source]

This one-liner invokes the API of two wikis asking for info on siteinfo, general, extensions and skins; in json format. Since that data is returned without any newlines, we use `jq` to pretty-print the json output. Then it's an easy `meld` or `diff` to compare them. The `--silent` option to `curl` just suppresses the connection and retrieval metadata; while the `-L` is customary to follow redirects.

A='https://freephile.org/' B='https://www.mediawiki.org/' API='w/api.php?action=query&meta=siteinfo&siprop=general%7Cextensions%7Cskins&format=json' meld <(curl --silent -L "${A}${API}" | jq '.') <(curl --silent -L "${B}${API}" | jq '.')

Perl edit[edit | edit source]

Sometimes you want to make a bunch of changes (substitutions) of the same text across multiple files. Like changing a product name across multiple pages of documentation. With a one-line perl command, you can do just that. Furthermore, the example below uses a ls command to select which files to operate on -- giving you even more powerful control over your one-line edit.

perl -p -i -e "s/lemons/lemonade/" $(/bin/ls my/life*)

Free Memory[edit | edit source]

Use echo to output the result of a sub-shell, and a few extra characters (' - + p'), which is then piped to the (reverse-polish) desk calculator. Concatenate the /proc/meminfo file, printing it on STDOUT. Using extended-regex grep, we search for lines of output that begin with "MemFree", "Cached" or "Writeback" followed by the colon character. Piping to awk, we can print out the string in position 2 of each line. Those values are ultimately processed in the calculator by popping the last two numbers off the stack (Writeback and Cached), and adding that result to the first number (MemFree).^[1]

echo $(cat /proc/meminfo | egrep '^(MemFree|Cached|Writeback):' | awk '{print $2}') - + p | dc

Result:

Size of Graphical Desktop (X Window System)[edit | edit source]

So you think your graphical desktop is slowing things down compared to using a pure console based system. Short of logging in single user mode, how much memory does the graphical desktop consume? Since everything is a file, we can look in the folder for processes (/proc), and specifically the folder created for the process id of "X" (X.org). grepping for the line starting with 'VmSize', we can see the Virtual Memory size of our graphical desktop.

grep ^VmSize /proc/$(pidof X)/status

Result:

VmSize:   158212 kB

Delete old stuff[edit | edit source]

You stumble upon a directory full of backups, which is great. But you also realize that nobody setup logrotate or other command to prune old content. Maybe that's because these backups are produced manually, say during upgrades, and so they are also deleted manually. What's a quick one-liner to remove old files? Use the mtime (modification time) option to find combined with the exec option to execute rm (remove) said files.

# Make sure we've got backups; look for recent files
sudo ls -al /backups
# list everything in the backups folder that's older than 30 days
sudo find /backups -mtime +30 -ls
# OK, delete those files
sudo find /backups -mtime +30 -exec rm {} \;

Reports with Find[edit | edit source]

Want to see all the .htaccess files in your webroot and see what they do? You can use -exec bash -c to perform multiple commands with one exec. (you can also use multiple -exec options in find). The example below echo's out the name of the found file; then cat's it with numbered lines. Note that the underscore is a throwaway value (could be any text, such as 'foobar') which consumes the first positional argument ($0) to bash -c making it "more readable" to reference our found filename as $1 (since $0 is commonly understood to refer to the script itself).

# All give similar output
find _mw -name .htaccess -exec bash -c 'echo -e "\n$1\n"; cat -n "$1"' _ '{}' \;
find _mw -name .htaccess -exec bash -c 'echo -e "\n$0\n"; cat -n "$0"' '{}' \;
find _mw -name .htaccess -exec bash -c 'echo -e "\n$0$1\n"; cat -n "$1"' 'Reporting on '  '{}' \;
find _mw -name .htaccess -exec echo -e "\nReporting on " '{}' "\n" \; -exec cat -n '{}' \;

^[2]

And this one in your /opt/conf-meza/public "config" directory

find . -name '*yml' -o -name '*php' -exec bash -c 'echo -e "\n$0\n"; grep --perl-regexp --only-matching "^\s*(\\\$[^\[ ]+)" '{}' | sed -e "s/^[[:space:]]*//" | sort -u ' '{}' \;

Split a big file[edit | edit source]

Say you have a file with 50,000 lines in it, which becomes unwieldy to deal with in a spreadsheet or otherwise. You can easily split the file into segments with the split command. Be default it uses alpha suffixes (little_file.aa, little_file.ab, etc.) If you add the option --numeric-suffixes, then you'll end up with little_file.00, little_file.01, etc. If you would like to re-add the original suffix, then you must use the option called --additional-suffix

The following command takes BIG_FILE.txt and for every 10,000 lines of that file, it generates new files called 'little_file.00.txt', 'little_file.01.txt', 'little_file.02.txt', and so on.

split --lines=10000 --numeric-suffixes --additional-suffix='.txt' BIG_FILE.txt little_file.

"BASH is better"[edit | edit source]

Most job postings that focus on DevOps have requirements for Python, Go programming or some other programming language. I disagree that a DevOps Engineer should also be a programmer. I prioritize quality and workmanship (craftsmanship) which is informed by broad experience, but honed by specialization. As a construction analogy, I prefer individual skilled trades over the general handyman approach. Simply put: DevOps is DevOps, it is not programming. Worse, the requirement for the hot language of the day is a bigger tell-tale sign that the company is either posturing or doesn't know what they're doing. Back when when I first learned Perl (which isn't the hot new language anymore), there was a hilarious t-shirt that said "Be careful or I'll replace you with a line of code"^[3]. Although you could write the Python example more concisely, it is a real-world example of code that I found that does the same thing as 5 lines of BASH that I wrote.

BASH code to concatenate certbot certificates:

#!/bin/bash

# $RENEWED_DOMAINS will contain a space-delimited list of renewed 
# certificate domains (for example, "example.com www.example.com"
# loop through a dynamic list of directories in 'live'
# for SITE in $(find /etc/letsencrypt/live -mindepth 1 -maxdepth 1 -type d -exec basename {} \;)
# $RENEWED_LINEAGE will contain the live subdirectory
for SITE in $RENEWED_DOMAINS
do
        # move to correct let's encrypt directory
        cd $RENEWED_LINEAGE
        # cat files to make combined .pem for haproxy
        cat fullchain.pem privkey.pem > /etc/haproxy/certs/$SITE.pem
done
# reload haproxy
# systemctl reload haproxy

Python code to concatenate certbot certificates:

#!/usr/bin/env python3

import os
import re
import sys

# Certbot sets an environment variable RENEWED_LINEAGE, which points to the
# path of the renewed certificate. We use that path to determine and find
# the files for the currently renewed certificated
lineage=os.environ.get('RENEWED_LINEAGE')

# If nothing renewed, exit
if not lineage:
    sys.exit()

# From the linage, we strip the 'domain name', which is the last part
# of the path.
result = re.match(r'.*/live/(.+)$', lineage)

# If we can not recognize the path, we exit with 1
if not result:
    sys.exit(1)

# Extract the domain name
domain = result.group(1)

# Define a path for HAproxy where you want to write the .pem file.
deploy_path="/etc/haproxy/ssl/" + domain + ".pem"

# The source files can be found in below paths, constructed with the lineage
# path
source_key = lineage + "/privkey.pem"
source_chain = lineage + "/fullchain.pem"

# HAproxy requires to combine the key and chain in one .pem file
with open(deploy_path, "w") as deploy, \
        open(source_key, "r") as key, \
        open(source_chain, "r") as chain:
    deploy.write(key.read())
    deploy.write(chain.read())

# Here you can add your service reload command. Which will be executed after
# every renewal, which is fine if you only have a few domains.

# Alternative is to add the reload to the --post-hook. In that case it is only
# run once after all renewals. That would be the use-case if you have a large
# number of different certificates served by HAproxy.

References[edit source]

↑ Cache explained
↑ https://stackoverflow.com/questions/5119946/find-exec-with-multiple-commands
↑ Dave Jacoby agrees with me on the broad point that (programming) languages are just different domain dialects, and also cites the ThinkGeek t-shirt phrase "Go Away Or I Will Replace You With a Small Shell Script" https://jacoby.github.io/2021/11/16/i-will-replace-you-with-a-small-shell-script.html

[1] Cache explained

[2] ttps://stackoverflow.com/questions/5119946/find-exec-with-multiple-commands

[3] Dave Jacoby agrees with me on the broad point that (programming) languages are just different domain dialects, and also cites the ThinkGeek t-shirt phrase "Go Away Or I Will Replace You With a Small Shell Script" https://jacoby.github.io/2021/11/16/i-will-replace-you-with-a-small-shell-script.html

[1]

[2]

[3]

@@ Line 1: / Line 1: @@
-Sometimes one-liners are so cool, you just want to remember them.  And good one-liners can also teach you the intricacies and features of the [[Bash]] shell.
+Sometimes one-liners are so cool, you just want to remember them.  And good one-liners can also teach you the intricacies and features of the [[Bash]] shell.  Although there are better sites on the Internet for [http://www.bashoneliners.com/ finding one-liners], [http://www.catonmat.net/series/bash-one-liners-explained understanding one-liners] or [http://uni.xkcd.com/ playing on the command line], we'd still like to illustrate a few here.
-== Free Memory ==
+==Find big files or directories==
-Use <code>echo</code> to output the result of a sub-shell, and a few extra characters (' - + p'), which is then piped to the (reverse-polish) desk calculator.  Con<code>cat</code>enate the /proc/meminfo file, printing it on STDOUT. Using extended-regex <code>grep</code>, we search for lines of output that begin with "MemFree", "Cached" or "Writeback" followed by the colon character.  Piping to <code>awk</code>, we can print out the string in position 2 of each line.  Those values are ultimately processed in the calculator by popping the last two numbers off the stack (Writeback and Cached), and adding that result to the first number (MemFree).
+[[File:Anas platyrhynchos (mixed pair) (32428014687).jpg|alt=ducks|thumb|ducks]]
+Help! I'm out of disc space. How do I find out where the big files or directories are that are consuming all storage?
+<source lang="bash">du -cks -- * | sort -rn | head</source>
+<code>du --total --block-size=1K --summarize</code> and the double dash argument means 'take the arguments from STDIN' then the asterisk is the glob character that matches 'everything in this directory', so each file and directory in the current working directory is summarized. This is the piped to <code>sort</code> with the reverse, numeric options and then piped to <code>head</code> for showing just the top 10. Adjust to taste.
+==Mount remote filesystem==
+Using sshfs is a great tool for mounting remote filesystems so that you can use your local tools on them.  This example supplies a complex SSH command, including port-forwarding at the same time, to the [[SSHFS]] tool.
+<source lang="bash">sshfs -o idmap=user -o ssh_command='ssh -t -i /home/greg/.ssh/eQualityTech-Test.pem -o IdentitiesOnly=true -o ForwardAgent=true -L 127.0.0.1:43306:10.0.50.53:3306 centos@ec2-52-203-160-83.compute-1.amazonaws.com ssh -A' centos@10.0.50.161:/ /mnt/es1</source>
+==Compare two wikis for extensions and skins==
+This one-liner invokes the API of two wikis asking for info on siteinfo, general, extensions and skins; in json format.  Since that data is returned without any newlines, we use `jq` to pretty-print the json output.  Then it's an easy `meld` or `diff` to compare them.  The `--silent` option to `curl` just suppresses the connection and retrieval metadata; while the `-L` is customary to follow redirects.
+<source lang="bash">
+A='https://freephile.org/' B='https://www.mediawiki.org/' API='w/api.php?action=query&meta=siteinfo&siprop=general%7Cextensions%7Cskins&format=json' meld <(curl --silent -L "${A}${API}" | jq '.') <(curl --silent -L "${B}${API}" | jq '.')</source>
+==Perl edit==
+Sometimes you want to make a bunch of changes (substitutions) of the same text across multiple files.  Like changing a product name across multiple pages of documentation.  With a one-line perl command, you can do just that.  Furthermore, the example below uses a <code>ls</code> command to select which files to operate on -- giving you even more powerful control over your one-line edit.
+<source lang="perl">
+perl -p -i -e "s/lemons/lemonade/" $(/bin/ls my/life*)
+</source>
+==Free Memory==
+Use <code>echo</code> to output the result of a sub-shell, and a few extra characters (' - + p'), which is then piped to the (reverse-polish) desk calculator.  Con<code>cat</code>enate the /proc/meminfo file, printing it on STDOUT. Using extended-regex <code>grep</code>, we search for lines of output that begin with "MemFree", "Cached" or "Writeback" followed by the colon character.  Piping to <code>awk</code>, we can print out the string in position 2 of each line.  Those values are ultimately processed in the calculator by popping the last two numbers off the stack (Writeback and Cached), and adding that result to the first number (MemFree).<ref>[http://www.computerweekly.com/feature/Write-through-write-around-write-back-Cache-explained Cache explained]</ref>
 <source lang="bash">
 echo $(cat /proc/meminfo | egrep '^(MemFree|Cached|Writeback):' | awk '{print $2}') - + p | dc
@@ Line 11: / Line 33: @@
 </pre>
-== Size of X ==
+==Size of Graphical Desktop (X Window System)==
-Since everything is a file, we can look in the folder for processes (/proc), and specifically the folder created for the process id of "X" (X-org).  <code>grep</code>ping for the line starting with 'VmSize', we can see the Virtual Memory size of our graphical desktop.
+So you think your graphical desktop is slowing things down compared to using a pure console based system.  Short of logging in single user mode, how much memory does the graphical desktop consume?  Since everything is a file, we can look in the folder for processes (/proc), and specifically the folder created for the process id of "X" ([http://x.org X.org]).  <code>grep</code>ping for the line starting with 'VmSize', we can see the Virtual Memory size of our graphical desktop.
 <source lang="bash">grep ^VmSize /proc/$(pidof X)/status</source>
 Result:
@@ Line 18: / Line 40: @@
 VmSize:   158212 kB
 </pre>
+==Delete old stuff==
+You stumble upon a directory full of backups, which is great.  But you also realize that nobody setup <code>logrotate</code> or other command to prune old content.  Maybe that's because these backups are produced manually, say during upgrades, and so they are also deleted manually.  What's a quick one-liner to remove old files?  Use the <code>mtime</code> (modification time) option to <code>find</code> combined with the <code>exec</code> option to execute <code>rm</code> (remove) said files.
+<source lang="bash">
+# Make sure we've got backups; look for recent files
+sudo ls -al /backups
+# list everything in the backups folder that's older than 30 days
+sudo find /backups -mtime +30 -ls
+# OK, delete those files
+sudo find /backups -mtime +30 -exec rm {} \;
+</source>
+==Reports with Find==
+Want to see all the <code>.htaccess</code> files in your webroot and see what they do?  You can use <code>-exec bash -c</code> to perform multiple commands with one exec. (you can also use multiple -exec options in find).  The example below echo's out the name of the found file; then cat's it with numbered lines. Note that the underscore is a throwaway value (could be any text, such as 'foobar') which consumes the first positional argument ($0) to <code>bash -c</code> making it "more readable" to reference our found filename as $1 (since $0 is commonly understood to refer to the script itself).
+<source lang="bash">
+# All give similar output
+find _mw -name .htaccess -exec bash -c 'echo -e "\n$1\n"; cat -n "$1"' _ '{}' \;
+find _mw -name .htaccess -exec bash -c 'echo -e "\n$0\n"; cat -n "$0"' '{}' \;
+find _mw -name .htaccess -exec bash -c 'echo -e "\n$0$1\n"; cat -n "$1"' 'Reporting on '  '{}' \;
+find _mw -name .htaccess -exec echo -e "\nReporting on " '{}' "\n" \; -exec cat -n '{}' \;
+</source>
+<ref>https://stackoverflow.com/questions/5119946/find-exec-with-multiple-commands</ref>
+And this one in your /opt/conf-meza/public "config" directory
+<source lang="bash">
+find . -name '*yml' -o -name '*php' -exec bash -c 'echo -e "\n$0\n"; grep --perl-regexp --only-matching "^\s*(\\\$[^\[ ]+)" '{}' | sed -e "s/^[[:space:]]*//" | sort -u ' '{}' \;
+</source>
+==Split a big file==
+Say you have a file with 50,000 lines in it, which becomes unwieldy to deal with in a spreadsheet or otherwise.  You can easily split the file into segments with the <code>split</code> command. Be default it uses alpha suffixes (little_file.aa, little_file.ab, etc.) If you add the option <code>--numeric-suffixes</code>, then you'll end up with little_file.00, little_file.01, etc.  If you would like to re-add the original suffix, then you must use the option called <code>--additional-suffix</code>
+The following command takes BIG_FILE.txt and for every 10,000 lines of that file, it generates new files called 'little_file.00.txt', 'little_file.01.txt', 'little_file.02.txt', and so on.
+<source lang="bash">
+split --lines=10000 --numeric-suffixes --additional-suffix='.txt' BIG_FILE.txt little_file.
+</source>
+== "BASH is better" ==
+Most job postings that focus on DevOps have requirements for [[Python]], [[Go]] programming or some other programming language. I disagree that a [[DevOps]] Engineer should also be a programmer. I prioritize quality and workmanship (craftsmanship) which is '''informed''' by broad experience, but '''honed''' by specialization. As a construction analogy, I prefer individual skilled trades over the general handyman approach. Simply put: DevOps is DevOps, it is not programming.  Worse, the requirement for the hot language of the day is a bigger tell-tale sign that the company is either posturing or doesn't know what they're doing. Back when when I first learned Perl (which isn't the hot new language anymore), there was a hilarious t-shirt that said "Be careful or I'll replace you with a line of code"<ref>Dave Jacoby agrees with me on the broad point that (programming) languages are just different domain dialects, and also cites the ThinkGeek t-shirt phrase "Go Away Or I Will Replace You With a Small Shell Script"
+https://jacoby.github.io/2021/11/16/i-will-replace-you-with-a-small-shell-script.html</ref>. Although you ''could'' write the Python example more concisely, it is a real-world example of code that I found that does the same thing as 5 lines of BASH that I wrote.
+[[BASH]] code to concatenate [[certbot]] certificates:<syntaxhighlight lang="bash">
+#!/bin/bash
+# $RENEWED_DOMAINS will contain a space-delimited list of renewed
+# certificate domains (for example, "example.com www.example.com"
+# loop through a dynamic list of directories in 'live'
+# for SITE in $(find /etc/letsencrypt/live -mindepth 1 -maxdepth 1 -type d -exec basename {} \;)
+# $RENEWED_LINEAGE will contain the live subdirectory
+for SITE in $RENEWED_DOMAINS
+do
+        # move to correct let's encrypt directory
+        cd $RENEWED_LINEAGE
+        # cat files to make combined .pem for haproxy
+        cat fullchain.pem privkey.pem > /etc/haproxy/certs/$SITE.pem
+done
+# reload haproxy
+# systemctl reload haproxy
+</syntaxhighlight>Python code to concatenate certbot certificates:<syntaxhighlight lang="python3">
+#!/usr/bin/env python3
+import os
+import re
+import sys
+# Certbot sets an environment variable RENEWED_LINEAGE, which points to the
+# path of the renewed certificate. We use that path to determine and find
+# the files for the currently renewed certificated
+lineage=os.environ.get('RENEWED_LINEAGE')
+# If nothing renewed, exit
+if not lineage:
+    sys.exit()
+# From the linage, we strip the 'domain name', which is the last part
+# of the path.
+result = re.match(r'.*/live/(.+)$', lineage)
+# If we can not recognize the path, we exit with 1
+if not result:
+    sys.exit(1)
+# Extract the domain name
+domain = result.group(1)
+# Define a path for HAproxy where you want to write the .pem file.
+deploy_path="/etc/haproxy/ssl/" + domain + ".pem"
+# The source files can be found in below paths, constructed with the lineage
+# path
+source_key = lineage + "/privkey.pem"
+source_chain = lineage + "/fullchain.pem"
+# HAproxy requires to combine the key and chain in one .pem file
+with open(deploy_path, "w") as deploy, \
+        open(source_key, "r") as key, \
+        open(source_chain, "r") as chain:
+    deploy.write(key.read())
+    deploy.write(chain.read())
+# Here you can add your service reload command. Which will be executed after
+# every renewal, which is fine if you only have a few domains.
+# Alternative is to add the reload to the --post-hook. In that case it is only
+# run once after all renewals. That would be the use-case if you have a large
+# number of different certificates served by HAproxy.
+</syntaxhighlight>
+{{References}}
 [[Category:Bash]]
+[[Category:System Administration]]
+[[Category:Files]]