Shell Fundamentals


You’ll need the shell. Here’s enough to make you useful.

FUNDAMENTAL COMMANDS

I’ve compiled a collection of relevant concepts, commands, flags, and approaches for common data-centric uses of the shell, as well as fundamentals you should know.

Directories

Present Working Directory

pwd: Reveals which directory you are currently working in, pwd = present working directory, easy right?

Viewing Directory Contents

ls: Reveals the contents of your pwd

Changing Directories

cd: ‘change directory’, use this to change your pwd

cd .. : Moves you up one level in your folder hierarchy

Example:

home/users/me cd.. –> pwd –> home/users

Copying Files

cp: Copy files, use in the form of cp filename.txt newfilename.txt

You can specify a new directory by specifying in the form of cp filename.txtnewfilename.txt newdirectory

Moving Files

mv: Move file, use in the form of mv filename.txt newdirectory

You can also use this to rename a file, but, this is dangerous because you run the risk of writing over a file with the same name if it already exists.

Deleting Files

rm: remove, use this to delete files.

Run the command in the form of rm filename.txt

You can string multiple rm commands by simply adding filenames and folder hierarchies, Example:

rm filename1.txt filename2.txt /home/users/me/filename3.txt

Deleting Folders

rmdir: removes entire directory, but you have to remove individual files first as a safety feature to prevent a massive loss of data unintentionally

Creating Directories

mkdir: creates a new directory, Example:

mkdir /home/users/me/newdirectory

Viewing and Manipulating Data

Viewing File Contents

cat: displays the contents of a file you specify.

Cat is short for ‘concatenate’ Example:

cat hello.txt —> Hello World!

less: Returns a page worth of information from a file. Use spacebar to view the next page, :n to see a second file if it was specifed, or q to end viewing

head: Returns a small sample of the start of your text file, or if the contents are sufficiently small, it just displays them all

tail: same function as head, but returns ending entries in a file

Shortcuts

Tab auto-completion: Shell has the ability toautocomplete your request, you can simply enter enough of a title to locate it,assuming no conflicting matching titles, and press tab to autocomplete.

This can be extended to include a shortened directory and a shortened filename within that directory. Example: head h <tab> should autocomplete to home

-n will allow us to specify a number of lines to be returned from a file. Example: head -n 15 filename.txt would return the first 15 lines within filename.txt

-R: Yields a recursive list of all subdirectories

-F: places a ‘/’ key following directories, and a ‘*’ after executable programs

Help

man: Returns helpful information about commands, such as man ls would yield a brief synopsis/description

history: Returns a list of recently entered commands into the shell. You can utilize this to automate commands in the list by accessing them via !12 where 12 would indicate the 12th line of the history.

Selecting Columns

cut: Returns columns of a file, structured as follows:

cut -f 1-3, 5-7 -d , filename.csv

This would return columns 1,2,3,5,6,7 with -d signifying a delimiter, which we defined as a comma ‘,’.

Merging Columns

paste: Similar to cut, selects columns, but merges them into a new set of columns.

Regular Expressions

grep: Allows us to search for, and return, lines in a specified document that contain a matching word/string/defined regex. Example: grep ProductA user/me/FinancialStatement.csv would return all lines containing the keyworkd “ProductA”.

Useful flags for a grep call include:

-i: Ignores upper/lower case

-n: Returns the lines with matches

-l: Returns the file names with matches

-c: Returns the number of matches

-v: Returns the number of non-matches

-wc: Returns a word count of matches

-l: Returns a count of lines

-w: Returns a count of words

-c: Returns a count of characters

Pipe

|: Allows us to extract data, and process it with another command.

Example:

tail -10 /user/me/20favoritebooks.txt | head -5 is a convuluted way to first get the bottom 10 entries of my list of 20 favorite books, and then return the upper 5 of that bottom 10.

With piping, you’ll be able to use an output as an input for another command. It’s important to note you’re not only limited to a single input/output relationship, you can string together many input-output-input-output commands in a single call by simply extending with a | pipe

Wildcards

You can simply specify multiple files as we have earlier, but writing multiple file names. However, we can use a wildcard such as * to engage in pattern matching across multiple file names. Example:

head -5 user/me/yearlyfinances/financial20* to return the top 5 lines from every financial record from the year 2000 on.

Other Wildcards:

*: Matches a pattern of any length/composition after the * symbol. A-* would return A-Z or A-8 or A-asojhdoashd12345

[]: Matches characters inside a bracket, ba[td] would match bat or bad

{}: Matches patterns inside, such as {200*, 300*, 400*}, returning only values from 2000-2009, 3000-3009, 4000-4009.

?: Matches a single character. Andre? would return Andrew, Andrea, but not Andrews or Andreas.

Sorting Lines

sort: Allows us to engage in ordering.

We can use a few flags to specify the kind of ordering we’d prefer.

-n: Sorts numerically

-r: Sorts reversed

-f: Ignore upper/lower case

Handling Duplicates

uniq: Removes duplicate lines, but only if they are directly adjacent.

Written on November 30, 2017