Shell Fundamentals
You’ll need the shell. Here’s enough to make you useful.
FUNDAMENTAL COMMANDS
I’ve compiled a collection of relevant concepts, commands, flags, and approaches for common data-centric uses of the shell, as well as fundamentals you should know.
Directories
Present Working Directory
pwd
: Reveals which directory you are currently working in, pwd = present working directory, easy right?
Viewing Directory Contents
ls
: Reveals the contents of your pwd
Changing Directories
cd
: ‘change directory’, use this to change your pwd
cd ..
: Moves you up one level in your folder hierarchy
Example:
home/users/me
cd.. –> pwd –> home/users
Copying Files
cp
: Copy files, use in the form of cp filename.txt newfilename.txt
You can specify a new directory by specifying in the form of cp filename.txtnewfilename.txt newdirectory
Moving Files
mv
: Move file, use in the form of mv filename.txt newdirectory
You can also use this to rename a file, but, this is dangerous because you run the risk of writing over a file with the same name if it already exists.
Deleting Files
rm
: remove, use this to delete files.
Run the command in the form of rm filename.txt
You can string multiple rm
commands by simply adding filenames and folder hierarchies, Example:
rm filename1.txt filename2.txt /home/users/me/filename3.txt
Deleting Folders
rmdir
: removes entire directory, but you have to remove individual files first as
a safety feature to prevent a massive loss of data unintentionally
Creating Directories
mkdir
: creates a new directory, Example:
mkdir /home/users/me/newdirectory
Viewing and Manipulating Data
Viewing File Contents
cat
: displays the contents of a file you specify.
Cat is short for ‘concatenate’ Example:
cat hello.txt
—> Hello World!
less
: Returns a page worth of information from a file. Use spacebar
to view the next page, :n
to see a second
file if it was specifed, or q
to end viewing
head
: Returns a small sample of the start of your text file, or if the contents are sufficiently small, it just
displays them all
tail
: same function as head, but returns ending entries in a file
Shortcuts
Tab auto-completion: Shell has the ability toautocomplete your request, you can simply enter enough of a title to locate it,assuming no conflicting matching titles, and press tab to autocomplete.
This can be extended to include a shortened directory and a shortened filename within that directory. Example:
head h <tab>
should autocomplete to home
-n
will allow us to specify a number of lines to be returned from a file. Example:
head -n 15 filename.txt
would return the first 15 lines within filename.txt
-R
: Yields a recursive list of all subdirectories
-F
: places a ‘/’ key following directories, and a ‘*’ after executable programs
Help
man
: Returns helpful information about commands, such as man ls
would yield a brief synopsis/description
history
: Returns a list of recently entered commands into the shell. You can utilize this to automate commands in the list by accessing them via !12
where 12 would indicate the 12th line of the history.
Selecting Columns
cut
: Returns columns of a file, structured as follows:
cut -f 1-3, 5-7 -d , filename.csv
This would return columns 1,2,3,5,6,7 with -d
signifying a delimiter, which we defined as a comma ‘,’.
Merging Columns
paste
: Similar to cut, selects columns, but merges them into a new set of columns.
Regular Expressions
grep
: Allows us to search for, and return, lines in a specified document that contain a matching word/string/defined regex. Example:
grep ProductA user/me/FinancialStatement.csv
would return all lines containing the keyworkd “ProductA”.
Useful flags for a grep
call include:
-i
: Ignores upper/lower case
-n
: Returns the lines with matches
-l
: Returns the file names with matches
-c
: Returns the number of matches
-v
: Returns the number of non-matches
-wc
: Returns a word count of matches
-l
: Returns a count of lines
-w
: Returns a count of words
-c
: Returns a count of characters
Pipe
|
: Allows us to extract data, and process it with another command.
Example:
tail -10 /user/me/20favoritebooks.txt | head -5
is a convuluted way to first get the bottom 10 entries of my list of 20 favorite books, and then return the upper 5 of that bottom 10.
With piping, you’ll be able to use an output as an input for another command. It’s important to note you’re not only limited to a single input/output relationship, you can string together many input-output-input-output commands in a single call by simply extending with a |
pipe
Wildcards
You can simply specify multiple files as we have earlier, but writing multiple file names. However, we can use a wildcard
such as *
to engage in pattern matching across multiple file names.
Example:
head -5 user/me/yearlyfinances/financial20*
to return the top 5 lines from every financial record from the year 2000 on.
Other Wildcards:
*
: Matches a pattern of any length/composition after the *
symbol. A-*
would return A-Z
or A-8
or A-asojhdoashd12345
[]
: Matches characters inside a bracket, ba[td]
would match bat
or bad
{}
: Matches patterns inside, such as {200*, 300*, 400*}
, returning only values from 2000-2009, 3000-3009, 4000-4009.
?
: Matches a single character. Andre?
would return Andrew
, Andrea
, but not Andrews
or Andreas
.
Sorting Lines
sort
: Allows us to engage in ordering.
We can use a few flags to specify the kind of ordering we’d prefer.
-n
: Sorts numerically
-r
: Sorts reversed
-f
: Ignore upper/lower case
Handling Duplicates
uniq
: Removes duplicate lines, but only if they are directly adjacent.