Title: UNIX
1UNIX LINUX Fundamentals for HPC
- A Short-Series Presentation by
- Evan Lee Turner
- David Carver
- January 24, 2008
2Purpose of This Lecture
- Fundamentals of using UNIX and UNIX-like systems
- History of UNIX Linux
- Basic system commands
- Data Management
- Constructing basic shell scripts
3History of UNIX
- "...the number of UNIX installations has grown to
10, with more expected..." - Dennis Ritchie and Ken Thompson, June 1972
- "... When BTL withdrew from the project, they
needed to rewrite an operating system (OS) in
order to play space war on another smaller
machine (a DEC PDP-7 Programmed Data Processor
with 4K memory for user programs). The result was
a system which a punning colleague called UNICS
(UNiplexed Information and Computing Service)--an
'emasculated Multics' no one recalls whose idea
the change to UNIX was -
- A few years later, a colleague named Dennis
Ritchie suggestd that they rewrite UNIX using the
C language, which Dennis had recently developed
from a language called B. - - Graham Glass and King Ables, 2006
4Early Movers and Shakers
- Dennis Ritchie Ken Thompson
- PDP11
5And then there was C
- The idea that an operating system could be
written in a high level language was an unusual
approach at that time, since most people felt
that assembly language was the only language fast
enough for such an important component of a
computer system - - Glass and Ables, 2006
6Bringing UNIX to the Desktop
- There were two different versions of UNIX SYSV
and BSD - Unix was very expensive
- Microsoft DOS was the mainstream OS
- MINIX, tried but was not a full port
- An open source solution was needed!
7Linux 0.02 October 5, 1991
- Do you pine for the nice days of minix-1.1, when
men were men and wrote their own device drivers?
Are you without a nice project and just dying to
cut your teeth on a OS you can try to modify for
your needs? Are you finding it frustrating when
everything works on minix? No more all-nighters
to get a nifty program working? Then this post
might be just for you -) - - Linus Benedict Torvalds
- "I still maintain the point that designing a
monolithic kernel in 1991 is a fundamental error.
Be thankful you are not my student. You would not
get a high grade for such a design -)" (Andrew
Tanenbaum to Linus Torvalds)
81990s Movers and Shakers
- Richard Stallman, father of the GNU Project
- Linus Torvalds Linux Kernel
9Linux Packaging
- Linus and his group developed a Linux kernel,
which is the core of the operating system, but
most distributions of Linux included the GNU
software utilities developed by the Free Software
Foundation in their distribution package. - What most people refer to as Linux is mostly
utilities and applications from the GNU project. - The GNU utilities are also available on non-Linux
systems. Even Windows!
10GNU Project
- The GNU Project was launched in 1984 to develop
a complete Unix-like operating system which is
free software the GNU system. - The GNU Hurd is the GNU project's replacement for
the Unix kernel. The Hurd is a collection of
servers that run on the Mach microkernel to
implement file systems, network protocols, file
access control, and other features that are
implemented by the Unix kernel or similar kernels
(such as Linux). In 2002 development in general
has not met expectations, and there are still
bugs and missing features. - Developing a whole system is a very large
project. To bring it into reach, I decided to
adapt and use existing pieces of free software
wherever that was possible. For example, I
decided to use the X Window System rather than
writing another window system for GNU. - - Richard Stallman
- www.gnu.org and Richard Stallman
11Why UNIX/LINUX is Still Used
- 30 years of development
- Many academic, scientific, and system tools
- Open Source
- System Stability
- Lightweight
- Easy Development
- Linux kernel, GNU software, and X windows are
free software
12The Basics
- Command-Line
- Interaction with UNIX LINUX is based on
entering commands to a text terminal - Oftentimes there are no warnings with commands,
no undo - The Shell
- The user environment that enables interaction
with the kernel, or lower-system OS. - Internet Explorer would be a shell for Microsoft
Windows.
13Common Shells
- sh The original UNIX shell still
located in /bin/sh - Bash is a Unix shell written for the GNU
Project and is installed on most Linux systems. - csh C Shell, modeled after the C
programming language used by UNIX systems - tcsh C Shell with modern improvements such
as filename completion - echo SHELL Displays what shell your account
is using. - chsh Changes your shell
14Before We Go Further
- Read the Manual.
- man command
- man section command
- man k keyword (search all manuals based on
keyword) - Most commands have a built-in UNIX manual, even
the man command! - Commands without manuals have help too, with h,
--help, or /? option. - Google is your friend
15The Manual
- The manual pages are divided into eight sections
depending on type of command. - 1 commands and applications
- 2 system calls
- 3 C library functions
- 4 special files
- 5 file formats
- 6 games
- 7 misc.
- 8 system administration utilities
16Why Sections are Important
17Conventions for this Lecture
- This lecture is too short to give you all of the
options, so look at the manual for specific
syntax for commands. - Commands will be in bold, options will be in
italics. - command arguments
- Output will be shown in its own bordered table
18Command Conventions
- In help files and manuals, commands will have
required input and optional input - cp OPTION source destination
- Optional arguments are in brackets, required
arguments are not. - cp R or cp - -recursive
- Short options -, long options - -
19Directories
- What is a working directory?
- The directory your shell is currently associated
with. At anytime in the system your login is
associated with a directory - pwd View the path of your working directory
- ls View your working directory
20Whos Path is it Anyway?
- UNIX treats the directory structure as a
hierarchy of individual paths
/
/ (root directory) /home /home/david
usr
home
dev
david
bin
21Finding Your Home
- Each user has a home directory which can be found
with - cd
- cd david
- cd HOME
- The tilde character will tell the shell to
auto-complete the path statement for the cd
command - HOME refers to an environment variable which
contains the path for home.
22More File Commands
- cd directory
- Change your current working directory to the
new path - ls l show hidden files
- Hidden files are files that begin with a period
in the filename . - mv Moves one file to another
- cp Copies files or directories
- rm Remove files directories
- rm rf Remove everything with no warnings
- rm rf Most dangerous command you can run!
23Recursive Directories
- Oftentimes a manual will refer to recursive
actions on directories. This means to perform an
action on the given directory and recursively to
all subdirectories. - cp R source destination
- copy recursively all directories under source
to destination
24The Bit Bucket
- /dev/null
- Throw items away into /dev/null and they will be
gone forever. Good place to redirect trash
output to. - Other interesting files on the system
- /dev/rand
- Psuedo random number generator
- /dev/zero
- Fill zeros. Very fast ,use with caution
otherwise you may get nasty email from
administrator.
25Relative vs Absolute Path
- Commands expect you to give them a path to a
file. Most commands will let you provide a file
with a relative path, or a path relative to your
working directory. - ../directory the .. refers to looking at
our previous directory first - ./executable . says this directory, or our
working directory - Absolute, or Full paths are complete. An easy
way to know if a path is absolute is does it
contain the / character at the beginning? - /home/user/directory/executable
- (a full path to file executable)
26Poking Around in Home
- How much space do I have?
- quota
- Command to see all quotas for your
directories, if any. - How much space am I taking up?
- du
- Command to find out how much space a folder or
directory uses. - df
- Display space information for the entire
system
27Helpful Hints on Space
- Almost all commands that deal with file space
will display information in Kilobytes, or Bytes.
Nobody finds this useful. - Many commands will support a -h option for
Human Readable formatting. - ls lh
- Displays the working directory files with a
long listing format, using human readable
notation for space
28Representing Space
- Bit either a 1 or 0
- Byte 8 bits 0000 1111, or x0F, or 16
- KB, Kilobyte - 1024 Bytes
- MB, Megabyte 1024KB
- GB, Gigabyte 1024MB
- TB, Terabyte 1024 GB
- PB, Petabyte 1024 TB
29Permissions
- The NIX systems are multi-user environments
where many users run programs and share data.
Files and directories have three levels of
permissions World, Group, and User. - The types of permissions a file can contain are
30Permissions Cont.
- File permissions are arranged in three groups of
three characters. - In this example the owner can read write a
file, while others have read access
31Changing Permissions
- chmod
- Change permissions on a file or directory
- chown
- Change file ownership to another user
- Both options support -R for recursion.
32All About Me
- Every userid corresponds to a unique user or
system process - whoami Returns the userid of the current
user - passwd Change password
- What is my group? support!
33Everyone Else
- who Show all other users logged in
- finger Show detailed information about a
user
34What Everyone Else is Up To
- top
- Show a detailed, refreshed, description of
running processes on a system. - uptime
- Show the system load and how long the
system has been up. - load is a number based on utility of the cpus
of the system. - A load of 1 indicates full load for one cpu.
35Working With Programs
- Commands or programs on the system are identified
by their filename and by a process ID which is a
unique identifier. - ps Display process information on
the system - kill pid Terminates the process id
- c (controlc) Terminates the running
program - d (controld) Terminates your session.
- Only you and the superuser (root) have permission
to kill processes you own.
36Advanced Program Options
- Oftentimes we must run a command in the
background with the ampersand character - command options
- Runs command in background, prompt returns
immediately. - Match zero or more characters wildcard
- cp destination
- Copy everything to destination.
- This option can get you into trouble if misused
37Editing Files
- emacs vs vi
- Among the largest nerd battle in history.
emacs relies heavily on key-cords (multiple key
strokes), while vi is mode based. (editor mode
vs command mode) - vi users tend to enter and exit the editor
repeatedly, and use the Unix shell for complex
tasks, whereas emacs users usually remain within
the editor and use emacs itself for complex tasks
- pico (nano)
- editor originally used for the email client pine,
simple no-frills editor which resembles notepad
for windows.
38Input and Output
- Programs and commands can contain an input and
output. These are called streams. UNIX
programming is oftentimes stream based. - Programs also have an error output. We will see
later how to catch the error output. - STDIN standard input, or input from the
keyboard - STDOUT standard output, or output to the
screen - STDERR standard error, error output which is
sent to the screen.
39File Redirection
- Oftentimes we want to save output (stdout) from a
program to a file. This can be done with the
redirection operator. - myprogram gt myfile
- Using the gt operator we redirect the output
from myprogram to file myfile - Similarly, we can append the output to a file
instead of rewriting it with a double gtgt - myprogram gtgt myfile
- Using the gt operator we append the output
from myprogram to file myfile
40Input Redirection
- Input can also be given to a command from a file
instead of typing it to the screen, which would
be impractical. - cat programinput gt mycommand
- This command series starts with the command
cat which prints a file to a screen. - programinput is printed to stdout, which is
redirected to a command mycommand.
41Redirecting stderr
- Performing a normal redirection will not redirect
sdterr. In Bash, this can be accomplished with
2gt - command 2gt file1
- Or, one can merge stderr to stdout (most popular)
with 2gt1 - command gt file 2gt1
42Pipes
- Using a pipe operator , commands can be
linked together. The pipe will link the standard
output from one command to the standard input of
another. - Very helpful for searching files
43Searching
- A large majority of activity on UNIX systems
involve searching for files and information. - find Utility to find files
- grep The best utility ever written for
UNIX, searches for patterns inside files and will
return the line, if found
44Packing Files
- When creating backups of files, or transferring
to other hosts, files must be packed into larger
files. This is needed for ease of manipulation,
transfer speeds, and file management. - tar
- Create or extract a packed file.
- tar stands for tape archive.
45Compressing Files
- Compressing files can gain file space at the
expense of cpu time to compress and decompress
files. - Compression works well for text files, but not as
well for binary files with random data such as
float values. - Compression algorithms commands
- gzip, gunzip, bzip2, bunzip2
46Using tar to Create Compressed Files
- Tar will create compressed files for you.
- tar czvf mytarfile.tar.gz directory
- Creates a compressed file named
mytarfile.tar.gz containing all of the files
in the directory directory - tar xzvf mytarfile.tar.gz
- Uncompresses all directories and files inside
the file mytarfile.tar.gz into the working
directory
47Testing Compression
- Using an example from a dataset of visual MRI
binary data that is used for an application
called Freesurfer three different compression
methods will be tested. The dataset contains a
mix of binary and text data. - The collected data set, which includes 128
individual 180KB binary files, is 42MB in
uncompressed form.
48The Dataset is Compressed using bzip, gzip, and
compress (Z)
49Results
- The test shows that bzip compression is the most
efficient at the expense of the most CPU time.
gzip is generally a good all-around compression
algorithm because it gives decent performance
with an average CPU load. - It is a good idea to test your own dataset.
50Connecting to Another Machine
- Secure Shell vs Restricted Shell
- ssh is an encrypted remote login program that is
secure to trust across non secure networks. - rsh is a non-encryped version of ssh which is
only used between sites that are inside secure
networks. rsh will provide faster file transfer
speeds. - rsh and ssh take similar arguments
- ssh userid_at_hostname
- rsh userid_at_hostname
51Copying Files to Remote Hosts
- Copy local file lfile to rfile on remote machine
rsys - scp lfile rsysrfile
- -p preserves modification time, access time and
mode from original - scp -p lfile rsysrfile
- Copy rfile from remote machine rsys to local file
lfile - scp -p rsysrfile lfile
52Running Commands on a Remote Host
- Commands can be executed on a remote host with
ssh and rsh - rsh userid_at_hostname ls
- Run ls on remote host hostname
53Advanced Movement
- tar -cf foldertoarchive bzip2 rsh
archive.tacc.utexas.edu "cat gt
ARCHIVE/myfile.tar.bz2 - This statement creates a bzip tar file and sends
it to the remote host archive.tacc.utexas.edu - What is nifty about this
- Since the command is inline, no local backup
copy is created on the local host.
54Basic Shell Scripts
- Many times it is helpful to create a script of
commands to run instead of typing them in
individually. Scripts can be made to aid in
post-processing, system administration, and to
automate menial tasks - !/bin/bash
- First statement inside a script, will list
which shell to run this script in - Says what will follow is a comment and not to
execute
55Variables
- By convention system variables are capitalized
- HOME Location of the home directory
- OLDPWD Location of the previous working
directory - PATH Locations to look inside for
executable files - Setting system variables differs by shell.
- bash uses export
- csh uses setenv
- User defined variables in scripts are lower-case
by convention - myvariable10 Sets myvariable to 10
- echo myvariable Prints myvariable
56My Environment
- View all system variables by the command env
- Depending on shell, startup commands can be
managed with the files .profile for bash and
.cshrc with C shell
57Conditionals
- if condition then condition is zero (true
- 0) execute all commands up to else statement
else if condition is not true then execute
all commands up to fi fi
58Multilevel Conditionals
- if condition then condition is zero
(true - 0) execute all commands - up to elif statement elif condition1 then
- condition1 is zero (true - 0) execute all
commands up to - elif statement elif condition2
- then
- condition2 is zero (true - 0) execute all
commands up to - elif statement else None of the above
condtion,condtion1,condtion2 are true (i.e.
all of the above nonzero or false) execute all
commands up to fi fi
59Performing Loops
- Loops are statements that are repeated until the
conditions are met. - for variable name in list
- do
- execute one for each item in the list until the
list is not finished (And repeat all statements
between do and done) - done
- for i in 1 2 3 4 5
- doecho "Welcome i times
- done
60While Loop
- while condition
- do
- command1
- command2
- command3 .. ....
- done
61Putting it Together
- !/bin/bash
- my first script
- scp replacement
- remotefilelabresults
- myserverhome.utexas.edu
- mylsinfossh myserver -la remotefile 2gt1
- ismissingecho mylsinfo grep ERROR
- if "ismissing"
- then
- echo "remotefile not found! Exiting!"
- else
- ssh -n "cat lt remotefile" gt localfile
- fi
62- mylsinfossh myserver -la remotefile 2gt1
- Backticks are used to place output from a
command into a variable - if "ismissing"
- Is ismissing set (has a value)?
- If so then the expression is true, otherwise
false
63What Were Getting at Here
- Tools have been written to interact with the tape
archive system, which will be announced this
week. This is in conjunction with the
announcement for the archive replacement system,
ranch. - module load sinc
- sinc Archive data to tape
- unsinc Unpack data from tape
- rls Remote ls
64Remote LS
- !/bin/bash
- myls(echo ARCHIVER grep archive
gt/dev/null echo "/etc/dmf/dmbase/bin/dmls")
(echo ARCHIVER grep ranch gt/dev/null echo
"/opt/SUNWsamfs/bin/sls") - if "myls"
- then
- rsh ARCHIVER -n "myls "
- else
- echo "archiver not found"
- exit 1
- fi
65References
- Graham Glass King Abels UNIX for Programmers
and Users SE 1999 - Mark G. Sobell A Practical Guide to UNIX System
V1985 - Amir Afzal UNIX Unbounded A Beginning Approach TE
2000 - http//www.english.uga.edu/hc/unixhistoryrev.html
- https//netfiles.uiuc.edu/rhasan/linux/