 | The concept of process
 | A process is a program in execution. |
 | Each process has a unique identifier, which we refer to as
"process id". |
 | Some process identifiers are reserved for special purposes.
 | 0 for scheduler, or the code for kernel activity.
 | The scheduler decides which processes to run and which should
wait. |
|
 | 1 for init. created by the kernel after booting.
 | init brings the UNIX to working condition, and may refer to rc
scripts in the /etc/directory. |
 | The process is either in /etc or /sbin |
 | Now let's take a look at what Linux will do. |
|
 | 2 for page daemon.
 | Sometimes called pager. |
 | A kernel process that supports virtual memory. |
|
|
|
 | Process information
 | getpid
 | Get the process id. |
|
 | getppid
 | Get the parent process id. |
|
 | getuid
 | Get the user id of a process. |
|
 | geteuid
 | Get the effective user id of a process. |
|
 | getgid
 | Get the group id of a process. |
|
 | getegid
 | Get the effective group id of a process. |
|
 | Now let's write a program to print out these information. |
|
 | Process creation
 | fork function
 | The fork function is the only way to create a process in a UNIX
environment. |
 | The fork function is called once, but returned twice. |
 | A child process is created by calling the fork. The two processes
are identical except for the return value.
 | The parent process returns the child process id. |
 | The child process returns 0. |
|
 | The two processes continue execution after calling fork. |
 | The two processes DO NOT share data -- they are two copies of the
same program. |
 | Now let's try the textbook example. fork1.c
 | The variables are different copies. |
 | The process ids are different. |
 | When we run it normally the "before fork" appears
only once. |
 | When the output is directed to a disk file, the "before
fork" appears twice. The reason is that when the disk I/O
is fully buffered, so the contents of standard I/O buffers are
copied from the parent to the child. |
 | The write appears only once since it is not buffered (because
of write system call). |
|
|
 | Information sharing
 | Some information is shared between the parent and the child.
 | Open file descriptor and offsets. In that case two processes
can share a open file. |
 | Various user and group ids. |
 | Working directory. |
 | Environment. |
 | Resource limits. |
 | Refer to table on page 192 for a complete list of entries that
the child inherits from the parent. |
|
 | Some data are different between parent and child.
 | Process id. |
 | Return value from fork. |
 | Refer to table on page 192 for a complete list of entries that
the child does not inherit from the parent. |
|
|
 | Purpose of process creation
 | To create a duplicate copy.
 | This is done by placing different sections of code after
checking the return value of the fork. |
|
 | To run a different program.
 | This is usually done by running a "exec" after the
fork. |
 | The fork and the exec can be combined (sometimes called spawn)
to improve efficiency. |
|
|
 | vfork function
 | To run a program using the child. Notice that the child runs in
the addressing space of the parent and no memory copying is
necessary. |
 | The child will run first, and the parent will wait for it. |
 | The textbook example
vfork1.c
 | Notice the values the parent prints out. |
 | What will happen if we replace the _exit with exit? |
|
|
|
 | Process termination
 | exit function
 | To terminate a process with an exit code. |
 | Notice that this is a library, and _exit is a system call. |
|
 | Normal termination
 | The main program returns. |
 | The program calls exit. |
 | The program calls _exit. |
|
 | Abnormal termination
 | The program calls abort. |
 | The program catches a signal. |
|
 | No matter how a process terminates, the same code is the kernel does
the following.
 | Close all open file descriptor. |
 | Release memory. |
 | Release process table entry. |
|
 | The exit code
 | The exit code lets the child to notify the parent about the its
execution status. |
 | The process reports exit status, to which the kernel might add
extra information and called termination status. |
 | The termination code can be found from the wait family from the
parent. |
|
 | Anomaly
 | When the parent terminates before the child, the init process
becomes the parent of this orphan process. |
 | When the child terminates before the parent, the child becomes a
zombie.
 | A zombie is a dead entity, but not completely dead. :-) |
 | The child must leave sufficient information in the process
table so that later when its parent wants to fetch its status,
it is able to do so, |
 | The information a zombie keeps in the process table includes
process id, termination status, and accounting information. |
 | One can use ps to find out the status of all processes,
including zombies. |
|
|
|
 | Process synchronization
 | wait families
 | wait
 | A process can call wait to wait for the child process to
complete. |
 | The wait function provides a integer buffer for receiving the
termination status. |
 | The wait function will block if no child is available. |
 | The return value is the process id of the child process. |
|
 | waitpid
 | A process can call wait to wait for a particular child
process to complete. |
 | The waitpid can be non-blocking. |
|
 | Termination status
 | There are a set of macros to retrieve information from the
termination status.
 | WIFEXITED
 | true is the child terminates normally. |
 | WEXITSTATUS tells us the actual exit code. |
|
 | WIFSIGNALED
 | true if the child process catches a signal and
terminates. |
 | WTERMSIG |
 | WCOREDUMP |
|
 | WIFSTOPPED
 | true if the process is stopped. |
 | WSTOPSIG |
|
|
|
 | Now we try the textbook example wait1.c
 | Can we generate the coredump file? |
 | Three cases are tested in this example.
 | normal termination with exit. |
 | abnormal termination with abort. |
 | abnormal termination with arithmetic exception. |
|
|
 | Another textbook example fork2.c.
 | The first child exits before the second, so that init will
adopt the second child. |
 | This is useful when the parent does not want to wait for the
child and we do not want the child to become a zombie either. |
|
|
|
 | Race condition
 | Multiple processes running simultaneously could result very strange
errors. |
 | If the correctness of a program depends on the execution sequence of
consisting processes, then we have a race condition. |
 | The race condition is difficult to debug since the error may not
appear when we want it to. |
 | The (proc/fork2.c) example has a race condition.
 | We cannot guarantee that the first child will exit first. |
 | If that happens, init will not adopt the second child. |
|
 | Textbook example tellwait1.c
 | The stdout is specifically changed to be unbuffered. |
 | The outputs from parent and child are mingled together. |
|
 | Process synchronization
 | To avoid race condition, we need to synchronize the processes. |
 | tellwait1.c gives an example of process synchronization. We will
discuss its implementation when we cover IPC and signals. |
|
|
 | Exec family
 | The exec family runs a specific program, which replaces the image of
the calling process. |
 | In Linux only the execve is a system call, and all the other are
library that were built on top of execve. See the figure on page 211 for
details. |
 | Here is a list of all the functions.
|
 | Here is the trick (page 209).
 | program filename
 | nothing means pathname |
 | p means the file should be found from the path. |
|
 | argument passing
 | l means the arguments are passed as argument list. |
 | v means the arguments are passed as a pointer array. |
|
 | environment
 | nothing means the environment is from the environ variable. |
 | e means the environment is from the environment pointer array. |
|
|
 | When the program is given as a filename
 | No slash found
 | Find it from PATH. |
|
 | Slash found
 | Treated as a pathname. |
|
|
 | Try the textbook example exec1.c
 | The echoall.c echoes all command line arguments. |
 | The main program runs the echoall program. |
|
|
 | Interpreter file
 | An interpreter file is the input to an interpreter. |
 | An interpreter file is a text file, not a binary executable. |
 | An interpreter file starts with "#!", then followed by the
name of the interpreter, then by the optional arguments. |
 | When the interpreter is a shell (in most cases it is), the interpreter
file is usually called a shell script -- namely a script that will be
run by the shell. |
 | Try the text book example.
 | We write an interpreter file testinterp, which will execute the
echoarg program.
 | Notice that the path name of the interpreter file is added as
the last argument (by the kernel), and passed to the interpreter
(in this case, echoarg). |
|
 | Now we compile and run exec2.c.
 | The exec2 executes the testinterp, which executes the echoarg. |
 | Notice that the prompt disappeared! |
|
 | Now try to use awk as the interpreter.
 | awk is a very useful script
interpreter. The basic syntax is "awk -f file", where
file is the name of the awk script. |
 | Now we write an awk script to print the first word of every
line. |
 | Now we try the textbook awk script that prints all the
arguments.
 | There are several files here.
 | The interpreter program awk. |
 | The interpreter file awkexample. |
|
 | When the awkexample is evoked from a shell, the shell
creates a process, and executes the interpreter file. |
 | The interpreter files then executes the interpreter (awk),
and use the '-f' mechanism to pass the name of the
interpreter file name as an arguments, along with other
command line arguments from shell. |
 | Notice that the awk is given five parameters -- the '-f'
from the interpreter file, the pathname of the
interpreter file is given by the kernel, the last three
arguments are from the command line, then passed to
interpreter file awkexample. |
 | See page 220 for a complete illustration. |
|
|
 | Reasons for using interpreter files.
 | Hide the fact that a command is a script, not a binary
executable. |
 | Efficiency gain. |
 | Write shell scripts other than for sh. |
|
|
|
 | system functions
 | A simple way to utilize system facility -- just like typing into a
shell. |
 | The return value tells whether the command is executed successfully or
not. |
 | system takes a command string and passes it to /bin/sh
for execution. Now exam the following implementation (system.c).
 | First we create a new process by fork. |
 | Then we use execl to execute /bin/sh and ask it to run the command
string for us. The command string is passed to /bin/sh by way of -c
option. This option directs /bin/sh to take the command from the
string immediately after '-c' option. |
 | Finally the system process waits for sh to complete. |
 | Note that we use _exit instead of exit in the system process. |
|
 | Now we try the textbook example (systest1.c). |
 | system and setuid programs
 | A setuid program should never use system, since the effective uid
can be carried into the child process. |
 | Consider two files tsys and printuids.
 | tsys is a program that uses system to run the command given to
it. |
 | printuid is a program that prints the real and effective uids. |
|
 | If tsys has setuid bit on, then the printuids process will have
the effect of setuid. |
|
|
 | Process accounting
 | The superuser can turn the accounting on and record the accounting
information into a file. |
 | These information can be retrieved from the file by simple file I/O. |
|
 | Process times
 | The command time reports the time usage. |
 | The system call times retrieves the
time usage of a process and its children. |
 | The function returns the wall clock time each time it is called. |
 | Now try the textbook example (times1.c).
 | The function pr_times computes the difference between two tms
records and report the values. |
|
|