Processes in an operating system
Processes in an operating system
A running program is called a process in the operating system. It can summarize everything about the program at a given state of execution. This includes memory locations, a couple of registers (all, in fact) and any open IO devices. Earlier OSes loaded a process data eagerly, but now it's used lazily. This involves paging and swapping.
A process has three states (4 actually)
- Running: Running
- Blocked: Waiting for IO on a file descriptor or something
- Ready: Ready to execute but waiting for its turn.
Data structures involved
The proc struct stores a lot of things in xv6. First up is the set of registers.
struct context {
int eip;
int esp;
int ebx;
int ecx;
int edx;
int esi;
int edi;
int ebp;
};
Next, we store the state of the process at the time of descheduling.
enum proc_state { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };
The actual struct also holds the start and stop of the memory region that the process is occupying, alongside the kernel stack, the process id, the parent if any, open files, current working directory, the saved registers, and the trapframe.
struct proc {
char *mem; // Start of process memory
uint sz; // Size of process memory
char *kstack; // Bottom of kernel stack for this process
enum proc_state state; // Process state
int pid; // Process ID
struct proc *parent; // Parent process
void *chan; // If !zero, sleeping on chan
int killed; // If !zero, has been killed
struct file *ofile[NOFILE]; // Open files
struct inode *cwd; // Current directory
struct context context; // Switch here to run process
struct trapframe *tf; // Trap frame for the current interrupt
};
Such an entry is also sometimes called as a Process Control Block.
Process related syscalls
The fork() system call creates a new process in Linux. It returns twice! Once in parent
process (where it returns the pid of the child process) and once in child process (where it
returns 0). The only register that changes is the rax, where it's set to the PID of the child
in the parent process and to 0 in the child process.
The wait() system call makes the parent wait till the child process has finished execution.
This is useful for syncing. If called from a process that has no active unwaited-for child.
There's also waitpid() that takes the pid of the child process and checks for that one
specifically. wait() has unspecified order for two or more children. Both of them take an
argument to a pointer to an int where they can store the return status of the child process.
If NULL then it's discarded.
The exec() system call is useful when you want to replace the current process with another
one without creating a new process. It replaces the code and data segment of the current
process, refreshes the register entries and cleans up stack, heap and other stuff. Any other
arguments are passed as arguments to the binary.
A successful call to exec() never returns.
