Processes in an operating system
Basics of how processes in an operating system work.
Processes in an operating system
A running program is called a process in the operating system. It can summarize everything about the program at a given state of execution. This includes memory locations, a couple of registers (all, in fact) and any open IO devices. Earlier OSes loaded a process data eagerly, but now it's used lazily. This involves paging and swapping.
A process has three states (4 actually)
- Running: Running
- Blocked: Waiting for IO on a file descriptor or something
- Ready: Ready to execute but waiting for its turn.
Data structures involved
The proc struct stores a lot of things in xv6. First
up is the set of registers.
struct context {
int eip;
int esp;
int ebx;
int ecx;
int edx;
int esi;
int edi;
int ebp;
};
Next, we store the state of the process at the time of descheduling.
enum proc_state { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };
The actual struct also holds the start and stop of the memory region that the process is occupying, alongside the kernel stack, the process id, the parent if any, open files, current working directory, the saved registers, and the trapframe.
struct proc {
char *mem; // Start of process memory
uint sz; // Size of process memory
char *kstack; // Bottom of kernel stack for this process
enum proc_state state; // Process state
int pid; // Process ID
struct proc *parent; // Parent process
void *chan; // If !zero, sleeping on chan
int killed; // If !zero, has been killed
struct file *ofile[NOFILE]; // Open files
struct inode *cwd; // Current directory
struct context context; // Switch here to run process
struct trapframe *tf; // Trap frame for the current interrupt
};Such an entry is also sometimes called as a Process Control Block.
Process related syscalls
The fork() system call creates a new process in
Linux. It returns twice! Once in parent process (where it
returns the pid of the child process) and once in child process
(where it returns 0). The only register that changes is the
rax, where it's set to the PID of the child in the
parent process and to 0 in the child process.
The wait() system call makes the parent wait till
the child process has finished execution. This is useful for
syncing. If called from a process that has no active unwaited-for
child. There's also waitpid() that takes the pid of the
child process and checks for that one specifically. wait() has
unspecified order for two or more children. Both of them take an
argument to a pointer to an int where they can store the return
status of the child process. If NULL then it's
discarded.
The exec() system call is useful when you want to
replace the current process with another one without creating a new
process. It replaces the code and data segment of the current
process, refreshes the register entries and cleans up stack, heap
and other stuff. Any other arguments are passed as arguments to the
binary. A successful call to exec() never
returns.