Linux Process & System Management - DevOps Process Control & Automation

Linux Process & System Management

Published: December 2023 | Topic: System Administration & Automation for DevOps

Mastering process and system management is crucial for DevOps engineers. You need to understand how Linux manages processes, how to monitor system resources, schedule automated tasks, and control system startup. These skills are essential for maintaining stable, performant production systems and automating operational tasks.

The Linux Process Model

In Linux, everything runs as a process. Understanding processes is fundamental to system administration:

Process: A running instance of a program
PID: Process ID - unique number identifying each process
PPID: Parent Process ID - the process that created this process
UID/GID: User/Group ID of the process owner
Process States: Running, Sleeping, Stopped, Zombie
Process Hierarchy: Tree structure starting from init/systemd (PID 1)

Process Hierarchy Visualization

systemd/init

PID 1

Parent of all processes

↓

sshd

PID 456

SSH daemon

nginx

PID 789

Web server

bash

PID 1234

User shell

1. Understanding Processes & Jobs

What are Processes and Jobs?

A process is a running program instance managed by the kernel. A job is a process that's managed by your shell (with job control). Every process has:

Process Attributes

PID: Unique Process ID
PPID: Parent Process ID
UID/GID: User/Group ownership
Priority: Nice value (-20 to 19)
State: Running, Sleeping, etc.
Memory Usage: RSS, VSZ
CPU Usage: %CPU time

Process States

R Running or Runnable
S Sleeping (interruptible)
D Uninterruptible Sleep
T Stopped by signal
Z Zombie (terminated)
X Dead (won't be seen)

Process Creation: fork() and exec()

Linux creates processes through two system calls:

fork() - Create Copy

Creates a duplicate of the current process (child). Parent and child initially run same code.

                        # Parent process continues

                        Parent PID: 1000

                        Child PID: 1001

                        # Both processes run from same point

exec() - Replace Image

Replaces current process with a new program. Process continues with new program code.

                        # Child process loads new program

                        bash → exec() → ls

                        # Same PID (1001), new program

Viewing Process Information

                # Basic process listing

                $ ps

                PID TTY TIME CMD

                1234 pts/0 00:00:00 bash

                5678 pts/0 00:00:01 python

                # Detailed process info

                $ ps -ef

                # Full format listing

                # Show process tree

                $ pstree

                # or

                $ ps -ejH

                # Find process by name

                $ pgrep nginx

                $ pidof nginx

2. Foreground vs Background Processes

What are Foreground and Background Processes?

Shells can run processes in two modes:

Foreground Processes

Runs in current terminal
Blocks shell input until complete
Receives keyboard signals (Ctrl+C, Ctrl+Z)
Standard input/output connected to terminal
Use when: Interactive programs, need user input

                        $ vim file.txt

                        # Shell waits until vim exits

                        # Ctrl+C to interrupt

Background Processes

Runs independently of terminal
Shell immediately returns prompt
Doesn't receive keyboard input
Output may still appear in terminal
Use when: Long-running tasks, don't need interaction

                        $ tar -czf backup.tar.gz /data &

                        # [1] 12345 - Job number and PID

                        # Shell returns immediately

Job Control Commands

Starting Jobs

                        $ long_task.sh &

                        # Start in background

                        $ Ctrl+Z

                        # Suspend foreground job

                        # [1]+ Stopped long_task.sh

Managing Jobs

                        $ jobs

                        # List shell jobs

                        [1] Running tar -czf backup.tar.gz /data &

                        [2]- Stopped vim file.txt

                        [3]+ Stopped top

                        $ fg %1

                        # Bring job 1 to foreground

                        $ bg %2

                        # Resume job 2 in background

Disowning Jobs

                        # Keep job running after logout

                        $ long_task.sh &

                        $ disown %1

                        # or start with nohup

                        $ nohup long_task.sh &

                        # Output goes to nohup.out

Practical Job Control Workflow

                    # Start a backup in background

                    $ tar -czf /backup/data.tar.gz /data &

                    [1] 12345

                    # Check job status

                    $ jobs -l

                    [1]+ 12345 Running tar -czf /backup/data.tar.gz /data &

                    # Start another task, then suspend it

                    $ find / -name "*.log" 2>/dev/null

                    # Too slow, press Ctrl+Z

                    [2]+ Stopped find / -name "*.log" 2>/dev/null

                    # Resume find in background

                    $ bg %2

                    [2]+ find / -name "*.log" 2>/dev/null &

                    # Bring backup to foreground to monitor

                    $ fg %1

                    # Now monitoring tar output

                    # Press Ctrl+Z to suspend again

                    # Kill the find job

                    $ kill %2

3. Process Monitoring: ps, top, htop

What is Process Monitoring?

Monitoring tools help you understand system resource usage, identify performance bottlenecks, and troubleshoot process issues. Essential for maintaining healthy systems.

Essential Monitoring Commands

ps - Process Status

The classic process viewer. Many output formats available.

                        $ ps aux

                        # BSD style: user-oriented

                        USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

                        $ ps -ef

                        # Unix style: full listing

                        $ ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head

                        # Custom format, sorted by CPU

top - Interactive Process Viewer

Real-time process monitoring with interactive controls.

                        # Launch top

                        $ top

                        # Top keyboard shortcuts:

                        h - Help

                        P - Sort by CPU (default)

                        M - Sort by memory

                        N - Sort by PID

                        k - Kill process (enter PID)

                        r - Renice process

                        1 - Toggle CPU cores

                        q - Quit

htop - Enhanced top

Colorful, user-friendly process viewer with more features.

                        # Install if not available

                        $ sudo apt install htop  # Debian/Ubuntu

                        $ sudo yum install htop  # RHEL/CentOS

                        $ htop

                        # Features:

                        • Color-coded display

                        • Mouse support

                        • Tree view (F5)

                        • Filter processes (F4)

                        • Kill with F9

                        • Customize with F2

Advanced Monitoring Tools

System Resource Monitoring

                        # Memory usage

                        $ free -h

                        # or

                        $ cat /proc/meminfo

                        # CPU information

                        $ lscpu

                        $ cat /proc/cpuinfo

                        # Disk I/O statistics

                        $ iostat -x 1

                        $ iotop

                        # Network connections

                        $ ss -tulpn

                        $ netstat -tulpn

Process Priority: nice & renice

Control process CPU scheduling priority (-20 highest, 19 lowest).

                        # Start process with low priority

                        $ nice -n 19 compression.sh

                        # Nice value 19 (lowest priority)

                        # Start process with high priority

                        $ nice -n -20 important_task.sh

                        # Nice value -20 (highest priority)

                        # Change priority of running process

                        $ renice -n 10 -p 1234

                        # Change PID 1234 to nice 10

                        # Change priority of all processes for user

                        $ renice -n 5 -u alice

Monitoring Best Practices for DevOps

Establish baselines: Know normal resource usage patterns
Monitor key metrics: CPU, memory, disk I/O, network
Set up alerts: For high resource usage or process failures
Use process trees: Understand parent-child relationships
Check for zombies: Processes stuck in zombie state
Monitor system load: Load average over 1, 5, 15 minutes
Track process lifetimes: Sudden deaths or long runtimes

4. Scheduling Tasks with cron and at

What is Task Scheduling?

Linux provides tools to automate tasks to run at specific times. cron schedules recurring tasks, while at schedules one-time tasks.

cron - Recurring Task Scheduler

Understanding cron Syntax

Minute (0-59)

Hour (0-23)

Day of Month (1-31)

Month (1-12)

Day of Week (0-7, 0=Sun)

↓

Command to Execute

cron Special Characters

                        # * = any value

                        0 * * * * command

                        # Every hour at minute 0

                        # , = value list separator

                        0 0,12 * * * command

                        # Twice daily at midnight and noon

                        # - = range of values

                        0 9-17 * * * command

                        # Every hour 9am-5pm

                        # / = step values

                        */15 * * * * command

                        # Every 15 minutes

cron Shortcuts

                        # Common predefined schedules

                        @hourly command

                        # 0 * * * *

                        @daily command

                        # 0 0 * * * (midnight)

                        @weekly command

                        # 0 0 * * 0 (Sunday midnight)

                        @monthly command

                        # 0 0 1 * * (1st of month)

                        @yearly command

                        # 0 0 1 1 * (Jan 1st)

                        @reboot command

                        # Run at system startup

Managing cron Jobs

User crontabs

Each user has their own crontab file.

                        # Edit your crontab

                        $ crontab -e

                        # List your cron jobs

                        $ crontab -l

                        # Remove all your cron jobs

                        $ crontab -r

                        # Load crontab from file

                        $ crontab mycron.txt

System crontabs

System-wide cron jobs in /etc/cron.* directories.

                        # System crontab file

                        /etc/crontab

                        # Directory for hourly scripts

                        /etc/cron.hourly/

                        # Directory for daily scripts

                        /etc/cron.daily/

                        # Directory for weekly scripts

                        /etc/cron.weekly/

                        # Directory for monthly scripts

                        /etc/cron.monthly/

                        # Additional cron files

                        /etc/cron.d/

at - One-time Task Scheduler

                # Schedule job to run at specific time

                $ at 11:30 PM tomorrow

                at> /path/to/script.sh

                at> Ctrl+D

                job 1 at Thu Dec  2 23:30:00 2023

                # Schedule in 2 hours

                $ at now + 2 hours

                at> echo "Task complete" | mail -s "Reminder" user@example.com

                # List scheduled jobs

                $ atq

                # or

                $ at -l

                # Remove scheduled job

                $ atrm 1

                # Remove job number 1

Practical cron Examples for DevOps

# Backup database daily at 2 AM
0 2 * * * /usr/bin/mysqldump -u root -pPASSWORD database > /backup/db-$(date +\%Y\%m\%d).sql

# Monitor disk space every hour
0 * * * * /usr/local/bin/check_disk_space.sh

# Rotate logs weekly on Sunday at 3 AM
0 3 * * 0 /usr/sbin/logrotate /etc/logrotate.conf

# Sync data between servers every 15 minutes during business hours
*/15 9-17 * * 1-5 /usr/bin/rsync -avz /data/ user@backup:/backup/

# Run maintenance script on first day of month at midnight
0 0 1 * * /usr/local/bin/monthly_maintenance.sh

⚠️ cron Security & Best Practices

Use absolute paths: cron has minimal PATH
Redirect output: cron sends mail or use > /dev/null 2>&1
Test commands manually: Before adding to cron
Check cron logs: /var/log/cron or /var/log/syslog
Use locking: Prevent overlapping executions with flock
Set proper permissions: /etc/cron.allow and /etc/cron.deny
Monitor cron jobs: Failed cron jobs can cause issues

5. System Startup, Boot Targets, and Runlevels

What is the Linux Boot Process?

The Linux boot process involves multiple stages that initialize hardware, load the kernel, and start system services. Understanding this process is crucial for troubleshooting boot issues.

Linux Boot Process Stages

BIOS/UEFI
Hardware initialization
POST and device detection

→

Bootloader
GRUB2
Loads kernel and initramfs

→

Kernel
Linux kernel
Hardware initialization
Mounts root filesystem

→

Init System
systemd (or SysV init)
Starts system services
Reaches target/runlevel

systemd vs SysV Init

systemd (Modern)

Parallel startup (faster boot)
Socket activation
Dependency-based startup
Uses targets instead of runlevels
Logging via journald
Default on modern distributions

                        $ systemctl list-units

                        $ systemctl status nginx

                        $ systemctl enable nginx

SysV Init (Traditional)

Sequential startup
Uses runlevels (0-6)
Scripts in /etc/init.d/
Symbolic links in /etc/rc*.d/
Still found on older systems
Simpler but slower

                        $ service nginx status

                        $ /etc/init.d/nginx restart

                        $ chkconfig nginx on

systemd Targets (Modern Runlevels)

Target	Purpose	Traditional Runlevel	Description
`poweroff.target`	Shutdown	0	System shutdown
`rescue.target`	Single User	1	Single-user mode, emergency
`multi-user.target`	Multi-user	3	Text-based multi-user
`graphical.target`	Graphical	5	Graphical multi-user
`reboot.target`	Reboot	6	System reboot

Managing systemd Services

Service Management

                        # Start/stop services

                        $ sudo systemctl start nginx

                        $ sudo systemctl stop nginx

                        $ sudo systemctl restart nginx

                        $ sudo systemctl reload nginx

                        # Enable/disable at boot

                        $ sudo systemctl enable nginx

                        $ sudo systemctl disable nginx

Service Status & Info

                        # Check service status

                        $ systemctl status nginx

                        $ systemctl is-active nginx

                        $ systemctl is-enabled nginx

                        # View service logs

                        $ journalctl -u nginx

                        $ journalctl -u nginx --since "1 hour ago"

System Control

                        # Change system target

                        $ sudo systemctl isolate multi-user.target

                        # Switch to multi-user mode

                        # Power management

                        $ sudo systemctl reboot

                        $ sudo systemctl poweroff

                        $ sudo systemctl suspend

Boot Process Troubleshooting

Common Boot Issues & Solutions

                    # View boot messages

                    $ dmesg | less

                    $ journalctl -b

                    # Current boot

                    $ journalctl -b -1

                    # Previous boot

                    # Check failed services

                    $ systemctl --failed

                    $ systemctl list-units --state=failed

                    # Emergency/Rescue mode

                    # At GRUB menu, press 'e' to edit, add:

                    systemd.unit=rescue.target

                    # or for emergency shell:

                    systemd.unit=emergency.target

                    # Check boot time performance

                    $ systemd-analyze

                    $ systemd-analyze blame

                    $ systemd-analyze critical-chain

Complete DevOps Process Management Workflow

End-to-End Process Management

A typical DevOps workflow for managing application processes:

                # 1. Deploy application with proper service file

                $ sudo cp myapp.service /etc/systemd/system/

                $ sudo systemctl daemon-reload

                $ sudo systemctl enable myapp

                $ sudo systemctl start myapp

                # 2. Monitor application performance

                $ watch -n 1 'ps aux | grep myapp | grep -v grep'

                $ htop -p $(pgrep myapp)

                # 3. Set up automated health checks

                # In /etc/cron.d/myapp-monitor:

                */5 * * * * root /usr/local/bin/check_myapp.sh

                # 4. Configure log rotation

                # In /etc/logrotate.d/myapp:

                /var/log/myapp/*.log {

                  daily

                  rotate 30

                  compress

                  delaycompress

                  missingok

                  notifempty

                  create 644 myapp myapp

                  postrotate

                    systemctl reload myapp

                  endscript

                }

                # 5. Set up automated restarts on failure

                # In /etc/systemd/system/myapp.service:

                [Service]

                Restart=on-failure

                RestartSec=10s

                StartLimitIntervalSec=60

                StartLimitBurst=3

Essential Commands Cheat Sheet

Process Management

                    $ ps aux               # List all processes

                    $ top                  # Interactive process viewer

                    $ htop                 # Enhanced process viewer

                    $ pstree               # Process tree

                    $ kill -9 PID          # Force kill process

                    $ killall process_name # Kill by name

Job Control

                    $ command &            # Start in background

                    $ jobs                # List jobs

                    $ fg %1              # Bring job 1 to foreground

                    $ bg %2              # Resume job 2 in background

                    $ Ctrl+Z             # Suspend foreground job

Task Scheduling

                    $ crontab -e          # Edit cron jobs

                    $ crontab -l          # List cron jobs

                    $ at 10:30            # Schedule one-time job

                    $ atq                 # List scheduled jobs

                    $ atrm 1              # Remove job 1

System Control

                    $ systemctl start service # Start service

                    $ systemctl stop service  # Stop service

                    $ systemctl restart service # Restart service

                    $ systemctl status service # Service status

                    $ systemctl enable service # Enable at boot

Practice Scenarios for DevOps Engineers

A web application is consuming 100% CPU. How would you identify the problematic process and safely restart it?
You need to schedule a database backup every night at 2 AM that takes 3 hours to complete. How would you ensure it doesn't overlap with other jobs?
A critical service keeps crashing. How would you configure it to automatically restart and alert you when it fails?
You suspect a memory leak in an application. What tools would you use to monitor memory usage over time?
How would you gracefully stop a long-running data processing job and resume it later?
A server won't boot after a configuration change. What steps would you take to troubleshoot and recover?
You need to run a resource-intensive report during off-hours without affecting production. How would you manage its priority?
How would you set up monitoring to alert you when disk space reaches 90% or when critical services stop?

Key Takeaways

Everything is a process: Understanding processes is fundamental to Linux administration
Job control enables multitasking: Manage foreground/background processes effectively
Monitoring is proactive: Use ps, top, htop to identify issues before they cause problems
Automate with cron: Schedule repetitive tasks to run automatically
Master systemd: Modern Linux uses systemd for service management
Understand the boot process: Essential for troubleshooting startup issues
Prioritize processes: Use nice/renice to manage CPU allocation
Implement proper service management: Enable, start, stop, and monitor services correctly

Mastering process and system management will make you more effective at maintaining stable, performant Linux systems in DevOps environments. These skills enable you to proactively monitor systems, automate routine tasks, and quickly resolve issues when they arise.

Saturday, December 13, 2025