Linux Process & System Management
Published: December 2023 | Topic: System Administration & Automation for DevOps
Mastering process and system management is crucial for DevOps engineers. You need to understand how Linux manages processes, how to monitor system resources, schedule automated tasks, and control system startup. These skills are essential for maintaining stable, performant production systems and automating operational tasks.
The Linux Process Model
In Linux, everything runs as a process. Understanding processes is fundamental to system administration:
- Process: A running instance of a program
- PID: Process ID - unique number identifying each process
- PPID: Parent Process ID - the process that created this process
- UID/GID: User/Group ID of the process owner
- Process States: Running, Sleeping, Stopped, Zombie
- Process Hierarchy: Tree structure starting from init/systemd (PID 1)
Process Hierarchy Visualization
1. Understanding Processes & Jobs
What are Processes and Jobs?
A process is a running program instance managed by the kernel. A job is a process that's managed by your shell (with job control). Every process has:
Process Attributes
- PID: Unique Process ID
- PPID: Parent Process ID
- UID/GID: User/Group ownership
- Priority: Nice value (-20 to 19)
- State: Running, Sleeping, etc.
- Memory Usage: RSS, VSZ
- CPU Usage: %CPU time
Process States
- R Running or Runnable
- S Sleeping (interruptible)
- D Uninterruptible Sleep
- T Stopped by signal
- Z Zombie (terminated)
- X Dead (won't be seen)
Process Creation: fork() and exec()
Linux creates processes through two system calls:
fork() - Create Copy
Creates a duplicate of the current process (child). Parent and child initially run same code.
Parent PID: 1000
Child PID: 1001
# Both processes run from same point
exec() - Replace Image
Replaces current process with a new program. Process continues with new program code.
bash → exec() → ls
# Same PID (1001), new program
Viewing Process Information
$ ps
PID TTY TIME CMD
1234 pts/0 00:00:00 bash
5678 pts/0 00:00:01 python
# Detailed process info
$ ps -ef
# Full format listing
# Show process tree
$ pstree
# or
$ ps -ejH
# Find process by name
$ pgrep nginx
$ pidof nginx
2. Foreground vs Background Processes
What are Foreground and Background Processes?
Shells can run processes in two modes:
Foreground Processes
- Runs in current terminal
- Blocks shell input until complete
- Receives keyboard signals (Ctrl+C, Ctrl+Z)
- Standard input/output connected to terminal
- Use when: Interactive programs, need user input
# Shell waits until vim exits
# Ctrl+C to interrupt
Background Processes
- Runs independently of terminal
- Shell immediately returns prompt
- Doesn't receive keyboard input
- Output may still appear in terminal
- Use when: Long-running tasks, don't need interaction
# [1] 12345 - Job number and PID
# Shell returns immediately
Job Control Commands
Starting Jobs
# Start in background
$ Ctrl+Z
# Suspend foreground job
# [1]+ Stopped long_task.sh
Managing Jobs
# List shell jobs
[1] Running tar -czf backup.tar.gz /data &
[2]- Stopped vim file.txt
[3]+ Stopped top
$ fg %1
# Bring job 1 to foreground
$ bg %2
# Resume job 2 in background
Disowning Jobs
$ long_task.sh &
$ disown %1
# or start with nohup
$ nohup long_task.sh &
# Output goes to nohup.out
Practical Job Control Workflow
$ tar -czf /backup/data.tar.gz /data &
[1] 12345
# Check job status
$ jobs -l
[1]+ 12345 Running tar -czf /backup/data.tar.gz /data &
# Start another task, then suspend it
$ find / -name "*.log" 2>/dev/null
# Too slow, press Ctrl+Z
[2]+ Stopped find / -name "*.log" 2>/dev/null
# Resume find in background
$ bg %2
[2]+ find / -name "*.log" 2>/dev/null &
# Bring backup to foreground to monitor
$ fg %1
# Now monitoring tar output
# Press Ctrl+Z to suspend again
# Kill the find job
$ kill %2
3. Process Monitoring: ps, top, htop
What is Process Monitoring?
Monitoring tools help you understand system resource usage, identify performance bottlenecks, and troubleshoot process issues. Essential for maintaining healthy systems.
Essential Monitoring Commands
ps - Process Status
The classic process viewer. Many output formats available.
# BSD style: user-oriented
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
$ ps -ef
# Unix style: full listing
$ ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head
# Custom format, sorted by CPU
top - Interactive Process Viewer
Real-time process monitoring with interactive controls.
$ top
# Top keyboard shortcuts:
h - Help
P - Sort by CPU (default)
M - Sort by memory
N - Sort by PID
k - Kill process (enter PID)
r - Renice process
1 - Toggle CPU cores
q - Quit
htop - Enhanced top
Colorful, user-friendly process viewer with more features.
$ sudo apt install htop # Debian/Ubuntu
$ sudo yum install htop # RHEL/CentOS
$ htop
# Features:
• Color-coded display
• Mouse support
• Tree view (F5)
• Filter processes (F4)
• Kill with F9
• Customize with F2
Advanced Monitoring Tools
System Resource Monitoring
$ free -h
# or
$ cat /proc/meminfo
# CPU information
$ lscpu
$ cat /proc/cpuinfo
# Disk I/O statistics
$ iostat -x 1
$ iotop
# Network connections
$ ss -tulpn
$ netstat -tulpn
Process Priority: nice & renice
Control process CPU scheduling priority (-20 highest, 19 lowest).
$ nice -n 19 compression.sh
# Nice value 19 (lowest priority)
# Start process with high priority
$ nice -n -20 important_task.sh
# Nice value -20 (highest priority)
# Change priority of running process
$ renice -n 10 -p 1234
# Change PID 1234 to nice 10
# Change priority of all processes for user
$ renice -n 5 -u alice
Monitoring Best Practices for DevOps
- Establish baselines: Know normal resource usage patterns
- Monitor key metrics: CPU, memory, disk I/O, network
- Set up alerts: For high resource usage or process failures
- Use process trees: Understand parent-child relationships
- Check for zombies: Processes stuck in zombie state
- Monitor system load: Load average over 1, 5, 15 minutes
- Track process lifetimes: Sudden deaths or long runtimes
4. Scheduling Tasks with cron and at
What is Task Scheduling?
Linux provides tools to automate tasks to run at specific times. cron schedules recurring tasks, while at schedules one-time tasks.
cron - Recurring Task Scheduler
Understanding cron Syntax
cron Special Characters
0 * * * * command
# Every hour at minute 0
# , = value list separator
0 0,12 * * * command
# Twice daily at midnight and noon
# - = range of values
0 9-17 * * * command
# Every hour 9am-5pm
# / = step values
*/15 * * * * command
# Every 15 minutes
cron Shortcuts
@hourly command
# 0 * * * *
@daily command
# 0 0 * * * (midnight)
@weekly command
# 0 0 * * 0 (Sunday midnight)
@monthly command
# 0 0 1 * * (1st of month)
@yearly command
# 0 0 1 1 * (Jan 1st)
@reboot command
# Run at system startup
Managing cron Jobs
User crontabs
Each user has their own crontab file.
$ crontab -e
# List your cron jobs
$ crontab -l
# Remove all your cron jobs
$ crontab -r
# Load crontab from file
$ crontab mycron.txt
System crontabs
System-wide cron jobs in /etc/cron.* directories.
/etc/crontab
# Directory for hourly scripts
/etc/cron.hourly/
# Directory for daily scripts
/etc/cron.daily/
# Directory for weekly scripts
/etc/cron.weekly/
# Directory for monthly scripts
/etc/cron.monthly/
# Additional cron files
/etc/cron.d/
at - One-time Task Scheduler
$ at 11:30 PM tomorrow
at> /path/to/script.sh
at> Ctrl+D
job 1 at Thu Dec 2 23:30:00 2023
# Schedule in 2 hours
$ at now + 2 hours
at> echo "Task complete" | mail -s "Reminder" user@example.com
# List scheduled jobs
$ atq
# or
$ at -l
# Remove scheduled job
$ atrm 1
# Remove job number 1
Practical cron Examples for DevOps
0 2 * * * /usr/bin/mysqldump -u root -pPASSWORD database > /backup/db-$(date +\%Y\%m\%d).sql
# Monitor disk space every hour
0 * * * * /usr/local/bin/check_disk_space.sh
# Rotate logs weekly on Sunday at 3 AM
0 3 * * 0 /usr/sbin/logrotate /etc/logrotate.conf
# Sync data between servers every 15 minutes during business hours
*/15 9-17 * * 1-5 /usr/bin/rsync -avz /data/ user@backup:/backup/
# Run maintenance script on first day of month at midnight
0 0 1 * * /usr/local/bin/monthly_maintenance.sh
⚠️ cron Security & Best Practices
- Use absolute paths: cron has minimal PATH
- Redirect output: cron sends mail or use > /dev/null 2>&1
- Test commands manually: Before adding to cron
- Check cron logs: /var/log/cron or /var/log/syslog
- Use locking: Prevent overlapping executions with flock
- Set proper permissions: /etc/cron.allow and /etc/cron.deny
- Monitor cron jobs: Failed cron jobs can cause issues
5. System Startup, Boot Targets, and Runlevels
What is the Linux Boot Process?
The Linux boot process involves multiple stages that initialize hardware, load the kernel, and start system services. Understanding this process is crucial for troubleshooting boot issues.
Linux Boot Process Stages
Hardware initialization
POST and device detection
GRUB2
Loads kernel and initramfs
Linux kernel
Hardware initialization
Mounts root filesystem
systemd (or SysV init)
Starts system services
Reaches target/runlevel
systemd vs SysV Init
systemd (Modern)
- Parallel startup (faster boot)
- Socket activation
- Dependency-based startup
- Uses targets instead of runlevels
- Logging via journald
- Default on modern distributions
$ systemctl status nginx
$ systemctl enable nginx
SysV Init (Traditional)
- Sequential startup
- Uses runlevels (0-6)
- Scripts in /etc/init.d/
- Symbolic links in /etc/rc*.d/
- Still found on older systems
- Simpler but slower
$ /etc/init.d/nginx restart
$ chkconfig nginx on
systemd Targets (Modern Runlevels)
| Target | Purpose | Traditional Runlevel | Description |
|---|---|---|---|
poweroff.target |
Shutdown | 0 | System shutdown |
rescue.target |
Single User | 1 | Single-user mode, emergency |
multi-user.target |
Multi-user | 3 | Text-based multi-user |
graphical.target |
Graphical | 5 | Graphical multi-user |
reboot.target |
Reboot | 6 | System reboot |
Managing systemd Services
Service Management
$ sudo systemctl start nginx
$ sudo systemctl stop nginx
$ sudo systemctl restart nginx
$ sudo systemctl reload nginx
# Enable/disable at boot
$ sudo systemctl enable nginx
$ sudo systemctl disable nginx
Service Status & Info
$ systemctl status nginx
$ systemctl is-active nginx
$ systemctl is-enabled nginx
# View service logs
$ journalctl -u nginx
$ journalctl -u nginx --since "1 hour ago"
System Control
$ sudo systemctl isolate multi-user.target
# Switch to multi-user mode
# Power management
$ sudo systemctl reboot
$ sudo systemctl poweroff
$ sudo systemctl suspend
Boot Process Troubleshooting
Common Boot Issues & Solutions
$ dmesg | less
$ journalctl -b
# Current boot
$ journalctl -b -1
# Previous boot
# Check failed services
$ systemctl --failed
$ systemctl list-units --state=failed
# Emergency/Rescue mode
# At GRUB menu, press 'e' to edit, add:
systemd.unit=rescue.target
# or for emergency shell:
systemd.unit=emergency.target
# Check boot time performance
$ systemd-analyze
$ systemd-analyze blame
$ systemd-analyze critical-chain
Complete DevOps Process Management Workflow
End-to-End Process Management
A typical DevOps workflow for managing application processes:
$ sudo cp myapp.service /etc/systemd/system/
$ sudo systemctl daemon-reload
$ sudo systemctl enable myapp
$ sudo systemctl start myapp
# 2. Monitor application performance
$ watch -n 1 'ps aux | grep myapp | grep -v grep'
$ htop -p $(pgrep myapp)
# 3. Set up automated health checks
# In /etc/cron.d/myapp-monitor:
*/5 * * * * root /usr/local/bin/check_myapp.sh
# 4. Configure log rotation
# In /etc/logrotate.d/myapp:
/var/log/myapp/*.log {
daily
rotate 30
compress
delaycompress
missingok
notifempty
create 644 myapp myapp
postrotate
systemctl reload myapp
endscript
}
# 5. Set up automated restarts on failure
# In /etc/systemd/system/myapp.service:
[Service]
Restart=on-failure
RestartSec=10s
StartLimitIntervalSec=60
StartLimitBurst=3
Essential Commands Cheat Sheet
Process Management
$ top # Interactive process viewer
$ htop # Enhanced process viewer
$ pstree # Process tree
$ kill -9 PID # Force kill process
$ killall process_name # Kill by name
Job Control
$ jobs # List jobs
$ fg %1 # Bring job 1 to foreground
$ bg %2 # Resume job 2 in background
$ Ctrl+Z # Suspend foreground job
Task Scheduling
$ crontab -l # List cron jobs
$ at 10:30 # Schedule one-time job
$ atq # List scheduled jobs
$ atrm 1 # Remove job 1
System Control
$ systemctl stop service # Stop service
$ systemctl restart service # Restart service
$ systemctl status service # Service status
$ systemctl enable service # Enable at boot
Practice Scenarios for DevOps Engineers
- A web application is consuming 100% CPU. How would you identify the problematic process and safely restart it?
- You need to schedule a database backup every night at 2 AM that takes 3 hours to complete. How would you ensure it doesn't overlap with other jobs?
- A critical service keeps crashing. How would you configure it to automatically restart and alert you when it fails?
- You suspect a memory leak in an application. What tools would you use to monitor memory usage over time?
- How would you gracefully stop a long-running data processing job and resume it later?
- A server won't boot after a configuration change. What steps would you take to troubleshoot and recover?
- You need to run a resource-intensive report during off-hours without affecting production. How would you manage its priority?
- How would you set up monitoring to alert you when disk space reaches 90% or when critical services stop?
Key Takeaways
- Everything is a process: Understanding processes is fundamental to Linux administration
- Job control enables multitasking: Manage foreground/background processes effectively
- Monitoring is proactive: Use ps, top, htop to identify issues before they cause problems
- Automate with cron: Schedule repetitive tasks to run automatically
- Master systemd: Modern Linux uses systemd for service management
- Understand the boot process: Essential for troubleshooting startup issues
- Prioritize processes: Use nice/renice to manage CPU allocation
- Implement proper service management: Enable, start, stop, and monitor services correctly
Mastering process and system management will make you more effective at maintaining stable, performant Linux systems in DevOps environments. These skills enable you to proactively monitor systems, automate routine tasks, and quickly resolve issues when they arise.
No comments:
Post a Comment