Process & System Management: The DevOps Control Center
Learn how to control, monitor, and schedule processes like a professional system administrator.
📅 Published: Feb 2026
⏱️ Estimated Reading Time: 18 minutes
🏷️ Tags: Process Management, System Monitoring, Cron Jobs, Linux Administration, DevOps
🧠 Understanding Processes & Jobs: The Heart of Linux
What Exactly is a Process?
Think of a process as a running instance of a program. When you start any application in Linux, whether it's a simple command like ls or a complex server like Apache, Linux creates a process for it. Here's a simple analogy:
Program = A recipe book (the instructions)
Process = Actually cooking the recipe (the execution)
Every process has:
PID (Process ID) - Unique number identifying the process
PPID (Parent Process ID) - The process that started this one
Owner - Which user is running it
Priority - How important it is for CPU time
State - Running, sleeping, stopped, or zombie
The Parent-Child Relationship
Linux processes are organized in a family tree. When you start a process, it becomes the child of whatever started it (usually your shell). This parent-child relationship is crucial for understanding how Linux manages processes.
# See the process hierarchy pstree # Output shows parent-child relationships visually # Or more detailed pstree -p # Shows PIDs too
Real-world example: When you start a web server (like Nginx), it might start multiple worker processes. The main Nginx process is the parent, and the workers are its children. If the parent dies, all children die too (unless specially configured).
Types of Processes
Foreground Processes - Run in your terminal, block input until done
Background Processes - Run detached from terminal, don't block input
Daemons - System service processes that run in background
Zombies - Processes that have finished but haven't been cleaned up
Orphans - Processes whose parent has died (adopted by init/systemd)
🎭 Foreground vs Background Processes
Foreground Processes: The Attention Grabbers
When you run a command normally, it runs in the foreground. Your terminal is "locked" to that command - you can't type anything else until it finishes. This is perfect for commands that need your attention or produce output you want to watch.
# These run in foreground (you wait for them) ls -la vim file.txt ping google.com tail -f logfile.txt
When to use foreground: Interactive tasks, debugging, watching live output.
Background Processes: The Multitaskers
Background processes let you run commands without tying up your terminal. You start them, they run independently, and you get your prompt back immediately.
# Start a process in background sleep 60 & # The & symbol sends it to background # Output: # [1] 12345 # [1] = Job number # 12345 = PID (Process ID)
Controlling Jobs: jobs, fg, bg
Linux keeps track of your background processes as "jobs". Here's how to manage them:
# Start a long-running process tar -czf backup.tar.gz /home/user/documents & # Returns: [1] 23456 # Check background jobs jobs # Output: [1]+ Running tar -czf backup.tar.gz ... # Bring background job to foreground fg %1 # Now the tar command runs in foreground # Suspend a foreground job # Press Ctrl+Z while it's running # It stops (suspends) the job # Send suspended job to background bg %1 # Continues the job in background # List all jobs with PIDs jobs -l
Common scenario: You start a big file transfer, realize you need your terminal back:
Press
Ctrl+Zto suspend itType
bgto send it to backgroundYou get your prompt back while transfer continues
Later, use
fgto bring it back to check progress
kill: Gracefully Stopping Processes
The kill command sends signals to processes. It doesn't necessarily "kill" them - it sends requests to terminate gracefully.
# List available signals kill -l # Graceful termination (default) kill 12345 # Same as: kill -TERM 12345 or kill -15 12345 # Force kill (use as last resort) kill -9 12345 # -9 = SIGKILL, cannot be ignored # Kill by job number kill %1 # Kill by process name pkill firefox killall chrome
Important distinction:
kill -15(TERM) = "Please shut down nicely" (process can clean up)kill -9(KILL) = "Die now!" (forceful, may leave temporary files)
Best practice: Always try kill (or kill -15) first. Only use kill -9 if a process is truly stuck and won't respond to polite requests.
📊 Process Monitoring: Keeping an Eye on Everything
ps: The Process Snapshot
ps shows a snapshot of current processes. It's like taking a photo of what's running right now.
# Basic view ps # Shows your processes in current terminal # View all processes ps aux # a = all users, u = user format, x = include non-terminal processes # View as tree (parent-child relationships) ps auxf # f = forest/tree view # Custom columns ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head -10 # Shows: PID, Parent PID, Command, Memory %, CPU %, sorted by memory # Find specific process ps aux | grep nginx
Understanding ps aux output:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.1 169000 5500 ? Ss Jan01 0:12 /sbin/init www-data 456 0.2 2.1 245000 85000 ? S Jan01 12:34 nginx: worker
USER: Who owns the process
PID: Process ID
%CPU/%MEM: Resource usage
VSZ: Virtual memory size
RSS: Actual physical memory used
TTY: Terminal associated (? means no terminal - daemon)
STAT: Process state
TIME: Total CPU time used
COMMAND: What's running
top: The Live Dashboard
While ps is a snapshot, top is a live updating dashboard of your system. It's like watching security camera footage instead of looking at a photo.
# Start top top # Inside top: # q = quit # k = kill process (enter PID) # r = renice process (change priority) # M = sort by memory # P = sort by CPU # 1 = show all CPU cores # h = help # Batch mode (for scripts) top -bn1 # One snapshot, non-interactive # Monitor specific process top -p 12345
Understanding top's header:
top - 14:30:00 up 30 days, 3:15, 1 user, load average: 0.50, 0.75, 1.00 Tasks: 125 total, 1 running, 124 sleeping, 0 stopped, 0 zombie %Cpu(s): 5.2 us, 1.5 sy, 0.0 ni, 93.3 id, 0.0 wa, 0.0 hi, 0.0 si MiB Mem : 7845.2 total, 1024.5 free, 2048.3 used, 4772.4 buff/cache MiB Swap: 2048.0 total, 2048.0 free, 0.0 used. 5120.2 avail Mem
Load average: System load (1.00 means 100% busy)
Tasks: Process counts by state
%Cpu: User, system, idle time percentages
Mem/Swap: Memory usage
htop: The Supercharged top
htop is like top but with colors, mouse support, and better visual layout. It's not always installed by default, but worth adding.
# Install htop sudo apt install htop # Ubuntu/Debian sudo yum install htop # CentOS/RHEL # Run it htop # Features: # Mouse click to select processes # Color-coded resource usage # Tree view toggle (F5) # Search (F3) # Filter (F4) # Kill process (F9) # Nice/renice (F7/F8)
When to use htop: When you need to quickly understand what's happening on a system. The colors make high-CPU processes stand out immediately.
nice and renice: Setting Priorities
In Linux, not all processes are equal. Some need immediate attention (like your web server responding to requests), while others can wait (like a background backup).
Nice values range from -20 (highest priority) to 19 (lowest priority). Default is 0.
# Start process with low priority nice -n 19 big_backup_script.sh # Runs with niceness 19 (lowest priority) # Start process with high priority sudo nice -n -10 critical_process # Needs root for negative nice values # Change priority of running process renice -n 10 -p 12345 # Makes PID 12345 lower priority # Change priority of all processes for a user renice -n 5 -u username
Real-world use: Running a CPU-intensive report generation during off-hours:
# Don't interfere with production traffic nice -n 19 generate_reports.sh &
⏰ Scheduling Tasks: cron and at
cron: The Time-Based Scheduler
cron is Linux's job scheduler. It runs commands at specific times, dates, or intervals. Think of it as your personal assistant who never sleeps.
The crontab file contains your scheduled jobs:
# Edit your crontab crontab -e # List your scheduled jobs crontab -l # Remove all your scheduled jobs crontab -r
Understanding cron Syntax
Cron uses a specific format:
* * * * * command_to_execute │ │ │ │ │ │ │ │ │ └── Day of week (0-7, 0=Sunday) │ │ │ └──── Month (1-12) │ │ └────── Day of month (1-31) │ └──────── Hour (0-23) └────────── Minute (0-59)
Examples:
# Run every minute * * * * * /path/to/script.sh # Run at 2:30 AM daily 30 2 * * * /backup/daily.sh # Run at 4 PM every Monday 0 16 * * 1 /report/weekly.sh # Run every 5 minutes */5 * * * * /monitor/check.sh # Run at 10:15 on the 1st of each month 15 10 1 * * /billing/monthly.sh
Special cron Strings
For common intervals, you can use special strings:
@reboot /script/startup.sh # Run at system startup @yearly /backup/yearly.sh # Same as 0 0 1 1 * @monthly /backup/monthly.sh # Same as 0 0 1 * * @weekly /backup/weekly.sh # Same as 0 0 * * 0 @daily /backup/daily.sh # Same as 0 0 * * * @hourly /monitor/hourly.sh # Same as 0 * * * *
System-wide cron Jobs
System administrators can also schedule jobs in these directories:
/etc/cron.hourly/ # Runs hourly /etc/cron.daily/ # Runs daily /etc/cron.weekly/ # Runs weekly /etc/cron.monthly/ # Runs monthly
Just put executable scripts in these directories (no cron syntax needed).
at: The One-Time Scheduler
While cron is for repeating tasks, at is for one-time scheduling.
# Schedule a job for 2:30 PM today echo "backup.sh" | at 14:30 # Schedule for tomorrow at 10:00 tomorrow # Then type commands, Ctrl+D to finish # Schedule in 2 hours at now + 2 hours # Enter commands, Ctrl+D # List pending jobs atq # Remove a job atrm job_number
When to use at: When you need to run something once at a specific time, like "reboot the server at 3 AM when no one's using it."
🚀 System Startup, Boot Targets, and Runlevels
Understanding System Boot Process
When you start a Linux machine, it goes through several stages:
BIOS/UEFI - Hardware initialization
Bootloader (GRUB) - Loads the kernel
Kernel - Initializes hardware, mounts root filesystem
init/systemd - First process (PID 1), starts all other processes
Runlevel/Target - Determines what services to start
systemd: The Modern Init System
Most modern Linux distributions use systemd (pronounced "system-dee") as their init system. It replaces the older SysV init system.
# Check if system is using systemd ps -p 1 # If it shows "systemd", you're using systemd # System status systemctl status # List all units (services, mounts, sockets, etc.) systemctl list-units
Boot Targets (Like Runlevels)
In systemd, targets are groups of services that should be started together. They're similar to the old runlevels but more flexible.
# List available targets systemctl list-units --type=target # Check current default target systemctl get-default # Change default target sudo systemctl set-default multi-user.target # Changes default to multi-user (no GUI) # Switch to different target immediately sudo systemctl isolate graphical.target # Starts GUI interface # Emergency mode (minimal boot) sudo systemctl rescue # Or: sudo systemctl emergency
Common targets:
multi-user.target- Normal server mode (text only)graphical.target- Desktop mode with GUIrescue.target- Single-user mode for recoveryemergency.target- Bare minimum for fixing serious issues
Managing Services with systemctl
Services are programs that run in the background (daemons). systemctl manages them.
# Start a service sudo systemctl start nginx # Stop a service sudo systemctl stop nginx # Restart a service sudo systemctl restart nginx # Reload configuration (without full restart) sudo systemctl reload nginx # Check service status systemctl status nginx # Enable service to start at boot sudo systemctl enable nginx # Disable service from starting at boot sudo systemctl disable nginx # See if service is enabled systemctl is-enabled nginx # View service logs sudo journalctl -u nginx sudo journalctl -u nginx --since "1 hour ago"
The Old Way: SysV Init (Still Good to Know)
Some older systems or certain distributions still use SysV init. The commands are different:
# Start/stop/restart sudo service nginx start sudo /etc/init.d/nginx start # Alternative # Check runlevel runlevel # Output: N 5 # First char: Previous runlevel (N = none) # Second char: Current runlevel (5 = graphical) # Change runlevel init 3 # Switch to multi-user mode init 5 # Switch to graphical mode
Runlevels explained:
0: Halt (shutdown)
1: Single-user mode (recovery)
2: Multi-user without networking
3: Multi-user with networking (server default)
4: Unused/custom
5: Multi-user with GUI (desktop default)
6: Reboot
🎯 Real-World DevOps Scenarios
Scenario 1: Troubleshooting High CPU Usage
Problem: Server is slow, users complaining.
# Step 1: Check system load uptime # If load > number of CPU cores, there's a problem # Step 2: Find top CPU consumers top # Or: ps aux --sort=-%cpu | head -10 # Step 3: Investigate specific process strace -p 12345 # See what system calls it's making lsof -p 12345 # See what files it has open # Step 4: Kill if necessary kill 12345 # If doesn't respond: kill -9 12345
Scenario 2: Scheduling Database Backups
Problem: Need daily backup of database at 2 AM.
# Create backup script cat > /usr/local/bin/backup-db.sh << 'EOF' #!/bin/bash DATE=$(date +%Y%m%d) mysqldump -u root -pPASSWORD database > /backup/db-$DATE.sql gzip /backup/db-$DATE.sql find /backup -name "*.gz" -mtime +30 -delete EOF chmod +x /usr/local/bin/backup-db.sh # Schedule with cron crontab -e # Add: 0 2 * * * /usr/local/bin/backup-db.sh # Test immediately at now # Type: /usr/local/bin/backup-db.sh # Ctrl+D to run
Scenario 3: Managing Web Server During Deployment
Problem: Deploy new code without downtime.
# During deployment script: # 1. Check current status systemctl status nginx # 2. Graceful reload (keep serving during config update) sudo systemctl reload nginx # 3. If new code needs full restart # Take server out of load balancer first sudo systemctl stop nginx # Deploy code cp -r new_code/ /var/www/html/ # Start server sudo systemctl start nginx # Verify it's running systemctl status nginx curl -I http://localhost
Scenario 4: Creating a System Monitor Script
cat > /usr/local/bin/system-monitor.sh << 'EOF' #!/bin/bash echo "=== System Monitor ===" echo "Time: $(date)" echo # CPU and Load echo "CPU Load:" uptime echo # Memory echo "Memory Usage:" free -h echo # Disk echo "Disk Usage:" df -h / /home echo # Top Processes echo "Top 5 CPU Processes:" ps aux --sort=-%cpu | head -6 echo echo "Top 5 Memory Processes:" ps aux --sort=-%mem | head -6 echo # Services echo "Critical Services:" for service in nginx mysql redis; do systemctl is-active $service >/dev/null && \ echo "✓ $service is running" || \ echo "✗ $service is NOT running" done EOF chmod +x /usr/local/bin/system-monitor.sh # Run every 5 minutes (crontab -l 2>/dev/null; echo "*/5 * * * * /usr/local/bin/system-monitor.sh >> /var/log/system-monitor.log") | crontab -
💡 Best Practices for Process Management
1. Never Use kill -9 First
Always try graceful shutdown first. kill -9 can corrupt data and leave temporary files.
2. Monitor Zombie Processes
Zombie processes (state "Z") are finished but not cleaned up. Too many zombies indicate a problem with parent processes not cleaning up children.
# Check for zombies ps aux | grep 'Z' # Count zombies ps aux | grep -c 'Z'
3. Use nohup for Long-Running Jobs
If you want a process to keep running after you log out:
nohup long_script.sh > output.log 2>&1 & # nohup = No Hang Up (immune to terminal disconnects)
4. Set Proper Process Limits
Prevent a single process from consuming all resources:
# Set limits for a script ulimit -n 2048 # Max open files ulimit -u 500 # Max processes for user ulimit -t 3600 # Max CPU time (seconds) # Run script with limits ulimit -Sv 500000 # 500MB virtual memory limit ./memory_hungry_script.sh
5. Log Your cron Jobs
Always redirect cron job output to logs:
# In crontab: 0 2 * * * /backup/script.sh >> /var/log/backup.log 2>&1 # Logs both stdout and stderr
📋 Quick Reference Cheat Sheet
| Task | Command | Example |
|---|---|---|
| List processes | ps aux | ps aux | grep nginx |
| Live monitoring | top | top then M to sort by memory |
| Enhanced monitor | htop | htop (install if needed) |
| Background job | & | sleep 60 & |
| List jobs | jobs | jobs -l |
| Foreground job | fg | fg %1 |
| Background job | bg | bg %1 |
| Suspend job | Ctrl+Z | (While process running) |
| Kill process | kill | kill 12345 |
| Force kill | kill -9 | kill -9 12345 (last resort) |
| Kill by name | pkill | pkill firefox |
| Set priority | nice | nice -n 19 bigjob.sh |
| Change priority | renice | renice -n 10 -p 12345 |
| Edit cron jobs | crontab -e | crontab -e |
| List cron jobs | crontab -l | crontab -l |
| Schedule once | at | echo "cmd" | at 14:30 |
| Start service | systemctl start | sudo systemctl start nginx |
| Stop service | systemctl stop | sudo systemctl stop nginx |
| Service status | systemctl status | systemctl status nginx |
| Enable service | systemctl enable | sudo systemctl enable nginx |
| Service logs | journalctl -u | sudo journalctl -u nginx -f |
| Default target | systemctl get-default | systemctl get-default |
| Change target | systemctl set-default | sudo systemctl set-default multi-user.target |
🚀 Practice Exercises
Exercise 1: Create a Process Monitor
# Create a script that monitors and logs process activity cat > process-watch.sh << 'EOF' #!/bin/bash LOG_FILE="/var/log/process-watch.log" echo "=== Process Watch: $(date) ===" >> $LOG_FILE echo "Top 5 CPU:" >> $LOG_FILE ps aux --sort=-%cpu | head -6 >> $LOG_FILE echo >> $LOG_FILE echo "Top 5 Memory:" >> $LOG_FILE ps aux --sort=-%mem | head -6 >> $LOG_FILE echo "---" >> $LOG_FILE # Check for zombie processes ZOMBIES=$(ps aux | grep -c 'Z') if [ $ZOMBIES -gt 0 ]; then echo "WARNING: $ZOMBIES zombie processes found!" >> $LOG_FILE ps aux | grep 'Z' >> $LOG_FILE fi EOF chmod +x process-watch.sh ./process-watch.sh cat /var/log/process-watch.log
Exercise 2: Schedule System Health Checks
# Create a health check script cat > /usr/local/bin/health-check.sh << 'EOF' #!/bin/bash THRESHOLD=80 ALERT_EMAIL="admin@example.com" # Check disk DISK_USAGE=$(df / | awk 'NR==2{print $5}' | sed 's/%//') if [ $DISK_USAGE -gt $THRESHOLD ]; then echo "High disk usage: ${DISK_USAGE}%" | mail -s "Disk Alert" $ALERT_EMAIL fi # Check memory MEM_USAGE=$(free | awk '/Mem/{printf("%.0f"), $3/$2*100}') if [ $MEM_USAGE -gt $THRESHOLD ]; then echo "High memory usage: ${MEM_USAGE}%" | mail -s "Memory Alert" $ALERT_EMAIL fi # Check critical services for service in ssh nginx mysql; do if ! systemctl is-active --quiet $service; then echo "Service $service is down!" | mail -s "Service Alert" $ALERT_EMAIL # Try to restart systemctl restart $service fi done EOF chmod +x /usr/local/bin/health-check.sh # Schedule every 5 minutes (crontab -l 2>/dev/null; echo "*/5 * * * * /usr/local/bin/health-check.sh") | crontab -
Exercise 3: Practice Process Control
# Open three terminals or use screen/tmux # Terminal 1: Start a long process while true; do echo "Process 1: $(date)"; sleep 2; done # Terminal 2: Start another while true; do echo "Process 2: $(date)"; sleep 3; done # Practice: # 1. Suspend each with Ctrl+Z # 2. List jobs with `jobs` # 3. Send to background with `bg %1` # 4. Bring to foreground with `fg %2` # 5. Kill one with `kill %1` # 6. Change priority with `renice -n 10 -p $(pgrep -f "Process 2")`
🔗 Master System Management with Hands-on Labs
Process and system management skills are critical for keeping servers healthy and applications running smoothly. The best way to learn is through hands-on practice with real scenarios.
👉 Practice process management, service control, and scheduling in our interactive labs at:
https://devops.trainwithsky.com/
Our platform provides:
Real Linux systems to practice on
Common production scenarios
Guided exercises with increasing complexity
Instant feedback and solutions
Progress from basic monitoring to advanced tuning
Common Questions Answered
Q: What's the difference between a daemon and a regular process?
A: Daemons are background service processes (usually started at boot, no controlling terminal). Regular processes are typically user-started applications.
Q: When should I use systemctl vs service?
A: Use systemctl on modern systems (Ubuntu 16.04+, CentOS 7+). Use service on older systems or when you need compatibility.
Q: How do I make a process survive terminal logout?
A: Use nohup command & or disown after starting, or better yet, create a systemd service.
Q: What's a zombie process and should I worry?
A: Zombies are processes that have finished but whose parent hasn't cleaned them up. A few zombies are normal, but many indicate a problem.
Q: How do I find which cron job is running a process?
A: Check the process's environment: cat /proc/PID/environ | tr '\0' '\n' | grep CRON
Q: Why is my cron job not working?
A: Common issues: PATH differences, permission problems, missing output redirection. Always test commands manually first, then add to cron with full paths and logging.
Having trouble with process management or scheduling? Share your specific challenge in the comments below! 💬
Comments
Post a Comment