SKIP THE SHIPPING
Use code NOSHIP during checkout to save 40% on eligible eBooks, now through January 5. Shop now.
Register your product to gain access to bonus material or receive a coupon.
This eBook includes the following formats, accessible from your Account page after purchase:
EPUB The open industry format known for its reflowable content and usability on supported mobile devices.
PDF The popular standard, used most often with the free Acrobat® Reader® software.
This eBook requires no passwords or activation to read. We customize your eBook by discreetly watermarking it with your name, making it uniquely yours.
“If you’re a developer trying to figure out why your application is not responding at 3 am, you need this book! This is now my go-to book when diagnosing production issues. It has saved me hours in troubleshooting complicated operations problems.”
–Trotter Cashion, cofounder, Mashion
DevOps can help developers, QAs, and admins work together to solve Linux server problems far more rapidly, significantly improving IT performance, availability, and efficiency. To gain these benefits, however, team members need common troubleshooting skills and practices.
In DevOps Troubleshooting: Linux Server Best Practices, award-winning Linux expert Kyle Rankin brings together all the standardized, repeatable techniques your team needs to stop finger-pointing, collaborate effectively, and quickly solve virtually any Linux server problem. Rankin walks you through using DevOps techniques to troubleshoot everything from boot failures and corrupt disks to lost email and downed websites. You’ll master indispensable skills for diagnosing high-load systems and network problems in production environments.
Rankin shows how to
Preface xiii
Acknowledgments xix
About the Author xxi
Chapter 1: Troubleshooting Best Practices 1
Divide the Problem Space 3
Practice Good Communication When Collaborating 4
Favor Quick, Simple Tests over Slow, Complex Tests 8
Favor Past Solutions 9
Document Your Problems and Solutions 10
Know What Changed 12
Understand How Systems Work 13
Use the Internet, but Carefully 14
Resist Rebooting 15
Chapter 2: Why Is the Server So Slow? Running Out of CPU, RAM, and Disk I/O 17
System Load 18
Diagnose Load Problems with top 20
Troubleshoot High Load after the Fact 29
Chapter 3: Why Won’t the System Boot? Solving Boot Problems 35
The Linux Boot Process 36
BIOS Boot Order 45
Fix GRUB 47
Disable Splash Screens 51
Can’t Mount the Root File System 51
Can’t Mount Secondary File Systems 55
Chapter 4: Why Can’t I Write to the Disk? Solving Full or Corrupt Disk Issues 57
When the Disk Is Full 58
Out of Inodes 61
The File System Is Read-Only 62
Repair Corrupted File Systems 63
Repair Software RAID 64
Chapter 5: Is the Server Down? Tracking Down the Source of Network Problems 67
Server A Can’t Talk to Server B 68
Troubleshoot Slow Networks 78
Packet Captures 83
Chapter 6: Why Won’t the Hostnames Resolve? Solving DNS Server Issues 93
DNS Client Troubleshooting 95
DNS Server Troubleshooting 98
Chapter 7: Why Didn’t My Email Go Through? Tracing Email Problems 119
Trace an Email Request 120
Understand Email Headers 123
Problems Sending Email 125
Problems Receiving Email 135
Chapter 8: Is the Website Down? Tracking Down Web Server Problems 141
Is the Server Running? 143
Test a Web Server from the Command Line 146
HTTP Status Codes 149
Parse Web Server Logs 154
Get Web Server Statistics 158
Solve Common Web Server Problems 163
Chapter 9: Why Is the Database Slow? Tracking Down Database Problems 171
Search Database Logs 172
Is the Database Running? 174
Get Database Metrics 177
Identify Slow Queries 182
Chapter 10: It’s the Hardware’s Fault! Diagnosing Common Hardware Problems 185
The Hard Drive Is Dying 186
Test RAM for Errors 190
Network Card Failures 191
The Server Is Too Hot 192
Power Supply Failures 194
Index 197