Introduction

When working on performance issues in Java applications, one often needs to analyze thread dumps. In tightly controlled environments you may not have access to any of the default tools to generate the thread dumps like jstack. In this case, typically these dumps are generated by sending a kill -3 signal to the Java process, appending the output to catalina.out. This blog post describes an efficient method to extract these thread dumps using an AWK script, making the analysis process smoother and faster.

Taking Thread Dumps with kill -3

In production environments where you may not have access to tools like jstack, you can generate a thread dump by sending a SIGQUIT signal using the kill -3 command. Here’s how to do it:

  1. Identify the Java process ID (PID): Use the ps command to find the PID of the Java process running your application.

    ps aux | grep java
    
  2. Send the kill -3 signal:

    kill -3 <PID>
    

Extracting Thread Dumps with AWK

The catalina.out file usually contains other runtime logging, making it cumbersome to isolate thread dumps manually. To streamline the extraction process, you can use the following AWK script. This script identifies and isolates thread dumps into separate files, making it easier to analyze them.

Script Overview

Below is the AWK script that accomplishes this task. It’s designed to optimize file handling operations, making it fast even with very large catalina.out files.

awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$/ {
if(td) close(outFile);
n++; td=1;
outFile=("jira_threads." sprintf("%06d", n) ".txt");
}
td { print $0 >> outFile; }
/object space/ && lastLine ~ /PSPermGen/ { td=0; }
{ lastLine=$0; }' catalina.out

For convenience, here’s a one-liner version of the same script:

awk '/^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}$/ { if(td) close(outFile); n++; td=1; outFile=("jira_threads." sprintf("%06d", n) ".txt"); } td { print $0 >> outFile; } /object space/ && lastLine ~ /PSPermGen/ { td=0; } { lastLine=$0; }' catalina.out

Logic Explained

  1. Thread Dump Identification: The script begins by identifying the start of a new thread dump. It looks for lines matching the date-time pattern YYYY-MM-DD HH:MM:SS. When such a line is found, it prepares to write to a new output file.

  2. Efficient File Handling: For each identified thread dump section, the script opens a new file and appends all subsequent lines to it. This file remains open until the start of the next section, reducing the overhead of repeatedly opening and closing files.

  3. Termination Condition: The script includes logic to stop capturing data for the current file upon encountering specific patterns, such as “object space” following a line containing “PSPermGen”. This ensures only relevant data is extracted.

Analyzing the Extracted Thread Dumps

For comprehensive analysis of the extracted thread dumps, you can use the following tools:

  • FastThread: This tool provides in-depth insights and visualizations, helping you pinpoint performance issues and deadlocks more effectively. Simply upload the extracted thread dump files for analysis.
  • Drauf’s Watson: A viewer that allows you to visualize and interpret thread dumps easily. Drop the extracted thread dump files into Watson for a comprehensive analysis.

By using these tools, you can gain valuable insights into your application’s performance and identify the root causes of issues more effectively.

Conclusion

Using this AWK script can greatly simplify and speed up the process of extracting and analyzing thread dumps from catalina.out.

I hope this information is helpful for someone out there. 🍻