HANGFG介绍
尽管有些问题可能是不可预见的,但在许多情况下,如果及早发现迹象,问题是可以避免的。
此外,如果确实发生了问题,那么在事件发生后收集有关该问题的信息是有用的。
HANGFG是支持收集此类诊断的推荐工具之一。
HANGFG(Hang文件生成器)是一系列unix shell脚本,用于自动生成和收集 hanganalyze 和systemstate trace files(系统状态跟踪文件)。
HANGFG根据对已经处于降级状态的系统进行诊断跟踪的影响来生成和收集挂起跟踪文件。
当用户运行HANGFG时,用户可以承受多大程度的影响的总体决定由用户决定,因为影响程度作为参数传递给工具。
如果用户选择轻微或中等冲击(选项1或2)作为工具的参数,HANGFG也能够为用户做出这一决定。
HANGFG支持RAC,可以在RAC或非RAC环境中运行。
获取诊断数据库挂起问题所需的诊断信息的主要问题之一是在问题实际发生时收集必要的诊断数据。
此外,由于识别问题、尝试确定收集哪种数据以及知道如何收集数据需要时间,因此很少收集必要的诊断数据。
通常,问题已经过去,或者必须关闭数据库才能纠正问题。
HANGFG自动生成和收集挂起诊断痕迹。
在数据库挂起期间,用户只需运行一个诊断程序HANGFG。
发出生成诊断痕迹的命令由HANGFG执行,从而使用户不必知道并向数据库发出单个神秘命令来生成这些痕迹。
在发出任何可能进一步降低系统性能的命令之前,HANGFG还将查看系统性能下降的程度。
最后,HANGFG将收集发出这些命令时生成的所有跟踪文件,以及从执行第一次挂起诊断时更新的任何其他oracle跟踪文件。
这大大增加了一次收集所有必要的挂起相关诊断跟踪的可能性,从而减少了用户和Oracle支持之间对其他诊断数据请求的ping请求量。
支持的平台
Solaris
Linux
HP-UX
AIX
Tru64
下载HANGFG
HANGFG User Guide (Doc ID 362094.1)
关注《IT小Chen》公众号,私信关键字“HANGFG”,可获取下载链接!
安装HANGFG
HANGFG的下载可通过MetaLink获得或《IT小Chen》公众号获取,并可作为tar文件下载。然后,用户可以将tar文件复制到要安装HANGFG的目录中,并发出以下命令:
tar xvf hangfg.tar
这将创建一个名为hangfg的目录。然后将hangfg文件解压缩到这个新目录中。
使用chmod命令更改这些文件的文件权限以执行。
卸载HANGFG
要卸载HANGFG,请在HANGFG目录下发出以下命令:
rm -rf hangfg
运行HANGFG
要独立运行HANGFG,请转到目录HANGFG并发出以下命令:
./hangfg.sh
此脚本接受1个参数,该参数控制生成的挂起跟踪文件的数量和详细程度。
ARG1 = 影响程度(1、2或3)。
ARG1 = 1 说明:
对系统的轻微影响。
此选项收集2个hanganalyze级别3跟踪,然后确定它是否也可以在对系统影响最小的情况下收集1个hanganalyse级别4跟踪。
如果是这样,它将收集hanganalyze 4级跟踪。如果没有,则不会收集其他跟踪文件。
ARG1 = 2 说明:
对系统的中等影响(默认值)。
此选项收集1个hanganalyze级别3跟踪,然后确定它是否也可以在对系统影响最小的情况下收集2个hanganalyse级别4跟踪。
如果是这样,它会收集另外2个hanganalyze级别4的痕迹。
如果没有,它将收集额外的hanganalyze 3级跟踪。
此选项还收集1个系统状态级别258跟踪。
ARG1 = 3 说明:
对系统造成严重影响。
此选项收集2个hanganalyze级别4跟踪和2个系统状态级别258跟踪。
如果不输入任何参数,脚本将以默认值2运行,这意味着收集对系统影响中等的跟踪文件。
示例1:
./hangfg.sh 3
这将启动工具并收集对系统影响最大的跟踪(2个hanganalyze 4级跟踪和2个系统状态转储)。
示例2:
./hangfg.sh
在没有参数的情况下调用Hangfg会使用对系统影响中等的默认值(1个hanganalyze级别3的跟踪,然后是额外的hanganalyse级别3或4的跟踪,最后是1个系统状态转储)。
RAC环境:
HANGFG也可以用在RAC环境。
如果您正在运行的节点是RAC群集的一部分,则会使用-g选项发出hanganalyze和systemstate转储命令。
如果您正在运行hangfg的当前节点挂起,您将收到通知,并被要求从集群中的另一个节点运行hangfg。
HANGFG查找集群中所有节点上的所有文件,并在文件hangfiles.out中列出这些文件以及节点名。
HANGFG将所有节点的所有跟踪文件捆绑在一起,并将其放入一个名为hfiles_.tar.Z的压缩tar文件中。
其中 MMDDYYHHMI 是文件生成的日期和时间。
输出文件
HANGFG在各自的udump和bdump目录中创建挂起跟踪文件。
此外,在运行hangfg的目录中还会生成以下文件。
(1)hangfiles.out:
此文件包含挂起期间生成的所有文件的列表。
这是您需要发送到支持部门进行分析的文件列表。
(2)hangfg.log
此文件包含hangfg程序的日志。
此文件对于调试可能出现的任何问题以及记录与hangfg程序相关的所有操作非常有用。
(3)hfiles_.tar.Z
此文件包含hangfg程序收集的所有挂起痕迹的压缩tarball。它还包含文件hangfiles.out。
测试过程
操作系统:Oracle Linux Server 7.5
数据库:Oracle 19.22 单机
上传 hangfg.tar 并解压:
[oracle@cjc-db-02 tools]$ tar -xvf hangfg.tar
查看文件
[oracle@cjc-db-02 tools]$ ls -lrth hangfg
total 72K
-rwxr-xr-x 1 oracle oinstall 58 Apr 9 2007 test_rconnection.sh
-rwxr-xr-x 1 oracle oinstall 69 Apr 9 2007 test_rconnection2.sh
-rwxr-xr-x 1 oracle oinstall 86 Apr 9 2007 test_connection.sh
-rwxr-xr-x 1 oracle oinstall 366 Apr 9 2007 hq.sh
-rwxr-xr-x 1 oracle oinstall 305 Apr 9 2007 haLocalLevel.sh
-rwxr-xr-x 1 oracle oinstall 312 Apr 9 2007 haLevel.sh
-rwxr-xr-x 1 oracle oinstall 285 Oct 25 2012 ss.sh
-rwxr-xr-x 1 oracle oinstall 279 Oct 25 2012 ssLocal.sh
-rwxr-xr-x 1 oracle oinstall 30K Oct 25 2012 hangfg.sh
-rw-r--r-- 1 oracle oinstall 4.1K Oct 25 2012 README
查看README
README文件里有对HANGFG工具的详细介绍,原文如下:
[oracle@cjc-db-02 tools]$ more hangfg/README
######################################################################
HANGFG (Hang file generator) is a series of unix shell scripts used
to automate the generation and collection of hanganalyze and
systemstate trace files. HANGFG generates and collects hang trace
files based on the impact of taking diagnostic traces on a system
which is already in a degraded state. The overall decision on what
level of impact the user can afford is left up to the user when he
runs HANGFG, as the level of impact is passed in as an argument to
the tool. HANGFG is also capable of making this decision for the user
if the user selects light impact (option 1) as an argument to the tool.
HANGFG is RAC aware and can run in either a RAC or non RAC environment.
######################################################################
INSTALLATION:
Once the tool has been downloaded to the directory you wish to install,
untar the hangfg.tar file. This will create a directory called hangfg.
The hangfg files are then untared into this new directory. Next, make
sure to change the file permissions on these files to execute by using
the chmod command.
######################################################################
RUNNING HANGFG:
If running hangfg on a RAC cluster you must run hangfg as the oracle
user. To run the HANGFG utility, execute the hangfg.sh shell script.
This script takes 1 argument which controls the number and level of
detail of the hang trace files generated.
ARG1 = level of impact (1, 2 or 3).
1) Light impact on system. This option collects 2 hanganlyze level 3
traces and then determines whether it can also collect 1 hanganalyze
level 4 trace with minimal impact to the system. If so, it collects
the hanganalyze level 4 trace. If not, it does not collect an
additional trace file.
2) Medium impact on system (default value). This option collects 1
hanganlyze level 3 trace and then determines whether it can also
collect 2 hanganalyze level 4 traces with minimal impact to the
system. If so, it collects the 2 additional hanganalyze level 4
traces. If not, it collects 1 additional hanganalyze level 3 trace.
This option also collects 1 systemstate level 258 trace.
3) Heavy impact on system. This option collects 2 hanganalyze level 4
traces and 2 systemstate level 258 traces.
If you do not enter any argument the script runs with a default value
of 2 meaning collect trace files with a medium impact on the system.
Example 1:
./hangfg.sh 3
This would start the tool and collect traces with the heaviest impact
on the system ( 2 hanganalyze level 4 traces and 2 systemstate dumps).
Example 2:
./hangfg.sh
This would use the default values of 2 and collect traces with a
medium impact on the system.
######################################################################
RAC ENVIRONMENTS:
HANGFG is rac aware. If the node you are running on is part of a RAC
cluster the hanganalyze and systemstate dump commands are issued with
the -g option. If the current node where you are running hangfg is
hung you will be given a notification and asked to run hangfg from
another node in the cluster. HANGFG finds all files on all nodes in
the cluster and lists these files along with nodename in the file
hangfiles.out. HANGFG bundles all the trace files from all the nodes
and puts them into one compressed tar file hfiles.tar.Z.
######################################################################
OUTPUT FILES:
HANGFG creates hang trace files in the respective udump and bdump
directories. In addition the following files are also generated in the
directory where you are running hangfg.
1) hangfiles.out. This file contains a list of all files generated
during the hang. This is the list of files you need to send into
support for analysis.
2) hangfg.log. This file contains a log of the hangfg program. This
file is useful for debugging any issues they may arise along with
logging all actions associated with the hangfg program.
3) hfiles.tar.Z This file contains a compressed tarball of all hang
traces collected by the hangfg program. It also contains the file
hangfiles.out.
测试:
./hangfg.sh 1
前台日志如下:
Starting Hang File Generator V 1.2.0 on Sun May 25 18:24:24 CST 2025
HANGFG - Written by Carl Davis, Center of Expertise, Oracle Corporation
rm: cannot remove ‘*.tmp’: No such file or directory
Searching for udump/bdump...
/oracle/db/product/19.0.0/rdbms/log
/oracle/db/product/19.0.0/rdbms/log
Database connection established.
Skipping remote node file collection...
Treating collection as single node (non RAC).
Please wait. File operations in progress...
Processing Light Impact Hang Trace Collection...
Starting HangAnalyze Trace. Please Wait...
Statement processed.
Statement processed.
Hang Analysis in /oracle/db/diag/rdbms/chen/chen/trace/chen_ora_5913.trc
Please wait. File operations in progress...
Starting HangAnalyze Trace. Please Wait...
Statement processed.
Statement processed.
Hang Analysis in /oracle/db/diag/rdbms/chen/chen/trace/chen_ora_6044.trc
Please wait. File operations in progress...
HA Filename=/oracle/db/diag/rdbms/chen/chen/trace/chen_ora_6044.trc
blockers=
Blocker info missing. Cannot continue with level 4 dump.
Copying files to hangFileArchive...
Creating tarball of all hangfiles...
hangFileArchive/
Program hangfg terminated successfully.
More information contained in hangfg.log
查看生成的文件:
[oracle@cjc-db-02 hangfg]$ ls -lrth
......
drwxr-xr-x 2 oracle oinstall 6 May 25 18:24 hangFileArchive
-rw-r--r-- 1 oracle oinstall 402 May 25 18:24 hq.tmp
-rw-r--r-- 1 oracle oinstall 0 May 25 18:24 conn.test
-rw-r--r-- 1 oracle oinstall 115 May 25 18:26 hanganalyze.tmp
-rw-r--r-- 1 oracle oinstall 253 May 25 18:26 hfiles.tar.Z
-rw-r--r-- 1 oracle oinstall 2.1K May 25 18:26 hangfg.log
日志如下:
[oracle@cjc-db-02 hangfg]$ ls -lrth hangFileArchive/
total 0
[oracle@cjc-db-02 hangfg]$ cat hangfg.log
Sun May 25 18:24:24 CST 2025 Program hangfg.sh started on host cjc-db-02 for db instance chen
Sun May 25 18:24:34 CST 2025 Database connection established.
Sun May 25 18:24:34 CST 2025 Directory udump found =/oracle/db/product/19.0.0/rdbms/log
Sun May 25 18:24:34 CST 2025 Directory bdump found =/oracle/db/product/19.0.0/rdbms/log
Sun May 25 18:24:34 CST 2025 Looking for file markers prior to trace generation...
Sun May 25 18:24:34 CST 2025 Latest file in udump= /oracle/db/product/19.0.0/rdbms/log/hang.fm
Sun May 25 18:24:34 CST 2025 Latest file in bdump= /oracle/db/product/19.0.0/rdbms/log/hang.fm
Sun May 25 18:24:34 CST 2025 Looking for ssh...
Sun May 25 18:24:34 CST 2025 ssh found.
Sun May 25 18:24:34 CST 2025 Looking for lsnodes...
Sun May 25 18:24:34 CST 2025 lsnodes not found.
Sun May 25 18:24:34 CST 2025 Looking for olsnodes...
Sun May 25 18:24:34 CST 2025 olsnodes not found.
Sun May 25 18:24:34 CST 2025 Skipping remote node file collection.
Sun May 25 18:24:34 CST 2025 Treating collection as single node (non RAC).
Sun May 25 18:25:34 CST 2025 Processing Light Impact Hang Trace Collection...
Sun May 25 18:25:34 CST 2025 Starting hanganalyze Level 3 trace.
Sun May 25 18:25:35 CST 2025 Completed hanganalyze Level 3 trace.
Sun May 25 18:26:05 CST 2025 Starting hanganalyze Level 3 trace.
Sun May 25 18:26:05 CST 2025 Completed hanganalyze Level 3 trace.
Sun May 25 18:26:35 CST 2025 Testing for feasability to take HA level 4 trace...
Sun May 25 18:26:35 CST 2025 HA Filename=/oracle/db/diag/rdbms/chen/chen/trace/chen_ora_6044.trc
Sun May 25 18:26:35 CST 2025 Blockers=
Sun May 25 18:26:35 CST 2025 Blocker info missing. Cannot continue with level 4 dump.
Sun May 25 18:26:35 CST 2025 Searching udump on local node for hang trace files...
Sun May 25 18:26:35 CST 2025 Searching bdump on local node for hang trace files...
Sun May 25 18:26:35 CST 2025 Copying files to hangFileArchive...
Sun May 25 18:26:35 CST 2025 Creating tarball of all hangfiles...
Sun May 25 18:26:35 CST 2025 Program hangfg terminated successfully...
脚本分析:
hangfg.sh 最终调用的是haLocalLevel.sh脚本执行hanganalyze命令。
[oracle@cjc-db-02 hangfg]$ more haLocalLevel.sh
#!/bin/sh
#oradebug -g all hanganalyze $1
sqlplus -S /nolog <<END
connect / as sysdba
set echo off
set feedback off
set arraysize 4
set pagesize 0
set pause off
set linesize 200
set verify off
set head off
spool hanganalyze.tmp
oradebug setmypid
oradebug unlimit
oradebug hanganalyze $1
spool off
END
参考:
HANGFG User Guide (Doc ID 362094.1)
###chenjuchao 20250525###
欢迎关注我的公众号《IT小Chen》