Monday, December 30, 2019

Oracle TFA Installation in 2 node cluster

Oracle TFA - Issue and fix 1. Install TFA from node1 this will install in node1 & node2



       

[root@node2 TFA]# ./ahf_setup

AHF Installer for Platform Linux Architecture x86_64

AHF Installation Log : /tmp/ahf_install_382363_2020_04_30-08_32_06.log

Starting Autonomous Health Framework (AHF) Installation

AHF Version: 20.1.2 Build Date: 202004031134

Default AHF Location : /opt/oracle.ahf

Do you want to install AHF at [/opt/oracle.ahf] ? [Y]|N :

AHF Location : /opt/oracle.ahf

AHF Data Directory stores diagnostic collections and metadata.
AHF Data Directory requires at least 5GB (Recommended 10GB) of free space.

Choose Data Directory from below options :

1. /u01/app/grid [Free Space : 9323 MB]
2. Enter a different Location

Choose Option [1 - 2] : 2

Please Enter AHF Data Directory : /depot/TFA_DATA

AHF Data Directory : /depot/TFA_DATA/oracle.ahf/data

Do you want to add AHF Notification Email IDs ? [Y]|N : n

AHF will also be installed/upgraded on these Cluster Nodes :

1. node1

The AHF Location and AHF Data Directory must exist on the above nodes
AHF Location : /opt/oracle.ahf
AHF Data Directory : /depot/TFA_DATA/oracle.ahf/data

Do you want to install/upgrade AHF on Cluster Nodes ? [Y]|N :

Extracting AHF to /opt/oracle.ahf

Configuring TFA Services

Discovering Nodes and Oracle Resources

Do you want us to store the Password for Cells in Oracle Wallet: [Y]|N

Is password same for all Cells: [Y]|N

Please Enter Password for Cell:
Please Confirm Password for Cell:

Both Passwords should be the same...!!!

Please Enter Password for Cell:
Please Confirm Password for Cell:

Verifying Password...

.----------------------------------.
|  | EXADATA CELL | CURRENT STATUS |
+--+--------------+----------------+
'--+--------------+----------------'


Not generating certificates as GI discovered

Starting TFA Services

.-------------------------------------------------------------------------------------.
| Host              | Status of TFA | PID  | Port | Version    | Build ID             |
+-------------------+---------------+------+------+------------+----------------------+
| node2             | RUNNING       | 1077 | 5000 | 20.1.2.0.0 | 20120020200403113404 |
'-------------------+---------------+------+------+------------+----------------------'

Running TFA Inventory...

Adding default users to TFA Access list...

.--------------------------------------------------------------------------.
|                       Summary of AHF Configuration                       |
+-----------------+--------------------------------------------------------+
| Parameter       | Value                                                  |
+-----------------+--------------------------------------------------------+
| AHF Location    | /opt/oracle.ahf                                        |
| TFA Location    | /opt/oracle.ahf/tfa                                    |
| Exachk Location | /opt/oracle.ahf/exachk                                 |
| Data Directory  | /depot/TFA_DATA/oracle.ahf/data                        |
| Repository      | /depot/TFA_DATA/oracle.ahf/data/repository             |
| Diag Directory  | /depot/TFA_DATA/oracle.ahf/data/node2/diag |
'-----------------+--------------------------------------------------------'


Starting exachk daemon from AHF ...

AHF install completed on node2

Installing AHF on Remote Nodes :

AHF will be installed on node1, Please wait.

Installing AHF on node1 :

[node1] Copying AHF Installer

[node1] Running AHF Installer

AHF binaries are available in /opt/oracle.ahf/bin

AHF is successfully installed

Moving /tmp/ahf_install_382363_2020_04_30-08_32_06.log to /depot/TFA_DATA/oracle.ahf/data/node2/diag/ahf/

       
 


2. If all good then you can run below command to check status.


       

[root@node2 TFA]# tfactl status

.---------------------------------------------------------------------------------------------------------.
| Host              | Status of TFA | PID   | Port | Version    | Build ID             | Inventory Status |
+-------------------+---------------+-------+------+------------+----------------------+------------------+
| node2             | RUNNING       |  1077 | 5000 | 20.1.2.0.0 | 20120020200403113404 | RUNNING          |
| node1             | RUNNING       | 28675 | 5000 | 20.1.2.0.0 | 20120020200403113404 | RUNNING          |
'-------------------+---------------+-------+------+------------+----------------------+------------------'

       
 
Issue:
       

 [root@node1 ssh]# tfactl status

.----------------------------------------------------------------------------------------------------------.
| Host              | Status of TFA | PID    | Port | Version    | Build ID             | Inventory Status |
+-------------------+---------------+--------+------+------------+----------------------+------------------+
| node1 | RUNNING       | 101416 | 5000 | 20.1.2.0.0 | 20120020200403113404 | STOPPED          |
| node2 | RUNNING       |  89841 | 5000 | 20.1.2.0.0 | 20120020200403113404 | RUNNING          |
'-------------------+---------------+--------+------+------------+----------------------+------------------'
[root@node1 ssh]#

       
 

After many round of troubleshooting we notice PermitRootLogin was not allowed.

We change below in /etc/ssh/sshd_config file in both node (node1 and node2):

PasswordAuthentication yes
PermitRootLogin yes


2. restart sshd service

service sshd restart

3. Uninstall TFA and Install it back
3. Uninstall TFA and Install it back
       

[root@node2 ssh]# tfactl uninstall
Starting AHF Uninstall
NOTE : Uninstalling does not return all the space used by the AHF repository
TFA-00104 Cannot establish connection with TFA Server. Please check TFA Certificates
TFA-00104 Cannot establish connection with TFA Server. Please check TFA Certificates
AHF will be uninstalled on:
node2
node1 node2

Do you want to continue with AHF uninstall ? [Y]|N :

Stopping AHF service on local node node2...
Stopping TFA Support Tools...


TFA-00002 Oracle Trace File Analyzer (TFA) is not running
Stopping exachk scheduler ...
Removing exachk cache discovery....
No exachk cache discovery found.





Removed exachk from inittab



Stopping and removing AHF in node1...
TFA-00002 Oracle Trace File Analyzer (TFA) is not running
Removing exachk cache discovery....
No exachk cache discovery found.





Removed exachk from inittab


Successfully uninstalled AHF on node node1
Removing AHF setup on node2:
Removing /etc/rc.d/rc0.d/K17init.tfa
Removing /etc/rc.d/rc1.d/K17init.tfa
Removing /etc/rc.d/rc2.d/K17init.tfa
Removing /etc/rc.d/rc4.d/K17init.tfa
Removing /etc/rc.d/rc6.d/K17init.tfa
Removing /etc/init.d/init.tfa...
Removing /opt/oracle.ahf/jre
Removing /opt/oracle.ahf/common
Removing /opt/oracle.ahf/bin
Removing /opt/oracle.ahf/python
Removing /opt/oracle.ahf/analyzer
Removing /opt/oracle.ahf/tfa
Removing /opt/oracle.ahf/ahf
Removing /opt/oracle.ahf/exachk
Removing /opt/oracle.ahf/data/node2
Removing /opt/oracle.ahf/install.properties

[root@node2 ssh]#

       
 

apt-key warning when sudo apt update run

Issue: apt-key warning when sudo apt update run Update below file: cat /etc/apt/sources.list.d/download_docker_com_linux_ubuntu.list ...