Managing Complete Link Failure
If all available links for a domain on the I1 network fail, domain consoles and domain initiated DR can still successfully complete because they can operate over a backup mechanism called the IOSRAM. IOSRAM is not discussed here. Services that become unavailable when all links fail include time synchronization, SC initiated DR, message logging from the domain, and network booting from the SC.
If all links on the I2 network fail, file synchronization between SCs is the most affected function. Heartbeat and failover still function, because they do not solely rely on the I2 network.
When troubleshooting problems concerning the MAN network, it might be desirable to force a switch to another path. Perhaps intermittent problems on a given network path are causing a degradation in connectivity, but not enough to induce an automatic path switch. MAN provides a method for forcing a path change through ndd.
I1 Network Path Switch
For the I1 network, you can force a path switch from either the domain or the SC. Switchover from the domain is preferred because dman drives path recovery on the I1 network. Therefore, if the change is made to dman on the domain, it is immediately propagated to scman on the SC.
To perform the path switch, use the ndd commands:
domainA# ndd /dev/dman man_pathgroups_report MAN Pathgroup report: (* == failed) ================================================================== Interface Destination Active Path Alternate Paths dman0 Master SSC eri0 eri0 exp 6, eri2 exp 12, eri3 exp 14
Now the first command is complete. We've listed available paths. Now let's set the path as we desire. See the new command. This command produces no output.
domainA# ndd -set /dev/dman man_set_active_path 0 0 2
The first parameter is the instance of the driver. The second parameter is the number of the domain. The third parameter is the eri instance to set active. For the domain side, the first and second values are always 0. The key parameter is the third number passed to man_set_active_path.
This indicates which eri instance to make active. In the prior example, eri2 is made active. The path change is reflected immediately to scman0. Furthermore, the path that was previously active is marked failed (*). (If that NIC is in fact healthy, dman will clear the failed flag as part of its normal link policing.) Performing manual path failover on the scman0 driver from the SC is similar but not discussed here.
I2 Network Path Switch
The I2 network has similar manual path switch commands.
To perform the path switch, use the ndd commands:
sc# ndd /dev/scman man_pathgroups_report MAN Pathgroup report: (* == failed) ========================================================================= Interface Destination Active Path Alternate Paths ----------------------------------------------------------------------------- scman C 8:0:20:be:f4:ed eri8 eri8 exp 6, eri14 exp 12, eri16 exp 14 scman0 A 8:0:20:b7:2f:20 eri4 eri4 exp 2 scman1 Other SSC eri0 exp 0, hme1 exp 0
The first command is complete and we listed the available paths. Now let's set the path as we desire. See the new command. This command produces no output.
sc# ndd -set /dev/scman man_set_active_path 1 0 1
The first parameter is the instance of the driver, the second is the number of the domain, and the third is the eri instance to set active. For the I2 network, the first parameter is always 1 and the second is always 0. The key parameter is the third number passed to man_set_active_path. It indicates which NIC to make active (0 = eri0, 1 = hme1).
Above, hme1 has been made active. There is a small delay while the main and spare SCs scman drivers converge on the new active path. As with the I1 network, the previously active path is marked failed (*).
Typical network troubleshooting techniques such as using ping, netstat, and snoop are effective for debugging problems with the I1 and I2 MAN networks.