Skip to main content

Highway stuck on startup when IPsec is enabled

Highway process can get stuck on startup due to DNS related race conditions for IPSec tunnels.

Issue ID: PLUGIN-2550

Last Updated: 2024-08-05

Introduced in Plugin Version: 128T-ipsec-client-3.6.1

Problem

The IPSec client plugin (3.6.1) attempted to correct a run time race condition between IPSec tunnel starting before DNS was fully operational in the IPSec namespace. Two changes were made to resolve this issue:

  1. A verification step was added to ensure successful DNS resolution before starting IPsec tunnel.
  2. A watch dog service was added to monitor IPSec tunnel services and restart them every 30 seconds if not.

During the initial boot process of a router with the IPSec plugin enabled, the WAN interfaces can take some time to fully come up, creating two interactions with the IPSec controller:

  • Each call to verify DNS resolution can take up to 6 mins 40 seconds to timeout since the network is unreachable.
note

During normal operation this verification process ranges from a few hundred milliseconds to a few seconds.

  • The watchdog, in an attempt to start the failed IPsec tunnel service, initiates another verify DNS resolution API call.

Due to the delayed initialization of the WAN interfaces, the IPSec controller would get backed up with health-check tasks. This backlog prevented the forwarding plane from initializing the device interfaces. In the instance of the reproduction, there was a backlog of 9 hours before the API call from the forwarding plane could be processed. This was verified by letting the reproduction system stay in the stuck state for more than 9 hours. The device interfaces eventually updated the operational status of the Broadband interfaces.

Release Notes

Resolved the startup race condition by enforcing stricter default timeouts for DNS operations.

Severity

Details

The potential impact of a software defect if encountered. Severity levels are:

  • Critical: Could severely affect service, capacity/traffic, and maintenance capabilities. May have a prolonged impact to the entire system.
  • Major: Could seriously affect system operation, maintenance, administration and related tasks.
  • Minor: Would not significantly impair the functioning or affect service.

Major

Status

Resolved

Resolved In

IPSEC Client Plugin Release 3.6.2

Product

SSR

Functional Area

IPSec Client Plugin

Workaround

Details

Juniper may provide a method to temporarily circumvent a problem; workarounds do not exist for all issues.

Disable IPSec client

While the issue does not happen in every environment, one option is to temporarily disable the IPSec plugin for planned reboot of devices.

Monitor and restart the IPSec controller

Once the system is in the problem state, the following steps can be performed from the linux shell to unblock the highway process:

  • systemctl stop ipsec-controller
  • Wait for the interfaces and peers over WAN interfaces to come up
  • touch /var/lib/128technology/plugins/ipsec/config.json
    • This will restart the IPSec controller and other necessary services