As a network consultant, I’ve encountered numerous scenarios where routing stability is critical. One of the key features I’ve come to appreciate is BGP Graceful Restart (GR). It’s a game-changer, allowing networks to maintain their integrity even during disruptions. Let’s dive into what BGP Graceful Restart is, why it matters, and how to leverage it effectively in your MikroTik environment
What is BGP Graceful Restart?
Imagine you’re on a bus journey, and the driver suddenly needs to take a break. Instead of pulling over abruptly, they signal for another driver to take over seamlessly, ensuring that you and the other passengers continue your journey without a hitch. This smooth transition is akin to BGP Graceful Restart. It allows a BGP router to restart without dropping its established connections and routes, ensuring that the data traffic flows uninterrupted.
How BGP Graceful Restart Works: Step by Step
Graceful Restart (GR) allows a BGP router to undergo a restart without disrupting the flow of routing information. Here’s a detailed step-by-step breakdown of how it works:
1. Detecting a Restart
Graceful Restart Capability: When two BGP routers establish a session, they exchange their capabilities. If both routers support Graceful Restart, they acknowledge this during the BGP capability exchange phase.
Router Initiates Restart: At some point, one of the routers (let’s call it the “Restarting Router”) needs to restart either due to maintenance or failure. However, since it advertised the GR capability, its peers are aware that the router is capable of preserving its BGP state.
2. Notifying Peers
BGP Graceful Notification: As the Restarting Router goes down, it sends a notification to its BGP peers indicating that it is entering a Graceful Restart state.
Peers Maintain Session: Upon receiving the notification, the peers (referred to as the “Non-Restarting Routers”) do not tear down the BGP session immediately. Instead, they keep the session alive and mark the routes learned from the Restarting Router as stale.
3. Maintaining Routing Information
Stale Routes in Use: The Non-Restarting Routers continue to use the stale routes from the Restarting Router. This prevents any immediate impact on routing decisions and avoids potential traffic loss.
Forwarding Plane Continuity: The router’s forwarding plane continues to forward packets using the last known good routes while the control plane (BGP process) is being restarted. This ensures that traffic is not dropped during the restart.
4. Restarting the BGP Process
BGP Session Re-establishment: After the Restarting Router completes its restart process, it re-establishes the BGP session with its peers. This involves re-synchronizing its routing table and exchanging route updates.
Route Validation: As the Restarting Router sends its route updates, the Non-Restarting Routers compare the new routes with the stale ones and discard any stale routes that have been superseded by new updates.
5. Convergence and Cleanup
Clearing Stale Routes: Once the Restarting Router has fully re-synchronized with its peers, any stale routes that are no longer valid are removed from the routing table.
Graceful Convergence: The network converges gracefully with minimal disruption, as new routes take over and traffic continues flowing smoothly.
6. Completion of Graceful Restart
Session Resumption: The Restarting Router is now fully operational and in sync with its peers, having completed the BGP Graceful Restart without impacting network traffic.
Traffic Stability: Throughout the process, traffic was forwarded seamlessly, minimizing packet loss, latency, or downtime.
BGP with Graceful Restart vs. Without Graceful Restart
With Graceful Restart (GR)
When a BGP router undergoes a restart while Graceful Restart is enabled, several key processes occur:
Session Retention: The router maintains its BGP sessions with peers, even during the restart. This means that the established TCP connections remain intact, allowing for seamless communication.
Stale Route Information: The router can continue to advertise routes to its neighbors using stale routing information. This is crucial during the restart period, as it prevents immediate route withdrawal and maintains network stability.
Route Recalculation: Upon restart, the router does not immediately recalculate its routing table from scratch. Instead, it temporarily holds onto the routes it was previously advertising, reducing the potential for routing loops or blackholes.
Peer Notification: While the router is restarting, it sends a notification to its BGP peers indicating that it is in a GR state. Peers are made aware that the router is temporarily unavailable for route updates but can continue to use the existing routes.
Session Re-establishment: After the router completes its restart, it can quickly re-establish its BGP session with peers, allowing it to synchronize any new routing information and update its table as necessary without dropping traffic.
Impact: Minimal disruption in routing and data flow; quick recovery and synchronization with peers.
Without Graceful Restart
When a BGP router restarts without GR enabled, the following events occur:
Session Termination: The existing BGP sessions with peers are terminated, leading to the loss of TCP connections. This results in an immediate disruption of routing information exchanges.
Route Withdrawal: The router will withdraw all routes it previously advertised to its neighbors. This withdrawal can cause transient routing instability, as other routers must detect the route loss and recalculate their routing tables accordingly.
Full Route Recalculation: Upon restart, the router must perform a full recalculation of its routing table. This process can be time-consuming, especially in larger networks, leading to potential delays in traffic delivery.
Traffic Impact: During the period when the router is restarting and recalculating its routes, packets may be dropped, leading to increased latency and potential network outages for users relying on those routes.
Increased Load on Peers: Other routers in the network may experience increased load due to the need to update their routing tables based on the withdrawal of routes. This can create cascading effects, especially in a large topology.
In my experience with MikroTik, the implementation of BGP Graceful Restart in RouterOS v7.x is robust and user-friendly. Notably, GR is enabled by default, which simplifies the configuration process. There’s no need to toggle any settings; just ensure your peers also support this feature for optimal performance.
Furthermore, it’s worth noting that major peers, like Google, are now mandating that their partners support BGP Graceful Restart. This push highlights the increasing importance of GR in maintaining stable and efficient interconnections across the internet.
Mikrotik ROS v7.x Local BGP Capabilities
Mikrotik ROS v7.x Showing Remote BGP Capabilities
Note: Currently there is no option to set restart time in Mikrotik Router OS.
Advantages of Graceful Restart
Reduced Downtime: By maintaining active sessions during restarts, GR minimizes potential downtimes.
Maintained Traffic Flow: Network traffic can continue smoothly, preventing disruptions that could lead to customer dissatisfaction.
Enhanced Stability: Graceful Restart contributes to a more stable network environment, especially during planned maintenance or unexpected failures.
Comparison Table of BGP with and without Graceful Restart
Feature
BGP with Graceful Restart
BGP without Graceful Restart
Session Retention
Maintains active BGP sessions during restart
Terminates active sessions upon restart
Routing Table Stability
Uses stale routes during restart, minimizing disruption
Withdraws all routes, leading to instability
Impact on Peers
Peers continue to receive advertised routes without interruption
Peers experience immediate route withdrawal
Traffic Flow
Data packets continue flowing without delay
Potential packet loss and increased latency
Route Recalculation
Minimal; holds onto previously advertised routes
Full recalculation required after restart
Recovery Time
Quick recovery; sessions re-establish rapidly
Longer recovery; peers must re-establish routes
Notification to Peers
Sends a notification to peers indicating GR state
No notification; peers must detect disconnection
Load on Network
Reduces load on network during failover
Increases load as peers update their routing tables
Routing Loops
Low risk of routing loops during restart
Higher risk of routing loops due to immediate withdrawal
Configuration Complexity
Typically simple, as it is often enabled by default (e.g., MikroTik, modern Cisco)
Requires careful planning to manage disruptions
Usage in Modern Networks
Increasingly mandated by major peers like Google
Not recommended for networks that prioritize uptime
Table 1.1 – Comparison of BGP with and without Graceful Restart
Summary
Key Benefits of Graceful Restart: The table highlights how Graceful Restart contributes to network stability, minimal downtime, and improved traffic flow.
Considerations for Network Engineers: Understanding the differences can guide network engineers in their configuration choices, especially in environments where uptime is critical.
How to Enable Graceful Restart in Cisco IOS, Juniper OS and Huawei Routers
Enabling Graceful Restart in Cisco IOS
In Cisco IOS, enabling Graceful Restart for BGP is straightforward. Follow these steps:
1. Access the Router Configuration: Enter global configuration mode on your Cisco router.
configureterminal
2.Enter BGP Configuration Mode: Specify the BGP router configuration.
routerbgp [your_AS_number]
3.Enable Graceful Restart: Use the following command to enable Graceful Restart. The graceful-restart command allows the router to maintain its BGP sessions during a restart.
bgpgraceful-restart
4.Specify the Restart Time: Optionally, you can set a maximum time for the graceful restart period using:
bgpgraceful-restartrestart-time [seconds]
5.Exit Configuration Mode: Save your changes and exit the configuration.
endwritememory
Enabling Graceful Restart in Juniper OS
1. Access Configuration Mode: First, access your Juniper device and enter configuration mode:
configure
2. Enter the Routing Protocol Configuration: Navigate to the BGP routing configuration section:
setprotocolsbgp
3. Enable Graceful Restart: Use the following command to enable Graceful Restart:
setprotocolsbgpgraceful-restart
4. Set the Graceful Restart Time (Optional): Specify the maximum time the router should hold stale routes during a restart:
5. Configure the Helper Mode (Optional): If you want the router to assist its peers that have enabled Graceful Restart, you can configure helper mode:
setprotocolsbgpgraceful-restarthelper
6. Commit the Configuration: Once the changes are complete, save the configuration and exit:
commitexit
Enabling Graceful Restart in Huawei Routers
For Huawei routers, enabling Graceful Restart can be done through the following steps:
1. Access the Router Configuration: Enter the system view on your Huawei router.
system-view
2. Enter BGP Configuration Mode: Specify the BGP instance.
bgp [your_AS_number]
3. Enable Graceful Restart: Use the following command to enable Graceful Restart:
graceful-restart
4. Configure Restart Time (Optional): You can set a specific duration for the GR period with:
graceful-restarttimer [seconds]
5. Exit Configuration Mode: Save your configuration changes.
savequit
Conclusion
BGP Graceful Restart is an invaluable feature for maintaining network stability. By ensuring that routing sessions persist even during disruptions, it allows network professionals like us to provide reliable services to our clients.
In my journey as a network consultant, I’ve seen firsthand how GR can make a significant difference in uptime and performance. So, if you haven’t checked if your peers support Graceful Restart yet, now’s the time. Your network will thank you for it!