Path Protection and Restoration Methods

on network Net8   


Network Description

This network appears in the dissertation by Yu Liu (advisor: D. Tipper), in which several path protection and restoration methods are explained and compared. It is the largest network given in the document: 50 nodes and 82 spans/links. A single (bi-directional) demand is offered between each pair of nodes, a total of 1225 demands. Each demand is assigned a least-cost working path, where each link/span is assigned unit cost. Since this is a one-layer design, we do not distinguish between links and spans. 

Having at most one unit of demand between a pair of nodes precludes examining restoration methods that may use multiple backup up. This will be considered in detail for smaller networks.


Solution Techniques for Link Failure, Path Protection, with Link Diversity

Network restoration problems can be characterized by the following attributes: 

  • failure type: (link/span, node)
  • protection path type: (path, subpath, link)
  • protection path diversity: (link/span, node)


Here are some attributes to distinguish solution methods: 

  • feasible solution?: (yes, no)
  • working/protect: (independent, joint)
  • protection path dependency on fault isolation: (fault-independent, fault-dependent)
  • protection path selection: (predetermined, dynamic, finite)
  • spare capacity: (dedicated, shared)


 The follow table gives a short description of each method. More complete information can be found in Yu Liu's dissertation. The following key words will be used to distinguish the type of mesh restoration/protection problem: 

Solution Method Attributes Description
LP feasible=no, w/p=independent, dependency=fault-independent, selection=dynamic, spare=shared Linear programming relaxation of shared spare capacity protection for link-failures: for each flow using a single backup path that is link-and-tandem-node diverse from the working path. The network cost is a lower bound on the optimal spare cost for this type of restoration.
BB feasible=yes, w/p=independent, dependency=fault-independent, selection=finite, spare=shared Branch-and-Bound: using a predetermined set of candidate backup paths for each flow, use branch and bound to determine the minimal network cost. This is an close upper bound on the optimal spare cost, but normally requires too much execution time for large networks.
SA feasible=yes, w/p=independent, dependency=fault-independent, selection=finite, spare=shared Simulated Annealing: using a simulated annealing method with a predetermined set of paths
SSR feasible=yes, w/p=independent, dependency=fault-independent, selection=dynamic, spare=shared Successive Survivable Routing: Multiple iterations of the SR method
SR feasible=yes, w/p=independent, dependency=fault-independent, selection=dynamic, spare=shared Survivable Routing: Calculating the least-cost backup path for an ordered list of demands, where spare link costs are updated after each demand's backup path is assigned
SPI feasible=yes, w/p=independent, dependency=fault-independent, selection=dynamic, spare=shared Sharing with Partial Information: Similar to SR but using a simplified estimate of spare bandwidth sharing between failures
RAFT feasible=yes, w/p=independent, dependency=fault-independent, selection=predetermined, spare=shared Resource Aggregation for Fault Tolerant: Least hop backup path predetermined for each demand


In the category of 100% recovery from span/link failures, Dr. Liu cites the following results: 

LP BB SA SSR Min SSR Max SR Min SR Max SPI Min SPI Max RAFT
_ _ _ 2784 2827 2796 2835 3643 3896 3758


In the category of 100% recovery from node failures, the results are: 

LP BB SA SSR Min SSR Max SR Min SR Max SPI Min SPI Max RAFT
_ _ _ 2860 2900 2882 2947 3697 3890 3845

 


Comparison with St*Mesh

St*Mesh is based on a constraint programming approach to solving a broad range of mesh restoration problems. Although somewhat slowered than most heuristic algorithms, the constraint programming approach permitted a more direct modeling of network constraints, such as QoS requirements and spare capacity limitations. It was motivated, in part, to address the "demand order-bias" limitations of greedy heuristic algorithms such as Box-Nathan and SSR, where the order of demands strongly affects the design results. Primarily, though, it attempted to allow express links to be included in the link topology, which greedy heuristics could only accomodate in limited fashion with post-optimizations. Early attempts to use constraint programming on spare capacity design with significant cost modularities produced designs that were cost-effective but not spare-capacity efficient. St*Mesh represents an attempt to retain the advantages of the constraint programming methodology while being more competitive with current techniques on less-constrained spare capacity allocation problems. A design goal for St*Mesh was to be competitive (in solution quality) with the best greedy heuristics for non-expressed topologies.   

Like SSR, St*Mesh iterates multiple times through the list of demands. St*Mesh uses constraint programming heuristics to determine the order in which demands are processed, rather than the random ordering in SSR. A more detailed description of St*Mesh will be provided at a later date. 

Although we don't have the LP, BB, and SA results for this network, St*Mesh will typically produce results between BB and SSR for a sufficient number of iterations. The following table shows the relationship between iteration count and network cost for St*Mesh, as compared to SSR. 

Link Failure
LP BB SA St*Mesh(100) SSR Min St*Mesh Min(20) St*Mesh Max(20) St*Mesh Min(5) SSR Max St*Mesh Max(5)
_ _ _ 2767 2784 2788 2796 2826 2827 2834


St*Mesh results fall above and below those for SSR depending on the number of iterations. For networks smaller than Net8, a post-optimization routine can be enabled in St*Mesh that improves the 5-iteration results to the range of the 20-iteration results.  

Node failure has a larger impact on the network than link failure, as a node failure will cause all adjacent links to fail. So we would expect more spare capacity would be required for network restoration. But 100% recovery is not possible, as demands that originate or terminate from a failed node cannot be recovered. How should we account for these demands? Following Yu Liu's approach, we remove any demands, for a given node failure, that originate or terminate from that node. Since these demands do contribute spare capacity requirements to the link failure case, their removal partially offsets the increase in spare capacity associated with node failure restoration. 

Node Failure
BB St*Mesh(100) St*Mesh Min(20) SSR Min St*Mesh Max(20) SSR Max St*Mesh Min(5) St*Mesh Max(5)
_ 2833 2850 2860 2865 2900 2905 2923

 


St*Mesh Applied to Other Mesh Restorations Mechanisms

To be added for link failure problem: 1+1 dedicated path, 1+1 dedicated link, SBPP (link-diversity), SBPP (link and tandem-node diversity), MPLS LSP Recalculation, MPLS FRR (link diversity), and MPLS FRR (link and next node diversity). SBPP stands for Shared Backup Path Protection, where a single backup path is determined for each aggregate node-pair demand. The link-failure results above (for St*Mesh, SSR, SR, and SPI) are solutions to SBPP (link-diversity). The (link and tandem-node diversity) results will be added for comparison, as 1+1 and MPLS LSP calculations find backup paths that are tandem-node disjoint from the working path as well as link-disjoint.