12/03/2018

FB: Eng Egress with Edge Fabric

https://research.fb.com/wp-content/uploads/2017/08/sigcomm17-final177-2billion.pdf?

PR's BGP connection types:
  • Transit: private link with dedicated b/w
  • Peers:
    • private peer: dedicated PN ?~= transit ?
    • public: via public fabric
    • route server: prefix redirected by RS and traffic via public fabric
How prefixes are preferred:
  • Prefer peer routes over transit (via local_pref), as_path tiebreaker
    • In/egress traffic over same path
  • If still tie, private peer > public > route server peer, using MED
    • to avoid cross-congestion over fabric
BGP multi-path vs. ECMP
  • Cisco BGP multipath doc
  • Requirement of path characteristics to be multipath
    • Weight
    • local_pref
    • as_path length
    • origin
    • MED
    • one of these:
      • neighbor AS or sub-as
      • as_path
BGP limitation
  • Not capacity-aware + ECMP
    • unbalances links get equal load
  • Static bgp policy likely optimizes traffic, but
    • as-path != performance
Avoid congestion
  • Input:
    • Prefix via BMP, BGP only has 1 best
    • controller does best selection
    • sFLOW, IPFIX, traffic info
    • SNMP, interface info
  • Output, via BGP update by using higher local_pref
Performance-based routing
  • servers set DSCP
  • PBR on PRs, 1 DSCP - 1 route (table?)
  • PR ISIS-SR/MPLS to ASW
  • eBPF - extended berkley packet filter
    • change pkt egress this server
to be continued.....

No comments:

Post a Comment