One place for hosting & domains


      An Introduction to the OSI Networking Model

      Computer networking is a complicated subject, with many interconnected layers and interactions. To help developers and engineers understand how the various networking components work together, several conceptual models have been developed. The
      Open Systems Interconnection (OSI) Model is a popular model that divides the networking stack into seven layers. This guide explains the OSI Model and describes each layer. It also lists the tools available for each layer and contrasts the OSI Model with the competing
      Internet Protocol suite.

      What is the OSI Model?

      The OSI Model provides a method for understanding how end-to-end internet communications work. It deconstructs the networking process into seven layers, each representing a different step of the transmission chain. Each layer has its own function and is responsible for well-defined tasks. Most user data passes through each layer upon both ingress and egress.

      The OSI model was originally developed in the 1970s and 1980s under the oversight of the International Organization for Standardization (ISO). It is formalized in the ITU-T series of
      X.200 recommendations. The model is mainly conceptual in nature and models the network at a high level of abstraction. It is designed to encourage a shared consensus of network standards and interoperability. While it has never been fully applied, it has gained popularity as a good educational model.

      The OSI Model originally included a number of network protocols to implement each of the different layers. However, these protocols were determined to be too complex and difficult to implement. They also involved too drastic of a change to established practices. Therefore, they were never adopted, and protocols from the Internet Protocol (IP) suite were used instead. The standard network protocols in use today show more complexity and do not perfectly align with the OSI model.

      The seven layers are numbered from lowest to highest. The highest layer is closest to the user applications, while the lowest relates to physical transmission. User data passes sequentially from the highest layer down through the lower layers until the device transmits it externally.

      The OSI model encourages a strict encapsulation model. Data from a higher level becomes part of the lower layer message. The data packet received from the higher layer is known as a service data unit (SDU). The lower layer prepends a header to the SDU. In some cases, it might also append a footer. The header and footer contain information intended for the peer layer of the receiving device. After the additional information is concatenated to the original packet from the higher layer, the message is called a Protocol Data Unit (PDU). The PDU is designed to be processed at the same layer on the destination node. This continues until the data reaches the physical layer. At this point, it is converted to a bitstream and physically transmitted to the receiver.

      On the incoming side, the order is reversed. Traffic is first received at the physical layer. It then passes upward one layer at a time. At each layer, the receiving layer reviews the information in the header and removes the encapsulating material. If necessary, the packet is then passed to the higher layer. This process continues until the packet is completely consumed.

      The OSI Layer Architecture

      The seven layers, from lowest to highest, are listed below. Each layer is described in a separate section later in this guide.

      1. Physical Layer
      2. Data Link Layer
      3. Network Layer
      4. Transport Layer
      5. Session Layer
      6. Presentation Layer
      7. Application Layer

      The acronym “All People Seem To Need Data Processing” can be used to remember the layers from highest to lowest. Not all data flows begin at the application layer. Lower layers negotiate automatically after they are configured, even if they are not serving any higher-layer application. Additionally, packets might only be partially processed by intermediate devices. For example, a core router examines packets at the network layer. It then forwards the packet, sending it back to the data link and physical layers to be transmitted.

      Each of the seven layers within the OSI is given its own set of responsibilities. The layers are numbered from the lowest layer, the physical layer, to the high-level application layer. Egress data passes from higher to lower layers. Ingress data is reversed and passes from the lowest layer to the upper layers.

      Layer 1: The Physical Layer

      The lowest layer is responsible for transmitting data to another device using some type of physical medium. It handles characteristics of the physical connection between nodes. All networked devices, from high-end network routers, mobile phones, and laptops, down to simple repeaters, transmit packets using the physical layer. Therefore all devices must use physical layer technologies to communicate with other devices. The physical layer converts data packets into a signal representing a stream of bits.

      This signal can be transmitted using a variety of techniques, including electrical, optical, and wireless encoding. Some examples of physical layer technologies include Wi-Fi, Ethernet, USB, and SONET/SDH. The implementation of this layer usually happens in hardware through a chip, rather than software. Physical layer standards usually include hardware specifications for the pin layout, cable attributes, and data encoding. However some attributes might be software controlled, including physical duplex and framing.

      Physical layer protocols are responsible for implementing the following functionality:

      • Voltage levels
      • Physical data rates
      • Physical connector specifications
      • Maximum transmission length
      • Modulation or channel access
      • Framing and bit stuffing
      • Signal timing and frequency
      • Transmission mode/duplex
      • Auto-negotiation

      Many transmission standards specify details for both the physical and data link layers. The Ethernet standard is a good example.

      Physical Layer Tools

      In a lab environment, a multimeter or oscilloscope can verify quality and compliance. In a real world setting, there is no practical way to debug physical layer problems. A trial and error process of swapping out cables, connectors, and physical ports is often required. If a cable is flakey or defective, throw it out and use another.

      The data link layer is responsible for transferring data between two nodes that are either directly connected or lie within the same network. To send data to a different network, network layer functionality is required. Layer two protocols can often correct physical layer errors using bit correction algorithms. At the data link layer, data is transported inside a frame. A network switch is an example of a data link layer device.

      Layer two specifications explain how to establish a connection and transmit data to another node. The Institute of Electrical and Electronics Engineers (IEEE) organization defines many of the data link specifications under the
      IEEE 802 family of standards. Some of these standards include Ethernet, Wireless LAN, Bluetooth, and Radio, while non-802 standards include the Point-to-Point Protocol (PPP) and Frame Relay. Unlike IP addresses, layer two addresses occupy a flat addressing space. This means the addresses are not hierarchical or routable.

      The IEEE 802 specifications can be further subdivided into two sub-layers, each with their own responsibilities.

      • Logical Link Control (LLC): This is the higher of the two layers. It acts as an interface between the network layer and the MAC layer. It encapsulates higher-layer protocols, and handles flow control, multiplexing, and error detection. However, some of those functions might also be handled at higher layers.
      • Medium Access Control (MAC): The MAC layer is closely entwined with the physical layer. The MAC layer controls network access, frame synchronization, byte/bit stuffing, and link addressing. It encapsulates data from the LLC layer into the appropriate format for the link layer protocol. It also adds and removes a frame checksum to help identify erroneous frames and implements collision detection.

      For a complete analysis, a packet capture tool such as
      Wireshark can capture and analyze the frames. However, many Linux commands allow users to examine interface statistics for packet stats and errors. The ip link command displays information about the network interfaces on the server. The command output includes the state, MTU, and MAC address of the link. See the
      Ubuntu ip command man page for more information.

      1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
      2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
          link/ether f2:3c:93:15:ce:03 brd ff:ff:ff:ff:ff:ff

      The nast utility is a packet sniffer for use in analyzing LAN traffic. It is not pre-installed, so users must install it using apt.

      Run the command at the sudo level and terminate it using the Ctrl+C combination. Specify the interface to listen to using the -i option. The
      Ubuntu nast man page includes more details.

      Sniffing on:
      - Device:	eth0
      - MAC address:	F2:3C:93:15:CE:03
      - IP address:
      - Netmask:
      - Promisc mode:	Set
      - Filter:	None
      - Logging:	None
      ---[ TCP ]----------------------------------------------------------- ->
      TTL: 64 	Window: 501	Version: 4	Length: 112
      FLAGS: ---PA--	SEQ: 855325394 - ACK: 3719741052
      Packet Number: 1
      ---[ TCP ]----------------------------------------------------------- ->
      TTL: 64 	Window: 501	Version: 4	Length: 124
      FLAGS: ---PA--	SEQ: 855325454 - ACK: 3719741052
      Packet Number: 2
      Packets Received:		35805
      Packets Dropped by kernel:	14803

      To list the configuration and capabilities of each network interface, use the ip netconf command.

      inet lo forwarding off rp_filter off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet eth0 forwarding off rp_filter loose mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet all forwarding off rp_filter loose mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet default forwarding off rp_filter loose mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet6 eth0 forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet6 all forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
      inet6 default forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off

      Layer 3: The Network Layer

      The network layer lies at the heart of the OSI network stack. It is responsible for addressing packets and routing them across the internet. Layer three data units are known as packets. The network layer allows packets to flow across non-adjacent networks. Most routers are network layer devices, although some also implement higher layer functions.

      Layer three protocols use the packet destination address to determine the best egress interface for the data. Before reaching its destination, a packet might be routed through many nodes. A path consists of all the routers a packet must pass through to reach a specific destination. Each network device a packet transits through is known as a hop. At each hop, the network layer processes the packet. If the packet has reached its final destination, the data is sent to the transport layer. Otherwise, the packet receives a new header and footer and is sent back to the data link layer for forwarding to the next hop.

      The network layer is responsible for breaking down packets that are too large for the lower layer links into smaller pieces. This process is called fragmentation. At the destination end, the network layer reassembles the fragments back into the original packet. Protocols at the network layer are not required to be reliable, although some protocols might report and retransmit missing packets. Network layer protocols are generally connectionless. Connections and sessions are managed by the higher layers.

      Many well-known network protocols operate at the network layer, including the following:

      • The Internet Protocol (IP). This protocol specifies the addressing format for the internet.
      • Routing protocols including Border Gateway Protocol (BGP) and Open Shortest Path First (OSPF). These protocols are responsible for determining the best path to the final destination.
      • The Multiprotocol Label Switching (MPLS) protocol. In reality, MPLS is a multi-layer protocol. It includes functionality from both the network and transport layers.
      • The various Internet Control Message Protocol (ICMP) control messages, and related applications like ping and traceroute.
      • Multicast standards, including the Internet Group Management Protocol (IGMP).

      Network Layer Tools

      The ip command is also quite useful for network layer problems. The ip addr show command displays the IP address associated with each interface.

      1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
          link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
          inet scope host lo
             valid_lft forever preferred_lft forever
          inet6 ::1/128 scope host
             valid_lft forever preferred_lft forever
      2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
          link/ether f2:3c:93:15:ce:03 brd ff:ff:ff:ff:ff:ff
          inet brd scope global eth0
             valid_lft forever preferred_lft forever
          inet6 2a01:7e00::f03c:93ff:fe15:ce03/64 scope global dynamic mngtmpaddr noprefixroute
             valid_lft 5316sec preferred_lft 1716sec
          inet6 fe80::f03c:93ff:fe15:ce03/64 scope link
             valid_lft forever preferred_lft forever

      The ping and traceroute commands can determine whether a destination is reachable and track the path the packet follows to reach it. These commands can be used with either the name of a router or an IP address. Terminate the command using the Ctrl+C key combination.

      PING (2620:0:862:ed1a::1)) 56 data bytes
      64 bytes from (2620:0:862:ed1a::1): icmp_seq=1 ttl=55 time=6.45 ms
      64 bytes from (2620:0:862:ed1a::1): icmp_seq=2 ttl=55 time=6.41 ms
      64 bytes from (2620:0:862:ed1a::1): icmp_seq=3 ttl=55 time=6.41 ms
      64 bytes from (2620:0:862:ed1a::1): icmp_seq=4 ttl=55 time=6.55 ms
      64 bytes from (2620:0:862:ed1a::1): icmp_seq=5 ttl=55 time=6.40 ms
      64 bytes from (2620:0:862:ed1a::1): icmp_seq=6 ttl=55 time=6.68 ms
      --- ping statistics ---
      6 packets transmitted, 6 received, 0% packet loss, time 5008ms
      rtt min/avg/max/mdev = 6.398/6.483/6.678/0.101 ms

      To view the contents of the system routing table, use the ip route show command. The ip neighbor show and ip nexthop show commands are also often useful.

      default via dev eth0 proto static dev eth0 proto kernel scope link src

      Layer 4: The Transport Layer

      The transport layer works in conjunction with the network layer to coordinate data transfer between the host and the destination. While the network layer is more concerned with addressing and routing, the transport layer is responsible for segmenting and ordering the data. It must collect and interleave packets from many different higher-level protocols. It must also associate these packets with the correct session. On the receiving side, the transport layer reassembles the packets and detects any missing segments. Some transport layer protocols also handle quality of service, congestion avoidance, reliability, and packet retransmission.

      Transport layer protocols are either connection-oriented or connectionless. The two most important transport protocols are the
      Transmission Control Protocol (TCP) and the
      User Datagram Protocol (UDP). Transport layer data units are sent from, and received on, a specific port. The full destination address consists of both an IP number and a port. For ease of use, many protocols are associated with a specific, well-known port.

      • Transmission Control Protocol: TCP is a robust connection-oriented protocol. It implements reliability and error-checking and guarantees packets are delivered in order. It is used for applications that cannot tolerate corrupted or missing packets, such as file transfers and email. TCP segments data based on the maximum transmission unit (MTU) of the egress interface. Some portions of the TCP specification, including the graceful close technique, better align with the session layer of the OSI model.
      • User Datagram Protocol: UDP is a connectionless, lightweight protocol that is far less complex than TCP. Unlike TCP, UDP does not segment packets. It is not necessarily reliable and does not retransmit packets. It is a best effort option for performance-oriented applications that can tolerate missing or corrupted packets. UDP is a good choice for streaming video and applications using built-in buffering mechanisms.

      The Transport Layer Security (TLS) protocol somewhat aligns with the OSI transport layer, but it also provides features from the higher layers.

      Transport Layer Tools

      There is no generic transport layer monitoring tool for Linux. Instead, tools are available for specific protocols. For TCP, the tcptrack utility displays a list of current sessions. tcptrack does not come preinstalled, so install it using apt.`

      sudo apt install tcptrack

      Use the -i option and the name of the interface to see all connections active on the interface. There is no corresponding UDP equivalent because UDP is connectionless. The
      Ubuntu tcptrack man page provides full usage instructions. Terminate the command using the Ctrl+C key combination.

      Client                Server                State        Idle A Speed     ESTABLISHED  0s     10 KB/s

      tcpdump is a packet analyzer for monitoring outgoing and incoming packets on a specific interface. The -i attribute indicates the interface to listen to. The eth0 interface is the default. It can also monitor UDP packets. tcpdump is also able to detect packets at lower layers than the transport layer, while another option allows users to view the Ethernet headers. Consult the
      Ubuntu tcpdump man page for a list of options. Terminate the command using the Ctrl+C key combination.

      tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
      listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
      18:52:14.806270 IP testworkstation.ssh > Flags [P.], seq 866780550:866780658, ack 3719759268, win 501, options [nop,nop,TS val 3917578569 ecr 3770283712], length 108

      Layer 5: The Session Layer

      The session layer is relatively lightweight. It is used to establish and maintain ongoing sessions of longer duration between two systems. It handles the negotiation of the connection and closes it when no longer required. The session layer often manages user authentication during the establishment phase. Sometimes the session layer provides a way to suspend, restart, or resume a session. Network sockets operate at this layer, and protocols including FTP and DNS make substantial use of session layer functionality. It is also heavily used by streaming services, and web/video conferencing. For some services, session layer protocols use flow control for proper synchronization.

      Session Layer Tools

      In many applications, the session layer is bundled together with the presentation and application layer. All layers are managed as a single unit. Therefore there are no generic tools for the session layer or any of the higher layers. Instead, users must employ the application tools. For instance, the
      FileZilla FTP application provides logs and a debug menu to help resolve FTP connectivity problems at the session level.

      Layer 6: The Presentation Layer

      The presentation layer is responsible for translating content between the application layer and the lower layers. It handles data formatting and translation, including data compression/decompression, encoding, and encryption. For some higher layer applications, the presentation layer might also handle graphics and operating system specific tasks. In most modern applications, the presentation and application layers are tightly integrated.

      A example of a protocol residing at the presentation layer is Multipurpose Internet Mail Extensions (MIME), for formatting email messages. The Transport Layer Security (TLS) encryption protocol is also a presentation layer application.

      Layer 7: The Application Layer

      The application layer is the highest layer and the one that is closest to the end user of most software applications. There is a tendency to think of this layer as being equivalent to the application, but the user applications actually directly interact with this layer. Many application layer protocols tend to be closely bound to the client software. They manage tasks including message handling, printer access, and database access. Some examples of application layer protocols include the following:

      • Hypertext Transfer Protocol (HTTP)
      • Simple Mail Transfer Protocol (SMTP)
      • Telnet
      • Secure Shell (SSH)
      • File Transfer Protocol (FTP)
      • Simple Network Management Protocol (SNMP)
      • Domain Name System (DNS)

      End-to-End Processing Using the OSI Model

      It is possible to use the OSI Model to explain how a user request passes from a client application down to the physical layer. For instance, depending on the web application, the following steps might occur when browsing the internet.

      1. The web browser client interacts with an application protocol at the application layer. The user request is translated to either an HTTP or HTTPS message. The DNS protocol is used to convert the domain name into an IP address.
      2. If HTTPS is used, the presentation layer encrypts the outgoing request using a TLS socket. If necessary, the data is encoded or translated to a different character set.
      3. At the session layer, a session is established to send and receive the HTTP/HTTPS messages. In most cases, the session layer opens a TCP session because web browsing requires reliable transmission. However, some streaming applications might opt for a UDP session.
      4. The transport layer TCP protocol initiates a connection to the destination server. When the session is operational, it transmits the packets in their original order and ensures all packets are sent and received. UDP sends all packets out in a best effort manner without a direct connection and does not wait for any acknowledgments. If necessary, the data packets are segmented into smaller packets. The transport protocol forwards all outgoing packets to the network layer.
      5. At the network layer, the routing protocols decide what egress interface to use based on the destination address. The data, including the address information, is encapsulated inside an IP packet. The packet is then forwarded to the data link layer.
      6. The data link layer converts the IP packets to frames, which might result in further fragmentation. It builds the frames based on the data link protocol being used.
      7. At the physical layer, the frames are converted to a stream of bits and transmitted onto the carrier media.

      Drawbacks of the OSI Model

      The OSI Model is useful as a tool for understanding networks. However, it has a number of drawbacks.

      • OSI is very complex, with too many layers. Some of the layers are much more significant and important than others.
      • There are too many OSI standards documents and recommendations.
      • The model does not reflect the real world network structure. In many cases, actual network models span multiple layers and do not align with the boundaries of the OSI layers.
      • The OSI protocols were not widely implemented and the model does not map very well to the protocols in use today.

      A Comparison Between the OSI Model and the Internet Protocol Suite

      The Internet Protocol suite is an alternative to the OSI model. The IP suite has four layers.

      1. Application layer: This maps to the OSI application and presentation layers and much of the session layer.
      2. Transport Layer: This includes some parts of the OSI session layer as well as the transport layer. The TCP and UDP protocols are part of this layer.
      3. Internet Layer: This layer closely matches the OSI network layer definition and includes the IP protocol.
      4. Link Layer: This encompasses both the physical and data link layers of the OSI model.

      The IP suite is considered less prescriptive and more flexible, and better reflects actual usage. Protocols such as TCP/IP and the main routing protocols are derived from the IP suite. However, the IP suite is not as informative, conceptual, or comprehensive as the OSI model, and is not as widely-used as a teaching aid. To properly understand networking concepts, engineers should familiarize themselves with both models.


      The OSI Model is a framework for understanding network communications. It breaks the network stack down into seven layers. The layers range from the low-level physical layer up to the application layer residing closest to a computer user. At the heart of the model are the mid-level network and transport layers. The network layer addresses and routes packets, while the transport layer establishes and maintains a connection with a far-end device.

      Although the OSI Model is a handy learning model, it is relatively abstract and does not always reflect real world behavior. The OSI-based protocols were never really implemented, and most commonly-used network protocols are more closely related to the IP suite. However, the OSI-model is integral to many networking methods, and many of the common networking tools still map to the different OSI layers.

      More Information

      You may wish to consult the following resources for additional information
      on this topic. While these are provided in the hope that they will be
      useful, please note that we cannot vouch for the accuracy or timeliness of
      externally hosted materials.

      Source link

      An Explanation of BGP Networking

      Routing protocols are crucial for proper network engineering and design. They allow traffic to be quickly routed across a network from source to destination. The
      Border Gateway Protocol (BGP) is the universally-accepted routing protocol for the internet backbone. Although BGP is incredibly powerful and has many advantages, it is also complex and has some drawbacks. This guide explains what BGP is and how BGP routing works. It also discusses the advantages and disadvantages of BGP and its role in many recent major network outages.

      What is the BGP Protocol?

      BGP is designed to quickly route traffic from source to destination across the entire internet. It enforces the use of standardized routing messages and practices that are shared between each internet service provider (ISP). For each packet, BGP determines the best route to the final destination. It then transmits the message using the appropriate interface. This process is repeated at every node along the route until the packet arrives at its destination. BGP routing information is exchanged between connected neighbor interfaces, which are called peers. Large ISPs must use BGP to connect to the internet backbone, but smaller networks might also choose to deploy it internally.

      BGP was first roughly sketched out in 1989 and initially deployed in 1994. It quickly became the de facto standard for routing traffic across domains. The current version of BGP is version 4 (BGP4), and its specification is described in
      RFC 4271. The current implementation corrects some errors and ambiguities, and introduces support for Classless Inter-Domain Routing (CIDR) and route aggregation. There is also a BGP implementation for IPv6 as first described in
      RFC 2545, along with several extensions and optimizations.

      BGP is an example of an exterior gateway protocol, as opposed to an interior gateway protocol. Exterior gateway protocols connect large autonomous systems (AS), which might represent an internet service provider (ISP) or an application provider like Netflix. Interior gateway protocols, including Open Shortest Path First (OSPF) and Intermediate System to Intermediate System (ISIS), can only be used inside a single AS. BGP is said to create a “network of networks” out of the different autonomous systems.

      Each AS is allowed to choose its own interior gateway protocol but requires a shared mechanism for sending traffic to another system. BGP provides a successful solution, and virtually all providers now use it to reduce interoperability problems. Consequently, BGP is the only remaining exterior gateway protocol in widespread use.

      BGP constructs routing tables describing the next hop and interface for each external network. The networks are defined based on their IP address and network mask. For instance, the network might have its own entry in the BGP routing table. Any traffic to any address in this subnet is routed the same way. BGP is considered a vector protocol. It considers each route to be a sequence of autonomous systems to transit and not a series of hops. BGP also provides a mechanism to add, delete, and change information from the routing table.

      BGP uses the Transmission Control Protocol (TCP) as its transport mechanism. It is officially assigned TCP port 179. TCP provides a reliable connection, packet retransmission, and fragmentation/reassembly services. This eliminates any requirement for BGP to implement these features. Each BGP message contains a header consisting of a marker field of all ones, the message length, and the BGP message type.

      How does a BGP Session Work?

      BGP is responsible for receiving and transmitting network updates and constructing a routing table from these updates. A BGP interface establishes a point-to-point connection with its BGP peer, which is a neighboring interface. Peering information for each BGP interface is configured manually and used when negotiating the connection. The configuration must include the expected AS and IP address of the BGP peer. The following example demonstrates how to configure a BGP session on a Nokia 7750 router.

      group "To_AS_30000"
          connect-retry 20
          hold-time 90
          keepalive 30
          local-preference 100
          peer-as 30000
              description "To_Router C - EBGP Peer"
              connect-retry 20
              hold-time 90
              keepalive 30
              peer-as 30000

      To establish, monitor, and maintain these connections, BGP implements a finite state machine (FSM). The BGP FSM changes the state of the connection based on a combination of internal and external events. Connections are established and unicast messages are sent using TCP port 179. When the connection is established, BGP sends keep-alive messages at a configurable rate.

      There are two types of BGP connections. The type of connection depends on whether the peers are in the same AS or not.

      • Exterior Border Gateway Protocol (eBGP): eBGP connects interfaces in different autonomous systems. Routes learned from an eBGP peer are advertised to both iBGP and eBGP peers. eBGP can also run between two peers that are not directly connected using a virtual private network (VPN) tunnel.
      • Interior Border Gateway Protocol (iBGP): This runs between two peers within the same AS. Routes learned from an iBGP peer are only advertised outside the AS through eBGP peers. As a result, all routes within iBGP are required to be connected in a full mesh. However, several optimizations and extensions can somewhat relieve this demand.

      Each AS is assigned an autonomous system number (ASN) to identify it on the network. This designates it as an independent network with its own routing policies and helps BGP decide whether to use iBGP or eBGP. An AS controls a set of IP addresses and is allowed to advertise and aggregate these addresses.

      How Does the BGP State Machine Operate?

      BGP uses a finite state machine to establish and maintain connections. The connection state changes in response to incoming messages, errors, lack of responses from the peer, and internal timers. There are four types of BGP messages, and the FSM has six states. Here are the BGP message types.

      • Open: These messages are sent after the TCP session is established. They communicate information about the local BGP configuration and negotiate shared parameters.
      • Keepalive: A keepalive message tells the peer the session is still active. The frequency of the keepalive messages is typically configurable on a per-peer basis.
      • Update: An update communicates new routing information to the peer. When a BGP session is initially established, local routing information is sent to the peer. In an ongoing session, information about new and withdrawn routes is also sent using this message type.
      • Notification: A notification advises the peer about errors or connectivity problems. When a peer receives the message, it terminates the BGP connection.

      The BGP finite state machine includes the following six states.

      • Idle: A BGP connection begins in this state and returns to this state if errors occur or the connection times out. While in this state, BGP initiates a connection and listens for a response or request for connection from the peer. After the connection establishment phase begins, the FSM moves the session to the Connect state. It is not possible to move from Idle to any other state other than Connect.
      • Connect: This is a transitional state that occurs during TCP negotiation after a message is received from the peer. The Connect state attempts to complete a three-way TCP handshake with the peer and establish a TCP session. After a successful handshake, BGP sends an Open message and moves to OpenSent. If the TCP session is not immediately established, the state changes to Active. Any errors cause the state machine to transition back to Idle.
      • Active: This confusingly named state allows for another attempt at establishing a TCP session. If the new attempt is successful, the session sends an Open message and proceeds to the OpenSent state. If it is unsuccessful and the timer expires, it returns to either the Connect or Idle state. In the case of network issues, congestion, or unstable links, a connection can wind up cycling between Idle, Connect, and Active.
      • OpenSent: When the TCP session is established, the BGP FSM enters the OpenSent state and listens for an Open message from the peer. When it receives this message, it validates the message. If the validation is successful, the FSM sets the hold time and moves to the OpenConfirm state. If there are errors, it notifies the peer and resets the connection back to Idle.
      • OpenConfirm: In this short-lived state, BGP listens for a keepalive message from the peer. If it receives one before the keepalive timer expires, the session transitions to Established. If a keepalive is not received or another error occurs, the session moves back to Idle.
      • Established: When a BGP session reaches this final state, it becomes fully operational. When the Established state is reached, the peers send Update messages to advertise information about the routes in their database. If any errors occur, the session moves back to the Idle state.

      Here is a state diagram illustrating all possible states and transitions, courtesy of Wikipedia.

      BGP Finite State Machine

      When BGP detects an error, it sends a Notification message. This message causes the interface to close and moves the BGP session back to the Idle state. Here are some examples of errors that can return the connection to the Idle state.

      • TCP port 179 is closed or fails to open. This can occur if the underlying physical interface is not operational or if the BGP interface is disabled or not fully configured.
      • A dynamic TCP port with an ID greater than 1023 is not available for the second ephemeral TCP port.
      • The BGP peer address and ASN are not configured correctly.
      • Network congestion or high latency.
      • A flapping network interface or connection.
      • A session times out due to a lack of keepalive messages.

      How Does BGP Build its Routing Information Base?

      BGP routing is extremely complex and some aspects are vendor specific. A complete discussion of the full intricacies of BGP routing is beyond the scope of this introductory guide. However, certain basic principles apply in all cases.

      Each BGP interface receives Update messages from its peer indicating reachable destinations for a specific BGP path. The reachable networks are communicated in the form of Network-layer Reachability Information (NLRI) updates. Each NLRI contains a length and prefix for each network on the path. Only one NLRI is advertised in each update message. For instance, an NLRI might contain the entry /24, 192.0.2.

      The Update also contains path attributes, including the inter-AS path to the network. A BGP route contains a list of the autonomous systems a packet must transit to reach the destination. This information helps detect and avoid routing loops. The path attributes must also include the next hop information, and a legacy attribute for the route origin.

      BGP receives a number of NLRI updates from each peer. In return, it also sends out its list of known networks to each of its peers. This strategy allows each peer to effectively calculate the best route to each network.

      For each peer, BGP maintains a conceptual base of routing information. This includes an adjacent routing information base, incoming (Adj-RIB-In) for NLRI updates received from the neighbor. Only one route to a given destination is stored in the Adj-RIB-In. The complementary Adj-RIB-Out is an outgoing information base containing NLRIs to send to the peer. The adj-RIB-Out is also influenced by routing policy and other factors, but only viable routes are allowed.

      BGP collates the information received from all peers into its local routing information base (Loc-RIB). The Loc-RIB is the master BGP routing table. It includes the best BGP route to each advertised network, independent of other routing protocols. The BGP protocol specification lists some factors to consider when selecting routes. However, individual vendors can also incorporate other information.

      Each route in the Loc-RIB must be resolvable and reachable. Additionally, the Loc-RIB chooses routes learned from external peers over any routes learned from iBGP. Other factors might include the cost/throughput of the route, the number of autonomous systems to traverse, community membership, route preferences, and system routing policies. The values of the AS and the IP address can potentially be used as tiebreakers.

      BGP is only one source of routing information for a system. BGP submits the candidate routes from the Loc-RIB to the main routing table, which combines routes from all routing sources. The BGP route is not necessarily selected. For example, in a local network an OSPF route or a static route might be chosen instead. Routing policies also dictate which route is used. For instance, all outgoing packets of a certain type might have to be sent to a traffic management system first. Each router is responsible for its own routing decisions, and the algorithms might vary from vendor to vendor.

      The routing topology is never static. New networks are always being built and removed, and interfaces can fail. New information from a peer forces BGP to recalculate the Adj-RIB-In. In many cases, this might cause a new best route to be installed in the Loc-RIB. This can consequently result in changes to the system routing table. In the event of a failure, traffic can quickly switch over to a less efficient secondary route. If there are no longer any routes to a destination, it is withdrawn from the BGP routing table.

      BGP Extensions and Optimizations

      Over the years, several enhancements and optimizations have been proposed to improve BGP performance and reduce memory requirements. BGP does not require any of these extensions, and system administrators can choose whether to use them or not. Here is a list of some of the most significant BGP options.

      • BGP Communities: Communities allow common policies to be applied to a set of prefixes. A community is indicated through the use of a common attribute tag. There are official well-known communities that should not be advertised or exported. Communities also allow geographic restrictions and help defend against denial-of-service attacks.
      • BGP Confederations: A BGP confederation contains more than one AS. The AS networks within the confederation exchange information as if they were all part of the same iBGP network. Only the ID of the confederation is advertised to the rest of the internet. A confederation makes it easier for an ISP to administer large networks.
      • Multi-exit Discriminators (MED): The MED is sent to peers to advertise the preferred interface for inbound traffic within an AS.
      • Multiprotocol BGP (MBGP/MP-BGP): MBGP can simultaneously carry information for multiple routing protocols and address families, including IPv6 and L3VPN. It supports both unicast and multicast addresses and permits separate routing tables for multicast addresses. MBGP is defined in
        RFC 4760.
      • Route Reflectors (RR): Route reflectors reduce the number of connections inside an AS and eliminates the requirement for a full mesh. An RR acts as the central point for a cluster of routers within the AS. All the route reflectors peer with each other, but the remaining routers only peer with the RR in their cluster. The net effect is to increase scalability and reduce the number of iBGP routes within the network.

      Unfortunately both route reflectors and confederations can increase convergence time and introduce inefficient routes, especially when used together.

      The Advantages and Disadvantages of BGP

      BGP has become the default exterior routing protocol due its many advantages. Here are some of the main benefits of BGP:

      • It provides a comprehensive and unified standard for the entire internet. BGP enables routers in different ISPs to communicate, thus reducing interoperability and compliance issues.
      • BGP is highly scalable and is able to store hundreds of thousands of routes.
      • The protocol does an efficient job of computing the best next hop for a given destination. Its internal algorithms and route selection process are highly optimized.
      • Several optimizations and extensions are available to maximize the performance of BGP and reduce its memory demands.
      • BGP conserves network bandwidth by minimizing the amount of updates and network traffic. The BGP finite state machine is clear and simple and most sessions are negotiated quickly.
      • It is able to quickly handle network failures and reroute traffic.

      Unfortunately, BGP has several drawbacks. Some of these problems are inherent in the original RFC, while others have developed due to the rapid growth of internet networks. Here are some of the most serious disadvantages of BGP:

      • BGP relies on manual configuration, which has the potential to introduce problems. Incorrect BGP configuration has been the root cause of several large-scale internet outages.
      • Sometimes the BGP route and the best route differ. This can be due to factors such as congestion and cost. So BGP is prone to suboptimal routing, particularly when extensions are used.
      • BGP is prone to stability problems in the presence of rapidly flapping interfaces or continually rebooting routers. This can lead to cascading failures that spread out to other networks and cause instability in the rest of the AS or across the internet backbone. BGP instability can be partially alleviated by route flap damping. Damping ignores a problematic interface for a predefined interval of time. Unfortunately, this can also increase convergence times for updates.
      • The global BGP routing tables are very large and continue to grow rapidly. iBGP requires a full mesh of connections, which is memory intensive to store. Efforts have been made to reduce the number of routes through aggregation and routing policies. The CPU demands of a large network can also be very large, especially when the router has many iBGP peers. Smaller and older routers might not be able to keep up with the large volume of updates.
      • BGP cannot detect congestion. This can result in the selection of suboptimal routes.
      • BGP is subject to security issues like BGP Hijacking. This occurs when attackers distribute false routing information to misdirect traffic. BGP hijacking takes advantage of the trust-based system between the autonomous systems. This problem has been reduced through the use of a Resource Public Key Infrastructure (RPKI). RPKI ensures only the official owner of an AS can distribute route updates for the addresses it owns.

      How Can BGP Cause Network Outages?

      BGP problems are commonly the root cause of major network outages, due to the complexity and challenges of the protocol. DNS problems can also cause network outages, but routing issues tend to be more complex and more difficult to debug and resolve. Some extensions and optimizations can interact to produce negative results.

      Problems can occur during configuration or when updating routers. A large number of route flaps can overwhelm a network with updates, leading to router failures. These failures intensify the original problem, and can subsequently spread to the wider internet. Several major outages at Tier 1 ISPs and services have been caused by cascading failures that eventually took down the entire network.

      Certain misconfiguration problems can be particularly severe. One ISP configured its network so all inbound and outbound traffic had to pass through one particular node, which soon failed under the pressure. It is also possible to accidentally drop all traffic destined for another AS. This could, for example, result in all users of the ISP being unable to access Google. In some cases, these failures can spread outside the AS and affect other networks. On more than one occasion, an organization has wiped all traces of its BGP interfaces off the internet with a bad configuration change.

      BGP problems can be very difficult to fix, whether the problem is a routing update from a telecom vendor or an internal configuration change. Only a certain number of highly skilled and trained people typically understand BGP and know how to debug and fix problems. In most cases, the only reasonable response is to roll back the router software or BGP configuration to an earlier version. Engineers can then investigate the problem offline while the network recovers. Unfortunately, it can still take a while for the network to come back online because BGP updates take some time to propagate. Networks that are already unstable or in flux are especially prone to this type of problem.

      BGP errors can also cause DNS failures, so the two major risk factors in a network can combine to amplify problems. A large number of BGP updates can cause DNS requests to fail and overload the DNS servers. This can potentially affect other services and lead to a network storm that decreases performance across the entire internet.


      The Border Gateway Protocol (BGP) is the main routing system for the internet backbone and is used by every major service provider. BGP provides a map for routers to funnel packets from their original source to their final destination. A BGP session is established with a peer interface through the use of a finite state machine and standard message types. The peer can be in the same autonomous system or part of a different vendor’s network.

      BGP sends information about the reachable networks it knows to all of its peers. Based on the routes it receives in return, it constructs a local routing information base consisting of the best route to each network. In turn, BGP sends its selected routes to the main system routing base, where they can potentially be selected as the best route. BGP is a scalable and efficient standard, but suffers from high memory requirements and security issues, and is prone to serious misconfiguration problems. For full details about BGP, including state machine details and messaging formats, consult the IETF
      RFC 4271.

      More Information

      You may wish to consult the following resources for additional information
      on this topic. While these are provided in the hope that they will be
      useful, please note that we cannot vouch for the accuracy or timeliness of
      externally hosted materials.

      Source link

      Practical Kubernetes Networking: How to Use Kubernetes Services to Expose Your App

      How to Join

      This Tech Talk is free and open to everyone. Register below to get a link to join the live stream or receive the video recording after it airs.

      Date Time RSVP
      September 22, 2021 11 a.m.–12 p.m. ET / 3–4 p.m. GMT

      About the Talk

      You’ve deployed an application and a few microservices into your Kubernetes cluster and now you’re wondering how to configure the workloads so that they communicate with one another, and more importantly, how to expose your application to the internet. This talk will show you how to use Kubernetes Services to enable communication between your workloads and how to configure your application so it can accept traffic from the internet.

      What You’ll Learn

      • The difference between the kinds of Kubernetes Services
      • How to use the ClusterIP service to enable internal communication between workloads
      • How to use the LoadBalancer service to expose an application so it is reachable from the internet

      This Talk Is Designed For

      • Anyone running containerized workloads in a non-Kubernetes environment
      • Anyone looking to gradually migrate to Kubernetes
      • Anyone interested in how Kubernetes microservices communicate with one another


      • You have containerized an application or microservice
      • You have basic knowledge of containers and Kubernetes
      • You are familiar with Kubernetes Deployments


      Kubernetes Docs: Services
      Networking Best Practices To Power Your Kubernetes Deployment
      Using a Service to Expose Your App

      Kubernetes in minutes, on DigitalOcean

      DigitalOcean Kubernetes (DOKS) is a managed Kubernetes service that lets you deploy Kubernetes clusters without the complexities of handling the control plane and containerized infrastructure. Clusters are compatible with standard Kubernetes toolchains and integrate natively with DigitalOcean Load Balancers and block storage volumes.

      DigitalOcean Kubernetes is designed for you and your small business. Start small at just $10 per month, and scale up and save with our free control plane and inexpensive bandwidth.

      Source link