TRUMP/RBCC Operation

 

As discussed above, the proposed congestion control mechanism is divided into the parts in the source and destination (TRUMP flow control) and in each router (RBCC). Together, these use the congestion information in each packet and in each Allocated Bandwidth Table to control congestion.

I will examine each part of TRUMP and RBCC in turn, describing its operation in detail. I will introduce some pseudo-code, which is based heavily on the RBCC code implemented in the REAL simulator. The pseudo-code will have a C syntax.

Source Transmission

TRUMP, as a transport protocol, provides not only flow-control, but error correction via packet retransmission and selection of the type of connection required between the source and destination. This report will concentrate on the flow-control aspects of TRUMP. For more details on TRUMP, the reader should refer to [Toomey 93] (which is slightly dated), or contact me for a more up-to-date description of the protocol.

A TRUMP transport layer can begin transmission either by performing a two-way handshake with the destination transport layer, or by starting immediately with data transmission. The current implementation of TRUMP always starts with a two-way handshake:

  1. Transmit connection setup packets to the destination with an inter-packet separation of S seconds.
  2. If no setup acknowledgment packet is received after T seconds, abort the connection. Stop.
  3. If a setup acknowledgment packet is received, discontinue setup packet transmission and start transmitting data packets.
In the current implementation, S is 1 second and T is 30 seconds.

During data transmission, TRUMP transmits data packets spaced by an inter-packet delay, whose value is calculated as follows:

where is obtained from the Return_Rate field from either the connection setup
acknowledgment packet or the last data acknowledgment packet. A single timer set to this delay prompts the source to transmit the next data packet.

With an initial inter-packet delay set, TRUMP performs the following code:

  while (there are still packets left to transmit) {

      make_pkt(pkt);
                                        /* Initialise many packet */
                                        /* fields including */
      pkt->seq_no = s++;                /* the sequence number, */
      pkt->desired_rate= pkt->rate=     /* the desired rate and */
                           infinity();  /* initial rate values, */
      pkt->bottlenode = node;           /* the bottlenode */

                                        /* Initialise fields to */
                                        /* hold information about */
                                        /* the ack flow, such as */
      pkt->rtn_rate = ack_rate;         /* the measured ack rate */
      pkt->rtn_bottlenode=              /* and the ack bottleneck */
                        ack_bottlenode;
      send_pkt(pkt);                    /* Send the packet off */
      set_timer(delay);                 /* Perform inter-pkt delay */
  }

Packet Routing

The packet has now entered the network. To reach its destination, it must be forwarded by one or more routers. When the packet reaches a router, the router performs the following operations:

  1. Determine the output interface on which the packet must be retransmitted.
  2. Decide to update any Allocated Bandwidth Tables based on the values of the five congestion fields in the packet.
  3. Decide to change the Rate and Bottlenode fields in the packet based on the information in the output interface's Allocated Bandwidth Table.
  4. Queue the packet on the output interface, or drop the packet if there is no room in the output queuegif.

Operations 1 and 4 must be done in any router. Operations 2 and 3 are part of the RBCC scheme. To these two we can add a third operation, the actual update of the Allocated Bandwidth Tables. Let us look at each of these three operations in turn.

Deciding to Update the ABTs

The router's Allocated Bandwidth Tables do not need to be updated on every packet arrival. For every packet, there are two ABTs which could be updated. The first ABT is for the output interface to which the packet is destined. The second ABT belongs to the interface which retransmits packets coming in the opposite direction (i.e destination to source). RBCC updates these ABTs on the conditions outlined below.

The first/forward ABT can be updated when:

The second/reverse ABT can be updated when:

Here is the pseudo-code for the above conditions:

  c is a pointer to the flow's entry in the ABT;
  if (this is a new traffic flow) {
    add an entry to the forward ABT, make c point to the new entry;
    c->desired_rate= pkt->desired_rate;

    /* Prevent infinite rate values in the ABT */
    if (pkt->rate== infinity()) {
          c->rate = 0.0; c->bottlenode = node;
    } else { c->rate = pkt->rate; c->bottlenode = pkt->bottlenode; }
    recalc_rate_table();
  }

  if (pkt->desired_rate != c->desired_rate) {
    c->desired_rate = pkt->desired_rate;
    recalc_rate_table();
  }

  if ((pkt->bottlenode == c->bottlenode) && (pkt->rate != c->rate)) {
    c->rate = pkt->rate;
    recalc_rate_table();
  }

  if (pkt->rate < c->rate) {
    c->rate = pkt->rate;
    c->bottlenode = pkt->bottlenode;
    recalc_rate_table(node, qnum, line_speeds[node][dest]);
  }

  /* Now test the reverse ABT conditions */
  if (pkt->rtn_rate < rev_c->rate or
      (pkt->rtn_bottlenode==rev_c->bottlenode and
                          pkt->rtn_rate != rev_c->rate)) {
    rev_c->rate = pkt->rtn_rate;
    rev_c->bottlenode = pkt->rtn_bottlenode;
    recalc_rate_table();
  }

Changing the Packet's Rate Field

There are two instances when a router can change a packet's Rate and Bottle_node fields. If the router has allocated less bandwidth than that indicated by the Rate field, then the Rate field can be lowered and the Bottle_node field updated to the value in the ABT.

When the router is itself the bottleneck on the traffic flow, the Rate field can be overwritten by the Rate field in the ABT. This has the effect of increasing the bandwidth available to the traffic flow when there is excess bandwidth at the router (another traffic flow may have terminated). Here is the pseudo-code for these two operations:

  /* Now update the rate field if necessary) */
  if ((pkt->rate > c->rate) || (pkt->bottlenode== node)) {
    pkt->rate = c->rate;
    pkt->bottlenode = c->bottlenode;
  }

Updating the ABTs

Updating an Allocated Bandwidth Table given the information already in the table and the congestion information in a packet is not trivial. We have considered the conditions which might prompt recalculation of the information in an ABT. Let us now consider the work involved.

There are several criteria which must be taken into account when updating an ABT:

The ABT update algorithm used by RBCC is:

The ABT update algorithm used by RBCC is:

  1. If there are no entries in the ABT, there is nothing to do. Stop.
  2. If all the traffic flows in the ABT are bottlenecked by this router, then
  3. Update the Rate entries by dividing the output interface's bandwidth evenly across all flows, without exceeding any flow's desired rate. Stop.
  4. Otherwise, calculate H, the highest rate in the ABT for a traffic flow not bottlenecked by this router.
  5. Find S, the sum of all the rates in the ABT. For those flows bottlenecked by this router, substitute H for their rate. This ensures that the flows bottlenecked by this router are treated as fairly as other flows in steps 6 and 7.
  6. Find the ratio .
  7. If R < 1, then there is less bandwidth on the output interface than required by all flows. Mark all flows as being bottlenecked by this router, and perform step 3.
  8. Otherwise, the output interface has excess capacity. Update some Rate entries by dividing the excess capacity evenly across all flows bottlenecked by this router, without exceeding any flow's desired rate. Stop.

The pseudo-code for the algorithm is below. Note that some of the fields in the ABT are updated by the update decision pseudo-code before this code is executed.

  /* Firstly, calculate H, the highest rate allocated to a connection */
  /* not bottlenecked by us. Also count the # of conns and the # of */
  /* conns bottled by us */
  for (H=0.0, c = head_conv_list; c; c = c->next) {
    conncnt++;
    if (c->rate > c->desired_rate) c->rate= c->desired_rate;
    if (c->bottlenode == node) bottlecnt++;
    else if ((c->rate != infinity()) && (c->rate > H)) H = c->rate;
  }
  if (conncnt==0) return;       /* nothing to do */

  /* If H remains zero, divide bandwidth */
  /* equally to all connections. */
  /* They are all bottled by us */
  /* The ABT list is ordered by desired rate */
  if (H == 0.0) {
    for (c = head_conv_list; c; c = c->next) {
        excess= bw / (1.0 * conncnt);
        c->rate = min(c->desired_rate, excess);
                                /* Subtract allocation from bandwidth */
        bw -= c->rate; conncnt--;
     }
    return;
  }

  /* Find the sum of rates allocated by us. For connections which are */
  /* bottlenecked by us, use the highest rate H above. */
  for (sum = 0.0, c = head_conv_list; c; c = c->next) {
    if (c->bottlenode == node) c->rate = min(H, c->desired_rate);
    sum += c->rate;
  }

  /* Now calculate the ratio of available bw to desired bw */
  ratio = bw / sum;

  /* If ratio is below 1.0, there is not enough bandwidth */
  /* so lower all the allocated bandwidths by the ratio. */
  /* If ratio is greater than zero, divide the excess bandwidth */
  /* amongst those connections bottlenecked by us. */
  if (ratio <= 1.0)
    for (c = head_conv_list; c; c = c->next) {
       c->rate *= ratio; c->bottlenode = node;
    }
  else {
    bw -= sum;                  /* Leave the excess in bandwidth */
    for (c = head_conv_list; bottlecnt && c; c = c->next)
        if (c->bottlenode==node) {
          excess= bw / (1.0 * bottlecnt);
          c->rate = min(c->desired_rate, excess);
          bw -= c->rate; bottlecnt--;
       }
  }

Packet Reception by the Destination

After being forwarded by all the routers between the source and destination, and having the Rate and Bottle_node fields set to reflect the bandwidth available to the corresponding traffic flow, the packet reaches the destination. In the current implementation of TRUMP in REAL, the packet is converted into an acknowledgment packet, the values in the Rate and Bottle_node fields are copied into the Return_Rate and Return_Bottle_node fields, respectively; the packet's size is changed and the packet is returned to the source. Here is the pseudo-code:

  pkt->type= ACK;
  pkt->dest= pkt->source;
  pkt->source= node;            /* We are now the packet's source */
                                /* Return the measured rate */
  pkt->rtn_rate= pkt->rate;
  pkt->rtn_bottlenode= pkt->bottlenode;

                                /* Guesstimate desired rate */
  pkt->desired_rate= pkt->rate=
        pkt->rate * ack_size / pkt->size ;
  pkt->bottlenode= node;
  pkt->size= ack_size;  

  send_pkt(pkt);

Note in particular the method used to determine the desired rate for the flow of acknowledgments back to the source:

This has the effect of limiting the desired rate of acknowledgment packets to a fraction of the available bandwidth for data packets, with the fraction based on the ratio between acknowledgment packet sizes and data packet sizes. When I get selective retransmission working correctly in TRUMP, the ratio will have to be divided again by the number of data packets that are acknowledged in each acknowledgment packet.

The assumption behind the equation above is that the data traffic flow is the limiting factor on the acknowledgment traffic flow. In some circumstances, it may be the other way around. I do not take into account this reverse situation at the moment.

I also do not separate the transmission of acknowledgment packets with any inter-packet delay at the moment; I trust that the spacing of data packets from the source will correctly space the acknowledgment packets from the destination. This is probably a glaring omission.

Packet Reception by the Source

As with the data packet, the acknowledgment packet is forwarded by several routers before being received at the source. As with the destination, the values of the Rate and Bottle_node fields are reused as the Return_Rate and Return_Bottle_node fields for future data packet transmission.

The Return_Rate field from the acknowledgment packet is used to recalculate the inter-packet delay, using the same equation as before:

The new delay takes effect after the next packet transmission.

Connection Termination

Correct calculation of a router's ABTs depends upon the router knowing about all the traffic flows through it. It can deduce the start of a new traffic flow by its packets, but it cannot easily deduce the end of a traffic flow from a lack of packets; the flow may have been slowed down by a bottleneck somewhere.

To alleviate this problem, the TRUMP protocol has been designed to perform a two-way handshake at the end of a traffic flow, with teardown packets. RBCC as implemented detects these packets, removes the corresponding entry for the traffic flow from the ABT and then recalculates the table:

  switch (pkt->type) {
    case TEARDOWN:
      delete connection;
      recalc_rate_table();
   }

A TRUMP source sends teardown packets every second once there are no packets left to send, and stops sending packets after it has received one teardown packet.

  sendteardown is initially 1;
  switch (pkt->type) {
    case TEARDOWN:
      sendteardown=0;
      break;
  }

  if (sendteardown) {
    /* We have successfully sent all our packets, */
    /* now send a teardown packet */
    make_pkt(pkt);
    pkt->seq_no= s;
    pkt->type= TEARDOWN;
    pkt->gen_time= runtime();
    pkt->gs= 0;
    pkt->desired_rate= pkt->rate= infinity(); /* Set initial rates */
    pkt->bottlenode= node;
    pkt->rtn_rate= ack_rate;
    pkt->rtn_bottlenode= ack_bottlenode;
    send_pkt(pkt);
    delay= 1.0e+6;              /* Send teardowns once a second */
  }

Upon receipt of a teardown packet, a TRUMP destination immediately reflects it back to the source.

  switch (pkt->type) {
    case TEARDOWN:
      /* send a teardown by changing the packet's params */
      pkt->dest = pkt->source;
      pkt->source = node;
      pkt->size = ack_size;     
      send_pkt(pkt);
      break;
  }



Warren Toomey
Fri Mar 15 10:43:33 EST 1996