From van@lbl-csam.arpa  Mon Jan 18 18:32:05 1988
Posted-Date: Mon, 18 Jan 88 18:30:42 PST
Received-Date: Mon, 18 Jan 88 18:32:05 PST
Received: from LBL-CSAM.ARPA by venera.isi.edu (5.54/5.51)
	id AA15227; Mon, 18 Jan 88 18:32:05 PST
Received: by lbl-csam.arpa (5.58/1.18)
	id AA12921; Mon, 18 Jan 88 18:30:44 PST
Message-Id: <8801190230.AA12921@lbl-csam.arpa>
To: Jon Crowcroft <jon@cs.ucl.ac.uk>
Cc: end2end-interest@venera.isi.edu
Subject: Re: Thinking about Congestion 
In-Reply-To: Your message of Mon, 18 Jan 88 16:05:06 GMT.
Date: Mon, 18 Jan 88 18:30:42 PST
From: Van Jacobson <van@lbl-csam.arpa>
Status: R

Jon,

That's quite interesting.  It looks like you see the same effect
I did when testing through an echo: there seems to be a queue
limit in the path, between either UCL and the Butterfly or the
Butterfly and SIMP.  It's hard to tell from this data but the
limit looks like either 8 or 16 packets.  The low throughput is
largely an artifact of testing through an echo server (which is
the main reason I gave up on Satnet tests -- to get realistic
throughput numbers you have to test to some other point on the
net, not to an echo). 

The queue limit sets an upper bound on total packets/sec. (except for
long transfers at very low error rates).  I think the throughput goes
down for the 16KB window/246B mss case vs. the 4KB window/512B mss
case because of a combination of three factors (I'm not sure of
their relative contribution, it would take more data and some time
to sort that out):

1) The larger mss is more "efficient" at transferring data because
   the fragments carry only the IP header instead of both the TCP &
   IP headers.  Remember there's a limit on the number of packets we
   can get to the pipe so we're interested in maximizing user data
   per packet.  Since the datagram size is 250B, a 512B mss gives a
   frag with 20B IP, 20B TCP and 210B user, a frag with 20B IP and
   230B user, and a frag with 20B IP and 72B user.  Or 512 user
   bytes / 3 packets = 170 user bytes / packet.  The 246B mss gives
   two frags, one 20/20/210 and one 20/0/36 or 246 user / 2 packets
   = 123 / packet.  Thus the 512B mss is about 30% more efficient
   delivering user data.  (You saw an ~30% effect -- 1.54 KB/s at
   512 vs. 1.13 KB/s at 246 -- so this might account for all the
   difference).

   Note that this doesn't necessarily argue for the largest possible
   mss: You lose more on an error with the large mss so for any
   given error rate there's an optimum that balances the efficiency
   gain against the higher loss.  I once worked out how to compute
   this optimum [it's a simple function of the network's max packet
   size and p, the probability of loss] but I can't find the notes
   just now.  But the derivation is trivial -- just the sort of
   thing to give a grad student on a final exam :-).  I remember
   that if the network packet size was reasonable (i.e., the
   protocol header was <10% of the max packet size), it almost never
   paid to fragment. 

   BTW, 512 & 246 are really bad choices for mss if you want to
   maximize throughput.  You should get the best throughput where
	   mss = 210 + 230 n
   for integer n >= 0.

2) The fast part of the turnon must be <2KB to fit within the limit.
   With a 4KB window, it takes one loss to "learn" this.  With a
   16KB window, it takes 4 losses.  Since this is a relatively short
   xfer (100KB or ~60 sec.), the 4-seconds-per-loss paid to learn
   anything means the additional 3 losses for the big window have an
   ~20% effect on throughput. 

3) Since an echo test forces the gateway to simultaneously handle
   outgoing data and ack packets, there's a point (when you've
   filled approx. half the rtt with packets) where the data packets
   arrive at the bottleneck at the same time as returning acks from
   earlier data packets.  There is one ack generated per two tcp
   packets, on the average, so the 512B mss gets one ack per 6
   incoming packets and the 246B mss gets one ack per 4 incoming
   packets.  Thus, at the "crossing time", the packet density at the
   gateway is about 30% higher with the smaller mss so it hits the
   packet limit about 30% sooner in terms of window (and delivered
   throughput). 

   Note that this is only a problem on echo tests since normally the
   forward and reverse paths are distinct and the data packets and
   acks don't compete for the same queue space.

 - Van

ps- It would be interesting if the trace contained the other half
    of the transfer (i.e., the result of "tcpdump host satnet-echo").
    Then we could compute the outbound packet density by grepping
    out all the "> satnet" lines, taking the first difference of
    their time stamp to get inter-arrival time, then inverting the
    smoothed interarrival time to get instantaneous arrival rate
    (which should be directly proportional to queue length).

pps- I don't really need the send-ack & packetdat output -- I can
    generate them from the raw trace data (I only mention this
    because the message put me way over my disk quota on lbl-csam).