From van@helios.ee.lbl.gov  Wed Aug 10 07:08:10 1988
Posted-Date: Wed, 10 Aug 88 07:07:41 PDT
Received-Date: Wed, 10 Aug 88 07:08:10 PDT
Received: from vs.ee.lbl.gov by venera.isi.edu (5.54/5.51)
	id AA05555; Wed, 10 Aug 88 07:08:10 PDT
Received: by helios.ee.lbl.gov (5.59/s2.2)
	id AA02228; Wed, 10 Aug 88 07:07:43 PDT
Message-Id: <8808101407.AA02228@helios.ee.lbl.gov>
To: Craig Partridge <craig@nnsc.nsf.net>
Cc: karels@monet.berkeley.edu, end2end@venera.isi.edu
Subject: Re: BSD routing structures 
In-Reply-To: Your message of Mon, 08 Aug 88 13:18:04 EDT.
Date: Wed, 10 Aug 88 07:07:41 PDT
From: Van Jacobson <van@helios.ee.lbl.gov>
Status: R

Craig -

The kernel routing stuff is in flux right now so it will be real
difficult to coordinate.  The routing ioctls are being replaced
by msgs to/from the kernel (i.e., a SOCK_ROUTE that you send to
to add/delete/change routes and read from to find out what changes
have been made).  Tied up with this are changes to support the
(bizarre) ISO ideas on layering (`"Qualified data" for the
transport layer?  You indicate that in layer 2, of course.')
I don't know when things will settle down.

A while ago I added the following information fields to a route
entry and changed tcp to use them:

	u_long	rt_mtu;		/* MTU for this path (bytes) */
	u_long	rt_maxhops;	/* max hops expected along rt 
				   (used to set datagram ttl) */
	u_long	rt_recvpipe;	/* inbound delay-bandwidth product (bytes) */
	u_long	rt_sendpipe;	/* outbound delay-bandwidth product (bytes) */
	u_long	rt_pipelimit;	/* outbound gateway buffer limit (bytes) */
	u_long	rt_rtt;		/* estimated round trip time (usec) */
	u_long	rt_rttvar;	/* estimated rtt variance (usec) */

This information will certainly exist in the new implementation
but God only knows what the interface to it will be.  But I can
probably save you some time by giving you what I've done so far
(a new net/route.[ch] and a /usr/src/etc/route.c with "set" and
"info" commands added).  I'd prefer this not be distributed though.

I think you really need to use the ctlinput upcall to notify the
higher level protocols of an mtu change, not just fiddle the
mtu in the route.  TCP, for example, has used the mtu to calculate
the MSS (and, I should note that, contrary to Bob's opinion, the
calculation was taken directly from rfc879) and the result of
that calculation is stashed in the tcp pcb.  Since the value
should change seldom (BTW, the rfc1063 probe intervals have got
to be way too short), you don't want to force tcp to check for
changes in the mtu every packet.  The upcall mechanism already
exists -- just call pfctlinput in the ip layer when you decide
the mtu has changed.  You'll need a new type code, maybe
PRC_NEW_CHARACTERISTICS and a case to handle it in the ctlinput
routine of any proto that's interested.  The way PRC_QUENCH
is handled in tcp_ctlinput is probably a good model.

 - Van

ps- I didn't see anyone answer Drew's objection to mtu discovery:
    that it fails silently when gateways don't implement the probe
    option (i.e., a gateway that doesn't implement probe and is the
    limitting mtu in the path results in fragmentation and no way to
    "discover" it.)  Since it will take a long time to deploy mtu
    option handling to all gateways, doesn't this limit the utility
    of the mtu discovery?

    What about having the receiver send a probe *reply* (subject to
    appropriate rate, "try everything else first" and "otherside
    ignores me" limits) when it receives the first frag of a
    fragmented datagram?  The size of this frag will almost always
    be the limitting mtu, so you know what value to use in the
    reply.  And this should stop the fragmentation even if none of
    the intermediate gateways handle probes.  And then, since
    you'd have an event-driven trigger to handle most fragmentation,
    you could multiply all the probe intervals in rfc1063 by 100 ...

From van@helios.ee.lbl.gov  Fri Aug 12 14:39:47 1988
Posted-Date: Fri, 12 Aug 88 14:40:09 PDT
Received-Date: Fri, 12 Aug 88 14:39:47 PDT
Received: from vs.ee.lbl.gov by venera.isi.edu (5.54/5.51)
	id AA24886; Fri, 12 Aug 88 14:39:47 PDT
Received: by helios.ee.lbl.gov (5.59/s2.2)
	id AA00979; Fri, 12 Aug 88 14:40:10 PDT
Message-Id: <8808122140.AA00979@helios.ee.lbl.gov>
To: braden@venera.isi.edu
Cc: end2end@venera.isi.edu
Subject: Re: BSD routing structures 
In-Reply-To: Your message of Fri, 12 Aug 88 11:53:04 PDT.
Date: Fri, 12 Aug 88 14:40:09 PDT
From: Van Jacobson <van@helios.ee.lbl.gov>
Status: R

Bob -

I was flaming back at you for a July 7th flame at Berkeley
(a msg to ietf-hosts with the subject "re: IP MTU Option"
that started with the line

	As usual, the confusion is caused by Berkeley.

and concluded with an injunction to re-read rfc879.)  As near as
I can tell, 4bsd implements exactly the algorithms in sections 7
& 11 of 879 for both sending & receiving the mss option.  If
this isn't so and somethings broken, let me know & I'll try to
get it fixed.

 - Van

