ViewVC Help
View File | Revision Log | Show Annotations | Revision Graph | Root Listing
root/cebix/BasiliskII/src/slirp/tcp_input.c
Revision: 1.4
Committed: 2012-03-30T01:10:28Z (12 years, 1 month ago) by asvitkine
Content type: text/plain
Branch: MAIN
CVS Tags: HEAD
Changes since 1.3: +1 -5 lines
Log Message:
Switch slirp to 3-clause BSD license. This change went in upstream to QEMU's
version of slirp (where this code comes from), with the following checkin:

commit 2f5f89963186d42a7ded253bc6cf5b32abb45cec
Author: aliguori <aliguori@c046a42c-6fe2-441c-8c8c-71466251a162>
Date:   Mon Jan 26 19:37:41 2009 +0000

    Remove the advertising clause from the slirp license

    According to the FSF, the 4-clause BSD license, which slirp is covered under,
    is not compatible with the GPL or LGPL[1].

    [1] http://www.fsf.org/licensing/licenses/index_html#GPLIncompatibleLicenses

    There are three declared copyright holders in slirp that use the 4-clause
    BSD license, the Regents of UC Berkley, Danny Gasparovski, and Kelly Price.
    Below are the appropriate permissions to remove the advertise clause from slirp
    from each party.

    Special thanks go to Richard Fontana from Red Hat for contacting all of the
    necessary authors to resolve this issue!

    Regents of UC Berkley:
    From ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change

    July 22, 1999

    To All Licensees, Distributors of Any Version of BSD:

    As you know, certain of the Berkeley Software Distribution ("BSD") source
    code files require that further distributions of products containing all or
    portions of the software, acknowledge within their advertising materials
    that such products contain software developed by UC Berkeley and its
    contributors.

    Specifically, the provision reads:

    "     * 3. All advertising materials mentioning features or use of this software
          *    must display the following acknowledgement:
          *    This product includes software developed by the University of
          *    California, Berkeley and its contributors."

    Effective immediately, licensees and distributors are no longer required to
    include the acknowledgement within advertising materials.  Accordingly, the
    foregoing paragraph of those BSD Unix files containing it is hereby deleted
    in its entirety.

    William Hoskins
    Director, Office of Technology Licensing
    University of California, Berkeley

    Danny Gasparovski:

    Subject: RE: Slirp license
    Date: Thu, 8 Jan 2009 10:51:00 +1100
    From: "Gasparovski, Daniel" <Daniel.Gasparovski@ato.gov.au>
    To: "Richard Fontana" <rfontana@redhat.com>

    Hi Richard,

    I have no objection to having Slirp code in QEMU be licensed under the
    3-clause BSD license.

    Thanks for taking the effort to consult me about this.


    Dan ...

    Kelly Price:

    Date: Thu, 8 Jan 2009 19:38:56 -0500
    From: "Kelly Price" <strredwolf@gmail.com>
    To: "Richard Fontana" <rfontana@redhat.com>
    Subject: Re: Slirp license

    Thanks for contacting me, Richard.  I'm glad you were able to find
    Dan, as I've been "keeping the light on" for Slirp.  I have no use for
    it now, and I have little time for it (now holding onto Keenspot's
    Comic Genesis and having a regular US state government position). If
    Dan would like to return to the project, I'd love to give it back to
    him.

    As for copyright, I don't own all of it.  Dan does, so I will defer to
    him.  Any of my patches I will gladly license to the 3-part BSD
    license.  My interest in re-licensing was because we didn't have ready
    info to contact Dan.  If Dan would like to port Slirp back out of
    QEMU, a lot of us 64-bit users would be grateful.

    Feel free to share this email address with Dan.  I will be glad to
    effect a transfer of the project to him and Mr. Bellard of the QEMU
    project.

    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>


    git-svn-id: svn://svn.savannah.nongnu.org/qemu/trunk@6451 c046a42c-6fe2-441c-8c8c-71466251a162

File Contents

# User Rev Content
1 gbeauche 1.1 /*
2     * Copyright (c) 1982, 1986, 1988, 1990, 1993, 1994
3     * The Regents of the University of California. All rights reserved.
4     *
5     * Redistribution and use in source and binary forms, with or without
6     * modification, are permitted provided that the following conditions
7     * are met:
8     * 1. Redistributions of source code must retain the above copyright
9     * notice, this list of conditions and the following disclaimer.
10     * 2. Redistributions in binary form must reproduce the above copyright
11     * notice, this list of conditions and the following disclaimer in the
12     * documentation and/or other materials provided with the distribution.
13 asvitkine 1.4 * 3. Neither the name of the University nor the names of its contributors
14 gbeauche 1.1 * may be used to endorse or promote products derived from this software
15     * without specific prior written permission.
16     *
17     * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
18     * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
19     * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
20     * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
21     * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
22     * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
23     * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
24     * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
25     * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
26     * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
27     * SUCH DAMAGE.
28     *
29     * @(#)tcp_input.c 8.5 (Berkeley) 4/10/94
30     * tcp_input.c,v 1.10 1994/10/13 18:36:32 wollman Exp
31     */
32    
33     /*
34     * Changes and additions relating to SLiRP
35     * Copyright (c) 1995 Danny Gasparovski.
36     *
37     * Please read the file COPYRIGHT for the
38     * terms and conditions of the copyright.
39     */
40    
41 asvitkine 1.3 #include <stdlib.h>
42 gbeauche 1.1 #include <slirp.h>
43     #include "ip_icmp.h"
44    
45     struct socket tcb;
46    
47     int tcprexmtthresh = 3;
48     struct socket *tcp_last_so = &tcb;
49    
50     tcp_seq tcp_iss; /* tcp initial send seq # */
51    
52     #define TCP_PAWS_IDLE (24 * 24 * 60 * 60 * PR_SLOWHZ)
53    
54     /* for modulo comparisons of timestamps */
55     #define TSTMP_LT(a,b) ((int)((a)-(b)) < 0)
56     #define TSTMP_GEQ(a,b) ((int)((a)-(b)) >= 0)
57    
58     /*
59     * Insert segment ti into reassembly queue of tcp with
60     * control block tp. Return TH_FIN if reassembly now includes
61     * a segment with FIN. The macro form does the common case inline
62     * (segment is the next to be received on an established connection,
63     * and the queue is empty), avoiding linkage into and removal
64     * from the queue and repetition of various conversions.
65     * Set DELACK for segments received in order, but ack immediately
66     * when segments are out of order (so fast retransmit can work).
67     */
68     #ifdef TCP_ACK_HACK
69     #define TCP_REASS(tp, ti, m, so, flags) {\
70     if ((ti)->ti_seq == (tp)->rcv_nxt && \
71     (tp)->seg_next == (tcpiphdrp_32)(tp) && \
72     (tp)->t_state == TCPS_ESTABLISHED) {\
73     if (ti->ti_flags & TH_PUSH) \
74     tp->t_flags |= TF_ACKNOW; \
75     else \
76     tp->t_flags |= TF_DELACK; \
77     (tp)->rcv_nxt += (ti)->ti_len; \
78     flags = (ti)->ti_flags & TH_FIN; \
79     tcpstat.tcps_rcvpack++;\
80     tcpstat.tcps_rcvbyte += (ti)->ti_len;\
81     if (so->so_emu) { \
82     if (tcp_emu((so),(m))) sbappend((so), (m)); \
83     } else \
84     sbappend((so), (m)); \
85     /* sorwakeup(so); */ \
86     } else {\
87     (flags) = tcp_reass((tp), (ti), (m)); \
88     tp->t_flags |= TF_ACKNOW; \
89     } \
90     }
91     #else
92     #define TCP_REASS(tp, ti, m, so, flags) { \
93     if ((ti)->ti_seq == (tp)->rcv_nxt && \
94     (tp)->seg_next == (tcpiphdrp_32)(tp) && \
95     (tp)->t_state == TCPS_ESTABLISHED) { \
96     tp->t_flags |= TF_DELACK; \
97     (tp)->rcv_nxt += (ti)->ti_len; \
98     flags = (ti)->ti_flags & TH_FIN; \
99     tcpstat.tcps_rcvpack++;\
100     tcpstat.tcps_rcvbyte += (ti)->ti_len;\
101     if (so->so_emu) { \
102     if (tcp_emu((so),(m))) sbappend(so, (m)); \
103     } else \
104     sbappend((so), (m)); \
105     /* sorwakeup(so); */ \
106     } else { \
107     (flags) = tcp_reass((tp), (ti), (m)); \
108     tp->t_flags |= TF_ACKNOW; \
109     } \
110     }
111     #endif
112    
113     int
114     tcp_reass(tp, ti, m)
115     register struct tcpcb *tp;
116     register struct tcpiphdr *ti;
117     struct mbuf *m;
118     {
119     register struct tcpiphdr *q;
120     struct socket *so = tp->t_socket;
121     int flags;
122    
123     /*
124     * Call with ti==0 after become established to
125     * force pre-ESTABLISHED data up to user socket.
126     */
127     if (ti == 0)
128     goto present;
129    
130     /*
131     * Find a segment which begins after this one does.
132     */
133     for (q = (struct tcpiphdr *)tp->seg_next; q != (struct tcpiphdr *)tp;
134     q = (struct tcpiphdr *)q->ti_next)
135     if (SEQ_GT(q->ti_seq, ti->ti_seq))
136     break;
137    
138     /*
139     * If there is a preceding segment, it may provide some of
140     * our data already. If so, drop the data from the incoming
141     * segment. If it provides all of our data, drop us.
142     */
143     if ((struct tcpiphdr *)q->ti_prev != (struct tcpiphdr *)tp) {
144     register int i;
145     q = (struct tcpiphdr *)q->ti_prev;
146     /* conversion to int (in i) handles seq wraparound */
147     i = q->ti_seq + q->ti_len - ti->ti_seq;
148     if (i > 0) {
149     if (i >= ti->ti_len) {
150     tcpstat.tcps_rcvduppack++;
151     tcpstat.tcps_rcvdupbyte += ti->ti_len;
152     m_freem(m);
153     /*
154     * Try to present any queued data
155     * at the left window edge to the user.
156     * This is needed after the 3-WHS
157     * completes.
158     */
159     goto present; /* ??? */
160     }
161     m_adj(m, i);
162     ti->ti_len -= i;
163     ti->ti_seq += i;
164     }
165     q = (struct tcpiphdr *)(q->ti_next);
166     }
167     tcpstat.tcps_rcvoopack++;
168     tcpstat.tcps_rcvoobyte += ti->ti_len;
169     REASS_MBUF(ti) = (mbufp_32) m; /* XXX */
170    
171     /*
172     * While we overlap succeeding segments trim them or,
173     * if they are completely covered, dequeue them.
174     */
175     while (q != (struct tcpiphdr *)tp) {
176     register int i = (ti->ti_seq + ti->ti_len) - q->ti_seq;
177     if (i <= 0)
178     break;
179     if (i < q->ti_len) {
180     q->ti_seq += i;
181     q->ti_len -= i;
182     m_adj((struct mbuf *) REASS_MBUF(q), i);
183     break;
184     }
185     q = (struct tcpiphdr *)q->ti_next;
186     m = (struct mbuf *) REASS_MBUF((struct tcpiphdr *)q->ti_prev);
187     remque_32((void *)(q->ti_prev));
188     m_freem(m);
189     }
190    
191     /*
192     * Stick new segment in its place.
193     */
194     insque_32(ti, (void *)(q->ti_prev));
195    
196     present:
197     /*
198     * Present data to user, advancing rcv_nxt through
199     * completed sequence space.
200     */
201     if (!TCPS_HAVEESTABLISHED(tp->t_state))
202     return (0);
203     ti = (struct tcpiphdr *) tp->seg_next;
204     if (ti == (struct tcpiphdr *)tp || ti->ti_seq != tp->rcv_nxt)
205     return (0);
206     if (tp->t_state == TCPS_SYN_RECEIVED && ti->ti_len)
207     return (0);
208     do {
209     tp->rcv_nxt += ti->ti_len;
210     flags = ti->ti_flags & TH_FIN;
211     remque_32(ti);
212     m = (struct mbuf *) REASS_MBUF(ti); /* XXX */
213     ti = (struct tcpiphdr *)ti->ti_next;
214     /* if (so->so_state & SS_FCANTRCVMORE) */
215     if (so->so_state & SS_FCANTSENDMORE)
216     m_freem(m);
217     else {
218     if (so->so_emu) {
219     if (tcp_emu(so,m)) sbappend(so, m);
220     } else
221     sbappend(so, m);
222     }
223     } while (ti != (struct tcpiphdr *)tp && ti->ti_seq == tp->rcv_nxt);
224     /* sorwakeup(so); */
225     return (flags);
226     }
227    
228     /*
229     * TCP input routine, follows pages 65-76 of the
230     * protocol specification dated September, 1981 very closely.
231     */
232     void
233     tcp_input(m, iphlen, inso)
234     register struct mbuf *m;
235     int iphlen;
236     struct socket *inso;
237     {
238     struct ip save_ip, *ip;
239     register struct tcpiphdr *ti;
240     caddr_t optp = NULL;
241     int optlen = 0;
242     int len, tlen, off;
243     register struct tcpcb *tp = 0;
244     register int tiflags;
245     struct socket *so = 0;
246     int todrop, acked, ourfinisacked, needoutput = 0;
247     /* int dropsocket = 0; */
248     int iss = 0;
249     u_long tiwin;
250     int ret;
251     /* int ts_present = 0; */
252    
253     DEBUG_CALL("tcp_input");
254     DEBUG_ARGS((dfd," m = %8lx iphlen = %2d inso = %lx\n",
255     (long )m, iphlen, (long )inso ));
256    
257     /*
258     * If called with m == 0, then we're continuing the connect
259     */
260     if (m == NULL) {
261     so = inso;
262    
263     /* Re-set a few variables */
264     tp = sototcpcb(so);
265     m = so->so_m;
266     so->so_m = 0;
267     ti = so->so_ti;
268     tiwin = ti->ti_win;
269     tiflags = ti->ti_flags;
270    
271     goto cont_conn;
272     }
273    
274    
275     tcpstat.tcps_rcvtotal++;
276     /*
277     * Get IP and TCP header together in first mbuf.
278     * Note: IP leaves IP header in first mbuf.
279     */
280     ti = mtod(m, struct tcpiphdr *);
281     if (iphlen > sizeof(struct ip )) {
282     ip_stripoptions(m, (struct mbuf *)0);
283     iphlen=sizeof(struct ip );
284     }
285     /* XXX Check if too short */
286    
287    
288     /*
289     * Save a copy of the IP header in case we want restore it
290     * for sending an ICMP error message in response.
291     */
292     ip=mtod(m, struct ip *);
293     save_ip = *ip;
294     save_ip.ip_len+= iphlen;
295    
296     /*
297     * Checksum extended TCP header and data.
298     */
299     tlen = ((struct ip *)ti)->ip_len;
300     ti->ti_next = ti->ti_prev = 0;
301     ti->ti_x1 = 0;
302     ti->ti_len = htons((u_int16_t)tlen);
303     len = sizeof(struct ip ) + tlen;
304     /* keep checksum for ICMP reply
305     * ti->ti_sum = cksum(m, len);
306     * if (ti->ti_sum) { */
307     if(cksum(m, len)) {
308     tcpstat.tcps_rcvbadsum++;
309     goto drop;
310     }
311    
312     /*
313     * Check that TCP offset makes sense,
314     * pull out TCP options and adjust length. XXX
315     */
316     off = ti->ti_off << 2;
317     if (off < sizeof (struct tcphdr) || off > tlen) {
318     tcpstat.tcps_rcvbadoff++;
319     goto drop;
320     }
321     tlen -= off;
322     ti->ti_len = tlen;
323     if (off > sizeof (struct tcphdr)) {
324     optlen = off - sizeof (struct tcphdr);
325     optp = mtod(m, caddr_t) + sizeof (struct tcpiphdr);
326    
327     /*
328     * Do quick retrieval of timestamp options ("options
329     * prediction?"). If timestamp is the only option and it's
330     * formatted as recommended in RFC 1323 appendix A, we
331     * quickly get the values now and not bother calling
332     * tcp_dooptions(), etc.
333     */
334     /* if ((optlen == TCPOLEN_TSTAMP_APPA ||
335     * (optlen > TCPOLEN_TSTAMP_APPA &&
336     * optp[TCPOLEN_TSTAMP_APPA] == TCPOPT_EOL)) &&
337     * *(u_int32_t *)optp == htonl(TCPOPT_TSTAMP_HDR) &&
338     * (ti->ti_flags & TH_SYN) == 0) {
339     * ts_present = 1;
340     * ts_val = ntohl(*(u_int32_t *)(optp + 4));
341     * ts_ecr = ntohl(*(u_int32_t *)(optp + 8));
342     * optp = NULL; / * we've parsed the options * /
343     * }
344     */
345     }
346     tiflags = ti->ti_flags;
347    
348     /*
349     * Convert TCP protocol specific fields to host format.
350     */
351     NTOHL(ti->ti_seq);
352     NTOHL(ti->ti_ack);
353     NTOHS(ti->ti_win);
354     NTOHS(ti->ti_urp);
355    
356     /*
357     * Drop TCP, IP headers and TCP options.
358     */
359     m->m_data += sizeof(struct tcpiphdr)+off-sizeof(struct tcphdr);
360     m->m_len -= sizeof(struct tcpiphdr)+off-sizeof(struct tcphdr);
361    
362     /*
363     * Locate pcb for segment.
364     */
365     findso:
366     so = tcp_last_so;
367     if (so->so_fport != ti->ti_dport ||
368     so->so_lport != ti->ti_sport ||
369     so->so_laddr.s_addr != ti->ti_src.s_addr ||
370     so->so_faddr.s_addr != ti->ti_dst.s_addr) {
371     so = solookup(&tcb, ti->ti_src, ti->ti_sport,
372     ti->ti_dst, ti->ti_dport);
373     if (so)
374     tcp_last_so = so;
375     ++tcpstat.tcps_socachemiss;
376     }
377    
378     /*
379     * If the state is CLOSED (i.e., TCB does not exist) then
380     * all data in the incoming segment is discarded.
381     * If the TCB exists but is in CLOSED state, it is embryonic,
382     * but should either do a listen or a connect soon.
383     *
384     * state == CLOSED means we've done socreate() but haven't
385     * attached it to a protocol yet...
386     *
387     * XXX If a TCB does not exist, and the TH_SYN flag is
388     * the only flag set, then create a session, mark it
389     * as if it was LISTENING, and continue...
390     */
391     if (so == 0) {
392     if ((tiflags & (TH_SYN|TH_FIN|TH_RST|TH_URG|TH_ACK)) != TH_SYN)
393     goto dropwithreset;
394    
395     if ((so = socreate()) == NULL)
396     goto dropwithreset;
397     if (tcp_attach(so) < 0) {
398     free(so); /* Not sofree (if it failed, it's not insqued) */
399     goto dropwithreset;
400     }
401    
402     sbreserve(&so->so_snd, tcp_sndspace);
403     sbreserve(&so->so_rcv, tcp_rcvspace);
404    
405     /* tcp_last_so = so; */ /* XXX ? */
406     /* tp = sototcpcb(so); */
407    
408     so->so_laddr = ti->ti_src;
409     so->so_lport = ti->ti_sport;
410     so->so_faddr = ti->ti_dst;
411     so->so_fport = ti->ti_dport;
412    
413     if ((so->so_iptos = tcp_tos(so)) == 0)
414     so->so_iptos = ((struct ip *)ti)->ip_tos;
415    
416     tp = sototcpcb(so);
417     tp->t_state = TCPS_LISTEN;
418     }
419    
420     /*
421     * If this is a still-connecting socket, this probably
422     * a retransmit of the SYN. Whether it's a retransmit SYN
423     * or something else, we nuke it.
424     */
425     if (so->so_state & SS_ISFCONNECTING)
426     goto drop;
427    
428     tp = sototcpcb(so);
429    
430     /* XXX Should never fail */
431     if (tp == 0)
432     goto dropwithreset;
433     if (tp->t_state == TCPS_CLOSED)
434     goto drop;
435    
436     /* Unscale the window into a 32-bit value. */
437     /* if ((tiflags & TH_SYN) == 0)
438     * tiwin = ti->ti_win << tp->snd_scale;
439     * else
440     */
441     tiwin = ti->ti_win;
442    
443     /*
444     * Segment received on connection.
445     * Reset idle time and keep-alive timer.
446     */
447     tp->t_idle = 0;
448     if (so_options)
449     tp->t_timer[TCPT_KEEP] = tcp_keepintvl;
450     else
451     tp->t_timer[TCPT_KEEP] = tcp_keepidle;
452    
453     /*
454     * Process options if not in LISTEN state,
455     * else do it below (after getting remote address).
456     */
457     if (optp && tp->t_state != TCPS_LISTEN)
458     tcp_dooptions(tp, (u_char *)optp, optlen, ti);
459     /* , */
460     /* &ts_present, &ts_val, &ts_ecr); */
461    
462     /*
463     * Header prediction: check for the two common cases
464     * of a uni-directional data xfer. If the packet has
465     * no control flags, is in-sequence, the window didn't
466     * change and we're not retransmitting, it's a
467     * candidate. If the length is zero and the ack moved
468     * forward, we're the sender side of the xfer. Just
469     * free the data acked & wake any higher level process
470     * that was blocked waiting for space. If the length
471     * is non-zero and the ack didn't move, we're the
472     * receiver side. If we're getting packets in-order
473     * (the reassembly queue is empty), add the data to
474     * the socket buffer and note that we need a delayed ack.
475     *
476     * XXX Some of these tests are not needed
477     * eg: the tiwin == tp->snd_wnd prevents many more
478     * predictions.. with no *real* advantage..
479     */
480     if (tp->t_state == TCPS_ESTABLISHED &&
481     (tiflags & (TH_SYN|TH_FIN|TH_RST|TH_URG|TH_ACK)) == TH_ACK &&
482     /* (!ts_present || TSTMP_GEQ(ts_val, tp->ts_recent)) && */
483     ti->ti_seq == tp->rcv_nxt &&
484     tiwin && tiwin == tp->snd_wnd &&
485     tp->snd_nxt == tp->snd_max) {
486     /*
487     * If last ACK falls within this segment's sequence numbers,
488     * record the timestamp.
489     */
490     /* if (ts_present && SEQ_LEQ(ti->ti_seq, tp->last_ack_sent) &&
491     * SEQ_LT(tp->last_ack_sent, ti->ti_seq + ti->ti_len)) {
492     * tp->ts_recent_age = tcp_now;
493     * tp->ts_recent = ts_val;
494     * }
495     */
496     if (ti->ti_len == 0) {
497     if (SEQ_GT(ti->ti_ack, tp->snd_una) &&
498     SEQ_LEQ(ti->ti_ack, tp->snd_max) &&
499     tp->snd_cwnd >= tp->snd_wnd) {
500     /*
501     * this is a pure ack for outstanding data.
502     */
503     ++tcpstat.tcps_predack;
504     /* if (ts_present)
505     * tcp_xmit_timer(tp, tcp_now-ts_ecr+1);
506     * else
507     */ if (tp->t_rtt &&
508     SEQ_GT(ti->ti_ack, tp->t_rtseq))
509     tcp_xmit_timer(tp, tp->t_rtt);
510     acked = ti->ti_ack - tp->snd_una;
511     tcpstat.tcps_rcvackpack++;
512     tcpstat.tcps_rcvackbyte += acked;
513     sbdrop(&so->so_snd, acked);
514     tp->snd_una = ti->ti_ack;
515     m_freem(m);
516    
517     /*
518     * If all outstanding data are acked, stop
519     * retransmit timer, otherwise restart timer
520     * using current (possibly backed-off) value.
521     * If process is waiting for space,
522     * wakeup/selwakeup/signal. If data
523     * are ready to send, let tcp_output
524     * decide between more output or persist.
525     */
526     if (tp->snd_una == tp->snd_max)
527     tp->t_timer[TCPT_REXMT] = 0;
528     else if (tp->t_timer[TCPT_PERSIST] == 0)
529     tp->t_timer[TCPT_REXMT] = tp->t_rxtcur;
530    
531     /*
532     * There's room in so_snd, sowwakup will read()
533     * from the socket if we can
534     */
535     /* if (so->so_snd.sb_flags & SB_NOTIFY)
536     * sowwakeup(so);
537     */
538     /*
539     * This is called because sowwakeup might have
540     * put data into so_snd. Since we don't so sowwakeup,
541     * we don't need this.. XXX???
542     */
543     if (so->so_snd.sb_cc)
544     (void) tcp_output(tp);
545    
546     return;
547     }
548     } else if (ti->ti_ack == tp->snd_una &&
549     tp->seg_next == (tcpiphdrp_32)tp &&
550     ti->ti_len <= sbspace(&so->so_rcv)) {
551     /*
552     * this is a pure, in-sequence data packet
553     * with nothing on the reassembly queue and
554     * we have enough buffer space to take it.
555     */
556     ++tcpstat.tcps_preddat;
557     tp->rcv_nxt += ti->ti_len;
558     tcpstat.tcps_rcvpack++;
559     tcpstat.tcps_rcvbyte += ti->ti_len;
560     /*
561     * Add data to socket buffer.
562     */
563     if (so->so_emu) {
564     if (tcp_emu(so,m)) sbappend(so, m);
565     } else
566     sbappend(so, m);
567    
568     /*
569     * XXX This is called when data arrives. Later, check
570     * if we can actually write() to the socket
571     * XXX Need to check? It's be NON_BLOCKING
572     */
573     /* sorwakeup(so); */
574    
575     /*
576     * If this is a short packet, then ACK now - with Nagel
577     * congestion avoidance sender won't send more until
578     * he gets an ACK.
579     *
580 gbeauche 1.2 * It is better to not delay acks at all to maximize
581     * TCP throughput. See RFC 2581.
582 gbeauche 1.1 */
583 gbeauche 1.2 tp->t_flags |= TF_ACKNOW;
584     tcp_output(tp);
585 gbeauche 1.1 return;
586     }
587     } /* header prediction */
588     /*
589     * Calculate amount of space in receive window,
590     * and then do TCP input processing.
591     * Receive window is amount of space in rcv queue,
592     * but not less than advertised window.
593     */
594     { int win;
595     win = sbspace(&so->so_rcv);
596     if (win < 0)
597     win = 0;
598     tp->rcv_wnd = max(win, (int)(tp->rcv_adv - tp->rcv_nxt));
599     }
600    
601     switch (tp->t_state) {
602    
603     /*
604     * If the state is LISTEN then ignore segment if it contains an RST.
605     * If the segment contains an ACK then it is bad and send a RST.
606     * If it does not contain a SYN then it is not interesting; drop it.
607     * Don't bother responding if the destination was a broadcast.
608     * Otherwise initialize tp->rcv_nxt, and tp->irs, select an initial
609     * tp->iss, and send a segment:
610     * <SEQ=ISS><ACK=RCV_NXT><CTL=SYN,ACK>
611     * Also initialize tp->snd_nxt to tp->iss+1 and tp->snd_una to tp->iss.
612     * Fill in remote peer address fields if not previously specified.
613     * Enter SYN_RECEIVED state, and process any other fields of this
614     * segment in this state.
615     */
616     case TCPS_LISTEN: {
617    
618     if (tiflags & TH_RST)
619     goto drop;
620     if (tiflags & TH_ACK)
621     goto dropwithreset;
622     if ((tiflags & TH_SYN) == 0)
623     goto drop;
624    
625     /*
626     * This has way too many gotos...
627     * But a bit of spaghetti code never hurt anybody :)
628     */
629    
630     /*
631     * If this is destined for the control address, then flag to
632     * tcp_ctl once connected, otherwise connect
633     */
634     if ((so->so_faddr.s_addr&htonl(0xffffff00)) == special_addr.s_addr) {
635     int lastbyte=ntohl(so->so_faddr.s_addr) & 0xff;
636     if (lastbyte!=CTL_ALIAS && lastbyte!=CTL_DNS) {
637     #if 0
638     if(lastbyte==CTL_CMD || lastbyte==CTL_EXEC) {
639     /* Command or exec adress */
640     so->so_state |= SS_CTL;
641     } else
642     #endif
643     {
644     /* May be an add exec */
645     struct ex_list *ex_ptr;
646     for(ex_ptr = exec_list; ex_ptr; ex_ptr = ex_ptr->ex_next) {
647     if(ex_ptr->ex_fport == so->so_fport &&
648     lastbyte == ex_ptr->ex_addr) {
649     so->so_state |= SS_CTL;
650     break;
651     }
652     }
653     }
654     if(so->so_state & SS_CTL) goto cont_input;
655     }
656     /* CTL_ALIAS: Do nothing, tcp_fconnect will be called on it */
657     }
658    
659     if (so->so_emu & EMU_NOCONNECT) {
660     so->so_emu &= ~EMU_NOCONNECT;
661     goto cont_input;
662     }
663    
664     if((tcp_fconnect(so) == -1) && (errno != EINPROGRESS) && (errno != EWOULDBLOCK)) {
665     u_char code=ICMP_UNREACH_NET;
666     DEBUG_MISC((dfd," tcp fconnect errno = %d-%s\n",
667     errno,strerror(errno)));
668     if(errno == ECONNREFUSED) {
669     /* ACK the SYN, send RST to refuse the connection */
670     tcp_respond(tp, ti, m, ti->ti_seq+1, (tcp_seq)0,
671     TH_RST|TH_ACK);
672     } else {
673     if(errno == EHOSTUNREACH) code=ICMP_UNREACH_HOST;
674     HTONL(ti->ti_seq); /* restore tcp header */
675     HTONL(ti->ti_ack);
676     HTONS(ti->ti_win);
677     HTONS(ti->ti_urp);
678     m->m_data -= sizeof(struct tcpiphdr)+off-sizeof(struct tcphdr);
679     m->m_len += sizeof(struct tcpiphdr)+off-sizeof(struct tcphdr);
680     *ip=save_ip;
681     icmp_error(m, ICMP_UNREACH,code, 0,strerror(errno));
682     }
683     tp = tcp_close(tp);
684     m_free(m);
685     } else {
686     /*
687     * Haven't connected yet, save the current mbuf
688     * and ti, and return
689     * XXX Some OS's don't tell us whether the connect()
690     * succeeded or not. So we must time it out.
691     */
692     so->so_m = m;
693     so->so_ti = ti;
694     tp->t_timer[TCPT_KEEP] = TCPTV_KEEP_INIT;
695     tp->t_state = TCPS_SYN_RECEIVED;
696     }
697     return;
698    
699     cont_conn:
700     /* m==NULL
701     * Check if the connect succeeded
702     */
703     if (so->so_state & SS_NOFDREF) {
704     tp = tcp_close(tp);
705     goto dropwithreset;
706     }
707     cont_input:
708     tcp_template(tp);
709    
710     if (optp)
711     tcp_dooptions(tp, (u_char *)optp, optlen, ti);
712     /* , */
713     /* &ts_present, &ts_val, &ts_ecr); */
714    
715     if (iss)
716     tp->iss = iss;
717     else
718     tp->iss = tcp_iss;
719     tcp_iss += TCP_ISSINCR/2;
720     tp->irs = ti->ti_seq;
721     tcp_sendseqinit(tp);
722     tcp_rcvseqinit(tp);
723     tp->t_flags |= TF_ACKNOW;
724     tp->t_state = TCPS_SYN_RECEIVED;
725     tp->t_timer[TCPT_KEEP] = TCPTV_KEEP_INIT;
726     tcpstat.tcps_accepts++;
727     goto trimthenstep6;
728     } /* case TCPS_LISTEN */
729    
730     /*
731     * If the state is SYN_SENT:
732     * if seg contains an ACK, but not for our SYN, drop the input.
733     * if seg contains a RST, then drop the connection.
734     * if seg does not contain SYN, then drop it.
735     * Otherwise this is an acceptable SYN segment
736     * initialize tp->rcv_nxt and tp->irs
737     * if seg contains ack then advance tp->snd_una
738     * if SYN has been acked change to ESTABLISHED else SYN_RCVD state
739     * arrange for segment to be acked (eventually)
740     * continue processing rest of data/controls, beginning with URG
741     */
742     case TCPS_SYN_SENT:
743     if ((tiflags & TH_ACK) &&
744     (SEQ_LEQ(ti->ti_ack, tp->iss) ||
745     SEQ_GT(ti->ti_ack, tp->snd_max)))
746     goto dropwithreset;
747    
748     if (tiflags & TH_RST) {
749     if (tiflags & TH_ACK)
750     tp = tcp_drop(tp,0); /* XXX Check t_softerror! */
751     goto drop;
752     }
753    
754     if ((tiflags & TH_SYN) == 0)
755     goto drop;
756     if (tiflags & TH_ACK) {
757     tp->snd_una = ti->ti_ack;
758     if (SEQ_LT(tp->snd_nxt, tp->snd_una))
759     tp->snd_nxt = tp->snd_una;
760     }
761    
762     tp->t_timer[TCPT_REXMT] = 0;
763     tp->irs = ti->ti_seq;
764     tcp_rcvseqinit(tp);
765     tp->t_flags |= TF_ACKNOW;
766     if (tiflags & TH_ACK && SEQ_GT(tp->snd_una, tp->iss)) {
767     tcpstat.tcps_connects++;
768     soisfconnected(so);
769     tp->t_state = TCPS_ESTABLISHED;
770    
771     /* Do window scaling on this connection? */
772     /* if ((tp->t_flags & (TF_RCVD_SCALE|TF_REQ_SCALE)) ==
773     * (TF_RCVD_SCALE|TF_REQ_SCALE)) {
774     * tp->snd_scale = tp->requested_s_scale;
775     * tp->rcv_scale = tp->request_r_scale;
776     * }
777     */
778     (void) tcp_reass(tp, (struct tcpiphdr *)0,
779     (struct mbuf *)0);
780     /*
781     * if we didn't have to retransmit the SYN,
782     * use its rtt as our initial srtt & rtt var.
783     */
784     if (tp->t_rtt)
785     tcp_xmit_timer(tp, tp->t_rtt);
786     } else
787     tp->t_state = TCPS_SYN_RECEIVED;
788    
789     trimthenstep6:
790     /*
791     * Advance ti->ti_seq to correspond to first data byte.
792     * If data, trim to stay within window,
793     * dropping FIN if necessary.
794     */
795     ti->ti_seq++;
796     if (ti->ti_len > tp->rcv_wnd) {
797     todrop = ti->ti_len - tp->rcv_wnd;
798     m_adj(m, -todrop);
799     ti->ti_len = tp->rcv_wnd;
800     tiflags &= ~TH_FIN;
801     tcpstat.tcps_rcvpackafterwin++;
802     tcpstat.tcps_rcvbyteafterwin += todrop;
803     }
804     tp->snd_wl1 = ti->ti_seq - 1;
805     tp->rcv_up = ti->ti_seq;
806     goto step6;
807     } /* switch tp->t_state */
808     /*
809     * States other than LISTEN or SYN_SENT.
810     * First check timestamp, if present.
811     * Then check that at least some bytes of segment are within
812     * receive window. If segment begins before rcv_nxt,
813     * drop leading data (and SYN); if nothing left, just ack.
814     *
815     * RFC 1323 PAWS: If we have a timestamp reply on this segment
816     * and it's less than ts_recent, drop it.
817     */
818     /* if (ts_present && (tiflags & TH_RST) == 0 && tp->ts_recent &&
819     * TSTMP_LT(ts_val, tp->ts_recent)) {
820     *
821     */ /* Check to see if ts_recent is over 24 days old. */
822     /* if ((int)(tcp_now - tp->ts_recent_age) > TCP_PAWS_IDLE) {
823     */ /*
824     * * Invalidate ts_recent. If this segment updates
825     * * ts_recent, the age will be reset later and ts_recent
826     * * will get a valid value. If it does not, setting
827     * * ts_recent to zero will at least satisfy the
828     * * requirement that zero be placed in the timestamp
829     * * echo reply when ts_recent isn't valid. The
830     * * age isn't reset until we get a valid ts_recent
831     * * because we don't want out-of-order segments to be
832     * * dropped when ts_recent is old.
833     * */
834     /* tp->ts_recent = 0;
835     * } else {
836     * tcpstat.tcps_rcvduppack++;
837     * tcpstat.tcps_rcvdupbyte += ti->ti_len;
838     * tcpstat.tcps_pawsdrop++;
839     * goto dropafterack;
840     * }
841     * }
842     */
843    
844     todrop = tp->rcv_nxt - ti->ti_seq;
845     if (todrop > 0) {
846     if (tiflags & TH_SYN) {
847     tiflags &= ~TH_SYN;
848     ti->ti_seq++;
849     if (ti->ti_urp > 1)
850     ti->ti_urp--;
851     else
852     tiflags &= ~TH_URG;
853     todrop--;
854     }
855     /*
856     * Following if statement from Stevens, vol. 2, p. 960.
857     */
858     if (todrop > ti->ti_len
859     || (todrop == ti->ti_len && (tiflags & TH_FIN) == 0)) {
860     /*
861     * Any valid FIN must be to the left of the window.
862     * At this point the FIN must be a duplicate or out
863     * of sequence; drop it.
864     */
865     tiflags &= ~TH_FIN;
866    
867     /*
868     * Send an ACK to resynchronize and drop any data.
869     * But keep on processing for RST or ACK.
870     */
871     tp->t_flags |= TF_ACKNOW;
872     todrop = ti->ti_len;
873     tcpstat.tcps_rcvduppack++;
874     tcpstat.tcps_rcvdupbyte += todrop;
875     } else {
876     tcpstat.tcps_rcvpartduppack++;
877     tcpstat.tcps_rcvpartdupbyte += todrop;
878     }
879     m_adj(m, todrop);
880     ti->ti_seq += todrop;
881     ti->ti_len -= todrop;
882     if (ti->ti_urp > todrop)
883     ti->ti_urp -= todrop;
884     else {
885     tiflags &= ~TH_URG;
886     ti->ti_urp = 0;
887     }
888     }
889     /*
890     * If new data are received on a connection after the
891     * user processes are gone, then RST the other end.
892     */
893     if ((so->so_state & SS_NOFDREF) &&
894     tp->t_state > TCPS_CLOSE_WAIT && ti->ti_len) {
895     tp = tcp_close(tp);
896     tcpstat.tcps_rcvafterclose++;
897     goto dropwithreset;
898     }
899    
900     /*
901     * If segment ends after window, drop trailing data
902     * (and PUSH and FIN); if nothing left, just ACK.
903     */
904     todrop = (ti->ti_seq+ti->ti_len) - (tp->rcv_nxt+tp->rcv_wnd);
905     if (todrop > 0) {
906     tcpstat.tcps_rcvpackafterwin++;
907     if (todrop >= ti->ti_len) {
908     tcpstat.tcps_rcvbyteafterwin += ti->ti_len;
909     /*
910     * If a new connection request is received
911     * while in TIME_WAIT, drop the old connection
912     * and start over if the sequence numbers
913     * are above the previous ones.
914     */
915     if (tiflags & TH_SYN &&
916     tp->t_state == TCPS_TIME_WAIT &&
917     SEQ_GT(ti->ti_seq, tp->rcv_nxt)) {
918     iss = tp->rcv_nxt + TCP_ISSINCR;
919     tp = tcp_close(tp);
920     goto findso;
921     }
922     /*
923     * If window is closed can only take segments at
924     * window edge, and have to drop data and PUSH from
925     * incoming segments. Continue processing, but
926     * remember to ack. Otherwise, drop segment
927     * and ack.
928     */
929     if (tp->rcv_wnd == 0 && ti->ti_seq == tp->rcv_nxt) {
930     tp->t_flags |= TF_ACKNOW;
931     tcpstat.tcps_rcvwinprobe++;
932     } else
933     goto dropafterack;
934     } else
935     tcpstat.tcps_rcvbyteafterwin += todrop;
936     m_adj(m, -todrop);
937     ti->ti_len -= todrop;
938     tiflags &= ~(TH_PUSH|TH_FIN);
939     }
940    
941     /*
942     * If last ACK falls within this segment's sequence numbers,
943     * record its timestamp.
944     */
945     /* if (ts_present && SEQ_LEQ(ti->ti_seq, tp->last_ack_sent) &&
946     * SEQ_LT(tp->last_ack_sent, ti->ti_seq + ti->ti_len +
947     * ((tiflags & (TH_SYN|TH_FIN)) != 0))) {
948     * tp->ts_recent_age = tcp_now;
949     * tp->ts_recent = ts_val;
950     * }
951     */
952    
953     /*
954     * If the RST bit is set examine the state:
955     * SYN_RECEIVED STATE:
956     * If passive open, return to LISTEN state.
957     * If active open, inform user that connection was refused.
958     * ESTABLISHED, FIN_WAIT_1, FIN_WAIT2, CLOSE_WAIT STATES:
959     * Inform user that connection was reset, and close tcb.
960     * CLOSING, LAST_ACK, TIME_WAIT STATES
961     * Close the tcb.
962     */
963     if (tiflags&TH_RST) switch (tp->t_state) {
964    
965     case TCPS_SYN_RECEIVED:
966     /* so->so_error = ECONNREFUSED; */
967     goto close;
968    
969     case TCPS_ESTABLISHED:
970     case TCPS_FIN_WAIT_1:
971     case TCPS_FIN_WAIT_2:
972     case TCPS_CLOSE_WAIT:
973     /* so->so_error = ECONNRESET; */
974     close:
975     tp->t_state = TCPS_CLOSED;
976     tcpstat.tcps_drops++;
977     tp = tcp_close(tp);
978     goto drop;
979    
980     case TCPS_CLOSING:
981     case TCPS_LAST_ACK:
982     case TCPS_TIME_WAIT:
983     tp = tcp_close(tp);
984     goto drop;
985     }
986    
987     /*
988     * If a SYN is in the window, then this is an
989     * error and we send an RST and drop the connection.
990     */
991     if (tiflags & TH_SYN) {
992     tp = tcp_drop(tp,0);
993     goto dropwithreset;
994     }
995    
996     /*
997     * If the ACK bit is off we drop the segment and return.
998     */
999     if ((tiflags & TH_ACK) == 0) goto drop;
1000    
1001     /*
1002     * Ack processing.
1003     */
1004     switch (tp->t_state) {
1005     /*
1006     * In SYN_RECEIVED state if the ack ACKs our SYN then enter
1007     * ESTABLISHED state and continue processing, otherwise
1008     * send an RST. una<=ack<=max
1009     */
1010     case TCPS_SYN_RECEIVED:
1011    
1012     if (SEQ_GT(tp->snd_una, ti->ti_ack) ||
1013     SEQ_GT(ti->ti_ack, tp->snd_max))
1014     goto dropwithreset;
1015     tcpstat.tcps_connects++;
1016     tp->t_state = TCPS_ESTABLISHED;
1017     /*
1018     * The sent SYN is ack'ed with our sequence number +1
1019     * The first data byte already in the buffer will get
1020     * lost if no correction is made. This is only needed for
1021     * SS_CTL since the buffer is empty otherwise.
1022     * tp->snd_una++; or:
1023     */
1024     tp->snd_una=ti->ti_ack;
1025     if (so->so_state & SS_CTL) {
1026     /* So tcp_ctl reports the right state */
1027     ret = tcp_ctl(so);
1028     if (ret == 1) {
1029     soisfconnected(so);
1030     so->so_state &= ~SS_CTL; /* success XXX */
1031     } else if (ret == 2) {
1032     so->so_state = SS_NOFDREF; /* CTL_CMD */
1033     } else {
1034     needoutput = 1;
1035     tp->t_state = TCPS_FIN_WAIT_1;
1036     }
1037     } else {
1038     soisfconnected(so);
1039     }
1040    
1041     /* Do window scaling? */
1042     /* if ((tp->t_flags & (TF_RCVD_SCALE|TF_REQ_SCALE)) ==
1043     * (TF_RCVD_SCALE|TF_REQ_SCALE)) {
1044     * tp->snd_scale = tp->requested_s_scale;
1045     * tp->rcv_scale = tp->request_r_scale;
1046     * }
1047     */
1048     (void) tcp_reass(tp, (struct tcpiphdr *)0, (struct mbuf *)0);
1049     tp->snd_wl1 = ti->ti_seq - 1;
1050     /* Avoid ack processing; snd_una==ti_ack => dup ack */
1051     goto synrx_to_est;
1052     /* fall into ... */
1053    
1054     /*
1055     * In ESTABLISHED state: drop duplicate ACKs; ACK out of range
1056     * ACKs. If the ack is in the range
1057     * tp->snd_una < ti->ti_ack <= tp->snd_max
1058     * then advance tp->snd_una to ti->ti_ack and drop
1059     * data from the retransmission queue. If this ACK reflects
1060     * more up to date window information we update our window information.
1061     */
1062     case TCPS_ESTABLISHED:
1063     case TCPS_FIN_WAIT_1:
1064     case TCPS_FIN_WAIT_2:
1065     case TCPS_CLOSE_WAIT:
1066     case TCPS_CLOSING:
1067     case TCPS_LAST_ACK:
1068     case TCPS_TIME_WAIT:
1069    
1070     if (SEQ_LEQ(ti->ti_ack, tp->snd_una)) {
1071     if (ti->ti_len == 0 && tiwin == tp->snd_wnd) {
1072     tcpstat.tcps_rcvdupack++;
1073     DEBUG_MISC((dfd," dup ack m = %lx so = %lx \n",
1074     (long )m, (long )so));
1075     /*
1076     * If we have outstanding data (other than
1077     * a window probe), this is a completely
1078     * duplicate ack (ie, window info didn't
1079     * change), the ack is the biggest we've
1080     * seen and we've seen exactly our rexmt
1081     * threshold of them, assume a packet
1082     * has been dropped and retransmit it.
1083     * Kludge snd_nxt & the congestion
1084     * window so we send only this one
1085     * packet.
1086     *
1087     * We know we're losing at the current
1088     * window size so do congestion avoidance
1089     * (set ssthresh to half the current window
1090     * and pull our congestion window back to
1091     * the new ssthresh).
1092     *
1093     * Dup acks mean that packets have left the
1094     * network (they're now cached at the receiver)
1095     * so bump cwnd by the amount in the receiver
1096     * to keep a constant cwnd packets in the
1097     * network.
1098     */
1099     if (tp->t_timer[TCPT_REXMT] == 0 ||
1100     ti->ti_ack != tp->snd_una)
1101     tp->t_dupacks = 0;
1102     else if (++tp->t_dupacks == tcprexmtthresh) {
1103     tcp_seq onxt = tp->snd_nxt;
1104     u_int win =
1105     min(tp->snd_wnd, tp->snd_cwnd) / 2 /
1106     tp->t_maxseg;
1107    
1108     if (win < 2)
1109     win = 2;
1110     tp->snd_ssthresh = win * tp->t_maxseg;
1111     tp->t_timer[TCPT_REXMT] = 0;
1112     tp->t_rtt = 0;
1113     tp->snd_nxt = ti->ti_ack;
1114     tp->snd_cwnd = tp->t_maxseg;
1115     (void) tcp_output(tp);
1116     tp->snd_cwnd = tp->snd_ssthresh +
1117     tp->t_maxseg * tp->t_dupacks;
1118     if (SEQ_GT(onxt, tp->snd_nxt))
1119     tp->snd_nxt = onxt;
1120     goto drop;
1121     } else if (tp->t_dupacks > tcprexmtthresh) {
1122     tp->snd_cwnd += tp->t_maxseg;
1123     (void) tcp_output(tp);
1124     goto drop;
1125     }
1126     } else
1127     tp->t_dupacks = 0;
1128     break;
1129     }
1130     synrx_to_est:
1131     /*
1132     * If the congestion window was inflated to account
1133     * for the other side's cached packets, retract it.
1134     */
1135     if (tp->t_dupacks > tcprexmtthresh &&
1136     tp->snd_cwnd > tp->snd_ssthresh)
1137     tp->snd_cwnd = tp->snd_ssthresh;
1138     tp->t_dupacks = 0;
1139     if (SEQ_GT(ti->ti_ack, tp->snd_max)) {
1140     tcpstat.tcps_rcvacktoomuch++;
1141     goto dropafterack;
1142     }
1143     acked = ti->ti_ack - tp->snd_una;
1144     tcpstat.tcps_rcvackpack++;
1145     tcpstat.tcps_rcvackbyte += acked;
1146    
1147     /*
1148     * If we have a timestamp reply, update smoothed
1149     * round trip time. If no timestamp is present but
1150     * transmit timer is running and timed sequence
1151     * number was acked, update smoothed round trip time.
1152     * Since we now have an rtt measurement, cancel the
1153     * timer backoff (cf., Phil Karn's retransmit alg.).
1154     * Recompute the initial retransmit timer.
1155     */
1156     /* if (ts_present)
1157     * tcp_xmit_timer(tp, tcp_now-ts_ecr+1);
1158     * else
1159     */
1160     if (tp->t_rtt && SEQ_GT(ti->ti_ack, tp->t_rtseq))
1161     tcp_xmit_timer(tp,tp->t_rtt);
1162    
1163     /*
1164     * If all outstanding data is acked, stop retransmit
1165     * timer and remember to restart (more output or persist).
1166     * If there is more data to be acked, restart retransmit
1167     * timer, using current (possibly backed-off) value.
1168     */
1169     if (ti->ti_ack == tp->snd_max) {
1170     tp->t_timer[TCPT_REXMT] = 0;
1171     needoutput = 1;
1172     } else if (tp->t_timer[TCPT_PERSIST] == 0)
1173     tp->t_timer[TCPT_REXMT] = tp->t_rxtcur;
1174     /*
1175     * When new data is acked, open the congestion window.
1176     * If the window gives us less than ssthresh packets
1177     * in flight, open exponentially (maxseg per packet).
1178     * Otherwise open linearly: maxseg per window
1179     * (maxseg^2 / cwnd per packet).
1180     */
1181     {
1182     register u_int cw = tp->snd_cwnd;
1183     register u_int incr = tp->t_maxseg;
1184    
1185     if (cw > tp->snd_ssthresh)
1186     incr = incr * incr / cw;
1187     tp->snd_cwnd = min(cw + incr, TCP_MAXWIN<<tp->snd_scale);
1188     }
1189     if (acked > so->so_snd.sb_cc) {
1190     tp->snd_wnd -= so->so_snd.sb_cc;
1191     sbdrop(&so->so_snd, (int )so->so_snd.sb_cc);
1192     ourfinisacked = 1;
1193     } else {
1194     sbdrop(&so->so_snd, acked);
1195     tp->snd_wnd -= acked;
1196     ourfinisacked = 0;
1197     }
1198     /*
1199     * XXX sowwakup is called when data is acked and there's room for
1200     * for more data... it should read() the socket
1201     */
1202     /* if (so->so_snd.sb_flags & SB_NOTIFY)
1203     * sowwakeup(so);
1204     */
1205     tp->snd_una = ti->ti_ack;
1206     if (SEQ_LT(tp->snd_nxt, tp->snd_una))
1207     tp->snd_nxt = tp->snd_una;
1208    
1209     switch (tp->t_state) {
1210    
1211     /*
1212     * In FIN_WAIT_1 STATE in addition to the processing
1213     * for the ESTABLISHED state if our FIN is now acknowledged
1214     * then enter FIN_WAIT_2.
1215     */
1216     case TCPS_FIN_WAIT_1:
1217     if (ourfinisacked) {
1218     /*
1219     * If we can't receive any more
1220     * data, then closing user can proceed.
1221     * Starting the timer is contrary to the
1222     * specification, but if we don't get a FIN
1223     * we'll hang forever.
1224     */
1225     if (so->so_state & SS_FCANTRCVMORE) {
1226     soisfdisconnected(so);
1227     tp->t_timer[TCPT_2MSL] = tcp_maxidle;
1228     }
1229     tp->t_state = TCPS_FIN_WAIT_2;
1230     }
1231     break;
1232    
1233     /*
1234     * In CLOSING STATE in addition to the processing for
1235     * the ESTABLISHED state if the ACK acknowledges our FIN
1236     * then enter the TIME-WAIT state, otherwise ignore
1237     * the segment.
1238     */
1239     case TCPS_CLOSING:
1240     if (ourfinisacked) {
1241     tp->t_state = TCPS_TIME_WAIT;
1242     tcp_canceltimers(tp);
1243     tp->t_timer[TCPT_2MSL] = 2 * TCPTV_MSL;
1244     soisfdisconnected(so);
1245     }
1246     break;
1247    
1248     /*
1249     * In LAST_ACK, we may still be waiting for data to drain
1250     * and/or to be acked, as well as for the ack of our FIN.
1251     * If our FIN is now acknowledged, delete the TCB,
1252     * enter the closed state and return.
1253     */
1254     case TCPS_LAST_ACK:
1255     if (ourfinisacked) {
1256     tp = tcp_close(tp);
1257     goto drop;
1258     }
1259     break;
1260    
1261     /*
1262     * In TIME_WAIT state the only thing that should arrive
1263     * is a retransmission of the remote FIN. Acknowledge
1264     * it and restart the finack timer.
1265     */
1266     case TCPS_TIME_WAIT:
1267     tp->t_timer[TCPT_2MSL] = 2 * TCPTV_MSL;
1268     goto dropafterack;
1269     }
1270     } /* switch(tp->t_state) */
1271    
1272     step6:
1273     /*
1274     * Update window information.
1275     * Don't look at window if no ACK: TAC's send garbage on first SYN.
1276     */
1277     if ((tiflags & TH_ACK) &&
1278     (SEQ_LT(tp->snd_wl1, ti->ti_seq) ||
1279     (tp->snd_wl1 == ti->ti_seq && (SEQ_LT(tp->snd_wl2, ti->ti_ack) ||
1280     (tp->snd_wl2 == ti->ti_ack && tiwin > tp->snd_wnd))))) {
1281     /* keep track of pure window updates */
1282     if (ti->ti_len == 0 &&
1283     tp->snd_wl2 == ti->ti_ack && tiwin > tp->snd_wnd)
1284     tcpstat.tcps_rcvwinupd++;
1285     tp->snd_wnd = tiwin;
1286     tp->snd_wl1 = ti->ti_seq;
1287     tp->snd_wl2 = ti->ti_ack;
1288     if (tp->snd_wnd > tp->max_sndwnd)
1289     tp->max_sndwnd = tp->snd_wnd;
1290     needoutput = 1;
1291     }
1292    
1293     /*
1294     * Process segments with URG.
1295     */
1296     if ((tiflags & TH_URG) && ti->ti_urp &&
1297     TCPS_HAVERCVDFIN(tp->t_state) == 0) {
1298     /*
1299     * This is a kludge, but if we receive and accept
1300     * random urgent pointers, we'll crash in
1301     * soreceive. It's hard to imagine someone
1302     * actually wanting to send this much urgent data.
1303     */
1304     if (ti->ti_urp + so->so_rcv.sb_cc > so->so_rcv.sb_datalen) {
1305     ti->ti_urp = 0;
1306     tiflags &= ~TH_URG;
1307     goto dodata;
1308     }
1309     /*
1310     * If this segment advances the known urgent pointer,
1311     * then mark the data stream. This should not happen
1312     * in CLOSE_WAIT, CLOSING, LAST_ACK or TIME_WAIT STATES since
1313     * a FIN has been received from the remote side.
1314     * In these states we ignore the URG.
1315     *
1316     * According to RFC961 (Assigned Protocols),
1317     * the urgent pointer points to the last octet
1318     * of urgent data. We continue, however,
1319     * to consider it to indicate the first octet
1320     * of data past the urgent section as the original
1321     * spec states (in one of two places).
1322     */
1323     if (SEQ_GT(ti->ti_seq+ti->ti_urp, tp->rcv_up)) {
1324     tp->rcv_up = ti->ti_seq + ti->ti_urp;
1325     so->so_urgc = so->so_rcv.sb_cc +
1326     (tp->rcv_up - tp->rcv_nxt); /* -1; */
1327     tp->rcv_up = ti->ti_seq + ti->ti_urp;
1328    
1329     }
1330     } else
1331     /*
1332     * If no out of band data is expected,
1333     * pull receive urgent pointer along
1334     * with the receive window.
1335     */
1336     if (SEQ_GT(tp->rcv_nxt, tp->rcv_up))
1337     tp->rcv_up = tp->rcv_nxt;
1338     dodata:
1339    
1340     /*
1341     * Process the segment text, merging it into the TCP sequencing queue,
1342     * and arranging for acknowledgment of receipt if necessary.
1343     * This process logically involves adjusting tp->rcv_wnd as data
1344     * is presented to the user (this happens in tcp_usrreq.c,
1345     * case PRU_RCVD). If a FIN has already been received on this
1346     * connection then we just ignore the text.
1347     */
1348     if ((ti->ti_len || (tiflags&TH_FIN)) &&
1349     TCPS_HAVERCVDFIN(tp->t_state) == 0) {
1350     TCP_REASS(tp, ti, m, so, tiflags);
1351     /*
1352     * Note the amount of data that peer has sent into
1353     * our window, in order to estimate the sender's
1354     * buffer size.
1355     */
1356     len = so->so_rcv.sb_datalen - (tp->rcv_adv - tp->rcv_nxt);
1357     } else {
1358     m_free(m);
1359     tiflags &= ~TH_FIN;
1360     }
1361    
1362     /*
1363     * If FIN is received ACK the FIN and let the user know
1364     * that the connection is closing.
1365     */
1366     if (tiflags & TH_FIN) {
1367     if (TCPS_HAVERCVDFIN(tp->t_state) == 0) {
1368     /*
1369     * If we receive a FIN we can't send more data,
1370     * set it SS_FDRAIN
1371     * Shutdown the socket if there is no rx data in the
1372     * buffer.
1373     * soread() is called on completion of shutdown() and
1374     * will got to TCPS_LAST_ACK, and use tcp_output()
1375     * to send the FIN.
1376     */
1377     /* sofcantrcvmore(so); */
1378     sofwdrain(so);
1379    
1380     tp->t_flags |= TF_ACKNOW;
1381     tp->rcv_nxt++;
1382     }
1383     switch (tp->t_state) {
1384    
1385     /*
1386     * In SYN_RECEIVED and ESTABLISHED STATES
1387     * enter the CLOSE_WAIT state.
1388     */
1389     case TCPS_SYN_RECEIVED:
1390     case TCPS_ESTABLISHED:
1391     if(so->so_emu == EMU_CTL) /* no shutdown on socket */
1392     tp->t_state = TCPS_LAST_ACK;
1393     else
1394     tp->t_state = TCPS_CLOSE_WAIT;
1395     break;
1396    
1397     /*
1398     * If still in FIN_WAIT_1 STATE FIN has not been acked so
1399     * enter the CLOSING state.
1400     */
1401     case TCPS_FIN_WAIT_1:
1402     tp->t_state = TCPS_CLOSING;
1403     break;
1404    
1405     /*
1406     * In FIN_WAIT_2 state enter the TIME_WAIT state,
1407     * starting the time-wait timer, turning off the other
1408     * standard timers.
1409     */
1410     case TCPS_FIN_WAIT_2:
1411     tp->t_state = TCPS_TIME_WAIT;
1412     tcp_canceltimers(tp);
1413     tp->t_timer[TCPT_2MSL] = 2 * TCPTV_MSL;
1414     soisfdisconnected(so);
1415     break;
1416    
1417     /*
1418     * In TIME_WAIT state restart the 2 MSL time_wait timer.
1419     */
1420     case TCPS_TIME_WAIT:
1421     tp->t_timer[TCPT_2MSL] = 2 * TCPTV_MSL;
1422     break;
1423     }
1424     }
1425    
1426     /*
1427     * If this is a small packet, then ACK now - with Nagel
1428     * congestion avoidance sender won't send more until
1429     * he gets an ACK.
1430     *
1431     * See above.
1432     */
1433     /* if (ti->ti_len && (unsigned)ti->ti_len < tp->t_maxseg) {
1434     */
1435     /* if ((ti->ti_len && (unsigned)ti->ti_len < tp->t_maxseg &&
1436     * (so->so_iptos & IPTOS_LOWDELAY) == 0) ||
1437     * ((so->so_iptos & IPTOS_LOWDELAY) &&
1438     * ((struct tcpiphdr_2 *)ti)->first_char == (char)27)) {
1439     */
1440     if (ti->ti_len && (unsigned)ti->ti_len <= 5 &&
1441     ((struct tcpiphdr_2 *)ti)->first_char == (char)27) {
1442     tp->t_flags |= TF_ACKNOW;
1443     }
1444    
1445     /*
1446     * Return any desired output.
1447     */
1448     if (needoutput || (tp->t_flags & TF_ACKNOW)) {
1449     (void) tcp_output(tp);
1450     }
1451     return;
1452    
1453     dropafterack:
1454     /*
1455     * Generate an ACK dropping incoming segment if it occupies
1456     * sequence space, where the ACK reflects our state.
1457     */
1458     if (tiflags & TH_RST)
1459     goto drop;
1460     m_freem(m);
1461     tp->t_flags |= TF_ACKNOW;
1462     (void) tcp_output(tp);
1463     return;
1464    
1465     dropwithreset:
1466     /* reuses m if m!=NULL, m_free() unnecessary */
1467     if (tiflags & TH_ACK)
1468     tcp_respond(tp, ti, m, (tcp_seq)0, ti->ti_ack, TH_RST);
1469     else {
1470     if (tiflags & TH_SYN) ti->ti_len++;
1471     tcp_respond(tp, ti, m, ti->ti_seq+ti->ti_len, (tcp_seq)0,
1472     TH_RST|TH_ACK);
1473     }
1474    
1475     return;
1476    
1477     drop:
1478     /*
1479     * Drop space held by incoming segment and return.
1480     */
1481     m_free(m);
1482    
1483     return;
1484     }
1485    
1486     /* , ts_present, ts_val, ts_ecr) */
1487     /* int *ts_present;
1488     * u_int32_t *ts_val, *ts_ecr;
1489     */
1490     void
1491     tcp_dooptions(tp, cp, cnt, ti)
1492     struct tcpcb *tp;
1493     u_char *cp;
1494     int cnt;
1495     struct tcpiphdr *ti;
1496     {
1497     u_int16_t mss;
1498     int opt, optlen;
1499    
1500     DEBUG_CALL("tcp_dooptions");
1501     DEBUG_ARGS((dfd," tp = %lx cnt=%i \n", (long )tp, cnt));
1502    
1503     for (; cnt > 0; cnt -= optlen, cp += optlen) {
1504     opt = cp[0];
1505     if (opt == TCPOPT_EOL)
1506     break;
1507     if (opt == TCPOPT_NOP)
1508     optlen = 1;
1509     else {
1510     optlen = cp[1];
1511     if (optlen <= 0)
1512     break;
1513     }
1514     switch (opt) {
1515    
1516     default:
1517     continue;
1518    
1519     case TCPOPT_MAXSEG:
1520     if (optlen != TCPOLEN_MAXSEG)
1521     continue;
1522     if (!(ti->ti_flags & TH_SYN))
1523     continue;
1524     memcpy((char *) &mss, (char *) cp + 2, sizeof(mss));
1525     NTOHS(mss);
1526     (void) tcp_mss(tp, mss); /* sets t_maxseg */
1527     break;
1528    
1529     /* case TCPOPT_WINDOW:
1530     * if (optlen != TCPOLEN_WINDOW)
1531     * continue;
1532     * if (!(ti->ti_flags & TH_SYN))
1533     * continue;
1534     * tp->t_flags |= TF_RCVD_SCALE;
1535     * tp->requested_s_scale = min(cp[2], TCP_MAX_WINSHIFT);
1536     * break;
1537     */
1538     /* case TCPOPT_TIMESTAMP:
1539     * if (optlen != TCPOLEN_TIMESTAMP)
1540     * continue;
1541     * *ts_present = 1;
1542     * memcpy((char *) ts_val, (char *)cp + 2, sizeof(*ts_val));
1543     * NTOHL(*ts_val);
1544     * memcpy((char *) ts_ecr, (char *)cp + 6, sizeof(*ts_ecr));
1545     * NTOHL(*ts_ecr);
1546     *
1547     */ /*
1548     * * A timestamp received in a SYN makes
1549     * * it ok to send timestamp requests and replies.
1550     * */
1551     /* if (ti->ti_flags & TH_SYN) {
1552     * tp->t_flags |= TF_RCVD_TSTMP;
1553     * tp->ts_recent = *ts_val;
1554     * tp->ts_recent_age = tcp_now;
1555     * }
1556     */ break;
1557     }
1558     }
1559     }
1560    
1561    
1562     /*
1563     * Pull out of band byte out of a segment so
1564     * it doesn't appear in the user's data queue.
1565     * It is still reflected in the segment length for
1566     * sequencing purposes.
1567     */
1568    
1569     #ifdef notdef
1570    
1571     void
1572     tcp_pulloutofband(so, ti, m)
1573     struct socket *so;
1574     struct tcpiphdr *ti;
1575     register struct mbuf *m;
1576     {
1577     int cnt = ti->ti_urp - 1;
1578    
1579     while (cnt >= 0) {
1580     if (m->m_len > cnt) {
1581     char *cp = mtod(m, caddr_t) + cnt;
1582     struct tcpcb *tp = sototcpcb(so);
1583    
1584     tp->t_iobc = *cp;
1585     tp->t_oobflags |= TCPOOB_HAVEDATA;
1586     memcpy(sp, cp+1, (unsigned)(m->m_len - cnt - 1));
1587     m->m_len--;
1588     return;
1589     }
1590     cnt -= m->m_len;
1591     m = m->m_next; /* XXX WRONG! Fix it! */
1592     if (m == 0)
1593     break;
1594     }
1595     panic("tcp_pulloutofband");
1596     }
1597    
1598     #endif /* notdef */
1599    
1600     /*
1601     * Collect new round-trip time estimate
1602     * and update averages and current timeout.
1603     */
1604    
1605     void
1606     tcp_xmit_timer(tp, rtt)
1607     register struct tcpcb *tp;
1608     int rtt;
1609     {
1610     register short delta;
1611    
1612     DEBUG_CALL("tcp_xmit_timer");
1613     DEBUG_ARG("tp = %lx", (long)tp);
1614     DEBUG_ARG("rtt = %d", rtt);
1615    
1616     tcpstat.tcps_rttupdated++;
1617     if (tp->t_srtt != 0) {
1618     /*
1619     * srtt is stored as fixed point with 3 bits after the
1620     * binary point (i.e., scaled by 8). The following magic
1621     * is equivalent to the smoothing algorithm in rfc793 with
1622     * an alpha of .875 (srtt = rtt/8 + srtt*7/8 in fixed
1623     * point). Adjust rtt to origin 0.
1624     */
1625     delta = rtt - 1 - (tp->t_srtt >> TCP_RTT_SHIFT);
1626     if ((tp->t_srtt += delta) <= 0)
1627     tp->t_srtt = 1;
1628     /*
1629     * We accumulate a smoothed rtt variance (actually, a
1630     * smoothed mean difference), then set the retransmit
1631     * timer to smoothed rtt + 4 times the smoothed variance.
1632     * rttvar is stored as fixed point with 2 bits after the
1633     * binary point (scaled by 4). The following is
1634     * equivalent to rfc793 smoothing with an alpha of .75
1635     * (rttvar = rttvar*3/4 + |delta| / 4). This replaces
1636     * rfc793's wired-in beta.
1637     */
1638     if (delta < 0)
1639     delta = -delta;
1640     delta -= (tp->t_rttvar >> TCP_RTTVAR_SHIFT);
1641     if ((tp->t_rttvar += delta) <= 0)
1642     tp->t_rttvar = 1;
1643     } else {
1644     /*
1645     * No rtt measurement yet - use the unsmoothed rtt.
1646     * Set the variance to half the rtt (so our first
1647     * retransmit happens at 3*rtt).
1648     */
1649     tp->t_srtt = rtt << TCP_RTT_SHIFT;
1650     tp->t_rttvar = rtt << (TCP_RTTVAR_SHIFT - 1);
1651     }
1652     tp->t_rtt = 0;
1653     tp->t_rxtshift = 0;
1654    
1655     /*
1656     * the retransmit should happen at rtt + 4 * rttvar.
1657     * Because of the way we do the smoothing, srtt and rttvar
1658     * will each average +1/2 tick of bias. When we compute
1659     * the retransmit timer, we want 1/2 tick of rounding and
1660     * 1 extra tick because of +-1/2 tick uncertainty in the
1661     * firing of the timer. The bias will give us exactly the
1662     * 1.5 tick we need. But, because the bias is
1663     * statistical, we have to test that we don't drop below
1664     * the minimum feasible timer (which is 2 ticks).
1665     */
1666     TCPT_RANGESET(tp->t_rxtcur, TCP_REXMTVAL(tp),
1667     (short)tp->t_rttmin, TCPTV_REXMTMAX); /* XXX */
1668    
1669     /*
1670     * We received an ack for a packet that wasn't retransmitted;
1671     * it is probably safe to discard any error indications we've
1672     * received recently. This isn't quite right, but close enough
1673     * for now (a route might have failed after we sent a segment,
1674     * and the return path might not be symmetrical).
1675     */
1676     tp->t_softerror = 0;
1677     }
1678    
1679     /*
1680     * Determine a reasonable value for maxseg size.
1681     * If the route is known, check route for mtu.
1682     * If none, use an mss that can be handled on the outgoing
1683     * interface without forcing IP to fragment; if bigger than
1684     * an mbuf cluster (MCLBYTES), round down to nearest multiple of MCLBYTES
1685     * to utilize large mbufs. If no route is found, route has no mtu,
1686     * or the destination isn't local, use a default, hopefully conservative
1687     * size (usually 512 or the default IP max size, but no more than the mtu
1688     * of the interface), as we can't discover anything about intervening
1689     * gateways or networks. We also initialize the congestion/slow start
1690     * window to be a single segment if the destination isn't local.
1691     * While looking at the routing entry, we also initialize other path-dependent
1692     * parameters from pre-set or cached values in the routing entry.
1693     */
1694    
1695     int
1696     tcp_mss(tp, offer)
1697     register struct tcpcb *tp;
1698     u_int offer;
1699     {
1700     struct socket *so = tp->t_socket;
1701     int mss;
1702    
1703     DEBUG_CALL("tcp_mss");
1704     DEBUG_ARG("tp = %lx", (long)tp);
1705     DEBUG_ARG("offer = %d", offer);
1706    
1707     mss = min(if_mtu, if_mru) - sizeof(struct tcpiphdr);
1708     if (offer)
1709     mss = min(mss, offer);
1710     mss = max(mss, 32);
1711     if (mss < tp->t_maxseg || offer != 0)
1712     tp->t_maxseg = mss;
1713    
1714     tp->snd_cwnd = mss;
1715    
1716     sbreserve(&so->so_snd, tcp_sndspace+((tcp_sndspace%mss)?(mss-(tcp_sndspace%mss)):0));
1717     sbreserve(&so->so_rcv, tcp_rcvspace+((tcp_rcvspace%mss)?(mss-(tcp_rcvspace%mss)):0));
1718    
1719     DEBUG_MISC((dfd, " returning mss = %d\n", mss));
1720    
1721     return mss;
1722     }