Introduction
The FreeBSD Developer Summit associated with EuroBSDCon 2005 was occurring on Thursday, November 24 and Friday, November 25 2005, prior to the EuroBSDCon conference technical track.
Schedule ideas
We will have projectors if you want to prepare slides.
Silby wishes to spend some time talking about how students interact with Fedora / other Linux distros and how this impacts FreeBSD.
10 minutes - Overview of Google's Summer of Code program, descriptions of projects and things that might be of general interest to developers, as well as brain-storming on potential future programs. (rwatson)
45 minutes - Discussion of network polling, device polling, interrupts, and things along those lines.
15 minutes - Discussion of models for parallelism in the network stack -- where will adding more threads help, vs hurt. Related to polling discussion. (rwatson)
15 minutes - Formalizing the network stack device driver API -- what should and shouldn't be documented as part of the API?
30 minutes - NFSv4 client, server. Do we jump on the NFSv4 bandwagon? Who wants to lend a hand? What do we get out of NFSv4 anyway?
15 minutes - Timekeeping, timecounters, precision and scheduler/resource accounting. (phk)
15 minutes - Lockless PFIL. (mlaier, andre)
30 minutes - New IPv4 SMP friendly routing table, changes to rtentry/rtsocket. (andre)
15 minutes - Removing inter-linking from INPCBs to allow parallesim for incoming packets. (andre, rwatson)
15 minutes - Incoming concurrent TCP segments on multiple CPUs, lockless per-TCPCB queue? (andre, rwatson)
30 minutes - Interface groups, implications for sockets, jails and (policy) routing. (andre)
Actual Turnout
Thursday:
Friday:
Silby did his Fedora presentation, pointing out things we should think about. Slides?
Chuck Lever gave a high-level overview of NFSv4 and the state of current implementations. Slides?
Poul-Henning gave a detailed explanation of the state of our time infrastructure. TSC vs. ACPI-fast and why/how the TSC should be used for scheduling/accounting. A Proof-Of-Concept patch has been sent to -current: http://lists.freebsd.org/pipermail/freebsd-current/2005-November/058510.html
After lunch we split into groups.
Network group log (provided by Silby): Network BOF: Started around 2:40pm
Topics to discuss:
- tcpsecure changes
- tcp callout improvements
- stack virtualization vs interface groups
- lockless pfil
- routing table already covered
- network stack device driver API
- polling
- run to completion, other performance stuff
- MD5 checks
- tcpdump-like logging for weird packets
Notes:
Silby is working on tcp sequence numbers, timestamps
Max brings up MD5 verification, we don't think we implement it fully yet
Andre's working on:
- T/TCP version 2
- uses a session cookie, which can secure sessions
- the cookie also allows subsequent connections to skip the 3WHS
Max on PFIL
- hacked up shared lock used by the packet filters
- a few implementations already, but no decision yet on which is best
- linear array
- singly linked list
- general lockless updates would be nice
Most recent idea/implementation here: http://people.freebsd.org/~mlaier/pfil_lockless_20051204.diff
Andre
- Atomic ops may be better than locks for an interface queue
Ed:
- device polling
- locking wasn't correct until recently
- only applies to network devices (would be good to extend)
- no feedback from poll handlers telling how much work was done
- falls down when you hit 100% cpu usage
- done per network device instance right now
- SMP issues all resolved, wasn't actually a problem on 4.x SMP anyway
- firewire might be something else that could use it
- is interrupt moderation a good thing to add? - maybe, if possible - linux does it
- real time data acquisition might want polling as well
silby:
- tcpsecure discussion
- RST solution we have is fine - tcpsecure's is bad
- SYN solution from tcpsecure seems good
- data injection protection seems to make sense
- andre/silby will try to open a constructive dialog with the tcpm working group
andre:
- we need empirical evidence of what tcp problems (out of order, etc) are happening
- Data collectors at yahoo?
silby:
- kernel tcpdump-like
- pflog can be used for this
- queue of packets over the past few seconds
silby/andre:
- no empircal data yet
- optimization: one callout per socket
- rework callout wheel entirely
- locking is one problem
- hash-based nature of it is another
- scales to accuracy of hz, we don't need that accuracy
- change the time of a callout, but leave it in the same bucket move it to the correct bucket later
- has potential
andre/se:
- should we delay acks at all?
silby/andre:
- dragonfly acks all packets that arrived in one batch with only one ack
- more data needed before changing this
andre:
- dynamic tcp socket buffers
silby:
- reduce delayed ack time
- to 3*rtt?
- to 3*inter-packet time?
- tcp retransmit timeouts - do they need tuning for modern networks?
andre:
- bw delay products for various networks
andre:
- stack virualization vs interface groups
- virtualization causes nearly everything to be virtualized
- interface groups help with
- better jails
- policy routing
- research uses may be better served by virtualization for simulation purposes
- does xen buy the same research options?
- one use of vimage would be to make routing changes, then switch the stack atomically
andre:
- openBGPD might be ported
long xen/vmware discussion
- maybe network performance can be optimized in vmware
andre:
- M_EOR flag will be used by SCTP, so we can't just remove it.
- SCTP has a few different implementations
- the KAME seems to be out of date
- Linux has a possibly different version
- Solaris has it too, origin unknown
silby:
- ephemeral port quick reuse due to port randomization
- bloom filters?
- list of most recently closed ports in a static array
- array of ports, timestamps of when they were most recently used
- we need to calculate out connection rate * # ports * desired recycled time to see what is acceptable
5:34pm - BoF breaks up