This is 'mission control' for ongoing work on the FreeBSD networking code. It is by no means exhaustive or authoritative, however, its use as a tool for effective collaboration is encouraged.
If you want to work on a task please put your WikiName next to it. Participation in other forums such as the freebsd-net mailing list is strongly encouraged. There is plenty for volunteers of all aptitude levels to do; some of this work even has the potential to be funded open source work.
The tasks are arranged roughly in descending order of how quickly they can/should be finished.
Contents
Subpages
/10GbE /ReceiveSideScaling /WakeOnLan |
Low hanging userland fruit
These are tasks suitable for immediate hand-off to new userland volunteers. They are particularly suitable for newer developers who wish to gain experience.
- Rewrite parts of netstat to not require KVM.
- This is to support the ability to build a FreeBSD system which does not export kernel memory to userland via KVM.
- It should consistently use sysctl(2) for obtaining statistics from a running system.
- It should consistently use KVM *if and only if* it was built with KVM support *and* the -M option was specified on the command line.
- quota(8) is not IPv6 clean.
- who(1) truncates IPv6 addresses in its output.
- rpc.lockd may not be compatible with Linux NFS servers.
- Now that nmount() is in 6.2-STABLE, it should be possible to implement this as an optional feature without patching both world and kernel.
- Implement the equivalent of 'netstat -an' for Bluetooth RFCOMM.
- Write a command which shows current connections and listening channels, or better still integrate into netstat.
Subsystem maintainer should see reviews for this: MaksimYevmenkin
- Regression tests for routing sockets.
- As FreeBSD moves towards supporting a multi-path forwarding trie, this is going to be extremely important.
- It would be great if someone could volunteer to write a fairly aggressive set of regression tests for the PF_ROUTE protocol family.
- Regression test for the SO_ACCEPTCONN option.
Import latest OpenBSD dhclient. -- BrooksDavis
- This needs to happen before IP_ONESBCAST and IP_SENDIF support are merged, to eliminate the dependency on BPF.
Low hanging kernel fruit
These are tasks suitable for immediate hand-off to new network stack volunteers.
Rate limit ICMP responses in the forwarding path. -- BruceSimpson
- Add support for the IP_SENDSRCADDR option to rip_output(); OSPF may use it later on.
- This is harder than it looks!
- Reject non-broadcast destinations passed to IP_ONESBCAST with a meaningful error.
- Route lookup does not skip interfaces marked IFF_DOWN.
- This cannot be patched as-is because some parts of the kernel code rely on being able to look up interfaces given a destination via the routing code based on a first match regardless.
- If we take equal-cost multipath we may need to make such a change.
- Clean up IPv4 multicast forwarding.
Use the BSD abstract data type macros. NetBSD patch
- Make IPv4 multicast forwarding scale beyond 32 VIFs.
- Implement an Ethernet feature whereby cards with more than one perfect hash filter entry may be programmed to listen for additional addresses.
This is a useful feature for CARP as it avoids having to put the parent interface into promiscuous mode.
- It does however mean that some of the logic has to change in ether_demux().
- Add support for IGMP v1/2/3 and MLD v1/2 pruning to the if_bridge code.
- Mark the ability to do IFF_ALLMULTI correctly as an interface capability.
- Document hardware which does not or has a broken IFF_ALLMULTI mode.
- wi(4) PRISM2 is known to have broken IFF_ALLMULTI.
- Emulate IFF_ALLMULTI using IFF_PROMISC and M_PROMISC.
- Document hardware which does not or has a broken IFF_ALLMULTI mode.
- Fix warning about adding a new protocol family after finalization. Quick way to trigger error is to load the bluetooth stack after boot (which is now down by devd by default on HEAD if a USB bluetooth device is found).
Porting tasks
These are tasks which involve the porting of software to FreeBSD, using the ports system, which will help future network development by making use of new FreeBSD features for 'hot topics' in computer networking in the wider world.
Forward port the Click Modular Router kernel support to FreeBSD 6 and 7. -- AniruddhaBohra
- The code is being updated for the polling, netisr, ifnet, queueing and locking changes in FreeBSD 6 and 7.
The XORP router control plane can use Click as its forwarding path as an alternative to ip_forward().
- Make 'Zeroconf' local area DNS service discovery ('DNS-SD') work.
A http://people.freebsd.org/~bms/dump/kdnssd_avahi.tar (broken) New port for kdnssd_avahi is available for testing, which overwrites the default KDE libkdnssd.so.1 library with one which uses Avahi as a back-end for mdns. -- BruceSimpson
- Teach the net/olsrd port to use IP_ONESBCAST.
A patch has been sent to the port maintainer. -- BruceSimpson
Architectural tasks
These are tasks which involve architectural changes in the code base or other significant changes to how FreeBSD currently does things, which will require significant levels of cooperation amongst developers to succeed.
Fix the TCP duplicate ACK problem which has been exposed in 7.0-CURRENT. -- AndreOppermann
- There have been 2 independent reports of this issue on current@, there is no PR yet for this issue.
- Merge the IP_SENDIF option to skip source interface selection.
- This work is incomplete. IP_SENDIF needs to work with unnumbered interfaces to be useful.
- Fix bad UDP checksums with TXCSUM enabled; code needs a little rethinking.
- The code as it stands in bms_netdev works only with numbered interfaces unless a number of forwarding table kludges are used. in_pcbconnect_setup() needs to be told about IP_SENDIF being in use so as not to attempt to bind temporarily to a local address, nor to attempt to find a route to the destination which breaks source address selection.
- Teach dhclient to use IP_SENDTOIF/IP_ONESBCAST.
- Doing IP_SENDIF and IP_SENDSRCADDR for the Raw IP path requires gnarly code changes to rip_output() and rip_send().
- Allow IP_SENDTOIF without privilege if it is used to send from a numbered interface.
- Extend IP_SENDTOIF to be able to accept layer 2 addresses with sufficient privilege.
- Break ip_output() into a varargs function and perform a detailed performance study for common and high performance use cases.
- Fix IPv4 unicast and broadcast architectural problems.
- Support Equal-Cost Multipath; consider importing this support from OpenBSD.
- Teach stack about interface preference; see also source address selection.
- If the stack can deal with multiple routes, it can deal with 255.255.255.255 in the forwarding trie.
- Deprecate hacks like IP_ONESBCAST as they will then no longer be needed.
PR for this http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/57479
- This effort should be funded as it's mission critical to corporate FreeBSD users.
- Support Equal-Cost Multipath; consider importing this support from OpenBSD.
- Integrate support for RFC 3678, Socket Interface Extensions for Multicast Source Filters.
Phase I, the rewrite of the socket-level multicast option handling code for SSM, is in FreeBSD 7.0. -- BruceSimpson
- Phase II: IGMPv3 support; support for raw IPv4 sockets.
- Phase III: IPv6 and MLDv2 support.
- Funding for phases II and III of this work is available.
- Finish supporting Zeroconf (RFC 3927).
- Add address scope as a concrete kernel concept to both netinet and netinet6.
- Linux implements address scope as part of its struct in_ifaddr.
- Source selection is used throughout its IPv4 stack.
- Scopes are exposed to userland via the Netlink socket. FreeBSD could add it to getifaddrs().
- It may be wise to make the protocol domains link off ifnet separately rather than putting protocol addresses in ifnet as we have done for years before making these changes.
- Linux implements address scope as part of its struct in_ifaddr.
- Consider importing NetBSD source address selection policy (by dyoung).
See http://mail-index.netbsd.org/tech-net/2006/09/02/0000.html
See ftp://cuw.ojctech.com/cuw/netbsd-e3b075d7/pristine-selsrc-patch
- The inpcb connect and bind code will need to be educated about source selection.
- Extend source selection to IGMP to deal with its addressing limitations (see ip(4)).
- Much of the code in dyoung's patch is covered by infrastructure already present in the KAME netinet6 stack as it exists in FreeBSD. See ip6addrctl(8).
- Add address scope as a concrete kernel concept to both netinet and netinet6.
- Teach TCP-MD5 (RFC 2385) support to use the IPSEC SADB for its key storage.
- Implement optimized trie lookups for forwarding.
ARP code rewrite. -- AndreOppermann
- Review luigi's last ARP code snapshot with a view to incorporation if it can solve the 255.255.255.255 problem in the shorter term.
- It divorces ARP from the routing table, but some things in the kernel rely on link-layer route cloning (notably ip fast-forwarding, which checks for RTF_BROADCAST to see if it should drop directed broadcasts).
- Rejig ipfw to not require the aquisition of a lock for every packet.
- This would also remove a LOR when testing for uid of the sender/receiver process.
- Generally revisit if_flags, if_drvflags. Look especially hard at the IFF_LINK* flags, which have quite variable semantics and may need to be broken out themselves, as they are sometimes driver flags (and sometimes not).
- General review of struct ifnet locking, ifnet API for device drivers, ifnet locking.
- TCPDEBUG appears not to have been updated for TCP timewait support, and therefore may return incorrect data. It needs to be updated.
- so_upcall needs to be broken out from a single upcall with ambiguous locking into several upcalls each with carefully documented locaking and semantics
- Review accept filter locking following so_upcall cleanup.
- Continue exploring socket locking strategy changes and vertical protocol lock integration in rwatson_resock and rwatson_resock_virtual.
- For each protosw switch entry, decide whether thread is needed, or if ucred would be sufficient. Where possible, pass only a cred.
- Investigate moving to optimistic locking of accept mutex to avoid frequent accept mutex acquires when not needed.
- protosw(9) man page for protocol switch APIs.
- Revisit multi-mbuf allocation, multi-mbuf input.
- ether_input() accepts a list of mbufs
- Investigate solutions to inpcb tear-down races involving timers with tcp -- right now, timers are stopped but not drained, so timers can race with tear-down, hence NULL checks in timer code.
- Revisit locking and atomicity for uipc_send(), and in particular, with respect to what happens if two threads simultaneously call sendto() on different addresses with the same datagram socket. Currently this is likely prevented by sblock() at the system call layer, but races with connect() aren't, as was shown by the recent race case along the same lines. Somehow this needs to be serialized.
- inpcb freeing support so that pcb's can be freed after a load spike, requires moving away from weak consistency sysctl monitoring.
- Continue pushing of control of state changes from socket layer to protocol layer so as to improve inter-layer consistency and reduce races between layers on socket state transitions (connect/disconnect/connecting/disconnecting/...)
- Analysis of listen state transitions, which appear poorly defined. Consider in particular interactions with kqueue, where the order of registering kqueue events vs. calling listen affects the semantics considerably.
Integrate solutions for RSS (for reference: ComparingMultiqueueSupportLinuxvsFreeBSD /ReceiveSideScaling)
Long term projects
These are tasks which are more oriented towards documentation and review, or which are significant long-term software projects in of themselves.
- General stack cleanup.
- Deal with alignment constraints for embedded platforms.
- Use C99 types.
- Clean up re-entrancy poop.
- inet_ntoa() and inet_aton() in kernel have PRs...
- Port NetBSD's MBUFTRACE. Who really owns that mbuf chain which happens to be in your core file?
- This was added to NetBSD MAIN on Feb 26 2003.
NetworkRFCCompliance FreeBSD Networking Standards Compliance Project
- Review FreeBSD TCP SACK implementation against RFC 3517's Conservative SACK Recovery Scheme.
If this is implemented by anyone in any OS, please let allman@icir.org know.
- Implement TCP Quick Start (RFC 4782).
- Quick Start has the potential to improve network performance in situations where the congestion window kicks in due to loss of TCP segments by the link layer, such as 802.11 'Wi-fi' wireless.
- Implement TCP Non Congestion Robustness (RFC 4653).
- It is likely to offer more benefit in the case where TCP performance has been affected by significant packet reordering, such as in GPRS, 3GPP and INMARSAT BGAN networks as well as IP multipathing scenarios.
- Because of the interest in this TCP extension by industry sectors associated with 3GPP, it should be considered as an area for potentially funded FreeBSD work.
- Implement source routing policy.
- This is deliberately vague as it needs further research.
Original research is currently being conducted in the formal academic world into the issues around implementing DiffServ and IntServ, both of which impact forwarding at all layers of the stack.
- This is an area for funded formal research.
- Explore increasing netisr queue depth due to gallatin's report that overflows are causing TCP problems with 10gbps interfaces.
- reentry locking: Yandex (TODO: name) has done a lot of investigation into packet forwarding performance on FreeBSD-9 and FreeBSD-10.
- The locking around rtentry is expensive - do we actually need to refcount rtentry, or can we architect it out by not passing it by reference into the forwarding stack?
- Implement counter(9)-based interface for lockless refcounting on ifp's
- Convert current users of rtentry(9) api to new forwarding functions
- Radix trie efficencies (or lack thereof)
- Split routing structures (radix/rtentry stuff) into 2 different things: feature-rich RIB for determining best routes among several ones and fast af-dependent one used to forward packets.
- Fix masks bug in radix (done)
- Fix multipath
- Implement API to program FIB for given family
- Implement IPv4 FIB
- mbuf backing store representation
- Under memory fragmentation, a large send buffer can be represented by a large set of underlying, discontiguous memory pages. In the most pessimistic case, a large read via sendfile() can be decomposed into a small number of 4KiB page entries which are individually converted into separate mbufs and then passed to the network stack to send. If some pages are cached but others aren't, then this also results in a list of mbufs that represent each set of pages. Then in the network driver, this list of mbufs has to be walked again and turned into a gather DMA list for the NIC. This all takes time and touches memory.
- Ideally the network stack should be able to deal with an mbuf which is made up of multiple underlying buffers. For the above case(s), an array of pages could be passed up, or some API that allows for 'struct buf' (and others) to be exposed as the underlying buffer without overhead.
- The trick here is that almost all of the networking code assumes that:
you can directly fondle the m->m_data pointer for a given mbuf, without modifying the actual backing store, in order to reserve header space;
- there's only one buffer per mbuf, that you can access via mtod();
- walking the buffers that make up an mbuf is done by walking the m_next pointer list.
- It may be quite a win to be able to represent mbufs as an abstract data source that will default to a single buffer, but could be backed by an array of vm_page_t entries, or a 'struct buf', or similar. A lot of code may need modifying to handle this - ie, all the direct mbuf pointer adjustment and manual iteration - but it will result in the mbuf consumer code being tidier.
- Syscall batching
- There's been some recent work by Yahoo (TODO: URL) looking at how effective it is to batch IO related syscalls. They found that batching up to 32 read/write/accept/connect/close (zero copy or otherwise) syscalls into batches gave a significant performance boost.
Contention between TCP RX (ACK -> TCP window update, transmit) and TCP TX
802.11 Tasks
802.11 Wireless is a 'killer app' for open source operating systems. Help FreeBSD to stay ahead.
- Look at a few things:
IPv6 Cleanup Tasks
- Getting People on IPv6
- Config files need to be updated (network.subr)
- IPv6 sysctls have no description fields, these should be added.
- Hack rtsol(8) to optionally not configure a link-local address.
- This enables one to configure a "static" IPv6 address, by only soliciting a default route.
- Main IPv6 Locking Related
- Lock ip6_init and ip6_init2 against domain initialization (Max Laier)
- raw_ip6 locking could use review and maybe fixes.
- Route locking in IPv6 needs reviewing.
- There's a lack of locking of the raw pcb list in PF_KEY
- Neither the IPv4 nor IPv6 inaddr lists are locked.
- ip6_mroute needs locking. While here, review the ip_mroute code also.
ip6 in6_multiaddr refcounting needs fixed, see http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/100579
- mld6 looks suspicious. Watch out for static variables too.
- ip6_id is unsynchronized
- The if_gif support found in IPv4 and IPv6 has locking problems (or lack of locking problems). There are some changes in the netperf branch relating to this, but more work is definitely needed.
- Issues in ifnet handling related to router6_info. Struct ifnet MUST be created via if_alloc (Done?)
- Merge KAME MLDv2 host-mode source-specific multicast.
Future Cool Stuff
This is a list of stuff which would be cool to have in the future for putting FreeBSD into orbit or onto Antarctica.
- Integrate KAME SHISA Mobility for IPv6 (MIP6)
TsuyoshiMomose and others are working on this. See their home page.
- Implement RSVP and Differentiated Services.
- More thorough research on RSVP would be useful here.
- ALTQ only schedules transmission queues. If we want to preserve QoS information on input, we must do that separately from ALTQ.
- If we want to send 802.1p priority, this should be done as part of a full implementation.
- When layers are being crossed, something between the protocol layer and link layer will be responsible for mapping the DSCP byte to 802.1p.
Until RSVP and DiffServ are fully implemented, the IP_TOS socket option can be used to explicitly set the entire ip_tos byte to a DSCP by an application.
- Create a FreeBSD port of KOM RSVPD (it interoperates with kernel rsvp code and ALTQ).
- Implement generic packet classification capabilities, with a view to the switching of multimedia calls via SIP over lossy / bandwidth limited networks such as 802.11.
I am unable to get KOM RSVPD to build with G++ 3.x and up. My development system is an amd64 and gcc 2.95 is not available for this platform. Some hacking of the autoconf files was needed to get that far. --BruceSimpson
- Support for Operations and Management (OAM).
- Both the Ethernet Slow Protocols (802.3ad) and ATM specify OAM now for managing carrier-grade networks.
- Implement Multi-Protocol Label Switching (MPLS)
- MPLS is widely deployed on Internet network backbones to optimize forwarding and provide advanced IP services.
AlexanderChernikov has begun work with MPLS implementation on FreeBSD.
- This MPLS implementation is heavily based on netgraph
- Modified version of bird routing software is used to provide LDP signalling
Current project status/wiki/svn can be viewed at http://freebsd.mpls.in
MatthewLuckie has begun porting the Ayame MPLS stack to FreeBSD.
- Ayame contains an implementation of MPLS-TE with an RSVPD.
- This has potential to be a funded software project, though there are competing commercial interests in this area.
Port the Space Communications Protocol Suite (SCPS) to FreeBSD. -- SoerenStraarup
- This is a 'hot topic' in commercial and academic satellite communications (TCP Performance Enhancing Proxies).
Port the MITRE MiniRouter to FreeBSD.
- This potentially builds on and makes use of SCPS.
See http://www.openchannelsoftware.org/projects/MITRE_Minirouter/
- This is a specialist area of research of interest to those in satellite communications.