summaryrefslogtreecommitdiff
path: root/open_issues/bpf.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'open_issues/bpf.mdwn')
-rw-r--r--open_issues/bpf.mdwn372
1 files changed, 371 insertions, 1 deletions
diff --git a/open_issues/bpf.mdwn b/open_issues/bpf.mdwn
index 73f73093..98b50430 100644
--- a/open_issues/bpf.mdwn
+++ b/open_issues/bpf.mdwn
@@ -1,4 +1,4 @@
-[[!meta copyright="Copyright © 2009 Free Software Foundation, Inc."]]
+[[!meta copyright="Copyright © 2009, 2012 Free Software Foundation, Inc."]]
[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
@@ -70,3 +70,373 @@ This is a collection of resources concerning *Berkeley Packet Filter*s.
* [[!GNU_Savannah_patch 6622]] -- pfinet uses the BPF filter
* [[!GNU_Savannah_patch 6851]] -- fix a bug in BPF
+
+
+# IRC
+
+## IRC, freenode, #hurd, 2012-01-13
+
+ <braunr> hm, i think the bpf code needs a complete redesign :p
+ <braunr> unless it's actually a true hurdish way to do things
+ <braunr> antrik: i need your help :)
+ <braunr> antrik: I need advice on the bpf "architecture"
+ <braunr> the current implementation uses a translator installed at /dev/bpf
+ <braunr> which means packets from the kernel are copied to that translator
+ and then to client applications
+ <braunr> does that seem ok to you ?
+ <braunr> couldn't the translator be used to set a direct link between the
+ kernel and the client app ?
+ <braunr> which approach seems the more Hurdish to you ? (<= this is what I
+ need your help on)
+ <pinotree> braunr: so there would be a roundtrip like kernel → bpf
+ translator → pfinet?
+ <antrik> braunr: TBH, I don't see why we need a BPF translator at all...
+ <braunr> antrik: it handles the ioctls
+ <braunr> pinotree: pfinet isn't involved (it was merely modified to use the
+ "new" filter format to specify it used the old packet filter, and not
+ bpf)
+ <antrik> braunr: do we really need to emulate the ioctl()s? can't we assume
+ that all packages using BPF will just use libpcap?
+ <antrik> (and even if we *do* want to emulate ioctl()s, why can't we handle
+ this is libc?)
+ <braunr> antrik: that's what i'm wondering actually
+ <braunr> even if assuming all packages use libpcap, i'd like our bpf
+ interface to be close to what bsds have, and most importantly, what
+ libpcap expects from a bpf interface
+ <antrik> well, why? if we already have a library handling the abstraction,
+ I don't see much point in complicating the design and use by adding
+ another layer :-)
+ <braunr> so you would advise adapting libpcap to include a hurd specific
+ module ?
+ <antrik> there are two reasons for adding translators: more robustness or
+ more flexibility... so far I don't see how a BPF translator would add
+ either
+ <braunr> right
+ <antrik> yes
+ <braunr> so we'd end up with a bpf-like interface, the same instructions
+ and format, with different control calls
+ <antrik> right
+ <antrik> note that I had more or less the same desicion to make for KGI
+ (emulate Linux/BSD ioctl()s, or implement a backend in libggi for
+ handling Hurd-specific RPC; and after much consideration, I decided on
+ the latter)
+
+
+## IRC, freenode, #hurd, 2012-01-16
+
+ <braunr> antrik: is there an existing facility to easily give a send right
+ to the device master port to a task ?
+ <braunr> another function of the bpf translator is to handle the /dev/bpf
+ node, and most importantly its permissions
+ <braunr> so that users which have read/write access to the node have access
+ to the packet filter
+ <braunr> i guess the translator could limit itself to that functionality
+ <braunr> and then provide a device port on which libpcap operates directly
+ by means of device_{g,s}et_status/device_set_filter
+ <antrik> braunr: I don't see the point in seperating permissions for filter
+ from permissions from general network device access...
+ <antrik> as for device master port, all root tasks can obtain it from proc
+ IIRC
+ <braunr> antrik: yes, but how do we allow non-root users to access that
+ facility ?
+ <braunr> on a unix like system, it's a matter of changing the permissions
+ of /dev/bpf
+ <antrik> with devnode, non-root users can get access to specific device
+ nodes, including network devices
+ <braunr> i can't imagine the hurd being less flexible for that
+ <braunr> ah devnode
+ <braunr> good
+ <antrik> so we can for example make /dev/eth0 accessible by users of some
+ group
+ <braunr> what's devnode exactly ?
+ <antrik> it's a very simple translator that implements an FS node that
+ looks somewhat like a file, but the only operation it supports is
+ obtaining a pseudo device master port, giving access to a specific Mach
+ device
+ <braunr> is it already part of the hurd ?
+ <braunr> or hurdextras maybe ?
+ <antrik> it's only in zhengda's branch
+ <braunr> ah
+ <antrik> needed for both eth-multipexer and DDE
+ <braunr> and bpf soon i guess
+ <antrik> indeed :-)
+ <braunr> "obtaining a pseudo device master port", i believe you meant a
+ pseudo device port
+ <antrik> I must admit that I don't remember exactly whether devnode proxies
+ device_open(), so clients direct get a port to the device in question, or
+ whether it implements a pseudo device master port...
+ <antrik> but definitely not a pseudo device port :-)
+ <braunr> i'm almost positive it gives the target device port, otherwise i
+ don't see the point
+ <braunr> i don't understand the user of the "pseudo" word here either
+ <braunr> s/user/use/
+ <braunr> aiui, devnode should be started as root (or in any way which gives
+ it the device master port)
+ <antrik> the point is that the client doesn't need to know the Mach device
+ name, and also is not bound to actual kernel devices
+ <braunr> and when started, implement the required permissions before giving
+ clients a device port to the specific device it was installed for
+ <braunr> right
+ <braunr> but it mustn't be a proxy
+ <antrik> yes, devnode needs access to either the real master device port
+ (for kernel devices), or one provided by eth-multiplexer or the DDE
+ network driver
+ <braunr> well, a very simple proxy for deviceopen
+ <braunr> ok
+ <braunr> that seems exactly what i wanted to do
+ <braunr> we now need to see if we can integrate it separately
+ <braunr> create a separate branch that works for the current gnumach code,
+ and merge dde/other specific code later on
+ <antrik> you mean independent of eth-multiplexer or DDE? yes, it was
+ generally agreed that devnode is a good idea in any case. I have no idea
+ why there are no device nodes for network devices on other UNIX
+ systems...
+ <braunr> i've been wondering that for years too :)
+ <antrik> zhengda's branch has a pfinet modified to a) use devnode, and b)
+ use BPF
+ <braunr> why bpf ?
+ <braunr> for more specific filters maybe ?
+ <antrik> hm... don't remember whether there was any technical reason for
+ going with BPF; I guess it just seemed more reasonable to invest new work
+ in BPF rather than obsolete Mach-specific NPF...
+ <braunr> cspf could be removed altogether, i agree
+ <antrik> another plus side of his modified pfinet is that it actually sets
+ an appropriate filter for TCP/IP and the IP in use, rather than just
+ setting a dummy filter catching app packets (including those irrelevant
+ to the specific pfinet instance)
+ <antrik> err... catching all packets
+ <braunr> that's what i meant by "for more specific filters maybe ?"
+ <braunr> he was probably more comfortable with the bpf interface to write
+ his filter rules
+ <antrik> well, it would probably be doable with NPF too :-) so by itself
+ it's not a reason for switching to BPF...
+ <antrik> it's rather the other way around: as it was necessary to implement
+ filters in eth-multiplexer, and implementing BPF seemed more reasoable,
+ pfinet had to be changed to use BPF...
+ <braunr> antrik: where is zhengda's branch btw ?
+ <antrik> (I guess using proper filters with eth-multiplexer is not strictly
+ necessary; but it would be a major performance hit not to)
+ <antrik> it's in incubator.git
+ <antrik> but it's very messy
+ <braunr> ok
+ <antrik> at some point I asked him to provide cleaned up branches, and I'm
+ pretty sure he said he did, but I totally fail to remember where he
+ published them :-(
+ <braunr> hm, i don't like how devnode is "architectured" :/
+ <braunr> but it makes things a little more easy to get working i guess
+ <LarstiQ> antrik: any idea what to grep the logs on for that?
+ <braunr> ok never mind, devnode is fine
+ <braunr> exactly what i need
+ <braunr> i wonder however if it shouldn't be improved to better handle
+ permissions
+ <braunr> ok, never mind either, permission handling is fine
+ <braunr> so what are we waiting for ? :)
+ <antrik> I remember that there were some issues with permission handling,
+ but I don't remember whether all were fixed :-(
+ <antrik> LarstiQ: hm... good question...
+ <braunr> ah ?
+ <braunr> hm actually, there could be issues for packet filters, yes
+ <braunr> i guess we want to allow e.g. read-only opens for capture only
+ <antrik> braunr: that would have to be handled by the actual BPF
+ implementation I'd say
+ <braunr> it should already be the case
+ <antrik> what's the problem then?
+ <braunr> but when the actual device_open() is performed, the appropriate
+ permissions must be provided
+ <braunr> and checking those is the responsibility of the proxy, devnode in
+ this case
+ <antrik> and it doesn't do that?
+ <braunr> apparently not
+ <braunr> the only check is against the device name
+ <braunr> i'll begin playing with that first
+ <antrik> I vaguely remember that there has been discussion about the
+ relation of underlying device open mode and devnode open mode... but I
+ don't remember the outcome. in fact it was probably one of the
+ discussions I never got around to follow up on... :-(
+ <antrik> before you begin playing, take a look at the relevant messages in
+ the ML archive :-)
+ <antrik> must have been around two years ago
+ <braunr> ok
+ <antrik> some thread with me and scolobb (Sergiu Ivanov +- spelling) and
+ probably zhengda
+ <antrik> there might also be some outstanding patch(es) from scolobb, not
+ sure
+
+
+## IRC, freenode, #hurd, 2012-01-17
+
+ <braunr> antrik: i think i found the thread you mentioned about devnode
+ <braunr> neither sergiu nor zhengda considered the use of a read-only
+ device for packet filtering
+ <braunr> leading to assumptions such as "only receiving packets
+ <braunr> is not terribly useful, in view of the fact that you have to at
+ least
+ <braunr> request them, which implies *sending* packets :-)
+ <braunr> "
+ <braunr> IMO, devnode should definitely check its node permissions to build
+ the device open flags
+ <braunr> good news is that it doesn't depend on anything specific to other
+ incubator projects
+ <braunr> making it almost readily mergeable in the hurd
+ <braunr> i'm not sure devnode is an appropriate name though
+ <braunr> maybe something like device, or devproxy
+ <braunr> proxy-devopen maybe
+ <antrik> braunr: well, I don't remember the details of the disucssion; but
+ as I mentioned in some mail, I did actually want to write a followup,
+ just didn't get around to it... so I was definitely not in agreement with
+ some of the statements made by others. I just don't remember on which
+ point :-)
+ <antrik> which thread was it?
+ <antrik> anyways, this should in no way be specific to network
+ devices... the idea is simply that if the client has only read
+ permissions on the device node, it should only get to open the underlying
+ device for read. it's up to the kernel to handle the read-only status for
+ the device once it's opened
+ <antrik> as for the naming, the idea is that devnode simply makes Mach
+ devices accessible through FS nodes... so the name seemed appropriate
+ <antrik> you may be right though that just "device" might be more
+ straightforward... I don't agree on the other variants
+ <braunr> antrik:
+ http://lists.gnu.org/archive/html/bug-hurd/2009-12/msg00155.html
+ <braunr> antrik: i agree with the general idea behind permission handling,
+ i was just referring to their thoughts about it, which probably led to
+ the hard coded READ | WRITE flags
+ <antrik> braunr: unfortunately, I don't remember the context of the
+ discussion... would take me a while to get into this again :-(
+ <antrik> the discussion seems to be about eth-multiplexer as much as about
+ devnode (if not more), and I don't remember the exact interaction
+
+
+## IRC, freenode, #hurd, 2012-01-18
+
+ <braunr> so, does anyone have an objection to getting devnode into the hurd
+ and calling it something else like e.g. device ?
+ <youpi> braunr: it's Zhengda's work, right?
+ <braunr> yes
+ <youpi> I'm completely for it, it just perhaps needs some cleanup
+ <braunr> i have a few changes to add to what already exists
+ <braunr> ok
+ <braunr> well i'm assigning myself to the task
+ <antrik> braunr: I'm still not convinced just "device" is preferable
+ <antrik> perhaps machdevice ;-)
+ <antrik> but otherwise, I'd LOVE to see it in :-)
+ <braunr> i don't know .. what if the device is actually eth-multiplexer or
+ a dde one ?
+ <braunr> it's not really "mach", is it ?
+ <braunr> or do we only refer to the interface ?
+ <youpi> that translator is only for mach devices
+ <braunr> so you consider dde devices as being mach devices too ?
+ <braunr> it's a simple proxy for device_open really
+ <youpi> will these devices use that translator?
+ <youpi> ah
+ <youpi> I thought it was using a mach-specific RPC
+ <braunr> so we can consider whatever we want
+ <antrik> braunr: yes, the translator is for Mach device interface only. it
+ might be provided by other servers, but it's still Mach devices
+ <youpi> then drop the mach, yes
+ <braunr> i'd tend to agree with antrik
+ <youpi> antrik: I'd say the device interface is part of the hur dinterfaces
+ <braunr> then machdev :p
+ <braunr> no, it's really part of the mach interface
+ <youpi> it's part of the mach interface, yes
+ <youpi> but also of the Hurd, no?
+ <antrik> DDE network servers also use the Mach device interface
+ <braunr> no
+ <youpi> can't we say it's part of it?
+ <youpi> I mean
+ <youpi> even if we change the kernel
+ <braunr> dde is the only thing that implements it besides the kernel that i
+ know of
+ <youpi> we will probably want to keep the same interface
+ <braunr> yes but that's a mach thing
+ <youpi> what we have now is not necessarily a reason
+ <antrik> as for other DDE drivers, I for my part believe they should export
+ proper Hurd (UNIX) device nodes directly... but for some reason zhengda
+ insisted on implementing it as Mach devices too :-(
+ <braunr> antrik: i agree with you on that too
+ <braunr> i was a bit surprised to see the same interface was reused
+ <braunr> youpi: we can, we just have to agree on what we'll do
+ <braunr> what do you mean by "even if we change the kernel" ?
+ <antrik> the problem with "machdev" is that it might suggest the translator
+ actually implements the device... not sure whether this would cause
+ serious confusion
+ <antrik> "devopen" might be another option
+ <antrik> or "machdevopen" to be entirely verbose ;-)
+ <braunr> an option i suggested earlier which you disagreed on :p
+ <braunr> but devopen is the one i'd choose
+ <antrik> youpi: as I already mentioned in the libburn thread, I don't
+ actually think the Mach device interface is very nice; IMHO we should get
+ rid of it as soon as we can, rather than port it to other
+ architectures...
+ <antrik> but even *if* we decided to reuse it after all, it would still be
+ the Mach device interface :-)
+ <braunr> actually, zheng da already suggested that name a long time ago
+ <braunr> http://lists.gnu.org/archive/html/bug-hurd/2008-08/msg00005.html
+ <braunr> no actually antrik did eh
+ <braunr> ok let's use devopen
+ <antrik> braunr: you suggested proxy-devopen, which I didn't like because
+ of the "proxy" part :-)
+ <braunr> not only, but i don't have the logs any more :p
+ <antrik> oh, I already suggested devopen once? didn't expect myself to be
+ that consistent... ;-)
+ <antrik> braunr: you suggested device, devproxy or proxy-devopen
+ <braunr> ah, ok
+ <braunr> devopen is better
+ <antrik> I wonder whether it's more important for clarity to have "mach" in
+ there or "open"... or whether it's really too unweildy to have both
+
+
+## IRC, freenode, #hurd, 2012-01-21
+
+ <braunr> oh btw, i made devopen run today, it shouldn't be hard getting it
+ in properly
+ <braunr> patching libpcap will be somewhat trickier
+ <braunr> i don't even really need it, but it allows having user access to
+ mach devices, which is nice for the libpcap patch and tcpdump tests
+ <braunr> permission checking is actually its only purpose
+ <braunr> well, no, not really, it also allows opening devices implemented
+ by user space servers transparently
+
+
+## IRC, freenode, #hurd, 2012-01-27
+
+ <braunr> hmm, bpf needs more work :(
+ <braunr> or we can use the userspace bpf filter in libpcap, so that it
+ works with both gnumach and dde drivers
+ <antrik> braunr: there is a userspace BPF implementation in libpcap? I'm
+ surprised that zhengda didn't notice it, and ported the one from gnumach
+ instead...
+ <antrik> what is missing in the kernel implementation?
+ <braunr> antrik: filling the bpf header
+ <braunr> frankly, i'm not sure we want to bother with the kernel
+ implementation
+ <braunr> i'd like it to work with both gnumach and dde drivers
+ <braunr> and in the long run, we'll be using userspace drivers anyway
+ <braunr> the bpf header was one of the things the defunct translator did
+ <braunr> which involved ugly memcpy()s :p
+ <antrik> braunr: well, if you want to get rid of the kernel implementation,
+ basically you would have to take up eth-multiplexer and get it into
+ mainline
+ <antrik> (and make sure it's used by default in Debian)
+ <antrik> I frankly believe it's the better design anyways... but quite a
+ major change :-)
+ <braunr> not that major to me
+ <braunr> in the meantime i'll use the libpcap embedded implementation
+ <braunr> we'll have something useful faster, with minimum work when
+ eth-multiplexer is available
+ <antrik> eth-multiplexer is ready for use, it just needs to go upstream
+ <antrik> though it's probably desirable to switch it to the BPF
+ implementation from libpcap
+ <braunr> using the libpcap implementation in libpcap and in eth-multiplexer
+ are two different things
+ <braunr> the latter is preferrable
+ <braunr> (and yes, by available, i meant upstream ofc)
+ <antrik> eth-mulitplexer is already using libpcap anyways (for compiling
+ the filters); I'm sure zhengda just didn't realize it has an actual BPF
+ implementation too...
+ <braunr> we want the filter implementation as close to the packet source as
+ possible
+ <antrik> I have been using eth-multiplexer for at least two years now
+ <braunr> hm, there is a "snoop" source type, using raw sockets
+ <braunr> too far from the packet source, but i'll try it anyway
+ <braunr> hm wrong, snoop was the solaris packet filter fyi