At this time, the only functionality provided by pfinet and pending to be implemented in the LwIP translator is the --peer option, everything else is done. From now on, most of the work will consist on polishing and fixing bugs, so I'll probably talk about many different topics in each post.
The first topic for today is an error I got when I called apt-get update
over LwIP:
../sysdeps/posix/getaddrinfo.c:1722: rfc3484_sort: Assertion `a1->source_addr.sin6_family == PF_INET6' failed.
Unlike the previous errors I faced, this one was located in Glibc and aborted the user program, not the translator. But obviously, it was some unexpected behavior in the translator that leaded Glibc to fail. There was the time to find it. Taking a look at the log generated by rpctrace, I could see the last of the LwIP provided operation being called by the client before the crash was socket_whatis_address()
. I'll try to explain what's this operation for.
Glibc needs to call two complementary operations: socket_create_address()
and socket_whatis_address()
. The first one creates a libport object containing a struct sockaddr
which is created from the given parameters: family, data and length. The second one is the opposite, given a port it returns family, the data and the length of the sockaddr structrure contained in the libport object of that port. When Glibc receives an struct sockaddr from the user, it sends its content to create_address()
to get a port which will later send to other operations, for instance, bind(). In the same way, when some socket operation returns an address, Glibc will receive a port, and will have to call whatis_address() to get the necessary data to create a copy of the sockaddr structure for the user. We can see it in action in getpeername().
Well then, in the rpctrace output I could see that the client was calling create_address()
with the family 26
(IPv6), but a bit later when calling whatis_address()
over the same port, the server was returning a sockaddr which family was 2
(IPv4). The problem was a bug in LwIP that, when asked for the address of an IPv6 socket, returned a IPv4 address if dual-stack was enabled. LwIP developers fixed this bug months ago and will be ready for the next release, but for now I've applied the patch in my source.
After this problem, I wanted to ask the server with inetutils-After this problem, I wanted to ask the server with inetutils-ifconfig
, and found it didn't work because it makes a call to ioctl with the SIOCGIFCONF
command, and that command is part of the pfinet interface, which was still not implemented. I wrote the operation, but before, I observed that all ioctl related operations were written to work only over an Ethernet device, and that produced two problems: first, that trying an ioctl over the loopback address lead to a crash; and second, that I was planning to create a new module to implement PPP and then all iioctl and pfinet operations would need to be rewritten. For these reasons, I've decided to create a common interface for communicating with device modules. Besides, I've created a new loopback module that sets up the loopback interface in LwIP to adapt it to the Hurd's requirements and provide the necessary functions to be integrated with the ioctls.
Another issue I left pending for some time was improving the performance of the stack. Since the beginning of the project, LwIP never reached more than 40Kbps on download speed, that's because the default TCP tuning parameters in LwIP are rather conservative, e.g. MSS equal to 536
bytes or TCP receive window equal to 2144
bytes. I tried to increase the receive window to 65536
and set the MSS to 1460
, the usual value for the Internet, and with this setting the stack reaches about 600Kbps, still far from being competitive. I spent a few days trying to make the stack faster but it's proved to be non-trivial. Incrementing the receive window a bit more results in an increase of loss packets, and for some reason LwIP has a hard time dealing with the fast-retransmit mechanism when the window if high enough to receive hundreds of packets before the sender fast-retransmits the loss packet. In practice, the leads the stack to download at full speed during a random time until the first packet is loss, which triggers the fast-retransmit mechanism, which produces more loss packets and so on, preventing the stack to receive more than 100Kbps.