From d149a746478ae0178e16983ac61bb255dd3d7205 Mon Sep 17 00:00:00 2001 From: Joshua Branson Date: Thu, 10 Sep 2020 09:50:28 -0400 Subject: Better introduce httpfs and xmlfs hurd/translator/httpfs.mdwn: I added a Intro, how to use, and TODO section. hurd/translator/xmlfs.mdwn: I added a How to use and TODO wishlist section. I copied most of the text from the Hurd extras repos. Message-Id: <20200910135028.27288-1-jbranso@dismail.de> --- hurd/translator/httpfs.mdwn | 73 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 73 insertions(+) (limited to 'hurd/translator/httpfs.mdwn') diff --git a/hurd/translator/httpfs.mdwn b/hurd/translator/httpfs.mdwn index 8b02aa06..0fc6fbbd 100644 --- a/hurd/translator/httpfs.mdwn +++ b/hurd/translator/httpfs.mdwn @@ -12,6 +12,79 @@ License|/fdl]]."]]"""]] While the httpfs translator works, it is only suitable for very simple use cases: it just provides the actual file contents downloaded from the URL, but no additional status information that are necessary for interactive use. (Progress indication, error codes, HTTP redirects etc.) +# Intro +INTRODUCTION: + +Here we describe the structure of the /http filesystem for the Hurd. +Under the Hurd, we provide a translator called 'httpfs' which is intended +to provide the filesystem structure. + +The httpfs translator accepts an "http:// URL" as an argument. The underlying +node of the translator can be a file or directory. This is guided by the --mode +command lineoption. Default is a directory. + +If its a file, only file system read requests are supported on that node. If +its a directory, we can cd into that directory and ls would list the files in +the web server. A web server may provide a directory listing or it may not +provide, whatever it be the case the web server always returns an HTML stream +for an user request (GET command). So to get the files residing in the web +server, we have to parse the incoming HTML stream to find out the anchor +tags. These anchor tags point to different pages or files in the web +server. These file name are extracted and filled into the node of the +translator. An anchor tag can also be a pointer to an external URL, in such a +case we just show that URL as a regular file so that the user can make file +system read requests on that URL. In case the file is a URL, we change the name +of URL by converting all the /'s with .'s so that it can be displayed in the +file system. + +Only the root node is filled when the translator is set, subdirectories inside +that are filled as on demand, i.e. when a cd or ls occurs on that particular sub +directory. + +The File size is now displayed as 0. One way of getting individual file sizes is +sending a GET request for each file and cull the file size from Content-Length +field of an HTTP response. But this may put a very heavy burden on the network, +So as of now we have not incorporated this method with this http translator. + +The translator uses the libxml2 library for doing the parsing of HTML +stream. The libxml2 provides SAX interfaces for the parser which are used for +finding the begining of anchor tags -- cgit v1.2.3