diff options
author | Marcus Brinkmann <marcus@gnu.org> | 2001-10-04 02:35:28 +0000 |
---|---|---|
committer | Marcus Brinkmann <marcus@gnu.org> | 2001-10-04 02:35:28 +0000 |
commit | 21ea10295a2b95b5b89ee0097aabde64ce01317c (patch) | |
tree | d505d53a5e225a5fbde172af588c190d2dd51aa0 /doc | |
parent | 5df48e5543060daf29f59b910c567d74205da9d6 (diff) |
2001-10-04 Marcus Brinkmann <marcus@gnu.org>
* doc: New directory.
* doc/Makefile.in: New file.
* doc/gpl.texi: Likewise.
* doc/fdl.texi: Likewise.
* doc/mach.texi: Likewise.
* configure.in: Add doc/Makefile to AC_OUTPUT call.
* configure: Regenerated.
* Makefile.in (dist): Create directories doc and debian.
(doc-files): New variable with documentation files.
(debian-files): New variable with Debian packaging files.
* debian/rules (stamp-build): Build documentation.
(build-gnumach): Install the documentation into the gnumach
package.
* debian/postrm: New file to install info document.
* debian/prerm: New file to install info document.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/Makefile.in | 65 | ||||
-rw-r--r-- | doc/fdl.texi | 402 | ||||
-rw-r--r-- | doc/gpl.texi | 397 | ||||
-rw-r--r-- | doc/mach.texi | 7100 |
4 files changed, 7964 insertions, 0 deletions
diff --git a/doc/Makefile.in b/doc/Makefile.in new file mode 100644 index 0000000..562401b --- /dev/null +++ b/doc/Makefile.in @@ -0,0 +1,65 @@ +# +# Copyright (C) 2001 Free Software Foundation +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation; either version 2, or (at +# your option) any later version. +# +# This program is distributed in the hope that it will be useful, but +# WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. + +mach-version := 1.2 +targets := mach.info + +# Variables from `configure'. +srcdir=@srcdir@ +prefix=@prefix@ + +infodir=$(prefix)/info + +DVIPS = dvips + +INSTALL_PROGRAM = @INSTALL_PROGRAM@ + +VPATH = $(srcdir) + +all: $(targets) + +# For each .info file we need a .d file. +-include $(patsubst %.info,%.d,$(filter %.info,$(targets))) /dev/null + +# Build dependencies from included files. +%.d: %.texi + set -e; (echo "$*.info $*.dvi: \\"; grep '^@include ' $< | \ + sed -e 's/^[^ ]*[ ]*\([^ ]*\).*$$/ \1 \\/'; \ + echo) > $@.new + mv -f $@.new $@ + +%.info: %.texi + @rm -f $@ $@-[0-9] $@-[0-9][0-9] + $(MAKEINFO) -I $(@D) -I $(<D) $< + +.PRECIOUS: %.dvi +%.dvi: %.texi + TEXINPUTS=$(srcdir):$$TEXINPUTS \ + MAKEINFO='$(MAKEINFO) -I $(srcdir)' $(TEXI2DVI) $< + +%.ps: %.dvi + $(DVIPS) $< -o $@ + +# move-if-change = $(SHELL) $(top_srcdir)/move-if-change +# For now: +move-if-change = mv + +version.texi: stamp-version; @: +stamp-version: + echo '@set VERSION $(mach-version)' > version.texi.new + $(move-if-change) version.texi.new version.texi + touch $@ diff --git a/doc/fdl.texi b/doc/fdl.texi new file mode 100644 index 0000000..50028ab --- /dev/null +++ b/doc/fdl.texi @@ -0,0 +1,402 @@ +@node Free Documentation License +@appendix GNU Free Documentation License + +@cindex FDL, GNU Free Documentation License +@center Version 1.1, March 2000 + +@display +Copyright @copyright{} 2000 Free Software Foundation, Inc. +59 Temple Place, Suite 330, Boston, MA 02111-1307, USA + +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. +@end display + +@enumerate 0 +@item +PREAMBLE + +The purpose of this License is to make a manual, textbook, or other +written document @dfn{free} in the sense of freedom: to assure everyone +the effective freedom to copy and redistribute it, with or without +modifying it, either commercially or noncommercially. Secondarily, +this License preserves for the author and publisher a way to get +credit for their work, while not being considered responsible for +modifications made by others. + +This License is a kind of ``copyleft'', which means that derivative +works of the document must themselves be free in the same sense. It +complements the GNU General Public License, which is a copyleft +license designed for free software. + +We have designed this License in order to use it for manuals for free +software, because free software needs free documentation: a free +program should come with manuals providing the same freedoms that the +software does. But this License is not limited to software manuals; +it can be used for any textual work, regardless of subject matter or +whether it is published as a printed book. We recommend this License +principally for works whose purpose is instruction or reference. + +@item +APPLICABILITY AND DEFINITIONS + +This License applies to any manual or other work that contains a +notice placed by the copyright holder saying it can be distributed +under the terms of this License. The ``Document'', below, refers to any +such manual or work. Any member of the public is a licensee, and is +addressed as ``you''. + +A ``Modified Version'' of the Document means any work containing the +Document or a portion of it, either copied verbatim, or with +modifications and/or translated into another language. + +A ``Secondary Section'' is a named appendix or a front-matter section of +the Document that deals exclusively with the relationship of the +publishers or authors of the Document to the Document's overall subject +(or to related matters) and contains nothing that could fall directly +within that overall subject. (For example, if the Document is in part a +textbook of mathematics, a Secondary Section may not explain any +mathematics.) The relationship could be a matter of historical +connection with the subject or with related matters, or of legal, +commercial, philosophical, ethical or political position regarding +them. + +The ``Invariant Sections'' are certain Secondary Sections whose titles +are designated, as being those of Invariant Sections, in the notice +that says that the Document is released under this License. + +The ``Cover Texts'' are certain short passages of text that are listed, +as Front-Cover Texts or Back-Cover Texts, in the notice that says that +the Document is released under this License. + +A ``Transparent'' copy of the Document means a machine-readable copy, +represented in a format whose specification is available to the +general public, whose contents can be viewed and edited directly and +straightforwardly with generic text editors or (for images composed of +pixels) generic paint programs or (for drawings) some widely available +drawing editor, and that is suitable for input to text formatters or +for automatic translation to a variety of formats suitable for input +to text formatters. A copy made in an otherwise Transparent file +format whose markup has been designed to thwart or discourage +subsequent modification by readers is not Transparent. A copy that is +not ``Transparent'' is called ``Opaque''. + +Examples of suitable formats for Transparent copies include plain +@sc{ascii} without markup, Texinfo input format, La@TeX{} input format, +@acronym{SGML} or @acronym{XML} using a publicly available +@acronym{DTD}, and standard-conforming simple @acronym{HTML} designed +for human modification. Opaque formats include PostScript, +@acronym{PDF}, proprietary formats that can be read and edited only by +proprietary word processors, @acronym{SGML} or @acronym{XML} for which +the @acronym{DTD} and/or processing tools are not generally available, +and the machine-generated @acronym{HTML} produced by some word +processors for output purposes only. + +The ``Title Page'' means, for a printed book, the title page itself, +plus such following pages as are needed to hold, legibly, the material +this License requires to appear in the title page. For works in +formats which do not have any title page as such, ``Title Page'' means +the text near the most prominent appearance of the work's title, +preceding the beginning of the body of the text. + +@item +VERBATIM COPYING + +You may copy and distribute the Document in any medium, either +commercially or noncommercially, provided that this License, the +copyright notices, and the license notice saying this License applies +to the Document are reproduced in all copies, and that you add no other +conditions whatsoever to those of this License. You may not use +technical measures to obstruct or control the reading or further +copying of the copies you make or distribute. However, you may accept +compensation in exchange for copies. If you distribute a large enough +number of copies you must also follow the conditions in section 3. + +You may also lend copies, under the same conditions stated above, and +you may publicly display copies. + +@item +COPYING IN QUANTITY + +If you publish printed copies of the Document numbering more than 100, +and the Document's license notice requires Cover Texts, you must enclose +the copies in covers that carry, clearly and legibly, all these Cover +Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on +the back cover. Both covers must also clearly and legibly identify +you as the publisher of these copies. The front cover must present +the full title with all words of the title equally prominent and +visible. You may add other material on the covers in addition. +Copying with changes limited to the covers, as long as they preserve +the title of the Document and satisfy these conditions, can be treated +as verbatim copying in other respects. + +If the required texts for either cover are too voluminous to fit +legibly, you should put the first ones listed (as many as fit +reasonably) on the actual cover, and continue the rest onto adjacent +pages. + +If you publish or distribute Opaque copies of the Document numbering +more than 100, you must either include a machine-readable Transparent +copy along with each Opaque copy, or state in or with each Opaque copy +a publicly-accessible computer-network location containing a complete +Transparent copy of the Document, free of added material, which the +general network-using public has access to download anonymously at no +charge using public-standard network protocols. If you use the latter +option, you must take reasonably prudent steps, when you begin +distribution of Opaque copies in quantity, to ensure that this +Transparent copy will remain thus accessible at the stated location +until at least one year after the last time you distribute an Opaque +copy (directly or through your agents or retailers) of that edition to +the public. + +It is requested, but not required, that you contact the authors of the +Document well before redistributing any large number of copies, to give +them a chance to provide you with an updated version of the Document. + +@item +MODIFICATIONS + +You may copy and distribute a Modified Version of the Document under +the conditions of sections 2 and 3 above, provided that you release +the Modified Version under precisely this License, with the Modified +Version filling the role of the Document, thus licensing distribution +and modification of the Modified Version to whoever possesses a copy +of it. In addition, you must do these things in the Modified Version: + +@enumerate A +@item +Use in the Title Page (and on the covers, if any) a title distinct +from that of the Document, and from those of previous versions +(which should, if there were any, be listed in the History section +of the Document). You may use the same title as a previous version +if the original publisher of that version gives permission. + +@item +List on the Title Page, as authors, one or more persons or entities +responsible for authorship of the modifications in the Modified +Version, together with at least five of the principal authors of the +Document (all of its principal authors, if it has less than five). + +@item +State on the Title page the name of the publisher of the +Modified Version, as the publisher. + +@item +Preserve all the copyright notices of the Document. + +@item +Add an appropriate copyright notice for your modifications +adjacent to the other copyright notices. + +@item +Include, immediately after the copyright notices, a license notice +giving the public permission to use the Modified Version under the +terms of this License, in the form shown in the Addendum below. + +@item +Preserve in that license notice the full lists of Invariant Sections +and required Cover Texts given in the Document's license notice. + +@item +Include an unaltered copy of this License. + +@item +Preserve the section entitled ``History'', and its title, and add to +it an item stating at least the title, year, new authors, and +publisher of the Modified Version as given on the Title Page. If +there is no section entitled ``History'' in the Document, create one +stating the title, year, authors, and publisher of the Document as +given on its Title Page, then add an item describing the Modified +Version as stated in the previous sentence. + +@item +Preserve the network location, if any, given in the Document for +public access to a Transparent copy of the Document, and likewise +the network locations given in the Document for previous versions +it was based on. These may be placed in the ``History'' section. +You may omit a network location for a work that was published at +least four years before the Document itself, or if the original +publisher of the version it refers to gives permission. + +@item +In any section entitled ``Acknowledgments'' or ``Dedications'', +preserve the section's title, and preserve in the section all the +substance and tone of each of the contributor acknowledgments +and/or dedications given therein. + +@item +Preserve all the Invariant Sections of the Document, +unaltered in their text and in their titles. Section numbers +or the equivalent are not considered part of the section titles. + +@item +Delete any section entitled ``Endorsements''. Such a section +may not be included in the Modified Version. + +@item +Do not retitle any existing section as ``Endorsements'' +or to conflict in title with any Invariant Section. +@end enumerate + +If the Modified Version includes new front-matter sections or +appendices that qualify as Secondary Sections and contain no material +copied from the Document, you may at your option designate some or all +of these sections as invariant. To do this, add their titles to the +list of Invariant Sections in the Modified Version's license notice. +These titles must be distinct from any other section titles. + +You may add a section entitled ``Endorsements'', provided it contains +nothing but endorsements of your Modified Version by various +parties---for example, statements of peer review or that the text has +been approved by an organization as the authoritative definition of a +standard. + +You may add a passage of up to five words as a Front-Cover Text, and a +passage of up to 25 words as a Back-Cover Text, to the end of the list +of Cover Texts in the Modified Version. Only one passage of +Front-Cover Text and one of Back-Cover Text may be added by (or +through arrangements made by) any one entity. If the Document already +includes a cover text for the same cover, previously added by you or +by arrangement made by the same entity you are acting on behalf of, +you may not add another; but you may replace the old one, on explicit +permission from the previous publisher that added the old one. + +The author(s) and publisher(s) of the Document do not by this License +give permission to use their names for publicity for or to assert or +imply endorsement of any Modified Version. + +@item +COMBINING DOCUMENTS + +You may combine the Document with other documents released under this +License, under the terms defined in section 4 above for modified +versions, provided that you include in the combination all of the +Invariant Sections of all of the original documents, unmodified, and +list them all as Invariant Sections of your combined work in its +license notice. + +The combined work need only contain one copy of this License, and +multiple identical Invariant Sections may be replaced with a single +copy. If there are multiple Invariant Sections with the same name but +different contents, make the title of each such section unique by +adding at the end of it, in parentheses, the name of the original +author or publisher of that section if known, or else a unique number. +Make the same adjustment to the section titles in the list of +Invariant Sections in the license notice of the combined work. + +In the combination, you must combine any sections entitled ``History'' +in the various original documents, forming one section entitled +``History''; likewise combine any sections entitled ``Acknowledgments'', +and any sections entitled ``Dedications''. You must delete all sections +entitled ``Endorsements.'' + +@item +COLLECTIONS OF DOCUMENTS + +You may make a collection consisting of the Document and other documents +released under this License, and replace the individual copies of this +License in the various documents with a single copy that is included in +the collection, provided that you follow the rules of this License for +verbatim copying of each of the documents in all other respects. + +You may extract a single document from such a collection, and distribute +it individually under this License, provided you insert a copy of this +License into the extracted document, and follow this License in all +other respects regarding verbatim copying of that document. + +@item +AGGREGATION WITH INDEPENDENT WORKS + +A compilation of the Document or its derivatives with other separate +and independent documents or works, in or on a volume of a storage or +distribution medium, does not as a whole count as a Modified Version +of the Document, provided no compilation copyright is claimed for the +compilation. Such a compilation is called an ``aggregate'', and this +License does not apply to the other self-contained works thus compiled +with the Document, on account of their being thus compiled, if they +are not themselves derivative works of the Document. + +If the Cover Text requirement of section 3 is applicable to these +copies of the Document, then if the Document is less than one quarter +of the entire aggregate, the Document's Cover Texts may be placed on +covers that surround only the Document within the aggregate. +Otherwise they must appear on covers around the whole aggregate. + +@item +TRANSLATION + +Translation is considered a kind of modification, so you may +distribute translations of the Document under the terms of section 4. +Replacing Invariant Sections with translations requires special +permission from their copyright holders, but you may include +translations of some or all Invariant Sections in addition to the +original versions of these Invariant Sections. You may include a +translation of this License provided that you also include the +original English version of this License. In case of a disagreement +between the translation and the original English version of this +License, the original English version will prevail. + +@item +TERMINATION + +You may not copy, modify, sublicense, or distribute the Document except +as expressly provided for under this License. Any other attempt to +copy, modify, sublicense or distribute the Document is void, and will +automatically terminate your rights under this License. However, +parties who have received copies, or rights, from you under this +License will not have their licenses terminated so long as such +parties remain in full compliance. + +@item +FUTURE REVISIONS OF THIS LICENSE + +The Free Software Foundation may publish new, revised versions +of the GNU Free Documentation License from time to time. Such new +versions will be similar in spirit to the present version, but may +differ in detail to address new problems or concerns. See +@uref{http://www.gnu.org/copyleft/}. + +Each version of the License is given a distinguishing version number. +If the Document specifies that a particular numbered version of this +License ``or any later version'' applies to it, you have the option of +following the terms and conditions either of that specified version or +of any later version that has been published (not as a draft) by the +Free Software Foundation. If the Document does not specify a version +number of this License, you may choose any version ever published (not +as a draft) by the Free Software Foundation. +@end enumerate + +@page +@appendixsubsec ADDENDUM: How to use this License for your documents + +To use this License in a document you have written, include a copy of +the License in the document and put the following copyright and +license notices just after the title page: + +@smallexample +@group + Copyright (C) @var{year} @var{your name}. + Permission is granted to copy, distribute and/or modify this document + under the terms of the GNU Free Documentation License, Version 1.1 + or any later version published by the Free Software Foundation; + with the Invariant Sections being @var{list their titles}, with the + Front-Cover Texts being @var{list}, and with the Back-Cover Texts being @var{list}. + A copy of the license is included in the section entitled ``GNU + Free Documentation License''. +@end group +@end smallexample + +If you have no Invariant Sections, write ``with no Invariant Sections'' +instead of saying which ones are invariant. If you have no +Front-Cover Texts, write ``no Front-Cover Texts'' instead of +``Front-Cover Texts being @var{list}''; likewise for Back-Cover Texts. + +If your document contains nontrivial examples of program code, we +recommend releasing these examples in parallel under your choice of +free software license, such as the GNU General Public License, +to permit their use in free software. + +@c Local Variables: +@c ispell-local-pdict: "ispell-dict" +@c End: + diff --git a/doc/gpl.texi b/doc/gpl.texi new file mode 100644 index 0000000..ca0508f --- /dev/null +++ b/doc/gpl.texi @@ -0,0 +1,397 @@ +@node Copying +@appendix GNU GENERAL PUBLIC LICENSE + +@cindex GPL, GNU General Public License +@center Version 2, June 1991 + +@display +Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc. +59 Temple Place -- Suite 330, Boston, MA 02111-1307, USA + +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. +@end display + +@appendixsubsec Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +License is intended to guarantee your freedom to share and change free +software---to make sure the software is free for all its users. This +General Public License applies to most of the Free Software +Foundation's software and to any other program whose authors commit to +using it. (Some other Free Software Foundation software is covered by +the GNU Library General Public License instead.) You can apply it to +your programs, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if you +distribute copies of the software, or if you modify it. + + For example, if you distribute copies of such a program, whether +gratis or for a fee, you must give the recipients all the rights that +you have. You must make sure that they, too, receive or can get the +source code. And you must show them these terms so they know their +rights. + + We protect your rights with two steps: (1) copyright the software, and +(2) offer you this license which gives you legal permission to copy, +distribute and/or modify the software. + + Also, for each author's protection and ours, we want to make certain +that everyone understands that there is no warranty for this free +software. If the software is modified by someone else and passed on, we +want its recipients to know that what they have is not the original, so +that any problems introduced by others will not reflect on the original +authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that redistributors of a free +program will individually obtain patent licenses, in effect making the +program proprietary. To prevent this, we have made it clear that any +patent must be licensed for everyone's free use or not licensed at all. + + The precise terms and conditions for copying, distribution and +modification follow. + +@iftex +@appendixsubsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end iftex +@ifinfo +@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end ifinfo + +@enumerate +@item +This License applies to any program or other work which contains +a notice placed by the copyright holder saying it may be distributed +under the terms of this General Public License. The ``Program'', below, +refers to any such program or work, and a ``work based on the Program'' +means either the Program or any derivative work under copyright law: +that is to say, a work containing the Program or a portion of it, +either verbatim or with modifications and/or translated into another +language. (Hereinafter, translation is included without limitation in +the term ``modification''.) Each licensee is addressed as ``you''. + +Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running the Program is not restricted, and the output from the Program +is covered only if its contents constitute a work based on the +Program (independent of having been made by running the Program). +Whether that is true depends on what the Program does. + +@item +You may copy and distribute verbatim copies of the Program's +source code as you receive it, in any medium, provided that you +conspicuously and appropriately publish on each copy an appropriate +copyright notice and disclaimer of warranty; keep intact all the +notices that refer to this License and to the absence of any warranty; +and give any other recipients of the Program a copy of this License +along with the Program. + +You may charge a fee for the physical act of transferring a copy, and +you may at your option offer warranty protection in exchange for a fee. + +@item +You may modify your copy or copies of the Program or any portion +of it, thus forming a work based on the Program, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + +@enumerate a +@item +You must cause the modified files to carry prominent notices +stating that you changed the files and the date of any change. + +@item +You must cause any work that you distribute or publish, that in +whole or in part contains or is derived from the Program or any +part thereof, to be licensed as a whole at no charge to all third +parties under the terms of this License. + +@item +If the modified program normally reads commands interactively +when run, you must cause it, when started running for such +interactive use in the most ordinary way, to print or display an +announcement including an appropriate copyright notice and a +notice that there is no warranty (or else, saying that you provide +a warranty) and that users may redistribute the program under +these conditions, and telling the user how to view a copy of this +License. (Exception: if the Program itself is interactive but +does not normally print such an announcement, your work based on +the Program is not required to print an announcement.) +@end enumerate + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Program, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Program, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Program. + +In addition, mere aggregation of another work not based on the Program +with the Program (or with a work based on the Program) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + +@item +You may copy and distribute the Program (or a work based on it, +under Section 2) in object code or executable form under the terms of +Sections 1 and 2 above provided that you also do one of the following: + +@enumerate a +@item +Accompany it with the complete corresponding machine-readable +source code, which must be distributed under the terms of Sections +1 and 2 above on a medium customarily used for software interchange; or, + +@item +Accompany it with a written offer, valid for at least three +years, to give any third party, for a charge no more than your +cost of physically performing source distribution, a complete +machine-readable copy of the corresponding source code, to be +distributed under the terms of Sections 1 and 2 above on a medium +customarily used for software interchange; or, + +@item +Accompany it with the information you received as to the offer +to distribute corresponding source code. (This alternative is +allowed only for noncommercial distribution and only if you +received the program in object code or executable form with such +an offer, in accord with Subsection b above.) +@end enumerate + +The source code for a work means the preferred form of the work for +making modifications to it. For an executable work, complete source +code means all the source code for all modules it contains, plus any +associated interface definition files, plus the scripts used to +control compilation and installation of the executable. However, as a +special exception, the source code distributed need not include +anything that is normally distributed (in either source or binary +form) with the major components (compiler, kernel, and so on) of the +operating system on which the executable runs, unless that component +itself accompanies the executable. + +If distribution of executable or object code is made by offering +access to copy from a designated place, then offering equivalent +access to copy the source code from the same place counts as +distribution of the source code, even though third parties are not +compelled to copy the source along with the object code. + +@item +You may not copy, modify, sublicense, or distribute the Program +except as expressly provided under this License. Any attempt +otherwise to copy, modify, sublicense or distribute the Program is +void, and will automatically terminate your rights under this License. +However, parties who have received copies, or rights, from you under +this License will not have their licenses terminated so long as such +parties remain in full compliance. + +@item +You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Program or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Program (or any work based on the +Program), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Program or works based on it. + +@item +Each time you redistribute the Program (or any work based on the +Program), the recipient automatically receives a license from the +original licensor to copy, distribute or modify the Program subject to +these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + +@item +If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Program at all. For example, if a patent +license would not permit royalty-free redistribution of the Program by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Program. + +If any portion of this section is held invalid or unenforceable under +any particular circumstance, the balance of the section is intended to +apply and the section as a whole is intended to apply in other +circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system, which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + +@item +If the distribution and/or use of the Program is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Program under this License +may add an explicit geographical distribution limitation excluding +those countries, so that distribution is permitted only in or among +countries not thus excluded. In such case, this License incorporates +the limitation as if written in the body of this License. + +@item +The Free Software Foundation may publish revised and/or new versions +of the General Public License from time to time. Such new versions will +be similar in spirit to the present version, but may differ in detail to +address new problems or concerns. + +Each version is given a distinguishing version number. If the Program +specifies a version number of this License which applies to it and ``any +later version'', you have the option of following the terms and conditions +either of that version or of any later version published by the Free +Software Foundation. If the Program does not specify a version number of +this License, you may choose any version ever published by the Free Software +Foundation. + +@item +If you wish to incorporate parts of the Program into other free +programs whose distribution conditions are different, write to the author +to ask for permission. For software which is copyrighted by the Free +Software Foundation, write to the Free Software Foundation; we sometimes +make exceptions for this. Our decision will be guided by the two goals +of preserving the free status of all derivatives of our free software and +of promoting the sharing and reuse of software generally. + +@iftex +@heading NO WARRANTY +@end iftex +@ifinfo +@center NO WARRANTY +@end ifinfo + +@item +BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY +FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN +OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES +PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED +OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS +TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE +PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, +REPAIR OR CORRECTION. + +@item +IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING +WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR +REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, +INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING +OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED +TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY +YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER +PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE +POSSIBILITY OF SUCH DAMAGES. +@end enumerate + +@iftex +@heading END OF TERMS AND CONDITIONS +@end iftex +@ifinfo +@center END OF TERMS AND CONDITIONS +@end ifinfo + +@page +@unnumberedsec How to Apply These Terms to Your New Programs + + If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it +free software which everyone can redistribute and change under these terms. + + To do so, attach the following notices to the program. It is safest +to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least +the ``copyright'' line and a pointer to where the full notice is found. + +@smallexample +@var{one line to give the program's name and an idea of what it does.} +Copyright (C) 19@var{yy} @var{name of author} + +This program is free software; you can redistribute it and/or +modify it under the terms of the GNU General Public License +as published by the Free Software Foundation; either version 2 +of the License, or (at your option) any later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License along +with this program; if not, write to the Free Software Foundation, Inc., +59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. +@end smallexample + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this +when it starts in an interactive mode: + +@smallexample +Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author} +Gnomovision comes with ABSOLUTELY NO WARRANTY; for details +type `show w'. This is free software, and you are welcome +to redistribute it under certain conditions; type `show c' +for details. +@end smallexample + +The hypothetical commands @samp{show w} and @samp{show c} should show +the appropriate parts of the General Public License. Of course, the +commands you use may be called something other than @samp{show w} and +@samp{show c}; they could even be mouse-clicks or menu items---whatever +suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a ``copyright disclaimer'' for the program, if +necessary. Here is a sample; alter the names: + +@smallexample +@group +Yoyodyne, Inc., hereby disclaims all copyright +interest in the program `Gnomovision' +(which makes passes at compilers) written +by James Hacker. + +@var{signature of Ty Coon}, 1 April 1989 +Ty Coon, President of Vice +@end group +@end smallexample + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General +Public License instead of this License. diff --git a/doc/mach.texi b/doc/mach.texi new file mode 100644 index 0000000..5638c02 --- /dev/null +++ b/doc/mach.texi @@ -0,0 +1,7100 @@ +\input texinfo @c -*- Texinfo -*- +@setfilename mach.info +@settitle The GNU Mach Reference Manual +@setchapternewpage odd + +@comment Tell install-info what to do. +@dircategory Kernel +@direntry +* GNU Mach: (mach). Using and programming the GNU Mach microkernel. +@end direntry + +@c Should have a glossary. +@c Unify some of our indices. +@syncodeindex pg cp +@syncodeindex vr fn +@syncodeindex tp fn + +@c Get the Mach version we are documenting. +@include version.texi +@set EDITION 0.3 +@set UPDATED 2001-09-01 +@c @set ISBN X-XXXXXX-XX-X + +@ifinfo +This file documents the GNU Mach microkernel. + +This is Edition @value{EDITION}, last updated @value{UPDATED}, of +@cite{The GNU Mach Reference Manual}, for Version @value{VERSION}. + +Copyright @copyright{} 2001 Free Software Foundation, Inc. + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "Free Software Needs Free Documentation" and +"GNU Lesser General Public License", the Front-Cover texts being (a) +(see below), and with the Back-Cover Texts being (b) (see below). A +copy of the license is included in the section entitled "GNU Free +Documentation License". + +(a) The FSF's Front-Cover Text is: + + A GNU Manual + +(b) The FSF's Back-Cover Text is: + + You have freedom to copy and modify this GNU Manual, like GNU + software. Copies published by the Free Software Foundation raise + funds for GNU development. + +This work is based on manual pages under the following copyright and license: + +@noindent +Mach Operating System@* +Copyright @copyright{} 1991,1990 Carnegie Mellon University@* +All Rights Reserved. + +Permission to use, copy, modify and distribute this software and its +documentation is hereby granted, provided that both the copyright +notice and this permission notice appear in all copies of the +software, derivative works or modified versions, and any portions +thereof, and that both notices appear in supporting documentation. + +CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS" +CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR +ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE. +@end ifinfo + +@iftex +@shorttitlepage The GNU Mach Reference Manual +@end iftex +@titlepage +@center @titlefont{The GNU Mach} +@sp 1 +@center @titlefont{Reference Manual} +@sp 2 +@center Marcus Brinkmann +@center with +@center Gordon Matzigkeit, Gibran Hasnaoui, +@center Robert V. Baron, Richard P. Draves, Mary R. Thompson, Joseph S. Barrera +@sp 3 +@center Edition @value{EDITION} +@sp 1 +@center last updated @value{UPDATED} +@sp 1 +@center for version @value{VERSION} +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 2001 Free Software Foundation, Inc. +@c @sp 2 +@c Published by the Free Software Foundation @* +@c 59 Temple Place -- Suite 330, @* +@c Boston, MA 02111-1307 USA @* +@c ISBN @value{ISBN} @* + +Permission is granted to copy, distribute and/or modify this document +under the terms of the GNU Free Documentation License, Version 1.1 or +any later version published by the Free Software Foundation; with the +Invariant Sections being "Free Software Needs Free Documentation" and +"GNU Lesser General Public License", the Front-Cover texts being (a) +(see below), and with the Back-Cover Texts being (b) (see below). A +copy of the license is included in the section entitled "GNU Free +Documentation License". + +(a) The FSF's Front-Cover Text is: + + A GNU Manual + +(b) The FSF's Back-Cover Text is: + + You have freedom to copy and modify this GNU Manual, like GNU + software. Copies published by the Free Software Foundation raise + funds for GNU development. + +This work is based on manual pages under the following copyright and license: + +@noindent +Mach Operating System@* +Copyright @copyright{} 1991,1990 Carnegie Mellon University@* +All Rights Reserved. + +Permission to use, copy, modify and distribute this software and its +documentation is hereby granted, provided that both the copyright +notice and this permission notice appear in all copies of the +software, derivative works or modified versions, and any portions +thereof, and that both notices appear in supporting documentation. + +CARNEGIE MELLON ALLOWS FREE USE OF THIS SOFTWARE IN ITS "AS IS" +CONDITION. CARNEGIE MELLON DISCLAIMS ANY LIABILITY OF ANY KIND FOR +ANY DAMAGES WHATSOEVER RESULTING FROM THE USE OF THIS SOFTWARE. +@end titlepage +@c @titlepage +@c @finalout +@c @title The GNU Mach Reference Manual +@c @author Marcus Brinkmann +@c @author Gordon Matzigkeit +@c @author Gibran Hasnaoui + +@c @author Robert V. Baron @c (rvb) +@c @author Richard P. Draves @c (rpd) +@c @author Mary R. Thompson @c (mrt) +@c @author Joseph S. Barrera @c (jsb) +@c @c The following occure rarely in the rcs commit logs of the man pages: +@c @c Dan Stodolsky, (danner) +@c @c David B. Golub, (dbg) +@c @c Terri Watson, (elf) +@c @c Lori Iannamico, (lli) [distribution coordinator] +@c @c Further authors of kernel_interfaces.ps: +@c @c David Black [OSF] +@c @c William Bolosky +@c @c Jonathan Chew +@c @c Alessandro Forin +@c @c Richard F. Rashid +@c @c Avadis Tevanian Jr. +@c @c Michael W. Young +@c @c See also +@c @c http://www.cs.cmu.edu/afs/cs/project/mach/public/www/people-former.html +@page + +@ifnottex +@node Top +@top Main Menu +This is Edition @value{EDITION}, last updated @value{UPDATED}, of +@cite{The GNU Mach Reference Manual}, for Version @value{VERSION} of the +GNU Mach microkernel. +@end ifnottex + +@menu +* Introduction:: How to use this manual. +* Installing:: Setting up GNU Mach on your computer. +* Bootstrap:: Running GNU Mach on your machine. +* Inter Process Communication:: Communication between process. +* Virtual Memory Interface:: Allocating and deallocating virtual memory. +* External Memory Management:: Handling memory pages in user space. +* Threads and Tasks:: Handling of threads and tasks. +* Host Interface:: Interface to a Mach host. +* Processors and Processor Sets:: Handling processors and sets of processors. +* Device Interface:: Accesing kernel devices. +* Kernel Debugger:: How to use the built-in kernel debugger. + +Appendices + +* Copying:: The GNU General Public License says how you + can copy and share the GNU Mach microkernel. +* Documentation License:: This manual is under the GNU Free + Documentation License. + +Indices + +* Concept Index:: Index of concepts and programs. +* Function and Data Index:: Index of functions, variables and data types. + + +@detailmenu + --- The Detailed Node Listing --- + +Introduction + +* Audience:: The people for whom this manual is written. +* Features:: Reasons to install and use GNU Mach. +* Overview:: Basic architecture of the Mach microkernel. +* History:: The story about Mach. + +Installing + +* Binary Distributions:: Obtaining ready-to-run GNU distributions. +* Compilation:: Building GNU Mach from its source code. +* Configuration:: Configuration options at compilation time. +* Cross-Compilation:: Building GNU Mach from another system. + +Bootstrap + +* Bootloader:: Starting the microkernel, or other OSes. +* Modules:: Starting the first task of the OS. + +Inter Process Communication + +* Major Concepts:: The concepts behind the Mach IPC system. +* Messaging Interface:: Composing, sending and receiving messages. +* Port Manipulation Interface:: Manipulating ports, port rights, port sets. + +Messaging Interface + +* Mach Message Call:: Sending and receiving messages. +* Message Format:: The format of Mach messages. +* Exchanging Port Rights:: Sending and receiving port rights. +* Memory:: Passing memory regions in messages. +* Message Send:: Sending messages. +* Message Receive:: Receiving messages. +* Atomicity:: Atomicity of port rights. + +Port Manipulation Interface + +* Port Creation:: How to create new ports and port sets. +* Port Destruction:: How to destroy ports and port sets. +* Port Names:: How to query and manipulate port names. +* Port Rights:: How to work with port rights. +* Ports and other Tasks:: How to move rights between tasks. +* Receive Rights:: How to work with receive rights. +* Port Sets:: How to work with port sets. +* Request Notifications:: How to request notifications for events. +@c * Inherited Ports:: How to work with the inherited system ports. + +Virtual Memory Interface + +* Memory Allocation:: Allocation of new virtual memory. +* Memory Deallocation:: Freeing unused virtual memory. +* Data Transfer:: Reading, writing and copying memory. +* Memory Attributes:: Tweaking memory regions. +* Mapping Memory Objects:: How to map memory objects. +* Memory Statistics:: How to get statistics about memory usage. + +External Memory Management + +* Memory Object Server:: The basics of external memory management. +* Memory Object Creation:: How new memory objects are created. +* Memory Object Termination:: How memory objects are terminated. +* Memory Objects and Data:: Data transfer to and from memory objects. +* Memory Object Locking:: How memory objects are locked. +* Memory Object Attributes:: Manipulating attributes of memory objects. +* Default Memory Manager:: Setting and using the default memory manager. + +Threads and Tasks + +* Thread Interface:: Manipulating threads. +* Task Interface:: Manipulating tasks. +* Profiling:: Profiling threads and tasks. + +Thread Interface + +* Thread Creation:: Creating threads. +* Thread Termination:: Terminating threads. +* Thread Information:: How to get informations on threads. +* Thread Settings:: How to set threads related informations. +* Thread Execution:: How to control the thread's machine state. +* Scheduling:: Operations on thread scheduling. +* Thread Special Ports:: How to handle the thread's special ports. +* Exceptions:: Managing exceptions. + +Scheduling + +* Thread Priority:: Changing the priority of a thread. +* Hand-Off Scheduling:: Switch to a new thread. +* Scheduling Policy:: Setting the scheduling policy. + +Task Interface + +* Task Creation:: Creating tasks. +* Task Termination:: Terminating tasks. +* Task Information:: Informations on tasks. +* Task Execution:: Thread scheduling in a task. +* Task Special Ports:: How to get and set the task's special ports. +* Syscall Emulation:: How to emulate system calls. + +Host Interface + +* Host Ports:: Ports representing a host. +* Host Information:: Query information about a host. +* Host Time:: Functions to query manipulate the host time. +* Host Reboot:: Rebooting the system. + +Processors and Processor Sets + +* Processor Set Interface:: How to work with processor sets. +* Processor Interface:: How to work with individual processors. + +Processor Set Interface + +* Processor Set Ports:: Ports representing a processor set. +* Processor Set Access:: How the processor sets are accessed. +* Processor Set Creation:: How new processor sets are created. +* Processor Set Destruction:: How processor sets are destroyed. +* Tasks and Threads on Sets:: Assigning tasks or threads to processor sets. +* Processor Set Priority:: Specifying the priority of a processor set. +* Processor Set Policy:: Changing the processor set policies. +* Processor Set Info:: Obtaining information about a processor set. + +Processor Interface + +* Hosted Processors:: Getting a list of all processors on a host. +* Processor Control:: Starting, stopping, controlling processors. +* Processors and Sets:: Combining processors into processor sets. +* Processor Info:: Obtaining information on processors. + +Device Interface + +* Device Open:: Opening hardware devices. +* Device Close:: Closing hardware devices. +* Device Read:: Reading data from the device. +* Device Write:: Writing data to the device. +* Device Map:: Mapping devices into virtual memory. +* Device Status:: Querying and manipulating a device. +* Device Filter:: Filtering packets arriving on a device. + +Kernel Debugger + +* Operation:: Basic architecture of the kernel debugger. +* Commands:: Available commands in the kernel debugger. +* Variables:: Access of variables from the kernel debugger. +* Expressions:: Usage of expressions in the kernel debugger. + +Documentation License + +* Free Documentation License:: The GNU Free Documentation License. +* CMU License:: The CMU license applies to the original Mach + kernel and its documentation. + +@end detailmenu +@end menu + + +@node Introduction +@chapter Introduction + +GNU Mach is the microkernel of the GNU Project. It is the base of the +operating system, and provides its functionality to the Hurd servers, +the GNU C Library and all user applications. The microkernel itself +does not provide much functionality of the system, just enough to make +it possible for the Hurd servers and the C library to implement the missing +features you would expect from a POSIX compatible operating system. + +@menu +* Audience:: The people for whom this manual is written. +* Features:: Reasons to install and use GNU Mach. +* Overview:: Basic architecture of the Mach microkernel. +* History:: The story about Mach. +@end menu + + +@node Audience +@section Audience + +This manual is designed to be useful to everybody who is interested in +using, administering, or programming the Mach microkernel. + +If you are an end-user and you are looking for help on running the Mach +kernel, the first few chapters of this manual describe the essential +parts of installing and using the kernel in the GNU operating system. + +The rest of this manual is a technical discussion of the Mach +programming interface and its implementation, and would not be helpful +until you want to learn how to extend the system or modify the kernel. + +This manual is organized according to the subsystems of Mach, and each +chapter begins with descriptions of conceptual ideas that are related to +that subsystem. If you are a programmer and want to learn more about, +say, the Mach IPC subsystem, you can skip to the IPC chapter +(@pxref{Inter Process Communication}), and read about the related +concepts and interface definitions. + + +@node Features +@section Features + +GNU Mach is not the most advanced microkernel known to the planet, +nor is it the fastest or smallest, but it has a rich set of interfaces and +some features which make it useful as the base of the Hurd system. + +@table @asis +@item it's free software +Anybody can use, modify, and redistribute it under the terms of the GNU +General Public License (@pxref{Copying}). GNU Mach is part of the GNU +system, which is a complete operating system licensed under the GPL. + +@item it's built to survive +As a microkernel, GNU Mach doesn't implement a lot of the features +commonly found in an operating system, but only the bare minimum +that is required to implement a full operating system on top of it. +This means that a lot of the operating system code is maintained outside +of GNU Mach, and while this code may go through a complete redesign, the +code of the microkernel can remain comparatively stable. + +@item it's scalable +Mach is particularly well suited for SMP and network cluster techniques. +Thread support is provided at the kernel level, and the kernel itself +takes advantage of that. Network transparency at the IPC level makes +resources of the system available across machine boundaries (with NORMA +IPC, currently not available in GNU Mach). + +@item it exists +The Mach microkernel is real software that works Right Now. +It is not a research or a proposal. You don't have to wait at all +before you can start using and developing it. Mach has been used in +many operating systems in the past, usually as the base for a single +UNIX server. In the GNU system, Mach is the base of a functional +multi-server operating system, the Hurd. +@end table + + +@node Overview +@section Overview + +@c This paragraph by Gordon Matzigkeit from the Hurd manual. +An operating system kernel provides a framework for programs to share a +computer's hardware resources securely and efficiently. This requires +that the programs are seperated and protected from each other. To make +running multiple programs in parallel useful, there also needs to be a +facility for programs to exchange information by communication. + +The Mach microkernel provides abstractions of the underlying hardware +resources like devices and memory. It organizes the running programs +into tasks and threads (points of execution in the tasks). In addition, +Mach provides a rich interface for inter-process communication. + +What Mach does not provide is a POSIX compatible programming interface. +In fact, it has no understanding of file systems, POSIX process semantics, +network protocols and many more. All this is implemented in tasks +running on top of the microkernel. In the GNU operating system, the Hurd +servers and the C library share the responsibility to implement the POSIX +interface, and the additional interfaces which are specific to the GNU +system. + + +@node History +@section History + +XXX A few lines about the history of Mach here. + + +@node Installing +@chapter Installing + +Before you can use the Mach microkernel in your system you'll need to install +it and all components you want to use with it, e.g. the rest of the operating +system. You also need a bootloader to load the kernel from the storage +medium and run it when the computer is started. + +GNU Mach is only available for Intel i386-compatible architectures +(such as the Pentium) currently. If you have a different architecture +and want to run the GNU Mach microkernel, you will need to port the +kernel and all other software of the system to your machine's architecture. +Porting is an involved process which requires considerable programming skills, +and it is not recommended for the faint-of-heart. +If you have the talent and desire to do a port, contact +@email{bug-hurd@@gnu.org} in order to coordinate the effort. + +@menu +* Binary Distributions:: Obtaining ready-to-run GNU distributions. +* Compilation:: Building GNU Mach from its source code. +* Configuration:: Configuration options at compile time. +* Cross-Compilation:: Building GNU Mach from another system. +@end menu + + +@node Binary Distributions +@section Binary Distributions + +By far the easiest and best way to install GNU Mach and the operating +system is to obtain a GNU binary distribution. The GNU operating +system consists of GNU Mach, the Hurd, the C library and many applications. +Without the GNU operating system, you will only have a microkernel, which +is not very useful by itself, without the other programs. + +Building the whole operating system takes a huge effort, and you are well +advised to not do it yourself, but to get a binary distribution of the +GNU operating system. The distribution also includes a binary of the +GNU Mach microkernel. + +Information on how to obtain the GNU system can be found in the Hurd +info manual. + + +@node Compilation +@section Compilation + +If you already have a running GNU system, and only want to recompile +the kernel, for example to select a different set of included hardware +drivers, you can easily do this. You need the GNU C compiler and +MiG, the Mach interface generator, which both come in their own +packages. + +Building and installing the kernel is as easy as with any other GNU +software package. The configure script is used to configure the source +and set the compile time options. The compilation is done by running: + +@example +make +@end example + +To install the kernel and its header files, just enter the command: + +@example +make install +@end example + +This will install the kernel into $(prefix)/boot/gnumach and the header +files into $(prefix)/include. You can also only install the kernel or +the header files. For this, the two targets install-kernel and +install-headers are provided. + + +@node Configuration +@section Configuration + +The following options can be passed to the configure script as command +line arguments and control what components are built into the kernel, or +where it is installed. + +The default for an option is to be disabled, unless otherwise noted. + +@table @code +@item --prefix @var{prefix} +Sets the prefix to PREFIX. The default prefix is the empty string, which +is the correct value for the GNU system. The prefix is prepended to all +path names at installation time. + +@item --enable-kdb +Enables the in-kernel debugger. This is only useful if you actually +anticipate debugging the kernel. It is not enabled by default because +it adds considerably to the unpageable memory footprint of the kernel. +@xref{Kernel Debugger}. + +@item --enable-kmsg +Enables the kernel message device kmsg. + +@item --enable-lpr +Enables the parallel port devices lpr%d. + +@item --enable-floppy +Enables the PC floppy disk controller devices fd%d. + +@item --enable-ide +Enables the IDE controller devices hd%d, hd%ds%d. +@end table + +The following options enable drivers for various SCSI controller. +SCSI devices are named sd%d (disks) or cd%d (CD ROMs). + +@table @code +@item --enable-advansys +Enables the AdvanSys SCSI controller devices sd%d, cd%d. + +@item --enable-buslogic +Enables the BusLogic SCSI controller devices sd%d, cd%d. + +@item --disable-flashpoint +Only meaningful in conjunction with @option{--enable-buslogic}. Omits the +FlshPoint support. This option is enabled by default if +@option{--enable-buslogic} is specified. + +@item --enable-u1434f +Enables the UltraStor 14F/34F SCSI controller devices sd%d, cd%d. + +@item --enable-ultrastor +Enables the UltraStor SCSI controller devices sd%d, cd%d. + +@item --enable-aha152x +@itemx --enable-aha2825 +Enables the Adaptec AHA-152x/2825 SCSI controller devices sd%d, cd%d. + +@item --enable-aha1542 +Enables the Adaptec AHA-1542 SCSI controller devices sd%d, cd%d. + +@item --enable-aha1740 +Enables the Adaptec AHA-1740 SCSI controller devices sd%d, cd%d. + +@item --enable-aic7xxx +Enables the Adaptec AIC7xxx SCSI controller devices sd%d, cd%d. + +@item --enable-futuredomain +Enables the Future Domain 16xx SCSI controller devices sd%d, cd%d. + +@item --enable-in2000 +Enables the Always IN 2000 SCSI controller devices sd%d, cd%d. + +@item --enable-ncr5380 +@itemx --enable-ncr53c400 +Enables the generic NCR5380/53c400 SCSI controller devices sd%d, cd%d. + +@item --enable-ncr53c406a +Enables the NCR53c406a SCSI controller devices sd%d, cd%d. + +@item --enable-pas16 +Enables the PAS16 SCSI controller devices sd%d, cd%d. + +@item --enable-seagate +Enables the Seagate ST02 and Future Domain TMC-8xx SCSI controller +devices sd%d, cd%d. + +@item --enable-t128 +@itemx --enable-t128f +@itemx --enable-t228 +Enables the Trantor T128/T128F/T228 SCSI controller devices sd%d, cd%d. + +@item --enable-ncr53c7xx +Enables the NCR53C7,8xx SCSI controller devices sd%d, cd%d. + +@item --enable-eatadma +Enables the EATA-DMA (DPT, NEC, AT&T, SNI, AST, Olivetti, Alphatronix) +SCSI controller devices sd%d, cd%d. + +@item --enable-eatapio +Enables the EATA-PIO (old DPT PM2001, PM2012A) SCSI controller devices +sd%d, cd%d. + +@item --enable-wd7000 +Enables the WD 7000 SCSI controller devices sd%d, cd%d. + +@item --enable-eata +Enables the EATA ISA/EISA/PCI (DPT and generic EATA/DMA-compliant boards) +SCSI controller devices sd%d, cd%d. + +@item --enable-am53c974 +@itemx --enable-am79c974 +Enables the AM53/79C974 SCSI controller devices sd%d, cd%d. + +@item --enable-dtc3280 +@itemx --enable-dtc3180 +Enables the DTC3180/3280 SCSI controller devices sd%d, cd%d. + +@item --enable-ncr53c8xx +@itemx --enable-dc390w +@itemx --enable-dc390u +@itemx --enable-dc390f +Enables the NCR53C8XX SCSI controller devices sd%d, cd%d. + +@item --enable-dc390t +@itemx --enable-dc390 +Enables the Tekram DC-390(T) SCSI controller devices sd%d, cd%d. + +@item --enable-ppa +Enables the IOMEGA Parallel Port ZIP drive device sd%d. + +@item --enable-qlogicfas +Enables the Qlogic FAS SCSI controller devices sd%d, cd%d. + +@item --enable-qlogicisp +Enables the Qlogic ISP SCSI controller devices sd%d, cd%d. + +@item --enable-gdth +Enables the GDT SCSI Disk Array controller devices sd%d, cd%d. +@end table + +The following options enable drivers for various ethernet cards. +NIC device names are usually eth%d, except for the pocket adaptors. + +GNU Mach does only autodetect one ethernet card. To enable any further +cards, the source code has to be edited. +@c XXX Reference to the source code. + +@table @code +@item --enable-ne2000 +@itemx --enable-ne1000 +Enables the NE2000/NE1000 ISA netword card devices eth%d. + +@item --enable-3c503 +@itemx --enable-el2 +Enables the 3Com 503 (Etherlink II) netword card devices eth%d. + +@item --enable-3c509 +@itemx --enable-3c579 +@itemx --enable-el3 +Enables the 3Com 509/579 (Etherlink III) netword card devices eth%d. + +@item --enable-wd80x3 +Enables the WD80X3 netword card devices eth%d. + +@item --enable-3c501 +@itemx --enable-el1 +Enables the 3COM 501 netword card devices eth%d. + +@item --enable-ul +Enables the SMC Ultra netword card devices eth%d. + +@item --enable-ul32 +Enables the SMC Ultra 32 netword card devices eth%d. + +@item --enable-hplanplus +Enables the HP PCLAN+ (27247B and 27252A) netword card devices eth%d. + +@item --enable-hplan +Enables the HP PCLAN (27245 and other 27xxx series) netword card devices eth%d. + +@item --enable-3c59x +@itemx --enable-3c90x +@itemx --enable-vortex +Enables the 3Com 590/900 series (592/595/597/900/905) "Vortex/Boomerang" +netword card devices eth%d. + +@item --enable-seeq8005 +Enables the Seeq8005 netword card devices eth%d. + +@item --enable-hp100 +@itemx --enable-hpj2577 +@itemx --enable-hpj2573 +@itemx --enable-hp27248b +@itemx --enable-hp2585 +Enables the HP 10/100VG PCLAN (ISA, EISA, PCI) netword card devices +eth%d. + +@item --enable-ac3200 +Enables the Ansel Communications EISA 3200 netword card devices eth%d. + +@item --enable-e2100 +Enables the Cabletron E21xx netword card devices eth%d. + +@item --enable-at1700 +Enables the AT1700 (Fujitsu 86965) netword card devices eth%d. + +@item --enable-eth16i +@itemx --enable-eth32 +Enables the ICL EtherTeam 16i/32 netword card devices eth%d. + +@item --enable-znet +@itemx --enable-znote +Enables the Zenith Z-Note netword card devices eth%d. + +@item --enable-eexpress +Enables the EtherExpress 16 netword card devices eth%d. + +@item --enable-eexpresspro +Enables the EtherExpressPro netword card devices eth%d. + +@item --enable-eexpresspro100 +Enables the Intel EtherExpressPro PCI 10+/100B/100+ netword card devices +eth%d. + +@item --enable-depca +@itemx --enable-de100 +@itemx --enable-de101 +@itemx --enable-de200 +@itemx --enable-de201 +@itemx --enable-de202 +@itemx --enable-de210 +@itemx --enable-de422 +Enables the DEPCA, DE10x, DE200, DE201, DE202, DE210, DE422 netword card +devices eth%d. + +@item --enable-ewrk3 +@itemx --enable-de203 +@itemx --enable-de204 +@itemx --enable-de205 +Enables the EtherWORKS 3 (DE203, DE204, DE205) netword card devices +eth%d. + +@item --enable-de4x5 +@itemx --enable-de425 +@itemx --enable-de434 +@itemx --enable-435 +@itemx --enable-de450 +@itemx --enable-500 +Enables the DE425, DE434, DE435, DE450, DE500 netword card devices +eth%d. + +@item --enable-apricot +Enables the Apricot XEN-II on board ethernet netword card devices eth%d. + +@item --enable-wavelan +Enables the AT&T WaveLAN & DEC RoamAbout DS netword card devices eth%d. + +@item --enable-3c507 +@itemx --enable-el16 +Enables the 3Com 507 netword card devices eth%d. + +@item --enable-3c505 +@itemx --enable-elplus +Enables the 3Com 505 netword card devices eth%d. + +@item --enable-de600 +Enables the D-Link DE-600 netword card devices eth%d. + +@item --enable-de620 +Enables the D-Link DE-620 netword card devices eth%d. + +@item --enable-skg16 +Enables the Schneider & Koch G16 netword card devices eth%d. + +@item --enable-ni52 +Enables the NI5210 netword card devices eth%d. + +@item --enable-ni65 +Enables the NI6510 netword card devices eth%d. + +@item --enable-atp +Enables the AT-LAN-TEC/RealTek pocket adaptor netword card devices atp%d. + +@item --enable-lance +@itemx --enable-at1500 +@itemx --enable-ne2100 +Enables the AMD LANCE and PCnet (AT1500 and NE2100) netword card devices eth%d. + +@item --enable-elcp +@itemx --enable-tulip +Enables the DECchip Tulip (dc21x4x) PCI netword card devices eth%d. + +@item --enable-fmv18x +Enables the FMV-181/182/183/184 netword card devices eth%d. + +@item --enable-3c515 +Enables the 3Com 515 ISA Fast EtherLink netword card devices eth%d. + +@item --enable-pcnet32 +Enables the AMD PCI PCnet32 (PCI bus NE2100 cards) netword card devices +eth%d. + +@item --enable-ne2kpci +Enables the PCI NE2000 netword card devices eth%d. + +@item --enable-yellowfin +Enables the Packet Engines Yellowfin Gigabit-NIC netword card devices +eth%d. + +@item --enable-rtl8139 +@itemx --enable-rtl8129 +Enables the RealTek 8129/8139 (not 8019/8029!) netword card devices +eth%d. + +@item --enable-epic +@itemx --enable-epic100 +Enables the SMC 83c170/175 EPIC/100 (EtherPower II) netword card devices eth%d. + +@item --enable-tlan +Enables the TI ThunderLAN netword card devices eth%d. + +@item --enable-viarhine +Enables the VIA Rhine netword card devices eth%d. +@end table + + +@node Cross-Compilation +@section Cross-Compilation + +Another way to install the kernel is to use an existing operating system +in order to compile the kernel binary. +This is called @dfn{cross-compiling}, because it is done between two +different platforms. If the pre-built kernels are not working for +you, and you can't ask someone to compile a custom kernel for your +machine, this is your last chance to get a kernel that boots on your +hardware. + +Luckily, the kernel does have light dependencies. You don't even +need a cross compiler if your build machine has a compiler and is +the same architecture as the system you want to run GNU Mach on. + +You need a cross-mig, though. + +XXX More info needed. + + +@node Bootstrap +@chapter Bootstrap + +Bootstrapping@footnote{The term @dfn{bootstrapping} refers to a Dutch +legend about a boy who was able to fly by pulling himself up by his +bootstraps. In computers, this term refers to any process where a +simple system activates a more complicated system.} is the procedure by +which your machine loads the microkernel and transfers control to the +operating system. + + +@menu +* Bootloader:: Starting the microkernel, or other OSes. +* Modules:: Starting the first task of the OS. +@end menu + +@node Bootloader +@section Bootloader + +The @dfn{bootloader} is the first software that runs on your machine. +Many hardware architectures have a very simple startup routine which +reads a very simple bootloader from the beginning of the internal hard +disk, then transfers control to it. Other architectures have startup +routines which are able to understand more of the contents of the hard +disk, and directly start a more advanced bootloader. + +@cindex GRUB +@cindex GRand Unified Bootloader +Currently, @dfn{GRUB}@footnote{The GRand Unified Bootloader, available +from @uref{http://www.uruk.org/grub/}.} is the preferred GNU bootloader. +GRUB provides advanced functionality, and is capable of loading several +different kernels (such as Mach, Linux, DOS, and the *BSD family). +@xref{Top, , Introduction, grub, GRUB Manual}. + +GNU Mach conforms to the Multiboot specification which defines an +interface between the bootloader and the components that run very early +at startup. GNU Mach can be started by any bootloader which supports +the multiboot standard. After the bootloader loaded the kernel image to +a designated address in the system memory, it jumps into the startup +code of the kernel. This code initializes the kernel and detects the +available hardware devices. Afterwards, the first system task is +started. @xref{Top, , Overview, multiboot, Multiboot Specification}. + + +@node Modules +@section Modules +@pindex serverboot + +Because the microkernel does not provide filesystem support and other +features necessary to load the first system task from a storage medium, +the first task is loaded by the bootloader as a module to a specified +address. In the GNU system, this first program is the @code{serverboot} +executable. GNU Mach inserts the host control port and the device +master port into this task and appends the port numbers to the command +line before executing it. + +The @code{serverboot} program is responsible for loading and executing +the rest of the Hurd servers. Rather than containing specific +instructions for starting the Hurd, it follows general steps given in a +user-supplied boot script. + +XXX More about boot scripts. + + +@node Inter Process Communication +@chapter Inter Process Communication + +This chapter describes the details of the Mach IPC system. First the +actual calls concerned with sending and receiving messages are +discussed, then the details of the port system are described in detail. + +@menu +* Major Concepts:: The concepts behind the Mach IPC system. +* Messaging Interface:: Composing, sending and receiving messages. +* Port Manipulation Interface:: Manipulating ports, port rights, port sets. +@end menu + + +@node Major Concepts +@section Major Concepts +@cindex interprocess communication (IPC) +@cindex IPC (interprocess communication) +@cindex communication between tasks +@cindex remote procedure calls (RPC) +@cindex RPC (remote procedure calls) +@cindex messages + +The Mach kernel provides message-oriented, capability-based interprocess +communication. The interprocess communication (IPC) primitives +efficiently support many different styles of interaction, including +remote procedure calls (RPC), object-oriented distributed programming, +streaming of data, and sending very large amounts of data. + +The IPC primitives operate on three abstractions: messages, ports, and +port sets. User tasks access all other kernel services and abstractions +via the IPC primitives. + +The message primitives let tasks send and receive messages. Tasks send +messages to ports. Messages sent to a port are delivered reliably +(messages may not be lost) and are received in the order in which they +were sent. Messages contain a fixed-size header and a variable amount +of typed data following the header. The header describes the +destination and size of the message. + +The IPC implementation makes use of the VM system to efficiently +transfer large amounts of data. The message body can contain the +address of a region in the sender's address space which should be +transferred as part of the message. When a task receives a message +containing an out-of-line region of data, the data appears in an unused +portion of the receiver's address space. This transmission of +out-of-line data is optimized so that sender and receiver share the +physical pages of data copy-on-write, and no actual data copy occurs +unless the pages are written. Regions of memory up to the size of a +full address space may be sent in this manner. + +Ports hold a queue of messages. Tasks operate on a port to send and +receive messages by exercising capabilities for the port. Multiple +tasks can hold send capabilities, or rights, for a port. Tasks can also +hold send-once rights, which grant the ability to send a single message. +Only one task can hold the receive capability, or receive right, for a +port. Port rights can be transferred between tasks via messages. The +sender of a message can specify in the message body that the message +contains a port right. If a message contains a receive right for a +port, then the receive right is removed from the sender of the message +and the right is transferred to the receiver of the message. While the +receive right is in transit, tasks holding send rights can still send +messages to the port, and they are queued until a task acquires the +receive right and uses it to receive the messages. + +Tasks can receive messages from ports and port sets. The port set +abstraction allows a single thread to wait for a message from any of +several ports. Tasks manipulate port sets with a capability, or +port-set right, which is taken from the same space as the port +capabilities. The port-set right may not be transferred in a message. +A port set holds receive rights, and a receive operation on a port set +blocks waiting for a message sent to any of the constituent ports. A +port may not belong to more than one port set, and if a port is a member +of a port set, the holder of the receive right can't receive directly +from the port. + +Port rights are a secure, location-independent way of naming ports. The +port queue is a protected data structure, only accessible via the +kernel's exported message primitives. Rights are also protected by the +kernel; there is no way for a malicious user task to guess a port name +and send a message to a port to which it shouldn't have access. Port +rights do not carry any location information. When a receive right for +a port moves from task to task, and even between tasks on different +machines, the send rights for the port remain unchanged and continue to +function. + +@node Messaging Interface +@section Messaging Interface + +This section describes how messages are composed, sent and received +within the Mach IPC system. + +@menu +* Mach Message Call:: Sending and receiving messages. +* Message Format:: The format of Mach messages. +* Exchanging Port Rights:: Sending and receiving port rights. +* Memory:: Passing memory regions in messages. +* Message Send:: Sending messages. +* Message Receive:: Receiving messages. +* Atomicity:: Atomicity of port rights. +@end menu + + +@node Mach Message Call +@subsection Mach Message Call + +To use the @code{mach_msg} call, you can include the header files +@file{mach/port.h} and @file{mach/message.h}. + +@deftypefun mach_msg_return_t mach_msg (@w{mach_msg_header_t *@var{msg}}, @w{mach_msg_option_t @var{option}}, @w{mach_msg_size_t @var{send_size}}, @w{mach_msg_size_t @var{rcv_size}}, @w{mach_port_t @var{rcv_name}}, @w{mach_msg_timeout_t @var{timeout}}, @w{mach_port_t @var{notify}}) +The @code{mach_msg} function is used to send and receive messages. Mach +messages contain typed data, which can include port rights and +references to large regions of memory. + +@var{msg} is the address of a buffer in the caller's address space. +Message buffers should be aligned on long-word boundaries. The message +options @var{option} are bit values, combined with bitwise-or. One or +both of @code{MACH_SEND_MSG} and @code{MACH_RCV_MSG} should be used. +Other options act as modifiers. When sending a message, @var{send_size} +specifies the size of the message buffer. Otherwise zero should be +supplied. When receiving a message, @var{rcv_size} specifies the size +of the message buffer. Otherwise zero should be supplied. When +receiving a message, @var{rcv_name} specifies the port or port set. +Otherwise @code{MACH_PORT_NULL} should be supplied. When using the +@code{MACH_SEND_TIMEOUT} and @code{MACH_RCV_TIMEOUT} options, +@var{timeout} specifies the time in milliseconds to wait before giving +up. Otherwise @code{MACH_MSG_TIMEOUT_NONE} should be supplied. When +using the @code{MACH_SEND_NOTIFY}, @code{MACH_SEND_CANCEL}, and +@code{MACH_RCV_NOTIFY} options, @var{notify} specifies the port used for +the notification. Otherwise @code{MACH_PORT_NULL} should be supplied. + +If the option argument is @code{MACH_SEND_MSG}, it sends a message. The +@var{send_size} argument specifies the size of the message to send. The +@code{msgh_remote_port} field of the message header specifies the +destination of the message. + +If the option argument is @code{MACH_RCV_MSG}, it receives a message. +The @var{rcv_size} argument specifies the size of the message buffer +that will receive the message; messages larger than @var{rcv_size} are +not received. The @var{rcv_name} argument specifies the port or port +set from which to receive. + +If the option argument is @code{MACH_SEND_MSG|MACH_RCV_MSG}, then +@code{mach_msg} does both send and receive operations. If the send +operation encounters an error (any return code other than +@code{MACH_MSG_SUCCESS}), then the call returns immediately without +attempting the receive operation. Semantically the combined call is +equivalent to separate send and receive calls, but it saves a system +call and enables other internal optimizations. + +If the option argument specifies neither @code{MACH_SEND_MSG} nor +@code{MACH_RCV_MSG}, then @code{mach_msg} does nothing. + +Some options, like @code{MACH_SEND_TIMEOUT} and @code{MACH_RCV_TIMEOUT}, +share a supporting argument. If these options are used together, they +make independent use of the supporting argument's value. +@end deftypefun + +@deftp {Data type} mach_msg_timeout_t +This is a @code{natural_t} used by the timeout mechanism. The units are +milliseconds. The value to be used when there is no timeout is +@code{MACH_MSG_TIMEOUT_NONE}. +@end deftp + + +@node Message Format +@subsection Message Format +@cindex message format +@cindex format of a message +@cindex composing messages +@cindex message composition + +A Mach message consists of a fixed size message header, a +@code{mach_msg_header_t}, followed by zero or more data items. Data +items are typed. Each item has a type descriptor followed by the actual +data (or the address of the data, for out-of-line memory regions). + +The following data types are related to Mach ports: + +@deftp {Data type} mach_port_t +The @code{mach_port_t} data type is an unsigned integer type which +represents a port name in the task's port name space. In GNU Mach, this +is an @code{unsigned int}. +@end deftp + +@c This is defined elsewhere. +@c @deftp {Data type} mach_port_seqno_t +@c The @code{mach_port_seqno_t} data type is an unsigned integer type which +@c represents a sequence number of a message. In GNU Mach, this is an +@c @code{unsigned int}. +@c @end deftp + +The following data types are related to Mach messages: + +@deftp {Data type} mach_msg_bits_t +The @code{mach_msg_bits_t} data type is an @code{unsigned int} used to +store various flags for a message. +@end deftp + +@deftp {Data type} mach_msg_size_t +The @code{mach_msg_size_t} data type is an @code{unsigned int} used to +store the size of a message. +@end deftp + +@deftp {Data type} mach_msg_id_t +The @code{mach_msg_id_t} data type is an @code{integer_t} typically used to +convey a function or operation id for the receiver. +@end deftp + +@deftp {Data type} mach_msg_header_t +This structure is the start of every message in the Mach IPC system. It +has the following members: + +@table @code +@item mach_msg_bits_t msgh_bits +The @code{msgh_bits} field has the following bits defined, all other +bits should be zero: + +@table @code +@item MACH_MSGH_BITS_REMOTE_MASK +@itemx MACH_MSGH_BITS_LOCAL_MASK +The remote and local bits encode @code{mach_msg_type_name_t} values that +specify the port rights in the @code{msgh_remote_port} and +@code{msgh_local_port} fields. The remote value must specify a send or +send-once right for the destination of the message. If the local value +doesn't specify a send or send-once right for the message's reply port, +it must be zero and msgh_local_port must be @code{MACH_PORT_NULL}. + +@item MACH_MSGH_BITS_COMPLEX +The complex bit must be specified if the message body contains port +rights or out-of-line memory regions. If it is not specified, then the +message body carries no port rights or memory, no matter what the type +descriptors may seem to indicate. +@end table + +@code{MACH_MSGH_BITS_REMOTE} and @code{MACH_MSGH_BITS_LOCAL} macros +return the appropriate @code{mach_msg_type_name_t} values, given a +@code{msgh_bits} value. The @code{MACH_MSGH_BITS} macro constructs a +value for @code{msgh_bits}, given two @code{mach_msg_type_name_t} +values. + +@item mach_msg_size_t msgh_size +The @code{msgh_size} field in the header of a received message contains +the message's size. The message size, a byte quantity, includes the +message header, type descriptors, and in-line data. For out-of-line +memory regions, the message size includes the size of the in-line +address, not the size of the actual memory region. There are no +arbitrary limits on the size of a Mach message, the number of data items +in a message, or the size of the data items. + +@item mach_port_t msgh_remote_port +The @code{msgh_remote_port} field specifies the destination port of the +message. The field must carry a legitimate send or send-once right for +a port. + +@item mach_port_t msgh_local_port +The @code{msgh_local_port} field specifies an auxiliary port right, +which is conventionally used as a reply port by the recipient of the +message. The field must carry a send right, a send-once right, +@code{MACH_PORT_NULL}, or @code{MACH_PORT_DEAD}. + +@item mach_port_seqno_t msgh_seqno +The @code{msgh_seqno} field provides a sequence number for the message. +It is only valid in received messages; its value in sent messages is +overwritten. +@c XXX The "MESSAGE RECEIVE" section discusses message sequence numbers. + +@item mach_msg_id_t msgh_id +The @code{mach_msg} call doesn't use the @code{msgh_id} field, but it +conventionally conveys an operation or function id. +@end table +@end deftp + +@deftypefn Macro mach_msg_bits_t MACH_MSGH_BITS (@w{mach_msg_type_name_t @var{remote}}, @w{mach_msg_type_name_t @var{local}}) +This macro composes two @code{mach_msg_type_name_t} values that specify +the port rights in the @code{msgh_remote_port} and +@code{msgh_local_port} fields of a @code{mach_msg} call into an +appropriate @code{mach_msg_bits_t} value. +@end deftypefn + +@deftypefn Macro mach_msg_type_name_t MACH_MSGH_BITS_REMOTE (@w{mach_msg_bits_t @var{bits}}) +This macro extracts the @code{mach_msg_type_name_t} value for the remote +port right in a @code{mach_msg_bits_t} value. +@end deftypefn + +@deftypefn Macro mach_msg_type_name_t MACH_MSGH_BITS_LOCAL (@w{mach_msg_bits_t @var{bits}}) +This macro extracts the @code{mach_msg_type_name_t} value for the local +port right in a @code{mach_msg_bits_t} value. +@end deftypefn + +@deftypefn Macro mach_msg_bits_t MACH_MSGH_BITS_PORTS (@w{mach_msg_bits_t @var{bits}}) +This macro extracts the @code{mach_msg_bits_t} component consisting of +the @code{mach_msg_type_name_t} values for the remote and local port +right in a @code{mach_msg_bits_t} value. +@end deftypefn + +@deftypefn Macro mach_msg_bits_t MACH_MSGH_BITS_OTHER (@w{mach_msg_bits_t @var{bits}}) +This macro extracts the @code{mach_msg_bits_t} component consisting of +everything except the @code{mach_msg_type_name_t} values for the remote +and local port right in a @code{mach_msg_bits_t} value. +@end deftypefn + +Each data item has a type descriptor, a @code{mach_msg_type_t} or a +@code{mach_msg_type_long_t}. The @code{mach_msg_type_long_t} type +descriptor allows larger values for some fields. The +@code{msgtl_header} field in the long descriptor is only used for its +inline, longform, and deallocate bits. + +@deftp {Data type} mach_msg_type_name_t +This is an @code{unsigned int} and can be used to hold the +@code{msgt_name} component of the @code{mach_msg_type_t} and +@code{mach_msg_type_long_t} structure. +@end deftp + +@deftp {Data type} mach_msg_type_size_t +This is an @code{unsigned int} and can be used to hold the +@code{msgt_size} component of the @code{mach_msg_type_t} and +@code{mach_msg_type_long_t} structure. +@end deftp + +@deftp {Data type} mach_msg_type_number_t +This is an @code{natural_t} and can be used to hold the +@code{msgt_number} component of the @code{mach_msg_type_t} and +@code{mach_msg_type_long_t} structure. +@c XXX This is used for the size of arrays, too. Mmh? +@end deftp + +@deftp {Data type} mach_msg_type_t +This structure has the following members: + +@table @code +@item unsigned int msgt_name : 8 +The @code{msgt_name} field specifies the data's type. The following +types are predefined: + +@table @code +@item MACH_MSG_TYPE_UNSTRUCTURED +@item MACH_MSG_TYPE_BIT +@item MACH_MSG_TYPE_BOOLEAN +@item MACH_MSG_TYPE_INTEGER_16 +@item MACH_MSG_TYPE_INTEGER_32 +@item MACH_MSG_TYPE_CHAR +@item MACH_MSG_TYPE_BYTE +@item MACH_MSG_TYPE_INTEGER_8 +@item MACH_MSG_TYPE_REAL +@item MACH_MSG_TYPE_STRING +@item MACH_MSG_TYPE_STRING_C +@item MACH_MSG_TYPE_PORT_NAME +@end table + +The following predefined types specify port rights, and receive special +treatment. The next section discusses these types in detail. The type +@c XXX cross ref +@code{MACH_MSG_TYPE_PORT_NAME} describes port right names, when no +rights are being transferred, but just names. For this purpose, it +should be used in preference to @code{MACH_MSG_TYPE_INTEGER_32}. + +@table @code +@item MACH_MSG_TYPE_MOVE_RECEIVE +@item MACH_MSG_TYPE_MOVE_SEND +@item MACH_MSG_TYPE_MOVE_SEND_ONCE +@item MACH_MSG_TYPE_COPY_SEND +@item MACH_MSG_TYPE_MAKE_SEND +@item MACH_MSG_TYPE_MAKE_SEND_ONCE +@end table + +@item msgt_size : 8 +The @code{msgt_size} field specifies the size of each datum, in bits. For +example, the msgt_size of @code{MACH_MSG_TYPE_INTEGER_32} data is 32. + +@item msgt_number : 12 +The @code{msgt_number} field specifies how many data elements comprise +the data item. Zero is a legitimate number. + +The total length specified by a type descriptor is @w{@code{(msgt_size * +msgt_number)}}, rounded up to an integral number of bytes. In-line data +is then padded to an integral number of long-words. This ensures that +type descriptors always start on long-word boundaries. It implies that +message sizes are always an integral multiple of a long-word's size. + +@item msgt_inline : 1 +The @code{msgt_inline} bit specifies, when @code{FALSE}, that the data +actually resides in an out-of-line region. The address of the memory +region (a @code{vm_offset_t} or @code{vm_address_t}) follows the type +descriptor in the message body. The @code{msgt_name}, @code{msgt_size}, +and @code{msgt_number} fields describe the memory region, not the +address. + +@item msgt_longform : 1 +The @code{msgt_longform} bit specifies, when @code{TRUE}, that this type +descriptor is a @code{mach_msg_type_long_t} instead of a +@code{mach_msg_type_t}. The @code{msgt_name}, @code{msgt_size}, and +@code{msgt_number} fields should be zero. Instead, @code{mach_msg} uses +the following @code{msgtl_name}, @code{msgtl_size}, and +@code{msgtl_number} fields. + +@item msgt_deallocate : 1 +The @code{msgt_deallocate} bit is used with out-of-line regions. When +@code{TRUE}, it specifies that the memory region should be deallocated +from the sender's address space (as if with @code{vm_deallocate}) when +the message is sent. + +@item msgt_unused : 1 +The @code{msgt_unused} bit should be zero. +@end table +@end deftp + +@deftypefn Macro boolean_t MACH_MSG_TYPE_PORT_ANY (mach_msg_type_name_t type) +This macro returns @code{TRUE} if the given type name specifies a port +type, otherwise it returns @code{FALSE}. +@end deftypefn + +@deftypefn Macro boolean_t MACH_MSG_TYPE_PORT_ANY_SEND (mach_msg_type_name_t type) +This macro returns @code{TRUE} if the given type name specifies a port +type with a send or send-once right, otherwise it returns @code{FALSE}. +@end deftypefn + +@deftypefn Macro boolean_t MACH_MSG_TYPE_PORT_ANY_RIGHT (mach_msg_type_name_t type) +This macro returns @code{TRUE} if the given type name specifies a port +right type which is moved, otherwise it returns @code{FALSE}. +@end deftypefn + +@deftp {Data type} mach_msg_type_long_t +This structure has the following members: + +@table @code +@item mach_msg_type_t msgtl_header +Same meaning as @code{msgt_header}. +@c XXX cross ref + +@item unsigned short msgtl_name +Same meaning as @code{msgt_name}. + +@item unsigned short msgtl_size +Same meaning as @code{msgt_size}. + +@item unsigned int msgtl_number +Same meaning as @code{msgt_number}. +@end table +@end deftp + + +@node Exchanging Port Rights +@subsection Exchanging Port Rights +@cindex sending port rights +@cindex receiving port rights +@cindex moving port rights + +Each task has its own space of port rights. Port rights are named with +positive integers. Except for the reserved values +@w{@code{MACH_PORT_NULL (0)}@footnote{In the Hurd system, we don't make +the assumption that @code{MACH_PORT_NULL} is zero and evaluates to +false, but rather compare port names to @code{MACH_PORT_NULL} +explicitely}} and @w{@code{MACH_PORT_DEAD (~0)}}, this is a full 32-bit +name space. When the kernel chooses a name for a new right, it is free +to pick any unused name (one which denotes no right) in the space. + +There are five basic kinds of rights: receive rights, send rights, +send-once rights, port-set rights, and dead names. Dead names are not +capabilities. They act as place-holders to prevent a name from being +otherwise used. + +A port is destroyed, or dies, when its receive right is deallocated. +When a port dies, send and send-once rights for the port turn into dead +names. Any messages queued at the port are destroyed, which deallocates +the port rights and out-of-line memory in the messages. + +Tasks may hold multiple user-references for send rights and dead names. +When a task receives a send right which it already holds, the kernel +increments the right's user-reference count. When a task deallocates a +send right, the kernel decrements its user-reference count, and the task +only loses the send right when the count goes to zero. + +Send-once rights always have a user-reference count of one, although a +port can have multiple send-once rights, because each send-once right +held by a task has a different name. In contrast, when a task holds +send rights or a receive right for a port, the rights share a single +name. + +A message body can carry port rights; the @code{msgt_name} +(@code{msgtl_name}) field in a type descriptor specifies the type of +port right and how the port right is to be extracted from the caller. +The values @code{MACH_PORT_NULL} and @code{MACH_PORT_DEAD} are always +valid in place of a port right in a message body. In a sent message, +the following @code{msgt_name} values denote port rights: + +@table @code +@item MACH_MSG_TYPE_MAKE_SEND +The message will carry a send right, but the caller must supply a +receive right. The send right is created from the receive right, and +the receive right's make-send count is incremented. + +@item MACH_MSG_TYPE_COPY_SEND +The message will carry a send right, and the caller should supply a send +right. The user reference count for the supplied send right is not +changed. The caller may also supply a dead name and the receiving task +will get @code{MACH_PORT_DEAD}. + +@item MACH_MSG_TYPE_MOVE_SEND +The message will carry a send right, and the caller should supply a send +right. The user reference count for the supplied send right is +decremented, and the right is destroyed if the count becomes zero. +Unless a receive right remains, the name becomes available for +recycling. The caller may also supply a dead name, which loses a user +reference, and the receiving task will get @code{MACH_PORT_DEAD}. + +@item MACH_MSG_TYPE_MAKE_SEND_ONCE +The message will carry a send-once right, but the caller must supply a +receive right. The send-once right is created from the receive right. + +@item MACH_MSG_TYPE_MOVE_SEND_ONCE +The message will carry a send-once right, and the caller should supply a +send-once right. The caller loses the supplied send-once right. The +caller may also supply a dead name, which loses a user reference, and +the receiving task will get @code{MACH_PORT_DEAD}. + +@item MACH_MSG_TYPE_MOVE_RECEIVE +The message will carry a receive right, and the caller should supply a +receive right. The caller loses the supplied receive right, but retains +any send rights with the same name. +@end table + +If a message carries a send or send-once right, and the port dies while +the message is in transit, then the receiving task will get +@code{MACH_PORT_DEAD} instead of a right. The following +@code{msgt_name} values in a received message indicate that it carries +port rights: + +@table @code +@item MACH_MSG_TYPE_PORT_SEND +This name is an alias for @code{MACH_MSG_TYPE_MOVE_SEND}. The message +carried a send right. If the receiving task already has send and/or +receive rights for the port, then that name for the port will be reused. +Otherwise, the new right will have a new name. If the task already has +send rights, it gains a user reference for the right (unless this would +cause the user-reference count to overflow). Otherwise, it acquires the +send right, with a user-reference count of one. + +@item MACH_MSG_TYPE_PORT_SEND_ONCE +This name is an alias for @code{MACH_MSG_TYPE_MOVE_SEND_ONCE}. The +message carried a send-once right. The right will have a new name. + +@item MACH_MSG_TYPE_PORT_RECEIVE +This name is an alias for @code{MACH_MSG_TYPE_MOVE_RECEIVE}. The +message carried a receive right. If the receiving task already has send +rights for the port, then that name for the port will be reused. +Otherwise, the right will have a new name. The make-send count of the +receive right is reset to zero, but the port retains other attributes +like queued messages, extant send and send-once rights, and requests for +port-destroyed and no-senders notifications. +@end table + +When the kernel chooses a new name for a port right, it can choose any +name, other than @code{MACH_PORT_NULL} and @code{MACH_PORT_DEAD}, which +is not currently being used for a port right or dead name. It might +choose a name which at some previous time denoted a port right, but is +currently unused. + + +@node Memory +@subsection Memory +@cindex sending memory +@cindex receiving memory + +A message body can contain the address of a region in the sender's +address space which should be transferred as part of the message. The +message carries a logical copy of the memory, but the kernel uses VM +techniques to defer any actual page copies. Unless the sender or the +receiver modifies the data, the physical pages remain shared. + +An out-of-line transfer occurs when the data's type descriptor specifies +@code{msgt_inline} as @code{FALSE}. The address of the memory region (a +@code{vm_offset_t} or @code{vm_address_t}) should follow the type +descriptor in the message body. The type descriptor and the address +contribute to the message's size (@code{send_size}, @code{msgh_size}). +The out-of-line data does not contribute to the message's size. + +The name, size, and number fields in the type descriptor describe the +type and length of the out-of-line data, not the in-line address. +Out-of-line memory frequently requires long type descriptors +(@code{mach_msg_type_long_t}), because the @code{msgt_number} field is +too small to describe a page of 4K bytes. + +Out-of-line memory arrives somewhere in the receiver's address space as +new memory. It has the same inheritance and protection attributes as +newly @code{vm_allocate}'d memory. The receiver has the responsibility +of deallocating (with @code{vm_deallocate}) the memory when it is no +longer needed. Security-conscious receivers should exercise caution +when using out-of-line memory from untrustworthy sources, because the +memory may be backed by an unreliable memory manager. + +Null out-of-line memory is legal. If the out-of-line region size is +zero (for example, because @code{msgtl_number} is zero), then the +region's specified address is ignored. A received null out-of-line +memory region always has a zero address. + +Unaligned addresses and region sizes that are not page multiples are +legal. A received message can also contain memory with unaligned +addresses and funny sizes. In the general case, the first and last +pages in the new memory region in the receiver do not contain only data +from the sender, but are partly zero.@footnote{Sending out-of-line +memory with a non-page-aligned address, or a size which is not a page +multiple, works but with a caveat. The extra bytes in the first and +last page of the received memory are not zeroed, so the receiver can +peek at more data than the sender intended to transfer. This might be a +security problem for the sender.} The received address points to the +start of the data in the first page. This possibility doesn't +complicate deallocation, because @code{vm_deallocate} does the right +thing, rounding the start address down and the end address up to +deallocate all arrived pages. + +Out-of-line memory has a deallocate option, controlled by the +@code{msgt_deallocate} bit. If it is @code{TRUE} and the out-of-line +memory region is not null, then the region is implicitly deallocated +from the sender, as if by @code{vm_deallocate}. In particular, the +start and end addresses are rounded so that every page overlapped by the +memory region is deallocated. The use of @code{msgt_deallocate} +effectively changes the memory copy into a memory movement. In a +received message, @code{msgt_deallocate} is @code{TRUE} in type +descriptors for out-of-line memory. + +Out-of-line memory can carry port rights. + + +@node Message Send +@subsection Message Send +@cindex sending messages + +The send operation queues a message to a port. The message carries a +copy of the caller's data. After the send, the caller can freely modify +the message buffer or the out-of-line memory regions and the message +contents will remain unchanged. + +Message delivery is reliable and sequenced. Messages are not lost, and +messages sent to a port, from a single thread, are received in the order +in which they were sent. + +If the destination port's queue is full, then several things can happen. +If the message is sent to a send-once right (@code{msgh_remote_port} +carries a send-once right), then the kernel ignores the queue limit and +delivers the message. Otherwise the caller blocks until there is room +in the queue, unless the @code{MACH_SEND_TIMEOUT} or +@code{MACH_SEND_NOTIFY} options are used. If a port has several blocked +senders, then any of them may queue the next message when space in the +queue becomes available, with the proviso that a blocked sender will not +be indefinitely starved. + +These options modify @code{MACH_SEND_MSG}. If @code{MACH_SEND_MSG} is +not also specified, they are ignored. + +@table @code +@item MACH_SEND_TIMEOUT +The timeout argument should specify a maximum time (in milliseconds) for +the call to block before giving up.@footnote{If MACH_SEND_TIMEOUT is +used without MACH_SEND_INTERRUPT, then the timeout duration might not be +accurate. When the call is interrupted and automatically retried, the +original timeout is used. If interrupts occur frequently enough, the +timeout interval might never expire.} If the message can't be queued +before the timeout interval elapses, then the call returns +@code{MACH_SEND_TIMED_OUT}. A zero timeout is legitimate. + +@item MACH_SEND_NOTIFY +The notify argument should specify a receive right for a notify port. +If the send were to block, then instead the message is queued, +@code{MACH_SEND_WILL_NOTIFY} is returned, and a msg-accepted +notification is requested. If @code{MACH_SEND_TIMEOUT} is also +specified, then @code{MACH_SEND_NOTIFY} doesn't take effect until the +timeout interval elapses. + +With @code{MACH_SEND_NOTIFY}, a task can forcibly queue to a send right +one message at a time. A msg-accepted notification is sent to the the +notify port when another message can be forcibly queued. If an attempt +is made to use @code{MACH_SEND_NOTIFY} before then, the call returns a +@code{MACH_SEND_NOTIFY_IN_PROGRESS} error. + +The msg-accepted notification carries the name of the send right. If +the send right is deallocated before the msg-accepted notification is +generated, then the msg-accepted notification carries the value +@code{MACH_PORT_NULL}. If the destination port is destroyed before the +notification is generated, then a send-once notification is generated +instead. + +@item MACH_SEND_INTERRUPT +If specified, the @code{mach_msg} call will return +@code{MACH_SEND_INTERRUPTED} if a software interrupt aborts the call. +Otherwise, the send operation will be retried. + +@item MACH_SEND_CANCEL +The notify argument should specify a receive right for a notify port. +If the send operation removes the destination port right from the +caller, and the removed right had a dead-name request registered for it, +and notify is the notify port for the dead-name request, then the +dead-name request may be silently canceled (instead of resulting in a +port-deleted notification). + +This option is typically used to cancel a dead-name request made with +the @code{MACH_RCV_NOTIFY} option. It should only be used as an optimization. +@end table + +The send operation can generate the following return codes. These +return codes imply that the call did nothing: + +@table @code +@item MACH_SEND_MSG_TOO_SMALL +The specified send_size was smaller than the minimum size for a message. + +@item MACH_SEND_NO_BUFFER +A resource shortage prevented the kernel from allocating a message +buffer. + +@item MACH_SEND_INVALID_DATA +The supplied message buffer was not readable. + +@item MACH_SEND_INVALID_HEADER +The @code{msgh_bits} value was invalid. + +@item MACH_SEND_INVALID_DEST +The @code{msgh_remote_port} value was invalid. + +@item MACH_SEND_INVALID_REPLY +The @code{msgh_local_port} value was invalid. + +@item MACH_SEND_INVALID_NOTIFY +When using @code{MACH_SEND_CANCEL}, the notify argument did not denote a +valid receive right. +@end table + +These return codes imply that some or all of the message was destroyed: + +@table @code +@item MACH_SEND_INVALID_MEMORY +The message body specified out-of-line data that was not readable. + +@item MACH_SEND_INVALID_RIGHT +The message body specified a port right which the caller didn't possess. + +@item MACH_SEND_INVALID_TYPE +A type descriptor was invalid. + +@item MACH_SEND_MSG_TOO_SMALL +The last data item in the message ran over the end of the message. +@end table + +These return codes imply that the message was returned to the caller +with a pseudo-receive operation: + +@table @code +@item MACH_SEND_TIMED_OUT +The timeout interval expired. + +@item MACH_SEND_INTERRUPTED +A software interrupt occurred. + +@item MACH_SEND_INVALID_NOTIFY +When using @code{MACH_SEND_NOTIFY}, the notify argument did not denote a +valid receive right. + +@item MACH_SEND_NO_NOTIFY +A resource shortage prevented the kernel from setting up a msg-accepted +notification. + +@item MACH_SEND_NOTIFY_IN_PROGRESS +A msg-accepted notification was already requested, and hasn't yet been +generated. +@end table + +These return codes imply that the message was queued: + +@table @code +@item MACH_SEND_WILL_NOTIFY +The message was forcibly queued, and a msg-accepted notification was +requested. + +@item MACH_MSG_SUCCESS +The message was queued. +@end table + +Some return codes, like @code{MACH_SEND_TIMED_OUT}, imply that the +message was almost sent, but could not be queued. In these situations, +the kernel tries to return the message contents to the caller with a +pseudo-receive operation. This prevents the loss of port rights or +memory which only exist in the message. For example, a receive right +which was moved into the message, or out-of-line memory sent with the +deallocate bit. + +The pseudo-receive operation is very similar to a normal receive +operation. The pseudo-receive handles the port rights in the message +header as if they were in the message body. They are not reversed. +After the pseudo-receive, the message is ready to be resent. If the +message is not resent, note that out-of-line memory regions may have +moved and some port rights may have changed names. + +The pseudo-receive operation may encounter resource shortages. This is +similar to a @code{MACH_RCV_BODY_ERROR} return code from a receive +operation. When this happens, the normal send return codes are +augmented with the @code{MACH_MSG_IPC_SPACE}, @code{MACH_MSG_VM_SPACE}, +@code{MACH_MSG_IPC_KERNEL}, and @code{MACH_MSG_VM_KERNEL} bits to +indicate the nature of the resource shortage. + +The queueing of a message carrying receive rights may create a circular +loop of receive rights and messages, which can never be received. For +example, a message carrying a receive right can be sent to that receive +right. This situation is not an error, but the kernel will +garbage-collect such loops, destroying the messages and ports involved. + + +@node Message Receive +@subsection Message Receive + +The receive operation dequeues a message from a port. The receiving +task acquires the port rights and out-of-line memory regions carried in +the message. + +The @code{rcv_name} argument specifies a port or port set from which to +receive. If a port is specified, the caller must possess the receive +right for the port and the port must not be a member of a port set. If +no message is present, then the call blocks, subject to the +@code{MACH_RCV_TIMEOUT} option. + +If a port set is specified, the call will receive a message sent to any +of the member ports. It is permissible for the port set to have no +member ports, and ports may be added and removed while a receive from +the port set is in progress. The received message can come from any of +the member ports which have messages, with the proviso that a member +port with messages will not be indefinitely starved. The +@code{msgh_local_port} field in the received message header specifies +from which port in the port set the message came. + +The @code{rcv_size} argument specifies the size of the caller's message +buffer. The @code{mach_msg} call will not receive a message larger than +@code{rcv_size}. Messages that are too large are destroyed, unless the +@code{MACH_RCV_LARGE} option is used. + +The destination and reply ports are reversed in a received message +header. The @code{msgh_local_port} field names the destination port, +from which the message was received, and the @code{msgh_remote_port} +field names the reply port right. The bits in @code{msgh_bits} are also +reversed. The @code{MACH_MSGH_BITS_LOCAL} bits have the value +@code{MACH_MSG_TYPE_PORT_SEND} if the message was sent to a send right, +and the value @code{MACH_MSG_TYPE_PORT_SEND_ONCE} if was sent to a +send-once right. The @code{MACH_MSGH_BITS_REMOTE} bits describe the +reply port right. + +A received message can contain port rights and out-of-line memory. The +@code{msgh_local_port} field does not receive a port right; the act of +receiving the message destroys the send or send-once right for the +destination port. The msgh_remote_port field does name a received port +right, the reply port right, and the message body can carry port rights +and memory if @code{MACH_MSGH_BITS_COMPLEX} is present in msgh_bits. +Received port rights and memory should be consumed or deallocated in +some fashion. + +In almost all cases, @code{msgh_local_port} will specify the name of a +receive right, either @code{rcv_name} or if @code{rcv_name} is a port +set, a member of @code{rcv_name}. If other threads are concurrently +manipulating the receive right, the situation is more complicated. If +the receive right is renamed during the call, then +@code{msgh_local_port} specifies the right's new name. If the caller +loses the receive right after the message was dequeued from it, then +@code{mach_msg} will proceed instead of returning +@code{MACH_RCV_PORT_DIED}. If the receive right was destroyed, then +@code{msgh_local_port} specifies @code{MACH_PORT_DEAD}. If the receive +right still exists, but isn't held by the caller, then +@code{msgh_local_port} specifies @code{MACH_PORT_NULL}. + +Received messages are stamped with a sequence number, taken from the +port from which the message was received. (Messages received from a +port set are stamped with a sequence number from the appropriate member +port.) Newly created ports start with a zero sequence number, and the +sequence number is reset to zero whenever the port's receive right moves +between tasks. When a message is dequeued from the port, it is stamped +with the port's sequence number and the port's sequence number is then +incremented. The dequeue and increment operations are atomic, so that +multiple threads receiving messages from a port can use the +@code{msgh_seqno} field to reconstruct the original order of the +messages. + +These options modify @code{MACH_RCV_MSG}. If @code{MACH_RCV_MSG} is not +also specified, they are ignored. + +@table @code +@item MACH_RCV_TIMEOUT +The timeout argument should specify a maximum time (in milliseconds) for +the call to block before giving up.@footnote{If MACH_RCV_TIMEOUT is used +without MACH_RCV_INTERRUPT, then the timeout duration might not be +accurate. When the call is interrupted and automatically retried, the +original timeout is used. If interrupts occur frequently enough, the +timeout interval might never expire.} If no message arrives before the +timeout interval elapses, then the call returns +@code{MACH_RCV_TIMED_OUT}. A zero timeout is legitimate. + +@item MACH_RCV_NOTIFY +The notify argument should specify a receive right for a notify port. +If receiving the reply port creates a new port right in the caller, then +the notify port is used to request a dead-name notification for the new +port right. + +@item MACH_RCV_INTERRUPT +If specified, the @code{mach_msg} call will return +@code{MACH_RCV_INTERRUPTED} if a software interrupt aborts the call. +Otherwise, the receive operation will be retried. + +@item MACH_RCV_LARGE +If the message is larger than @code{rcv_size}, then the message remains +queued instead of being destroyed. The call returns +@code{MACH_RCV_TOO_LARGE} and the actual size of the message is returned +in the @code{msgh_size} field of the message header. +@end table + +The receive operation can generate the following return codes. These +return codes imply that the call did not dequeue a message: + +@table @code +@item MACH_RCV_INVALID_NAME +The specified @code{rcv_name} was invalid. + +@item MACH_RCV_IN_SET +The specified port was a member of a port set. + +@item MACH_RCV_TIMED_OUT +The timeout interval expired. + +@item MACH_RCV_INTERRUPTED +A software interrupt occurred. + +@item MACH_RCV_PORT_DIED +The caller lost the rights specified by @code{rcv_name}. + +@item MACH_RCV_PORT_CHANGED +@code{rcv_name} specified a receive right which was moved into a port +set during the call. + +@item MACH_RCV_TOO_LARGE +When using @code{MACH_RCV_LARGE}, and the message was larger than +@code{rcv_size}. The message is left queued, and its actual size is +returned in the @code{msgh_size} field of the message buffer. +@end table + +These return codes imply that a message was dequeued and destroyed: + +@table @code +@item MACH_RCV_HEADER_ERROR +A resource shortage prevented the reception of the port rights in the +message header. + +@item MACH_RCV_INVALID_NOTIFY +When using @code{MACH_RCV_NOTIFY}, the notify argument did not denote a +valid receive right. + +@item MACH_RCV_TOO_LARGE +When not using @code{MACH_RCV_LARGE}, a message larger than +@code{rcv_size} was dequeued and destroyed. +@end table + +In these situations, when a message is dequeued and then destroyed, the +reply port and all port rights and memory in the message body are +destroyed. However, the caller receives the message's header, with all +fields correct, including the destination port but excepting the reply +port, which is @code{MACH_PORT_NULL}. + +These return codes imply that a message was received: + +@table @code +@item MACH_RCV_BODY_ERROR +A resource shortage prevented the reception of a port right or +out-of-line memory region in the message body. The message header, +including the reply port, is correct. The kernel attempts to transfer +all port rights and memory regions in the body, and only destroys those +that can't be transferred. + +@item MACH_RCV_INVALID_DATA +The specified message buffer was not writable. The calling task did +successfully receive the port rights and out-of-line memory regions in +the message. + +@item MACH_MSG_SUCCESS +A message was received. +@end table + +Resource shortages can occur after a message is dequeued, while +transferring port rights and out-of-line memory regions to the receiving +task. The @code{mach_msg} call returns @code{MACH_RCV_HEADER_ERROR} or +@code{MACH_RCV_BODY_ERROR} in this situation. These return codes always +carry extra bits (bitwise-ored) that indicate the nature of the resource +shortage: + +@table @code +@item MACH_MSG_IPC_SPACE +There was no room in the task's IPC name space for another port name. + +@item MACH_MSG_VM_SPACE +There was no room in the task's VM address space for an out-of-line +memory region. + +@item MACH_MSG_IPC_KERNEL +A kernel resource shortage prevented the reception of a port right. + +@item MACH_MSG_VM_KERNEL +A kernel resource shortage prevented the reception of an out-of-line +memory region. +@end table + +If a resource shortage prevents the reception of a port right, the port +right is destroyed and the caller sees the name @code{MACH_PORT_NULL}. +If a resource shortage prevents the reception of an out-of-line memory +region, the region is destroyed and the caller receives a zero address. +In addition, the @code{msgt_size} (@code{msgtl_size}) field in the +data's type descriptor is changed to zero. If a resource shortage +prevents the reception of out-of-line memory carrying port rights, then +the port rights are always destroyed if the memory region can not be +received. A task never receives port rights or memory regions that it +isn't told about. + + +@node Atomicity +@subsection Atomicity + +The @code{mach_msg} call handles port rights in a message header +atomically. Port rights and out-of-line memory in a message body do not +enjoy this atomicity guarantee. The message body may be processed +front-to-back, back-to-front, first out-of-line memory then port rights, +in some random order, or even atomically. + +For example, consider sending a message with the destination port +specified as @code{MACH_MSG_TYPE_MOVE_SEND} and the reply port specified +as @code{MACH_MSG_TYPE_COPY_SEND}. The same send right, with one +user-reference, is supplied for both the @code{msgh_remote_port} and +@code{msgh_local_port} fields. Because @code{mach_msg} processes the +message header atomically, this succeeds. If @code{msgh_remote_port} +were processed before @code{msgh_local_port}, then @code{mach_msg} would +return @code{MACH_SEND_INVALID_REPLY} in this situation. + +On the other hand, suppose the destination and reply port are both +specified as @code{MACH_MSG_TYPE_MOVE_SEND}, and again the same send +right with one user-reference is supplied for both. Now the send +operation fails, but because it processes the header atomically, +mach_msg can return either @code{MACH_SEND_INVALID_DEST} or +@code{MACH_SEND_INVALID_REPLY}. + +For example, consider receiving a message at the same time another +thread is deallocating the destination receive right. Suppose the reply +port field carries a send right for the destination port. If the +deallocation happens before the dequeuing, then the receiver gets +@code{MACH_RCV_PORT_DIED}. If the deallocation happens after the +receive, then the @code{msgh_local_port} and the @code{msgh_remote_port} +fields both specify the same right, which becomes a dead name when the +receive right is deallocated. If the deallocation happens between the +dequeue and the receive, then the @code{msgh_local_port} and +@code{msgh_remote_port} fields both specify @code{MACH_PORT_DEAD}. +Because the header is processed atomically, it is not possible for just +one of the two fields to hold @code{MACH_PORT_DEAD}. + +The @code{MACH_RCV_NOTIFY} option provides a more likely example. +Suppose a message carrying a send-once right reply port is received with +@code{MACH_RCV_NOTIFY} at the same time the reply port is destroyed. If +the reply port is destroyed first, then @code{msgh_remote_port} +specifies @code{MACH_PORT_DEAD} and the kernel does not generate a +dead-name notification. If the reply port is destroyed after it is +received, then @code{msgh_remote_port} specifies a dead name for which +the kernel generates a dead-name notification. It is not possible to +receive the reply port right and have it turn into a dead name before +the dead-name notification is requested; as part of the message header +the reply port is received atomically. + + +@node Port Manipulation Interface +@section Port Manipulation Interface + +This section describes the interface to create, destroy and manipulate +ports, port rights and port sets. + +@cindex IPC space port +@cindex port representing an IPC space +@deftp {Data type} ipc_space_t +This is a @code{task_t} (and as such a @code{mach_port_t}), which holds +a port name associated with a port that represents an IPC space in the +kernel. An IPC space is used by the kernel to manage the port names and +rights available to a task. The IPC space doesn't get a port name of +its own. Instead the port name of the task containing the IPC space is +used to name the IPC space of the task (as is indicated by the fact that +the type of @code{ipc_space_t} is actually @code{task_t}). + +The IPC spaces of tasks are the only ones accessible outside of +the kernel. +@end deftp + +@menu +* Port Creation:: How to create new ports and port sets. +* Port Destruction:: How to destroy ports and port sets. +* Port Names:: How to query and manipulate port names. +* Port Rights:: How to work with port rights. +* Ports and other Tasks:: How to move rights between tasks. +* Receive Rights:: How to work with receive rights. +* Port Sets:: How to work with port sets. +* Request Notifications:: How to request notifications for events. +@c * Inherited Ports:: How to work with the inherited system ports. +@end menu + + +@node Port Creation +@subsection Port Creation + +@deftypefun kern_return_t mach_port_allocate (@w{ipc_space_t @var{task}}, @w{mach_port_right_t @var{right}}, @w{mach_port_t *@var{name}}) +The @code{mach_port_allocate} function creates a new right in the +specified task. The new right's name is returned in @var{name}, which +may be any name that wasn't in use. + +The @var{right} argument takes the following values: + +@table @code +@item MACH_PORT_RIGHT_RECEIVE +@code{mach_port_allocate} creates a port. The new port is not a member +of any port set. It doesn't have any extant send or send-once rights. +Its make-send count is zero, its sequence number is zero, its queue +limit is @code{MACH_PORT_QLIMIT_DEFAULT}, and it has no queued messages. +@var{name} denotes the receive right for the new port. + +@var{task} does not hold send rights for the new port, only the receive +right. @code{mach_port_insert_right} and @code{mach_port_extract_right} +can be used to convert the receive right into a combined send/receive +right. + +@item MACH_PORT_RIGHT_PORT_SET +@code{mach_port_allocate} creates a port set. The new port set has no +members. + +@item MACH_PORT_RIGHT_DEAD_NAME +@code{mach_port_allocate} creates a dead name. The new dead name has +one user reference. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_VALUE} if @var{right} was invalid, @code{KERN_NO_SPACE} if +there was no room in @var{task}'s IPC name space for another right and +@code{KERN_RESOURCE_SHORTAGE} if the kernel ran out of memory. + +The @code{mach_port_allocate} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun mach_port_t mach_reply_port () +The @code{mach_reply_port} system call creates a reply port in the +calling task. + +@code{mach_reply_port} creates a port, giving the calling task the +receive right for the port. The call returns the name of the new +receive right. + +This is very much like creating a receive right with the +@code{mach_port_allocate} call, with two differences. First, +@code{mach_reply_port} is a system call and not an RPC (which requires a +reply port). Second, the port created by @code{mach_reply_port} may be +optimized for use as a reply port. + +The function returns @code{MACH_PORT_NULL} if a resource shortage +prevented the creation of the receive right. +@end deftypefun + +@deftypefun kern_return_t mach_port_allocate_name (@w{ipc_space_t @var{task}}, @w{mach_port_right_t @var{right}}, @w{mach_port_t @var{name}}) +The function @code{mach_port_allocate_name} creates a new right in the +specified task, with a specified name for the new right. @var{name} +must not already be in use for some right, and it can't be the reserved +values @code{MACH_PORT_NULL} and @code{MACH_PORT_DEAD}. + +The @var{right} argument takes the following values: + +@table @code +@item MACH_PORT_RIGHT_RECEIVE +@code{mach_port_allocate_name} creates a port. The new port is not a +member of any port set. It doesn't have any extant send or send-once +rights. Its make-send count is zero, its sequence number is zero, its +queue limit is @code{MACH_PORT_QLIMIT_DEFAULT}, and it has no queued +messages. @var{name} denotes the receive right for the new port. + +@var{task} does not hold send rights for the new port, only the receive +right. @code{mach_port_insert_right} and @code{mach_port_extract_right} +can be used to convert the receive right into a combined send/receive +right. + +@item MACH_PORT_RIGHT_PORT_SET +@code{mach_port_allocate_name} creates a port set. The new port set has +no members. + +@item MACH_PORT_RIGHT_DEAD_NAME +@code{mach_port_allocate_name} creates a new dead name. The new dead +name has one user reference. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_VALUE} if @var{right} was invalid or @var{name} was +@code{MACH_PORT_NULL} or @code{MACH_PORT_DEAD}, @code{KERN_NAME_EXISTS} +if @var{name} was already in use for a port right and +@code{KERN_RESOURCE_SHORTAGE} if the kernel ran out of memory. + +The @code{mach_port_allocate_name} call is actually an RPC to +@var{task}, normally a send right for a task port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@end deftypefun + + +@node Port Destruction +@subsection Port Destruction + +@deftypefun kern_return_t mach_port_deallocate (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}) +The function @code{mach_port_deallocate} releases a user reference for a +right in @var{task}'s IPC name space. It allows a task to release a +user reference for a send or send-once right without failing if the port +has died and the right is now actually a dead name. + +If @var{name} denotes a dead name, send right, or send-once right, then +the right loses one user reference. If it only had one user reference, +then the right is destroyed. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right and +@code{KERN_INVALID_RIGHT} if @var{name} denoted an invalid right. + +The @code{mach_port_deallocate} call is actually an RPC to +@var{task}, normally a send right for a task port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_destroy (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}) +The function @code{mach_port_destroy} deallocates all rights denoted by +a name. The name becomes immediately available for reuse. + +For most purposes, @code{mach_port_mod_refs} and +@code{mach_port_deallocate} are preferable. + +If @var{name} denotes a port set, then all members of the port set are +implicitly removed from the port set. + +If @var{name} denotes a receive right that is a member of a port set, +the receive right is implicitly removed from the port set. If there is +a port-destroyed request registered for the port, then the receive right +is not actually destroyed, but instead is sent in a port-destroyed +notification to the backup port. If there is no registered +port-destroyed request, remaining messages queued to the port are +destroyed and extant send and send-once rights turn into dead names. If +those send and send-once rights have dead-name requests registered, then +dead-name notifications are generated for them. + +If @var{name} denotes a send-once right, then the send-once right is +used to produce a send-once notification for the port. + +If @var{name} denotes a send-once, send, and/or receive right, and it +has a dead-name request registered, then the registered send-once right +is used to produce a port-deleted notification for the name. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right. + +The @code{mach_port_destroy} call is actually an RPC to +@var{task}, normally a send right for a task port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@end deftypefun + + +@node Port Names +@subsection Port Names + +@deftypefun kern_return_t mach_port_names (@w{ipc_space_t @var{task}}, @w{mach_port_array_t *@var{names}}, @w{mach_msg_type_number_t *@var{ncount}}, @w{mach_port_type_array_t *@var{types}}, @w{mach_msg_type_number_t *@var{tcount}}) +The function @code{mach_port_names} returns information about +@var{task}'s port name space. For each name, it also returns what type +of rights @var{task} holds. (The same information returned by +@code{mach_port_type}.) @var{names} and @var{types} are arrays that are +automatically allocated when the reply message is received. The user +should @code{vm_deallocate} them when the data is no longer needed. + +@code{mach_port_names} will return in @var{names} the names of the +ports, port sets, and dead names in the task's port name space, in no +particular order and in @var{ncount} the number of names returned. It +will return in @var{types} the type of each corresponding name, which +indicates what kind of rights the task holds with that name. +@var{tcount} should be the same as @var{ncount}. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_RESOURCE_SHORTAGE} if the kernel ran out of memory. + +The @code{mach_port_names} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_type (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_type_t *@var{ptype}}) +The function @code{mach_port_type} returns information about +@var{task}'s rights for a specific name in its port name space. The +returned @var{ptype} is a bitmask indicating what rights @var{task} +holds for the port, port set or dead name. The bitmask is composed of +the following bits: + +@table @code +@item MACH_PORT_TYPE_SEND +The name denotes a send right. + +@item MACH_PORT_TYPE_RECEIVE +The name denotes a receive right. + +@item MACH_PORT_TYPE_SEND_ONCE +The name denotes a send-once right. + +@item MACH_PORT_TYPE_PORT_SET +The name denotes a port set. + +@item MACH_PORT_TYPE_DEAD_NAME +The name is a dead name. + +@item MACH_PORT_TYPE_DNREQUEST +A dead-name request has been registered for the right. + +@item MACH_PORT_TYPE_MAREQUEST +A msg-accepted request for the right is pending. + +@item MACH_PORT_TYPE_COMPAT +The port right was created in the compatibility mode. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid and +@code{KERN_INVALID_NAME} if @var{name} did not denote a right. + +The @code{mach_port_type} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_rename (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{old_name}}, @w{mach_port_t @var{new_name}}) +The function @code{mach_port_rename} changes the name by which a port, +port set, or dead name is known to @var{task}. @var{old_name} is the +original name and @var{new_name} the new name for the port right. +@var{new_name} must not already be in use, and it can't be the +distinguished values @code{MACH_PORT_NULL} and @code{MACH_PORT_DEAD}. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{old_name} did not denote a right, +@code{KERN_INVALID_VALUE} if @var{new_name} was @code{MACH_PORT_NULL} or +@code{MACH_PORT_DEAD}, @code{KERN_NAME_EXISTS} if @code{new_name} +already denoted a right and @code{KERN_RESOURCE_SHORTAGE} if the kernel +ran out of memory. + +The @code{mach_port_rename} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + + +@node Port Rights +@subsection Port Rights + +@deftypefun kern_return_t mach_port_get_refs (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_right_t @var{right}}, @w{mach_port_urefs_t *@var{refs}}) +The function @code{mach_port_get_refs} returns the number of user +references a task has for a right. + +The @var{right} argument takes the following values: +@itemize @bullet +@item @code{MACH_PORT_RIGHT_SEND} +@item @code{MACH_PORT_RIGHT_RECEIVE} +@item @code{MACH_PORT_RIGHT_SEND_ONCE} +@item @code{MACH_PORT_RIGHT_PORT_SET} +@item @code{MACH_PORT_RIGHT_DEAD_NAME} +@end itemize + +If @var{name} denotes a right, but not the type of right specified, then +zero is returned. Otherwise a positive number of user references is +returned. Note that a name may simultaneously denote send and receive +rights. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_VALUE} if @var{right} was invalid and +@code{KERN_INVALID_NAME} if @var{name} did not denote a right. + +The @code{mach_port_get_refs} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_mod_refs (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_right_t @var{right}}, @w{mach_port_delta_t @var{delta}}) +The function @code{mach_port_mod_refs} requests that the number of user +references a task has for a right be changed. This results in the right +being destroyed, if the number of user references is changed to zero. +The task holding the right is @var{task}, @var{name} should denote the +specified right. @var{right} denotes the type of right being modified. +@var{delta} is the signed change to the number of user references. + +The @var{right} argument takes the following values: +@itemize @bullet +@item @code{MACH_PORT_RIGHT_SEND} +@item @code{MACH_PORT_RIGHT_RECEIVE} +@item @code{MACH_PORT_RIGHT_SEND_ONCE} +@item @code{MACH_PORT_RIGHT_PORT_SET} +@item @code{MACH_PORT_RIGHT_DEAD_NAME} +@end itemize + +The number of user references for the right is changed by the amount +@var{delta}, subject to the following restrictions: port sets, receive +rights, and send-once rights may only have one user reference. The +resulting number of user references can't be negative. If the resulting +number of user references is zero, the effect is to deallocate the +right. For dead names and send rights, there is an +implementation-defined maximum number of user references. + +If the call destroys the right, then the effect is as described for +@code{mach_port_destroy}, with the exception that +@code{mach_port_destroy} simultaneously destroys all the rights denoted +by a name, while @code{mach_port_mod_refs} can only destroy one right. +The name will be available for reuse if it only denoted the one right. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_VALUE} if @var{right} was invalid or the +user-reference count would become negative, @code{KERN_INVALID_NAME} if +@var{name} did not denote a right, @code{KERN_INVALID_RIGHT} if +@var{name} denoted a right, but not the specified right and +@code{KERN_UREFS_OVERFLOW} if the user-reference count would overflow. + +The @code{mach_port_mod_refs} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + + +@node Ports and other Tasks +@subsection Ports and other Tasks + +@deftypefun kern_return_t mach_port_insert_right (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_t @var{right}}, @w{mach_msg_type_name_t @var{right_type}}) +The function @var{mach_port_insert_right} inserts into @var{task} the +caller's right for a port, using a specified name for the right in the +target task. + +The specified @var{name} can't be one of the reserved values +@code{MACH_PORT_NULL} or @code{MACH_PORT_DEAD}. The @var{right} can't +be @code{MACH_PORT_NULL} or @code{MACH_PORT_DEAD}. + +The argument @var{right_type} specifies a right to be inserted and how +that right should be extracted from the caller. It should be a value +appropriate for @var{msgt_name}; see @code{mach_msg}. @c XXX cross ref + +If @var{right_type} is @code{MACH_MSG_TYPE_MAKE_SEND}, +@code{MACH_MSG_TYPE_MOVE_SEND}, or @code{MACH_MSG_TYPE_COPY_SEND}, then +a send right is inserted. If the target already holds send or receive +rights for the port, then @var{name} should denote those rights in the +target. Otherwise, @var{name} should be unused in the target. If the +target already has send rights, then those send rights gain an +additional user reference. Otherwise, the target gains a send right, +with a user reference count of one. + +If @var{right_type} is @code{MACH_MSG_TYPE_MAKE_SEND_ONCE} or +@code{MACH_MSG_TYPE_MOVE_SEND_ONCE}, then a send-once right is inserted. +The name should be unused in the target. The target gains a send-once +right. + +If @var{right_type} is @code{MACH_MSG_TYPE_MOVE_RECEIVE}, then a receive +right is inserted. If the target already holds send rights for the +port, then name should denote those rights in the target. Otherwise, +name should be unused in the target. The receive right is moved into +the target task. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_VALUE} if @var{right} was not a port right or +@var{name} was @code{MACH_PORT_NULL} or @code{MACH_PORT_DEAD}, +@code{KERN_NAME_EXISTS} if @var{name} already denoted a right, +@code{KERN_INVALID_CAPABILITY} if @var{right} was @code{MACH_PORT_NULL} +or @code{MACH_PORT_DEAD} @code{KERN_RIGHT_EXISTS} if @var{task} already +had rights for the port, with a different name, +@code{KERN_UREFS_OVERFLOW} if the user-reference count would overflow +and @code{KERN_RESOURCE_SHORTAGE} if the kernel ran out of memory. + +The @code{mach_port_insert_right} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_extract_right (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_msg_type_name_t @var{desired_type}}, @w{mach_port_t *@var{right}}, @w{mach_msg_type_name_t *@var{acquired_type}}) +The function @var{mach_port_extract_right} extracts a port right from +the target @var{task} and returns it to the caller as if the task sent +the right voluntarily, using @var{desired_type} as the value of +@var{msgt_name}. @xref{Mach Message Call}. + +The returned value of @var{acquired_type} will be +@code{MACH_MSG_TYPE_PORT_SEND} if a send right is extracted, +@code{MACH_MSG_TYPE_PORT_RECEIVE} if a receive right is extracted, and +@code{MACH_MSG_TYPE_PORT_SEND_ONCE} if a send-once right is extracted. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right, +@code{KERN_INVALID_RIGHT} if @var{name} denoted a right, but an invalid one, +@code{KERN_INVALID_VALUE} if @var{desired_type} was invalid. + +The @code{mach_port_extract_right} call is actually an RPC to +@var{task}, normally a send right for a task port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@end deftypefun + + +@node Receive Rights +@subsection Receive Rights + +@deftp {Data type} mach_port_seqno_t +The @code{mach_port_seqno_t} data type is an @code{unsigned int} which +contains the sequence number of a port. +@end deftp + +@deftp {Data type} mach_port_mscount_t +The @code{mach_port_mscount_t} data type is an @code{unsigned int} which +contains the make-send count for a port. +@end deftp + +@deftp {Data type} mach_port_msgcount_t +The @code{mach_port_msgcount_t} data type is an @code{unsigned int} which +contains a number of messages. +@end deftp + +@deftp {Data type} mach_port_rights_t +The @code{mach_port_rights_t} data type is an @code{unsigned int} which +contains a number of rights for a port. +@end deftp + +@deftp {Data type} mach_port_status_t +This structure contains some status information about a port, which can +be queried with @code{mach_port_get_receive_status}. It has the following +members: + +@table @code +@item mach_port_t mps_pset +The containing port set. + +@item mach_port_seqno_t mps_seqno +The sequence number. + +@item mach_port_mscount_t mps_mscount +The make-send count. + +@item mach_port_msgcount_t mps_qlimit +The maximum number of messages in the queue. + +@item mach_port_msgcount_t mps_msgcount +The current number of messages in the queue. + +@item mach_port_rights_t mps_sorights +The number of send-once rights that exist. + +@item boolean_t mps_srights +@code{TRUE} if send rights exist. + +@item boolean_t mps_pdrequest +@code{TRUE} if port-deleted notification is requested. + +@item boolean_t mps_nsrequest +@code{TRUE} if no-senders notification is requested. +@end table +@end deftp + +@deftypefun kern_return_t mach_port_get_receive_status (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_status_t *@var{status}}) +The function @code{mach_port_get_receive_status} returns the current +status of the specified receive right. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right and +@code{KERN_INVALID_RIGHT} if @var{name} denoted a right, but not a +receive right. + +The @code{mach_port_get_receive_status} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_set_mscount (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_mscount_t @var{mscount}}) +The function @code{mach_port_set_mscount} changes the make-send count of +@var{task}'s receive right named @var{name} to @var{mscount}. All +values for @var{mscount} are valid. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right and +@code{KERN_INVALID_RIGHT} if @var{name} denoted a right, but not a +receive right. + +The @code{mach_port_set_mscount} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_set_qlimit (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_msgcount_t @var{qlimit}}) +The function @code{mach_port_set_qlimit} changes the queue limit +@var{task}'s receive right named @var{name} to @var{qlimit}. Valid +values for @var{qlimit} are between zero and +@code{MACH_PORT_QLIMIT_MAX}, inclusive. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right, +@code{KERN_INVALID_RIGHT} if @var{name} denoted a right, but not a +receive right and @code{KERN_INVALID_VALUE} if @var{qlimit} was invalid. + +The @code{mach_port_set_qlimit} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_set_seqno (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_seqno_t @var{seqno}}) +The function @code{mach_port_set_seqno} changes the sequence number +@var{task}'s receive right named @var{name} to @var{seqno}. All +sequence number values are valid. The next message received from the +port will be stamped with the specified sequence number. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right and +@code{KERN_INVALID_RIGHT} if @var{name} denoted a right, but not a +receive right. + +The @code{mach_port_set_seqno} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + + +@node Port Sets +@subsection Port Sets + +@deftypefun kern_return_t mach_port_get_set_status (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_port_array_t *@var{members}}, @w{mach_msg_type_number_t *@var{count}}) +The function @code{mach_port_get_set_status} returns the members of a +port set. @var{members} is an array that is automatically allocated +when the reply message is received. The user should +@code{vm_deallocate} it when the data is no longer needed. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right, +@code{KERN_INVALID_RIGHT} if @var{name} denoted a right, but not a +receive right and @code{KERN_RESOURCE_SHORTAGE} if the kernel ran out of +memory. + +The @code{mach_port_get_set_status} call is actually an RPC to +@var{task}, normally a send right for a task port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@end deftypefun + +@deftypefun kern_return_t mach_port_move_member (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{member}}, @w{mach_port_t @var{after}}) +The function @var{mach_port_move_member} moves the receive right +@var{member} into the port set @var{after}. If the receive right is +already a member of another port set, it is removed from that set first +(the whole operation is atomic). If the port set is +@code{MACH_PORT_NULL}, then the receive right is not put into a port +set, but removed from its current port set. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_NAME} if @var{member} or @var{after} did not denote a +right, @code{KERN_INVALID_RIGHT} if @var{member} denoted a right, but +not a receive right or @var{after} denoted a right, but not a port set, +and @code{KERN_NOT_IN_SET} if @var{after} was @code{MACH_PORT_NULL}, but +@code{member} wasn't currently in a port set. + +The @code{mach_port_move_member} call is actually an RPC to @var{task}, +normally a send right for a task port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + + +@node Request Notifications +@subsection Request Notifications + +@deftypefun kern_return_t mach_port_request_notification (@w{ipc_space_t @var{task}}, @w{mach_port_t @var{name}}, @w{mach_msg_id_t @var{variant}}, @w{mach_port_mscount_t @var{sync}}, @w{mach_port_t @var{notify}}, @w{mach_msg_type_name_t @var{notify_type}}, @w{mach_port_t *@var{previous}}) +The function @code{mach_port_request_notification} registers a request +for a notification and supplies the send-once right @var{notify} to +which the notification will be sent. The @var{notify_type} denotes the +IPC type for the send-once right, which can be +@code{MACH_MSG_TYPE_MAKE_SEND_ONCE} or +@code{MACH_MSG_TYPE_MOVE_SEND_ONCE}. It is an atomic swap, returning +the previously registered send-once right (or @code{MACH_PORT_NULL} for +none) in @var{previous}. A previous notification request may be +cancelled by providing @code{MACH_PORT_NULL} for @var{notify}. + +The @var{variant} argument takes the following values: + +@table @code +@item MACH_NOTIFY_PORT_DESTROYED +@var{sync} must be zero. The @var{name} must specify a receive right, +and the call requests a port-destroyed notification for the receive +right. If the receive right were to have been destroyed, say by +@code{mach_port_destroy}, then instead the receive right will be sent in +a port-destroyed notification to the registered send-once right. + +@item MACH_NOTIFY_DEAD_NAME +The call requests a dead-name notification. @var{name} specifies send, +receive, or send-once rights for a port. If the port is destroyed (and +the right remains, becoming a dead name), then a dead-name notification +which carries the name of the right will be sent to the registered +send-once right. If @var{notify} is not null and sync is non-zero, the +name may specify a dead name, and a dead-name notification is +immediately generated. + +Whenever a dead-name notification is generated, the user reference count +of the dead name is incremented. For example, a send right with two +user refs has a registered dead-name request. If the port is destroyed, +the send right turns into a dead name with three user refs (instead of +two), and a dead-name notification is generated. + +If the name is made available for reuse, perhaps because of +@code{mach_port_destroy} or @code{mach_port_mod_refs}, or the name +denotes a send-once right which has a message sent to it, then the +registered send-once right is used to generate a port-deleted +notification. + +@item MACH_NOTIFY_NO_SENDERS +The call requests a no-senders notification. @var{name} must specify a +receive right. If @var{notify} is not null, and the receive right's +make-send count is greater than or equal to the sync value, and it has +no extant send rights, than an immediate no-senders notification is +generated. Otherwise the notification is generated when the receive +right next loses its last extant send right. In either case, any +previously registered send-once right is returned. + +The no-senders notification carries the value the port's make-send count +had when it was generated. The make-send count is incremented whenever +@code{MACH_MSG_TYPE_MAKE_SEND} is used to create a new send right from +the receive right. The make-send count is reset to zero when the +receive right is carried in a message. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_TASK} if @var{task} was invalid, +@code{KERN_INVALID_VALUE} if @var{variant} was invalid, +@code{KERN_INVALID_NAME} if @var{name} did not denote a right, +@code{KERN_INVALID_RIGHT} if @var{name} denoted an invalid right and +@code{KERN_INVALID_CAPABILITY} if @var{notify} was invalid. + +When using @code{MACH_NOTIFY_PORT_DESTROYED}, the function returns +@code{KERN_INVALID_VALUE} if @var{sync} wasn't zero. + +When using @code{MACH_NOTIFY_DEAD_NAME}, the function returns +@code{KERN_RESOURCE_SHORTAGE} if the kernel ran out of memory, +@code{KERN_INVALID_ARGUMENT} if @var{name} denotes a dead name, but +@var{sync} is zero or @var{notify} is @code{MACH_PORT_NULL}, and +@code{KERN_UREFS_OVERFLOW} if @var{name} denotes a dead name, but +generating an immediate dead-name notification would overflow the name's +user-reference count. + +The @code{mach_port_request_notification} call is actually an RPC to +@var{task}, normally a send right for a task port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@end deftypefun + +@c The inherited ports concept is not used in the Hurd, +@c and so the _SLOT macros are not defined in GNU Mach. + +@c @node Inherited Ports +@c @subsection Inherited Ports + +@c @deftypefun kern_return_t mach_ports_register (@w{task_t @var{target_task}, @w{port_array_t @var{init_port_set}}, @w{int @var{init_port_array_count}}) +@c @deftypefunx kern_return_t mach_ports_lookup (@w{task_t @var{target_task}, @w{port_array_t *@var{init_port_set}}, @w{int *@var{init_port_array_count}}) +@c @code{mach_ports_register} manipulates the inherited ports array, +@c @code{mach_ports_lookup} is used to acquire specific parent ports. +@c @var{target_task} is the task to be affected. @var{init_port_set} is an +@c array of system ports to be registered, or returned. Although the array +@c size is given as variable, the kernel will only accept a limited number +@c of ports. @var{init_port_array_count} is the number of ports returned +@c in @var{init_port_set}. + +@c @code{mach_ports_register} registers an array of well-known system ports +@c with the kernel on behalf of a specific task. Currently the ports to be +@c registered are: the port to the Network Name Server, the port to the +@c Environment Manager, and a port to the Service server. These port +@c values must be placed in specific slots in the init_port_set. The slot +@c numbers are given by the global constants defined in @file{mach_init.h}: +@c @code{NAME_SERVER_SLOT}, @code{ENVIRONMENT_SLOT}, and +@c @code{SERVICE_SLOT}. These ports may later be retrieved with +@c @code{mach_ports_lookup}. + +@c When a new task is created (see @code{task_create}), the child task will +@c be given access to these ports. Only port send rights may be +@c registered. Furthermore, the number of ports which may be registered is +@c fixed and given by the global constant @code{MACH_PORT_SLOTS_USED} +@c Attempts to register too many ports will fail. + +@c It is intended that this mechanism be used only for task initialization, +@c and then only by runtime support modules. A parent task has three +@c choices in passing these system ports to a child task. Most commonly it +@c can do nothing and its child will inherit access to the same +@c @var{init_port_set} that the parent has; or a parent task may register a +@c set of ports it wishes to have passed to all of its children by calling +@c @code{mach_ports_register} using its task port; or it may make necessary +@c modifications to the set of ports it wishes its child to see, and then +@c register those ports using the child's task port prior to starting the +@c child's thread(s). The @code{mach_ports_lookup} call which is done by +@c @code{mach_init} in the child task will acquire these initial ports for +@c the child. + +@c Tasks other than the Network Name Server and the Environment Mangager +@c should not need access to the Service port. The Network Name Server port +@c is the same for all tasks on a given machine. The Environment port is +@c the only port likely to have different values for different tasks. + +@c Since the number of ports which may be registered is limited, ports +@c other than those used by the runtime system to initialize a task should +@c be passed to children either through an initial message, or through the +@c Network Name Server for public ports, or the Environment Manager for +@c private ports. + +@c The function returns @code{KERN_SUCCESS} if the memory was allocated, +@c and @code{KERN_INVALID_ARGUMENT} if an attempt was made to register more +@c ports than the current kernel implementation allows. +@c @end deftypefun + + +@node Virtual Memory Interface +@chapter Virtual Memory Interface + +@cindex virtual memory map port +@cindex port representing a virtual memory map +@deftp {Data type} vm_task_t +This is a @code{task_t} (and as such a @code{mach_port_t}), which holds +a port name associated with a port that represents a virtual memory map +in the kernel. An virtual memory map is used by the kernel to manage +the address space of a task. The virtual memory map doesn't get a port +name of its own. Instead the port name of the task provided with the +virtual memory is used to name the virtual memory map of the task (as is +indicated by the fact that the type of @code{vm_task_t} is actually +@code{task_t}). + +The virtual memory maps of tasks are the only ones accessible outside of +the kernel. +@end deftp + +@menu +* Memory Allocation:: Allocation of new virtual memory. +* Memory Deallocation:: Freeing unused virtual memory. +* Data Transfer:: Reading, writing and copying memory. +* Memory Attributes:: Tweaking memory regions. +* Mapping Memory Objects:: How to map memory objects. +* Memory Statistics:: How to get statistics about memory usage. +@end menu + +@node Memory Allocation +@section Memory Allocation + +@deftypefun kern_return_t vm_allocate (@w{vm_task_t @var{target_task}}, @w{vm_address_t *@var{address}}, @w{vm_size_t @var{size}}, @w{boolean_t @var{anywhere}}) +The function @code{vm_allocate} allocates a region of virtual memory, +placing it in the specified @var{task}'s address space. + +The starting address is @var{address}. If the @var{anywhere} option is +false, an attempt is made to allocate virtual memory starting at this +virtual address. If this address is not at the beginning of a virtual +page, it will be rounded down to one. If there is not enough space at +this address, no memory will be allocated. If the @var{anywhere} option +is true, the input value of this address will be ignored, and the space +will be allocated wherever it is available. In either case, the address +at which memory was actually allocated will be returned in +@var{address}. + +@var{size} is the number of bytes to allocate (rounded by the system in +a machine dependent way to an integral number of virtual pages). + +If @var{anywhere} is true, the kernel should find and allocate any +region of the specified size, and return the address of the resulting +region in address address, rounded to a virtual page boundary if there +is sufficient space. + +The physical memory is not actually allocated until the new virtual +memory is referenced. By default, the kernel rounds all addresses down +to the nearest page boundary and all memory sizes up to the nearest page +size. The global variable @code{vm_page_size} contains the page size. +@code{mach_task_self} returns the value of the current task port which +should be used as the @var{target_task} argument in order to allocate +memory in the caller's address space. For languages other than C, these +values can be obtained by the calls @code{vm_statistics} and +@code{mach_task_self}. Initially, the pages of allocated memory will be +protected to allow all forms of access, and will be inherited in child +tasks as a copy. Subsequent calls to @code{vm_protect} and +@code{vm_inherit} may be used to change these properties. The allocated +region is always zero-filled. + +The function returns @code{KERN_SUCCESS} if the memory was successfully +allocated, @code{KERN_INVALID_ADDRESS} if an illegal address was +specified and @code{KERN_NO_SPACE} if there was not enough space left to +satisfy the request. +@end deftypefun + + +@node Memory Deallocation +@section Memory Deallocation + +@deftypefun kern_return_t vm_deallocate (@w{vm_task_t @var{target_task}}, @w{vm_address_t @var{address}}, @w{vm_size_t @var{size}}) +@code{vm_deallocate} relinquishes access to a region of a @var{task}'s +address space, causing further access to that memory to fail. This +address range will be available for reallocation. @var{address} is the +starting address, which will be rounded down to a page boundary. +@var{size} is the number of bytes to deallocate, which will be rounded +up to give a page boundary. Note, that because of the rounding to +virtual page boundaries, more than @var{size} bytes may be deallocated. +Use @code{vm_page_size} or @code{vm_statistics} to find out the current +virtual page size. + +This call may be used to deallocte memory that was passed to a task in a +message (via out of line data). In that case, the rounding should cause +no trouble, since the region of memory was allocated as a set of pages. + +The @code{vm_deallocate} call affects only the task specified by the +@var{target_task}. Other tasks which may have access to this memory may +continue to reference it. + +The function returns @code{KERN_SUCCESS} if the memory was successfully +deallocated and @code{KERN_INVALID_ADDRESS} if an illegal or +non-allocated address was specified. +@end deftypefun + + +@node Data Transfer +@section Data Transfer + +@deftypefun kern_return_t vm_read (@w{vm_task_t @var{target_task}}, @w{vm_address_t @var{address}}, @w{vm_size_t @var{size}}, @w{vm_offset_t *@var{data}}, @w{mach_msg_type_number_t *@var{data_count}}) +The function @code{vm_read} allows one task's virtual memory to be read +by another task. The @var{target_task} is the task whose memory is to +be read. @var{address} is the first address to be read and must be on a +page boundary. @var{size} is the number of bytes of data to be read and +must be an integral number of pages. @var{data} is the array of data +copied from the given task, and @var{data_count} is the size of the data +array in bytes (will be an integral number of pages). + +Note that the data array is returned in a newly allocated region; the +task reading the data should @code{vm_deallocate} this region when it is +done with the data. + +The function returns @code{KERN_SUCCESS} if the memory was successfully +read, @code{KERN_INVALID_ADDRESS} if an illegal or non-allocated address +was specified or there was not @var{size} bytes of data following the +address, @code{KERN_INVALID_ARGUMENT} if the address does not start on a +page boundary or the size is not an integral number of pages, +@code{KERN_PROTECTION_FAILURE} if the address region in the target task +is protected against reading and @code{KERN_NO_SPACE} if there was not +enough room in the callers virtual memory to allocate space for the data +to be returned. +@end deftypefun + +@deftypefun kern_return_t vm_write (@w{vm_task_t @var{target_task}}, @w{vm_address_t @var{address}}, @w{vm_offset_t @var{data}}, @w{mach_msg_type_number_t @var{data_count}}) +The function @code{vm_write} allows a task to write to the vrtual memory +of @var{target_task}. @var{address} is the starting address in task to +be affected. @var{data} is an array of bytes to be written, and +@var{data_count} the size of the @var{data} array. + +The current implementation requires that @var{address}, @var{data} and +@var{data_count} all be page-aligned. Otherwise, +@code{KERN_INVALID_ARGUMENT} is returned. + +The function returns @code{KERN_SUCCESS} if the memory was successfully +written, @code{KERN_INVALID_ADDRESS} if an illegal or non-allocated +address was specified or there was not @var{data_count} bytes of +allocated memory starting at @var{address} and +@code{KERN_PROTECTION_FAILURE} if the address region in the target task +is protected against writing. +@end deftypefun + +@deftypefun kern_return_t vm_copy (@w{vm_task_t @var{target_task}}, @w{vm_address_t @var{source_address}}, @w{vm_size_t @var{count}}, @w{vm_offset_t @var{dest_address}}) +The function @code{vm_copy} causes the source memory range to be copied +to the destination address. The source and destination memory ranges +may overlap. The destination address range must already be allocated +and writable; the source range must be readable. + +@code{vm_copy} is equivalent to @code{vm_read} followed by +@code{vm_write}. + +The current implementation requires that @var{address}, @var{data} and +@var{data_count} all be page-aligned. Otherwise, +@code{KERN_INVALID_ARGUMENT} is returned. + +The function returns @code{KERN_SUCCESS} if the memory was successfully +written, @code{KERN_INVALID_ADDRESS} if an illegal or non-allocated +address was specified or there was insufficient memory allocated at one +of the addresses and @code{KERN_PROTECTION_FAILURE} if the destination +region was not writable or the source region was not readable. +@end deftypefun + + +@node Memory Attributes +@section Memory Attributes + +@deftypefun kern_return_t vm_region (@w{vm_task_t @var{target_task}}, @w{vm_address_t *@var{address}}, @w{vm_size_t *@var{size}}, @w{vm_prot_t *@var{protection}}, @w{vm_prot_t *@var{max_protection}}, @w{vm_inherit_t *@var{inheritance}}, @w{boolean_t *@var{shared}}, @w{memory_object_name_t *@var{object_name}}, @w{vm_offset_t *@var{offset}}) +The function @code{vm_region} returns a description of the specified +region of @var{target_task}'s virtual address space. @code{vm_region} +begins at @var{address} and looks forward through memory until it comes +to an allocated region. If address is within a region, then that region +is used. Various bits of information about the region are returned. If +@var{address} was not within a region, then @var{address} is set to the +start of the first region which follows the incoming value. In this way +an entire address space can be scanned. + +The @var{size} returned is the size of the located region in bytes. +@var{protection} is the current protection of the region, +@var{max_protection} is the maximum allowable protection for this +region. @var{inheritance} is the inheritance attribute for this region. +@var{shared} tells if the region is shared or not. The port +@var{object_name} identifies the memory object associated with this +region, and @var{offset} is the offset into the pager object that this +region begins at. +@c XXX cross ref pager_init + +The function returns @code{KERN_SUCCESS} if the memory region was +successfully located and the information returned and @code{KERN_NO_SPACE} if +there is no region at or above @var{address} in the specified task. +@end deftypefun + +@deftypefun kern_return_t vm_protect (@w{vm_task_t @var{target_task}}, @w{vm_address_t @var{address}}, @w{vm_size_t @var{size}}, @w{boolean_t @var{set_maximum}}, @w{vm_prot_t @var{new_protection}}) +The function @code{vm_protect} sets the virtual memory access privileges +for a range of allocated addresses in @var{target_task}'s virtual +address space. The protection argument describes a combination of read, +write, and execute accesses that should be @emph{permitted}. + +@var{address} is the starting address, which will be rounded down to a +page boundary. @var{size} is the size in bytes of the region for which +protection is to change, and will be rounded up to give a page boundary. +If @var{set_maximum} is set, make the protection change apply to the +maximum protection associated with this address range; otherwise, the +current protection on this range is changed. If the maximum protection +is reduced below the current protection, both will be changed to reflect +the new maximum. @var{new_protection} is the new protection value for +this region; a set of: @code{VM_PROT_READ}, @code{VM_PROT_WRITE}, +@code{VM_PROT_EXECUTE}. + +The enforcement of virtual memory protection is machine-dependent. +Nominally read access requires @code{VM_PROT_READ} permission, write +access requires @code{VM_PROT_WRITE} permission, and execute access +requires @code{VM_PROT_EXECUTE} permission. However, some combinations +of access rights may not be supported. In particular, the kernel +interface allows write access to require @code{VM_PROT_READ} and +@code{VM_PROT_WRITE} permission and execute access to require +@code{VM_PROT_READ} permission. + +The function returns @code{KERN_SUCCESS} if the memory was successfully +protected, @code{KERN_INVALID_ADDRESS} if an illegal or non-allocated +address was specified and @code{KERN_PROTECTION_FAILURE} if an attempt +was made to increase the current or maximum protection beyond the +existing maximum protection value. +@end deftypefun + +@deftypefun kern_return_t vm_inherit (@w{vm_task_t @var{target_task}}, @w{vm_address_t @var{address}}, @w{vm_size_t @var{size}}, @w{vm_inherit_t @var{new_inheritance}}) +The function @code{vm_inherit} specifies how a region of +@var{target_task}'s address space is to be passed to child tasks at the +time of task creation. Inheritance is an attribute of virtual pages, so +@var{address} to start from will be rounded down to a page boundary and +@var{size}, the size in bytes of the region for wihch inheritance is to +change, will be rounded up to give a page boundary. How this memory is +to be inherited in child tasks is specified by @var{new_inheritance}. +Inheritance is specified by using one of these following three values: + +@table @code +@item VM_INHERIT_SHARE +Child tasks will share this memory with this task. + +@item VM_INHERIT_COPY +Child tasks will receive a copy of this region. + +@item VM_INHERIT_NONE +This region will be absent from child tasks. +@end table + +Setting @code{vm_inherit} to @code{VM_INHERIT_SHARE} and forking a child +task is the only way two Mach tasks can share physical memory. Remember +that all the theads of a given task share all the same memory. + +The function returns @code{KERN_SUCCESS} if the memory inheritance was +successfully set and @code{KERN_INVALID_ADDRESS} if an illegal or +non-allocated address was specified. +@end deftypefun + +@deftypefun kern_return_t vm_wire (@w{host_priv_t @var{host_priv}}, @w{vm_task_t @var{target_task}}, @w{vm_address_t @var{address}}, @w{vm_size_t @var{size}}, @w{vm_prot_t @var{access}}) +The function @code{vm_wire} allows privileged applications to control +memory pageability. @var{host_priv} is the privileged host port for the +host on which @var{target_task} resides. @var{address} is the starting +address, which will be rounded down to a page boundary. @var{size} is +the size in bytes of the region for which protection is to change, and +will be rounded up to give a page boundary. @var{access} specifies the +types of accesses that must not cause page faults. + +The semantics of a successful @code{vm_wire} operation are that memory +in the specified range will not cause page faults for any accesses +included in access. Data memory can be made non-pageable (wired) with a +access argument of @code{VM_PROT_READ | VM_PROT_WRITE}. A special case +is that @code{VM_PROT_NONE} makes the memory pageable. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_HOST} if @var{host_priv} was not the privileged host +port, @code{KERN_INVALID_TASK} if @var{task} was not a valid task, +@code{KERN_INVALID_VALUE} if @var{access} specified an invalid access +mode, @code{KERN_FAILURE} if some memory in the specified range is not +present or has an inappropriate protection value, and +@code{KERN_INVALID_ARGUMENT} if unwiring (@var{access} is +@code{VM_PROT_NONE}) and the memory is not already wired. + +The @code{vm_wire} call is actually an RPC to @var{host_priv}, normally +a send right for a privileged host port, but potentially any send right. +In addition to the normal diagnostic return codes from the call's server +(normally the kernel), the call may return @code{mach_msg} return codes. +@end deftypefun + +@deftypefun kern_return_t vm_machine_attribute (@w{vm_task_t @var{task}}, @w{vm_address_t @var{address}}, @w{vm_size_t @var{size}}, @w{vm_prot_t @var{access}}, @w{vm_machine_attribute_t @var{attribute}}, @w{vm_machine_attribute_val_t @var{value}}) +The function @code{vm_machine_attribute} specifies machine-specific +attributes for a VM mapping, such as cachability, migrability, +replicability. This is used on machines that allow the user control +over the cache (this is the case for MIPS architectures) or placement of +memory pages as in NUMA architectures (Non-Uniform Memory Access time) +such as the IBM ACE multiprocessor. + +Machine-specific attributes can be consider additions to the +machine-independent ones such as protection and inheritance, but they +are not guaranteed to be supported by any given machine. Moreover, +implementations of Mach on new architectures might find the need for new +attribute types and or values besides the ones defined in the initial +implementation. + +The types currently defined are +@table @code +@item MATTR_CACHE +Controls caching of memory pages + +@item MATTR_MIGRATE +Controls migrability of memory pages + +@item MATTR_REPLICATE +Controls replication of memory pages +@end table + +Corresponding values, and meaning of a specific call to +@code{vm_machine_attribute} +@table @code +@item MATTR_VAL_ON +Enables the attribute. Being enabled is the default value for any +applicable attribute. + +@item MATTR_VAL_OFF +Disables the attribute, making memory non-cached, or non-migratable, or +non-replicatable. + +@item MATTR_VAL_GET +Returns the current value of the attribute for the memory segment. If +the attribute does not apply uniformly to the given range the value +returned applies to the initial portion of the segment only. + +@item MATTR_VAL_CACHE_FLUSH +Flush the memory pages from the Cache. The size value in this case +might be meaningful even if not a multiple of the page size, depending +on the implementation. + +@item MATTR_VAL_ICACHE_FLUSH +Same as above, applied to the Instruction Cache alone. + +@item MATTR_VAL_DCACHE_FLUSH +Same as above, applied to the Data Cache alone. +@end table + +The function returns @code{KERN_SUCCESS} if call succeeded, and +@code{KERN_INVALID_ARGUMENT} if @var{task} is not a task, or +@var{address} and @var{size} do not define a valid address range in +task, or @var{attribute} is not a valid attribute type, or it is not +implemented, or @var{value} is not a permissible value for attribute. +@end deftypefun + + +@node Mapping Memory Objects +@section Mapping Memory Objects + +@deftypefun kern_return_t vm_map (@w{vm_task_t @var{target_task}}, @w{vm_address_t *@var{address}}, @w{vm_size_t @var{size}}, @w{vm_address_t @var{mask}}, @w{boolean_t @var{anywhere}}, @w{memory_object_t @var{memory_object}}, @w{vm_offset_t @var{offset}}, @w{boolean_t @var{copy}}, @w{vm_prot_t @var{cur_protection}}, @w{vm_prot_t @var{max_protection}}, @w{vm_inherit_t @var{inheritance}}) +The function @code{vm_map} maps a region of virtual memory at the +specified address, for which data is to be supplied by the given memory +object, starting at the given offset within that object. In addition to +the arguments used in @code{vm_allocate}, the @code{vm_map} call allows +the specification of an address alignment parameter, and of the initial +protection and inheritance values. +@c XXX See the descriptions of vm_allocate, vm_protect , and vm_inherit + +If the memory object in question is not currently in use, the kernel +will perform a @code{memory_object_init} call at this time. If the copy +parameter is asserted, the specified region of the memory object will be +copied to this address space; changes made to this object by other tasks +will not be visible in this mapping, and changes made in this mapping +will not be visible to others (or returned to the memory object). + +The @code{vm_map} call returns once the mapping is established. +Completion of the call does not require any action on the part of the +memory manager. + +Warning: Only memory objects that are provided by bona fide memory +managers should be used in the @code{vm_map} call. A memory manager +must implement the memory object interface described elsewhere in this +manual. If other ports are used, a thread that accesses the mapped +virtual memory may become permanently hung or may receive a memory +exception. + +@var{target_task} is the task to be affected. The starting address is +@var{address}. If the @var{anywhere} option is used, this address is +ignored. The address actually allocated will be returned in +@var{address}. @var{size} is the number of bytes to allocate (rounded by +the system in a machine dependent way). The alignment restriction is +specified by @var{mask}. Bits asserted in this mask must not be +asserted in the address returned. If @var{anywhere} is set, the kernel +should find and allocate any region of the specified size, and return +the address of the resulting region in @var{address}. + +@var{memory_object} is the port that represents the memory object: used +by user tasks in @code{vm_map}; used by the make requests for data or +other management actions. If this port is @code{MEMORY_OBJECT_NULL}, +then zero-filled memory is allocated instead. Within a memory object, +@var{offset} specifes an offset in bytes. This must be page aligned. +If @var{copy} is set, the range of the memory object should be copied to +the target task, rather than mapped read-write. + +The function returns @code{KERN_SUCCESS} if the object is mapped, +@code{KERN_NO_SPACE} if no unused region of the task's virtual address +space that meets the address, size, and alignment criteria could be +found, and @code{KERN_INVALID_ARGUMENT} if an illegal argument was provided. +@end deftypefun + + +@node Memory Statistics +@section Memory Statistics + +@deftp {Data type} vm_statistics_data_t +This structure is returned in @var{vm_stats} by the @code{vm_statistics} +function and provides virtual memory statistics for the system. It has +the following members: + +@table @code +@item long pagesize +The page size in bytes. + +@item long free_count +The number of free pages. + +@item long active_count +The umber of active pages. + +@item long inactive_count +The number of inactive pages. + +@item long wire_count +The number of pages wired down. + +@item long zero_fill_count +The number of zero filled pages. + +@item long reactivations +The number of reactivated pages. + +@item long pageins +The number of pageins. + +@item long pageouts +The number of pageouts. + +@item long faults +The number of faults. + +@item long cow_faults +The number of copy-on-writes. + +@item long lookups +The number of object cache lookups. + +@item long hits +The number of object cache hits. +@end table +@end deftp + +@deftypefun kern_return_t vm_statistics (@w{vm_task_t @var{target_task}}, @w{vm_statistics_data_t *@var{vm_stats}}) +The function @code{vm_statistics} returns the statistics about the +kernel's use of virtual memory since the kernel was booted. +@code{pagesize} can also be found as a global variable +@code{vm_page_size} which is set at task initialization and remains +constant for the life of the task. +@end deftypefun + + +@node External Memory Management +@chapter External Memory Management + +@menu +* Memory Object Server:: The basics of external memory management. +* Memory Object Creation:: How new memory objects are created. +* Memory Object Termination:: How memory objects are terminated. +* Memory Objects and Data:: Data transfer to and from memory objects. +* Memory Object Locking:: How memory objects are locked. +* Memory Object Attributes:: Manipulating attributes of memory objects. +* Default Memory Manager:: Setting and using the default memory manager. +@end menu + + +@node Memory Object Server +@section Memory Object Server + +@deftypefun boolean_t memory_object_server (@w{msg_header_t *@var{in_msg}}, @w{msg_header_t *@var{out_msg}}) +@deftypefunx boolean_t memory_object_default_server (@w{msg_header_t *@var{in_msg}}, @w{msg_header_t *@var{out_msg}}) +@deftypefunx boolean_t seqnos_memory_object_server (@w{msg_header_t *@var{in_msg}}, @w{msg_header_t *@var{out_msg}}) +@deftypefunx boolean_t seqnos_memory_object_default_server (@w{msg_header_t *@var{in_msg}}, @w{msg_header_t *@var{out_msg}}) +A memory manager is a server task that responds to specific messages +from the kernel in order to handle memory management functions for the +kernel. + +In order to isolate the memory manager from the specifics of message +formatting, the remote procedure call generator produces a procedure, +@code{memory_object_server}, to handle a received message. This +function does all necessary argument handling, and actually calls one of +the following functions: @code{memory_object_init}, +@code{memory_object_data_write}, @code{memory_object_data_return}, +@code{memory_object_data_request}, @code{memory_object_data_unlock}, +@code{memory_object_lock_completed}, @code{memory_object_copy}, +@code{memory_object_terminate}. The @strong{default memory manager} may +get two additional requests from the kernel: @code{memory_object_create} +and @code{memory_object_data_initialize}. The remote procedure call +generator produces a procedure @code{memory_object_default_server} to +handle those functions specific to the default memory manager. + +The @code{seqnos_memory_object_server} and +@code{seqnos_memory_object_default_server} differ from +@code{memory_object_server} and @code{memory_object_default_server} in +that they supply message sequence numbers to the server interfaces. +They call the @code{seqnos_memory_object_*} functions, which complement +the @code{memory_object_*} set of functions. + +The return value from the @code{memory_object_server} function indicates +that the message was appropriate to the memory management interface +(returning @code{TRUE}), or that it could not handle this message +(returning @code{FALSE}). + +The @var{in_msg} argument is the message that has been received from the +kernel. The @var{out_msg} is a reply message, but this is not used for +this server. + +The function returns @code{TRUE} to indicate that the message in +question was applicable to this interface, and that the appropriate +routine was called to interpret the message. It returns @code{FALSE} to +indicate that the message did not apply to this interface, and that no +other action was taken. +@end deftypefun + + +@node Memory Object Creation +@section Memory Object Creation + +@deftypefun kern_return_t memory_object_init (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{memory_object_name_t @var{memory_object_name}}, @w{vm_size_t @var{memory_object_page_size}}) +@deftypefunx kern_return_t seqnos_memory_object_init (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{memory_object_name_t @var{memory_object_name}}, @w{vm_size_t @var{memory_object_page_size}}) +The function @code{memory_object_init} serves as a notification that the +kernel has been asked to map the given memory object into a task's +virtual address space. Additionally, it provides a port on which the +memory manager may issue cache management requests, and a port which the +kernel will use to name this data region. In the event that different +each will perform a @code{memory_object_init} call with new request and +name ports. The virtual page size that is used by the calling kernel is +included for planning purposes. + +When the memory manager is prepared to accept requests for data for this +object, it must call @code{memory_object_ready} with the attribute. +Otherwise the kernel will not process requests on this object. To +reject all mappings of this object, the memory manager may use +@code{memory_object_destroy}. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) +@var{memory_object_name} is a port used by the kernel to refer to the +memory object data in reponse to @code{vm_region} calls. +@code{memory_object_page_size} is the page size to be used by this +kernel. All data sizes in calls involving this kernel must be an +integral multiple of the page size. Note that different kernels, +indicated by a different @code{memory_control}, may have different page +sizes. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_ready (@w{memory_object_control_t @var{memory_control}}, @w{boolean_t @var{may_cache_object}}, @w{memory_object_copy_strategy_t @var{copy_strategy}}) +The function @code{memory_object_ready} informs the kernel that the +memory manager is ready to receive data or unlock requests on behalf of +the clients. The argument @var{memory_control} is the port, provided by +the kernel in a @code{memory_object_init} call, to which cache +management requests may be issued. If @var{may_cache_object} is set, +the kernel may keep data associated with this memory object, even after +virtual memory references to it are gone. + +@var{copy_strategy} tells how the kernel should copy regions of the +associated memory object. There are three possible caching strategies: +@code{MEMORY_OBJECT_COPY_NONE} which specifies that nothing special +should be done when data in the object is copied; +@code{MEMORY_OBJECT_COPY_CALL} which specifies that the memory manager +should be notified via a @code{memory_object_copy} call before any part +of the object is copied; and @code{MEMORY_OBJECT_COPY_DELAY} which +guarantees that the memory manager does not externally modify the data +so that the kernel can use its normal copy-on-write algorithms. +@code{MEMORY_OBJECT_COPY_DELAY} is the strategy most commonly used. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + + +@node Memory Object Termination +@section Memory Object Termination + +@deftypefun kern_return_t memory_object_terminate (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{memory_object_name_t @var{memory_object_name}}) +@deftypefunx kern_return_t seqnos_memory_object_terminate (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{memory_object_name_t @var{memory_object_name}}) +The function @code{memory_object_terminate} indicates that the kernel +has completed its use of the given memory object. All rights to the +memory object control and name ports are included, so that the memory +manager can destroy them (using @code{mach_port_deallocate}) after doing +appropriate bookkeeping. The kernel will terminate a memory object only +after all address space mappings of that memory object have been +deallocated, or upon explicit request by the memory manager. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) +@var{memory_object_name} is a port used by the kernel to refer to the +memory object data in reponse to @code{vm_region} calls. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_destroy (@w{memory_object_control_t @var{memory_control}}, @w{kern_return_t @var{reason}}) +The function @code{memory_object_destroy} tells the kernel to shut down +the memory object. As a result of this call the kernel will no longer +support paging activity or any @code{memory_object} calls on this +object, and all rights to the memory object port, the memory control +port and the memory name port will be returned to the memory manager in +a memory_object_terminate call. If the memory manager is concerned that +any modified cached data be returned to it before the object is +terminated, it should call @code{memory_object_lock_request} with +@var{should_flush} set and a lock value of @code{VM_PROT_WRITE} before +making this call. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. @var{reason} is an error code indicating why the object +must be destroyed. +@c The error code is currently ingnored. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + + +@node Memory Objects and Data +@section Memory Objects and Data + +@deftypefun kern_return_t memory_object_data_return (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}, @w{boolean_t @var{dirty}}, @w{boolean_t @var{kernel_copy}}) +@deftypefunx kern_return_t seqnos_memory_object_data_return (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}, @w{boolean_t @var{dirty}}, @w{boolean_t @var{kernel_copy}}) +The function @code{memory_object_data_return} provides the memory +manager with data that has been modified while cached in physical +memory. Once the memory manager no longer needs this data (e.g., it has +been written to another storage medium), it should be deallocated using +@code{vm_deallocate}. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. This will be +page aligned. @var{data} is the data which has been modified while +cached in physical memory. @var{data_count} is the amount of data to be +written, in bytes. This will be an integral number of memory object +pages. + +The kernel will also use this call to return precious pages. If an +unmodified precious age is returned, @var{dirty} is set to @code{FALSE}, +otherwise it is @code{TRUE}. If @var{kernel_copy} is @code{TRUE}, the +kernel kept a copy of the page. Precious data remains precious if the +kernel keeps a copy. The indication that the kernel kept a copy is only +a hint if the data is not precious; the cleaned copy may be discarded +without further notifying the manager. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_data_request (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{length}}, @w{vm_prot_t @var{desired_access}}) +@deftypefunx kern_return_t seqnos_memory_object_data_request (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{length}}, @w{vm_prot_t @var{desired_access}}) +The function @code{memory_object_data_request} is a request for data +from the specified memory object, for at least the access specified. +The memory manager is expected to return at least the specified data, +with as much access as it can allow, using +@code{memory_object_data_supply}. If the memory manager is unable to +provide the data (for example, because of a hardware error), it may use +the @code{memory_object_data_error} call. The +@code{memory_object_data_unavailable} call may be used to tell the +kernel to supply zero-filled memory for this region. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. This will be +page aligned. @var{length} is the number of bytes of data, starting at +@var{offset}, to which this call refers. This will be an integral +number of memory object pages. @var{desired_access} is a protection +value describing the memory access modes which must be permitted on the +specified cached data. One or more of: @code{VM_PROT_READ}, +@code{VM_PROT_WRITE} or @code{VM_PROT_EXECUTE}. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_data_supply (@w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}, @w{vm_prot_t @var{lock_value}}, @w{boolean_t @var{precious}}, @w{mach_port_t @var{reply}}) +The function @code{memory_object_data_supply} supplies the kernel with +data for the specified memory object. Ordinarily, memory managers +should only provide data in reponse to @code{memory_object_data_request} +calls from the kernel (but they may provide data in advance as desired). +When data already held by this kernel is provided again, the new data is +ignored. The kernel may not provide any data (or protection) +consistency among pages with different virtual page alignments within +the same object. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. @var{offset} is an offset within a memory object in bytes. +This must be page aligned. @var{data} is the data that is being +provided to the kernel. This is a pointer to the data. +@var{data_count} is the amount of data to be provided. Only whole +virtual pages of data can be accepted; partial pages will be discarded. + +@var{lock_value} is a protection value indicating those forms of access +that should @strong{not} be permitted to the specified cached data. The +lock values must be one or more of the set: @code{VM_PROT_NONE}, +@code{VM_PROT_READ}, @code{VM_PROT_WRITE}, @code{VM_PROT_EXECUTE} and +@code{VM_PROT_ALL} as defined in @file{mach/vm_prot.h}. + +If @var{precious} is @code{FALSE}, the kernel treats the data as a +temporary and may throw it away if it hasn't been changed. If the +@var{precious} value is @code{TRUE}, the kernel treats its copy as a +data repository and promises to return it to the manager; the manager +may tell the kernel to throw it away instead by flushing and not +cleaning the data (see @code{memory_object_lock_request}). + +If @var{reply_to} is not @code{MACH_PORT_NULL}, the kernel will send a +completion message to the provided port (see +@code{memory_object_supply_completed}). + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + +@deftypefun kern_return_t memory_object_supply_completed (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}, @w{kern_return_t @var{result}}, @w{vm_offset_t @var{error_offset}}) +@deftypefunx kern_return_t seqnos_memory_object_supply_completed (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}, @w{kern_return_t @var{result}}, @w{vm_offset_t @var{error_offset}}) +The function @code{memory_object_supply_completed} indicates that a +previous @code{memory_object_data_supply} has been completed. Note that +this call is made on whatever port was specified in the +@code{memory_object_data_supply} call; that port need not be the memory +object port itself. No reply is expected after this call. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. @var{length} +is the length of the data covered by the lock request. The @var{result} +parameter indicates what happened during the supply. If it is not +@code{KERN_SUCCESS}, then @var{error_offset} identifies the first offset +at which a problem occurred. The pagein operation stopped at this +point. Note that the only failures reported by this mechanism are +@code{KERN_MEMORY_PRESENT}. All other failures (invalid argument, error +on pagein of supplied data in manager's address space) cause the entire +operation to fail. + + +@end deftypefun + +@deftypefun kern_return_t memory_object_data_error (@w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{size}}, @w{kern_return_t @var{reason}}) +The function @code{memory_object_data_error} indicates that the memory +manager cannot return the data requested for the given region, +specifying a reason for the error. This is typically used when a +hardware error is encountered. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. @var{offset} is an offset within a memory object in bytes. +This must be page aligned. @var{data} is the data that is being +provided to the kernel. This is a pointer to the data. @var{size} is +the amount of cached data (starting at @var{offset}) to be handled. +This must be an integral number of the memory object page size. +@var{reason} is an error code indicating what type of error occured. +@c The error code is currently ingnored. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + +@deftypefun kern_return_t memory_object_data_unavailable (@w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{size}}, @w{kern_return_t @var{reason}}) +The function @code{memory_object_data_unavailable} indicates that the +memory object does not have data for the given region and that the +kernel should provide the data for this range. The memory manager may +use this call in three different situations. + +@enumerate +@item +The object was created by @code{memory_object_create} and the kernel has +not yet provided data for this range (either via a +@code{memory_object_data_initialize}, @code{memory_object_data_write} or +a @code{memory_object_data_return} for the object. + +@item +The object was created by an @code{memory_object_data_copy} and the +kernel should copy this region from the original memory object. + +@item +The object is a normal user-created memory object and the kernel should +supply unlocked zero-filled pages for the range. +@end enumerate + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. @var{offset} is an offset within a memory object, in bytes. +This must be page aligned. @var{size} is the amount of cached data +(starting at @var{offset}) to be handled. This must be an integral +number of the memory object page size. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + +@deftypefun kern_return_t memory_object_copy (@w{memory_object_t @var{old_memory_object}}, @w{memory_object_control_t @var{old_memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}, @w{memory_object_t @var{new_memory_object}}) +@deftypefunx kern_return_t seqnos_memory_object_copy (@w{memory_object_t @var{old_memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{old_memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}, @w{memory_object_t @var{new_memory_object}}) +The function @code{memory_object_copy} indicates that a copy has been +made of the specified range of the given original memory object. This +call includes only the new memory object itself; a +@code{memory_object_init} call will be made on the new memory object +after the currently cached pages of the original object are prepared. +After the memory manager receives the init call, it must reply with the +@code{memory_object_ready} call to assert the "ready" attribute. The +kernel will use the new memory object, control and name ports to refer +to the new copy. + +This call is made when the original memory object had the caching +parameter set to @code{MEMORY_OBJECT_COPY_CALL} and a user of the object +has asked the kernel to copy it. + +Cached pages from the original memory object at the time of the copy +operation are handled as follows: Readable pages may be silently copied +to the new memory object (with all access permissions). Pages not +copied are locked to prevent write access. + +The new memory object is @strong{temporary}, meaning that the memory +manager should not change its contents or allow the memory object to be +mapped in another client. The memory manager may use the +@code{memory_object_data_unavailable} call to indicate that the +appropriate pages of the original memory object may be used to fulfill +the data request. + +The argument @var{old_memory_object} is the port that represents the old +memory object data. @var{old_memory_control} is the kernel port for the +old object. @var{offset} is the offset within a memory object to which +this call refers. This will be page aligned. @var{length} is the +number of bytes of data, starting at @var{offset}, to which this call +refers. This will be an integral number of memory object pages. +@var{new_memory_object} is a new memory object created by the kernel; +see synopsis for further description. Note that all port rights +(including receive rights) are included for the new memory object. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +The remaining interfaces in this section are obsolet. + +@deftypefun kern_return_t memory_object_data_write (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}) +@deftypefunx kern_return_t seqnos_memory_object_data_write (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}) +The function @code{memory_object_data_write} provides the memory manager +with data that has been modified while cached in physical memory. It is the old form of @code{memory_object_data_return}. Once +the memory manager no longer needs this data (e.g., it has been written +to another storage medium), it should be deallocated using +@code{vm_deallocate}. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. This will be +page aligned. @var{data} is the data which has been modified while +cached in physical memory. @var{data_count} is the amount of data to be +written, in bytes. This will be an integral number of memory object +pages. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_data_provided (@w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}, @w{vm_prot_t @var{lock_value}}) +The function @code{memory_object_data_provided} supplies the kernel with +data for the specified memory object. It is the old form of +@code{memory_object_data_supply}. Ordinarily, memory managers should +only provide data in reponse to @code{memory_object_data_request} calls +from the kernel. The @var{lock_value} specifies what type of access +will not be allowed to the data range. The lock values must be one or +more of the set: @code{VM_PROT_NONE}, @code{VM_PROT_READ}, +@code{VM_PROT_WRITE}, @code{VM_PROT_EXECUTE} and @code{VM_PROT_ALL} as +defined in @file{mach/vm_prot.h}. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. @var{offset} is an offset within a memory object in bytes. +This must be page aligned. @var{data} is the data that is being +provided to the kernel. This is a pointer to the data. +@var{data_count} is the amount of data to be provided. This must be an +integral number of memory object pages. @var{lock_value} is a +protection value indicating those forms of access that should +@strong{not} be permitted to the specified cached data. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + + +@node Memory Object Locking +@section Memory Object Locking + +@deftypefun kern_return_t memory_object_lock_request (@w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{size}}, @w{memory_object_return_t @var{should_clean}}, @w{boolean_t @var{should_flush}}, @w{vm_prot_t @var{lock_value}}, @w{mach_port_t @var{reply_to}}) +The function @code{memory_object_lock_request} allows a memory manager +to make cache management requests. As specified in arguments to the +call, the kernel will: +@itemize +@item +clean (i.e., write back using @code{memory_object_data_supply} or +@code{memory_object_data_write}) any cached data which has been modified +since the last time it was written + +@item +flush (i.e., remove any uses of) that data from memory + +@item +lock (i.e., prohibit the specified uses of) the cached data +@end itemize + +Locks applied to cached data are not cumulative; new lock values +override previous ones. Thus, data may also be unlocked using this +primitive. The lock values must be one or more of the following values: +@code{VM_PROT_NONE}, @code{VM_PROT_READ}, @code{VM_PROT_WRITE}, +@code{VM_PROT_EXECUTE} and @code{VM_PROT_ALL} as defined in +@file{mach/vm_prot.h}. + +Only data which is cached at the time of this call is affected. When a +running thread requires a prohibited access to cached data, the kernel +will issue a @code{memory_object_data_unlock} call specifying the forms +of access required. + +Once all of the actions requested by this call have been completed, the +kernel issues a @code{memory_object_lock_completed} call on the +specified reply port. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. @var{offset} is an offset within a memory object, in bytes. +This must be page aligned. @var{size} is the amount of cached data +(starting at @var{offset}) to be handled. This must be an integral +number of the memory object page size. If @var{should_clean} is set, +modified data should be written back to the memory manager. If +@var{should_flush} is set, the specified cached data should be +invalidated, and all uses of that data should be revoked. +@var{lock_value} is a protection value indicating those forms of access +that should @strong{not} be permitted to the specified cached data. +@var{reply_to} is a port on which a @code{memory_object_lock_comleted} +call should be issued, or @code{MACH_PORT_NULL} if no acknowledgement is +desired. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + +@deftypefun kern_return_t memory_object_lock_completed (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}) +@deftypefunx kern_return_t seqnos_memory_object_lock_completed (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}) +The function @code{memory_object_lock_completed} indicates that a +previous @code{memory_object_lock_request} has been completed. Note +that this call is made on whatever port was specified in the +@code{memory_object_lock_request} call; that port need not be the memory +object port itself. No reply is expected after this call. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. @var{length} +is the length of the data covered by the lock request. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_data_unlock (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}, @w{vm_prot_t @var{desired_access}}) +@deftypefunx kern_return_t seqnos_memory_object_data_unlock (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{length}}, @w{vm_prot_t @var{desired_access}}) +The function @code{memory_object_data_unlock} is a request that the +memory manager permit at least the desired access to the specified data +cached by the kernel. A call to @code{memory_object_lock_request} is +expected in response. + +The argument @var{memory_object} is the port that represents the memory +object data, as supplied to the kernel in a @code{vm_map} call. +@var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. This will be +page aligned. @var{length} is the number of bytes of data, starting at +@var{offset}, to which this call refers. This will be an integral +number of memory object pages. @var{desired_access} a protection value +describing the memory access modes which must be permitted on the +specified cached data. One or more of: @code{VM_PROT_READ}, +@code{VM_PROT_WRITE} or @code{VM_PROT_EXECUTE}. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + + +@node Memory Object Attributes +@section Memory Object Attributes + +@deftypefun kern_return_t memory_object_get_attributes (@w{memory_object_control_t @var{memory_control}}, @w{boolean_t *@var{object_ready}}, @w{boolean_t *@var{may_cache_object}}, @w{memory_object_copy_strategy_t *@var{copy_strategy}}) +The function @code{memory_object_get_attribute} retrieves the current +attributes associated with the memory object. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. If @var{object_ready} is set, the kernel may issue new data +and unlock requests on the associated memory object. If +@var{may_cache_object} is set, the kernel may keep data associated with +this memory object, even after virtual memory references to it are gone. +@var{copy_strategy} tells how the kernel should copy regions of the +associated memory object. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + +@deftypefun kern_return_t memory_object_change_attributes (@w{memory_object_control_t @var{memory_control}}, @w{boolean_t @var{may_cache_object}}, @w{memory_object_copy_strategy_t @var{copy_strategy}}, @w{mach_port_t @var{reply_to}}) +The function @code{memory_object_change_attribute} sets +performance-related attributes for the specified memory object. If the +caching attribute is asserted, the kernel is permitted (and encouraged) +to maintain cached data for this memory object even after no virtual +address space contains this data. + +There are three possible caching strategies: +@code{MEMORY_OBJECT_COPY_NONE} which specifies that nothing special +should be done when data in the object is copied; +@code{MEMORY_OBJECT_COPY_CALL} which specifies that the memory manager +should be notified via a @code{memory_object_copy} call before any part +of the object is copied; and @code{MEMORY_OBJECT_COPY_DELAY} which +guarantees that the memory manager does not externally modify the data +so that the kernel can use its normal copy-on-write algorithms. +@code{MEMORY_OBJECT_COPY_DELAY} is the strategy most commonly used. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. If @var{may_cache_object} is set, the kernel may keep data +associated with this memory object, even after virtual memory references +to it are gone. @var{copy_strategy} tells how the kernel should copy +regions of the associated memory object. @var{reply_to} is a port on +which a @code{memory_object_change_comleted} call will be issued upon +completion of the attribute change, or @code{MACH_PORT_NULL} if no +acknowledgement is desired. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + +@deftypefun kern_return_t memory_object_change_completed (@w{memory_object_t @var{memory_object}}, @w{boolean_t @var{may_cache_object}}, @w{memory_object_copy_strategy_t @var{copy_strategy}}) +@deftypefunx kern_return_t seqnos_memory_object_change_completed (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{boolean_t @var{may_cache_object}}, @w{memory_object_copy_strategy_t @var{copy_strategy}}) +The function @code{memory_object_change_completed} indicates the +completion of an attribute change call. + +@c Warning: This routine does NOT contain a memory_object_control_t because +@c the memory_object_change_attributes call may cause memory object +@c termination (by uncaching the object). This would yield an invalid +@c port. +@end deftypefun + +The following interface is obsoleted by @code{memory_object_ready} and +@code{memory_object_change_attributes}. If the old form +@code{memory_object_set_attributes} is used to make a memory object +ready, the kernel will write back data using the old +@code{memory_object_data_write} interface rather than +@code{memory_object_data_return}.. + +@deftypefun kern_return_t memory_object_set_attributes (@w{memory_object_control_t @var{memory_control}}, @w{boolean @var{object_ready}}, @w{boolean_t @var{may_cache_object}}, @w{memory_object_copy_strategy_t @var{copy_strategy}}) +The function @code{memory_object_set_attribute} controls how the the +memory object. The kernel will only make data or unlock requests when +the ready attribute is asserted. If the caching attribute is asserted, +the kernel is permitted (and encouraged) to maintain cached data for +this memory object even after no virtual address space contains this +data. + +There are three possible caching strategies: +@code{MEMORY_OBJECT_COPY_NONE} which specifies that nothing special +should be done when data in the object is copied; +@code{MEMORY_OBJECT_COPY_CALL} which specifies that the memory manager +should be notified via a @code{memory_object_copy} call before any part +of the object is copied; and @code{MEMORY_OBJECT_COPY_DELAY} which +guarantees that the memory manager does not externally modify the data +so that the kernel can use its normal copy-on-write algorithms. +@code{MEMORY_OBJECT_COPY_DELAY} is the strategy most commonly used. + +The argument @var{memory_control} is the port, provided by the kernel in +a @code{memory_object_init} call, to which cache management requests may +be issued. If @var{object_ready} is set, the kernel may issue new data +and unlock requests on the associated memory object. If +@var{may_cache_object} is set, the kernel may keep data associated with +this memory object, even after virtual memory references to it are gone. +@var{copy_strategy} tells how the kernel should copy regions of the +associated memory object. + +This routine does not receive a reply message (and consequently has no +return value), so only message transmission errors apply. +@end deftypefun + + +@node Default Memory Manager +@section Default Memory Manager + +@deftypefun kern_return_t vm_set_default_memory_manager (@w{host_t @var{host}}, @w{mach_port_t *@var{default_manager}}) +The function @code{vm_set_default_memory_manager} sets the kernel's +default memory manager. It sets the port to which newly-created +temporary memory objects are delivered by @code{memory_object_create} to +the host. The old memory manager port is returned. If +@var{default_manager} is @code{MACH_PORT_NULL} then this routine just returns +the current default manager port without changing it. + +The argument @var{host} is a task port to the kernel whose default +memory manager is to be changed. @var{default_manager} is an in/out +parameter. As input, @var{default_manager} is the port that the new +memory manager is listening on for @code{memory_object_create} calls. +As output, it is the old default memory manager's port. + +The function returns @code{KERN_SUCCESS} if the new memory manager is +installed, and @code{KERN_INVALID_ARGUMENT} if this task does not have +the privileges required for this call. +@end deftypefun + +@deftypefun kern_return_t memory_object_create (@w{memory_object_t @var{old_memory_object}}, @w{memory_object_t @var{new_memory_object}}, @w{vm_size_t @var{new_object_size}}, @w{memory_object_control_t @var{new_control}}, @w{memory_object_name_t @var{new_name}}, @w{vm_size_t @var{new_page_size}}) +@deftypefunx kern_return_t seqnos_memory_object_create (@w{memory_object_t @var{old_memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_t @var{new_memory_object}}, @w{vm_size_t @var{new_object_size}}, @w{memory_object_control_t @var{new_control}}, @w{memory_object_name_t @var{new_name}}, @w{vm_size_t @var{new_page_size}}) +The function @code{memory_object_create} is a request that the given +memory manager accept responsibility for the given memory object created +by the kernel. This call will only be made to the system +@strong{default memory manager}. The memory object in question +initially consists of zero-filled memory; only memory pages that are +actually written will ever be provided to +@code{memory_object_data_request} calls, the default memory manager must +use @code{memory_object_data_unavailable} for any pages that have not +previously been written. + +No reply is expected after this call. Since this call is directed to +the default memory manager, the kernel assumes that it will be ready to +handle data requests to this object and does not need the confirmation +of a @code{memory_object_set_attributes} call. + +The argument @var{old_memory_object} is a memory object provided by the +default memory manager on which the kernel can make +@code{memory_object_create} calls. @var{new_memory_object} is a new +memory object created by the kernel; see synopsis for further +description. Note that all port rights (including receive rights) are +included for the new memory object. @var{new_object_size} is the +maximum size of the new object. @var{new_control} is a port, created by +the kernel, on which a memory manager may issue cache management +requests for the new object. @var{new_name} a port used by the kernel +to refer to the new memory object data in response to @code{vm_region} +calls. @var{new_page_size} is the page size to be used by this kernel. +All data sizes in calls involving this kernel must be an integral +multiple of the page size. Note that different kernels, indicated by +different a @code{memory_control}, may have different page sizes. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + +@deftypefun kern_return_t memory_object_data_initialize (@w{memory_object_t @var{memory_object}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}) +@deftypefunx kern_return_t seqnos_memory_object_data_initialize (@w{memory_object_t @var{memory_object}}, @w{mach_port_seqno_t @var{seqno}}, @w{memory_object_control_t @var{memory_control}}, @w{vm_offset_t @var{offset}}, @w{vm_offset_t @var{data}}, @w{vm_size_t @var{data_count}}) +The function @code{memory_object_data_initialize} provides the memory +manager with initial data for a kernel-created memory object. If the +memory manager already has been supplied data (by a previous +@code{memory_object_data_initialize}, @code{memory_object_data_write} or +@code{memory_object_data_return}), then this data should be ignored. +Otherwise, this call behaves exactly as does +@code{memory_object_data_return} on memory objects created by the kernel +via @code{memory_object_create} and thus will only be made to default +memory managers. This call will not be made on objects created via +@code{memory_object_copy}. + +The argument @var{memory_object} the port that represents the memory +object data, as supplied by the kernel in a @code{memory_object_create} +call. @var{memory_control} is the request port to which a response is +requested. (In the event that a memory object has been supplied to more +than one the kernel that has made the request.) @var{offset} is the +offset within a memory object to which this call refers. This will be +page aligned. @var{data} os the data which has been modified while +cached in physical memory. @var{data_count} is the amount of data to be +written, in bytes. This will be an integral number of memory object +pages. + +The function should return @code{KERN_SUCCESS}, but since this routine +is called by the kernel, which does not wait for a reply message, this +value is ignored. +@end deftypefun + + +@node Threads and Tasks +@chapter Threads and Tasks + +@menu +* Thread Interface:: Manipulating threads. +* Task Interface:: Manipulating tasks. +* Profiling:: Profiling threads and tasks. +@end menu + + +@node Thread Interface +@section Thread Interface + +@cindex thread port +@cindex port representing a thread +@deftp {Data type} thread_t +This is a @code{mach_port_t} and used to hold the port name of a +thread port that represents the thread. Manipulations of the thread are +implemented as remote procedure calls to the thread port. A thread can +get a port to itself with the @code{mach_thread_self} system call. +@end deftp + +@menu +* Thread Creation:: Creating new threads. +* Thread Termination:: Terminating existing threads. +* Thread Information:: How to get informations on threads. +* Thread Settings:: How to set threads related informations. +* Thread Execution:: How to control the thread's machine state. +* Scheduling:: Operations on thread scheduling. +* Thread Special Ports:: How to handle the thread's special ports. +* Exceptions:: Managing exceptions. +@end menu + + +@node Thread Creation +@subsection Thread Creation + +@deftypefun kern_return_t thread_create (@w{task_t @var{parent_task}}, @w{thread_t *@var{child_thread}}) +The function @code{thread_create} creates a new thread within the task +specified by @var{parent_task}. The new thread has no processor state, +and has a suspend count of 1. To get a new thread to run, first +@code{thread_create} is called to get the new thread's identifier, +(@var{child_thread}). Then @code{thread_set_state} is called to set a +processor state, and finally @code{thread_resume} is called to get the +thread scheduled to execute. + +When the thread is created send rights to its thread kernel port are +given to it and returned to the caller in @var{child_thread}. The new +thread's exception port is set to @code{MACH_PORT_NULL}. + +The function returns @code{KERN_SUCCESS} if a new thread has been +created, @code{KERN_INVALID_ARGUMENT} if @var{parent_task} is not a +valid task and @code{KERN_RESOURCE_SHORTAGE} if some critical kernel +resource is not available. +@end deftypefun + + +@node Thread Termination +@subsection Thread Termination + +@deftypefun kern_return_t thread_terminate (@w{thread_t @var{target_thread}}) +The function @code{thread_terminate} destroys the thread specified by +@var{target_thread}. + +The function returns @code{KERN_SUCCESS} if the thread has been killed +and @code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not a thread. +@end deftypefun + + +@node Thread Information +@subsection Thread Information + +@deftypefun thread_t mach_thread_self () +The @code{mach_thread_self} system call returns the calling thread's +thread port. + +@code{mach_thread_self} has an effect equivalent to receiving a send +right for the thread port. @code{mach_thread_self} returns the name of +the send right. In particular, successive calls will increase the +calling task's user-reference count for the send right. + +@c author{marcus} +As a special exception, the kernel will overrun the user reference count +of the thread name port, so that this function can not fail for that +reason. Because of this, the user should not deallocate the port right +if an overrun might have happened. Otherwise the reference count could +drop to zero and the send right be destroyed while the user still +expects to be able to use it. As the kernel does not make use of the +number of extant send rights anyway, this is safe to do (the thread port +itself is not destroyed, even when there are no send rights anymore). + +The function returns @code{MACH_PORT_NULL} if a resource shortage +prevented the reception of the send right or if the thread port is +currently null and @code{MACH_PORT_DEAD} if the thread port is currently +dead. +@end deftypefun + +@deftypefun kern_return_t thread_info (@w{thread_t @var{target_thread}}, @w{int @var{flavor}}, @w{thread_info_t @var{thread_info}}, @w{mach_msg_type_number_t *@var{thread_infoCnt}}) +The function @code{thread_info} returns the selected information array +for a thread, as specified by @var{flavor}. + +@var{thread_info} is an array of integers that is supplied by the caller +and returned filled with specified information. @var{thread_infoCnt} is +supplied as the maximum number of integers in @var{thread_info}. On +return, it contains the actual number of integers in @var{thread_info}. +The maximum number of integers returned by any flavor is +@code{THREAD_INFO_MAX}. + +The type of information returned is defined by @var{flavor}, which can +be one of the following: + +@table @code +@item THREAD_BASIC_INFO +The function returns basic information about the thread, as defined by +@code{thread_basic_info_t}. This includes the user and system time, the +run state, and scheduling priority. The number of integers returned is +@code{THREAD_BASIC_INFO_COUNT}. + +@item THREAD_SCHED_INFO +The function returns information about the schduling policy for the +thread as defined by @code{thread_sched_info_t}. The number of integers +returned is @code{THREAD_SCHED_INFO_COUNT}. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not a thread or +@var{flavor} is not recognized. The function returns +@code{MIG_ARRAY_TOO_LARGE} if the returned info array is too large for +@var{thread_info}. In this case, @var{thread_info} is filled as much as +possible and @var{thread_infoCnt} is set to the number of elements that +would have been returned if there were enough room. +@end deftypefun + +@deftp {Data type} {struct thread_basic_info} +This structure is returned in @var{thread_info} by the +@code{thread_info} function and provides basic information about the +thread. You can cast a variable of type @code{thread_info_t} to a +pointer of this type if you provided it as the @var{thread_info} +parameter for the @code{THREAD_BASIC_INFO} flavor of @code{thread_info}. +It has the following members: + +@table @code +@item time_value_t user_time +user run time + +@item time_value_t system_time +system run time +@item int cpu_usage +Scaled cpu usage percentage. The scale factor is @code{TH_USAGE_SCALE}. + +@item int base_priority +The base scheduling priority of the thread. + +@item int cur_priority +The current scheduling priority of the thread. + +@item integer_t run_state +The run state of the thread. The possible vlues of this field are: +@table @code +@item TH_STATE_RUNNING +The thread is running normally. + +@item TH_STATE_STOPPED +The thread is suspended. + +@item TH_STATE_WAITING +The thread is waiting normally. + +@item TH_STATE_UNINTERRUPTIBLE +The thread is in an uninterruptible wait. + +@item TH_STATE_HALTED +The thread is halted at a clean point. +@end table + +@item flags +Various flags. The possible values of this field are: +@table @code +@item TH_FLAGS_SWAPPED +The thread is swapped out. + +@item TH_FLAGS_IDLE +The thread is an idle thread. +@end table + +@item int suspend_count +The suspend count for the thread. + +@item int sleep_time +The number of seconds that the thread has been sleeping. + +@item time_value_t creation_time +The time stamp of creation. +@end table +@end deftp + +@deftp {Data type} thread_basic_info_t +This is a pointer to a @code{struct thread_basic_info}. +@end deftp + +@deftp {Data type} {struct thread_sched_info} +This structure is returned in @var{thread_info} by the +@code{thread_info} function and provides schedule information about the +thread. You can cast a variable of type @code{thread_info_t} to a +pointer of this type if you provided it as the @var{thread_info} +parameter for the @code{THREAD_SCHED_INFO} flavor of @code{thread_info}. +It has the following members: + +@table @code +@item int policy +The scheduling policy of the thread, @ref{Scheduling Policy}. + +@item integer_t data +Policy-dependent scheduling information, @ref{Scheduling Policy}. + +@item int base_priority +The base scheduling priority of the thread. + +@item int max_priority +The maximum scheduling priority of the thread. + +@item int cur_priority +The current scheduling priority of the thread. + +@item int depressed +@code{TRUE} if the thread is depressed. + +@item int depress_priority +The priority the thread was depressed from. +@end table +@end deftp + +@deftp {Data type} thread_sched_info_t +This is a pointer to a @code{struct thread_sched_info}. +@end deftp + + +@node Thread Settings +@subsection Thread Settings + +@deftypefun kern_return_t thread_wire (@w{host_priv_t @var{host_priv}}, @w{thread_t @var{thread}}, @w{boolean_t @var{wired}}) +The function @code{thread_wire} controls the VM privilege level of the +thread @var{thread}. A VM-privileged thread never waits inside the +kernel for memory allocation from the kernel's free list of pages or for +allocation of a kernel stack. + +Threads that are part of the default pageout path should be +VM-privileged, to prevent system deadlocks. Threads that are not part +of the default pageout path should not be VM-privileged, to prevent the +kernel's free list of pages from being exhausted. + +The functions returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_ARGUMENT} if @var{host_priv} or @var{thread} was +invalid. + +The @code{thread_wire} call is actually an RPC to @var{host_priv}, +normally a send right for a privileged host port, but potentially any +send right. In addition to the normal diagnostic return codes from the +call's server (normally the kernel), the call may return @code{mach_msg} +return codes. +@c See also: vm_wire(2), vm_set_default_memory_manager(2). +@end deftypefun + + +@node Thread Execution +@subsection Thread Execution + +@deftypefun kern_return_t thread_suspend (@w{thread_t @var{target_thread}}) +Increments the thread's suspend count and prevents the thread from +executing any more user level instructions. In this context a user +level instruction is either a machine instruction executed in user mode +or a system trap instruction including page faults. Thus if a thread is +currently executing within a system trap the kernel code may continue to +execute until it reaches the system return code or it may supend within +the kernel code. In either case, when the thread is resumed the system +trap will return. This could cause unpredictible results if the user +did a suspend and then altered the user state of the thread in order to +change its direction upon a resume. The call @code{thread_abort} is +provided to allow the user to abort any system call that is in progress +in a predictable way. + +The suspend count may become greater than one with the effect that it +will take more than one resume call to restart the thread. + +The function returns @code{KERN_SUCCESS} if the thread has been +suspended and @code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not +a thread. +@end deftypefun + +@deftypefun kern_return_t thread_resume (@w{thread_t @var{target_thread}}) +Decrements the threads's suspend count. If the count becomes zero the +thread is resumed. If it is still positive, the thread is left +suspended. The suspend count may not become negative. + +The function returns @code{KERN_SUCCESS} if the thread has been resumed, +@code{KERN_FAILURE} if the suspend count is already zero and +@code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not a thread. +@end deftypefun + +@deftypefun kern_return_t thread_abort (@w{thread_t @var{target_thread}}) +The function @code{thread_abort} aborts the kernel primitives: +@code{mach_msg}, @code{msg_send}, @code{msg_receive} and @code{msg_rpc} +and page-faults, making the call return a code indicating that it was +interrupted. The call is interrupted whether or not the thread (or task +containing it) is currently suspended. If it is supsended, the thread +receives the interupt when it is resumed. + +A thread will retry an aborted page-fault if its state is not modified +before it is resumed. @code{msg_send} returns @code{SEND_INTERRUPTED}; +@code{msg_receive} returns @code{RCV_INTERRUPTED}; @code{msg_rpc} +returns either @code{SEND_INTERRUPTED} or @code{RCV_INTERRUPTED}, +depending on which half of the RPC was interrupted. + +The main reason for this primitive is to allow one thread to cleanly +stop another thread in a manner that will allow the future execution of +the target thread to be controlled in a predictable way. +@code{thread_suspend} keeps the target thread from executing any further +instructions at the user level, including the return from a system call. +@code{thread_get_state}/@code{thread_set_state} allows the examination +or modification of the user state of a target thread. However, if a +suspended thread was executing within a system call, it also has +associated with it a kernel state. This kernel state can not be +modified by @code{thread_set_state} with the result that when the thread +is resumed the system call may return changing the user state and +possibly user memory. @code{thread_abort} aborts the kernel call from +the target thread's point of view by resetting the kernel state so that +the thread will resume execution at the system call return with the +return code value set to one of the interrupted codes. The system call +itself will either be entirely completed or entirely aborted, depending +on the precise moment at which the abort was received. Thus if the +thread's user state has been changed by @code{thread_set_state}, it will +not be modified by any unexpected system call side effects. + +For example to simulate a Unix signal, the following sequence of calls +may be used: + +@enumerate +@item +@code{thread_suspend}: Stops the thread. + +@item +@code{thread_abort}: Interrupts any system call in progress, setting the +return value to `interrupted'. Since the thread is stopped, it will not +return to user code. + +@item +@code{thread_set_state}: Alters thread's state to simulate a procedure +call to the signal handler + +@item +@code{thread_resume}: Resumes execution at the signal handler. If the +thread's stack has been correctly set up, the thread may return to the +interrupted system call. (Of course, the code to push an extra stack +frame and change the registers is VERY machine-dependent.) +@end enumerate + +Calling @code{thread_abort} on a non-suspended thread is pretty risky, +since it is very difficult to know exactly what system trap, if any, the +thread might be executing and whether an interrupt return would cause +the thread to do something useful. + +The function returns @code{KERN_SUCCESS} if the thread received an +interrupt and @code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not +a thread. +@end deftypefun + +@deftypefun kern_return_t thread_get_state (@w{thread_t @var{target_thread}}, @w{int @var{flavor}}, @w{thread_state_t @var{old_state}}, @w{mach_msg_type_number_t *@var{old_stateCnt}}) +The function @code{thread_get_state} returns the execution state +(e.g. the machine registers) of @var{target_thread} as specified by +@var{flavor}. The @var{old_state} is an array of integers that is +provided by the caller and returned filled with the specified +information. @var{old_stateCnt} is input set to the maximum number of +integers in @var{old_state} and returned equal to the actual number of +integers in @var{old_state}. + +@var{target_thread} may not be @code{mach_thread_self()}. + +The definition of the state structures can be found in +@file{machine/thread_status.h}. + +The function returns @code{KERN_SUCCESS} if the state has been returned, +@code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not a thread or +is @code{mach_thread_self} or @var{flavor} is unrecogized for this machine. +The function returns @code{MIG_ARRAY_TOO_LARGE} if the returned state is +too large for @var{old_state}. In this case, @var{old_state} is filled +as much as possible and @var{old_stateCnt} is set to the number of +elements that would have been returned if there were enough room. +@end deftypefun + +@deftypefun kern_return_t thread_set_state (@w{thread_t @var{target_thread}}, @w{int @var{flavor}}, @w{thread_state_t @var{new_state}}, @w{mach_msg_type_number_t @var{new_state_count}}) +The function @code{thread_set_state} sets the execution state (e.g. the +machine registers) of @var{target_thread} as specified by @var{flavor}. +The @var{new_state} is an array of integers. @var{new_state_count} is +the number of elements in @var{new_state}. The entire set of registers +is reset. This will do unpredictable things if @var{target_thread} is +not suspended. + +@var{target_thread} may not be @code{mach_thread_self}. + +The definition of the state structures can be found in +@file{machine/thread_status.h}. + +The function returns @code{KERN_SUCCESS} if the state has been set and +@code{KERN_INVALID_ARGUMENT} if @var{target_thread} is not a thread or +is @code{mach_thread_self} or @var{flavor} is unrecogized for this +machine. +@end deftypefun + + +@node Scheduling +@subsection Scheduling + +@menu +* Thread Priority:: Changing the priority of a thread. +* Hand-Off Scheduling:: Switching to a new thread. +* Scheduling Policy:: Setting the scheduling policy. +@end menu + + +@node Thread Priority +@subsubsection Thread Priority + +Threads have three priorities associated with them by the system, a +priority, a maximum priority, and a scheduled priority. The scheduled +priority is used to make scheduling decisions about the thread. It is +determined from the priority by the policy (for timesharing, this means +adding an increment derived from cpu usage). The priority can be set +under user control, but may never exceed the maximum priority. Changing +the maximum priority requires presentation of the control port for the +thread's processor set; since the control port for the default processor +set is privileged, users cannot raise their maximum priority to unfairly +compete with other users on that set. Newly created threads obtain +their priority from their task and their max priority from the thread. + +@deftypefun kern_return_t thread_priority (@w{thread_t @var{thread}}, @w{int @var{prority}}, @w{boolean_t @var{set_max}}) +The function @code{thread_priority} changes the priority and optionally +the maximum priority of @var{thread}. Priorities range from 0 to 31, +where lower numbers denote higher priorities. If the new priority is +higher than the priority of the current thread, preemption may occur as +a result of this call. The maximum priority of the thread is also set +if @var{set_max} is @code{TRUE}. This call will fail if @var{priority} +is greater than the current maximum priority of the thread. As a +result, this call can only lower the value of a thread's maximum +priority. + +The functions returns @code{KERN_SUCCESS} if the operation completed +successfully, @code{KERN_INVALID_ARGUMENT} if @var{thread} is not a +thread or @var{priority} is out of range (not in 0..31), and +@code{KERN_FAILURE} if the requested operation would violate the +thread's maximum priority (thread_priority). +@end deftypefun + +@deftypefun kern_return_t thread_max_priority (@w{thread_t @var{thread}}, @w{processor_set_t @var{processor_set}}, @w{int @var{priority}}) +The function @code{thread_max_priority} changes the maximum priority of +the thread. Because it requires presentation of the corresponding +processor set port, this call can reset the maximum priority to any +legal value. + +The functions returns @code{KERN_SUCCESS} if the operation completed +successfully, @code{KERN_INVALID_ARGUMENT} if @var{thread} is not a +thread or @var{processor_set} is not a control port for a processor set +or @var{priority} is out of range (not in 0..31), and +@code{KERN_FAILURE} if the thread is not assigned to the processor set +whose control port was presented. +@end deftypefun + + +@node Hand-Off Scheduling +@subsubsection Hand-Off Scheduling + +@deftypefun kern_return_t thread_switch (@w{thread_t @var{new_thread}}, @w{int @var{option}}, @w{int @var{time}}) +The function @code{thread_switch} provides low-level access to the +scheduler's context switching code. @var{new_thread} is a hint that +implements hand-off scheduling. The operating system will attempt to +switch directly to the new thread (by passing the normal logic that +selects the next thread to run) if possible. Since this is a hint, it +may be incorrect; it is ignored if it doesn't specify a thread on the +same host as the current thread or if that thread can't be switched to +(i.e., not runnable or already running on another processor). In this +case, the normal logic to select the next thread to run is used; the +current thread may continue running if there is no other appropriate +thread to run. + +Options for @var{option} are defined in @file{mach/thread_switch.h} and +specify the interpretation of @var{time}. The possible values for +@var{option} are: + +@table @code +@item SWITCH_OPTION_NONE +No options, the time argument is ignored. + +@item SWITCH_OPTION_WAIT +The thread is blocked for the specified time. This can be aborted by +@code{thread_abort}. + +@item SWITCH_OPTION_DEPRESS +The thread's priority is depressed to the lowest possible value for the +specified time. This can be aborted by @code{thread_depress_abort}. +This depression is independent of operations that change the thread's +priority (e.g. @code{thread_priority} will not abort the depression). +The minimum time and units of time can be obtained as the +@code{min_timeout} value from @code{host_info}. The depression is also +aborted when the current thread is next run (either via handoff +scheduling or because the processor set has nothing better to do). +@end table + +@code{thread_switch} is often called when the current thread can proceed +no further for some reason; the various options and arguments allow +information about this reason to be transmitted to the kernel. The +@var{new_thread} argument (handoff scheduling) is useful when the +identity of the thread that must make progress before the current thread +runs again is known. The @code{WAIT} option is used when the amount of +time that the current thread must wait before it can do anything useful +can be estimated and is fairly long. The @code{DEPRESS} option is used +when the amount of time that must be waited is fairly short, especially +when the identity of the thread that is being waited for is not known. + +Users should beware of calling @code{thread_switch} with an invalid hint +(e.g. @code{MACH_PORT_NULL}) and no option. Because the time-sharing +scheduler varies the priority of threads based on usage, this may result +in a waste of cpu time if the thread that must be run is of lower +priority. The use of the @code{DEPRESS} option in this situation is +highly recommended. + +@code{thread_switch} ignores policies. Users relying on the preemption +semantics of a fixed time policy should be aware that +@code{thread_switch} ignores these semantics; it will run the specified +@var{new_thread} indepent of its priority and the priority of any other +threads that could be run instead. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_ARGUMENT} if @var{thread} is not a thread or +@var{option} is not a recognized option, and @code{KERN_FAILURE} if +@code{kern_depress_abort} failed because the thread was not depressed. +@end deftypefun + +@deftypefun kern_return_t thread_depress_abort (@w{thread_t @var{thread}}) +The function @code{thread_depress_abort} cancels any priority depression +for @var{thread} caused by a @code{swtch_pri} or @code{thread_switch} +call. + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{thread} is not a valid thread. +@end deftypefun + +@deftypefun boolean_t swtch () +@c XXX Clear up wording. +The system trap @code{swtch} attempts to switch the current thread off +the processor. The return value indicates if more than the current +thread is running in the processor set. This is useful for lock +management routines. + +The call returns @code{FALSE} if the thread is justified in becoming a +resource hog by continuing to spin because there's nothing else useful +that the processor could do. @code{TRUE} is returned if the thread +should make one more check on the lock and then be a good citizen and +really suspend. +@end deftypefun + +@deftypefun boolean_t swtch_pri (@w{int @var{priority}}) +The system trap @code{swtch_pri} attempts to switch the current thread +off the processor as @code{swtch} does, but depressing the priority of +the thread to the minimum possible value during the time. +@var{priority} is not used currently. + +The return value is as for @code{swtch}. +@end deftypefun + + +@node Scheduling Policy +@subsubsection Scheduling Policy + +@deftypefun kern_return_t thread_policy (@w{thread_t @var{thread}}, @w{int @var{policy}}, @w{int @var{data}}) +The function @code{thread_policy} changes the scheduling policy for +@var{thread} to @var{policy}. + +@var{data} is policy-dependent scheduling information. There are +currently two supported policies: @code{POLICY_TIMESHARE} and +@code{POLICY_FIXEDPRI} defined in @file{mach/policy.h}; this file is +included by @file{mach.h}. @var{data} is meaningless for timesharing, +but is the quantum to be used (in milliseconds) for the fixed priority +policy. To be meaningful, this quantum must be a multiple of the basic +system quantum (min_quantum) which can be obtained from +@code{host_info}. The system will always round up to the next multiple +of the quantum. + +Processor sets may restrict the allowed policies, so this call will fail +if the processor set to which @var{thread} is currently assigned does +not permit @var{policy}. + +The function returns @code{KERN_SUCCESS} if the call succeeded. +@code{KERN_INVALID_ARGUMENT} if @var{thread} is not a thread or +@var{policy} is not a recognized policy, and @code{KERN_FAILURE} if the +processor set to which @var{thread} is currently assigned does not +permit @var{policy}. +@end deftypefun + + +@node Thread Special Ports +@subsection Thread Special Ports + +@deftypefun kern_return_t thread_get_special_port (@w{thread_t @var{thread}}, @w{int @var{which_port}}, @w{mach_port_t *@var{special_port}}) +The function @code{thread_get_special_port} returns send rights to one +of a set of special ports for the thread specified by @var{thread}. + +The possible values for @var{which_port} are @code{THREAD_KERNEL_PORT} +and @code{THREAD_EXCEPTION_PORT}. A thread also has access to its +task's special ports. + +The function returns @code{KERN_SUCCESS} if the port was returned and +@code{KERN_INVALID_ARGUMENT} if @var{thread} is not a thread or +@var{which_port} is an invalid port selector. +@end deftypefun + +@deftypefun kern_return_t thread_get_kernel_port (@w{thread_t @var{thread}}, @w{mach_port_t *@var{kernel_port}}) +The function @code{thread_get_kernel_port} is equivalent to the function +@code{thread_get_special_port} with the @var{which_port} argument set to +@code{THREAD_KERNEL_PORT}. +@end deftypefun + +@deftypefun kern_return_t thread_get_exception_port (@w{thread_t @var{thread}}, @w{mach_port_t *@var{exception_port}}) +The function @code{thread_get_exception_port} is equivalent to the +function @code{thread_get_special_port} with the @var{which_port} +argument set to @code{THREAD_EXCEPTION_PORT}. +@end deftypefun + +@deftypefun kern_return_t thread_set_special_port (@w{thread_t @var{thread}}, @w{int @var{which_port}}, @w{mach_port_t @var{special_port}}) +The function @code{thread_set_special_port} sets one of a set of special +ports for the thread specified by @var{thread}. + +The possible values for @var{which_port} are @code{THREAD_KERNEL_PORT} +and @code{THREAD_EXCEPTION_PORT}. A thread also has access to its +task's special ports. + +The function returns @code{KERN_SUCCESS} if the port was set and +@code{KERN_INVALID_ARGUMENT} if @var{thread} is not a thread or +@var{which_port} is an invalid port selector. +@end deftypefun + +@deftypefun kern_return_t thread_set_kernel_port (@w{thread_t @var{thread}}, @w{mach_port_t @var{kernel_port}}) +The function @code{thread_set_kernel_port} is equivalent to the function +@code{thread_set_special_port} with the @var{which_port} argument set to +@code{THREAD_KERNEL_PORT}. +@end deftypefun + +@deftypefun kern_return_t thread_set_exception_port (@w{thread_t @var{thread}}, @w{mach_port_t @var{exception_port}}) +The function @code{thread_set_exception_port} is equivalent to the +function @code{thread_set_special_port} with the @var{which_port} +argument set to @code{THREAD_EXCEPTION_PORT}. +@end deftypefun + + +@node Exceptions +@subsection Exceptions + +@deftypefun kern_return_t catch_exception_raise (@w{mach_port_t @var{exception_port}}, @w{thread_t @var{thread}}, @w{task_t @var{task}}, @w{int @var{exception}}, @w{int @var{code}}, @w{int @var{subcode}}) +XXX Fixme +@end deftypefun + +@deftypefun kern_return_t exception_raise (@w{mach_port_t @var{exception_port}}, @w{mach_port_t @var{thread}}, @w{mach_port_t @var{task}}, @w{integer_t @var{exception}}, @w{integer_t @var{code}}, @w{integer_t @var{subcode}}) +XXX Fixme +@end deftypefun + +@deftypefun kern_return_t evc_wait (@w{unsigned int @var{event}}) +@c XXX This is for user space drivers, the description is incomplete. +The system trap @code{evc_wait} makes the calling thread wait for the +event specified by @var{event}. + +The call returns @code{KERN_SUCCESS} if the event has occured, +@code{KERN_NO_SPACE} if another thread is waiting for the same event and +@code{KERN_INVALID_ARGUMENT} if the event object is invalid. +@end deftypefun + + +@node Task Interface +@section Task Interface + +@cindex task port +@cindex port representing a task +@deftp {Data type} task_t +This is a @code{mach_port_t} and used to hold the port name of a task +port that represents the thread. Manipulations of the task are +implemented as remote procedure calls to the task port. A task can get +a port to itself with the @code{mach_task_self} system call. + +The task port name is also used to identify the task's IPC space +(@pxref{Port Manipulation Interface}) and the task's virtual memory map +(@pxref{Virtual Memory Interface}). +@end deftp + +@menu +* Task Creation:: Creating tasks. +* Task Termination:: Terminating tasks. +* Task Information:: Informations on tasks. +* Task Execution:: Thread scheduling in a task. +* Task Special Ports:: How to get and set the task's special ports. +* Syscall Emulation:: How to emulate system calls. +@end menu + + +@node Task Creation +@subsection Task Creation + +@deftypefun kern_return_t task_create (@w{task_t @var{parent_task}}, @w{boolean_t @var{inherit_memory}}, @w{task_t *@var{child_task}}) +The function @code{task_create} creates a new task from +@var{parent_task}; the resulting task (@var{child_task}) acquires shared +or copied parts of the parent's address space (see @code{vm_inherit}). +The child task initially contains no threads. + +If @var{inherit_memory} is set, the child task's address space is built +from the parent task according to its memory inheritance values; +otherwise, the child task is given an empty address space. + +The child task gets the three special ports created or copied for it at +task creation. The @code{TASK_KERNEL_PORT} is created and send rights +for it are given to the child and returned to the caller. +@c The following is only relevant if MACH_IPC_COMPAT is used. +@c The @code{TASK_NOTIFY_PORT} is created and receive, ownership and send rights +@c for it are given to the child. The caller has no access to it. +The @code{TASK_BOOTSTRAP_PORT} and the @code{TASK_EXCEPTION_PORT} are +inherited from the parent task. The new task can get send rights to +these ports with the call @code{task_get_special_port}. + +The function returns @code{KERN_SUCCESS} if a new task has been created, +@code{KERN_INVALID_ARGUMENT} if @var{parent_task} is not a valid task +port and @code{KERN_RESOURCE_SHORTAGE} if some critical kernel resource +is unavailable. +@end deftypefun + + +@node Task Termination +@subsection Task Termination + +@deftypefun kern_return_t task_terminate (@w{task_t @var{target_task}}) +The function @code{task_terminate} destroys the task specified by +@var{target_task} and all its threads. All resources that are used only +by this task are freed. Any port to which this task has receive and +ownership rights is destroyed. + +The function returns @code{KERN_SUCCESS} if the task has been killed, +@code{KERN_INVALID_ARGUMENT} if @var{target_task} is not a task. +@end deftypefun + + +@node Task Information +@subsection Task Information +@deftypefun task_t mach_task_self () +The @code{mach_task_self} system call returns the calling thread's task +port. + +@code{mach_task_self} has an effect equivalent to receiving a send right +for the task port. @code{mach_task_self} returns the name of the send +right. In particular, successive calls will increase the calling task's +user-reference count for the send right. + +As a special exception, the kernel will overrun the user reference count +of the task name port, so that this function can not fail for that +reason. Because of this, the user should not deallocate the port right +if an overrun might have happened. Otherwise the reference count could +drop to zero and the send right be destroyed while the user still +expects to be able to use it. As the kernel does not make use of the +number of extant send rights anyway, this is safe to do (the task port +itself is not destroyed, even when there are no send rights anymore). + +The funcion returns @code{MACH_PORT_NULL} if a resource shortage +prevented the reception of the send right, @code{MACH_PORT_NULL} if the +task port is currently null, @code{MACH_PORT_DEAD} if the task port is +currently dead. +@end deftypefun + +@deftypefun kern_return_t task_threads (@w{task_t @var{target_task}}, @w{thread_array_t *@var{thread_list}}, @w{mach_msg_type_number_t *@var{thread_count}}) +The function @code{task_threads} gets send rights to the kernel port for +each thread contained in @var{target_task}. @var{thread_list} is an +array that is created as a result of this call. The caller may wish to +@code{vm_deallocate} this array when the data is no longer needed. + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{target_task} is not a task. +@end deftypefun + +@deftypefun kern_return_t task_info (@w{task_t @var{target_task}}, @w{int @var{flavor}}, @w{task_info_t @var{task_info}}, @w{mach_msg_type_number_t *@var{task_info_count}}) +The function @code{task_info} returns the selected information array for +a task, as specified by @var{flavor}. @var{task_info} is an array of +integers that is supplied by the caller, and filled with specified +information. @var{task_info_count} is supplied as the maximum number of +integers in @var{task_info}. On return, it contains the actual number +of integers in @var{task_info}. The maximum number of integers returned +by any flavor is @code{TASK_INFO_MAX}. + +The type of information returned is defined by @var{flavor}, which can +be one of the following: + +@table @code +@item TASK_BASIC_INFO +The function returns basic information about the task, as defined by +@code{task_basic_info_t}. This includes the user and system time and +memory consumption. The number of integers returned is +@code{TASK_BASIC_INFO_COUNT}. + +@item TASK_EVENTS_INFO +The function returns information about events for the task as defined by +@code{thread_sched_info_t}. This includes statistics about virtual +memory and IPC events like pageouts, pageins and messages sent and +received. The number of integers returned is +@code{TASK_EVENTS_INFO_COUNT}. + +@item TASK_THREAD_TIMES_INFO +The function returns information about the total time for live threads +as defined by @code{task_thread_times_info_t}. The number of integers +returned is @code{TASK_THREAD_TIMES_INFO_COUNT}. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{target_task} is not a thread or +@var{flavor} is not recognized. The function returns +@code{MIG_ARRAY_TOO_LARGE} if the returned info array is too large for +@var{task_info}. In this case, @var{task_info} is filled as much as +possible and @var{task_infoCnt} is set to the number of elements that +would have been returned if there were enough room. +@end deftypefun + +@deftp {Data type} {struct task_basic_info} +This structure is returned in @var{task_info} by the @code{task_info} +function and provides basic information about the task. You can cast a +variable of type @code{task_info_t} to a pointer of this type if you +provided it as the @var{task_info} parameter for the +@code{TASK_BASIC_INFO} flavor of @code{task_info}. It has the following +members: + +@table @code +@item integer_t suspend_count +suspend count for task + +@item integer_t base_priority +base scheduling priority + +@item vm_size_t virtual_size +number of virtual pages + +@item vm_size_t resident_size +number of resident pages + +@item time_value_t user_time +total user run time for terminated threads + +@item time_value_t system_time +total system run time for terminated threads + +@item time_value_t creation_time +creation time stamp +@end table +@end deftp + +@deftp {Data type} task_basic_info_t +This is a pointer to a @code{struct task_basic_info}. +@end deftp + +@deftp {Data type} {struct task_events_info} +This structure is returned in @var{task_info} by the @code{task_info} +function and provides event statistics for the task. You can cast a +variable of type @code{task_info_t} to a pointer of this type if you +provided it as the @var{task_info} parameter for the +@code{TASK_EVENTS_INFO} flavor of @code{task_info}. It has the +following members: + +@table @code +@item natural_t faults +number of page faults + +@item natural_t zero_fills +number of zero fill pages + +@item natural_t reactivations +number of reactivated pages + +@item natural_t pageins +number of actual pageins + +@item natural_t cow_faults +number of copy-on-write faults + +@item natural_t messages_sent +number of messages sent + +@item natural_t messages_received +number of messages received +@end table +@end deftp + +@deftp {Data type} task_events_info_t +This is a pointer to a @code{struct task_events_info}. +@end deftp + +@deftp {Data type} {struct task_thread_times_info} +This structure is returned in @var{task_info} by the @code{task_info} +function and provides event statistics for the task. You can cast a +variable of type @code{task_info_t} to a pointer of this type if you +provided it as the @var{task_info} parameter for the +@code{TASK_THREAD_TIMES_INFO} flavor of @code{task_info}. It has the +following members: + +@table @code +@item time_value_t user_time +total user run time for live threads + +@item time_value_t system_time +total system run time for live threads +@end table +@end deftp + +@deftp {Data type} task_thread_times_info_t +This is a pointer to a @code{struct task_thread_times_info}. +@end deftp + + +@node Task Execution +@subsection Task Execution + +@deftypefun kern_return_t task_suspend (@w{task_t @var{target_task}}) +The function @code{task_suspend} increments the task's suspend count and +stops all threads in the task. As long as the suspend count is positive +newly created threads will not run. This call does not return until all +threads are suspended. + +The count may become greater than one, with the effect that it will take +more than one resume call to restart the task. + +The function returns @code{KERN_SUCCESS} if the task has been suspended +and @code{KERN_INVALID_ARGUMENT} if @var{target_task} is not a task. +@end deftypefun + +@deftypefun kern_return_t task_resume (@w{task_t @var{target_task}}) +The function @code{task_resume} decrements the task's suspend count. If +it becomes zero, all threads with zero suspend counts in the task are +resumed. The count may not become negative. + +The function returns @code{KERN_SUCCESS} if the task has been resumed, +@code{KERN_FAILURE} if the suspend count is already at zero and +@code{KERN_INVALID_ARGUMENT} if @var{target_task} is not a task. +@end deftypefun + +@c XXX Should probably be in the "Scheduling" node of the Thread Interface. +@deftypefun kern_return_t task_priority (@w{task_t @var{task}}, @w{int @var{priority}}, @w{boolean_t @var{change_threads}}) +The priority of a task is used only for creation of new threads; a new +thread's priority is set to the enclosing task's priority. +@code{task_priority} changes this task priority. It also sets the +priorities of all threads in the task to this new priority if +@var{change_threads} is @code{TRUE}. Existing threads are not affected +otherwise. If this priority change violates the maximum priority of +some threads, as many threads as possible will be changed and an error +code will be returned. + +The function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_ARGUMENT} if @var{task} is not a task, or +@var{priority} is not a valid priority and @code{KERN_FAILURE} if +@var{change_threads} was @code{TRUE} and the attempt to change the +priority of at least one existing thread failed because the new priority +would have exceeded that thread's maximum priority. +@end deftypefun + +@deftypefun kern_return_t task_ras_control (@w{task_t @var{target_task}}, @w{vm_address_t @var{start_pc}}, @w{vm_address_t @var{end_pc}}, @w{int @var{flavor}}) +The function @code{task_ras_control} manipulates a task's set of +restartable atomic sequences. If a sequence is installed, and any +thread in the task is preempted within the range +[@var{start_pc},@var{end_pc}], then the thread is resumed at +@var{start_pc}. This enables applications to build atomic sequences +which, when executed to completion, will have executed atomically. +Restartable atomic sequences are intended to be used on systems that do +not have hardware support for low-overhead atomic primitives. + +As a thread can be rolled-back, the code in the sequence should have no +side effects other than a final store at @var{end_pc}. The kernel does +not guarantee that the sequence is restartable. It assumes the +application knows what it's doing. + +A task may have a finite number of atomic sequences that is defined at +compile time. + +The flavor specifices the particular operation that should be applied to +this restartable atomic sequence. Possible values for flavor can be: + +@table @code +@item TASK_RAS_CONTROL_PURGE_ALL +Remove all registered sequences for this task. + +@item TASK_RAS_CONTROL_PURGE_ONE +Remove the named registered sequence for this task. + +@item TASK_RAS_CONTROL_PURGE_ALL_AND_INSTALL_ONE +Atomically remove all registered sequences and install the named +sequence. + +@item TASK_RAS_CONTROL_INSTALL_ONE +Install this sequence. +@end table + +The function returns @code{KERN_SUCCESS} if the operation has been +performed, @code{KERN_INVALID_ADDRESS} if the @var{start_pc} or +@var{end_pc} values are not a valid address for the requested operation +(for example, it is invalid to purge a sequence that has not been +registered), @code{KERN_RESOURCE_SHORTAGE} if an attempt was made to +install more restartable atomic sequences for a task than can be +supported by the kernel, @code{KERN_INVALID_VALUE} if a bad flavor was +specified, @code{KERN_INVALID_ARGUMENT} if @var{target_task} is not a +task and @code{KERN_FAILURE} if the call is not not supported on this +configuration. +@end deftypefun + + +@node Task Special Ports +@subsection Task Special Ports + +@deftypefun kern_return_t task_get_special_port (@w{task_t @var{task}}, @w{int @var{which_port}}, @w{mach_port_t *@var{special_port}}) +The function @code{task_get_special_port} returns send rights to one of +a set of special ports for the task specified by @var{task}. + +The special ports associated with a task are the kernel port +(@code{TASK_KERNEL_PORT}), the bootstrap port +(@code{TASK_BOOTSTRAP_PORT}) and the exception port +(@code{TASK_EXCEPTION_PORT}). The bootstrap port is a port to which a +task may send a message requesting other system service ports. This +port is not used by the kernel. The task's exception port is the port +to which messages are sent by the kernel when an exception occurs and +the thread causing the exception has no exception port of its own. + +The following macros to call @code{task_get_special_port} for a specific +port are defined in @code{mach/task_special_ports.h}: +@code{task_get_exception_port} and @code{task_get_bootstrap_port}. + +The function returns @code{KERN_SUCCESS} if the port was returned and +@code{KERN_INVALID_ARGUMENT} if @var{task} is not a task or +@var{which_port} is an invalid port selector. +@end deftypefun + +@deftypefun kern_return_t task_get_kernel_port (@w{task_t @var{task}}, @w{mach_port_t *@var{kernel_port}}) +The function @code{task_get_kernel_port} is equivalent to the function +@code{task_get_special_port} with the @var{which_port} argument set to +@code{TASK_KERNEL_PORT}. +@end deftypefun + +@deftypefun kern_return_t task_get_exception_port (@w{task_t @var{task}}, @w{mach_port_t *@var{exception_port}}) +The function @code{task_get_exception_port} is equivalent to the +function @code{task_get_special_port} with the @var{which_port} argument +set to @code{TASK_EXCEPTION_PORT}. +@end deftypefun + +@deftypefun kern_return_t task_get_bootstrap_port (@w{task_t @var{task}}, @w{mach_port_t *@var{bootstrap_port}}) +The function @code{task_get_bootstrap_port} is equivalent to the +function @code{task_get_special_port} with the @var{which_port} argument +set to @code{TASK_BOOTSTRAP_PORT}. +@end deftypefun + +@deftypefun kern_return_t task_set_special_port (@w{task_t @var{task}}, @w{int @var{which_port}}, @w{mach_port_t @var{special_port}}) +The function @code{thread_set_special_port} sets one of a set of special +ports for the task specified by @var{task}. + +The special ports associated with a task are the kernel port +(@code{TASK_KERNEL_PORT}), the bootstrap port +(@code{TASK_BOOTSTRAP_PORT}) and the exception port +(@code{TASK_EXCEPTION_PORT}). The bootstrap port is a port to which a +thread may send a message requesting other system service ports. This +port is not used by the kernel. The task's exception port is the port +to which messages are sent by the kernel when an exception occurs and +the thread causing the exception has no exception port of its own. + +The function returns @code{KERN_SUCCESS} if the port was set and +@code{KERN_INVALID_ARGUMENT} if @var{task} is not a task or +@var{which_port} is an invalid port selector. +@end deftypefun + +@deftypefun kern_return_t task_set_kernel_port (@w{task_t @var{task}}, @w{mach_port_t @var{kernel_port}}) +The function @code{task_set_kernel_port} is equivalent to the function +@code{task_set_special_port} with the @var{which_port} argument set to +@code{TASK_KERNEL_PORT}. +@end deftypefun + +@deftypefun kern_return_t task_set_exception_port (@w{task_t @var{task}}, @w{mach_port_t @var{exception_port}}) +The function @code{task_set_exception_port} is equivalent to the +function @code{task_set_special_port} with the @var{which_port} argument +set to @code{TASK_EXCEPTION_PORT}. +@end deftypefun + +@deftypefun kern_return_t task_set_bootstrap_port (@w{task_t @var{task}}, @w{mach_port_t @var{bootstrap_port}}) +The function @code{task_set_bootstrap_port} is equivalent to the +function @code{task_set_special_port} with the @var{which_port} argument +set to @code{TASK_BOOTSTRAP_PORT}. +@end deftypefun + + +@node Syscall Emulation +@subsection Syscall Emulation + +@deftypefun kern_return_t task_get_emulation_vector (@w{task_t @var{task}}, @w{int *@var{vector_start}}, @w{emulation_vector_t *@var{emulation_vector}}, @w{mach_msg_type_number_t *@var{emulation_vector_count}}) +The function @code{task_get_emulation_vector} gets the user-level +handler entry points for all emulated system calls. +@c XXX Fixme +@end deftypefun + +@deftypefun kern_return_t task_set_emulation_vector (@w{task_t @var{task}}, @w{int @var{vector_start}}, @w{emulation_vector_t @var{emulation_vector}}, @w{mach_msg_type_number_t @var{emulation_vector_count}}) +The function @code{task_set_emulation_vector} establishes user-level +handlers for the specified system calls. Non-emulated system calls are +specified with an entry of @code{EML_ROUTINE_NULL}. System call +emulation handlers are inherited by the childs of @var{task}. +@c XXX Fixme +@end deftypefun + +@deftypefun kern_return_t task_set_emulation (@w{task_t @var{task}}, @w{vm_address_t @var{routine_entry_pt}}, @w{int @var{routine_number}}) +The function @code{task_set_emulation} establishes a user-level handler +for the specified system call. System call emulation handlers are +inherited by the childs of @var{task}. +@c XXX Fixme +@end deftypefun + +@c XXX Fixme datatype emulation_vector_t + + +@node Profiling +@section Profiling + +@deftypefun kern_return_t task_enable_pc_sampling (@w{task_t @var{task}}, @w{int *@var{ticks}}, @w{sampled_pc_flavor_t @var{flavor}}) +@deftypefunx kern_return_t thread_enable_pc_sampling (@w{thread_t @var{thread}}, @w{int *@var{ticks}}, @w{sampled_pc_flavor_t @var{flavor}}) +The function @code{task_enable_pc_sampling} enables PC sampling for +@var{task}, the function @code{thread_enable_pc_sampling} enables PC +sampling for @var{thread}. The kernel's idea of clock granularity is +returned in @var{ticks} in usecs. (this value should not be trusted). The +sampling flavor is specified by @var{flavor}. + +The function returns @code{KERN_SUCCESS} if the operation is completed successfully +and @code{KERN_INVALID_ARGUMENT} if @var{thread} is not a valid thread. +@end deftypefun + +@deftypefun kern_return_t task_disable_pc_sampling (@w{task_t @var{task}}, @w{int *@var{sample_count}}) +@deftypefunx kern_return_t thread_disable_pc_sampling (@w{thread_t @var{thread}}, @w{int *@var{sample_count}}) +The function @code{task_disable_pc_sampling} disables PC sampling for +@var{task}, the function @code{thread_disable_pc_sampling} disables PC +sampling for @var{thread}. The number of sample elements in the kernel +for the thread is returned in @var{sample_count}. + +The function returns @code{KERN_SUCCESS} if the operation is completed successfully +and @code{KERN_INVALID_ARGUMENT} if @var{thread} is not a valid thread. +@end deftypefun + +@deftypefun kern_return_t task_get_sampled_pcs (@w{task_t @var{task}}, @w{sampled_pc_seqno_t *@var{seqno}}, @w{sampled_pc_array_t @var{sampled_pcs}}, @w{mach_msg_type_number_t *@var{sample_count}}) +@deftypefunx kern_return_t thread_get_sampled_pcs (@w{thread_t @var{thread}}, @w{sampled_pc_seqno_t *@var{seqno}}, @w{sampled_pc_array_t @var{sampled_pcs}}, @w{int *@var{sample_count}}) +The function @code{task_get_sampled_pcs} extracts the PC samples for +@var{task}, the function @code{thread_get_sampled_pcs} extracts the PC +samples for @var{thread}. @var{seqno} is the sequence number of the +sampled PCs. This is useful for determining when a collector thread has +missed a sample. The sampled PCs for the thread are returned in +@var{sampled_pcs}. @var{sample_count} contains the number of sample +elements returned. + +The function returns @code{KERN_SUCCESS} if the operation is completed successfully, +@code{KERN_INVALID_ARGUMENT} if @var{thread} is not a valid thread and +@code{KERN_FAILURE} if @var{thread} is not sampled. +@end deftypefun + + +@deftp {Data type} sampled_pc_t +This structure is returned in @var{sampled_pcs} by the +@code{thread_get_sampled_pcs} and @code{task_get_sampled_pcs} functions +and provides pc samples for threads or tasks. It has the following +members: + +@table @code +@item natural_t id +A thread-specific unique identifier. + +@item vm_offset_t pc +A pc value. + +@item sampled_pc_flavor_t sampletype +The type of the sample as per flavor. +@end table +@end deftp + + +@deftp {Data type} sampled_pc_flavor_t +This data type specifies a pc sample flavor, either as argument passed +in @var{flavor} to the @code{thread_enable_pc_sample} and +@code{thread_disable_pc_sample} functions, or as member +@code{sampletype} in the @code{sample_pc_t} data type. The flavor is a +bitwise-or of the possible flavors defined in @file{mach/pc_sample.h}: + +@table @code +@item SAMPLED_PC_PERIODIC +default +@item SAMPLED_PC_VM_ZFILL_FAULTS +zero filled fault +@item SAMPLED_PC_VM_REACTIVATION_FAULTS +reactivation fault +@item SAMPLED_PC_VM_PAGEIN_FAULTS +pagein fault +@item SAMPLED_PC_VM_COW_FAULTS +copy-on-write fault +@item SAMPLED_PC_VM_FAULTS_ANY +any fault +@item SAMPLED_PC_VM_FAULTS +the bitwise-or of @code{SAMPLED_PC_VM_ZFILL_FAULTS}, +@code{SAMPLED_PC_VM_REACTIVATION_FAULTS}, +@code{SAMPLED_PC_VM_PAGEIN_FAULTS} and @code{SAMPLED_PC_VM_COW_FAULTS}. +@end table +@end deftp + +@c XXX sampled_pc_array_t, sampled_pc_seqno_t + + +@node Host Interface +@chapter Host Interface +@cindex host interface + +This section describes the Mach interface to a host executing a Mach +kernel. The interface allows to query statistics about a host and +control its behaviour. + +A host is represented by two ports, a name port @var{host} used to query +information about the host accessible to everyone, and a control port +@var{host_priv} used to manipulate it. For example, you can query the +current time using the name port, but to change the time you need to +send a message to the host control port. + +Everything described in this section is declared in the header file +@file{mach.h}. + +@menu +* Host Ports:: Ports representing a host. +* Host Information:: Retrieval of information about a host. +* Host Time:: Operations on the time as seen by a host. +* Host Reboot:: Rebooting the system. +@end menu + + +@node Host Ports +@section Host Ports +@cindex host ports +@cindex ports representing a host + +@cindex host name port +@deftp {Data type} host_t +This is a @code{mach_port_t} and used to to hold the port name of a host +name port (or short: host port). Any task can get a send right to the +name port of the host running the task using the @code{mach_host_self} +system call. The name port can be used query information about the +host, for example the current time. +@end deftp + +@deftypefun host_t mach_host_self () +The @code{mach_host_self} system call returns the calling thread's host +name port. It has an effect equivalent to receiving a send right for +the host port. @code{mach_host_self} returns the name of the send +right. In particular, successive calls will increase the calling task's +user-reference count for the send right. + +As a special exception, the kernel will overrun the user reference count +of the host name port, so that this function can not fail for that +reason. Because of this, the user should not deallocate the port right +if an overrun might have happened. Otherwise the reference count could +drop to zero and the send right be destroyed while the user still +expects to be able to use it. As the kernel does not make use of the +number of extant send rights anyway, this is safe to do (the host port +itself is never destroyed). + +The function returns @code{MACH_PORT_NULL} if a resource shortage +prevented the reception of the send right. + +This function is also available in @file{mach/mach_traps.h}. +@end deftypefun + +@cindex host control port +@deftp {Data type} host_priv_t +This is a @code{mach_port_t} and used to hold the port name of a +privileged host control port. A send right to the host control port is +inserted into the first task at bootstrap (@pxref{Modules}). This is +the only way to get access to the host control port in Mach, so the +initial task has to preserve the send right carefully, moving a copy of +it to other privileged tasks if necessary and denying access to +unprivileged tasks. +@end deftp + + +@node Host Information +@section Host Information + +@deftypefun kern_return_t host_info (@w{host_t @var{host}}, @w{int @var{flavor}}, @w{host_info_t @var{host_info}}, @w{mach_msg_type_number_t *@var{host_info_count}}) +The @code{host_info} function returns various information about +@var{host}. @var{host_info} is an array of integers that is supplied by +the caller. It will be filled with the requested information. +@var{host_info_count} is supplied as the maximum number of integers in +@var{host_info}. On return, it contains the actual number of integers +in @var{host_info}. The maximum number of integers returned by any +flavor is @code{HOST_INFO_MAX}. + +The type of information returned is defined by @var{flavor}, which can +be one of the following: + +@table @code +@item HOST_BASIC_INFO +The function returns basic information about the host, as defined by +@code{host_basic_info_t}. This includes the number of processors, their +type, and the amount of memory installed in the system. The number of +integers returned is @code{HOST_BASIC_INFO_COUNT}. For how to get more +information about the processor, see @ref{Processor Interface}. + +@item HOST_PROCESSOR_SLOTS +The function returns the numbers of the slots with active processors in +them. The number of integers returned can be up to @code{max_cpus}, as +returned by the @code{HOST_BASIC_INFO} flavor of @code{host_info}. + +@item HOST_SCHED_INFO +The function returns information of interest to schedulers as defined by +@code{host_sched_info_t}. The number of integers returned is +@code{HOST_SCHED_INFO_COUNT}. +@end table + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{host} is not a host or @var{flavor} +is not recognized. The function returns @code{MIG_ARRAY_TOO_LARGE} if +the returned info array is too large for @var{host_info}. In this case, +@var{host_info} is filled as much as possible and @var{host_info_count} +is set to the number of elements that would be returned if there were +enough room. +@c BUGS Availability limited. Systems without this call support a +@c host_info call with an incompatible calling sequence. +@end deftypefun + +@deftp {Data type} {struct host_basic_info} +A pointer to this structure is returned in @var{host_info} by the +@code{host_info} function and provides basic information about the host. +You can cast a variable of type @code{host_info_t} to a pointer of this +type if you provided it as the @var{host_info} parameter for the +@code{HOST_BASIC_INFO} flavor of @code{host_info}. It has the following +members: + +@table @code +@item int max_cpus +The maximum number of possible processors for which the kernel is +configured. + +@item int avail_cpus +The number of cpus currently available. + +@item vm_size_t memory_size +The size of physical memory in bytes. + +@item cpu_type_t cpu_type +The type of the master processor. + +@item cpu_subtype_t cpu_subtype +The subtype of the master processor. +@end table + +The type and subtype of the individual processors are also available +by @code{processor_info}, see @ref{Processor Interface}. +@end deftp + +@deftp {Data type} host_basic_info_t +This is a pointer to a @code{struct host_basic_info}. +@end deftp + +@deftp {Data type} {struct host_sched_info} +A pointer to this structure is returned in @var{host_info} by the +@code{host_info} function and provides information of interest to +schedulers. You can cast a variable of type @code{host_info_t} to a +pointer of this type if you provided it as the @var{host_info} parameter +for the @code{HOST_SCHED_INFO} flavor of @code{host_info}. It has the +following members: + +@table @code +@item int min_timeout +The minimum timeout and unit of time in milliseconds. + +@item int min_quantum +The minimum quantum and unit of quantum in milliseconds. +@end table +@end deftp + +@deftp {Data type} host_sched_info_t +This is a pointer to a @code{struct host_sched_info}. +@end deftp + +@deftypefun kern_return_t host_kernel_version (@w{host_t @var{host}}, @w{kernel_version_t *@var{version}}) +The @code{host_kernel_version} function returns the version string +compiled into the kernel executing on @var{host} at the time it was +built in the character string @var{version}. This string describes the +version of the kernel. The constant @code{KERNEL_VERSION_MAX} should be +used to dimension storage for the returned string if the +@code{kernel_version_t} declaration is not used. + +If the version string compiled into the kernel is longer than +@code{KERNEL_VERSION_MAX}, the result is truncated and not necessarily +null-terminated. + +If @var{host} is not a valid send right to a host port, the function +returns @code{KERN_INVALID_ARGUMENT}. If @var{version} points to +inaccessible memory, it returns @code{KERN_INVALID_ADDRESS}, and +@code{KERN_SUCCESS} otherwise. +@end deftypefun + +@deftypefun kern_return_t host_get_boot_info (@w{host_priv_t @var{host_priv}}, @w{kernel_boot_info_t @var{boot_info}}) +The @code{host_get_boot_info} function returns the boot-time information +string supplied by the operator to the kernel executing on +@var{host_priv} in the character string @var{boot_info}. The constant +@code{KERNEL_BOOT_INFO_MAX} should be used to dimension storage for the +returned string if the @code{kernel_boot_info_t} declaration is not +used. + +If the boot-time information string supplied by the operator is longer +than @code{KERNEL_BOOT_INFO_MAX}, the result is truncated and not +necessarily null-terminated. +@end deftypefun + + +@node Host Time +@section Host Time + +@deftp {Data type} time_value_t +This is the representation of a time in Mach. It is a @code{struct +time_value} and consists of the following members: + +@table @code +@item integer_t seconds +The number of seconds. +@item integer_t microseconds +The number of microseconds. +@end table +@end deftp + +The number of microseconds should always be smaller than +@code{TIME_MICROS_MAX} (100000). A time with this property is +@dfn{normalized}. Normalized time values can be manipulated with the +following macros: + +@defmac time_value_add_usec (@w{time_value_t *@var{val}}, @w{integer_t *@var{micros}}) +Add @var{micros} microseconds to @var{val}. If @var{val} is normalized +and @var{micros} smaller than @code{TIME_MICROS_MAX}, @var{val} will be +normalized afterwards. +@end defmac + +@defmac time_value_add (@w{time_value_t *@var{result}}, @w{time_value_t *@var{addend}}) +Add the values in @var{addend} to @var{result}. If both are normalized, +@var{result} will be normalized afterwards. +@end defmac + +A variable of type @code{time_value_t} can either represent a duration +or a fixed point in time. In the latter case, it shall be interpreted as +the number of seconds and microseconds after the epoch 1. Jan 1970. + +@deftypefun kern_return_t host_get_time (@w{host_t @var{host}}, @w{time_value_t *@var{current_time}}) +Get the current time as seen by @var{host}. On success, the time passed +since the epoch is returned in @var{current_time}. +@end deftypefun + +@deftypefun kern_return_t host_set_time (@w{host_priv_t @var{host_priv}}, @w{time_value_t @var{new_time}}) +Set the time of @var{host_priv} to @var{new_time}. +@end deftypefun + +@deftypefun kern_return_t host_adjust_time (@w{host_priv_t @var{host_priv}}, @w{time_value_t @var{new_adjustment}}, @w{time_value_t *@var{old_adjustment}}) +Arrange for the current time as seen by @var{host_priv} to be gradually +changed by the adjustment value @var{new_adjustment}, and return the old +adjustment value in @var{old_adjustment}. +@end deftypefun + +For efficiency, the current time is available through a mapped-time +interface. + +@deftp {Data type} mapped_time_value_t +This structure defines the mapped-time interface. It has the following +members: + +@table @code +@item integer_t seconds +The number of seconds. + +@item integer_t microseconds +The number of microseconds. + +@item integer_t check_seconds +This is a copy of the seconds value, which must be checked to protect +against a race condition when reading out the two time values. +@end table +@end deftp + +Here is an example how to read out the current time using the +mapped-time interface: + +@c XXX Complete the example. +@example +do + @{ + secs = mtime->seconds; + usecs = mtime->microseconds; + @} +while (secs != mtime->check_seconds); +@end example + + +@node Host Reboot +@section Host Reboot + +@deftypefun kern_return_t host_reboot (@w{host_priv_t @var{host_priv}}, @w{int @var{options}}) +Reboot the host specified by @var{host_priv}. The argument +@var{options} specifies the flags. The available flags are defined in +@file{sys/reboot.h}: + +@table @code +@item RB_HALT +Do not reboot, but halt the machine. + +@item RB_DEBUGGER +Do not reboot, but enter kernel debugger from user space. +@end table + +If successful, the function might not return. +@end deftypefun + + +@node Processors and Processor Sets +@chapter Processors and Processor Sets + +This section describes the Mach interface to processor sets and +individual processors. The interface allows to group processors into +sets and control the processors and processor sets. + +A processor is not a central part of the interface. It is mostly of +relevance as a part of a processor set. Threads are always assigned to +processor sets, and all processors in a set are equally involved in +executing all threads assigned to that set. + +The processor set is represented by two ports, a name port +@var{processor_set_name} used to query information about the host +accessible to everyone, and a control port @var{processor_set} used to +manipulate it. + +@menu +* Processor Set Interface:: How to work with processor sets. +* Processor Interface:: How to work with individual processors. +@end menu + + +@node Processor Set Interface +@section Processor Set Interface + +@menu +* Processor Set Ports:: Ports representing a processor set. +* Processor Set Access:: How the processor sets are accessed. +* Processor Set Creation:: How new processor sets are created. +* Processor Set Destruction:: How processor sets are destroyed. +* Tasks and Threads on Sets:: Assigning tasks, threads to processor sets. +* Processor Set Priority:: Specifying the priority of a processor set. +* Processor Set Policy:: Changing the processor set policies. +* Processor Set Info:: Obtaining information about a processor set. +@end menu + + +@node Processor Set Ports +@subsection Processor Set Ports +@cindex processor set ports +@cindex ports representing a processor set + +@cindex processor set name port +@cindex port representing a processor set name +@deftp {Data type} processor_set_name_t +This is a @code{mach_port_t} and used to to hold the port name of a +processor set name port that names the processor set. Any task can get +a send right to name port of a processor set. The processor set name +port allows to get information about the processor set. +@end deftp + +@cindex processor set port +@deftp {Data type} processor_set_t +This is a @code{mach_port_t} and used to to hold the port name of a +privileged processor set control port that represents the processor set. +Operations on the processor set are implemented as remote procedure +calls to the processor set port. The processor set port allows to +manipulate the processor set. +@end deftp + + +@node Processor Set Access +@subsection Processor Set Access + +@deftypefun kern_return_t host_processor_sets (@w{host_t @var{host}}, @w{processor_set_name_array_t *@var{processor_sets}}, @w{mach_msg_type_number_t *@var{processor_sets_count}}) +The function @code{host_processor_sets} gets send rights to the name +port for each processor set currently assigned to @var{host}. + +@code{host_processor_set_priv} can be used to obtain the control ports +from these if desired. @var{processor_sets} is an array that is +created as a result of this call. The caller may wish to +@code{vm_deallocate} this array when the data is no longer needed. +@var{processor_sets_count} is set to the number of processor sets in the +@var{processor_sets}. + +This function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{host} is not a host. +@end deftypefun + +@deftypefun kern_return_t host_processor_set_priv (@w{host_priv_t @var{host_priv}}, @w{processor_set_name_t @var{set_name}}, @w{processor_set_t *@var{set}}) +The function @code{host_processor_set_priv} allows a privileged +application to obtain the control port @var{set} for an existing +processor set from its name port @var{set_name}. The privileged host +port @var{host_priv} is required. + +This function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{host_priv} is not a valid host +control port. +@end deftypefun + +@deftypefun kern_return_t processor_set_default (@w{host_t @var{host}}, @w{processor_set_name_t *@var{default_set}}) +The function @code{processor_set_default} returns the default processor +set of @var{host} in @var{default_set}. The default processor set is +used by all threads, tasks, and processors that are not explicitly +assigned to other sets. processor_set_default returns a port that can +be used to obtain information about this set (e.g. how many threads are +assigned to it). This port cannot be used to perform operations on that +set. + +This function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_ARGUMENT} if @var{host} is not a host and +@code{KERN_INVALID_ADDRESS} if @var{default_set} points to +inaccessible memory. +@end deftypefun + + +@node Processor Set Creation +@subsection Processor Set Creation + +@deftypefun kern_return_t processor_set_create (@w{host_t @var{host}}, @w{processor_set_t *@var{new_set}}, @w{processor_set_name_t *@var{new_name}}) +The function @code{processor_set_create} creates a new processor set on +@var{host} and returns the two ports associated with it. The port +returned in @var{new_set} is the actual port representing the set. It +is used to perform operations such as assigning processors, tasks, or +threads. The port returned in @var{new_name} identifies the set, and is +used to obtain information about the set. + +This function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_ARGUMENT} if @var{host} is not a host, +@code{KERN_INVALID_ADDRESS} if @var{new_set} or @var{new_name} points to +inaccessible memory and @code{KERN_FAILURE} is the operating system does +not support processor allocation. +@end deftypefun + + +@node Processor Set Destruction +@subsection Processor Set Destruction + +@deftypefun kern_return_t processor_set_destroy (@w{processor_set_t @var{processor_set}}) +The function @code{processor_set_destroy} destroys the specified +processor set. Any assigned processors, tasks, or threads are +reassigned to the default set. The object port for the processor set is +required (not the name port). The default processor set cannot be +destroyed. + +This function returns @code{KERN_SUCCESS} if the set was destroyed, +@code{KERN_FAILURE} if an attempt was made to destroy the default +processor set, or the operating system does not support processor +allocation, and @code{KERN_INVALID_ARGUMENT} if @var{processor_set} is +not a valid processor set control port. +@end deftypefun + + +@node Tasks and Threads on Sets +@subsection Tasks and Threads on Sets + +@deftypefun kern_return_t processor_set_tasks (@w{processor_set_t @var{processor_set}}, @w{task_array_t *@var{task_list}}, @w{mach_msg_type_number_t *@var{task_count}}) +The function @code{processor_set_tasks} gets send rights to the kernel +port for each task currently assigned to @var{processor_set}. + +@var{task_list} is an array that is created as a result of this call. +The caller may wish to @code{vm_deallocate} this array when the data is +no longer needed. @var{task_count} is set to the number of tasks in the +@var{task_list}. + +This function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{processor_set} is not a processor +set. +@end deftypefun + +@deftypefun kern_return_t processor_set_threads (@w{processor_set_t @var{processor_set}}, @w{thread_array_t *@var{thread_list}}, @w{mach_msg_type_number_t *@var{thread_count}}) +The function @code{processor_set_thread} gets send rights to the kernel +port for each thread currently assigned to @var{processor_set}. + +@var{thread_list} is an array that is created as a result of this call. +The caller may wish to @code{vm_deallocate} this array when the data is +no longer needed. @var{thread_count} is set to the number of threads in +the @var{thread_list}. + +This function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{processor_set} is not a processor +set. +@end deftypefun + +@deftypefun kern_return_t task_assign (@w{task_t @var{task}}, @w{processor_set_t @var{processor_set}}, @w{boolean_t @var{assign_threads}}) +The function @code{task_assign} assigns @var{task} the set +@var{processor_set}. This assignment is for the purposes of determining +the initial assignment of newly created threads in task. Any previous +assignment of the task is nullified. Existing threads within the task +are also reassigned if @var{assign_threads} is @code{TRUE}. They are +not affected if it is @code{FALSE}. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed and @code{KERN_INVALID_ARGUMENT} if @var{task} is not a task, +or @var{processor_set} is not a processor set on the same host as +@var{task}. +@end deftypefun + +@deftypefun kern_return_t task_assign_default (@w{task_t @var{task}}, @w{boolean_t @var{assign_threads}}) +The function @code{task_assign_default} is a variant of +@code{task_assign} that assigns the task to the default processor set on +that task's host. This variant exists because the control port for the +default processor set is privileged and not ususally available to users. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed and @code{KERN_INVALID_ARGUMENT} if @var{task} is not a task. +@end deftypefun + +@deftypefun kern_return_t task_get_assignment (@w{task_t @var{task}}, @w{processor_set_name_t *@var{assigned_set}}) +The function @code{task_get_assignment} returns the name of the +processor set to which the thread is currently assigned in +@var{assigned_set}. This port can only be used to obtain information +about the processor set. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed, @code{KERN_INVALID_ADDRESS} if @var{processor_set} points to +inaccessible memory, and @code{KERN_INVALID_ARGUMENT} if @var{task} is +not a task. +@end deftypefun + +@deftypefun kern_return_t thread_assign (@w{thread_t @var{thread}}, @w{processor_set_t @var{processor_set}}) +The function @code{thread_assign} assigns @var{thread} the set +@var{processor_set}. After the assignment is completed, the thread only +executes on processors assigned to the designated processor set. If +there are no such processors, then the thread is unable to execute. Any +previous assignment of the thread is nullified. Unix system call +compatibility code may temporarily force threads to execute on the +master processor. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed and @code{KERN_INVALID_ARGUMENT} if @var{thread} is not a +thread, or @var{processor_set} is not a processor set on the same host +as @var{thread}. +@end deftypefun + +@deftypefun kern_return_t thread_assign_default (@w{thread_t @var{thread}}) +The function @code{thread_assign_default} is a variant of +@code{thread_assign} that assigns the thread to the default processor +set on that thread's host. This variant exists because the control port +for the default processor set is privileged and not ususally available +to users. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed and @code{KERN_INVALID_ARGUMENT} if @var{thread} is not a +thread. +@end deftypefun + +@deftypefun kern_return_t thread_get_assignment (@w{thread_t @var{thread}}, @w{processor_set_name_t *@var{assigned_set}}) +The function @code{thread_get_assignment} returns the name of the +processor set to which the thread is currently assigned in +@var{assigned_set}. This port can only be used to obtain information +about the processor set. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed, @code{KERN_INVALID_ADDRESS} if @var{processor_set} points to +inaccessible memory, and @code{KERN_INVALID_ARGUMENT} if @var{thread} is +not a thread. +@end deftypefun + + +@node Processor Set Priority +@subsection Processor Set Priority + +@deftypefun kern_return_t processor_set_max_priority (@w{processor_set_t @var{processor_set}}, @w{int @var{max_priority}}, @w{boolean_t @var{change_threads}}) +The function @code{processor_set_max_priority} is used to set the +maximum priority for a processor set. The priority of a processor set +is used only for newly created threads (thread's maximum priority is set +to processor set's) and the assignment of threads to the set (thread's +maximum priority is reduced if it exceeds the set's maximum priority, +thread's priority is similarly reduced). +@code{processor_set_max_priority} changes this priority. It also sets +the maximum priority of all threads assigned to the processor set to +this new priority if @var{change_threads} is @code{TRUE}. If this +maximum priority is less than the priorities of any of these threads, +their priorities will also be set to this new value. + +This function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{processor_set} is not a processor +set or @var{priority} is not a valid priority. +@end deftypefun + + +@node Processor Set Policy +@subsection Processor Set Policy + +@deftypefun kern_return_t processor_set_policy_enable (@w{processor_set_t @var{processor_set}}, @w{int @var{policy}}) +@deftypefunx kern_return_t processor_set_policy_disable (@w{processor_set_t @var{processor_set}}, @w{int @var{policy}}, @w{boolean_t @var{change_threads}}) +Processor sets may restrict the scheduling policies to be used for +threads assigned to them. These two calls provide the mechanism for +designating permitted and forbidden policies. The current set of +permitted policies can be obtained from @code{processor_set_info}. +Timesharing may not be forbidden by any processor set. This is a +compromise to reduce the complexity of the assign operation; any thread +whose policy is forbidden by the target processor set has its policy +reset to timesharing. If the @var{change_threads} argument to +@code{processor_set_policy_disable} is true, threads currently assigned +to this processor set and using the newly disabled policy will have +their policy reset to timesharing. + +@file{mach/policy.h} contains the allowed policies; it is included by +@file{mach.h}. Not all policies (e.g. fixed priority) are supported by +all systems. + +This function returns @code{KERN_SUCCESS} if the operation was completed +successfully and @code{KERN_INVALID_ARGUMENT} if @var{processor_set} is +not a processor set or @var{policy} is not a valid policy, or an attempt +was made to disable timesharing. +@end deftypefun + + +@node Processor Set Info +@subsection Processor Set Info + +@deftypefun kern_return_t processor_set_info (@w{processor_set_name_t @var{set_name}}, @w{int @var{flavor}}, @w{host_t *@var{host}}, @w{processor_set_info_t @var{processor_set_info}}, @w{mach_msg_type_number_t *@var{processor_set_info_count}}) +The function @code{processor_set_info} returns the selected information array +for a processor set, as specified by @var{flavor}. + +@var{host} is set to the host on which the processor set resides. This +is the non-privileged host port. + +@var{processor_set_info} is an array of integers that is supplied by the +caller and returned filled with specified information. +@var{processor_set_info_count} is supplied as the maximum number of +integers in @var{processor_set_info}. On return, it contains the actual +number of integers in @var{processor_set_info}. The maximum number of +integers returned by any flavor is @code{PROCESSOR_SET_INFO_MAX}. + +The type of information returned is defined by @var{flavor}, which can +be one of the following: + +@table @code +@item PROCESSOR_SET_BASIC_INFO +The function returns basic information about the processor set, as +defined by @code{processor_set_basic_info_t}. This includes the number +of tasks and threads assigned to the processor set. The number of +integers returned is @code{PROCESSOR_SET_BASIC_INFO_COUNT}. + +@item PROCESSOR_SET_SCHED_INFO +The function returns information about the schduling policy for the +processor set as defined by @code{processor_set_sched_info_t}. The +number of integers returned is @code{PROCESSOR_SET_SCHED_INFO_COUNT}. +@end table + +Some machines may define additional (machine-dependent) flavors. + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{processor_set} is not a processor +set or @var{flavor} is not recognized. The function returns +@code{MIG_ARRAY_TOO_LARGE} if the returned info array is too large for +@var{processor_set_info}. In this case, @var{processor_set_info} is +filled as much as possible and @var{processor_set_info_count} is set to the +number of elements that would have been returned if there were enough +room. +@end deftypefun + +@deftp {Data type} {struct processor_set_basic_info} +This structure is returned in @var{processor_set_info} by the +@code{processor_set_info} function and provides basic information about +the processor set. You can cast a variable of type +@code{processor_set_info_t} to a pointer of this type if you provided it +as the @var{processor_set_info} parameter for the +@code{PROCESSOR_SET_BASIC_INFO} flavor of @code{processor_set_info}. It +has the following members: + +@table @code +@item int processor_count +number of processors + +@item int task_count +number of tasks + +@item int thread_count +number of threads + +@item int load_average +scaled load average + +@item int mach_factor +scaled mach factor +@end table +@end deftp + +@deftp {Data type} processor_set_basic_info_t +This is a pointer to a @code{struct processor_set_basic_info}. +@end deftp + +@deftp {Data type} {struct processor_set_sched_info} +This structure is returned in @var{processor_set_info} by the +@code{processor_set_info} function and provides schedule information +about the processor set. You can cast a variable of type +@code{processor_set_info_t} to a pointer of this type if you provided it +as the @var{processor_set_info} parameter for the +@code{PROCESSOR_SET_SCHED_INFO} flavor of @code{processor_set_info}. It +has the following members: + +@table @code +@item int policies +allowed policies + +@item int max_priority +max priority for new threads +@end table +@end deftp + +@deftp {Data type} processor_set_sched_info_t +This is a pointer to a @code{struct processor_set_sched_info}. +@end deftp + + +@node Processor Interface +@section Processor Interface + +@cindex processor port +@cindex port representing a processor +@deftp {Data type} processor_t +This is a @code{mach_port_t} and used to to hold the port name of a +processor port that represents the processor. Operations on the +processor are implemented as remote procedure calls to the processor +port. +@end deftp + +@menu +* Hosted Processors:: Getting a list of all processors on a host. +* Processor Control:: Starting, stopping, controlling processors. +* Processors and Sets:: Combining processors into processor sets. +* Processor Info:: Obtaining information on processors. +@end menu + + +@node Hosted Processors +@subsection Hosted Processors + +@deftypefun kern_return_t host_processors (@w{host_priv_t @var{host_priv}}, @w{processor_array_t *@var{processor_list}}, @w{mach_msg_type_number_t *@var{processor_count}}) +The function @code{host_processors} gets send rights to the processor +port for each processor existing on @var{host_priv}. This is the +privileged port that allows its holder to control a processor. + +@var{processor_list} is an array that is created as a result of this +call. The caller may wish to @code{vm_deallocate} this array when the +data is no longer needed. @var{processor_count} is set to the number of +processors in the @var{processor_list}. + +This function returns @code{KERN_SUCCESS} if the call succeeded, +@code{KERN_INVALID_ARGUMENT} if @var{host_priv} is not a privileged host +port, and @code{KERN_INVALID_ADDRESS} if @var{processor_count} points to +inaccessible memory. +@end deftypefun + + +@node Processor Control +@subsection Processor Control + +@deftypefun kern_return_t processor_start (@w{processor_t @var{processor}}) +@deftypefunx kern_return_t processor_exit (@w{processor_t @var{processor}}) +@deftypefunx kern_return_t processor_control (@w{processor_t @var{processor}}, @w{processor_info_t *@var{cmd}}, @w{mach_msg_type_number_t @var{count}}) +Some multiprocessors may allow privileged software to control +processors. The @code{processor_start}, @code{processor_exit}, and +@code{processor_control} operations implement this. The interpretation +of the command in @var{cmd} is machine dependent. A newly started +processor is assigned to the default processor set. An exited processor +is removed from the processor set to which it was assigned and ceases to +be active. + +@var{count} contains the length of the command @var{cmd} as a number of +ints. + +Availability limited. All of these operations are machine-dependent. +They may do nothing. The ability to restart an exited processor is also +machine-dependent. + +This function returns @code{KERN_SUCCESS} if the operation was +performed, @code{KERN_FAILURE} if the operation was not performed (a +likely reason is that it is not supported on this processor), +@code{KERN_INVALID_ARGUMENT} if @var{processor} is not a processor, and +@code{KERN_INVALID_ADDRESS} if @var{cmd} points to inaccessible memory. +@end deftypefun + +@node Processors and Sets +@subsection Processors and Sets + +@deftypefun kern_return_t processor_assign (@w{processor_t @var{processor}}, @w{processor_set_t @var{processor_set}}, @w{boolean_t @var{wait}}) +The function @code{processor_assign} assigns @var{processor} to the the +set @var{processor_set}. After the assignment is completed, the +processor only executes threads that are assigned to that processor set. +Any previous assignment of the processor is nullified. The master +processor cannot be reassigned. All processors take clock interrupts at +all times. The @var{wait} argument indicates whether the caller should +wait for the assignment to be completed or should return immediately. +Dedicated kernel threads are used to perform processor assignment, so +setting wait to @code{FALSE} allows assignment requests to be queued and +performed faster, especially if the kernel has more than one dedicated +internal thread for processor assignment. Redirection of other device +interrupts away from processors assigned to other than the default +processor set is machine-dependent. Intermediaries that interpose on +ports must be sure to interpose on both ports involved in this call if +they interpose on either. + +This function returns @code{KERN_SUCCESS} if the assignment has been +performed, @code{KERN_INVALID_ARGUMENT} if @var{processor} is not a +processor, or @var{processor_set} is not a processor set on the same +host as @var{processor}. +@end deftypefun + +@deftypefun kern_return_t processor_get_assignment (@w{processor_t @var{processor}}, @w{processor_set_name_t *@var{assigned_set}}) +The function @code{processor_get_assignment} obtains the current +assignment of a processor. The name port of the processor set is +returned in @var{assigned_set}. +@end deftypefun + +@node Processor Info +@subsection Processor Info + +@deftypefun kern_return_t processor_info (@w{processor_t @var{processor}}, @w{int @var{flavor}}, @w{host_t *@var{host}}, @w{processor_info_t @var{processor_info}}, @w{mach_msg_type_number_t *@var{processor_info_count}}) +The function @code{processor_info} returns the selected information array +for a processor, as specified by @var{flavor}. + +@var{host} is set to the host on which the processor set resides. This +is the non-privileged host port. + +@var{processor_info} is an array of integers that is supplied by the +caller and returned filled with specified information. +@var{processor_info_count} is supplied as the maximum number of integers in +@var{processor_info}. On return, it contains the actual number of +integers in @var{processor_info}. The maximum number of integers +returned by any flavor is @code{PROCESSOR_INFO_MAX}. + +The type of information returned is defined by @var{flavor}, which can +be one of the following: + +@table @code +@item PROCESSOR_BASIC_INFO +The function returns basic information about the processor, as defined +by @code{processor_basic_info_t}. This includes the slot number of the +processor. The number of integers returned is +@code{PROCESSOR_BASIC_INFO_COUNT}. +@end table + +Machines which require more configuration information beyond the slot +number are expected to define additional (machine-dependent) flavors. + +The function returns @code{KERN_SUCCESS} if the call succeeded and +@code{KERN_INVALID_ARGUMENT} if @var{processor} is not a processor or +@var{flavor} is not recognized. The function returns +@code{MIG_ARRAY_TOO_LARGE} if the returned info array is too large for +@var{processor_info}. In this case, @var{processor_info} is filled as +much as possible and @var{processor_infoCnt} is set to the number of +elements that would have been returned if there were enough room. +@end deftypefun + +@deftp {Data type} {struct processor_basic_info} +This structure is returned in @var{processor_info} by the +@code{processor_info} function and provides basic information about the +processor. You can cast a variable of type @code{processor_info_t} to a +pointer of this type if you provided it as the @var{processor_info} +parameter for the @code{PROCESSOR_BASIC_INFO} flavor of +@code{processor_info}. It has the following members: + +@table @code +@item cpu_type_t cpu_type +cpu type + +@item cpu_subtype_t cpu_subtype +cpu subtype + +@item boolean_t running +is processor running? + +@item int slot_num +slot number + +@item boolean_t is_master +is this the master processor +@end table +@end deftp + +@deftp {Data type} processor_basic_info_t +This is a pointer to a @code{struct processor_basic_info}. +@end deftp + + +@node Device Interface +@chapter Device Interface + +The GNU Mach microkernel provides a simple device interface that allows +the user space programs to access the underlying hardware devices. Each +device has a unique name, which is a string up to 127 characters long. +To open a device, the device master port has to be supplied. The device +master port is only available through the bootstrap port. Anyone who +has control over the device master port can use all hardware devices. +@c XXX FIXME bootstrap port, bootstrap + +@cindex device port +@cindex port representing a device +@deftp {Data type} device_t +This is a @code{mach_port_t} and used to to hold the port name of a +device port that represents the device. Operations on the device are +implemented as remote procedure calls to the device port. Each device +provides a sequence of records. The length of a record is specific to +the device. Data can be transferred ``out-of-line'' or ``in-line'' +(@pxref{Memory}). +@end deftp + +All constants and functions in this chapter are defined in +@file{device/device.h}. + +@menu +* Device Reply Server:: Handling device reply messages. +* Device Open:: Opening hardware devices. +* Device Close:: Closing hardware devices. +* Device Read:: Reading data from the device. +* Device Write:: Writing data to the device. +* Device Map:: Mapping devices into virtual memory. +* Device Status:: Querying and manipulating a device. +* Device Filter:: Filtering packets arriving on a device. +@end menu + + +@node Device Reply Server +@section Device Reply Server + +Beside the usual synchronous interface, an asynchronous interface is +provided. For this, the caller has to receive and handle the reply +messages seperately from the function call. + +@deftypefun boolean_t device_reply_server (@w{msg_header_t *@var{in_msg}}, @w{msg_header_t *@var{out_msg}}) +The function @code{device_reply_server} is produced by the +remote procedure call generator to to handle a received message. This +function does all necessary argument handling, and actually calls one of +the following functions: @code{ds_device_open_reply}, +@code{ds_device_read_reply}, @code{ds_device_read_reply_inband}, +@code{ds_device_write_reply} and @code{ds_device_write_reply_inband}. + +The @var{in_msg} argument is the message that has been received from the +kernel. The @var{out_msg} is a reply message, but this is not used for +this server. + +The function returns @code{TRUE} to indicate that the message in +question was applicable to this interface, and that the appropriate +routine was called to interpret the message. It returns @code{FALSE} to +indicate that the message did not apply to this interface, and that no +other action was taken. +@end deftypefun + + +@node Device Open +@section Device Open + +@deftypefun kern_return_t device_open (@w{mach_port_t @var{master_port}}, @w{dev_mode_t @var{mode}}, @w{dev_name_t @var{name}}, @w{device_t *@var{device}}) +The function @code{device_open} opens the device @var{name} and returns +a port to it in @var{device}. The open count for the device is +incremented by one. If the open count was 0, the open handler for the +device is invoked. + +@var{master_port} must hold the master device port. @var{name} +specifies the device to open, and is a string up to 128 characters long. +@var{mode} is the open mode. It is a bitwise-or of the following +constants: + +@table @code +@item D_READ +Request read access for the device. + +@item D_WRITE +Request write access for the device. + +@item D_NODELAY +Do not delay an open. +@c XXX Is this really used at all? Maybe for tape drives? What does it mean? +@end table + +The function returns @code{D_SUCCESS} if the device was successfully +opened, @code{D_INVALID_OPERATION} if @var{master_port} is not the +master device port, @code{D_WOULD_BLOCK} is the device is busy and +@code{D_NOWAIT} was specified in mode, @code{D_ALREADY_OPEN} if the +device is already open in an incompatible mode and +@code{D_NO_SUCH_DEVICE} if @var{name} does not denote a know device. +@end deftypefun + +@deftypefun kern_return_t device_open_request (@w{mach_port_t @var{master_port}}, @w{mach_port_t @var{reply_port}}, @w{dev_mode_t @var{mode}}, @w{dev_name_t @var{name}}) +@deftypefunx kern_return_t ds_device_open_reply (@w{mach_port_t @var{reply_port}}, @w{kern_return_t @var{return}}, @w{device_t *@var{device}}) +This is the asynchronous form of the @code{device_open} function. +@code{device_open_request} performs the open request. The meaning for +the parameters is as in @code{device_open}. Additionally, the caller +has to supply a reply port to which the @code{ds_device_open_reply} +message is sent by the kernel when the open has been performed. The +return value of the open operation is stored in @var{return_code}. + +As neither function receives a reply message, only message transmission +errors apply. If no error occurs, @code{KERN_SUCCESS} is returned. +@end deftypefun + + +@node Device Close +@section Device Close + +@deftypefun kern_return_t device_close (@w{device_t @var{device}}) +The function @code{device_close} decrements the open count of the device +by one. If the open count drops to zero, the close handler for the +device is called. The device to close is specified by its port +@var{device}. + +The function returns @code{D_SUCCESS} if the device was successfully +closed and @code{D_NO_SUCH_DEVICE} if @var{device} does not denote a +device port. +@end deftypefun + + +@node Device Read +@section Device Read + +@deftypefun kern_return_t device_read (@w{device_t @var{device}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{int @var{bytes_wanted}}, @w{io_buf_ptr_t *@var{data}}, @w{mach_msg_type_number_t *@var{data_count}}) +The function @code{device_read} reads @var{bytes_wanted} bytes from +@var{device}, and stores them in a buffer allocated with +@code{vm_allocate}, which address is returned in @var{data}. The caller +must deallocated it if it is no longer needed. The number of bytes +actually returned is stored in @var{data_count}. + +If @var{mode} is @code{D_NOWAIT}, the operation does not block. +Otherwise @var{mode} should be 0. @var{recnum} is the record number to +be read, its meaning is device specific. + +The function returns @code{D_SUCCESS} if some data was successfully +read, @code{D_WOULD_BLOCK} if no data is currently available and +@code{D_NOWAIT} is specified, and @code{D_NO_SUCH_DEVICE} if +@var{device} does not denote a device port. +@end deftypefun + +@deftypefun kern_return_t device_read_inband (@w{device_t @var{device}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{int @var{bytes_wanted}}, @w{io_buf_ptr_inband_t *@var{data}}, @w{mach_msg_type_number_t *@var{data_count}}) +The @code{device_read_inband} function works as the @code{device_read} +function, except that the data is returned ``in-line'' in the reply IPC +message (@pxref{Memory}). +@end deftypefun + +@deftypefun kern_return_t device_read_request (@w{device_t @var{device}}, @w{mach_port_t @var{reply_port}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{int @var{bytes_wanted}}) +@deftypefunx kern_return_t ds_device_read_reply (@w{mach_port_t @var{reply_port}}, @w{kern_return_t @var{return_code}}, @w{io_buf_ptr_t @var{data}}, @w{mach_msg_type_number_t @var{data_count}}) +This is the asynchronous form of the @code{device_read} function. +@code{device_read_request} performs the read request. The meaning for +the parameters is as in @code{device_read}. Additionally, the caller +has to supply a reply port to which the @code{ds_device_read_reply} +message is sent by the kernel when the read has been performed. The +return value of the read operation is stored in @var{return_code}. + +As neither function receives a reply message, only message transmission +errors apply. If no error occurs, @code{KERN_SUCCESS} is returned. +@end deftypefun + +@deftypefun kern_return_t device_read_request_inband (@w{device_t @var{device}}, @w{mach_port_t @var{reply_port}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{int @var{bytes_wanted}}) +@deftypefunx kern_return_t ds_device_read_reply_inband (@w{mach_port_t @var{reply_port}}, @w{kern_return_t @var{return_code}}, @w{io_buf_ptr_t @var{data}}, @w{mach_msg_type_number_t @var{data_count}}) +The @code{device_read_request_inband} and +@code{ds_device_read_reply_inband} functions work as the +@code{device_read_request} and @code{ds_device_read_reply} functions, +except that the data is returned ``in-line'' in the reply IPC message +(@pxref{Memory}). +@end deftypefun + + +@node Device Write +@section Device Write + +@deftypefun kern_return_t device_write (@w{device_t @var{device}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{io_buf_ptr_t @var{data}}, @w{mach_msg_type_number_t @var{data_count}}, @w{int *@var{bytes_written}}) +The function @code{device_write} writes @var{data_count} bytes from the +buffer @var{data} to @var{device}. The number of bytes actually written +is returned in @var{bytes_written}. + +If @var{mode} is @code{D_NOWAIT}, the function returns without waiting +for I/O completion. Otherwise @var{mode} should be 0. @var{recnum} is +the record number to be written, its meaning is device specific. + +The function returns @code{D_SUCCESS} if some data was successfully +written and @code{D_NO_SUCH_DEVICE} if @var{device} does not denote a +device port or the device is dead or not completely open. +@end deftypefun + +@deftypefun kern_return_t device_write_inband (@w{device_t @var{device}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{int @var{bytes_wanted}}, @w{io_buf_ptr_inband_t *@var{data}}, @w{mach_msg_type_number_t *@var{data_count}}) +The @code{device_write_inband} function works as the @code{device_write} +function, except that the data is sent ``in-line'' in the request IPC +message (@pxref{Memory}). +@end deftypefun + +@deftypefun kern_return_t device_write_request (@w{device_t @var{device}}, @w{mach_port_t @var{reply_port}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{io_buf_ptr_t @var{data}}, @w{mach_msg_type_number_t @var{data_count}}) +@deftypefunx kern_return_t ds_device_write_reply (@w{mach_port_t @var{reply_port}}, @w{kern_return_t @var{return_code}}, @w{int @var{bytes_written}}) +This is the asynchronous form of the @code{device_write} function. +@code{device_write_request} performs the write request. The meaning for +the parameters is as in @code{device_write}. Additionally, the caller +has to supply a reply port to which the @code{ds_device_write_reply} +message is sent by the kernel when the write has been performed. The +return value of the write operation is stored in @var{return_code}. + +As neither function receives a reply message, only message transmission +errors apply. If no error occurs, @code{KERN_SUCCESS} is returned. +@end deftypefun + +@deftypefun kern_return_t device_write_request_inband (@w{device_t @var{device}}, @w{mach_port_t @var{reply_port}}, @w{dev_mode_t @var{mode}}, @w{recnum_t @var{recnum}}, @w{io_buf_ptr_t @var{data}}, @w{mach_msg_type_number_t @var{data_count}}) +@deftypefunx kern_return_t ds_device_write_reply_inband (@w{mach_port_t @var{reply_port}}, @w{kern_return_t @var{return_code}}, @w{int @var{bytes_written}}) +The @code{device_write_request_inband} and +@code{ds_device_write_reply_inband} functions work as the +@code{device_write_request} and @code{ds_device_write_reply} functions, +except that the data is sent ``in-line'' in the request IPC message +(@pxref{Memory}). +@end deftypefun + + +@node Device Map +@section Device Map + +@deftypefun kern_return_t device_map (@w{device_t @var{device}}, @w{vm_prot_t @var{prot}}, @w{vm_offset_t @var{offset}}, @w{vm_size_t @var{size}}, @w{mach_port_t *@var{pager}}, @w{int @var{unmap}}) +The function @code{device_map} creates a new memory manager for +@var{device} and returns a port to it in @var{pager}. The memory +manager is usable as a memory object in a @code{vm_map} call. The call +is device dependant. + +The protection for the memory object is specified by @var{prot}. The +memory object starts at @var{offset} within the device and extends +@var{size} bytes. @var{unmap} is currently unused. +@c XXX I suppose the caller should set it to 0. + +The function returns @code{D_SUCCESS} if some data was successfully +written and @code{D_NO_SUCH_DEVICE} if @var{device} does not denote a +device port or the device is dead or not completely open. +@end deftypefun + + +@node Device Status +@section Device Status + +@deftypefun kern_return_t device_set_status (@w{device_t @var{device}}, @w{dev_flavor_t @var{flavor}}, @w{dev_status_t @var{status}}, @w{mach_msg_type_number_t @var{status_count}}) +The function @code{device_set_status} sets the status of a device. The +possible values for @var{flavor} and their interpretation is device +specific. + +The function returns @code{D_SUCCESS} if some data was successfully +written and @code{D_NO_SUCH_DEVICE} if @var{device} does not denote a +device port or the device is dead or not completely open. +@end deftypefun + +@deftypefun kern_return_t device_get_status (@w{device_t @var{device}}, @w{dev_flavor_t @var{flavor}}, @w{dev_status_t @var{status}}, @w{mach_msg_type_number_t *@var{status_count}}) +The function @code{device_get_status} gets the status of a device. The +possible values for @var{flavor} and their interpretation is device +specific. + +The function returns @code{D_SUCCESS} if some data was successfully +written and @code{D_NO_SUCH_DEVICE} if @var{device} does not denote a +device port or the device is dead or not completely open. +@end deftypefun + + +@node Device Filter +@section Device Filter + +@deftypefun kern_return_t device_set_filter (@w{device_t @var{device}}, @w{mach_port_t @var{receive_port}}, @w{mach_msg_type_name_t @var{receive_port_type}}, @w{int @var{priority}}, @w{filter_array_t @var{filter}}, @w{mach_msg_type_number_t @var{filter_count}}) +The function @code{device_set_filter} makes it possible to filter out +selected data arriving at the device and forward it to a port. +@var{filter} is a list of filter commands, which are applied to incoming +data to determine if the data should be sent to @var{receive_port}. The +IPC type of the send right is specified by @var{receive_port_right}, it +is either @code{MACH_MSG_TYPE_MAKE_SEND} or +@code{MACH_MSG_TYPE_MOVE_SEND}. The @var{priority} value is used to +order multiple filters. + +There can be up to @code{NET_MAX_FILTER} commands in @var{filter}. The +actual number of commands is passed in @var{filter_count}. For the +purpose of the filter test, an internal stack is provided. After all +commands have been processed, the value on the top of the stack +determines if the data is forwarded or the next filter is tried. + +@c XXX The following description was taken verbatim from the +@c kernel_interface.pdf document. +Each word of the command list specifies a data (push) operation (high +order NETF_NBPO bits) as well as a binary operator (low order NETF_NBPA +bits). The value to be pushed onto the stack is chosen as follows. + +@table @code +@item NETF_PUSHLIT +Use the next short word of the filter as the value. + +@item NETF_PUSHZERO +Use 0 as the value. + +@item NETF_PUSHWORD+N +Use short word N of the ``data'' portion of the message as the value. + +@item NETF_PUSHHDR+N +Use short word N of the ``header'' portion of the message as the value. + +@item NETF_PUSHIND+N +Pops the top long word from the stack and then uses short word N of the +``data'' portion of the message as the value. + +@item NETF_PUSHHDRIND+N +Pops the top long word from the stack and then uses short word N of the +``header'' portion of the message as the value. + +@item NETF_PUSHSTK+N +Use long word N of the stack (where the top of stack is long word 0) as +the value. + +@item NETF_NOPUSH +Don't push a value. +@end table + +The unsigned value so chosen is promoted to a long word before being +pushed. Once a value is pushed (except for the case of +@code{NETF_NOPUSH}), the top two long words of the stack are popped and +a binary operator applied to them (with the old top of stack as the +second operand). The result of the operator is pushed on the stack. +These operators are: + +@table @code +@item NETF_NOP +Don't pop off any values and do no operation. + +@item NETF_EQ +Perform an equal comparison. + +@item NETF_LT +Perform a less than comparison. + +@item NETF_LE +Perform a less than or equal comparison. + +@item NETF_GT +Perform a greater than comparison. + +@item NETF_GE +Perform a greater than or equal comparison. + +@item NETF_AND +Perform a bitise boolean AND operation. + +@item NETF_OR +Perform a bitise boolean inclusive OR operation. + +@item NETF_XOR +Perform a bitise boolean exclusive OR operation. + +@item NETF_NEQ +Perform a not equal comparison. + +@item NETF_LSH +Perform a left shift operation. + +@item NETF_RSH +Perform a right shift operation. + +@item NETF_ADD +Perform an addition. + +@item NETF_SUB +Perform a subtraction. + +@item NETF_COR +Perform an equal comparison. If the comparison is @code{TRUE}, terminate +the filter list. Otherwise, pop the result of the comparison off the +stack. + +@item NETF_CAND +Perform an equal comparison. If the comparison is @code{FALSE}, +terminate the filter list. Otherwise, pop the result of the comparison +off the stack. + +@item NETF_CNOR +Perform a not equal comparison. If the comparison is @code{FALSE}, +terminate the filter list. Otherwise, pop the result of the comparison +off the stack. + +@item NETF_CNAND +Perform a not equal comparison. If the comparison is @code{TRUE}, +terminate the filter list. Otherwise, pop the result of the comparison +off the stack. The scan of the filter list terminates when the filter +list is emptied, or a @code{NETF_C...} operation terminates the list. At +this time, if the final value of the top of the stack is @code{TRUE}, +then the message is accepted for the filter. +@end table + +The function returns @code{D_SUCCESS} if some data was successfully +written, @code{D_INVALID_OPERATION} if @var{receive_port} is not a valid +send right, and @code{D_NO_SUCH_DEVICE} if @var{device} does not denote +a device port or the device is dead or not completely open. +@end deftypefun + + +@node Kernel Debugger +@chapter Kernel Debugger + +The GNU Mach kernel debugger @code{ddb} is a powerful built-in debugger +with a gdb like syntax. It is enabled at compile time using the +@option{--enable-kdb} option. Whenever you want to enter the debugger +while running the kernel, you can press the key combination +@key{Ctrl-Alt-D}. + +@menu +* Operation:: Basic architecture of the kernel debugger. +* Commands:: Available commands in the kernel debugger. +* Variables:: Access of variables from the kernel debugger. +* Expressions:: Usage of expressions in the kernel debugger. +@end menu + + +@node Operation +@section Operation + +The current location is called @dfn{dot}. The dot is displayed with a +hexadecimal format at a prompt. Examine and write commands update dot +to the address of the last line examined or the last location modified, +and set @dfn{next} to the address of the next location to be examined or +changed. Other commands don't change dot, and set next to be the same +as dot. + +The general command syntax is: + +@example +@var{command}[/@var{modifier}] @var{address} [,@var{count}] +@end example + +@kbd{!!} repeats the previous command, and a blank line repeats from the +address next with count 1 and no modifiers. Specifying @var{address} sets +dot to the address. Omitting @var{address} uses dot. A missing @var{count} +is taken to be 1 for printing commands or infinity for stack traces. + +Current @code{ddb} is enhanced to support multi-thread debugging. A +break point can be set only for a specific thread, and the address space +or registers of non current thread can be examined or modified if +supported by machine dependent routines. For example, + +@example +break/t mach_msg_trap $task11.0 +@end example + +sets a break point at @code{mach_msg_trap} for the first thread of task +11 listed by a @code{show all threads} command. + +In the above example, @code{$task11.0} is translated to the +corresponding thread structure's address by variable translation +mechanism described later. If a default target thread is set in a +variable @code{$thread}, the @code{$task11.0} can be omitted. In +general, if @code{t} is specified in a modifier of a command line, a +specified thread or a default target thread is used as a target thread +instead of the current one. The @code{t} modifier in a command line is +not valid in evaluating expressions in a command line. If you want to +get a value indirectly from a specific thread's address space or access +to its registers within an expression, you have to specify a default +target thread in advance, and to use @code{:t} modifier immediately +after the indirect access or the register reference like as follows: + +@example +set $thread $task11.0 +print $eax:t *(0x100):tuh +@end example + +No sign extension and indirection @code{size(long, half word, byte)} can +be specified with @code{u}, @code{l}, @code{h} and @code{b} respectively +for the indirect access. + +Note: Support of non current space/register access and user space break +point depend on the machines. If not supported, attempts of such +operation may provide incorrect information or may cause strange +behavior. Even if supported, the user space access is limited to the +pages resident in the main memory at that time. If a target page is not +in the main memory, an error will be reported. + +@code{ddb} has a feature like a command @code{more} for the output. If +an output line exceeds the number set in the @code{$lines} variable, it +displays @samp{--db_more--} and waits for a response. The valid +responses for it are: + +@table @kbd +@item @key{SPC} +one more page + +@item @key{RET} +one more line + +@item q +abort the current command, and return to the command input mode +@end table + + +@node Commands +@section Commands + +@table @code +@item examine(x) [/@var{modifier}] @var{addr}[,@var{count}] [ @var{thread} ] +Display the addressed locations according to the formats in the +modifier. Multiple modifier formats display multiple locations. If no +format is specified, the last formats specified for this command is +used. Address space other than that of the current thread can be +specified with @code{t} option in the modifier and @var{thread} +parameter. The format characters are + +@table @code +@item b +look at by bytes(8 bits) + +@item h +look at by half words(16 bits) + +@item l +look at by long words(32 bits) + +@item a +print the location being displayed + +@item , +skip one unit producing no output + +@item A +print the location with a line number if possible + +@item x +display in unsigned hex + +@item z +display in signed hex + +@item o +display in unsigned octal + +@item d +display in signed decimal + +@item u +display in unsigned decimal + +@item r +display in current radix, signed + +@item c +display low 8 bits as a character. Non-printing characters are +displayed as an octal escape code (e.g. '\000'). + +@item s +display the null-terminated string at the location. Non-printing +characters are displayed as octal escapes. + +@item m +display in unsigned hex with character dump at the end of each line. +The location is also displayed in hex at the beginning of each line. + +@item i +display as an instruction + +@item I +display as an instruction with possible alternate formats depending on +the machine: + +@table @code +@item vax +don't assume that each external label is a procedure entry mask + +@item i386 +don't round to the next long word boundary + +@item mips +print register contents +@end table +@end table + +@item xf +Examine forward. It executes an examine command with the last specified +parameters to it except that the next address displayed by it is used as +the start address. + +@item xb +Examine backward. It executes an examine command with the last +specified parameters to it except that the last start address subtracted +by the size displayed by it is used as the start address. + +@item print[/axzodurc] @var{addr1} [ @var{addr2} @dots{} ] +Print @var{addr}'s according to the modifier character. Valid formats +are: @code{a} @code{x} @code{z} @code{o} @code{d} @code{u} @code{r} +@code{c}. If no modifier is specified, the last one specified to it is +used. @var{addr} can be a string, and it is printed as it is. For +example, + +@example +print/x "eax = " $eax "\necx = " $ecx "\n" +@end example + +will print like + +@example +eax = xxxxxx +ecx = yyyyyy +@end example + +@item write[/bhlt] @var{addr} [ @var{thread} ] @var{expr1} [ @var{expr2} @dots{} ] +Write the expressions at succeeding locations. The write unit size can +be specified in the modifier with a letter b (byte), h (half word) or +l(long word) respectively. If omitted, long word is assumed. Target +address space can also be specified with @code{t} option in the modifier +and @var{thread} parameter. Warning: since there is no delimiter +between expressions, strange things may happen. It's best to enclose +each expression in parentheses. + +@item set $@var{variable} [=] @var{expr} +Set the named variable or register with the value of @var{expr}. Valid +variable names are described below. + +@item break[/tuTU] @var{addr}[,@var{count}] [ @var{thread1} @dots{} ] +Set a break point at @var{addr}. If count is supplied, continues +(@var{count}-1) times before stopping at the break point. If the break +point is set, a break point number is printed with @samp{#}. This +number can be used in deleting the break point or adding conditions to +it. + +@table @code +@item t +Set a break point only for a specific thread. The thread is specified +by @var{thread} parameter, or default one is used if the parameter is +omitted. + +@item u +Set a break point in user space address. It may be combined with +@code{t} or @code{T} option to specify the non-current target user +space. Without @code{u} option, the address is considered in the kernel +space, and wrong space address is rejected with an error message. This +option can be used only if it is supported by machine dependent +routines. + +@item T +Set a break point only for threads in a specific task. It is like +@code{t} option except that the break point is valid for all threads +which belong to the same task as the specified target thread. + +@item U +Set a break point in shared user space address. It is like @code{u} +option, except that the break point is valid for all threads which share +the same address space even if @code{t} option is specified. @code{t} +option is used only to specify the target shared space. Without +@code{t} option, @code{u} and @code{U} have the same meanings. @code{U} +is useful for setting a user space break point in non-current address +space with @code{t} option such as in an emulation library space. This +option can be used only if it is supported by machine dependent +routines. +@end table + +Warning: if a user text is shadowed by a normal user space debugger, +user space break points may not work correctly. Setting a break point +at the low-level code paths may also cause strange behavior. + +@item delete[/tuTU] @var{addr}|#@var{number} [ @var{thread1} @dots{} ] +Delete the break point. The target break point can be specified by a +break point number with @code{#}, or by @var{addr} like specified in +@code{break} command. + +@item cond #@var{number} [ @var{condition} @var{commands} ] +Set or delete a condition for the break point specified by the +@var{number}. If the @var{condition} and @var{commands} are null, the +condition is deleted. Otherwise the condition is set for it. When the +break point is hit, the @var{condition} is evaluated. The +@var{commands} will be executed if the condition is true and the break +point count set by a break point command becomes zero. @var{commands} +is a list of commands separated by semicolons. Each command in the list +is executed in that order, but if a @code{continue} command is executed, +the command execution stops there, and the stopped thread resumes +execution. If the command execution reaches the end of the list, and it +enters into a command input mode. For example, + +@example +set $work0 0 +break/Tu xxx_start $task7.0 +cond #1 (1) set $work0 1; set $work1 0; cont +break/T vm_fault $task7.0 +cond #2 ($work0) set $work1 ($work1+1); cont +break/Tu xxx_end $task7.0 +cond #3 ($work0) print $work1 " faults\n"; set $work0 0 +cont +@end example + +will print page fault counts from @code{xxx_start} to @code{xxx_end} in +@code{task7}. + +@item step[/p] [,@var{count}] +Single step @var{count} times. If @code{p} option is specified, print +each instruction at each step. Otherwise, only print the last +instruction. + +Warning: depending on machine type, it may not be possible to +single-step through some low-level code paths or user space code. On +machines with software-emulated single-stepping (e.g., pmax), stepping +through code executed by interrupt handlers will probably do the wrong +thing. + +@item continue[/c] +Continue execution until a breakpoint or watchpoint. If @code{/c}, +count instructions while executing. Some machines (e.g., pmax) also +count loads and stores. + +Warning: when counting, the debugger is really silently single-stepping. +This means that single-stepping on low-level code may cause strange +behavior. + +@item until +Stop at the next call or return instruction. + +@item next[/p] +Stop at the matching return instruction. If @code{p} option is +specified, print the call nesting depth and the cumulative instruction +count at each call or return. Otherwise, only print when the matching +return is hit. + +@item match[/p] +A synonym for @code{next}. + +@item trace[/tu] [ @var{frame_addr}|@var{thread} ][,@var{count}] +Stack trace. @code{u} option traces user space; if omitted, only traces +kernel space. If @code{t} option is specified, it shows the stack trace +of the specified thread or a default target thread. Otherwise, it shows +the stack trace of the current thread from the frame address specified +by a parameter or from the current frame. @var{count} is the number of +frames to be traced. If the @var{count} is omitted, all frames are +printed. + +Warning: If the target thread's stack is not in the main memory at that +time, the stack trace will fail. User space stack trace is valid only +if the machine dependent code supports it. + +@item search[/bhl] @var{addr} @var{value} [@var{mask}] [,@var{count}] +Search memory for a value. This command might fail in interesting ways +if it doesn't find the searched-for value. This is because @code{ddb} +doesn't always recover from touching bad memory. The optional count +argument limits the search. + +@item macro @var{name} @var{commands} +Define a debugger macro as @var{name}. @var{commands} is a list of +commands to be associated with the macro. In the expressions of the +command list, a variable @code{$argxx} can be used to get a parameter +passed to the macro. When a macro is called, each argument is evaluated +as an expression, and the value is assigned to each parameter, +@code{$arg1}, @code{$arg2}, @dots{} respectively. 10 @code{$arg} +variables are reserved to each level of macros, and they can be used as +local variables. The nesting of macro can be allowed up to 5 levels. +For example, + +@example +macro xinit set $work0 $arg1 +macro xlist examine/m $work0,4; set $work0 *($work0) +xinit *(xxx_list) +xlist +@enddots{} +@end example + +will print the contents of a list starting from @code{xxx_list} by each +@code{xlist} command. + +@item dmacro @var{name} +Delete the macro named @var{name}. + +@item show all threads[/ul] +Display all tasks and threads information. This version of @code{ddb} +prints more information than previous one. It shows UNIX process +information like @command{ps} for each task. The UNIX process +information may not be shown if it is not supported in the machine, or +the bottom of the stack of the target task is not in the main memory at +that time. It also shows task and thread identification numbers. These +numbers can be used to specify a task or a thread symbolically in +various commands. The numbers are valid only in the same debugger +session. If the execution is resumed again, the numbers may change. +The current thread can be distinguished from others by a @code{#} after +the thread id instead of @code{:}. Without @code{l} option, it only +shows thread id, thread structure address and the status for each +thread. The status consists of 5 letters, R(run), W(wait), S(sus +pended), O(swapped out) and N(interruptible), and if corresponding +status bit is off, @code{.} is printed instead. If @code{l} option is +specified, more detail information is printed for each thread. + +@item show task [ @var{addr} ] +Display the information of a task specified by @var{addr}. If +@var{addr} is omitted, current task information is displayed. + +@item show thread [ @var{addr} ] +Display the information of a thread specified by @var{addr}. If +@var{addr} is omitted, current thread information is displayed. + +@item show registers[/tu [ @var{thread} ]] +Display the register set. Target thread can be specified with @code{t} +option and @var{thread} parameter. If @code{u} option is specified, it +displays user registers instead of kernel or currently saved one. + +Warning: The support of @code{t} and @code{u} option depends on the +machine. If not supported, incorrect information will be displayed. + +@item show map @var{addr} +Prints the @code{vm_map} at @var{addr}. + +@item show object @var{addr} +Prints the @code{vm_object} at @var{addr}. + +@item show page @var{addr} +Prints the @code{vm_page} structure at @var{addr}. + +@item show port @var{addr} +Prints the @code{ipc_port} structure at @var{addr}. + +@item show ipc_port[/t [ @var{thread} ]] +Prints all @code{ipc_port} structure's addresses the target thread has. +The target thread is a current thread or that specified by a parameter. + +@item show macro [ @var{name} ] +Show the definitions of macros. If @var{name} is specified, only the +definition of it is displayed. Otherwise, definitions of all macros are +displayed. + +@item show watches +Displays all watchpoints. + +@item watch[/T] @var{addr},@var{size} [ @var{task} ] +Set a watchpoint for a region. Execution stops when an attempt to +modify the region occurs. The @var{size} argument defaults to 4. +Without @code{T} option, @var{addr} is assumed to be a kernel address. +If you want to set a watch point in user space, specify @code{T} and +@var{task} parameter where the address belongs to. If the @var{task} +parameter is omitted, a task of the default target thread or a current +task is assumed. If you specify a wrong space address, the request is +rejected with an error message. + +Warning: Attempts to watch wired kernel memory may cause unrecoverable +error in some systems such as i386. Watchpoints on user addresses work +best. +@end table + + +@node Variables +@section Variables + +The debugger accesses registers and variables as $@var{name}. Register +names are as in the @code{show registers} command. Some variables are +suffixed with numbers, and may have some modifier following a colon +immediately after the variable name. For example, register variables +can have @code{u} and @code{t} modifier to indicate user register and +that of a default target thread instead of that of the current thread +(e.g. @code{$eax:tu}). + +Built-in variables currently supported are: + +@table @code +@item task@var{xx}[.@var{yy}] +Task or thread structure address. @var{xx} and @var{yy} are task and +thread identification numbers printed by a @code{show all threads} +command respectively. This variable is read only. + +@item thread +The default target thread. The value is used when @code{t} option is +specified without explicit thread structure address parameter in command +lines or expression evaluation. + +@item radix +Input and output radix + +@item maxoff +Addresses are printed as @var{symbol}+@var{offset} unless offset is greater than +maxoff. + +@item maxwidth +The width of the displayed line. + +@item lines +The number of lines. It is used by @code{more} feature. + +@item tabstops +Tab stop width. + +@item arg@var{xx} +Parameters passed to a macro. @var{xx} can be 1 to 10. + +@item work@var{xx} +Work variable. @var{xx} can be 0 to 31. +@end table + + +@node Expressions +@section Expressions + +Almost all expression operators in C are supported except @code{~}, +@code{^}, and unary @code{&}. Special rules in @code{ddb} are: + +@table @code +@item @var{identifier} +name of a symbol. It is translated to the address(or value) of it. +@code{.} and @code{:} can be used in the identifier. If supported by +an object format dependent routine, +[@var{file_name}:]@var{func}[:@var{line_number}] +[@var{file_name}:]@var{variable}, and +@var{file_name}[:@var{line_number}] can be accepted as a symbol. The +symbol may be prefixed with @code{@var{symbol_table_name}::} like +@code{emulator::mach_msg_trap} to specify other than kernel symbols. + +@item @var{number} +radix is determined by the first two letters: +@table @code +@item 0x +hex +@item 0o +octal +@item 0t +decimal +@end table + +otherwise, follow current radix. + +@item . +dot + +@item + +next + +@item .. +address of the start of the last line examined. Unlike dot or next, +this is only changed by @code{examine} or @code{write} command. + +@item ´ +last address explicitly specified. + +@item $@var{variable} +register name or variable. It is translated to the value of it. It may +be followed by a @code{:} and modifiers as described above. + +@item a +multiple of right hand side. + +@item *@var{expr} +indirection. It may be followed by a @code{:} and modifiers as +described above. +@end table + + +@include gpl.texi + + +@node Documentation License +@appendix Documentation License + +This manual is copyrighted and licensed under the GNU Free Documentation +license. + +Parts of this manual are derived from the Mach manual packages +originally provided by Carnegie Mellon University. + +@menu +* Free Documentation License:: The GNU Free Documentation License. +* CMU License:: The CMU license applies to the original Mach + kernel and its documentation. +@end menu + +@lowersections +@include fdl.texi +@raisesections + +@node CMU License +@appendixsec CMU License + +@quotation +@display +Mach Operating System +Copyright @copyright{} 1991,1990,1989 Carnegie Mellon University +All Rights Reserved. +@end display + +Permission to use, copy, modify and distribute this software and its +documentation is hereby granted, provided that both the copyright +notice and this permission notice appear in all copies of the +software, derivative works or modified versions, and any portions +thereof, and that both notices appear in supporting documentation. + +@sc{carnegie mellon allows free use of this software in its ``as is'' +condition. carnegie mellon disclaims any liability of any kind for +any damages whatsoever resulting from the use of this software.} + +Carnegie Mellon requests users of this software to return to + +@display + Software Distribution Coordinator + School of Computer Science + Carnegie Mellon University + Pittsburgh PA 15213-3890 +@end display + +@noindent +or @email{Software.Distribution@@CS.CMU.EDU} any improvements or +extensions that they make and grant Carnegie Mellon the rights to +redistribute these changes. +@end quotation + +@node Concept Index +@unnumbered Concept Index + +@printindex cp + + +@node Function and Data Index +@unnumbered Function and Data Index + +@printindex fn + + +@summarycontents +@contents +@bye |