2 <!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
6 <appendix id="appendix.contrib" xreflabel="Contributing">
7 <?dbhtml filename="appendix_contributing.html"?>
23 <primary>Appendix</primary>
24 <secondary>Contributing</secondary>
29 The GNU C++ Library follows an open development model. Active
30 contributors are assigned maintainer-ship responsibility, and given
31 write access to the source repository. First time contributors
32 should follow this procedure:
35 <sect1 id="contrib.list" xreflabel="Contributor Checklist">
36 <title>Contributor Checklist</title>
38 <sect2 id="list.reading">
39 <title>Reading</title>
44 Get and read the relevant sections of the C++ language
45 specification. Copies of the full ISO 14882 standard are
46 available on line via the ISO mirror site for committee
47 members. Non-members, or those who have not paid for the
48 privilege of sitting on the committee and sustained their
49 two meeting commitment for voting rights, may get a copy of
50 the standard from their respective national standards
51 organization. In the USA, this national standards
52 organization is ANSI and their web-site is right
53 <ulink url="http://www.ansi.org">here.</ulink>
54 (And if you've already registered with them, clicking this link will take you to directly to the place where you can
55 <ulink url="http://webstore.ansi.org/RecordDetail.aspx?sku=ISO%2FIEC+14882:2003">buy the standard on-line.)</ulink>
61 The library working group bugs, and known defects, can
63 <ulink url="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21 </ulink>
69 The newsgroup dedicated to standardization issues is
70 comp.std.c++: this FAQ for this group is quite useful and
72 found <ulink url="http://www.comeaucomputing.com/csc/faq.html">
80 the <ulink url="http://www.gnu.org/prep/standards">GNU
81 Coding Standards</ulink>, and chuckle when you hit the part
82 about <quote>Using Languages Other Than C</quote>.
88 Be familiar with the extensions that preceded these
89 general GNU rules. These style issues for libstdc++ can be
90 found <link linkend="contrib.coding_style">here</link>.
96 And last but certainly not least, read the
97 library-specific information
98 found <link linkend="appendix.porting"> here</link>.
104 <sect2 id="list.copyright">
105 <title>Assignment</title>
107 Small changes can be accepted without a copyright assignment form on
108 file. New code and additions to the library need completed copyright
109 assignment form on file at the FSF. Note: your employer may be required
110 to fill out appropriate disclaimer forms as well.
114 Historically, the libstdc++ assignment form added the following
120 Which Belgian comic book character is better, Tintin or Asterix, and
126 While not strictly necessary, humoring the maintainers and answering
127 this question would be appreciated.
131 For more information about getting a copyright assignment, please see
132 <ulink url="http://www.gnu.org/prep/maintain/html_node/Legal-Matters.html">Legal
137 Please contact Benjamin Kosnik at
138 <email>bkoz+assign@redhat.com</email> if you are confused
139 about the assignment or have general licensing questions. When
140 requesting an assignment form from
141 <email>mailto:assign@gnu.org</email>, please cc the libstdc++
142 maintainer above so that progress can be monitored.
146 <sect2 id="list.getting">
147 <title>Getting Sources</title>
149 <ulink url="http://gcc.gnu.org/svnwrite.html">Getting write access
150 (look for "Write after approval")</ulink>
154 <sect2 id="list.patches">
155 <title>Submitting Patches</title>
158 Every patch must have several pieces of information before it can be
159 properly evaluated. Ideally (and to ensure the fastest possible
160 response from the maintainers) it would have all of these pieces:
166 A description of the bug and how your patch fixes this
167 bug. For new features a description of the feature and your
174 A ChangeLog entry as plain text; see the various
175 ChangeLog files for format and content. If you are
176 using emacs as your editor, simply position the insertion
177 point at the beginning of your change and hit CX-4a to bring
178 up the appropriate ChangeLog entry. See--magic! Similar
179 functionality also exists for vi.
185 A testsuite submission or sample program that will
186 easily and simply show the existing error or test new
193 The patch itself. If you are accessing the SVN
194 repository use <command>svn update; svn diff NEW</command>;
195 else, use <command>diff -cp OLD NEW</command> ... If your
196 version of diff does not support these options, then get the
197 latest version of GNU
198 diff. The <ulink url="http://gcc.gnu.org/wiki/SvnTricks">SVN
199 Tricks</ulink> wiki page has information on customising the
200 output of <code>svn diff</code>.
206 When you have all these pieces, bundle them up in a
207 mail message and send it to libstdc++@gcc.gnu.org. All
208 patches and related discussion should be sent to the
209 libstdc++ mailing list.
218 <sect1 id="contrib.organization" xreflabel="Source Organization">
219 <?dbhtml filename="source_organization.html"?>
220 <title>Directory Layout and Source Conventions</title>
223 The unpacked source directory of libstdc++ contains the files
224 needed to create the GNU C++ Library.
228 It has subdirectories:
231 Files in HTML and text format that document usage, quirks of the
232 implementation, and contributor checklists.
235 All header files for the C++ library are within this directory,
236 modulo specific runtime-related files that are in the libsupc++
240 Files meant to be found by #include <name> directives in
241 standard-conforming user programs.
244 Headers intended to directly include standard C headers.
245 [NB: this can be enabled via --enable-cheaders=c]
248 Headers intended to include standard C headers in
249 the global namespace, and put select names into the std::
250 namespace. [NB: this is the default, and is the same as
251 --enable-cheaders=c_global]
254 Headers intended to include standard C headers
255 already in namespace std, and put select names into the std::
256 namespace. [NB: this is the same as --enable-cheaders=c_std]
259 Files included by standard headers and by other files in
263 Headers provided for backward compatibility, such as <iostream.h>.
264 They are not used in this library.
267 Headers that define extensions to the standard library. No
268 standard header refers to any of them.
271 Scripts that are used during the configure, build, make, or test
275 Files that are used in constructing the library, but are not
278 testsuites/[backward, demangle, ext, performance, thread, 17_* to 27_*]
279 Test programs are here, and may be used to begin to exercise the
280 library. Support for "make check" and "make check-install" is
281 complete, and runs through all the subdirectories here when this
282 command is issued from the build directory. Please note that
283 "make check" requires DejaGNU 1.4 or later to be installed. Please
284 note that "make check-script" calls the script mkcheck, which
285 requires bash, and which may need the paths to bash adjusted to
286 work properly, as /bin/bash is assumed.
288 Other subdirectories contain variant versions of certain files
289 that are meant to be copied or linked by the configure script.
298 In addition, a subdirectory holds the convenience library libsupc++.
301 Contains the runtime library for C++, including exception
302 handling and memory allocation and deallocation, RTTI, terminate
305 Note that glibc also has a bits/ subdirectory. We will either
306 need to be careful not to collide with names in its bits/
307 directory; or rename bits to (e.g.) cppbits/.
309 In files throughout the system, lines marked with an "XXX" indicate
310 a bug or incompletely-implemented feature. Lines marked "XXX MT"
311 indicate a place that may require attention for multi-thread safety.
316 <sect1 id="contrib.coding_style" xreflabel="Coding Style">
317 <?dbhtml filename="source_code_style.html"?>
318 <title>Coding Style</title>
321 <sect2 id="coding_style.bad_identifiers">
322 <title>Bad Identifiers</title>
324 Identifiers that conflict and should be avoided.
328 This is the list of names <quote>reserved to the
329 implementation</quote> that have been claimed by certain
330 compilers and system headers of interest, and should not be used
331 in the library. It will grow, of course. We generally are
332 interested in names that are not all-caps, except for those like
375 [Note that this list is out of date. It applies to the old
376 name-mangling; in G++ 3.0 and higher a different name-mangling is
377 used. In addition, many of the bugs relating to G++ interpreting
378 these names as operators have been fixed.]
380 The full set of __* identifiers (combined from gcc/cp/lex.c and
381 gcc/cplus-dem.c) that are either old or new, but are definitely
382 recognized by the demangler, is:
510 // long double conversion members mangled as __opr
511 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html
516 <sect2 id="coding_style.example">
517 <title>By Example</title>
519 This library is written to appropriate C++ coding standards. As such,
520 it is intended to precede the recommendations of the GNU Coding
521 Standard, which can be referenced in full here:
523 http://www.gnu.org/prep/standards/standards.html#Formatting
525 The rest of this is also interesting reading, but skip the "Design
528 The GCC coding conventions are here, and are also useful:
529 http://gcc.gnu.org/codingconventions.html
531 In addition, because it doesn't seem to be stated explicitly anywhere
532 else, there is an 80 column source limit.
534 ChangeLog entries for member functions should use the
535 classname::member function name syntax as follows:
537 1999-04-15 Dennis Ritchie <dr@att.com>
539 * src/basic_file.cc (__basic_file::open): Fix thinko in
540 _G_HAVE_IO_FILE_OPEN bits.
542 Notable areas of divergence from what may be previous local practice
543 (particularly for GNU C) include:
545 01. Pointers and references
549 char *p = "flop"; // wrong
550 char &c = *p; // wrong
552 Reason: In C++, definitions are mixed with executable code. Here,
553 p is being initialized, not *p. This is near-universal
554 practice among C++ programmers; it is normal for C hackers
555 to switch spontaneously as they gain experience.
557 02. Operator names and parentheses
560 operator == (type) // wrong
562 Reason: The == is part of the function name. Separating
563 it makes the declaration look like an expression.
565 03. Function names and parentheses
568 void mangle () // wrong
570 Reason: no space before parentheses (except after a control-flow
571 keyword) is near-universal practice for C++. It identifies the
572 parentheses as the function-call operator or declarator, as
573 opposed to an expression or other overloaded use of parentheses.
575 04. Template function indentation
576 template<typename T>
578 template_function(args)
581 template<class T>
582 void template_function(args) {};
584 Reason: In class definitions, without indentation whitespace is
585 needed both above and below the declaration to distinguish
586 it visually from other members. (Also, re: "typename"
587 rather than "class".) T often could be int, which is
588 not a class. ("class", here, is an anachronism.)
590 05. Template class indentation
591 template<typename _CharT, typename _Traits>
592 class basic_ios : public ios_base
598 template<class _CharT, class _Traits>
599 class basic_ios : public ios_base
605 template<class _CharT, class _Traits>
606 class basic_ios : public ios_base
620 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl };
622 07. Member initialization lists
623 All one line, separate from class name.
626 : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
629 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
648 09. Member functions declarations and definitions
649 Keywords such as extern, static, export, explicit, inline, etc
650 go on the line above the function name. Thus
657 Reason: GNU coding conventions dictate return types for functions
658 are on a separate line than the function name and parameter list
659 for definitions. For C++, where we have member functions that can
660 be either inline definitions or declarations, keeping to this
661 standard allows all member function names for a given class to be
662 aligned to the same margin, increasing readability.
665 10. Invocation of member functions with "this->"
666 For non-uglified names, use this->name to call the function.
672 Reason: Koenig lookup.
686 12. Spacing under protected and private in class declarations:
687 space above, none below
698 13. Spacing WRT return statements.
699 no extra spacing before returns, no parenthesis
716 14. Location of global variables.
717 All global variables of class type, whether in the "user visible"
718 space (e.g., cin) or the implementation namespace, must be defined
719 as a character array with the appropriate alignment and then later
720 re-initialized to the correct value.
722 This is due to startup issues on certain platforms, such as AIX.
723 For more explanation and examples, see src/globals.cc. All such
724 variables should be contained in that file, for simplicity.
726 15. Exception abstractions
727 Use the exception abstractions found in functexcept.h, which allow
728 C++ programmers to use this library with -fno-exceptions. (Even if
729 that is rarely advisable, it's a necessary evil for backwards
732 16. Exception error messages
733 All start with the name of the function where the exception is
734 thrown, and then (optional) descriptive text is added. Example:
736 __throw_logic_error(__N("basic_string::_S_construct NULL not valid"));
738 Reason: The verbose terminate handler prints out exception::what(),
739 as well as the typeinfo for the thrown exception. As this is the
740 default terminate handler, by putting location info into the
741 exception string, a very useful error message is printed out for
742 uncaught exceptions. So useful, in fact, that non-programmers can
743 give useful error messages, and programmers can intelligently
744 speculate what went wrong without even using a debugger.
746 17. The doxygen style guide to comments is a separate document,
749 The library currently has a mixture of GNU-C and modern C++ coding
750 styles. The GNU C usages will be combed out gradually.
754 For nonstandard names appearing in Standard headers, we are constrained
755 to use names that begin with underscores. This is called "uglification".
758 Local and argument names: __[a-z].*
760 Examples: __count __ix __s1
762 Type names and template formal-argument names: _[A-Z][^_].*
764 Examples: _Helper _CharT _N
766 Member data and function names: _M_.*
768 Examples: _M_num_elements _M_initialize ()
770 Static data members, constants, and enumerations: _S_.*
772 Examples: _S_max_elements _S_default_value
774 Don't use names in the same scope that differ only in the prefix,
775 e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names.
776 (The most tempting of these seem to be and "_T" and "__sz".)
778 Names must never have "__" internally; it would confuse name
779 unmanglers on some targets. Also, never use "__[0-9]", same reason.
781 --------------------------
795 gribble(const gribble&);
798 gribble(int __howmany);
801 operator=(const gribble&);
806 // Start with a capital letter, end with a period.
808 public_member(const char* __arg) const;
810 // In-class function definitions should be restricted to one-liners.
812 one_line() { return 0 }
815 two_lines(const char* arg)
816 { return strchr(arg, 'a'); }
819 three_lines(); // inline, but defined below.
822 template<typename _Formal_argument>
824 public_template() const throw();
826 template<typename _Iterator>
836 int _M_private_function();
845 _S_initialize_library();
848 // More-or-less-standard language features described by lack, not presence.
849 # ifndef _G_NO_LONGLONG
850 extern long long _G_global_with_a_good_long_name; // avoid globals!
853 // Avoid in-class inline definitions, define separately;
854 // likewise for member class definitions:
856 gribble::public_member() const
857 { int __local = 0; return __local; }
859 class gribble::_Helper
863 friend class gribble;
867 // Names beginning with "__": only for arguments and
868 // local variables; never use "__" in a type name, or
869 // within any name; never use "__[0-9]".
871 #endif /* _HEADER_ */
876 template<typename T> // notice: "typename", not "class", no space
877 long_return_value_type<with_many, args>
878 function_name(char* pointer, // "char *pointer" is wrong.
880 const Reference& ref)
882 // int a_local; /* wrong; see below. */
888 int a_local = 0; // declare variable at first use.
890 // char a, b, *p; /* wrong */
893 char* c = "abc"; // each variable goes on its own line, always.
895 // except maybe here...
896 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) {
902 : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
906 gribble::three_lines()
908 // doesn't fit in one line.
915 <sect1 id="contrib.doc_style" xreflabel="Documentation Style">
916 <?dbhtml filename="documentation_style.html"?>
917 <title>Documentation Style</title>
918 <sect2 id="doc_style.doxygen">
919 <title>Doxygen</title>
920 <sect3 id="doxygen.prereq">
921 <title>Prerequisites</title>
923 Prerequisite tools are Bash 2.x,
924 <ulink url="http://www.doxygen.org/">Doxygen</ulink>, and
925 the <ulink url="http://www.gnu.org/software/coreutils/">GNU
926 coreutils</ulink>. (GNU versions of find, xargs, and possibly
927 sed and grep are used, just because the GNU versions make
932 To generate the pretty pictures and hierarchy
934 <ulink url="http://www.graphviz.org">Graphviz</ulink>
935 package will need to be installed.
939 <sect3 id="doxygen.rules">
940 <title>Generating the Doxygen Files</title>
942 The following Makefile rules run Doxygen to generate HTML
943 docs, XML docs, and the man pages.
947 <screen><userinput>make doc-html-doxygen</userinput></screen>
951 <screen><userinput>make doc-xml-doxygen</userinput></screen>
955 <screen><userinput>make doc-man-doxygen</userinput></screen>
959 Careful observers will see that the Makefile rules simply call
960 a script from the source tree, <filename>run_doxygen</filename>, which
961 does the actual work of running Doxygen and then (most
962 importantly) massaging the output files. If for some reason
963 you prefer to not go through the Makefile, you can call this
964 script directly. (Start by passing <literal>--help</literal>.)
968 If you wish to tweak the Doxygen settings, do so by editing
969 <filename>doc/doxygen/user.cfg.in</filename>. Notes to fellow
970 library hackers are written in triple-# comments.
975 <sect3 id="doxygen.markup">
976 <title>Markup</title>
979 In general, libstdc++ files should be formatted according to
980 the rules found in the
981 <link linkend="contrib.coding_style">Coding Standard</link>. Before
982 any doxygen-specific formatting tweaks are made, please try to
983 make sure that the initial formatting is sound.
987 Adding Doxygen markup to a file (informally called
988 <quote>doxygenating</quote>) is very simple. The Doxygen manual can be
990 <ulink url="http://www.stack.nl/~dimitri/doxygen/download.html#latestman">here</ulink>.
991 We try to use a very-recent version of Doxygen.
996 <classname>deque</classname>/<classname>vector</classname>/<classname>list</classname>
997 and <classname>std::pair</classname> as examples. For
998 functions, see their member functions, and the free functions
999 in <filename>stl_algobase.h</filename>. Member functions of
1000 other container-like types should read similarly to these
1005 Some commentary to accompany
1006 the first list in the <ulink url="http://www.stack.nl/~dimitri/doxygen/docblocks.html">Special
1007 Documentation Blocks</ulink> section of
1013 <para>For longer comments, use the Javadoc style...</para>
1018 ...not the Qt style. The intermediate *'s are preferred.
1024 Use the triple-slash style only for one-line comments (the
1025 <quote>brief</quote> mode).
1031 This is disgusting. Don't do this.
1037 Some specific guidelines:
1041 Use the @-style of commands, not the !-style. Please be
1042 careful about whitespace in your markup comments. Most of the
1043 time it doesn't matter; doxygen absorbs most whitespace, and
1044 both HTML and *roff are agnostic about whitespace. However,
1045 in <pre> blocks and @code/@endcode sections, spacing can
1046 have <quote>interesting</quote> effects.
1050 Use either kind of grouping, as
1051 appropriate. <filename>doxygroups.cc</filename> exists for this
1052 purpose. See <filename>stl_iterator.h</filename> for a good example
1053 of the <quote>other</quote> kind of grouping.
1057 Please use markup tags like @p and @a when referring to things
1058 such as the names of function parameters. Use @e for emphasis
1059 when necessary. Use @c to refer to other standard names.
1060 (Examples of all these abound in the present code.)
1064 Complicated math functions should use the multi-line
1065 format. An example from <filename>random.h</filename>:
1071 * @brief A model of a linear congruential random number generator.
1074 * x_{i+1}\leftarrow(ax_{i} + c) \bmod m
1081 Be careful about using certain, special characters when
1082 writing Doxygen comments. Single and double quotes, and
1083 separators in filenames are two common trouble spots. When in
1084 doubt, consult the following table.
1088 <title>HTML to Doxygen Markup Comparison</title>
1089 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
1090 <colspec colname='c1'></colspec>
1091 <colspec colname='c2'></colspec>
1096 <entry>Doxygen</entry>
1107 <entry>"</entry>
1112 <entry>'</entry>
1117 <entry><i></entry>
1118 <entry>@a word</entry>
1122 <entry><b></entry>
1123 <entry>@b word</entry>
1127 <entry><code></entry>
1128 <entry>@c word</entry>
1132 <entry><em></entry>
1133 <entry>@a word</entry>
1137 <entry><em></entry>
1138 <entry><em>two words or more</em></entry>
1150 <sect2 id="doc_style.docbook">
1151 <title>Docbook</title>
1153 <sect3 id="docbook.prereq">
1154 <title>Prerequisites</title>
1156 Editing the DocBook sources requires an XML editor. Many
1157 exist: some notable options
1158 include <command>emacs</command>, <application>Kate</application>,
1159 or <application>Conglomerate</application>.
1163 Some editors support special <quote>XML Validation</quote>
1164 modes that can validate the file as it is
1165 produced. Recommended is the <command>nXML Mode</command>
1166 for <command>emacs</command>.
1170 Besides an editor, additional DocBook files and XML tools are
1175 Access to the DocBook stylesheets and DTD is required. The
1176 stylesheets are usually packaged by vendor, in something
1177 like <filename>docbook-style-xsl</filename>. To exactly match
1178 generated output, please use a version of the stylesheets
1180 to <filename>docbook-style-xsl-1.74.0-5</filename>. The
1181 installation directory for this package corresponds to
1182 the <literal>XSL_STYLE_DIR</literal>
1183 in <filename>doc/Makefile.am</filename> and defaults
1184 to <filename class="directory">/usr/share/sgml/docbook/xsl-stylesheets</filename>.
1188 For processing XML, an XML processor and some style
1189 sheets are necessary. Defaults are <command>xsltproc</command>
1190 provided by <filename>libxslt</filename>.
1194 For validating the XML document, you'll need
1195 something like <command>xmllint</command> and access to the
1196 DocBook DTD. These are provided
1197 by a vendor package like <filename>libxml2</filename>.
1201 For PDF output, something that transforms valid XML to PDF is
1202 required. Possible solutions include <command>xmlto</command>,
1203 <ulink url="http://xmlgraphics.apache.org/fop/">Apache
1204 FOP</ulink>, or <command>prince</command>. Other options are
1205 listed on the DocBook web <ulink
1206 url="http://wiki.docbook.org/topic/DocBookPublishingTools">pages</ulink>. Please
1207 consult the <email>libstdc++@gcc.gnu.org</email> list when
1208 preparing printed manuals for current best practice and suggestions.
1212 Make sure that the XML documentation and markup is valid for
1213 any change. This can be done easily, with the validation rules
1214 in the <filename>Makefile</filename>, which is equivalent to doing:
1219 xmllint --noout --valid <filename>xml/index.xml</filename>
1224 <sect3 id="docbook.rules">
1225 <title>Generating the DocBook Files</title>
1228 The following Makefile rules generate (in order): an HTML
1229 version of all the documentation, a PDF version of the same, a
1230 single XML document, and the result of validating the entire XML
1235 <screen><userinput>make doc-html</userinput></screen>
1239 <screen><userinput>make doc-pdf</userinput></screen>
1243 <screen><userinput>make doc-xml-single</userinput></screen>
1247 <screen><userinput>make doc-xml-validate</userinput></screen>
1252 <sect3 id="docbook.examples">
1253 <title>File Organization and Basics</title>
1256 <emphasis>Which files are important</emphasis>
1258 All Docbook files are in the directory
1259 libstdc++-v3/doc/xml
1261 Inside this directory, the files of importance:
1262 spine.xml - index to documentation set
1263 manual/spine.xml - index to manual
1264 manual/*.xml - individual chapters and sections of the manual
1265 faq.xml - index to FAQ
1266 api.xml - index to source level / API
1268 All *.txml files are template xml files, i.e., otherwise empty files with
1269 the correct structure, suitable for filling in with new information.
1271 <emphasis>Canonical Writing Style</emphasis>
1275 member function template
1276 (via C++ Templates, Vandevoorde)
1278 class in namespace std: allocator, not std::allocator
1280 header file: iostream, not <iostream>
1283 <emphasis>General structure</emphasis>
1318 <sect3 id="docbook.markup">
1319 <title>Markup By Example</title>
1322 Complete details on Docbook markup can be found in the DocBook
1324 <ulink url="http://www.docbook.org/tdg/en/html/part2.html">online</ulink>.
1325 An incomplete reference for HTML to Docbook conversion is
1326 detailed in the table below.
1330 <title>HTML to Docbook XML Markup Comparison</title>
1331 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
1332 <colspec colname='c1'></colspec>
1333 <colspec colname='c2'></colspec>
1338 <entry>Docbook</entry>
1344 <entry><p></entry>
1345 <entry><para></entry>
1348 <entry><pre></entry>
1349 <entry><computeroutput>, <programlisting>,
1350 <literallayout></entry>
1353 <entry><ul></entry>
1354 <entry><itemizedlist></entry>
1357 <entry><ol></entry>
1358 <entry><orderedlist></entry>
1361 <entry><il></entry>
1362 <entry><listitem></entry>
1365 <entry><dl></entry>
1366 <entry><variablelist></entry>
1369 <entry><dt></entry>
1370 <entry><term></entry>
1373 <entry><dd></entry>
1374 <entry><listitem></entry>
1378 <entry><a href=""></entry>
1379 <entry><ulink url=""></entry>
1382 <entry><code></entry>
1383 <entry><literal>, <programlisting></entry>
1386 <entry><strong></entry>
1387 <entry><emphasis></entry>
1390 <entry><em></entry>
1391 <entry><emphasis></entry>
1394 <entry>"</entry>
1395 <entry><quote></entry>
1402 And examples of detailed markup for which there are no real HTML
1403 equivalents are listed in the table below.
1407 <title>Docbook XML Element Use</title>
1408 <tgroup cols='2' align='left' colsep='1' rowsep='1'>
1409 <colspec colname='c1'></colspec>
1410 <colspec colname='c2'></colspec>
1414 <entry>Element</entry>
1421 <entry><structname></entry>
1422 <entry><structname>char_traits</structname></entry>
1425 <entry><classname></entry>
1426 <entry><classname>string</classname></entry>
1429 <entry><function></entry>
1431 <para><function>clear()</function></para>
1432 <para><function>fs.clear()</function></para>
1436 <entry><type></entry>
1437 <entry><type>long long</type></entry>
1440 <entry><varname></entry>
1441 <entry><varname>fs</varname></entry>
1444 <entry><literal></entry>
1446 <para><literal>-Weffc++</literal></para>
1447 <para><literal>rel_ops</literal></para>
1451 <entry><constant></entry>
1453 <para><constant>_GNU_SOURCE</constant></para>
1454 <para><constant>3.0</constant></para>
1458 <entry><command></entry>
1459 <entry><command>g++</command></entry>
1462 <entry><errortext></entry>
1463 <entry><errortext>In instantiation of</errortext></entry>
1466 <entry><filename></entry>
1468 <para><filename class="headerfile">ctype.h</filename></para>
1469 <para><filename class="directory">/home/gcc/build</filename></para>
1470 <para><filename class="libraryfile">libstdc++.so</filename></para>
1482 <sect1 id="contrib.design_notes" xreflabel="Design Notes">
1483 <?dbhtml filename="source_design_notes.html"?>
1484 <title>Design Notes</title>
1493 This paper is covers two major areas:
1495 - Features and policies not mentioned in the standard that
1496 the quality of the library implementation depends on, including
1497 extensions and "implementation-defined" features;
1499 - Plans for required but unimplemented library features and
1500 optimizations to them.
1505 The standard defines a large library, much larger than the standard
1506 C library. A naive implementation would suffer substantial overhead
1507 in compile time, executable size, and speed, rendering it unusable
1508 in many (particularly embedded) applications. The alternative demands
1509 care in construction, and some compiler support, but there is no
1510 need for library subsets.
1512 What are the sources of this overhead? There are four main causes:
1514 - The library is specified almost entirely as templates, which
1515 with current compilers must be included in-line, resulting in
1516 very slow builds as tens or hundreds of thousands of lines
1517 of function definitions are read for each user source file.
1518 Indeed, the entire SGI STL, as well as the dos Reis valarray,
1519 are provided purely as header files, largely for simplicity in
1520 porting. Iostream/locale is (or will be) as large again.
1522 - The library is very flexible, specifying a multitude of hooks
1523 where users can insert their own code in place of defaults.
1524 When these hooks are not used, any time and code expended to
1525 support that flexibility is wasted.
1527 - Templates are often described as causing to "code bloat". In
1528 practice, this refers (when it refers to anything real) to several
1529 independent processes. First, when a class template is manually
1530 instantiated in its entirely, current compilers place the definitions
1531 for all members in a single object file, so that a program linking
1532 to one member gets definitions of all. Second, template functions
1533 which do not actually depend on the template argument are, under
1534 current compilers, generated anew for each instantiation, rather
1535 than being shared with other instantiations. Third, some of the
1536 flexibility mentioned above comes from virtual functions (both in
1537 regular classes and template classes) which current linkers add
1538 to the executable file even when they manifestly cannot be called.
1540 - The library is specified to use a language feature, exceptions,
1541 which in the current gcc compiler ABI imposes a run time and
1542 code space cost to handle the possibility of exceptions even when
1543 they are not used. Under the new ABI (accessed with -fnew-abi),
1544 there is a space overhead and a small reduction in code efficiency
1545 resulting from lost optimization opportunities associated with
1546 non-local branches associated with exceptions.
1548 What can be done to eliminate this overhead? A variety of coding
1549 techniques, and compiler, linker and library improvements and
1550 extensions may be used, as covered below. Most are not difficult,
1551 and some are already implemented in varying degrees.
1553 Overhead: Compilation Time
1554 --------------------------
1556 Providing "ready-instantiated" template code in object code archives
1557 allows us to avoid generating and optimizing template instantiations
1558 in each compilation unit which uses them. However, the number of such
1559 instantiations that are useful to provide is limited, and anyway this
1560 is not enough, by itself, to minimize compilation time. In particular,
1561 it does not reduce time spent parsing conforming headers.
1563 Quicker header parsing will depend on library extensions and compiler
1564 improvements. One approach is some variation on the techniques
1565 previously marketed as "pre-compiled headers", now standardized as
1566 support for the "export" keyword. "Exported" template definitions
1567 can be placed (once) in a "repository" -- really just a library, but
1568 of template definitions rather than object code -- to be drawn upon
1569 at link time when an instantiation is needed, rather than placed in
1570 header files to be parsed along with every compilation unit.
1572 Until "export" is implemented we can put some of the lengthy template
1573 definitions in #if guards or alternative headers so that users can skip
1574 over the full definitions when they need only the ready-instantiated
1577 To be precise, this means that certain headers which define
1578 templates which users normally use only for certain arguments
1579 can be instrumented to avoid exposing the template definitions
1580 to the compiler unless a macro is defined. For example, in
1581 <string>, we might have:
1583 template <class _CharT, ... > class basic_string {
1584 ... // member declarations
1586 ... // operator declarations
1589 # if _G_NO_TEMPLATE_EXPORT
1590 # include <bits/std_locale.h> // headers needed by definitions
1592 # include <bits/string.tcc> // member and global template definitions.
1596 Users who compile without specifying a strict-ISO-conforming flag
1597 would not see many of the template definitions they now see, and rely
1598 instead on ready-instantiated specializations in the library. This
1599 technique would be useful for the following substantial components:
1600 string, locale/iostreams, valarray. It would *not* be useful or
1601 usable with the following: containers, algorithms, iterators,
1602 allocator. Since these constitute a large (though decreasing)
1603 fraction of the library, the benefit the technique offers is
1606 The language specifies the semantics of the "export" keyword, but
1607 the gcc compiler does not yet support it. When it does, problems
1608 with large template inclusions can largely disappear, given some
1609 minor library reorganization, along with the need for the apparatus
1612 Overhead: Flexibility Cost
1613 --------------------------
1615 The library offers many places where users can specify operations
1616 to be performed by the library in place of defaults. Sometimes
1617 this seems to require that the library use a more-roundabout, and
1618 possibly slower, way to accomplish the default requirements than
1619 would be used otherwise.
1621 The primary protection against this overhead is thorough compiler
1622 optimization, to crush out layers of inline function interfaces.
1623 Kuck & Associates has demonstrated the practicality of this kind
1626 The second line of defense against this overhead is explicit
1627 specialization. By defining helper function templates, and writing
1628 specialized code for the default case, overhead can be eliminated
1629 for that case without sacrificing flexibility. This takes full
1630 advantage of any ability of the optimizer to crush out degenerate
1633 The library specifies many virtual functions which current linkers
1634 load even when they cannot be called. Some minor improvements to the
1635 compiler and to ld would eliminate any such overhead by simply
1636 omitting virtual functions that the complete program does not call.
1637 A prototype of this work has already been done. For targets where
1638 GNU ld is not used, a "pre-linker" could do the same job.
1640 The main areas in the standard interface where user flexibility
1641 can result in overhead are:
1643 - Allocators: Containers are specified to use user-definable
1644 allocator types and objects, making tuning for the container
1645 characteristics tricky.
1647 - Locales: the standard specifies locale objects used to implement
1648 iostream operations, involving many virtual functions which use
1649 streambuf iterators.
1651 - Algorithms and containers: these may be instantiated on any type,
1652 frequently duplicating code for identical operations.
1654 - Iostreams and strings: users are permitted to use these on their
1655 own types, and specify the operations the stream must use on these
1658 Note that these sources of overhead are _avoidable_. The techniques
1659 to avoid them are covered below.
1664 In the SGI STL, and in some other headers, many of the templates
1665 are defined "inline" -- either explicitly or by their placement
1666 in class definitions -- which should not be inline. This is a
1667 source of code bloat. Matt had remarked that he was relying on
1668 the compiler to recognize what was too big to benefit from inlining,
1669 and generate it out-of-line automatically. However, this also can
1670 result in code bloat except where the linker can eliminate the extra
1673 Fixing these cases will require an audit of all inline functions
1674 defined in the library to determine which merit inlining, and moving
1675 the rest out of line. This is an issue mainly in chapters 23, 25, and
1676 27. Of course it can be done incrementally, and we should generally
1677 accept patches that move large functions out of line and into ".tcc"
1678 files, which can later be pulled into a repository. Compiler/linker
1679 improvements to recognize very large inline functions and move them
1680 out-of-line, but shared among compilation units, could make this
1683 Pre-instantiating template specializations currently produces large
1684 amounts of dead code which bloats statically linked programs. The
1685 current state of the static library, libstdc++.a, is intolerable on
1686 this account, and will fuel further confused speculation about a need
1687 for a library "subset". A compiler improvement that treats each
1688 instantiated function as a separate object file, for linking purposes,
1689 would be one solution to this problem. An alternative would be to
1690 split up the manual instantiation files into dozens upon dozens of
1691 little files, each compiled separately, but an abortive attempt at
1692 this was done for <string> and, though it is far from complete, it
1693 is already a nuisance. A better interim solution (just until we have
1694 "export") is badly needed.
1696 When building a shared library, the current compiler/linker cannot
1697 automatically generate the instantiations needed. This creates a
1698 miserable situation; it means any time something is changed in the
1699 library, before a shared library can be built someone must manually
1700 copy the declarations of all templates that are needed by other parts
1701 of the library to an "instantiation" file, and add it to the build
1702 system to be compiled and linked to the library. This process is
1703 readily automated, and should be automated as soon as possible.
1704 Users building their own shared libraries experience identical
1707 Sharing common aspects of template definitions among instantiations
1708 can radically reduce code bloat. The compiler could help a great
1709 deal here by recognizing when a function depends on nothing about
1710 a template parameter, or only on its size, and giving the resulting
1711 function a link-name "equate" that allows it to be shared with other
1712 instantiations. Implementation code could take advantage of the
1713 capability by factoring out code that does not depend on the template
1714 argument into separate functions to be merged by the compiler.
1716 Until such a compiler optimization is implemented, much can be done
1717 manually (if tediously) in this direction. One such optimization is
1718 to derive class templates from non-template classes, and move as much
1719 implementation as possible into the base class. Another is to partial-
1720 specialize certain common instantiations, such as vector<T*>, to share
1721 code for instantiations on all types T. While these techniques work,
1722 they are far from the complete solution that a compiler improvement
1725 Overhead: Expensive Language Features
1726 -------------------------------------
1728 The main "expensive" language feature used in the standard library
1729 is exception support, which requires compiling in cleanup code with
1730 static table data to locate it, and linking in library code to use
1731 the table. For small embedded programs the amount of such library
1732 code and table data is assumed by some to be excessive. Under the
1733 "new" ABI this perception is generally exaggerated, although in some
1734 cases it may actually be excessive.
1736 To implement a library which does not use exceptions directly is
1737 not difficult given minor compiler support (to "turn off" exceptions
1738 and ignore exception constructs), and results in no great library
1739 maintenance difficulties. To be precise, given "-fno-exceptions",
1740 the compiler should treat "try" blocks as ordinary blocks, and
1741 "catch" blocks as dead code to ignore or eliminate. Compiler
1742 support is not strictly necessary, except in the case of "function
1743 try blocks"; otherwise the following macros almost suffice:
1746 #define try if (true)
1747 #define catch(X) else if (false)
1749 However, there may be a need to use function try blocks in the
1750 library implementation, and use of macros in this way can make
1751 correct diagnostics impossible. Furthermore, use of this scheme
1752 would require the library to call a function to re-throw exceptions
1753 from a try block. Implementing the above semantics in the compiler
1756 Given the support above (however implemented) it only remains to
1757 replace code that "throws" with a call to a well-documented "handler"
1758 function in a separate compilation unit which may be replaced by
1759 the user. The main source of exceptions that would be difficult
1760 for users to avoid is memory allocation failures, but users can
1761 define their own memory allocation primitives that never throw.
1762 Otherwise, the complete list of such handlers, and which library
1763 functions may call them, would be needed for users to be able to
1764 implement the necessary substitutes. (Fortunately, they have the
1770 The template capabilities of C++ offer enormous opportunities for
1771 optimizing common library operations, well beyond what would be
1772 considered "eliminating overhead". In particular, many operations
1773 done in Glibc with macros that depend on proprietary language
1774 extensions can be implemented in pristine Standard C++. For example,
1775 the chapter 25 algorithms, and even C library functions such as strchr,
1776 can be specialized for the case of static arrays of known (small) size.
1778 Detailed optimization opportunities are identified below where
1779 the component where they would appear is discussed. Of course new
1780 opportunities will be identified during implementation.
1782 Unimplemented Required Library Features
1783 ---------------------------------------
1785 The standard specifies hundreds of components, grouped broadly by
1786 chapter. These are listed in excruciating detail in the CHECKLIST
1800 Annex D backward compatibility
1802 Anyone participating in implementation of the library should obtain
1803 a copy of the standard, ISO 14882. People in the U.S. can obtain an
1804 electronic copy for US$18 from ANSI's web site. Those from other
1805 countries should visit http://www.iso.org/ to find out the location
1806 of their country's representation in ISO, in order to know who can
1809 The emphasis in the following sections is on unimplemented features
1810 and optimization opportunities.
1815 Chapter 17 concerns overall library requirements.
1817 The standard doesn't mention threads. A multi-thread (MT) extension
1818 primarily affects operators new and delete (18), allocator (20),
1819 string (21), locale (22), and iostreams (27). The common underlying
1820 support needed for this is discussed under chapter 20.
1822 The standard requirements on names from the C headers create a
1823 lot of work, mostly done. Names in the C headers must be visible
1824 in the std:: and sometimes the global namespace; the names in the
1825 two scopes must refer to the same object. More stringent is that
1826 Koenig lookup implies that any types specified as defined in std::
1827 really are defined in std::. Names optionally implemented as
1828 macros in C cannot be macros in C++. (An overview may be read at
1829 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure"
1830 and "mkcshadow", and the directories shadow/ and cshadow/, are the
1831 beginning of an effort to conform in this area.
1833 A correct conforming definition of C header names based on underlying
1834 C library headers, and practical linking of conforming namespaced
1835 customer code with third-party C libraries depends ultimately on
1836 an ABI change, allowing namespaced C type names to be mangled into
1837 type names as if they were global, somewhat as C function names in a
1838 namespace, or C++ global variable names, are left unmangled. Perhaps
1839 another "extern" mode, such as 'extern "C-global"' would be an
1840 appropriate place for such type definitions. Such a type would
1841 affect mangling as follows:
1845 extern "C-global" { // or maybe just 'extern "C"'
1849 void f(A::X*); // mangles to f__FPQ21A1X
1850 void f(A::Y*); // mangles to f__FP1Y
1852 (It may be that this is really the appropriate semantics for regular
1853 'extern "C"', and 'extern "C-global"', as an extension, would not be
1854 necessary.) This would allow functions declared in non-standard C headers
1855 (and thus fixable by neither us nor users) to link properly with functions
1856 declared using C types defined in properly-namespaced headers. The
1857 problem this solves is that C headers (which C++ programmers do persist
1858 in using) frequently forward-declare C struct tags without including
1859 the header where the type is defined, as in
1864 Without some compiler accommodation, munge cannot be called by correct
1865 C++ code using a pointer to a correctly-scoped tm* value.
1867 The current C headers use the preprocessor extension "#include_next",
1868 which the compiler complains about when run "-pedantic".
1869 (Incidentally, it appears that "-fpedantic" is currently ignored,
1870 probably a bug.) The solution in the C compiler is to use
1871 "-isystem" rather than "-I", but unfortunately in g++ this seems
1872 also to wrap the whole header in an 'extern "C"' block, so it's
1873 unusable for C++ headers. The correct solution appears to be to
1874 allow the various special include-directory options, if not given
1875 an argument, to affect subsequent include-directory options additively,
1878 -pedantic -iprefix $(prefix) \
1879 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
1880 -iwithprefix -I g++-v3/ext
1882 the compiler would search $(prefix)/g++-v3 and not report
1883 pedantic warnings for files found there, but treat files in
1884 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics
1885 of "-isystem" in g++ stink. Can they be rescinded? If not it
1886 must be replaced with something more rationally behaved.)
1888 All the C headers need the treatment above; in the standard these
1889 headers are mentioned in various chapters. Below, I have only
1890 mentioned those that present interesting implementation issues.
1892 The components identified as "mostly complete", below, have not been
1893 audited for conformance. In many cases where the library passes
1894 conformance tests we have non-conforming extensions that must be
1895 wrapped in #if guards for "pedantic" use, and in some cases renamed
1896 in a conforming way for continued use in the implementation regardless
1897 of conformance flags.
1899 The STL portion of the library still depends on a header
1900 stl/bits/stl_config.h full of #ifdef clauses. This apparatus
1901 should be replaced with autoconf/automake machinery.
1903 The SGI STL defines a type_traits<> template, specialized for
1904 many types in their code including the built-in numeric and
1905 pointer types and some library types, to direct optimizations of
1906 standard functions. The SGI compiler has been extended to generate
1907 specializations of this template automatically for user types,
1908 so that use of STL templates on user types can take advantage of
1909 these optimizations. Specializations for other, non-STL, types
1910 would make more optimizations possible, but extending the gcc
1911 compiler in the same way would be much better. Probably the next
1912 round of standardization will ratify this, but probably with
1913 changes, so it probably should be renamed to place it in the
1914 implementation namespace.
1916 The SGI STL also defines a large number of extensions visible in
1917 standard headers. (Other extensions that appear in separate headers
1918 have been sequestered in subdirectories ext/ and backward/.) All
1919 these extensions should be moved to other headers where possible,
1920 and in any case wrapped in a namespace (not std!), and (where kept
1921 in a standard header) girded about with macro guards. Some cannot be
1922 moved out of standard headers because they are used to implement
1923 standard features. The canonical method for accommodating these
1924 is to use a protected name, aliased in macro guards to a user-space
1925 name. Unfortunately C++ offers no satisfactory template typedef
1926 mechanism, so very ad-hoc and unsatisfactory aliasing must be used
1929 Implementation of a template typedef mechanism should have the highest
1930 priority among possible extensions, on the same level as implementation
1931 of the template "export" feature.
1933 Chapter 18 Language support
1934 ----------------------------
1936 Headers: <limits> <new> <typeinfo> <exception>
1937 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp>
1938 <ctime> <csignal> <cstdlib> (also 21, 25, 26)
1940 This defines the built-in exceptions, rtti, numeric_limits<>,
1941 operator new and delete. Much of this is provided by the
1942 compiler in its static runtime library.
1944 Work to do includes defining numeric_limits<> specializations in
1945 separate files for all target architectures. Values for integer types
1946 except for bool and wchar_t are readily obtained from the C header
1947 <limits.h>, but values for the remaining numeric types (bool, wchar_t,
1948 float, double, long double) must be entered manually. This is
1949 largely dog work except for those members whose values are not
1950 easily deduced from available documentation. Also, this involves
1951 some work in target configuration to identify the correct choice of
1952 file to build against and to install.
1954 The definitions of the various operators new and delete must be
1955 made thread-safe, which depends on a portable exclusion mechanism,
1956 discussed under chapter 20. Of course there is always plenty of
1957 room for improvements to the speed of operators new and delete.
1959 <cstdarg>, in Glibc, defines some macros that gcc does not allow to
1960 be wrapped into an inline function. Probably this header will demand
1961 attention whenever a new target is chosen. The functions atexit(),
1962 exit(), and abort() in cstdlib have different semantics in C++, so
1963 must be re-implemented for C++.
1965 Chapter 19 Diagnostics
1966 -----------------------
1968 Headers: <stdexcept>
1969 C headers: <cassert> <cerrno>
1971 This defines the standard exception objects, which are "mostly complete".
1972 Cygnus has a version, and now SGI provides a slightly different one.
1973 It makes little difference which we use.
1975 The C global name "errno", which C allows to be a variable or a macro,
1976 is required in C++ to be a macro. For MT it must typically result in
1979 Chapter 20 Utilities
1980 ---------------------
1981 Headers: <utility> <functional> <memory>
1982 C header: <ctime> (also in 18)
1984 SGI STL provides "mostly complete" versions of all the components
1985 defined in this chapter. However, the auto_ptr<> implementation
1986 is known to be wrong. Furthermore, the standard definition of it
1987 is known to be unimplementable as written. A minor change to the
1988 standard would fix it, and auto_ptr<> should be adjusted to match.
1990 Multi-threading affects the allocator implementation, and there must
1991 be configuration/installation choices for different users' MT
1992 requirements. Anyway, users will want to tune allocator options
1993 to support different target conditions, MT or no.
1995 The primitives used for MT implementation should be exposed, as an
1996 extension, for users' own work. We need cross-CPU "mutex" support,
1997 multi-processor shared-memory atomic integer operations, and single-
1998 processor uninterruptible integer operations, and all three configurable
1999 to be stubbed out for non-MT use, or to use an appropriately-loaded
2000 dynamic library for the actual runtime environment, or statically
2001 compiled in for cases where the target architecture is known.
2005 Headers: <string>
2006 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
2007 <cstdlib> (also in 18, 25, 26)
2009 We have "mostly-complete" char_traits<> implementations. Many of the
2010 char_traits<char> operations might be optimized further using existing
2011 proprietary language extensions.
2013 We have a "mostly-complete" basic_string<> implementation. The work
2014 to manually instantiate char and wchar_t specializations in object
2015 files to improve link-time behavior is extremely unsatisfactory,
2016 literally tripling library-build time with no commensurate improvement
2017 in static program link sizes. It must be redone. (Similar work is
2018 needed for some components in chapters 22 and 27.)
2020 Other work needed for strings is MT-safety, as discussed under the
2023 The standard C type mbstate_t from <cwchar> and used in char_traits<>
2024 must be different in C++ than in C, because in C++ the default constructor
2025 value mbstate_t() must be the "base" or "ground" sequence state.
2026 (According to the likely resolution of a recently raised Core issue,
2027 this may become unnecessary. However, there are other reasons to
2028 use a state type not as limited as whatever the C library provides.)
2029 If we might want to provide conversions from (e.g.) internally-
2030 represented EUC-wide to externally-represented Unicode, or vice-
2031 versa, the mbstate_t we choose will need to be more accommodating
2032 than what might be provided by an underlying C library.
2034 There remain some basic_string template-member functions which do
2035 not overload properly with their non-template brethren. The infamous
2036 hack akin to what was done in vector<> is needed, to conform to
2037 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X',
2038 or incomplete, are so marked for this reason.
2040 Replacing the string iterators, which currently are simple character
2041 pointers, with class objects would greatly increase the safety of the
2042 client interface, and also permit a "debug" mode in which range,
2043 ownership, and validity are rigorously checked. The current use of
2044 raw pointers as string iterators is evil. vector<> iterators need the
2045 same treatment. Note that the current implementation freely mixes
2046 pointers and iterators, and that must be fixed before safer iterators
2049 Some of the functions in <cstring> are different from the C version.
2050 generally overloaded on const and non-const argument pointers. For
2051 example, in <cstring> strchr is overloaded. The functions isupper
2052 etc. in <cctype> typically implemented as macros in C are functions
2053 in C++, because they are overloaded with others of the same name
2054 defined in <locale>.
2056 Many of the functions required in <cwctype> and <cwchar> cannot be
2057 implemented using underlying C facilities on intended targets because
2058 such facilities only partly exist.
2062 Headers: <locale>
2063 C headers: <clocale>
2065 We have a "mostly complete" class locale, with the exception of
2066 code for constructing, and handling the names of, named locales.
2067 The ways that locales are named (particularly when categories
2068 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target
2069 environments. This code must be written in various versions and
2070 chosen by configuration parameters.
2072 Members of many of the facets defined in <locale> are stubs. Generally,
2073 there are two sets of facets: the base class facets (which are supposed
2074 to implement the "C" locale) and the "byname" facets, which are supposed
2075 to read files to determine their behavior. The base ctype<>, collate<>,
2076 and numpunct<> facets are "mostly complete", except that the table of
2077 bitmask values used for "is" operations, and corresponding mask values,
2078 are still defined in libio and just included/linked. (We will need to
2079 implement these tables independently, soon, but should take advantage
2080 of libio where possible.) The num_put<>::put members for integer types
2081 are "mostly complete".
2083 A complete list of what has and has not been implemented may be
2084 found in CHECKLIST. However, note that the current definition of
2085 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write
2086 out the raw bytes representing the wide characters, rather than
2087 trying to convert each to a corresponding single "char" value.
2089 Some of the facets are more important than others. Specifically,
2090 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
2091 are used by other library facilities defined in <string>, <istream>,
2092 and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
2093 in <fstream>, so a conforming iostream implementation depends on
2096 The "long long" type eventually must be supported, but code mentioning
2097 it should be wrapped in #if guards to allow pedantic-mode compiling.
2099 Performance of num_put<> and num_get<> depend critically on
2100 caching computed values in ios_base objects, and on extensions
2101 to the interface with streambufs.
2103 Specifically: retrieving a copy of the locale object, extracting
2104 the needed facets, and gathering data from them, for each call to
2105 (e.g.) operator<< would be prohibitively slow. To cache format
2106 data for use by num_put<> and num_get<> we have a _Format_cache<>
2107 object stored in the ios_base::pword() array. This is constructed
2108 and initialized lazily, and is organized purely for utility. It
2109 is discarded when a new locale with different facets is imbued.
2111 Using only the public interfaces of the iterator arguments to the
2112 facet functions would limit performance by forbidding "vector-style"
2113 character operations. The streambuf iterator optimizations are
2114 described under chapter 24, but facets can also bypass the streambuf
2115 iterators via explicit specializations and operate directly on the
2116 streambufs, and use extended interfaces to get direct access to the
2117 streambuf internal buffer arrays. These extensions are mentioned
2118 under chapter 27. These optimizations are particularly important
2121 Unused virtual members of locale facets can be omitted, as mentioned
2122 above, by a smart linker.
2124 Chapter 23 Containers
2125 ----------------------
2126 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
2128 All the components in chapter 23 are implemented in the SGI STL.
2129 They are "mostly complete"; they include a large number of
2130 nonconforming extensions which must be wrapped. Some of these
2131 are used internally and must be renamed or duplicated.
2133 The SGI components are optimized for large-memory environments. For
2134 embedded targets, different criteria might be more appropriate. Users
2135 will want to be able to tune this behavior. We should provide
2136 ways for users to compile the library with different memory usage
2139 A lot more work is needed on factoring out common code from different
2140 specializations to reduce code size here and in chapter 25. The
2141 easiest fix for this would be a compiler/ABI improvement that allows
2142 the compiler to recognize when a specialization depends only on the
2143 size (or other gross quality) of a template argument, and allow the
2144 linker to share the code with similar specializations. In its
2145 absence, many of the algorithms and containers can be partial-
2146 specialized, at least for the case of pointers, but this only solves
2147 a small part of the problem. Use of a type_traits-style template
2148 allows a few more optimization opportunities, more if the compiler
2149 can generate the specializations automatically.
2151 As an optimization, containers can specialize on the default allocator
2152 and bypass it, or take advantage of details of its implementation
2153 after it has been improved upon.
2155 Replacing the vector iterators, which currently are simple element
2156 pointers, with class objects would greatly increase the safety of the
2157 client interface, and also permit a "debug" mode in which range,
2158 ownership, and validity are rigorously checked. The current use of
2159 pointers for iterators is evil.
2161 As mentioned for chapter 24, the deque iterator is a good example of
2162 an opportunity to implement a "staged" iterator that would benefit
2163 from specializations of some algorithms.
2165 Chapter 24 Iterators
2166 ---------------------
2167 Headers: <iterator>
2169 Standard iterators are "mostly complete", with the exception of
2170 the stream iterators, which are not yet templatized on the
2171 stream type. Also, the base class template iterator<> appears
2172 to be wrong, so everything derived from it must also be wrong,
2175 The streambuf iterators (currently located in stl/bits/std_iterator.h,
2176 but should be under bits/) can be rewritten to take advantage of
2177 friendship with the streambuf implementation.
2179 Matt Austern has identified opportunities where certain iterator
2180 types, particularly including streambuf iterators and deque
2181 iterators, have a "two-stage" quality, such that an intermediate
2182 limit can be checked much more quickly than the true limit on
2183 range operations. If identified with a member of iterator_traits,
2184 algorithms may be specialized for this case. Of course the
2185 iterators that have this quality can be identified by specializing
2188 Many of the algorithms must be specialized for the streambuf
2189 iterators, to take advantage of block-mode operations, in order
2190 to allow iostream/locale operations' performance not to suffer.
2191 It may be that they could be treated as staged iterators and
2192 take advantage of those optimizations.
2194 Chapter 25 Algorithms
2195 ----------------------
2196 Headers: <algorithm>
2197 C headers: <cstdlib> (also in 18, 21, 26))
2199 The algorithms are "mostly complete". As mentioned above, they
2200 are optimized for speed at the expense of code and data size.
2202 Specializations of many of the algorithms for non-STL types would
2203 give performance improvements, but we must use great care not to
2204 interfere with fragile template overloading semantics for the
2205 standard interfaces. Conventionally the standard function template
2206 interface is an inline which delegates to a non-standard function
2207 which is then overloaded (this is already done in many places in
2208 the library). Particularly appealing opportunities for the sake of
2209 iostream performance are for copy and find applied to streambuf
2210 iterators or (as noted elsewhere) for staged iterators, of which
2211 the streambuf iterators are a good example.
2213 The bsearch and qsort functions cannot be overloaded properly as
2214 required by the standard because gcc does not yet allow overloading
2215 on the extern-"C"-ness of a function pointer.
2218 --------------------
2219 Headers: <complex> <valarray> <numeric>
2220 C headers: <cmath>, <cstdlib> (also 18, 21, 25)
2222 Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
2223 and the few algorithms from the STL are "mostly done". Of course
2224 optimization opportunities abound for the numerically literate. It
2225 is not clear whether the valarray implementation really conforms
2226 fully, in the assumptions it makes about aliasing (and lack thereof)
2229 The C div() and ldiv() functions are interesting, because they are the
2230 only case where a C library function returns a class object by value.
2231 Since the C++ type div_t must be different from the underlying C type
2232 (which is in the wrong namespace) the underlying functions div() and
2233 ldiv() cannot be re-used efficiently. Fortunately they are trivial to
2236 Chapter 27 Iostreams
2237 ---------------------
2238 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
2239 <iomanip> <sstream> <fstream>
2240 C headers: <cstdio> <cwchar> (also in 21)
2242 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>,
2243 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and
2244 basic_ostream<> are well along, but basic_istream<> has had little work
2245 done. The standard stream objects, <sstream> and <fstream> have been
2246 started; basic_filebuf<> "write" functions have been implemented just
2247 enough to do "hello, world".
2249 Most of the istream and ostream operators << and >> (with the exception
2250 of the op<<(integer) ones) have not been changed to use locale primitives,
2251 sentry objects, or char_traits members.
2253 All these templates should be manually instantiated for char and
2254 wchar_t in a way that links only used members into user programs.
2256 Streambuf is fertile ground for optimization extensions. An extended
2257 interface giving iterator access to its internal buffer would be very
2258 useful for other library components.
2260 Iostream operations (primarily operators << and >>) can take advantage
2261 of the case where user code has not specified a locale, and bypass locale
2262 operations entirely. The current implementation of op<</num_put<>::put,
2263 for the integer types, demonstrates how they can cache encoding details
2264 from the locale on each operation. There is lots more room for
2265 optimization in this area.
2267 The definition of the relationship between the standard streams
2268 cout et al. and stdout et al. requires something like a "stdiobuf".
2269 The SGI solution of using double-indirection to actually use a
2270 stdio FILE object for buffering is unsatisfactory, because it
2271 interferes with peephole loop optimizations.
2273 The <sstream> header work has begun. stringbuf can benefit from
2274 friendship with basic_string<> and basic_string<>::_Rep to use
2275 those objects directly as buffers, and avoid allocating and making
2278 The basic_filebuf<> template is a complex beast. It is specified to
2279 use the locale facet codecvt<> to translate characters between native
2280 files and the locale character encoding. In general this involves
2281 two buffers, one of "char" representing the file and another of
2282 "char_type", for the stream, with codecvt<> translating. The process
2283 is complicated by the variable-length nature of the translation, and
2284 the need to seek to corresponding places in the two representations.
2285 For the case of basic_filebuf<char>, when no translation is needed,
2286 a single buffer suffices. A specialized filebuf can be used to reduce
2287 code space overhead when no locale has been imbued. Matt Austern's
2288 work at SGI will be useful, perhaps directly as a source of code, or
2289 at least as an example to draw on.
2291 Filebuf, almost uniquely (cf. operator new), depends heavily on
2292 underlying environmental facilities. In current releases iostream
2293 depends fairly heavily on libio constant definitions, but it should
2294 be made independent. It also depends on operating system primitives
2295 for file operations. There is immense room for optimizations using
2296 (e.g.) mmap for reading. The shadow/ directory wraps, besides the
2297 standard C headers, the libio.h and unistd.h headers, for use mainly
2298 by filebuf. These wrappings have not been completed, though there
2299 is scaffolding in place.
2301 The encapsulation of certain C header <cstdio> names presents an
2302 interesting problem. It is possible to define an inline std::fprintf()
2303 implemented in terms of the 'extern "C"' vfprintf(), but there is no
2304 standard vfscanf() to use to implement std::fscanf(). It appears that
2305 vfscanf but be re-implemented in C++ for targets where no vfscanf
2306 extension has been defined. This is interesting in that it seems
2307 to be the only significant case in the C library where this kind of
2308 rewriting is necessary. (Of course Glibc provides the vfscanf()
2309 extension.) (The functions related to exit() must be rewritten
2315 Headers: <strstream>
2317 Annex D defines many non-library features, and many minor
2318 modifications to various headers, and a complete header.
2319 It is "mostly done", except that the libstdc++-2 <strstream>
2320 header has not been adopted into the library, or checked to
2321 verify that it matches the draft in those details that were
2322 clarified by the committee. Certainly it must at least be
2323 moved into the std namespace.
2325 We still need to wrap all the deprecated features in #if guards
2326 so that pedantic compile modes can detect their use.
2328 Nonstandard Extensions
2329 ----------------------
2330 Headers: <iostream.h> <strstream.h> <hash> <rbtree>
2331 <pthread_alloc> <stdiobuf> (etc.)
2333 User code has come to depend on a variety of nonstandard components
2334 that we must not omit. Much of this code can be adopted from
2335 libstdc++-v2 or from the SGI STL. This particularly includes
2336 <iostream.h>, <strstream.h>, and various SGI extensions such
2337 as <hash_map.h>. Many of these are already placed in the
2338 subdirectories ext/ and backward/. (Note that it is better to
2339 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
2340 to search the subdirectory itself via a "-I" directive.