2 <!DOCTYPE appendix PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
6 <appendix id="appendix.contrib" xreflabel="Contributing">
7 <?dbhtml filename="appendix_contributing.html"?>
20 <title>Contributing</title>
23 The GNU C++ Library follows an open development model. Active
24 contributors are assigned maintainer-ship responsibility, and given
25 write access to the source repository. First time contributors
26 should follow this procedure:
29 <sect1 id="contrib.list" xreflabel="Contributor Checklist">
30 <title>Contributor Checklist</title>
32 <sect2 id="list.reading" xreflabel="list.reading">
33 <title>Reading</title>
38 Get and read the relevant sections of the C++ language
39 specification. Copies of the full ISO 14882 standard are
40 available on line via the ISO mirror site for committee
41 members. Non-members, or those who have not paid for the
42 privilege of sitting on the committee and sustained their
43 two meeting commitment for voting rights, may get a copy of
44 the standard from their respective national standards
45 organization. In the USA, this national standards
46 organization is ANSI and their web-site is right
47 <ulink url="http://www.ansi.org">here.</ulink>
48 (And if you've already registered with them, clicking this link will take you to directly to the place where you can
49 <ulink url="http://webstore.ansi.org/ansidocstore/product.asp?sku=ISO%2FIEC+14882%3A2003">buy the standard on-line.)</ulink>
55 The library working group bugs, and known defects, can
57 <ulink url="http://www.open-std.org/jtc1/sc22/wg21/">http://www.open-std.org/jtc1/sc22/wg21 </ulink>
63 The newsgroup dedicated to standardization issues is
64 comp.std.c++: this FAQ for this group is quite useful and
66 found <ulink url="http://www.jamesd.demon.co.uk/csc/faq.html">
74 the <ulink url="http://www.gnu.org/prep/standards_toc.html">GNU
75 Coding Standards</ulink>, and chuckle when you hit the part
76 about <quote>Using Languages Other Than C</quote>.
82 Be familiar with the extensions that preceded these
83 general GNU rules. These style issues for libstdc++ can be
84 found <link linkend="contrib.coding_style">here</link>.
90 And last but certainly not least, read the
91 library-specific information
92 found <link linkend="appendix.porting"> here</link>.
98 <sect2 id="list.copyright" xreflabel="list.copyright">
99 <title>Assignment</title>
101 Small changes can be accepted without a copyright assignment form on
102 file. New code and additions to the library need completed copyright
103 assignment form on file at the FSF. Note: your employer may be required
104 to fill out appropriate disclaimer forms as well.
108 Historically, the libstdc++ assignment form added the following
114 Which Belgian comic book character is better, Tintin or Asterix, and
120 While not strictly necessary, humoring the maintainers and answering
121 this question would be appreciated.
125 For more information about getting a copyright assignment, please see
126 <ulink url="http://www.gnu.org/prep/maintain/html_node/Legal-Matters.html">Legal
131 Please contact Benjamin Kosnik at
132 <email>bkoz+assign@redhat.com</email> if you are confused
133 about the assignment or have general licensing questions. When
134 requesting an assignment form from
135 <email>mailto:assign@gnu.org</email>, please cc the libstdc++
136 maintainer above so that progress can be monitored.
140 <sect2 id="list.getting" xreflabel="list.getting">
141 <title>Getting Sources</title>
143 <ulink url="http://gcc.gnu.org/svnwrite.html">Getting write access
144 (look for "Write after approval")</ulink>
148 <sect2 id="list.patches" xreflabel="list.patches">
149 <title>Submitting Patches</title>
152 Every patch must have several pieces of information before it can be
153 properly evaluated. Ideally (and to ensure the fastest possible
154 response from the maintainers) it would have all of these pieces:
160 A description of the bug and how your patch fixes this
161 bug. For new features a description of the feature and your
168 A ChangeLog entry as plain text; see the various
169 ChangeLog files for format and content. If using you are
170 using emacs as your editor, simply position the insertion
171 point at the beginning of your change and hit CX-4a to bring
172 up the appropriate ChangeLog entry. See--magic! Similar
173 functionality also exists for vi.
179 A testsuite submission or sample program that will
180 easily and simply show the existing error or test new
187 The patch itself. If you are accessing the SVN
188 repository use <command>svn update; svn diff NEW</command>;
189 else, use <command>diff -cp OLD NEW</command> ... If your
190 version of diff does not support these options, then get the
191 latest version of GNU
192 diff. The <ulink url="http://gcc.gnu.org/wiki/SvnTricks">SVN
193 Tricks</ulink> wiki page has information on customising the
194 output of <code>svn diff</code>.
200 When you have all these pieces, bundle them up in a
201 mail message and send it to libstdc++@gcc.gnu.org. All
202 patches and related discussion should be sent to the
203 libstdc++ mailing list.
212 <sect1 id="contrib.organization" xreflabel="Source Organization">
213 <title>Directory Layout and Source Conventions</title>
216 The unpacked source directory of libstdc++ contains the files
217 needed to create the GNU C++ Library.
221 It has subdirectories:
224 Files in HTML and text format that document usage, quirks of the
225 implementation, and contributor checklists.
228 All header files for the C++ library are within this directory,
229 modulo specific runtime-related files that are in the libsupc++
233 Files meant to be found by #include <name> directives in
234 standard-conforming user programs.
237 Headers intended to directly include standard C headers.
238 [NB: this can be enabled via --enable-cheaders=c]
241 Headers intended to include standard C headers in
242 the global namespace, and put select names into the std::
243 namespace. [NB: this is the default, and is the same as
244 --enable-cheaders=c_global]
247 Headers intended to include standard C headers
248 already in namespace std, and put select names into the std::
249 namespace. [NB: this is the same as --enable-cheaders=c_std]
252 Files included by standard headers and by other files in
256 Headers provided for backward compatibility, such as <iostream.h>.
257 They are not used in this library.
260 Headers that define extensions to the standard library. No
261 standard header refers to any of them.
264 Scripts that are used during the configure, build, make, or test
268 Files that are used in constructing the library, but are not
271 testsuites/[backward, demangle, ext, performance, thread, 17_* to 27_*]
272 Test programs are here, and may be used to begin to exercise the
273 library. Support for "make check" and "make check-install" is
274 complete, and runs through all the subdirectories here when this
275 command is issued from the build directory. Please note that
276 "make check" requires DejaGNU 1.4 or later to be installed. Please
277 note that "make check-script" calls the script mkcheck, which
278 requires bash, and which may need the paths to bash adjusted to
279 work properly, as /bin/bash is assumed.
281 Other subdirectories contain variant versions of certain files
282 that are meant to be copied or linked by the configure script.
291 In addition, two subdirectories are convenience libraries:
294 Support routines needed for C++ math. Only needed if the
295 underlying "C" implementation is non-existent, in particular
296 required or optimal long double, long long, and C99 functionality.
299 Contains the runtime library for C++, including exception
300 handling and memory allocation and deallocation, RTTI, terminate
303 Note that glibc also has a bits/ subdirectory. We will either
304 need to be careful not to collide with names in its bits/
305 directory; or rename bits to (e.g.) cppbits/.
307 In files throughout the system, lines marked with an "XXX" indicate
308 a bug or incompletely-implemented feature. Lines marked "XXX MT"
309 indicate a place that may require attention for multi-thread safety.
314 <sect1 id="contrib.coding_style" xreflabel="Coding Style">
315 <title>Coding Style</title>
318 <sect2 id="coding_style.bad_identifiers" xreflabel="coding_style.bad">
319 <title>Bad Itentifiers</title>
321 Identifiers that conflict and should be avoided.
325 This is the list of names <quote>reserved to the
326 implementation</quote> that have been claimed by certain
327 compilers and system headers of interest, and should not be used
328 in the library. It will grow, of course. We generally are
329 interested in names that are not all-caps, except for those like
369 [Note that this list is out of date. It applies to the old
370 name-mangling; in G++ 3.0 and higher a different name-mangling is
371 used. In addition, many of the bugs relating to G++ interpreting
372 these names as operators have been fixed.]
374 The full set of __* identifiers (combined from gcc/cp/lex.c and
375 gcc/cplus-dem.c) that are either old or new, but are definitely
376 recognized by the demangler, is:
504 // long double conversion members mangled as __opr
505 // http://gcc.gnu.org/ml/libstdc++/1999-q4/msg00060.html
510 <sect2 id="coding_style.example" xreflabel="coding_style.example">
511 <title>By Example</title>
513 This library is written to appropriate C++ coding standards. As such,
514 it is intended to precede the recommendations of the GNU Coding
515 Standard, which can be referenced in full here:
517 http://www.gnu.org/prep/standards/standards.html#Formatting
519 The rest of this is also interesting reading, but skip the "Design
522 The GCC coding conventions are here, and are also useful:
523 http://gcc.gnu.org/codingconventions.html
525 In addition, because it doesn't seem to be stated explicitly anywhere
526 else, there is an 80 column source limit.
528 ChangeLog entries for member functions should use the
529 classname::member function name syntax as follows:
531 1999-04-15 Dennis Ritchie <dr@att.com>
533 * src/basic_file.cc (__basic_file::open): Fix thinko in
534 _G_HAVE_IO_FILE_OPEN bits.
536 Notable areas of divergence from what may be previous local practice
537 (particularly for GNU C) include:
539 01. Pointers and references
543 char *p = "flop"; // wrong
544 char &c = *p; // wrong
546 Reason: In C++, definitions are mixed with executable code. Here,
547 p is being initialized, not *p. This is near-universal
548 practice among C++ programmers; it is normal for C hackers
549 to switch spontaneously as they gain experience.
551 02. Operator names and parentheses
554 operator == (type) // wrong
556 Reason: The == is part of the function name. Separating
557 it makes the declaration look like an expression.
559 03. Function names and parentheses
562 void mangle () // wrong
564 Reason: no space before parentheses (except after a control-flow
565 keyword) is near-universal practice for C++. It identifies the
566 parentheses as the function-call operator or declarator, as
567 opposed to an expression or other overloaded use of parentheses.
569 04. Template function indentation
570 template<typename T>
572 template_function(args)
575 template<class T>
576 void template_function(args) {};
578 Reason: In class definitions, without indentation whitespace is
579 needed both above and below the declaration to distinguish
580 it visually from other members. (Also, re: "typename"
581 rather than "class".) T often could be int, which is
582 not a class. ("class", here, is an anachronism.)
584 05. Template class indentation
585 template<typename _CharT, typename _Traits>
586 class basic_ios : public ios_base
592 template<class _CharT, class _Traits>
593 class basic_ios : public ios_base
599 template<class _CharT, class _Traits>
600 class basic_ios : public ios_base
614 enum { space = _ISspace, print = _ISprint, cntrl = _IScntrl };
616 07. Member initialization lists
617 All one line, separate from class name.
620 : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
623 gribble::gribble() : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
642 09. Member functions declarations and definitions
643 Keywords such as extern, static, export, explicit, inline, etc
644 go on the line above the function name. Thus
651 Reason: GNU coding conventions dictate return types for functions
652 are on a separate line than the function name and parameter list
653 for definitions. For C++, where we have member functions that can
654 be either inline definitions or declarations, keeping to this
655 standard allows all member function names for a given class to be
656 aligned to the same margin, increasing readibility.
659 10. Invocation of member functions with "this->"
660 For non-uglified names, use this->name to call the function.
666 Reason: Koenig lookup.
680 12. Spacing under protected and private in class declarations:
681 space above, none below
692 13. Spacing WRT return statements.
693 no extra spacing before returns, no parenthesis
710 14. Location of global variables.
711 All global variables of class type, whether in the "user visable"
712 space (e.g., cin) or the implementation namespace, must be defined
713 as a character array with the appropriate alignment and then later
714 re-initialized to the correct value.
716 This is due to startup issues on certain platforms, such as AIX.
717 For more explanation and examples, see src/globals.cc. All such
718 variables should be contained in that file, for simplicity.
720 15. Exception abstractions
721 Use the exception abstractions found in functexcept.h, which allow
722 C++ programmers to use this library with -fno-exceptions. (Even if
723 that is rarely advisable, it's a necessary evil for backwards
726 16. Exception error messages
727 All start with the name of the function where the exception is
728 thrown, and then (optional) descriptive text is added. Example:
730 __throw_logic_error(__N("basic_string::_S_construct NULL not valid"));
732 Reason: The verbose terminate handler prints out exception::what(),
733 as well as the typeinfo for the thrown exception. As this is the
734 default terminate handler, by putting location info into the
735 exception string, a very useful error message is printed out for
736 uncaught exceptions. So useful, in fact, that non-programmers can
737 give useful error messages, and programmers can intelligently
738 speculate what went wrong without even using a debugger.
740 17. The doxygen style guide to comments is a separate document,
743 The library currently has a mixture of GNU-C and modern C++ coding
744 styles. The GNU C usages will be combed out gradually.
748 For nonstandard names appearing in Standard headers, we are constrained
749 to use names that begin with underscores. This is called "uglification".
752 Local and argument names: __[a-z].*
754 Examples: __count __ix __s1
756 Type names and template formal-argument names: _[A-Z][^_].*
758 Examples: _Helper _CharT _N
760 Member data and function names: _M_.*
762 Examples: _M_num_elements _M_initialize ()
764 Static data members, constants, and enumerations: _S_.*
766 Examples: _S_max_elements _S_default_value
768 Don't use names in the same scope that differ only in the prefix,
769 e.g. _S_top and _M_top. See BADNAMES for a list of forbidden names.
770 (The most tempting of these seem to be and "_T" and "__sz".)
772 Names must never have "__" internally; it would confuse name
773 unmanglers on some targets. Also, never use "__[0-9]", same reason.
775 --------------------------
789 gribble(const gribble&);
792 gribble(int __howmany);
795 operator=(const gribble&);
800 // Start with a capital letter, end with a period.
802 public_member(const char* __arg) const;
804 // In-class function definitions should be restricted to one-liners.
806 one_line() { return 0 }
809 two_lines(const char* arg)
810 { return strchr(arg, 'a'); }
813 three_lines(); // inline, but defined below.
816 template<typename _Formal_argument>
818 public_template() const throw();
820 template<typename _Iterator>
830 int _M_private_function();
839 _S_initialize_library();
842 // More-or-less-standard language features described by lack, not presence.
843 # ifndef _G_NO_LONGLONG
844 extern long long _G_global_with_a_good_long_name; // avoid globals!
847 // Avoid in-class inline definitions, define separately;
848 // likewise for member class definitions:
850 gribble::public_member() const
851 { int __local = 0; return __local; }
853 class gribble::_Helper
857 friend class gribble;
861 // Names beginning with "__": only for arguments and
862 // local variables; never use "__" in a type name, or
863 // within any name; never use "__[0-9]".
865 #endif /* _HEADER_ */
870 template<typename T> // notice: "typename", not "class", no space
871 long_return_value_type<with_many, args>
872 function_name(char* pointer, // "char *pointer" is wrong.
874 const Reference& ref)
876 // int a_local; /* wrong; see below. */
882 int a_local = 0; // declare variable at first use.
884 // char a, b, *p; /* wrong */
887 char* c = "abc"; // each variable goes on its own line, always.
889 // except maybe here...
890 for (unsigned i = 0, mask = 1; mask; ++i, mask <<= 1) {
896 : _M_private_data(0), _M_more_stuff(0), _M_helper(0);
900 gribble::three_lines()
902 // doesn't fit in one line.
909 <sect1 id="contrib.doc_style" xreflabel="Documentation Style">
910 <title>Documentation Style</title>
911 <sect2 id="doc_style.doxygen" xreflabel="doc_style.doxygen">
912 <title>Doxygen</title>
913 <sect3 id="doxygen.prereq" xreflabel="doxygen.prereq">
914 <title>Prerequisites</title>
916 Prerequisite tools are Bash 2.x,
917 <ulink url="http://www.doxygen.org/">Doxygen</ulink>, and
918 the <ulink url="http://www.gnu.org/software/coreutils/">GNU
919 coreutils</ulink>. (GNU versions of find, xargs, and possibly
920 sed and grep are used, just because the GNU versions make
925 To generate the pretty pictures and hierarchy
927 <ulink url="http://www.research.att.com/sw/tools/graphviz/download.html">Graphviz</ulink>
928 package will need to be installed.
932 <sect3 id="doxygen.rules" xreflabel="doxygen.rules">
933 <title>Generating the Doxygen Files</title>
937 <screen><userinput>make doc-html-doxygen</userinput></screen>
941 <screen><userinput>make doc-xml-doxygen</userinput></screen>
945 <screen><userinput>make doc-man-doxygen</userinput></screen>
947 in the libstdc++ build directory generate the HTML docs, the
948 XML docs, and the man pages.
952 Careful observers will see that the Makefile rules simply call
953 a script from the source tree, <filename>run_doxygen</filename>, which
954 does the actual work of running Doxygen and then (most
955 importantly) massaging the output files. If for some reason
956 you prefer to not go through the Makefile, you can call this
957 script directly. (Start by passing <literal>--help</literal>.)
961 If you wish to tweak the Doxygen settings, do so by editing
962 <filename>doc/doxygen/user.cfg.in</filename>. Notes to fellow
963 library hackers are written in triple-# comments.
968 <sect3 id="doxygen.markup" xreflabel="doxygen.markup">
969 <title>Markup</title>
972 In general, libstdc++ files should be formatted according to
973 the rules found in the
974 <link linkend="contrib.coding_style">Coding Standard</link>. Before
975 any doxygen-specific formatting tweaks are made, please try to
976 make sure that the initial formatting is sound.
980 Adding Doxygen markup to a file (informally called
981 <quote>doxygenating</quote>) is very simple. The Doxygen manual can be
983 <ulink url="http://www.stack.nl/~dimitri/doxygen/download.html#latestman">here</ulink>.
984 We try to use a very-recent version of Doxygen.
989 <classname>deque</classname>/<classname>vector</classname>/<classname>list</classname>
990 and <classname>std::pair</classname> as examples. For
991 functions, see their member functions, and the free functions
992 in <filename>stl_algobase.h</filename>. Member functions of
993 other container-like types should read similarly to these
998 These points accompany the first list in section 3.1 of the
1003 <listitem><para>Use the Javadoc style...</para></listitem>
1006 ...not the Qt style. The intermediate *'s are preferred.
1011 Use the triple-slash style only for one-line comments (the
1012 <quote>brief</quote> mode). Very recent versions of Doxygen permit
1013 full-mode comments in triple-slash blocks, but the
1014 formatting still comes out wonky.
1019 This is disgusting. Don't do this.
1025 Use the @-style of commands, not the !-style. Please be
1026 careful about whitespace in your markup comments. Most of the
1027 time it doesn't matter; doxygen absorbs most whitespace, and
1028 both HTML and *roff are agnostic about whitespace. However,
1029 in <pre> blocks and @code/@endcode sections, spacing can
1030 have <quote>interesting</quote> effects.
1034 Use either kind of grouping, as
1035 appropriate. <filename>doxygroups.cc</filename> exists for this
1036 purpose. See <filename>stl_iterator.h</filename> for a good example
1037 of the <quote>other</quote> kind of grouping.
1041 Please use markup tags like @p and @a when referring to things
1042 such as the names of function parameters. Use @e for emphasis
1043 when necessary. Use @c to refer to other standard names.
1044 (Examples of all these abound in the present code.)
1051 <sect2 id="doc_style.docbook" xreflabel="doc_style.docbook">
1052 <title>Docbook</title>
1054 <sect3 id="docbook.prereq" xreflabel="docbook.prereq">
1055 <title>Prerequisites</title>
1057 Editing the DocBook sources requires an XML editor. Many
1058 exist: some noteable options
1059 include <command>emacs</command>, <application>Kate</application>,
1060 or <application>Conglomerate</application>.
1064 Some editors support special <quote>XML Validation</quote>
1065 modes that can validate the file as it is
1066 produced. Recommended is the <command>nXML Mode</command>
1067 for <command>emacs</command>.
1071 Besides an editor, additional DocBook files and XML tools are
1076 Access to the DocBook stylesheets and DTD is required. The
1077 stylesheets are usually packaged by vendor, in something
1078 like <filename>docbook-style-xsl</filename>. The installation
1079 directory for this package corresponds to
1080 the <literal>XSL_STYLE_DIR</literal>
1081 in <filename>doc/Makefile.am</filename> and defaults
1082 to <filename class="directory">/usr/share/sgml/docbook/xsl-stylesheets</filename>.
1086 For procesessing XML, an XML processor and some style
1087 sheets are necessary. Defaults are <command>xsltproc</command>
1088 provided by <filename>libxslt</filename>.
1092 For validating the XML document, you'll need
1093 something like <command>xmllint</command> and access to the
1094 DocBook DTD. These are provided
1095 by a vendor package like <filename>lixml2</filename>.
1099 For PDF output, something that transforms valid XML to PDF is
1100 required. Possible solutions include <command>xmlto</command>,
1101 <ulink url="http://xmlgraphics.apache.org/fop/">Apache
1102 FOP</ulink>, or <command>prince</command>. Other options are
1103 listed on the DocBook web <ulink
1104 url="http://wiki.docbook.org/topic/DocBookPublishingTools">pages</ulink>. Please
1105 consult the <email>libstdc++@gcc.gnu.org</email> list when
1106 preparing printed manuals for current best practice and suggestions.
1110 Make sure that the XML documentation and markup is valid for
1111 any change. This can be done easily, with the validation rules
1112 in the <filename>Makefile</filename>, which is equivalent to doing:
1117 xmllint --noout --valid <filename>xml/index.xml</filename>
1122 <sect3 id="docbook.rules" xreflabel="docbook.rules">
1123 <title>Generating the DocBook Files</title>
1128 <screen><userinput>make doc-html</userinput></screen>
1132 <screen><userinput>make doc-pdf</userinput></screen>
1136 <screen><userinput>make doc-xml-single</userinput></screen>
1140 <screen><userinput>make doc-xml-validate</userinput></screen>
1143 in the libstdc++ build directory result respectively in the
1144 following: the generation of an HTML version of all the
1145 documentation, a PDF version of the same, a single XML
1146 document, and the results of validating the XML document.
1150 <sect3 id="docbook.examples" xreflabel="docbook.examples">
1151 <title>File Organization and Basics</title>
1154 <emphasis>Which files are important</emphasis>
1156 All Docbook files are in the directory
1157 libstdc++-v3/doc/xml
1159 Inside this directory, the files of importance:
1160 spine.xml - index to documentation set
1161 manual/spine.xml - index to manual
1162 manual/*.xml - individual chapters and sections of the manual
1163 faq.xml - index to FAQ
1164 api.xml - index to source level / API
1166 All *.txml files are template xml files, ie otherwise empty files with
1167 the correct structure, suitable for filling in with new information.
1169 <emphasis>Cannonical Writing Style</emphasis>
1173 member function template
1174 (via C++ Templates, Vandevoorde)
1176 class in namespace std: allocator, not std::allocator
1178 header file: iostream, not <iostream>
1181 <emphasis>General structure</emphasis>
1216 <sect3 id="docbook.markup" xreflabel="docbook.markup">
1217 <title>Markup By Example</title>
1220 HTML to XML rough equivalents
1222 <p> <para>
1224 <pre> <computeroutput>
1225 <pre> <programlisting>
1226 <pre> <literallayout>
1228 <ul> <itemizedlist>
1229 <ol> <orderedlist>
1230 <il> <listitem>
1232 <dl> <variablelist>
1234 <varlistentry>
1235 <dt> <term>
1236 </dt> </term>
1237 <dd> <listitem>
1238 </dt> </listitem>
1239 </varlistentry>
1241 <a href <ulink url
1242 <code> <literal>
1243 <code> <programlisting>
1245 <strong> <emphasis>
1246 <em> <emphasis>
1247 " <quote>
1249 ctype.h <filename class="headerfile"></filename>
1252 build_dir <filename class="directory">path_to_build_dir</filename>
1254 Finer gradations of <code>
1256 <classname> <classname>string</classname>
1257 <classname>vector<></classname>
1258 <function>fs.clear()</function>
1262 <function> <function>clear()</function>
1264 <type> <type>long long</type>
1266 <varname> <varname>fs</varname>
1268 <literal> <literal>-Weffc++</literal>
1269 <literal>rel_ops</literal>
1271 <constant> <constant>_GNU_SOURCE</constant>
1272 <constant>3.0</constant>
1276 <command> <command>g++</command>
1278 <errortext> <errortext>foo Concept </errortext>
1285 <sect1 id="contrib.design_notes" xreflabel="Design Notes">
1286 <title>Design Notes</title>
1295 This paper is covers two major areas:
1297 - Features and policies not mentioned in the standard that
1298 the quality of the library implementation depends on, including
1299 extensions and "implementation-defined" features;
1301 - Plans for required but unimplemented library features and
1302 optimizations to them.
1307 The standard defines a large library, much larger than the standard
1308 C library. A naive implementation would suffer substantial overhead
1309 in compile time, executable size, and speed, rendering it unusable
1310 in many (particularly embedded) applications. The alternative demands
1311 care in construction, and some compiler support, but there is no
1312 need for library subsets.
1314 What are the sources of this overhead? There are four main causes:
1316 - The library is specified almost entirely as templates, which
1317 with current compilers must be included in-line, resulting in
1318 very slow builds as tens or hundreds of thousands of lines
1319 of function definitions are read for each user source file.
1320 Indeed, the entire SGI STL, as well as the dos Reis valarray,
1321 are provided purely as header files, largely for simplicity in
1322 porting. Iostream/locale is (or will be) as large again.
1324 - The library is very flexible, specifying a multitude of hooks
1325 where users can insert their own code in place of defaults.
1326 When these hooks are not used, any time and code expended to
1327 support that flexibility is wasted.
1329 - Templates are often described as causing to "code bloat". In
1330 practice, this refers (when it refers to anything real) to several
1331 independent processes. First, when a class template is manually
1332 instantiated in its entirely, current compilers place the definitions
1333 for all members in a single object file, so that a program linking
1334 to one member gets definitions of all. Second, template functions
1335 which do not actually depend on the template argument are, under
1336 current compilers, generated anew for each instantiation, rather
1337 than being shared with other instantiations. Third, some of the
1338 flexibility mentioned above comes from virtual functions (both in
1339 regular classes and template classes) which current linkers add
1340 to the executable file even when they manifestly cannot be called.
1342 - The library is specified to use a language feature, exceptions,
1343 which in the current gcc compiler ABI imposes a run time and
1344 code space cost to handle the possibility of exceptions even when
1345 they are not used. Under the new ABI (accessed with -fnew-abi),
1346 there is a space overhead and a small reduction in code efficiency
1347 resulting from lost optimization opportunities associated with
1348 non-local branches associated with exceptions.
1350 What can be done to eliminate this overhead? A variety of coding
1351 techniques, and compiler, linker and library improvements and
1352 extensions may be used, as covered below. Most are not difficult,
1353 and some are already implemented in varying degrees.
1355 Overhead: Compilation Time
1356 --------------------------
1358 Providing "ready-instantiated" template code in object code archives
1359 allows us to avoid generating and optimizing template instantiations
1360 in each compilation unit which uses them. However, the number of such
1361 instantiations that are useful to provide is limited, and anyway this
1362 is not enough, by itself, to minimize compilation time. In particular,
1363 it does not reduce time spent parsing conforming headers.
1365 Quicker header parsing will depend on library extensions and compiler
1366 improvements. One approach is some variation on the techniques
1367 previously marketed as "pre-compiled headers", now standardized as
1368 support for the "export" keyword. "Exported" template definitions
1369 can be placed (once) in a "repository" -- really just a library, but
1370 of template definitions rather than object code -- to be drawn upon
1371 at link time when an instantiation is needed, rather than placed in
1372 header files to be parsed along with every compilation unit.
1374 Until "export" is implemented we can put some of the lengthy template
1375 definitions in #if guards or alternative headers so that users can skip
1376 over the full definitions when they need only the ready-instantiated
1379 To be precise, this means that certain headers which define
1380 templates which users normally use only for certain arguments
1381 can be instrumented to avoid exposing the template definitions
1382 to the compiler unless a macro is defined. For example, in
1383 <string>, we might have:
1385 template <class _CharT, ... > class basic_string {
1386 ... // member declarations
1388 ... // operator declarations
1391 # if _G_NO_TEMPLATE_EXPORT
1392 # include <bits/std_locale.h> // headers needed by definitions
1394 # include <bits/string.tcc> // member and global template definitions.
1398 Users who compile without specifying a strict-ISO-conforming flag
1399 would not see many of the template definitions they now see, and rely
1400 instead on ready-instantiated specializations in the library. This
1401 technique would be useful for the following substantial components:
1402 string, locale/iostreams, valarray. It would *not* be useful or
1403 usable with the following: containers, algorithms, iterators,
1404 allocator. Since these constitute a large (though decreasing)
1405 fraction of the library, the benefit the technique offers is
1408 The language specifies the semantics of the "export" keyword, but
1409 the gcc compiler does not yet support it. When it does, problems
1410 with large template inclusions can largely disappear, given some
1411 minor library reorganization, along with the need for the apparatus
1414 Overhead: Flexibility Cost
1415 --------------------------
1417 The library offers many places where users can specify operations
1418 to be performed by the library in place of defaults. Sometimes
1419 this seems to require that the library use a more-roundabout, and
1420 possibly slower, way to accomplish the default requirements than
1421 would be used otherwise.
1423 The primary protection against this overhead is thorough compiler
1424 optimization, to crush out layers of inline function interfaces.
1425 Kuck & Associates has demonstrated the practicality of this kind
1428 The second line of defense against this overhead is explicit
1429 specialization. By defining helper function templates, and writing
1430 specialized code for the default case, overhead can be eliminated
1431 for that case without sacrificing flexibility. This takes full
1432 advantage of any ability of the optimizer to crush out degenerate
1435 The library specifies many virtual functions which current linkers
1436 load even when they cannot be called. Some minor improvements to the
1437 compiler and to ld would eliminate any such overhead by simply
1438 omitting virtual functions that the complete program does not call.
1439 A prototype of this work has already been done. For targets where
1440 GNU ld is not used, a "pre-linker" could do the same job.
1442 The main areas in the standard interface where user flexibility
1443 can result in overhead are:
1445 - Allocators: Containers are specified to use user-definable
1446 allocator types and objects, making tuning for the container
1447 characteristics tricky.
1449 - Locales: the standard specifies locale objects used to implement
1450 iostream operations, involving many virtual functions which use
1451 streambuf iterators.
1453 - Algorithms and containers: these may be instantiated on any type,
1454 frequently duplicating code for identical operations.
1456 - Iostreams and strings: users are permitted to use these on their
1457 own types, and specify the operations the stream must use on these
1460 Note that these sources of overhead are _avoidable_. The techniques
1461 to avoid them are covered below.
1466 In the SGI STL, and in some other headers, many of the templates
1467 are defined "inline" -- either explicitly or by their placement
1468 in class definitions -- which should not be inline. This is a
1469 source of code bloat. Matt had remarked that he was relying on
1470 the compiler to recognize what was too big to benefit from inlining,
1471 and generate it out-of-line automatically. However, this also can
1472 result in code bloat except where the linker can eliminate the extra
1475 Fixing these cases will require an audit of all inline functions
1476 defined in the library to determine which merit inlining, and moving
1477 the rest out of line. This is an issue mainly in chapters 23, 25, and
1478 27. Of course it can be done incrementally, and we should generally
1479 accept patches that move large functions out of line and into ".tcc"
1480 files, which can later be pulled into a repository. Compiler/linker
1481 improvements to recognize very large inline functions and move them
1482 out-of-line, but shared among compilation units, could make this
1485 Pre-instantiating template specializations currently produces large
1486 amounts of dead code which bloats statically linked programs. The
1487 current state of the static library, libstdc++.a, is intolerable on
1488 this account, and will fuel further confused speculation about a need
1489 for a library "subset". A compiler improvement that treats each
1490 instantiated function as a separate object file, for linking purposes,
1491 would be one solution to this problem. An alternative would be to
1492 split up the manual instantiation files into dozens upon dozens of
1493 little files, each compiled separately, but an abortive attempt at
1494 this was done for <string> and, though it is far from complete, it
1495 is already a nuisance. A better interim solution (just until we have
1496 "export") is badly needed.
1498 When building a shared library, the current compiler/linker cannot
1499 automatically generate the instantiatiations needed. This creates a
1500 miserable situation; it means any time something is changed in the
1501 library, before a shared library can be built someone must manually
1502 copy the declarations of all templates that are needed by other parts
1503 of the library to an "instantiation" file, and add it to the build
1504 system to be compiled and linked to the library. This process is
1505 readily automated, and should be automated as soon as possible.
1506 Users building their own shared libraries experience identical
1509 Sharing common aspects of template definitions among instantiations
1510 can radically reduce code bloat. The compiler could help a great
1511 deal here by recognizing when a function depends on nothing about
1512 a template parameter, or only on its size, and giving the resulting
1513 function a link-name "equate" that allows it to be shared with other
1514 instantiations. Implementation code could take advantage of the
1515 capability by factoring out code that does not depend on the template
1516 argument into separate functions to be merged by the compiler.
1518 Until such a compiler optimization is implemented, much can be done
1519 manually (if tediously) in this direction. One such optimization is
1520 to derive class templates from non-template classes, and move as much
1521 implementation as possible into the base class. Another is to partial-
1522 specialize certain common instantiations, such as vector<T*>, to share
1523 code for instantiations on all types T. While these techniques work,
1524 they are far from the complete solution that a compiler improvement
1527 Overhead: Expensive Language Features
1528 -------------------------------------
1530 The main "expensive" language feature used in the standard library
1531 is exception support, which requires compiling in cleanup code with
1532 static table data to locate it, and linking in library code to use
1533 the table. For small embedded programs the amount of such library
1534 code and table data is assumed by some to be excessive. Under the
1535 "new" ABI this perception is generally exaggerated, although in some
1536 cases it may actually be excessive.
1538 To implement a library which does not use exceptions directly is
1539 not difficult given minor compiler support (to "turn off" exceptions
1540 and ignore exception constructs), and results in no great library
1541 maintenance difficulties. To be precise, given "-fno-exceptions",
1542 the compiler should treat "try" blocks as ordinary blocks, and
1543 "catch" blocks as dead code to ignore or eliminate. Compiler
1544 support is not strictly necessary, except in the case of "function
1545 try blocks"; otherwise the following macros almost suffice:
1548 #define try if (true)
1549 #define catch(X) else if (false)
1551 However, there may be a need to use function try blocks in the
1552 library implementation, and use of macros in this way can make
1553 correct diagnostics impossible. Furthermore, use of this scheme
1554 would require the library to call a function to re-throw exceptions
1555 from a try block. Implementing the above semantics in the compiler
1558 Given the support above (however implemented) it only remains to
1559 replace code that "throws" with a call to a well-documented "handler"
1560 function in a separate compilation unit which may be replaced by
1561 the user. The main source of exceptions that would be difficult
1562 for users to avoid is memory allocation failures, but users can
1563 define their own memory allocation primitives that never throw.
1564 Otherwise, the complete list of such handlers, and which library
1565 functions may call them, would be needed for users to be able to
1566 implement the necessary substitutes. (Fortunately, they have the
1572 The template capabilities of C++ offer enormous opportunities for
1573 optimizing common library operations, well beyond what would be
1574 considered "eliminating overhead". In particular, many operations
1575 done in Glibc with macros that depend on proprietary language
1576 extensions can be implemented in pristine Standard C++. For example,
1577 the chapter 25 algorithms, and even C library functions such as strchr,
1578 can be specialized for the case of static arrays of known (small) size.
1580 Detailed optimization opportunities are identified below where
1581 the component where they would appear is discussed. Of course new
1582 opportunities will be identified during implementation.
1584 Unimplemented Required Library Features
1585 ---------------------------------------
1587 The standard specifies hundreds of components, grouped broadly by
1588 chapter. These are listed in excruciating detail in the CHECKLIST
1602 Annex D backward compatibility
1604 Anyone participating in implementation of the library should obtain
1605 a copy of the standard, ISO 14882. People in the U.S. can obtain an
1606 electronic copy for US$18 from ANSI's web site. Those from other
1607 countries should visit http://www.iso.ch/ to find out the location
1608 of their country's representation in ISO, in order to know who can
1611 The emphasis in the following sections is on unimplemented features
1612 and optimization opportunities.
1617 Chapter 17 concerns overall library requirements.
1619 The standard doesn't mention threads. A multi-thread (MT) extension
1620 primarily affects operators new and delete (18), allocator (20),
1621 string (21), locale (22), and iostreams (27). The common underlying
1622 support needed for this is discussed under chapter 20.
1624 The standard requirements on names from the C headers create a
1625 lot of work, mostly done. Names in the C headers must be visible
1626 in the std:: and sometimes the global namespace; the names in the
1627 two scopes must refer to the same object. More stringent is that
1628 Koenig lookup implies that any types specified as defined in std::
1629 really are defined in std::. Names optionally implemented as
1630 macros in C cannot be macros in C++. (An overview may be read at
1631 <http://www.cantrip.org/cheaders.html>). The scripts "inclosure"
1632 and "mkcshadow", and the directories shadow/ and cshadow/, are the
1633 beginning of an effort to conform in this area.
1635 A correct conforming definition of C header names based on underlying
1636 C library headers, and practical linking of conforming namespaced
1637 customer code with third-party C libraries depends ultimately on
1638 an ABI change, allowing namespaced C type names to be mangled into
1639 type names as if they were global, somewhat as C function names in a
1640 namespace, or C++ global variable names, are left unmangled. Perhaps
1641 another "extern" mode, such as 'extern "C-global"' would be an
1642 appropriate place for such type definitions. Such a type would
1643 affect mangling as follows:
1647 extern "C-global" { // or maybe just 'extern "C"'
1651 void f(A::X*); // mangles to f__FPQ21A1X
1652 void f(A::Y*); // mangles to f__FP1Y
1654 (It may be that this is really the appropriate semantics for regular
1655 'extern "C"', and 'extern "C-global"', as an extension, would not be
1656 necessary.) This would allow functions declared in non-standard C headers
1657 (and thus fixable by neither us nor users) to link properly with functions
1658 declared using C types defined in properly-namespaced headers. The
1659 problem this solves is that C headers (which C++ programmers do persist
1660 in using) frequently forward-declare C struct tags without including
1661 the header where the type is defined, as in
1666 Without some compiler accommodation, munge cannot be called by correct
1667 C++ code using a pointer to a correctly-scoped tm* value.
1669 The current C headers use the preprocessor extension "#include_next",
1670 which the compiler complains about when run "-pedantic".
1671 (Incidentally, it appears that "-fpedantic" is currently ignored,
1672 probably a bug.) The solution in the C compiler is to use
1673 "-isystem" rather than "-I", but unfortunately in g++ this seems
1674 also to wrap the whole header in an 'extern "C"' block, so it's
1675 unusable for C++ headers. The correct solution appears to be to
1676 allow the various special include-directory options, if not given
1677 an argument, to affect subsequent include-directory options additively,
1680 -pedantic -iprefix $(prefix) \
1681 -idirafter -ino-pedantic -ino-extern-c -iwithprefix -I g++-v3 \
1682 -iwithprefix -I g++-v3/ext
1684 the compiler would search $(prefix)/g++-v3 and not report
1685 pedantic warnings for files found there, but treat files in
1686 $(prefix)/g++-v3/ext pedantically. (The undocumented semantics
1687 of "-isystem" in g++ stink. Can they be rescinded? If not it
1688 must be replaced with something more rationally behaved.)
1690 All the C headers need the treatment above; in the standard these
1691 headers are mentioned in various chapters. Below, I have only
1692 mentioned those that present interesting implementation issues.
1694 The components identified as "mostly complete", below, have not been
1695 audited for conformance. In many cases where the library passes
1696 conformance tests we have non-conforming extensions that must be
1697 wrapped in #if guards for "pedantic" use, and in some cases renamed
1698 in a conforming way for continued use in the implementation regardless
1699 of conformance flags.
1701 The STL portion of the library still depends on a header
1702 stl/bits/stl_config.h full of #ifdef clauses. This apparatus
1703 should be replaced with autoconf/automake machinery.
1705 The SGI STL defines a type_traits<> template, specialized for
1706 many types in their code including the built-in numeric and
1707 pointer types and some library types, to direct optimizations of
1708 standard functions. The SGI compiler has been extended to generate
1709 specializations of this template automatically for user types,
1710 so that use of STL templates on user types can take advantage of
1711 these optimizations. Specializations for other, non-STL, types
1712 would make more optimizations possible, but extending the gcc
1713 compiler in the same way would be much better. Probably the next
1714 round of standardization will ratify this, but probably with
1715 changes, so it probably should be renamed to place it in the
1716 implementation namespace.
1718 The SGI STL also defines a large number of extensions visible in
1719 standard headers. (Other extensions that appear in separate headers
1720 have been sequestered in subdirectories ext/ and backward/.) All
1721 these extensions should be moved to other headers where possible,
1722 and in any case wrapped in a namespace (not std!), and (where kept
1723 in a standard header) girded about with macro guards. Some cannot be
1724 moved out of standard headers because they are used to implement
1725 standard features. The canonical method for accommodating these
1726 is to use a protected name, aliased in macro guards to a user-space
1727 name. Unfortunately C++ offers no satisfactory template typedef
1728 mechanism, so very ad-hoc and unsatisfactory aliasing must be used
1731 Implementation of a template typedef mechanism should have the highest
1732 priority among possible extensions, on the same level as implementation
1733 of the template "export" feature.
1735 Chapter 18 Language support
1736 ----------------------------
1738 Headers: <limits> <new> <typeinfo> <exception>
1739 C headers: <cstddef> <climits> <cfloat> <cstdarg> <csetjmp>
1740 <ctime> <csignal> <cstdlib> (also 21, 25, 26)
1742 This defines the built-in exceptions, rtti, numeric_limits<>,
1743 operator new and delete. Much of this is provided by the
1744 compiler in its static runtime library.
1746 Work to do includes defining numeric_limits<> specializations in
1747 separate files for all target architectures. Values for integer types
1748 except for bool and wchar_t are readily obtained from the C header
1749 <limits.h>, but values for the remaining numeric types (bool, wchar_t,
1750 float, double, long double) must be entered manually. This is
1751 largely dog work except for those members whose values are not
1752 easily deduced from available documentation. Also, this involves
1753 some work in target configuration to identify the correct choice of
1754 file to build against and to install.
1756 The definitions of the various operators new and delete must be
1757 made thread-safe, which depends on a portable exclusion mechanism,
1758 discussed under chapter 20. Of course there is always plenty of
1759 room for improvements to the speed of operators new and delete.
1761 <cstdarg>, in Glibc, defines some macros that gcc does not allow to
1762 be wrapped into an inline function. Probably this header will demand
1763 attention whenever a new target is chosen. The functions atexit(),
1764 exit(), and abort() in cstdlib have different semantics in C++, so
1765 must be re-implemented for C++.
1767 Chapter 19 Diagnostics
1768 -----------------------
1770 Headers: <stdexcept>
1771 C headers: <cassert> <cerrno>
1773 This defines the standard exception objects, which are "mostly complete".
1774 Cygnus has a version, and now SGI provides a slightly different one.
1775 It makes little difference which we use.
1777 The C global name "errno", which C allows to be a variable or a macro,
1778 is required in C++ to be a macro. For MT it must typically result in
1781 Chapter 20 Utilities
1782 ---------------------
1783 Headers: <utility> <functional> <memory>
1784 C header: <ctime> (also in 18)
1786 SGI STL provides "mostly complete" versions of all the components
1787 defined in this chapter. However, the auto_ptr<> implementation
1788 is known to be wrong. Furthermore, the standard definition of it
1789 is known to be unimplementable as written. A minor change to the
1790 standard would fix it, and auto_ptr<> should be adjusted to match.
1792 Multi-threading affects the allocator implementation, and there must
1793 be configuration/installation choices for different users' MT
1794 requirements. Anyway, users will want to tune allocator options
1795 to support different target conditions, MT or no.
1797 The primitives used for MT implementation should be exposed, as an
1798 extension, for users' own work. We need cross-CPU "mutex" support,
1799 multi-processor shared-memory atomic integer operations, and single-
1800 processor uninterruptible integer operations, and all three configurable
1801 to be stubbed out for non-MT use, or to use an appropriately-loaded
1802 dynamic library for the actual runtime environment, or statically
1803 compiled in for cases where the target architecture is known.
1807 Headers: <string>
1808 C headers: <cctype> <cwctype> <cstring> <cwchar> (also in 27)
1809 <cstdlib> (also in 18, 25, 26)
1811 We have "mostly-complete" char_traits<> implementations. Many of the
1812 char_traits<char> operations might be optimized further using existing
1813 proprietary language extensions.
1815 We have a "mostly-complete" basic_string<> implementation. The work
1816 to manually instantiate char and wchar_t specializations in object
1817 files to improve link-time behavior is extremely unsatisfactory,
1818 literally tripling library-build time with no commensurate improvement
1819 in static program link sizes. It must be redone. (Similar work is
1820 needed for some components in chapters 22 and 27.)
1822 Other work needed for strings is MT-safety, as discussed under the
1825 The standard C type mbstate_t from <cwchar> and used in char_traits<>
1826 must be different in C++ than in C, because in C++ the default constructor
1827 value mbstate_t() must be the "base" or "ground" sequence state.
1828 (According to the likely resolution of a recently raised Core issue,
1829 this may become unnecessary. However, there are other reasons to
1830 use a state type not as limited as whatever the C library provides.)
1831 If we might want to provide conversions from (e.g.) internally-
1832 represented EUC-wide to externally-represented Unicode, or vice-
1833 versa, the mbstate_t we choose will need to be more accommodating
1834 than what might be provided by an underlying C library.
1836 There remain some basic_string template-member functions which do
1837 not overload properly with their non-template brethren. The infamous
1838 hack akin to what was done in vector<> is needed, to conform to
1839 23.1.1 para 10. The CHECKLIST items for basic_string marked 'X',
1840 or incomplete, are so marked for this reason.
1842 Replacing the string iterators, which currently are simple character
1843 pointers, with class objects would greatly increase the safety of the
1844 client interface, and also permit a "debug" mode in which range,
1845 ownership, and validity are rigorously checked. The current use of
1846 raw pointers as string iterators is evil. vector<> iterators need the
1847 same treatment. Note that the current implementation freely mixes
1848 pointers and iterators, and that must be fixed before safer iterators
1851 Some of the functions in <cstring> are different from the C version.
1852 generally overloaded on const and non-const argument pointers. For
1853 example, in <cstring> strchr is overloaded. The functions isupper
1854 etc. in <cctype> typically implemented as macros in C are functions
1855 in C++, because they are overloaded with others of the same name
1856 defined in <locale>.
1858 Many of the functions required in <cwctype> and <cwchar> cannot be
1859 implemented using underlying C facilities on intended targets because
1860 such facilities only partly exist.
1864 Headers: <locale>
1865 C headers: <clocale>
1867 We have a "mostly complete" class locale, with the exception of
1868 code for constructing, and handling the names of, named locales.
1869 The ways that locales are named (particularly when categories
1870 (e.g. LC_TIME, LC_COLLATE) are different) varies among all target
1871 environments. This code must be written in various versions and
1872 chosen by configuration parameters.
1874 Members of many of the facets defined in <locale> are stubs. Generally,
1875 there are two sets of facets: the base class facets (which are supposed
1876 to implement the "C" locale) and the "byname" facets, which are supposed
1877 to read files to determine their behavior. The base ctype<>, collate<>,
1878 and numpunct<> facets are "mostly complete", except that the table of
1879 bitmask values used for "is" operations, and corresponding mask values,
1880 are still defined in libio and just included/linked. (We will need to
1881 implement these tables independently, soon, but should take advantage
1882 of libio where possible.) The num_put<>::put members for integer types
1883 are "mostly complete".
1885 A complete list of what has and has not been implemented may be
1886 found in CHECKLIST. However, note that the current definition of
1887 codecvt<wchar_t,char,mbstate_t> is wrong. It should simply write
1888 out the raw bytes representing the wide characters, rather than
1889 trying to convert each to a corresponding single "char" value.
1891 Some of the facets are more important than others. Specifically,
1892 the members of ctype<>, numpunct<>, num_put<>, and num_get<> facets
1893 are used by other library facilities defined in <string>, <istream>,
1894 and <ostream>, and the codecvt<> facet is used by basic_filebuf<>
1895 in <fstream>, so a conforming iostream implementation depends on
1898 The "long long" type eventually must be supported, but code mentioning
1899 it should be wrapped in #if guards to allow pedantic-mode compiling.
1901 Performance of num_put<> and num_get<> depend critically on
1902 caching computed values in ios_base objects, and on extensions
1903 to the interface with streambufs.
1905 Specifically: retrieving a copy of the locale object, extracting
1906 the needed facets, and gathering data from them, for each call to
1907 (e.g.) operator<< would be prohibitively slow. To cache format
1908 data for use by num_put<> and num_get<> we have a _Format_cache<>
1909 object stored in the ios_base::pword() array. This is constructed
1910 and initialized lazily, and is organized purely for utility. It
1911 is discarded when a new locale with different facets is imbued.
1913 Using only the public interfaces of the iterator arguments to the
1914 facet functions would limit performance by forbidding "vector-style"
1915 character operations. The streambuf iterator optimizations are
1916 described under chapter 24, but facets can also bypass the streambuf
1917 iterators via explicit specializations and operate directly on the
1918 streambufs, and use extended interfaces to get direct access to the
1919 streambuf internal buffer arrays. These extensions are mentioned
1920 under chapter 27. These optimizations are particularly important
1923 Unused virtual members of locale facets can be omitted, as mentioned
1924 above, by a smart linker.
1926 Chapter 23 Containers
1927 ----------------------
1928 Headers: <deque> <list> <queue> <stack> <vector> <map> <set> <bitset>
1930 All the components in chapter 23 are implemented in the SGI STL.
1931 They are "mostly complete"; they include a large number of
1932 nonconforming extensions which must be wrapped. Some of these
1933 are used internally and must be renamed or duplicated.
1935 The SGI components are optimized for large-memory environments. For
1936 embedded targets, different criteria might be more appropriate. Users
1937 will want to be able to tune this behavior. We should provide
1938 ways for users to compile the library with different memory usage
1941 A lot more work is needed on factoring out common code from different
1942 specializations to reduce code size here and in chapter 25. The
1943 easiest fix for this would be a compiler/ABI improvement that allows
1944 the compiler to recognize when a specialization depends only on the
1945 size (or other gross quality) of a template argument, and allow the
1946 linker to share the code with similar specializations. In its
1947 absence, many of the algorithms and containers can be partial-
1948 specialized, at least for the case of pointers, but this only solves
1949 a small part of the problem. Use of a type_traits-style template
1950 allows a few more optimization opportunities, more if the compiler
1951 can generate the specializations automatically.
1953 As an optimization, containers can specialize on the default allocator
1954 and bypass it, or take advantage of details of its implementation
1955 after it has been improved upon.
1957 Replacing the vector iterators, which currently are simple element
1958 pointers, with class objects would greatly increase the safety of the
1959 client interface, and also permit a "debug" mode in which range,
1960 ownership, and validity are rigorously checked. The current use of
1961 pointers for iterators is evil.
1963 As mentioned for chapter 24, the deque iterator is a good example of
1964 an opportunity to implement a "staged" iterator that would benefit
1965 from specializations of some algorithms.
1967 Chapter 24 Iterators
1968 ---------------------
1969 Headers: <iterator>
1971 Standard iterators are "mostly complete", with the exception of
1972 the stream iterators, which are not yet templatized on the
1973 stream type. Also, the base class template iterator<> appears
1974 to be wrong, so everything derived from it must also be wrong,
1977 The streambuf iterators (currently located in stl/bits/std_iterator.h,
1978 but should be under bits/) can be rewritten to take advantage of
1979 friendship with the streambuf implementation.
1981 Matt Austern has identified opportunities where certain iterator
1982 types, particularly including streambuf iterators and deque
1983 iterators, have a "two-stage" quality, such that an intermediate
1984 limit can be checked much more quickly than the true limit on
1985 range operations. If identified with a member of iterator_traits,
1986 algorithms may be specialized for this case. Of course the
1987 iterators that have this quality can be identified by specializing
1990 Many of the algorithms must be specialized for the streambuf
1991 iterators, to take advantage of block-mode operations, in order
1992 to allow iostream/locale operations' performance not to suffer.
1993 It may be that they could be treated as staged iterators and
1994 take advantage of those optimizations.
1996 Chapter 25 Algorithms
1997 ----------------------
1998 Headers: <algorithm>
1999 C headers: <cstdlib> (also in 18, 21, 26))
2001 The algorithms are "mostly complete". As mentioned above, they
2002 are optimized for speed at the expense of code and data size.
2004 Specializations of many of the algorithms for non-STL types would
2005 give performance improvements, but we must use great care not to
2006 interfere with fragile template overloading semantics for the
2007 standard interfaces. Conventionally the standard function template
2008 interface is an inline which delegates to a non-standard function
2009 which is then overloaded (this is already done in many places in
2010 the library). Particularly appealing opportunities for the sake of
2011 iostream performance are for copy and find applied to streambuf
2012 iterators or (as noted elsewhere) for staged iterators, of which
2013 the streambuf iterators are a good example.
2015 The bsearch and qsort functions cannot be overloaded properly as
2016 required by the standard because gcc does not yet allow overloading
2017 on the extern-"C"-ness of a function pointer.
2020 --------------------
2021 Headers: <complex> <valarray> <numeric>
2022 C headers: <cmath>, <cstdlib> (also 18, 21, 25)
2024 Numeric components: Gabriel dos Reis's valarray, Drepper's complex,
2025 and the few algorithms from the STL are "mostly done". Of course
2026 optimization opportunities abound for the numerically literate. It
2027 is not clear whether the valarray implementation really conforms
2028 fully, in the assumptions it makes about aliasing (and lack thereof)
2031 The C div() and ldiv() functions are interesting, because they are the
2032 only case where a C library function returns a class object by value.
2033 Since the C++ type div_t must be different from the underlying C type
2034 (which is in the wrong namespace) the underlying functions div() and
2035 ldiv() cannot be re-used efficiently. Fortunately they are trivial to
2038 Chapter 27 Iostreams
2039 ---------------------
2040 Headers: <iosfwd> <streambuf> <ios> <ostream> <istream> <iostream>
2041 <iomanip> <sstream> <fstream>
2042 C headers: <cstdio> <cwchar> (also in 21)
2044 Iostream is currently in a very incomplete state. <iosfwd>, <iomanip>,
2045 ios_base, and basic_ios<> are "mostly complete". basic_streambuf<> and
2046 basic_ostream<> are well along, but basic_istream<> has had little work
2047 done. The standard stream objects, <sstream> and <fstream> have been
2048 started; basic_filebuf<> "write" functions have been implemented just
2049 enough to do "hello, world".
2051 Most of the istream and ostream operators << and >> (with the exception
2052 of the op<<(integer) ones) have not been changed to use locale primitives,
2053 sentry objects, or char_traits members.
2055 All these templates should be manually instantiated for char and
2056 wchar_t in a way that links only used members into user programs.
2058 Streambuf is fertile ground for optimization extensions. An extended
2059 interface giving iterator access to its internal buffer would be very
2060 useful for other library components.
2062 Iostream operations (primarily operators << and >>) can take advantage
2063 of the case where user code has not specified a locale, and bypass locale
2064 operations entirely. The current implementation of op<</num_put<>::put,
2065 for the integer types, demonstrates how they can cache encoding details
2066 from the locale on each operation. There is lots more room for
2067 optimization in this area.
2069 The definition of the relationship between the standard streams
2070 cout et al. and stdout et al. requires something like a "stdiobuf".
2071 The SGI solution of using double-indirection to actually use a
2072 stdio FILE object for buffering is unsatisfactory, because it
2073 interferes with peephole loop optimizations.
2075 The <sstream> header work has begun. stringbuf can benefit from
2076 friendship with basic_string<> and basic_string<>::_Rep to use
2077 those objects directly as buffers, and avoid allocating and making
2080 The basic_filebuf<> template is a complex beast. It is specified to
2081 use the locale facet codecvt<> to translate characters between native
2082 files and the locale character encoding. In general this involves
2083 two buffers, one of "char" representing the file and another of
2084 "char_type", for the stream, with codecvt<> translating. The process
2085 is complicated by the variable-length nature of the translation, and
2086 the need to seek to corresponding places in the two representations.
2087 For the case of basic_filebuf<char>, when no translation is needed,
2088 a single buffer suffices. A specialized filebuf can be used to reduce
2089 code space overhead when no locale has been imbued. Matt Austern's
2090 work at SGI will be useful, perhaps directly as a source of code, or
2091 at least as an example to draw on.
2093 Filebuf, almost uniquely (cf. operator new), depends heavily on
2094 underlying environmental facilities. In current releases iostream
2095 depends fairly heavily on libio constant definitions, but it should
2096 be made independent. It also depends on operating system primitives
2097 for file operations. There is immense room for optimizations using
2098 (e.g.) mmap for reading. The shadow/ directory wraps, besides the
2099 standard C headers, the libio.h and unistd.h headers, for use mainly
2100 by filebuf. These wrappings have not been completed, though there
2101 is scaffolding in place.
2103 The encapulation of certain C header <cstdio> names presents an
2104 interesting problem. It is possible to define an inline std::fprintf()
2105 implemented in terms of the 'extern "C"' vfprintf(), but there is no
2106 standard vfscanf() to use to implement std::fscanf(). It appears that
2107 vfscanf but be re-implemented in C++ for targets where no vfscanf
2108 extension has been defined. This is interesting in that it seems
2109 to be the only significant case in the C library where this kind of
2110 rewriting is necessary. (Of course Glibc provides the vfscanf()
2111 extension.) (The functions related to exit() must be rewritten
2117 Headers: <strstream>
2119 Annex D defines many non-library features, and many minor
2120 modifications to various headers, and a complete header.
2121 It is "mostly done", except that the libstdc++-2 <strstream>
2122 header has not been adopted into the library, or checked to
2123 verify that it matches the draft in those details that were
2124 clarified by the committee. Certainly it must at least be
2125 moved into the std namespace.
2127 We still need to wrap all the deprecated features in #if guards
2128 so that pedantic compile modes can detect their use.
2130 Nonstandard Extensions
2131 ----------------------
2132 Headers: <iostream.h> <strstream.h> <hash> <rbtree>
2133 <pthread_alloc> <stdiobuf> (etc.)
2135 User code has come to depend on a variety of nonstandard components
2136 that we must not omit. Much of this code can be adopted from
2137 libstdc++-v2 or from the SGI STL. This particularly includes
2138 <iostream.h>, <strstream.h>, and various SGI extensions such
2139 as <hash_map.h>. Many of these are already placed in the
2140 subdirectories ext/ and backward/. (Note that it is better to
2141 include them via "<backward/hash_map.h>" or "<ext/hash_map>" than
2142 to search the subdirectory itself via a "-I" directive.