2 <!DOCTYPE part PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN"
3 "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"
6 <part id="manual.iterators" xreflabel="Iterators">
7 <?dbhtml filename="iterators.html"?>
22 <indexterm><primary>Iterators</primary></indexterm>
25 <!-- Chapter 01 : Predefined -->
26 <chapter id="manual.iterators.predefined" xreflabel="Predefined">
27 <title>Predefined</title>
29 <sect1 id="iterators.predefined.vs_pointers" xreflabel="Versus Pointers">
30 <title>Iterators vs. Pointers</title>
33 FAQ <link linkend="faq.iterator_as_pod">entry</link> points out that
34 iterators are not implemented as pointers. They are a generalization
35 of pointers, but they are implemented in libstdc++ as separate
38 <para>Keeping that simple fact in mind as you design your code will
39 prevent a whole lot of difficult-to-understand bugs.
41 <para>You can think of it the other way 'round, even. Since iterators
42 are a generalization, that means that <emphasis>pointers</emphasis> are
43 <emphasis>iterators</emphasis>, and that pointers can be used whenever an
44 iterator would be. All those functions in the Algorithms chapter
45 of the Standard will work just as well on plain arrays and their
48 <para>That doesn't mean that when you pass in a pointer, it gets wrapped
49 into some special delegating iterator-to-pointer class with a layer
50 of overhead. (If you think that's the case anywhere, you don't
51 understand templates to begin with...) Oh, no; if you pass
52 in a pointer, then the compiler will instantiate that template
53 using T* as a type, and good old high-speed pointer arithmetic as
54 its operations, so the resulting code will be doing exactly the same
55 things as it would be doing if you had hand-coded it yourself (for
58 <para>How much overhead <emphasis>is</emphasis> there when using an iterator class?
59 Very little. Most of the layering classes contain nothing but
60 typedefs, and typedefs are "meta-information" that simply
61 tell the compiler some nicknames; they don't create code. That
62 information gets passed down through inheritance, so while the
63 compiler has to do work looking up all the names, your runtime code
64 does not. (This has been a prime concern from the beginning.)
70 <sect1 id="iterators.predefined.end" xreflabel="end() Is One Past the End">
71 <title>One Past the End</title>
73 <para>This starts off sounding complicated, but is actually very easy,
74 especially towards the end. Trust me.
76 <para>Beginners usually have a little trouble understand the whole
77 'past-the-end' thing, until they remember their early algebra classes
78 (see, they <emphasis>told</emphasis> you that stuff would come in handy!) and
79 the concept of half-open ranges.
81 <para>First, some history, and a reminder of some of the funkier rules in
82 C and C++ for builtin arrays. The following rules have always been
83 true for both languages:
87 <para>You can point anywhere in the array, <emphasis>or to the first element
88 past the end of the array</emphasis>. A pointer that points to one
89 past the end of the array is guaranteed to be as unique as a
90 pointer to somewhere inside the array, so that you can compare
95 <para>You can only dereference a pointer that points into an array.
96 If your array pointer points outside the array -- even to just
97 one past the end -- and you dereference it, Bad Things happen.
101 <para>Strictly speaking, simply pointing anywhere else invokes
102 undefined behavior. Most programs won't puke until such a
103 pointer is actually dereferenced, but the standards leave that
108 <para>The reason this past-the-end addressing was allowed is to make it
109 easy to write a loop to go over an entire array, e.g.,
110 while (*d++ = *s++);.
112 <para>So, when you think of two pointers delimiting an array, don't think
113 of them as indexing 0 through n-1. Think of them as <emphasis>boundary
120 | | This is bad. Always having to
121 | | remember to add or subtract one.
122 | | Off-by-one bugs very common here.
125 |---|---|--...--|---|---|
126 | 0 | 1 | ... |N-2|N-1|
127 |---|---|--...--|---|---|
131 | | This is good. This is safe. This
132 | | is guaranteed to work. Just don't
133 | | dereference 'end'.
137 <para>See? Everything between the boundary markers is part of the array.
140 <para>Now think back to your junior-high school algebra course, when you
141 were learning how to draw graphs. Remember that a graph terminating
142 with a solid dot meant, "Everything up through this point,"
143 and a graph terminating with an open dot meant, "Everything up
144 to, but not including, this point," respectively called closed
145 and open ranges? Remember how closed ranges were written with
146 brackets, <emphasis>[a,b]</emphasis>, and open ranges were written with parentheses,
147 <emphasis>(a,b)</emphasis>?
149 <para>The boundary markers for arrays describe a <emphasis>half-open range</emphasis>,
150 starting with (and including) the first element, and ending with (but
151 not including) the last element: <emphasis>[beginning,end)</emphasis>. See, I
152 told you it would be simple in the end.
154 <para>Iterators, and everything working with iterators, follows this same
155 time-honored tradition. A container's <code>begin()</code> method returns
156 an iterator referring to the first element, and its <code>end()</code>
157 method returns a past-the-end iterator, which is guaranteed to be
158 unique and comparable against any other iterator pointing into the
159 middle of the container.
161 <para>Container constructors, container methods, and algorithms, all take
162 pairs of iterators describing a range of values on which to operate.
163 All of these ranges are half-open ranges, so you pass the beginning
164 iterator as the starting parameter, and the one-past-the-end iterator
165 as the finishing parameter.
167 <para>This generalizes very well. You can operate on sub-ranges quite
168 easily this way; functions accepting a <emphasis>[first,last)</emphasis> range
169 don't know or care whether they are the boundaries of an entire {array,
170 sequence, container, whatever}, or whether they only enclose a few
171 elements from the center. This approach also makes zero-length
172 sequences very simple to recognize: if the two endpoints compare
173 equal, then the {array, sequence, container, whatever} is empty.
175 <para>Just don't dereference <code>end()</code>.
181 <!-- Chapter 02 : Stream -->