Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Problems removing an element from a std::set using a reverse_iterator

921 views
Skip to first unread message

irotas

unread,
Jul 13, 2007, 2:23:03 PM7/13/07
to
The following following article talks about removing elements from a
standard container using a reverse_iterator:
http://www.ddj.com/dept/cpp/184401406

Note that the article is (as best as I can tell) identical to Item 28
in 'Effective STL' by Scott Meyers.

In the article, Meyers states that there are two ways to remove an
element from a standard container using a reverse iterator:
1) v.erase(--ri.base());
2) v.erase((++ri).base());

The article states that option 1 is fine for all standard containers
*except* for std::string and std::vector, which often implement
iterators as built-in pointers (C++ imposes the constraint that
pointers returned from functions may not be modified). However, the
article claims that option 2 works for all standard containers, and
therefore is the preferred technique.

In my case, I have a std::set that I want to remove an element from
using a reverse_iterator. In this case, option 1 does in fact compile
and execute properly. However, I wanted to use option 2 (as Meyers
claims its the preferred technique), but in fact this method does
*not* execute properly.

On my Linux computer, valgrind complains about many "Invalid reads"
when I use option 2. I posted this code on ##c++ on Freenode IRC, and
one user said that it even triggered an assertion in Visual Studio.

I emailed the code to Scott Meyers for review, but he replied that
nothing seems wrong with the code from his quick examination, and that
I should post the code here.

Please be aware that this code is simply for pedagogical purposes; it
does not represent my real algorithm (which in fact demands using a
reverse_iterator to remove an element).


***********************************************************************
#include <set>
#include <iostream>

using namespace std;

int
main(
int argc,
char** argv)
{
set<unsigned int> uis;

// add 0 through 9 to the set
for(unsigned int i = 0U ; i < 10U ; ++i)
{
(void)uis.insert(i);
}

cout << endl << "Before:" << endl;

// display the set before modification
for(set<unsigned int>::const_iterator i = uis.begin() ;
i != uis.end() ;
++i)
{
cout << " " << (*i) << endl;
}

// use a reverse iterator to remove 5 through 9 from the set
for(set<unsigned int>::reverse_iterator ri = uis.rbegin() ;
ri != uis.rend() ;)
{
cout << endl << "Testing " << (*ri) << endl;

// if we hit an element smaller than 5, terminate the loop
if((*ri) < 5U)
{
cout << endl << "Terminating loop at " << (*ri) << endl;

break;
}

cout << endl << "Removing " << (*ri) << endl;

#if 1
uis.erase((++ri).base());
#else
uis.erase(--(ri.base()));
#endif
}

cout << endl << "After:" << endl;

// display the set after modification
for(set<unsigned int>::const_iterator i = uis.begin() ;
i != uis.end() ;
++i)
{
cout << " " << (*i) << endl;
}

return(0);
}
***********************************************************************

Here is the relevant output of valgrind on my computer:

***********************************************************************
...
==11229== Using valgrind-2.2.0, a program supervision framework for
x86-linux.
==11229== Copyright (C) 2000-2004, and GNU GPL'd, by Julian Seward et
al.
==11229== Valgrind library directory: /usr/lib/valgrind
==11229== Command line
==11229== ./ritertest
==11229== Startup, with flags:
==11229== --tool=memcheck
==11229== --leak-check=yes
==11229== -v
==11229== Contents of /proc/version:
==11229== Linux version 2.6.9-1.667
(bhco...@tweety.build.redhat.com) (gcc version 3.4.2 20041017 (Red
Hat 3.4.2-6.fc3)) #1 Tue Nov 2 14:41:25 EST 2004
...
Before:
0
1
2
3
4
5
6
7
8
9

Testing 9

Removing 9
==11229== TRANSLATE: 0x87E424 redirected to 0x1B9050CE
==11229== Invalid read of size 4
==11229== at 0x82B6FA:
std::_Rb_tree_decrement(std::_Rb_tree_node_base*) (in /usr/lib/libstdc
+
+.so.6.0.3)
==11229== by 0x82B764:
std::_Rb_tree_decrement(std::_Rb_tree_node_base const*) (in /usr/lib/
libstdc++.so.6.0.3)
==11229== by 0x8049660:
std::_Rb_tree_const_iterator<unsigned>::operator--() (in /home/
mclaurin/concerto_dev/ritertest)
==11229== by 0x804920F:
std::reverse_iterator<std::_Rb_tree_const_iterator<unsigned>
>::operator*() const (in /home/mclaurin/concerto_dev/ritertest)
==11229== Address 0x1B91D2B0 is 0 bytes inside a block of size 20
free'd
==11229== at 0x1B90514F: operator delete(void*)
(vg_replace_malloc.c:156)
==11229== by 0x8049A3C:
__gnu_cxx::new_allocator<std::_Rb_tree_node<unsigned>
>::deallocate(std::_Rb_tree_node<unsigned>*, unsigned) (in /home/
mclaurin/concerto_dev/ritertest)
==11229== by 0x804998F: std::_Rb_tree<unsigned, unsigned,
std::_Identity<unsigned>, std::less<unsigned>,
std::allocator<unsigned> >::_M_put_node(std::_Rb_tree_node<unsigned>*)
(in /home/mclaurin/concerto_dev/ritertest)
==11229== by 0x804975B: std::_Rb_tree<unsigned, unsigned,
std::_Identity<unsigned>, std::less<unsigned>,
std::allocator<unsigned>
>::destroy_node(std::_Rb_tree_node<unsigned>*) (in /home/mclaurin/
concerto_dev/ritertest)
...
--11229-- supp: 15 Ugly strchr error in /lib/ld-2.3.3.so
==11229==
==11229== IN SUMMARY: 54 errors from 8 contexts (suppressed: 15 from
1)
...
***********************************************************************


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Carl Barron

unread,
Jul 14, 2007, 4:32:35 PM7/14/07
to
irotas <goo...@irotas.net> wrote:

>
> The article states that option 1 is fine for all standard containers
> *except* for std::string and std::vector, which often implement
> iterators as built-in pointers (C++ imposes the constraint that
> pointers returned from functions may not be modified).

chapter and verse please. this implies that
char *p = "abzxycd";
if(*(strchr(p,'z')+3) !='c') {/* */}
is illegal , [not safe but it is legal]

> However, the
> article claims that option 2 works for all standard containers, and
> therefore is the preferred technique.

I do see a huge problem with your code. You are invalidating the
reverse_iterator by invalidating its 'base iterator'.

first you find the last entry . thats ok then you erase that entry
noe you increment the reverse iterator , decrementing the base
iterator so it no points to the erased element. that is you are doing
this under the hood:
std::set<unsigned int>::iterator it = my_set.end();
while (it != my_set.begin()
{
std::set<unsigned int>::iterator tmp = it; --tmp;
if(pred(*tmp))
{
erase(it);
}
--it; // Ouch it now points to erased element if it was erased.
}

You need to do something like.

template <class Cont, class Pred>
void erase_backward_if(Cont &c, Pred pred)
{
typename Cont::iterator it;
typename Cont::reverse_iterator rit;

// find the first to remove
rit = std::find_if(c.rbegin(),c.rend(), pred);
// while any to remove.
while(rit != c.rend())
{
// copy the base iterator to erase to it for safe keeping
it = --(rit.begin());
// 'advance' the reverse iterator
rit = std::find_if(++rit,c.rend(),pred());
// now erase the elment pointed to by it.
c.erase(it);
}
}

now the data is not removed from the container until the
reverse_iterator has moved to the 'next position' and beyond.

Of course this is O(N) but apparently that is not a problem.

Greg Herlihy

unread,
Jul 15, 2007, 1:40:16 AM7/15/07
to
On Jul 14, 1:32 pm, cbarr...@ix.netcom.com (Carl Barron) wrote:
> irotas <goo...@irotas.net> wrote:
>
> > The article states that option 1 is fine for all standard containers
> > *except* for std::string and std::vector, which often implement
> > iterators as built-in pointers (C++ imposes the constraint that
> > pointers returned from functions may not be modified).
>
> chapter and verse please. this implies that
> char *p = "abzxycd";
> if(*(strchr(p,'z')+3) !='c') {/* */}
> is illegal , [not safe but it is legal]

The C++ rule is that r-values may not be modified (because there is no
object associated wth an r-value which could store the adjusted
value). Therefore - since a pointer returned by a call to base() would
be an rvalue - an expression like:

uis.erase( --ri.base() );

would not be legal C++.

The strchr() example above is legal because the code does not attempt
to modify the pointer returned by strchar(). Instead the program adds
another r-value (3) to the returned pointer in order to calculate a
new r-value. A better strchr() example would be:

char *p = "abzxycd";
if(*(++strchr(p,'z')) !='c') {/* */} // Error: '++' needs l-value

> > However, the
> > article claims that option 2 works for all standard containers, and
> > therefore is the preferred technique.
>

> You need to do something like.
>
> template <class Cont, class Pred>
> void erase_backward_if(Cont &c, Pred pred)
> {
> typename Cont::iterator it;
> typename Cont::reverse_iterator rit;
>
> // find the first to remove
> rit = std::find_if(c.rbegin(),c.rend(), pred);
> // while any to remove.
> while(rit != c.rend())
> {
> // copy the base iterator to erase to it for safe keeping
> it = --(rit.begin());
> // 'advance' the reverse iterator
> rit = std::find_if(++rit,c.rend(),pred());
> // now erase the elment pointed to by it.
> c.erase(it);
> }
> }

I think a more compact solution would be to replace:

uis.erase( (++ri).base() );

in the original program, with:

uis.erase(( ++ri, ri++.base() ));

in order to erase ri's item from the set - without (at the same time)
invalidating ri.

Greg

Carl Barron

unread,
Jul 15, 2007, 11:25:22 AM7/15/07
to
Greg Herlihy <gre...@pacbell.net> wrote:

> I think a more compact solution would be to replace:
>
> uis.erase( (++ri).base() );
>
> in the original program, with:
>
> uis.erase(( ++ri, ri++.base() ));
>
> in order to erase ri's item from the set - without (at the same time)
> invalidating ri.
>

But after the expression is evaluated, ri has been incremented twice
result every other one is erassed. It looks brittle

nonw of the single increment ri options work.
you need to use a named tempory to hold ri.base() and decrement it.

so I don't see anything more concise than:
for the erase function call in original code.
{
set<unsigned int>::iterator it = ri++.base();
uis.erase(--it);
}
but i'd probably
{
set<unsigned int>::iterator it = ri.base();
++ri;
erase(--it);
}
to avoid copying ri more than needed.

Louis Lavery

unread,
Jul 15, 2007, 11:25:44 AM7/15/07
to
Greg Herlihy wrote:
> On Jul 14, 1:32 pm, cbarr...@ix.netcom.com (Carl Barron) wrote:

[snip]

>
> I think a more compact solution would be to replace:
>
> uis.erase( (++ri).base() );
>
> in the original program, with:
>
> uis.erase(( ++ri, ri++.base() ));
>

That can fail (consider a container containing exactly one item).

Isn't it simpy...

ri = uis.erase( (++ri).base() );

Louis.

irotas

unread,
Jul 15, 2007, 1:57:46 PM7/15/07
to
Louis Lavery wrote:
> Greg Herlihy wrote:
> > On Jul 14, 1:32 pm, cbarr...@ix.netcom.com (Carl Barron) wrote:
> Isn't it simpy...
>
> ri = uis.erase( (++ri).base() );

Unfortunately, std::set.erase(iterator) does not have a return value.

Louis Lavery

unread,
Jul 16, 2007, 5:36:37 PM7/16/07
to
irotas wrote:
> Louis Lavery wrote:
>> Greg Herlihy wrote:
>>> On Jul 14, 1:32 pm, cbarr...@ix.netcom.com (Carl Barron) wrote:
>> Isn't it simpy...
>>
>> ri = uis.erase( (++ri).base() );
>
> Unfortunately, std::set.erase(iterator) does not have a return value.

Ah! A set is it.

Well, I tend not to hang on to iterators after an erase,
it's just so easy to get things wrong. So, going back to
your original problem, how about...

reverse_iterator rfirst = uis.rbegin();

reverse_iterator rpast = std::find(uis.rbegin(),uis.rend(),5);

uis.erase((++rpast).base(),(++rfirst).base());

...not elegant I know, but trying to do it in a loop while
juggling with iterators is fraught with danger.

Louis.

Louis Lavery

unread,
Jul 16, 2007, 5:36:36 PM7/16/07
to
Carl Barron wrote:
> Greg Herlihy <gre...@pacbell.net> wrote:
>
>> I think a more compact solution would be to replace:
>>
>> uis.erase( (++ri).base() );
>>
>> in the original program, with:
>>
>> uis.erase(( ++ri, ri++.base() ));
>>
>> in order to erase ri's item from the set - without (at the same time)
>> invalidating ri.
>>
> But after the expression is evaluated, ri has been incremented twice
> result every other one is erassed. It looks brittle
>
> nonw of the single increment ri options work.
> you need to use a named tempory to hold ri.base() and decrement it.
>
> so I don't see anything more concise than:
> for the erase function call in original code.
> {
> set<unsigned int>::iterator it = ri++.base();

Isn't ri.base() now one before it so...

> uis.erase(--it);

... --it == ri.base().

> }
> but i'd probably
> {
> set<unsigned int>::iterator it = ri.base();
> ++ri;
> erase(--it);

Ditto.

Louis.

Carl Barron

unread,
Jul 17, 2007, 11:17:22 AM7/17/07
to
In article <f7e40e$ccn$1$8302...@news.demon.co.uk>, Louis Lavery
<Lo...@LaverREMOVE.demon.co.uk> wrote:

>
> > so I don't see anything more concise than:
> > for the erase function call in original code.
> > {
> > set<unsigned int>::iterator it = ri++.base();
>
> Isn't ri.base() now one before it so...

ri++ returns a copy of ri before the increment and increments ri.
therefore ri++.base() returns the base iterator that ri had before
it was incremented.

The second version eliminates the need of a copy of ri produced with
operator ++(int). [most likely, op ++(int) creates a copy, especially
for a set<...>::reverse_iterator.]. But the final result is the same
with either version I provided in the post you refer to.

Louis Lavery

unread,
Jul 17, 2007, 6:25:50 PM7/17/07
to
Carl Barron wrote:
> In article <f7e40e$ccn$1$8302...@news.demon.co.uk>, Louis Lavery
> <Lo...@LaverREMOVE.demon.co.uk> wrote:
>
>>> so I don't see anything more concise than:
>>> for the erase function call in original code.
>>> {
>>> set<unsigned int>::iterator it = ri++.base();
>> Isn't ri.base() now one before it so...
> ri++ returns a copy of ri before the increment and increments ri.
> therefore ri++.base() returns the base iterator that ri had before
> it was incremented.
>

Yes. So ri.base() is now one before it. I.e. ri.base() == --it.

> The second version eliminates the need of a copy of ri produced with
> operator ++(int). [most likely, op ++(int) creates a copy, especially
> for a set<...>::reverse_iterator.]. But the final result is the same
> with either version I provided in the post you refer to.

Yes and they both result in invalidating ri by erasing the item its
base iterator points to. But you don't have to believe me, it's easy
to ask your compilers.

Your example code...

but i'd probably
{
set<unsigned int>::iterator it = ri.base();
++ri;
erase(--it);
}
to avoid copying ri more than needed.

...is equivalent to...

{
set<unsigned int>::iterator it = ri.base();
++ri;
--it;
erase(it);
}

...can be tested via...

#include <cassert>
#include <set>

int main()
{
std::set<int unsigned> uis;
uis.insert(1);
std::set<int unsigned>::reverse_iterator ri = uis.rbegin();

std::set<int unsigned>::iterator it = ri.base();
++ri;
--it;
assert(ri.base() != it); // Ask the compiler what it thinks.
uis.erase(it);

return 0;
}

...try it.

Louis.

Carl Barron

unread,
Jul 18, 2007, 9:58:39 AM7/18/07
to
In article <f7j79p$g2b$1$8300...@news.demon.co.uk>, Louis Lavery
<Lo...@LaverREMOVE.demon.co.uk> wrote:

>
> Yes and they both result in invalidating ri by erasing the item its
> base iterator points to. But you don't have to believe me, it's easy
> to ask your compilers.
>
> Your example code...
>
> but i'd probably
> {
> set<unsigned int>::iterator it = ri.base();
> ++ri;
> erase(--it);
> }
> to avoid copying ri more than needed.
>
> ...is equivalent to...
>
> {
> set<unsigned int>::iterator it = ri.base();
> ++ri;
> --it;
> erase(it);
> }
>

OK then there is no concise version. I'd then drop using
reverse iterators, the trick is not to decrement before the
begining of the container. Something like this is clear.

void erase_backward(Cont &c,Pred pred)
{
typename Cont::iterator i = c.end());
if(i == c.begin()) return;
// this loop intentionally ends one too soon.
for(--i;i!=c.begin() ; )
{
if(pred(*i))
{
// this erases *i and properly decrements i.
erase(i--);
}
else
{
--i; // just decrement
}
}
// i == c.begin();
if(pred(*i))
{
c.erase(i);

Louis Lavery

unread,
Jul 18, 2007, 6:29:08 PM7/18/07
to
Carl Barron wrote:
> In article <f7j79p$g2b$1$8300...@news.demon.co.uk>, Louis Lavery
> <Lo...@LaverREMOVE.demon.co.uk> wrote:
>
>> Yes and they both result in invalidating ri by erasing the item its
>> base iterator points to. But you don't have to believe me, it's easy
>> to ask your compilers.
>>
>> Your example code...
>>
>> but i'd probably
>> {
>> set<unsigned int>::iterator it = ri.base();
>> ++ri;
>> erase(--it);
>> }
>> to avoid copying ri more than needed.
>>
>> ...is equivalent to...
>>
>> {
>> set<unsigned int>::iterator it = ri.base();
>> ++ri;
>> --it;
>> erase(it);
>> }
>>
> OK then there is no concise version.

Well there's...

void erase_backward(Cont &c,Pred pred)
{
Cont::reverse_iterator ri = c.rbegin();

while(ri != c.rend())
{
if (pred(*ri)}
{
Cont::reverse_iterator x = ri;

c.erase((++x).base());
}
else
{
++ri;
}
}
}

...but what's the point of doing it backwards?

Louis.

Carl Barron

unread,
Jul 19, 2007, 11:51:59 AM7/19/07
to
In article <f7lmdg$524$1$8302...@news.demon.co.uk>, Louis Lavery
<Lo...@LaverREMOVE.demon.co.uk> wrote:

> Well there's...
>
> void erase_backward(Cont &c,Pred pred)
> {
> Cont::reverse_iterator ri = c.rbegin();
>
> while(ri != c.rend())
> {
> if (pred(*ri)}
> {
> Cont::reverse_iterator x = ri;
>
> c.erase((++x).base());
> }
> else
> {
> ++ri;
> }
> }
> }
>
> ...but what's the point of doing it backwards?
>
> Louis.

The OP wants it to work with sequence containers as well
and there might be a reason to remove duplicates in a backward
order, I really don't know. But it did produce a 'gotcha' with
reverse_iterator<BI>, worth remembering.

Vyacheslav Kononenko

unread,
Jul 19, 2007, 6:08:04 PM7/19/07
to
On Jul 15, 1:57 pm, irotas <goo...@irotas.net> wrote:
> Louis Lavery wrote:
> > Greg Herlihy wrote:
> > > On Jul 14, 1:32 pm, cbarr...@ix.netcom.com (Carl Barron) wrote:
> > Isn't it simpy...
>
> > ri = uis.erase( (++ri).base() );
>
> Unfortunately, std::set.erase(iterator) does not have a return value.

And it seems to be UB as well (ri modified twice)

Greg Herlihy

unread,
Jul 21, 2007, 3:36:00 AM7/21/07
to

On 7/15/07 8:25 AM, in article f7csse$44e$1$8300...@news.demon.co.uk,
"Louis Lavery" <Lo...@LaverREMOVE.demon.co.uk> wrote:

> Greg Herlihy wrote:
>> On Jul 14, 1:32 pm, cbarr...@ix.netcom.com (Carl Barron) wrote:
>
> [snip]
>
>>
>> I think a more compact solution would be to replace:
>>
>> uis.erase( (++ri).base() );
>>
>> in the original program, with:
>>
>> uis.erase(( ++ri, ri++.base() ));
>>
>
> That can fail (consider a container containing exactly one item).

That problem is easy enough to fix. Instead of moving the reverse iterator
forward - move it back to its initial position:

uis.erase(( ++ri, ri--.base() ));

When erase() returns, ri will refer either to an existing item in the set or
to the set's rend() iterator. Interestingly, even though ri ends up where it
started - ri nonetheless refers to a different item in the set than it did
before the call to erase(). So unlike ordinary set iterators (which always
refer to the same item - as long as the item resides in the set),
reverse_iterators in a set are not stable.

The most sensible approach it seems to me would be to sort the set in the
opposite direction and then remove the items while iterating in a forward
direction.

Greg

irotas

unread,
Jul 23, 2007, 2:37:13 PM7/23/07
to
On Jul 21, 3:36 am, Greg Herlihy <gre...@pacbell.net> wrote:
> The most sensible approach it seems to me would be to sort the set in the
> opposite direction and then remove the items while iterating in a forward
> direction.

This doesn't really work for my particular application. Some
operations on my container are indeed best suited for iterating
through the container from "smallest" to "largest". These operations
are well understood and work fine.

However, one particularly important operation is far more efficient if
I iterate from "largest" to "smallest". The reason is quite simple - I
abort the iteration once I find an element that satisfies a particular
condition, and that element will be much closer to the end of the
container than the beginning.

Also, I had originally hoped for a generic answer that would cover all
standard containers, in my particular application I am using a
std::set. If I could get a clear answer for that container, I suppose
that would suffice.

Again, in my original post I presented two alternatives suggested by
Scott Meyers:
1) v.erase(--ri.base());
2) v.erase((++ri).base());

Option #1 compiles and (seemingly) executes properly. That is, the
output is as expected, and valgrind doesn't complain. However, Option
#2 (the option Meyers' asserts is preferred) does not produce correct
output, and valgrind complains profusely. Therefore, one of the goals
of this thread was to determine if this advice in Meyers' "Effective
STL" is in fact flawed.

Thanks again,
Adam

P.S. I had no idea that this would be so complicated ...

Greg Herlihy

unread,
Jul 23, 2007, 7:03:18 PM7/23/07
to
On 7/23/07 11:37 AM, in article
1185208334.9...@r34g2000hsd.googlegroups.com, "irotas"
<goo...@irotas.net> wrote:

> Again, in my original post I presented two alternatives suggested by
> Scott Meyers:
> 1) v.erase(--ri.base());
> 2) v.erase((++ri).base());
>
> Option #1 compiles and (seemingly) executes properly. That is, the
> output is as expected, and valgrind doesn't complain. However, Option
> #2 (the option Meyers' asserts is preferred) does not produce correct
> output, and valgrind complains profusely. Therefore, one of the goals
> of this thread was to determine if this advice in Meyers' "Effective
> STL" is in fact flawed.

And the determination is that Meyer's advice is not flawed - both calls to
erase() work fine. But as has already been pointed out, one of these calls
also invalidates ri. Valgrind starts to complain only after the program has
incremented this invalid iterator and passed it a -second- time in a call to
erase().

Furthermore, it makes little sense to erase items in a way that would work
for any Standard container - instead of in the way optimally suited for the
container the items are actually in (which in this case is a std::set). And
the preferred way for erasing a range of items in a std::set is to combine
lower_bound() and upper_bound() in a call to erase() like this:

uis.erase( uis.lower_bound(5), uis.upper_bound(9) );

Certainly no other technique can match this one's efficiency, conciseness
and correctness.

Greg

irotas

unread,
Jul 24, 2007, 3:25:23 PM7/24/07
to
On Jul 23, 7:03 pm, Greg Herlihy <gre...@pacbell.net> wrote:
> uis.erase( uis.lower_bound(5), uis.upper_bound(9) );
>
> Certainly no other technique can match this one's efficiency, conciseness
> and correctness.

I won't argue with your statement of conciseness and correctness, but
I don't agree regarding efficiency. The reason is simple; to find the
lower bound requires logarithmic time. If removing from right to left,
finding the upper bound requires constant time. It seems to me that
everything beyond that point would be essentially equivalent.

However, this is a nice theory, but it's needs to be demonstrated. I
wrote up a test driver to measure the speed efficiency of both
approaches. I ran my driver using a variety of configurations, and
removing using a reverse_iterator is more efficient every time (see
below for the source code for the test driver).

Now, if this is true, then by using your argument (regarding optimized
use of a particular container), I should be using a reverse_iterator
in my algorithm implementation. Unfortunately, Meyers' preferred
technique for doing so doesn't work (again, his other technique *does*
work), at least not on my compiler (g++ 3.4.2). You say that Meyers'
advice is not flawed, so then it seems to indicate that there is a
problem with g++. Do you agree?

Here's the test driver:

**********************************************************************
#include <set>
#include <iostream>

#include <sys/time.h>

using namespace std;

double
diffMs(
const struct timeval& t1,
const struct timeval& t2)
{
return(1000.0 * (t2.tv_sec - t1.tv_sec) + 0.001 * (t2.tv_usec -
t1.tv_usec));
}

int
main(
int argc,
char** argv)
{

typedef std::set<unsigned int> UIntSet;

UIntSet uis;

static const unsigned int UPPER_BOUND = 10000000U;

cout << "Inserting " << UPPER_BOUND << " elements ..." << endl;

for(unsigned int i = 0U ; i < UPPER_BOUND ; ++i)
{
(void)uis.insert(uis.end(), i);
}

static const unsigned int THRESHOLD = UPPER_BOUND - 5U;

struct timeval t_start;

struct timeval t_stop;

cout << endl << "Erasing elements: " << flush;

(void)gettimeofday(&t_start, static_cast<struct timezone*>(0));

#if 1

for(UIntSet::reverse_iterator ri = uis.rbegin() ;
ri != uis.rend() ;)
{
if((*ri) < THRESHOLD)
{
break;
}

uis.erase(--(ri.base()));
}

#else

uis.erase(uis.lower_bound(THRESHOLD), uis.end());

#endif

(void)gettimeofday(&t_stop, static_cast<struct timezone*>(0));

cout << diffMs(t_start, t_stop) << " (ms)" << endl;

cout << endl << "Program exiting ..." << endl;

return(0);
}

**********************************************************************

Carl Barron

unread,
Jul 25, 2007, 2:41:26 AM7/25/07
to
In article <1185287381....@m3g2000hsh.googlegroups.com>,
irotas <goo...@irotas.net> wrote:

>
> for(UIntSet::reverse_iterator ri = uis.rbegin() ;
> ri != uis.rend() ;)
> {
> if((*ri) < THRESHOLD)
> {
> break;
> }
>
> uis.erase(--(ri.base()));
> }

Safer and equivalent to your code is

UIntSet::reverse_iterator ri = std::find_if
(
uis.rbegin(),
uis,rend(),
std::bind2nd
(
std::less<unsigned int>(),
THRESHOLD
)
);

// you now erase [uis.rbegin(),ri)
// ri.base points to next in forward sequence and uis.rbegin()
// points to r.end() that is you want to erase [ri.base(),uis.end())
// or
uis.erase(ri.base(),uis.end())

This should work with sequence containers of sorted data and
sorted assoc. containers.

I would expect that a vector<unsigned int> that is not sorted to
compare favorably if not surpass greatly the efficiency of set to
test just the erasure process.
std::vector<unsigned int> vec;
// fill vec
vec.erase
(
remove_if(vec.begin(),vec.end(),erase_me()),
vec.end()
);

is even more concise, this involve N compares and # to remove copies
and # to remove dtor calls. The copies and dtors are dirt cheap. so it
is N compares and they are cheap too most likely.

irotas

unread,
Jul 25, 2007, 2:41:05 PM7/25/07
to
On Jul 25, 2:41 am, Carl Barron <cbarron...@adelphia.net> wrote:
> Safer and equivalent to your code is
>
> UIntSet::reverse_iteratorri = std::find_if

> (
> uis.rbegin(),
> uis,rend(),
> std::bind2nd
> (
> std::less<unsigned int>(),
> THRESHOLD
> )
> );
>
> // you now erase [uis.rbegin(),ri)
> // ri.base points to next in forward sequence and uis.rbegin()
> // points to r.end() that is you want to erase [ri.base(),uis.end())
> // or
> uis.erase(ri.base(),uis.end())
>
> This should work with sequence containers of sorted data and
> sorted assoc. containers.


Hi Carl,

Thanks for the tip! I integrated your suggestion into my test driver,
and (of course) it produces the same output as the other two
approaches. However, it is in fact slower, as I'd expect it to be. The
reason is because in my approach I am removing as I'm iterating
backwards through the container, so that I'm finished when I exceed
the threshold. The approach you suggested first iterates backwards
through the container, then goes back over the same elements again in
order to remove them.

Of course, if my approach truly is unsafe, then speed makes no
difference. However, I'm still not convinced that it is in fact
unsafe. I'm using one of the methods proposed by Scott Meyers, it
compiles fine and produces the expected output, and valgrind doesn't
complain. However, if I'm violating some language standard, or if my
code would break on other compilers, then again speed doesn't matter,
and I'd have to adopt a different approach. However, if it's really
true that there's no good way to remove elements from a container
while iterating over it in reverse, that seems to be a severe language
limitation.

Anyway, here's my updated test driver:

**********************************************************************
#include <set>
#include <iostream>

#include <algorithm>
#include <functional>

#include <sys/time.h>

using namespace std;

double
diffMs(
const struct timeval& t1,
const struct timeval& t2)
{
return(1000.0 * (t2.tv_sec - t1.tv_sec) + 0.001 * (t2.tv_usec -
t1.tv_usec));
}

int
main(
int argc,
char** argv)
{

typedef set<unsigned int> UIntSet;

UIntSet uis;

static const unsigned int UPPER_BOUND = 10000000U;

cout << "Inserting " << UPPER_BOUND << " elements ..." << endl;

for(unsigned int i = 0U ; i < UPPER_BOUND ; ++i)
{
(void)uis.insert(uis.end(), i);
}

static const unsigned int THRESHOLD = UPPER_BOUND - 5U;

struct timeval t_start;

struct timeval t_stop;

cout << endl << "Erasing elements: " << flush;

(void)gettimeofday(&t_start, static_cast<struct timezone*>(0));

#if defined(ERASE1)

for(UIntSet::reverse_iterator ri = uis.rbegin() ;
ri != uis.rend() ;)
{
if((*ri) < THRESHOLD)
{
break;
}

uis.erase(--(ri.base()));
}

#elif defined(ERASE2)

uis.erase(uis.lower_bound(THRESHOLD), uis.end());

#elif defined(ERASE3)

UIntSet::reverse_iterator ri = find_if(uis.rbegin(),
uis.rend(),
bind2nd(less<unsigned int>(),
THRESHOLD));

uis.erase(ri.base(), uis.end());

#else

cerr << "Error: No erase techniques defined." << endl;

return(1);

#endif

(void)gettimeofday(&t_stop, static_cast<struct timezone*>(0));

cout << diffMs(t_start, t_stop) << " (ms)" << endl;

#ifdef DEBUG

cout << endl;

for(UIntSet::const_iterator i = uis.begin() ;


i != uis.end() ;
++i)
{
cout << (*i) << endl;
}

cout << endl;

#endif

cout << endl << "Program exiting ..." << endl;

return(0);
}
**********************************************************************


Carl Barron

unread,
Jul 25, 2007, 8:51:28 PM7/25/07
to
In article <1185371762....@g4g2000hsf.googlegroups.com>,
irotas <goo...@irotas.net> wrote:

> However, if it's really
> true that there's no good way to remove elements from a container
> while iterating over it in reverse, that seems to be a severe language
> limitation.

Just use iterators and decrement operations as appropriate:
UIntSet::iterator it(uis.end());
while(it !=uis.begin() && pred(*--it)) uis.erase(it++);

note there is no dereference of iterators pointing to erased items.
and no gyrations needed using reverse_iterator.
// easiest way to get the SFINAE class that tests for
// typedef ... key_type;
#include <boost/mpl/has_xxx.hpp>
BOOST_MPL_HAS_XXX_TRAIT_DEF(key_type)

template <class C,class B=has_key_type<C>::type>
struct erase_struct
{
typedef typename C::iterator iterator;
static iterator exec(C &c,iterator it)
{
// emulate recent drafts
c.erase(it++);
return it;
}
};

template <class C>
struct erase_struct<C,boost::mpl::false_>
{
typedef typename C::iterator iterator;
static iterator exec(C &c,iterator it)
{
return c.erase(it);
}
};

template <class C,class Pred>
void erase_backward_if(C &c,Pred pred)
{
typename C::iterator it(c.end());
while( it != c.begin() && pred(*--it))
erase_struct<C>::exec(c,it);
}

looks generic. but not tested.
note no dereferencing of invalid iterators in all cases.

irotas

unread,
Jul 26, 2007, 12:32:48 PM7/26/07
to
On Jul 25, 8:51 pm, Carl Barron <cbarron...@adelphia.net> wrote:
> Just use iterators and decrement operations as appropriate:
> UIntSet::iterator it(uis.end());
> while(it !=uis.begin() && pred(*--it)) uis.erase(it++);

Thanks again for the suggestion. I integrated this into my test
driver, and again it produces the correct output and valgrind doesn't
complain. However, for some reason that I don't yet know, this
approach is slower than the 3 previous approaches. I'll post the
source code for my updated test driver below.

There's one thing that I'm still not sure of. Is the method using a
reverse_iterator in the ERASE1 approach (see source code below)
actually incorrect? Does it violate the standard? Or is it just clumsy/
awkward?


> // easiest way to get the SFINAE class that tests for
> // typedef ... key_type;

I haven't yet deciphered what all this means and how it applies to my
problem. I'll look at it again more this evening.


Here's the source code for the updated test driver:

#include <sys/time.h>

using namespace std;

UIntSet uis;

struct timeval t_start;

struct timeval t_stop;

#if defined(ERASE1)

uis.erase(--(ri.base()));
}

#elif defined(ERASE2)

uis.erase(uis.lower_bound(THRESHOLD), uis.end());

#elif defined(ERASE3)

uis.erase(ri.base(), uis.end());

#elif defined(ERASE4)

UIntSet::iterator it(uis.end());

while((it !=uis.begin()) && ((*--it) >= THRESHOLD))
{
uis.erase(it++);
}

#else

cerr << "Error: No erase techniques defined." << endl;

return(1);

#endif

(void)gettimeofday(&t_stop, static_cast<struct timezone*>(0));

cout << diffMs(t_start, t_stop) << " (ms)" << endl;

#ifdef DEBUG

cout << endl;

for(UIntSet::const_iterator i = uis.begin() ;
i != uis.end() ;
++i)
{
cout << (*i) << endl;
}

cout << endl;

#endif

cout << endl << "Program exiting ..." << endl;

return(0);
}
**********************************************************************


Carl Barron

unread,
Jul 28, 2007, 5:28:04 AM7/28/07
to
irotas <goo...@irotas.net> wrote:

>
> There's one thing that I'm still not sure of. Is the method using a
> reverse_iterator in the ERASE1 approach (see source code below)
> actually incorrect? Does it violate the standard? Or is it just clumsy/
> awkward?

> ... Quote


for(UIntSet::reverse_iterator ri = uis.rbegin() ;
ri != uis.rend() ;)
{
if((*ri) < THRESHOLD)
{
break;
}

uis.erase(--(ri.base()));
}
> ... end quote

looks ok since the enclosed interator is is not decremented
until after the erase has adjusteed the inards of the std::set, [during
*ri operation, ri. Not 100% certain.

As to why my use of iterator appears to be slower, beats me. As I
would guess that the iterator solution is at least as fast as a reverse
iterator solution, since it does not need the indirection used by
reverse iterator.

ERASE3 loops twice once to find the starting point to erase then in
erase(iterator,iterator) so it is likely to be slower than a single
loop.

<quote>

> // easiest way to get the SFINAE class that tests for
> // typedef ... key_type;

I haven't yet deciphered what all this means and how it applies to my
problem. I'll look at it again more this evening.

</quote>

Most existing compilers do not support iterator set::erase(iterator);
so one must determine how to erase from a general container with a
bidirectional iterator. Sorted associative containers have a nested
typedef for key_type, and sequenntial container do not.
erase in a loop with a sorted assoc. container
[set/mep/multiset/multimap] need to be done like this
erase(it++);
and most sequential containers require
it = erase(it);

so depending on the presence or not of a type C::key_type determines
which to use with generic code. Does this make any sense??
[latest draft does provide iterator erase(iterator) for all standard
provided sequence or sorted assoc. containers, but many compilers might
not yet so provide]

0 new messages