TITLE: performance of ostrstreams

(Newsgroups: comp.lang.c++.moderated, 24 Jun 97)

LIEBSTER: T. Daniel Liebster

> I have a program that does a lot of calculations and then the results
> are either output to a text file, or to a string in memory for display
> in a text window.  ...
> 
> Why does this function take orders of magnitude more time to complete
> when it is passed an ostrstream than when it is passed an ofstream?


[ Steve Clamage is chairman of the X3J16 C++ standardization
  committe. -adc ]


CLAMAGE: Steve Clamage <stephen.clamage@Eng.Sun.COM>

When you write to an expandable ostrstream, the buffer gets
reallocated and copied whenever it needs to be expanded.
Exactly how much space get allocated is up to the implementation,
and you might find some documentation on that point. If the
implementation allocates a minimum amount of space, which is
probably happening in your case, you get a buffer reallocation
and copy on every output. If you write 10 characters 10
times, and the increment is 10 characters each time, the
sequence at successive expansions is

	allocate buffer of  10
	allocate buffer of  20, copy 10, delete original
	allocate buffer of  30, copy 20, delete original
	allocate buffer of  40, copy 30, delete original
	allocate buffer of  50, copy 40, delete original
	allocate buffer of  60, copy 50, delete original
	allocate buffer of  70, copy 60, delete original
	allocate buffer of  80, copy 70, delete original
	allocate buffer of  90, copy 80, delete original
	allocate buffer of 100, copy 90, delete original

BUCK: jbuck@synopsys.com (Joe Buck)

[snip]

>Any implementation that behaves this way (extending the buffer by a
>constant increment) is deeply flawed.  This gives quadradic performance
>for N insertions (thus if this algorithm were used to implement the vector
>template, it would be a violation of the standard, since linear time
>is required on average).
>
>The implementation should (must, in the case of the STL)
>increase the allocation proportionately to the current size
>(one way is to double it each time).

[snip]

CLAMAGE:

That is where your time is going. (In principle, the
strstreambuf could use realloc and hope to extend the buffer
in place, but there isn't any way to predict whether the
buffer will be extended in place. If it can't be extended,
realloc performs the above operations anyway.)

Deciding how much to use for the default allocation is no-win
situation for the library implementor. A big allocation
is best for a program having only one ostrstream which might
grow large. A small allocation is best for a program that
creates many ostrstreams which are all modest in size. The
library provides a mechanism for you to specify larger
increments, so a small default allocation makes the most
sense, given that the library implementor doesn't know what
the usage patterns will be.

You have two simple options. 

1. If you have a bounded amount of output, supply your own
buffer to a non-expandable ostrstream:
	char mybuf[BIG_ENOUGH]; // or put it on the heap
	ostrstream os(mybuf, BIG_ENOUGH);
The ostrstream will never overwrite the buffer, but will
go into a bad state and truncate output if reaches the end.
This isn't ideal, but isn't catastrophic. If you can't
realistically predict the amount of output, use option 2.

2. The setbuf member function of the strstreambuf allows you
to set the minimum re-allocation size. Example:
	ostrstream os(); // expandable ostrstream
	os.rdbuf()->setbuf(0, SIZE);
The next time the the buffer needs to be expanded, at least
SIZE bytes will be added to the buffer size. Implementations may
vary in whether this increment is "sticky". Usually it applies
only to the next allocation. There isn't a way to specify the
initial allocation size.

The draft standard does not define the action of setbuf for
strstreambufs. With an implementation that supports new-sytle
iostreams, you would probably want to use stringstreams instead
of strstreams.

