Another Son of Martha: OutOfMemoryException on MemoryStream in Pipeline Component

On one of my projects, I was using a custom pipeline component to decompress a file in a send pipeline (*see side note on design decision at end of post). In the production environment, we began to see pipeline failures caused by mscorlib OutOfMemoryExceptions in the pipeline component. It was happening sporadically, so at first was not clear what was happening, though we knew the issue was occurring with growing frequency, roughly in correlation to the increase in the size of the file (the compressed file was a statement of account balances which grew as the number of accounts grew).

As many BizTalkers do, I use the SharpZipLib library for Zip compression (see Pro BizTalk 2006 by Dunphy and Metwally for a great example of this) and was taking the approach of loading the zipstream into a MemoryStream object. The exception was being thrown in my loop which copied 4K segments into the stream.

After consulting my trusted advisor I saw a few discussions related to the need to provide a contiguous segment of memory for a MemoryStream to work.

I quickly inferred the likely culprit: I was trying to load data into a MemoryStream without the OS knowing how much memory to allocate it, so in many cases it was allocating a contiguous block that would not be enough for the uncompressed file.

Short term fix:

For the time being, I have simply updated my pipeline component to declare the size of the MemoryStream up front based on the zipstream’s Size property which is the number of bytes of the fiie uncompressed.

DANGER: the Size property is a Long, whereas you can only instantiate a MemoryStream with an Int32 (of course from the 2GB memory limit for 32-bit processes). Knowing the file size will not grow beyond 1GB uncompressed, I have squished the long into an int, which of course is terrible. Hence a long-term fix:

Long term fix:

Though I have yet to implement this, the sensible approach is instead to use something like the VirtualStream which will offload data to the filesystem if a stream exceeds a configured size, saving your poor BTSNTSvc.exe

Indeed hardware should never be used to mask up bad code, but it’s interesting to consider that this issue would likely not have arisen if it had been deployed on a 64-bit OS, which we can hopefully encourage all clients to do in the future.

* as a side note, the unzip was done in a send pipeline, in opposition to the usual approach of decompressing a file in a receive pipeline because the contents of the file were not needed. All we wanted to do was route based on the file name, so by delaying the decompression we only needed to load a 16MB compressed file to the messagebox instead of a 700MB uncompressed file.

Another Son of Martha

Thursday, September 10, 2009

OutOfMemoryException on MemoryStream in Pipeline Component

No comments: