|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Problem with the GZipStream class and small streamsI have a problem using the System.IO.Compression.GZipStream class. I wrote the following methods to compress and decompress arrays of bytes. private static byte[] Compress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(stream, CompressionMode.Compress); gZipStream.Write(array, 0, array.Length); return stream.ToArray(); } private static byte[] Decompress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(new MemoryStream(array), CompressionMode.Decompress); byte[] b = new byte[4096]; while (true) { int n = gZipStream.Read(b, 0, b.Length); if (n > 0) stream.Write(b, 0, n); else break; } return stream.ToArray(); } In the Decompress method, if the array of bytes is small (apparently smaller than 4096 bytes), then the Read method returns 0 regardless the size of the b buffer. Also, replacing the while block with the following one, if the array of bytes is small, then the ReadByte method returns -1. while (true) { int n = gZipStream.ReadByte(); if (n != -1) stream.WriteByte((byte)n); else break; } Is this happening because the GZipStream class internally uses a 4 KB buffer? Anyway, how could I solve the problem? Thank you, Fabio Hi,
In the Compress code: private static byte[] Compress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(stream, CompressionMode.Compress); gZipStream.Write(array, 0, array.Length); return stream.ToArray(); } You need to close the GZipStream first to read from the underlying MemoryStream. It's because the GZip footer was written in GZipStream.Dispose. Sincerely, Walter Wang (waw***@online.microsoft.com, remove 'online.') Microsoft Online Community Support ================================================== Get notification to my posts through email? Please refer to http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif ications. If you are using Outlook Express, please make sure you clear the check box "Tools/Options/Read: Get 300 headers at a time" to see your reply promptly. Note: The MSDN Managed Newsgroup support offering is for non-urgent issues where an initial response from the community or a Microsoft Support Engineer within 1 business day is acceptable. Please note that each follow up response may take approximately 2 business days as the support professional working with you may need further investigation to reach the most efficient resolution. The offering is not appropriate for situations that require urgent, real-time or phone-based interactions or complex project analysis and dump analysis issues. Issues of this nature are best handled working with a dedicated Microsoft Support Engineer by contacting Microsoft Customer Support Services (CSS) at http://msdn.microsoft.com/subscriptions/support/default.aspx. ================================================== This posting is provided "AS IS" with no warranties, and confers no rights. Thank you, Walter. That solves the problem indeed. Still I cannot explain the
different behavior with larger streams? Hi Fabio,
Thanks for the update. I've done some test using large buffer, however, they all shows incorrect result if the GZipStream is not closed before reading the underlying MemoryStream. Can you help me confirm the behavior on your side? Thanks. static void Main(string[] args) { TestBySize(10); TestBySize(4095); TestBySize(4096); TestBySize(409700); } private static void TestBySize(int C) { byte[] buf1 = new byte[C]; for (int i = 0; i < C; i++) { buf1[i] = (byte) i; } byte[] buf2 = Compress(buf1); byte[] buf3 = Decompress(buf2); Console.WriteLine(CompareBuffer(buf1, buf3)); } private static bool CompareBuffer(byte[] buf1, byte[] buf2) { if (buf1 == null || buf2 == null) return false; if (buf1.Length != buf2.Length) return false; for (int i = 0; i < buf1.Length; i++) { if (buf1[i] != buf2[i]) return false; } return true; } private static byte[] Compress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(stream, CompressionMode.Compress); gZipStream.Write(array, 0, array.Length); return stream.ToArray(); } private static byte[] Decompress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(new MemoryStream(array), CompressionMode.Decompress); byte[] b = new byte[4096]; while (true) { int n = gZipStream.Read(b, 0, b.Length); if (n > 0) stream.Write(b, 0, n); else break; } return stream.ToArray(); } Regards, Walter Wang (waw***@online.microsoft.com, remove 'online.') Microsoft Online Community Support ================================================== When responding to posts, please "Reply to Group" via your newsreader so that others may learn and benefit from your issue. ================================================== This posting is provided "AS IS" with no warranties, and confers no rights. Hi Fabio,
I am interested in this issue. Would you mind letting me know the result of the suggestions? If you need further assistance, feel free to let me know. I will be more than happy to be of assistance. Have a great day! Regards, Walter Wang (waw***@online.microsoft.com, remove 'online.') Microsoft Online Community Support ================================================== When responding to posts, please "Reply to Group" via your newsreader so that others may learn and benefit from your issue. ================================================== This posting is provided "AS IS" with no warranties, and confers no rights. Hi Walter,
Sorry for my delayed response, I have been training during the last days and only today I found some time to further investigate this issue. I first noticed this behavior while developing the following sample application. http://fabioscagliola.spaces.live.com/blog/cns!919F8FCDE3DC9AC4!160.entry In the beginning I was not closing the GZipStream in my Compress method as you suggested me. Nonetheless the application was able to correctly handle files larger than 32576 bytes (I thought 4096 guessing the GZipStream class internally uses a 4 KB buffer, but I was wrong). Running your test code I get the same results as you: if I close the GZipStream in my Compress method as you suggested me, then your Compare method always returns true, else if I do not close it, then your Compare method always returns false. However, the reasons why your Compare method returns false if I do not close the GZipStream are different based on the size of the array of bytes. Here is what I found out. (1) If the size of the array is 32575 bytes or less (852 compressed), then the Read method of the GZipStream fails. (2) If the size of the array is 32576 bytes or more (854 compressed), then the array being compressed and then decompressed is one byte larger that the original one, BUT, except for the last byte, the contents of the two arrays are identical. Please, give the following code a try. I still cannot explain the different behavior. using System; using System.IO; using System.IO.Compression; public class ConsoleApplication { private static byte[] Compress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(stream, CompressionMode.Compress); gZipStream.Write(array, 0, array.Length); //gZipStream.Close(); return stream.ToArray(); } private static byte[] Decompress(byte[] array) { MemoryStream stream = new MemoryStream(); GZipStream gZipStream = new GZipStream(new MemoryStream(array), CompressionMode.Decompress); byte[] b = new byte[4096]; while (true) { int n = gZipStream.Read(b, 0, b.Length); if (n > 0) stream.Write(b, 0, n); else { if (stream.Length == 0) throw new Exception("Cannot read from GZipStream."); else break; } } gZipStream.Close(); return stream.ToArray(); } public static void Test(int size) { try { Console.WriteLine(string.Format("Using a {0} bytes array...", size)); byte[] b1 = new byte[size]; for (int i = 0; i < size; i++) b1[i] = (byte)i; Console.WriteLine("Compressing..."); byte[] b2 = Compress(b1); Console.WriteLine("Decompressing..."); byte[] b3 = Decompress(b2); Console.WriteLine("Done."); } catch (Exception e) { Console.WriteLine(e.Message); } } public static void Main() { Test(32574); Test(32575); Test(32576); Test(32577); } } I thank you very much and wish you too have a great day! Regards, Fabio Hi Fabio,
Thank you very much for your following up. Based on my understanding, your current question is about: when not closing the GZipStream in Compress(), the decompressed MemoryStream in Decompress() sometimes has data, sometimes doesn't, and you want to know why. Well, I think it's related to the GZip compressing algorithm and the internal implementation details in the .NET class library. In my opinion, when we're not closing the GZipStream before getting the underlying MemoryStream's content, the content passed to Decompress() is already invalid GZip content. Which means the resulting data from Decompress() would be unexpected, the data could be zero-length, or could be other incomplete data; in either case, the resulting data is wrong if you use my CompareBuffer() method to compare it with the original buffer before compressing. I hope I didn't misunderstand your question and my answer make sense to your question. Please let me know whether or not you need further information. Thanks. Regards, Walter Wang (waw***@online.microsoft.com, remove 'online.') Microsoft Online Community Support ================================================== When responding to posts, please "Reply to Group" via your newsreader so that others may learn and benefit from your issue. ================================================== This posting is provided "AS IS" with no warranties, and confers no rights. Hi Walter,
You perfectly understood my question (by the way, sorry for I did not formulate it at all :) and your answer definitely makes sense. After all it was just my curiosity, because -as you pointed out since the beginning- the correct way to handle compression and decompression is closing the stream, which I had forgotten in the first implementation of my methods. Thank you, Fabio |
|||||||||||||||||||||||