|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Reading and writing a large binary file failsI am reading a 240MB+ binary file performing some changes and writing it back out. For now I have removed the code that performs changes so in its simplistic form reading a large binary and then writing it back out. After about 4MB I receive an exception: The output char buffer is too small to contain the decoded characters, encoding 'Unicode' fallback 'System.Text.DecoderReplacementFallback'. Parameter name: chars I really cant figure this one out. Code is below ( I put the flushes in hoping that was the problem) try { //Open the source file sourcestream = SourceFile.Open(FileMode.Open, FileAccess.Read, FileShare.None); //open the output file targetstream = TargetFile.Open(FileMode.CreateNew, FileAccess.Write, FileShare.None); breader = new BinaryReader(sourcestream); bwriter = new BinaryWriter(targetstream); for (long i = 0; i < SourceFile.Length; i++) { bwriter.Write(breader.Read()); if (i % 2000 == 0) { bwriter.Flush(); targetstream.Flush(); } } breader.Close(); sourcestream.Close(); bwriter.Close(); targetstream.Close(); //Delete the original source file SourceFile.Delete(); Any help would be greatly appreciated Regards, Pete. Hello, TrinityPete!
T> I am reading a 240MB+ binary file performing some changes and writing it T> back out. For now I have removed the code that performs changes so in T> its simplistic form reading a large binary and then writing it back out. T> After about 4MB I receive an exception: T> The output char buffer is too small to contain the decoded characters, T> encoding 'Unicode' fallback 'System.Text.DecoderReplacementFallback'. T> Parameter name: chars On what operation did you get an exception? breader.Read() or bwriter.Write()? Maybe on the other? Looks like it was on the read??
Just been messing some more with the code and if I change bwriter.Write(breader.Read()); to bwriter.Write(breader.ReadByte()); it works fine. It still doesn't help in understanding whats happening with the original statement. Full stack trace: " at System.Text.Encoding.ThrowCharsOverflow()\r\n at System.Text.Encoding.ThrowCharsOverflow(DecoderNLS decoder, Boolean nothingDecoded)\r\n at System.Text.UTF8Encoding.GetChars(Byte* bytes, Int32 byteCount, Char* chars, Int32 charCount, DecoderNLS baseDecoder)\r\n at System.Text.DecoderNLS.GetChars(Byte* bytes, Int32 byteCount, Char* chars, Int32 charCount, Boolean flush)\r\n at System.Text.DecoderNLS.GetChars(Byte[] bytes, Int32 byteIndex, Int32 byteCount, Char[] chars, Int32 charIndex, Boolean flush)\r\n at System.Text.DecoderNLS.GetChars(Byte[] bytes, Int32 byteIndex, Int32 byteCount, Char[] chars, Int32 charIndex)\r\n at System.IO.BinaryReader.InternalReadOneChar()\r\n at System.IO.BinaryReader.Read()\r\n at TCS.Utilities.TCSDirMonClasses.tcsDirMonitor.MoveFile(FileInfo SourceFile, FileInfo TargetFile) in D:\\DOTNETDEV VS2005\\TCS.Utilities.TCSDirMon\\TCS.Utilities.TCSDirMon\\TCS.Utilities.TCSDirMonClasses\\TCSDirMonClasses.cs:line 797" Pete. Show quote "Vadym Stetsyak" wrote: > Hello, TrinityPete! > > T> I am reading a 240MB+ binary file performing some changes and writing it > T> back out. For now I have removed the code that performs changes so in > T> its simplistic form reading a large binary and then writing it back out. > > T> After about 4MB I receive an exception: > > T> The output char buffer is too small to contain the decoded characters, > T> encoding 'Unicode' fallback 'System.Text.DecoderReplacementFallback'. > T> Parameter name: chars > > On what operation did you get an exception? > breader.Read() or bwriter.Write()? Maybe on the other? > > -- > Regards, Vadym Stetsyak > www: http://vadmyst.blogspot TrinityPete <TrinityP***@discussions.microsoft.com> wrote:
> I am reading a 240MB+ binary file performing some changes and writing it So it's a binary file - and doesn't contain meaningful text.> back out. > bwriter.Write(breader.Read()); The documentation for BinaryReader.Read() states:---8<--- Reads characters from the underlying stream and advances the current position of the stream in accordance with the Encoding used and the specific character being read from the stream. --->8--- This reads *characters* according to the encoding associated with the BinaryReader (which defaults to UTF8Encoding). Having read a character (which may be more than one byte due to UTF8 being a multibyte encoding), you write to the BinaryWriter.Write(Int32) overload, which writes out exactly 4 bytes corresponding to an int. As near as I can make out, you should be using something more like: ---8<--- bwriter.Write(breader.ReadInt32()); --->8--- Don't forget that you can open an existing file and seek within it, and make changes in place - you won't be able to insert easily, though. BTW: To reduce fragmentation, you may want to extend your target file by calling SetLength on the output stream before doing your stream-based editing. If you don't, Windows will end up being too optimistic and try to squeeze the increasingly long file in all the fragmented bits of free space on your drive. I've seen files with >1000 fragments easily created because of this, taking a significant first-time-read hit next time they're accessed. -- Barry Thanks barry, that seems to make sense now - if changed the read() to
readbyte() it works OK. Thank you. Pete. Show quote "Barry Kelly" wrote: > TrinityPete <TrinityP***@discussions.microsoft.com> wrote: > > > I am reading a 240MB+ binary file performing some changes and writing it > > back out. > > So it's a binary file - and doesn't contain meaningful text. > > > bwriter.Write(breader.Read()); > > The documentation for BinaryReader.Read() states: > > ---8<--- > Reads characters from the underlying stream and advances the current > position of the stream in accordance with the Encoding used and the > specific character being read from the stream. > --->8--- > > This reads *characters* according to the encoding associated with the > BinaryReader (which defaults to UTF8Encoding). > > Having read a character (which may be more than one byte due to UTF8 > being a multibyte encoding), you write to the BinaryWriter.Write(Int32) > overload, which writes out exactly 4 bytes corresponding to an int. > > As near as I can make out, you should be using something more like: > > ---8<--- > bwriter.Write(breader.ReadInt32()); > --->8--- > > Don't forget that you can open an existing file and seek within it, and > make changes in place - you won't be able to insert easily, though. > > BTW: To reduce fragmentation, you may want to extend your target file by > calling SetLength on the output stream before doing your stream-based > editing. If you don't, Windows will end up being too optimistic and try > to squeeze the increasingly long file in all the fragmented bits of free > space on your drive. I've seen files with >1000 fragments easily created > because of this, taking a significant first-time-read hit next time > they're accessed. > > -- Barry > |
|||||||||||||||||||||||