|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
writing an advanced filestream classI am trying to create a stream that writes text to a file and:
- automatically creates a new file once the current file exceeds a certain size - makes it possible to be used by multiple threads and/or processes so that multiple threads/processes can write to the same file (all threads use the same instance of the stream, processes use a different instance but still may point to the same file) Could you point me in the correct direction how this would ideally be implemented? What class should I drive from? How would I solve the problem of multiple threads acessing the same stream or multiple processes acess the same file? Additional information that might be helpful: The stream will be used with TraceListener (passed in the ctor of the TraceListener). First; note that my example only covers in-process (shared instance)
usage. For multi-process, you would presumably have to open/close the file each time (possibly open it in shared write mode?) and use a Mutex to sync access between all processes instead of a lock. I also have "issues" with multiple processes writing in binary (interleaved) to a single file; this could lead to partial characters getting written to a file / files; e.g. a multi-byte character (which means "most of them" unless you are limiting yourself to ASCII or a single-byte codepage) gets it's first byte written; another thread/process then adds some data; then the rest of the character is written, possibly (disjointed) to the same file, possibly to a different file. Either way you just knackered the encoding good'n'proper. You would certainly be splitting up words / sentances. Perhaps you should be performing this function at the StreamWriter / TextWriter level, so that character (or better: strings) are written in their entirety. You should also probably not split a string between files (hard to read), so I'd allow it to overflow the capped limit as needed, and then start a new file. Unfortunately, when writing bytes at the stream level, you can't guarantee that the current buffer represents complete characters (they could legitimately a byte at a time), and without a *lot* of inspection (and knowledge of encodings) it would be hard to keep integrity at the stream level. Anyway, "as presented", I would derive from Stream, and encapsulate (contain) a FileStream; something like below; note the main code is the Write method; this syncs all access to one thread at a time, and works in a loop, writing as much as we can from the input buffer to each successive file until we run out of data. I haven't tested it at all - but something along those lines may be close. Again: If you need strings to remain intact in files, then you may want to write a StreamWriter instead. Marc public class MultiFileStream : Stream { private Stream current; private long totalLength; private readonly string path; private readonly int maxFileLength; private int currentSpace; private readonly object SyncLock = new object(); private int fileCounter; public override bool CanRead {get { return false; }} // write-only stream public override bool CanSeek {get { return false; }} // write to end only public override bool CanWrite {get { return true; }} public override void Flush() { if (current != null) current.Flush(); } public override long Length {get { return totalLength; }} public override long Position { get { return totalLength; } set { if (value != Position) throw new NotSupportedException(); } } public override int Read(byte[] buffer, int offset, int count) { throw new NotSupportedException(); } public override long Seek(long offset, SeekOrigin origin) { throw new NotSupportedException(); } public override void SetLength(long value) { throw new NotSupportedException(); } protected override void Dispose(bool disposing) { if (disposing && current!=null) { current.Dispose(); current = null; } base.Dispose(disposing); } public override void Close() { if (current != null) { current.Close(); current = null; } base.Close(); } public override void Write(byte[] buffer, int offset, int count) { lock (SyncLock) { while (count > 0) { if (current == null || currentSpace == 0) GetNextFile(); int writeThisPass = currentSpace < count ? currentSpace : count; current.Write(buffer, offset, writeThisPass);offset += writeThisPass; count -= writeThisPass; totalLength += writeThisPass; } } } private void GetNextFile() { if (current != null) { current.Close(); current = null; } while (File.Exists(Path.Combine(path, fileCounter.ToString()))) { fileCounter++; } current = File.Create(Path.Combine(path, fileCounter.ToString())); fileCounter++; currentSpace = maxFileLength; } public MultiFileStream(string path, int maxFileLength) { this.path = path; this.maxFileLength = maxFileLength; currentSpace = 0; } } Forgot to add; for multi-process you must expect the size to change
without notice... so you'd probably drop both the current and currentSpace fields and check them on the fly within the Write method. Lots of opening and closing :-( Alternatively - you could perhaps use remoting so all processes talk to a single instance? Of course, then it has to remote the buffer... Marc On 25 Oct 2006 00:32:15 -0700, Marc Gravell wrote:
> protected override void Dispose(bool disposing) Is a lock not needed in your Dispose method? What if this method is called> { > if (disposing && current!=null) > { > current.Dispose(); > current = null; > } > base.Dispose(disposing); > } while another thread is creating a new stream? What if you dispose a stream which is currently being written to? Possibly so; however, I would be reasonably happy for it to go bang in
this scenario, as they *really* shouldn't be individually disposing this stream, since they don't own it! A better option would be for me to have included an "isDisposed" field, and barf if *anything* happens after Dispose() [except Dispose()]. If the caller is using e.g. a Writer that rudely insists on Dispose()ing the base stream when itself disposed, then this can be managed; I believe Jon Skeet has a non-closing stream example in his bag of tricks on his site. Marc Marc Gravell <marc.grav***@gmail.com> wrote:
> Possibly so; however, I would be reasonably happy for it to go bang in Yup - it's in MiscUtil:> this scenario, as they *really* shouldn't be individually disposing > this stream, since they don't own it! > > A better option would be for me to have included an "isDisposed" field, > and barf if *anything* happens after Dispose() [except Dispose()]. > > If the caller is using e.g. a Writer that rudely insists on > Dispose()ing the base stream when itself disposed, then this can be > managed; I believe Jon Skeet has a non-closing stream example in his > bag of tricks on his site. http://www.pobox.com/~skeet/csharp/miscutil -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too Hi bonk,
As I understood from your requirement. There are two key problems you want to solve. 1. Write to a file from multiple threads in the same process 2. Write to a file from multiple threads from multiple processes The 1st problem can be solved quite easily. But the second problem may need a re-thinking on your end to see if the approach itself is correct. I say so because writing to the same file from multiple process without any mediating process is difficult to manage. The main issue would be to synchronize the access to the file by multiple processes so that the final file content won't look as garbage data due to mix up of data from multiple processes. And synchronizing actions from multiple processes will un-doubtedly and severly degrade the performance of the application. Probably if you want to use the same file name to write content then you could as well think of creating one file per process with the following format. <filename>_<process-id>.<ext> Anyway coming to the 1st problem (Write to a file from multiple threads in the same process), here also you need to synchronize the writing of data to your FileStream class from mutiple threads so that the data won't get mixed-up. For this you can use lock on the FileStream object representing your file inside your "Write" method. This technique uses locks and may still pose performance issues. I have created some free C# libraries that can assist in performing such operations without causing performance degradation. Take a look at my recent library that I posted on CodeProject.com http://www.codeproject.com/cs/library/asynchronouscodeblocks.asp Using this library you can implement the Write method of your FileStream as shown below (only Pseudo code). class AdvancedFileStream { int AdvancedFileStream::Write(byte[] data,....) { new async(delegate { this.InternalWrite(data,...) }, _myThreadPool); } Sonic.Net.ThreadPool _myThreadPool = new Sonic.Net.ThreadPool(1,1); } You can use one ThreadPool (defined in my library) with one maximum and one concurrent thread to do all writing of the data to the underlying file stream represented by the AdvancedFileStream object. When multiple threads call into Write the method will post a delegate to the _myThreadPool and will return immediately to the calling code. Later the delegate will be executed on the ThreadPool thread dedicated for this instance of the AdvancedFileStream object. Since there is only one thread in this ThreadPool only one write request will be handled at any given point of time thus protecting the sequence of the writes to the file from multiple threads. There is one issue that you need to be aware of in this approach. The byte array supplied to the AdvancedFileStream::Write method should not be re-used as we do not know when the ThreadPool thread gets a chance to perform the actual write. There are ways to overcome it. I leave this to you as you can figure out several options after reading my article on ACB. Hope this helps. -- Show quoteRegards, Aditya.P "bonk" wrote: > I am trying to create a stream that writes text to a file and: > > - automatically creates a new file once the current file exceeds a > certain size > - makes it possible to be used by multiple threads and/or processes so > that multiple threads/processes can write to the same file (all threads > use the same instance of the stream, processes use a different instance > but still may point to the same file) > > Could you point me in the correct direction how this would ideally be > implemented? What class should I drive from? How would I solve the > problem of multiple threads acessing the same stream or multiple > processes acess the same file? > > Additional information that might be helpful: The stream will be used > with TraceListener (passed in the ctor of the TraceListener). > > |
|||||||||||||||||||||||