|
dev
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Global data concurrent access ?I have some global structure in my application. The data within this is read by some reader threads. There are about 10 new threads created per second. Threads do some processing and die. All threads process based on data read from the global structure. This global structure is populated from a set of files at application startup, which can be modified at run-time. There is also the option to apply the modified files at run-time. At the time the modified files are read into the application, I need to block access to this global structure from the 'reader' threads. Threads do not modify this structure. What is the best/efficient way to achieve this task - lock / mutex / wait - and how can this be done ? Any ideas are welcome. Thanks n Regards, Kunal "Kunal" <koolku***@gmail.com> wrote You want to abandon this architecture as quickly as you can. The cost of > There are about 10 new threads created per second. Threads do some > processing and die. All threads process based on data read from the > global structure. creating and destroying threads is very high, and at 10 threads per second, you're application is spending 95% of it's time creating threads, and 5% doing the work you want done. You should: 1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL, making a web service call, reading/writing to a file) this is what you want to do. If your work is I/O centric, don't do this. 2 - Create your threads, but keep them around. When a thread is doing doing it's work, have it go look in a "Work queue" for more work to do. This is a good way to go in general. > This global structure is populated from a set of files at application The pattern you're looking for is a ReaderWriterLock. In my other post, I > startup, which can be modified at run-time. There is also the option to > apply the modified files at run-time. At the time the modified files > are read into the application, I need to block access to this global > structure from the 'reader' threads. Threads do not modify this > structure. used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very similar in terms of methodology though. "Kunal" <koolku***@gmail.com> wrote You want to abandon this architecture as quickly as you can. The cost of > There are about 10 new threads created per second. Threads do some > processing and die. All threads process based on data read from the > global structure. creating and destroying threads is very high, and at 10 threads per second, you're application is spending 95% of it's time creating threads, and 5% doing the work you want done. You should: 1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL, making a web service call, reading/writing to a file) this is what you want to do. If your work is I/O centric, don't do this. 2 - Create your threads, but keep them around. When a thread is doing doing it's work, have it go look in a "Work queue" for more work to do. This is a good way to go in general. > This global structure is populated from a set of files at application The pattern you're looking for is a ReaderWriterLock. In my other post, I > startup, which can be modified at run-time. There is also the option to > apply the modified files at run-time. At the time the modified files > are read into the application, I need to block access to this global > structure from the 'reader' threads. Threads do not modify this > structure. used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very similar in terms of methodology though. "Kunal" <koolku***@gmail.com> wrote You want to abandon this architecture as quickly as you can. The cost of > There are about 10 new threads created per second. Threads do some > processing and die. All threads process based on data read from the > global structure. creating and destroying threads is very high, and at 10 threads per second, you're application is spending 95% of it's time creating threads, and 5% doing the work you want done. You should: 1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL, making a web service call, reading/writing to a file) this is what you want to do. If your work is I/O centric, don't do this. 2 - Create your threads, but keep them around. When a thread is doing doing it's work, have it go look in a "Work queue" for more work to do. This is a good way to go in general. > This global structure is populated from a set of files at application The pattern you're looking for is a ReaderWriterLock. In my other post, I > startup, which can be modified at run-time. There is also the option to > apply the modified files at run-time. At the time the modified files > are read into the application, I need to block access to this global > structure from the 'reader' threads. Threads do not modify this > structure. used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very similar in terms of methodology though. "Kunal" <koolku***@gmail.com> wrote You want to abandon this architecture as quickly as you can. The cost of > There are about 10 new threads created per second. Threads do some > processing and die. All threads process based on data read from the > global structure. creating and destroying threads is very high, and at 10 threads per second, you're application is spending 95% of it's time creating threads, and 5% doing the work you want done. You should: 1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL, making a web service call, reading/writing to a file) this is what you want to do. If your work is I/O centric, don't do this. 2 - Create your threads, but keep them around. When a thread is doing doing it's work, have it go look in a "Work queue" for more work to do. This is a good way to go in general. > This global structure is populated from a set of files at application The pattern you're looking for is a ReaderWriterLock. In my other post, I > startup, which can be modified at run-time. There is also the option to > apply the modified files at run-time. At the time the modified files > are read into the application, I need to block access to this global > structure from the 'reader' threads. Threads do not modify this > structure. used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very similar in terms of methodology though. Chris Mullins [MVP] <cmull***@yahoo.com> wrote:
> "Kunal" <koolku***@gmail.com> wrote While I agree that creating a lot of threads is reasonably expensive, I > > There are about 10 new threads created per second. Threads do some > > processing and die. All threads process based on data read from the > > global structure. > > You want to abandon this architecture as quickly as you can. The cost of > creating and destroying threads is very high, and at 10 threads per second, > you're application is spending 95% of it's time creating threads, and 5% > doing the work you want done. don't think it's quite as bad as all that. On my laptop, creating and starting 500 threads takes about 120-170ms. (I haven't got the energy to work up a good benchmark - this is about as crude as they come.) Assuming linear scaling (and it should actually be better than that, because with only 10 threads at a time there'll be less context switching) that would suggest that 10 threads would take less than 4ms to start - i.e. under 1% of the time. Yes, it's a very crude benchmark - but I do think 95% is higher than reality. Just to reiterate though, I totally agree that the OP should move away from that architecture ASAP :) -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too I try hard not to let reality intrude on my generalizations...
I guess the big questions for the OP is "how much work are you doing in each thread?". It still may be trivial relevant to the time it takes to start the thread. .... but yea, starting and stopping threads are like exceptions. Everyone says (even me, obviously) that it's horrible, but we all forget just how good horrible can be. :) Show quote "Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message news:MPG.200a203b60216f1698d740@msnews.microsoft.com... > Chris Mullins [MVP] <cmull***@yahoo.com> wrote: >> "Kunal" <koolku***@gmail.com> wrote >> > There are about 10 new threads created per second. Threads do some >> > processing and die. All threads process based on data read from the >> > global structure. >> >> You want to abandon this architecture as quickly as you can. The cost of >> creating and destroying threads is very high, and at 10 threads per >> second, >> you're application is spending 95% of it's time creating threads, and 5% >> doing the work you want done. > > While I agree that creating a lot of threads is reasonably expensive, I > don't think it's quite as bad as all that. > > On my laptop, creating and starting 500 threads takes about 120-170ms. > (I haven't got the energy to work up a good benchmark - this is about > as crude as they come.) > > Assuming linear scaling (and it should actually be better than that, > because with only 10 threads at a time there'll be less context > switching) that would suggest that 10 threads would take less than 4ms > to start - i.e. under 1% of the time. Yes, it's a very crude benchmark > - but I do think 95% is higher than reality. > > Just to reiterate though, I totally agree that the OP should move away > from that architecture ASAP :) > > -- > Jon Skeet - <sk***@pobox.com> > http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet > If replying to the group, please do not mail me too Chris Mullins [MVP] <cmull***@yahoo.com> wrote:
> I try hard not to let reality intrude on my generalizations... True.:) > I guess the big questions for the OP is "how much work are you doing in each > thread?". It still may be trivial relevant to the time it takes to start the > thread. > ... but yea, starting and stopping threads are like exceptions. Everyone And of course the reality of that changes every year as hardware gets > says (even me, obviously) that it's horrible, but we all forget just how > good horrible can be. :) cheaper. Just as with exceptions, of course, excessive starting and stopping of threads is a symptom of an architecture which should be looked at, but probably won't actually kill performance in itself. Next in line: how expensive is making a database connection these days? :) -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too Jon wrote:
> Next in line: how expensive is making a database connection these days? That may have been a rhetorical question, but I happen to have justwritten a little test to figure out how pooled versus unpooled connections performed. The code below is executing the same simple stored procedure 1000 times, from a 3.4GHz Xeon box to a 3.6GHz Xeon box about four network hops away. No pooling 1000 iterations. Elapsed time: 00:00:12.9217096 Pooling 1000 iterations. Elapsed time: 00:00:01.0624864 That's about 12ms per round-trip for a non-pooled connection, and 1ms per for a pooled connection (I was trying to prove that pooling didn't produce an order of magnitude performance improvement, and I was obviously wrong). using System; using System.Data.SqlClient; using System.Data; namespace PoolSpeedTest { class Program { static string server = XXX static string database = XXX static string connStringPooling = "Data Source={0};Initial Catalog={1};Integrated Security=True"; static string connStringNoPooling = "Data Source={0};Initial Catalog={1};Integrated Security=True;Pooling=false"; static void Main(string[] args) { TwoByOne(1000); Console.ReadLine(); } static private void TwoByOne(int iterations) { // preload for (int i = 0; i < 10; i++) { using (SqlConnection conn = new SqlConnection(String.Format(connStringPooling, server, database))) { DoSomething(conn); } } for (int i = 0; i < 10; i++) { using (SqlConnection conn = new SqlConnection(String.Format(connStringNoPooling, server, database))) { DoSomething(conn); } } // DateTime start; DateTime end; start = DateTime.Now; for (int i = 0; i < iterations; i++) { using (SqlConnection conn = new SqlConnection(String.Format(connStringNoPooling, server, database))) { DoSomething(conn); } } end = DateTime.Now; Console.WriteLine(String.Format("No pooling {0} iterations. Elapsed time: {1}", iterations, end - start)); start = DateTime.Now; for (int i = 0; i < iterations; i++) { using (SqlConnection conn = new SqlConnection(String.Format(connStringPooling, server, database))) { DoSomething(conn); } } end = DateTime.Now; Console.WriteLine(String.Format(" Pooling {0} iterations. Elapsed time: {1}", iterations, end-start)); } private static void DoSomething(SqlConnection conn) { SqlCommand command = new SqlCommand("XXX", conn); command.CommandType = CommandType.StoredProcedure; conn.Open(); using (IDataReader dr = command.ExecuteReader(CommandBehavior.CloseConnection)) { while (dr.Read()) { string x= dr["XXX"].ToString(); } } } } } Michael Petrotta <mpetro***@gmail.com> wrote:
Show quote > Jon wrote: Out of interest, what happens to the figures if:> > Next in line: how expensive is making a database connection these days? > > That may have been a rhetorical question, but I happen to have just > written a little test to figure out how pooled versus unpooled > connections performed. The code below is executing the same simple > stored procedure 1000 times, from a 3.4GHz Xeon box to a 3.6GHz Xeon > box about four network hops away. > > No pooling 1000 iterations. Elapsed time: 00:00:12.9217096 > Pooling 1000 iterations. Elapsed time: 00:00:01.0624864 > > That's about 12ms per round-trip for a non-pooled connection, and 1ms > per for a pooled connection (I was trying to prove that pooling didn't > produce an order of magnitude performance improvement, and I was > obviously wrong). 1) You're on the same box? 2) You're only one network hop away? My guess is that 2) won't be terribly different, but 1) may well decrease difference significantly. -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too Jon wrote:
Show quote > Michael Petrotta <mpetro***@gmail.com> wrote: No pooling 1000 iterations. Elapsed time: 00:00:06.0936720> > Jon wrote: > > > Next in line: how expensive is making a database connection these days? > > > > That may have been a rhetorical question, but I happen to have just > > written a little test to figure out how pooled versus unpooled > > connections performed. The code below is executing the same simple > > stored procedure 1000 times, from a 3.4GHz Xeon box to a 3.6GHz Xeon > > box about four network hops away. > > > > No pooling 1000 iterations. Elapsed time: 00:00:12.9217096 > > Pooling 1000 iterations. Elapsed time: 00:00:01.0624864 > > > > That's about 12ms per round-trip for a non-pooled connection, and 1ms > > per for a pooled connection (I was trying to prove that pooling didn't > > produce an order of magnitude performance improvement, and I was > > obviously wrong). > > Out of interest, what happens to the figures if: > > 1) You're on the same box? Pooling 1000 iterations. Elapsed time: 00:00:00.2499968 > 2) You're only one network hop away? No pooling 1000 iterations. Elapsed time: 00:00:12.7683600Pooling 1000 iterations. Elapsed time: 00:00:00.2904176 Interesting. The results are repeatable, but it's not a perfect test (in particular, the client for #2 is a much slower laptop). If I take the results at face value, it seems to say that network speed affects pooled connections (makes sense; there's not much connection setup and teardown with pooled connections. It's just the time taken to get the query and results over the wires). I think the speed of my laptop is affecting the non-pooled test; it's taking time to have ADO.NET set up and tear down the connection. Unfortunately, our network architecture is such that it's hard to have two powerful desktops closely connected. (The speed of pooled connections in general surprised me, when I first ran this test. I'd assumed network latency would give me round-trip times around 10-20ms. Pings to the server are also returning sub-millisecond times. Remember modems and their 200ms pings?) Michael Michael Petrotta <mpetro***@gmail.com> wrote:
Show quote > > Out of interest, what happens to the figures if: Wow. Just shows how wrong intuition can be!> > > > 1) You're on the same box? > > No pooling 1000 iterations. Elapsed time: 00:00:06.0936720 > Pooling 1000 iterations. Elapsed time: 00:00:00.2499968 > > > 2) You're only one network hop away? > > No pooling 1000 iterations. Elapsed time: 00:00:12.7683600 > Pooling 1000 iterations. Elapsed time: 00:00:00.2904176 > > Interesting. The results are repeatable, but it's not a perfect test > (in particular, the client for #2 is a much slower laptop). > If I take the results at face value, it seems to say that network speed Yup.> affects pooled connections (makes sense; there's not much connection > setup and teardown with pooled connections. It's just the time taken > to get the query and results over the wires). > I think the speed of my laptop is affecting the non-pooled test; it's Fair enough - thanks for taking the time to run the tests at all!> taking time to have ADO.NET set up and tear down the connection. > Unfortunately, our network architecture is such that it's hard to have > two powerful desktops closely connected. > (The speed of pooled connections in general surprised me, when I first It's quite incredible how fast some things can change while others stay > ran this test. I'd assumed network latency would give me round-trip > times around 10-20ms. Pings to the server are also returning > sub-millisecond times. Remember modems and their 200ms pings?) the same. Where are the 1TB+ cheap, fast static memory chips that we've been waiting for for so long? That's what I think will *really* transform computing... -- Jon Skeet - <sk***@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too "Kunal" <koolku***@gmail.com> wrote You want to abandon this architecture as quickly as you can. The cost of > There are about 10 new threads created per second. Threads do some > processing and die. All threads process based on data read from the > global structure. creating and destroying threads is very high, and at 10 threads per second, you're application is spending 95% of it's time creating threads, and 5% doing the work you want done. You should: 1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL, making a web service call, reading/writing to a file) this is what you want to do. If your work is I/O centric, don't do this. 2 - Create your threads, but keep them around. When a thread is doing doing it's work, have it go look in a "Work queue" for more work to do. This is a good way to go in general. > This global structure is populated from a set of files at application The pattern you're looking for is a ReaderWriterLock. In my other post, I > startup, which can be modified at run-time. There is also the option to > apply the modified files at run-time. At the time the modified files > are read into the application, I need to block access to this global > structure from the 'reader' threads. Threads do not modify this > structure. used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very similar in terms of methodology though. |
|||||||||||||||||||||||