Home All Groups Group Topic Archive Search About

Global data concurrent access ?

Author
5 Jan 2007 5:37 PM
Kunal
Hi friends,

I have some global structure in my application. The data within this is
read by some reader threads. There are about 10 new threads created per
second. Threads do some processing and die. All threads process based
on data read from the global structure.

This global structure is populated from a set of files at application
startup, which can be modified at run-time. There is also the option to
apply the modified files at run-time. At the time the modified files
are read into the application, I need to block access to this global
structure from the 'reader' threads. Threads do not modify this
structure.

What is the best/efficient way to achieve this task - lock / mutex /
wait - and how can this be done ? Any ideas are welcome.

Thanks n Regards,
Kunal

Author
6 Jan 2007 1:33 AM
Chris Mullins [MVP]
"Kunal" <koolku***@gmail.com> wrote
> There are about 10 new threads created per second. Threads do some
> processing and die. All threads process based on data read from the
> global structure.

You want to abandon this architecture as quickly as you can. The cost of
creating and destroying threads is very high, and at 10 threads per second,
you're application is spending 95% of it's time creating threads, and 5%
doing the work you want done.

You should:
1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL,
making a web service call, reading/writing to a file) this is what you want
to do. If your work is I/O centric, don't do this.

2 - Create your threads, but keep them around. When a thread is doing doing
it's work, have it go look in a "Work queue" for more work to do. This is a
good way to go in general.

> This global structure is populated from a set of files at application
> startup, which can be modified at run-time. There is also the option to
> apply the modified files at run-time. At the time the modified files
> are read into the application, I need to block access to this global
> structure from the 'reader' threads. Threads do not modify this
> structure.

The pattern you're looking for is a ReaderWriterLock. In my other post, I
used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very
similar in terms of methodology though.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, MVP C#
http://www.coversant.net/blogs/cmullins
Author
6 Jan 2007 1:33 AM
Chris Mullins [MVP]
"Kunal" <koolku***@gmail.com> wrote
> There are about 10 new threads created per second. Threads do some
> processing and die. All threads process based on data read from the
> global structure.

You want to abandon this architecture as quickly as you can. The cost of
creating and destroying threads is very high, and at 10 threads per second,
you're application is spending 95% of it's time creating threads, and 5%
doing the work you want done.

You should:
1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL,
making a web service call, reading/writing to a file) this is what you want
to do. If your work is I/O centric, don't do this.

2 - Create your threads, but keep them around. When a thread is doing doing
it's work, have it go look in a "Work queue" for more work to do. This is a
good way to go in general.

> This global structure is populated from a set of files at application
> startup, which can be modified at run-time. There is also the option to
> apply the modified files at run-time. At the time the modified files
> are read into the application, I need to block access to this global
> structure from the 'reader' threads. Threads do not modify this
> structure.

The pattern you're looking for is a ReaderWriterLock. In my other post, I
used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very
similar in terms of methodology though.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, MVP C#
http://www.coversant.net/blogs/cmullins
Author
6 Jan 2007 1:33 AM
Chris Mullins [MVP]
"Kunal" <koolku***@gmail.com> wrote
> There are about 10 new threads created per second. Threads do some
> processing and die. All threads process based on data read from the
> global structure.

You want to abandon this architecture as quickly as you can. The cost of
creating and destroying threads is very high, and at 10 threads per second,
you're application is spending 95% of it's time creating threads, and 5%
doing the work you want done.

You should:
1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL,
making a web service call, reading/writing to a file) this is what you want
to do. If your work is I/O centric, don't do this.

2 - Create your threads, but keep them around. When a thread is doing doing
it's work, have it go look in a "Work queue" for more work to do. This is a
good way to go in general.

> This global structure is populated from a set of files at application
> startup, which can be modified at run-time. There is also the option to
> apply the modified files at run-time. At the time the modified files
> are read into the application, I need to block access to this global
> structure from the 'reader' threads. Threads do not modify this
> structure.

The pattern you're looking for is a ReaderWriterLock. In my other post, I
used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very
similar in terms of methodology though.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, MVP C#
http://www.coversant.net/blogs/cmullins
Author
6 Jan 2007 1:33 AM
Chris Mullins [MVP]
"Kunal" <koolku***@gmail.com> wrote
> There are about 10 new threads created per second. Threads do some
> processing and die. All threads process based on data read from the
> global structure.

You want to abandon this architecture as quickly as you can. The cost of
creating and destroying threads is very high, and at 10 threads per second,
you're application is spending 95% of it's time creating threads, and 5%
doing the work you want done.

You should:
1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL,
making a web service call, reading/writing to a file) this is what you want
to do. If your work is I/O centric, don't do this.

2 - Create your threads, but keep them around. When a thread is doing doing
it's work, have it go look in a "Work queue" for more work to do. This is a
good way to go in general.

> This global structure is populated from a set of files at application
> startup, which can be modified at run-time. There is also the option to
> apply the modified files at run-time. At the time the modified files
> are read into the application, I need to block access to this global
> structure from the 'reader' threads. Threads do not modify this
> structure.

The pattern you're looking for is a ReaderWriterLock. In my other post, I
used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very
similar in terms of methodology though.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, MVP C#
http://www.coversant.net/blogs/cmullins
Author
6 Jan 2007 9:08 PM
Jon Skeet [C# MVP]
Chris Mullins [MVP] <cmull***@yahoo.com> wrote:
> "Kunal" <koolku***@gmail.com> wrote
> > There are about 10 new threads created per second. Threads do some
> > processing and die. All threads process based on data read from the
> > global structure.
>
> You want to abandon this architecture as quickly as you can. The cost of
> creating and destroying threads is very high, and at 10 threads per second,
> you're application is spending 95% of it's time creating threads, and 5%
> doing the work you want done.

While I agree that creating a lot of threads is reasonably expensive, I
don't think it's quite as bad as all that.

On my laptop, creating and starting 500 threads takes about 120-170ms.
(I haven't got the energy to work up a good benchmark - this is about
as crude as they come.)

Assuming linear scaling (and it should actually be better than that,
because with only 10 threads at a time there'll be less context
switching) that would suggest that 10 threads would take less than 4ms
to start - i.e. under 1% of the time. Yes, it's a very crude benchmark
- but I do think 95% is higher than reality.

Just to reiterate though, I totally agree that the OP should move away
from that architecture ASAP :)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
6 Jan 2007 9:52 PM
Chris Mullins [MVP]
I try hard not to let reality intrude on my generalizations...

I guess the big questions for the OP is "how much work are you doing in each
thread?". It still may be trivial relevant to the time it takes to start the
thread.

.... but yea, starting and stopping threads are like exceptions. Everyone
says (even me, obviously) that it's horrible, but we all forget just how
good horrible can be. :)

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, MVP C#
http://www.coversant.net/blogs/cmullins

Show quote
"Jon Skeet [C# MVP]" <sk***@pobox.com> wrote in message
news:MPG.200a203b60216f1698d740@msnews.microsoft.com...
> Chris Mullins [MVP] <cmull***@yahoo.com> wrote:
>> "Kunal" <koolku***@gmail.com> wrote
>> > There are about 10 new threads created per second. Threads do some
>> > processing and die. All threads process based on data read from the
>> > global structure.
>>
>> You want to abandon this architecture as quickly as you can. The cost of
>> creating and destroying threads is very high, and at 10 threads per
>> second,
>> you're application is spending 95% of it's time creating threads, and 5%
>> doing the work you want done.
>
> While I agree that creating a lot of threads is reasonably expensive, I
> don't think it's quite as bad as all that.
>
> On my laptop, creating and starting 500 threads takes about 120-170ms.
> (I haven't got the energy to work up a good benchmark - this is about
> as crude as they come.)
>
> Assuming linear scaling (and it should actually be better than that,
> because with only 10 threads at a time there'll be less context
> switching) that would suggest that 10 threads would take less than 4ms
> to start - i.e. under 1% of the time. Yes, it's a very crude benchmark
> - but I do think 95% is higher than reality.
>
> Just to reiterate though, I totally agree that the OP should move away
> from that architecture ASAP :)
>
> --
> Jon Skeet - <sk***@pobox.com>
> http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
> If replying to the group, please do not mail me too
Author
6 Jan 2007 10:24 PM
Jon Skeet [C# MVP]
Chris Mullins [MVP] <cmull***@yahoo.com> wrote:
> I try hard not to let reality intrude on my generalizations...

:)

> I guess the big questions for the OP is "how much work are you doing in each
> thread?". It still may be trivial relevant to the time it takes to start the
> thread.

True.

> ... but yea, starting and stopping threads are like exceptions. Everyone
> says (even me, obviously) that it's horrible, but we all forget just how
> good horrible can be. :)

And of course the reality of that changes every year as hardware gets
cheaper. Just as with exceptions, of course, excessive starting and
stopping of threads is a symptom of an architecture which should be
looked at, but probably won't actually kill performance in itself.

Next in line: how expensive is making a database connection these days?
:)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
8 Jan 2007 12:06 AM
Michael Petrotta
Jon wrote:
> Next in line: how expensive is making a database connection these days?

That may have been a rhetorical question, but I happen to have just
written a little test to figure out how pooled versus unpooled
connections performed.  The code below is executing the same simple
stored procedure 1000 times, from a 3.4GHz Xeon box to a 3.6GHz Xeon
box about four network hops away.

No pooling 1000 iterations.  Elapsed time: 00:00:12.9217096
Pooling 1000 iterations.  Elapsed time: 00:00:01.0624864

That's about 12ms per round-trip for a non-pooled connection, and 1ms
per for a pooled connection (I was trying to prove that pooling didn't
produce an order of magnitude performance improvement, and I was
obviously wrong).

using System;
using System.Data.SqlClient;
using System.Data;

namespace PoolSpeedTest
{
    class Program
    {
        static string server = XXX
        static string database = XXX
        static string connStringPooling = "Data Source={0};Initial
Catalog={1};Integrated Security=True";
        static string connStringNoPooling = "Data Source={0};Initial
Catalog={1};Integrated Security=True;Pooling=false";

        static void Main(string[] args)
        {
            TwoByOne(1000);
            Console.ReadLine();
        }

        static private void TwoByOne(int iterations)
        {
            // preload
            for (int i = 0; i < 10; i++)
            {
                using (SqlConnection conn = new
SqlConnection(String.Format(connStringPooling, server, database)))
                {
                    DoSomething(conn);
                }
            }

            for (int i = 0; i < 10; i++)
            {
                using (SqlConnection conn = new
SqlConnection(String.Format(connStringNoPooling, server, database)))
                {
                    DoSomething(conn);
                }
            }
            //

            DateTime start;
            DateTime end;

            start = DateTime.Now;
            for (int i = 0; i < iterations; i++)
            {
                using (SqlConnection conn = new
SqlConnection(String.Format(connStringNoPooling, server, database)))
                {
                    DoSomething(conn);
                }
            }
            end = DateTime.Now;
            Console.WriteLine(String.Format("No pooling {0} iterations.  Elapsed
time: {1}", iterations, end - start));

            start = DateTime.Now;
            for (int i = 0; i < iterations; i++)
            {
                using (SqlConnection conn = new
SqlConnection(String.Format(connStringPooling, server, database)))
                {
                    DoSomething(conn);
                }
            }
            end = DateTime.Now;
            Console.WriteLine(String.Format("   Pooling {0} iterations.  Elapsed
time: {1}", iterations, end-start));

        }

        private static void DoSomething(SqlConnection conn)
        {
            SqlCommand command = new SqlCommand("XXX", conn);
            command.CommandType = CommandType.StoredProcedure;
            conn.Open();
            using (IDataReader dr =
command.ExecuteReader(CommandBehavior.CloseConnection))
            {
                while (dr.Read())
                {
                    string x= dr["XXX"].ToString();
                }
            }
        }
    }
}
Author
8 Jan 2007 7:31 PM
Jon Skeet [C# MVP]
Michael Petrotta <mpetro***@gmail.com> wrote:
Show quote
> Jon wrote:
> > Next in line: how expensive is making a database connection these days?
>
> That may have been a rhetorical question, but I happen to have just
> written a little test to figure out how pooled versus unpooled
> connections performed.  The code below is executing the same simple
> stored procedure 1000 times, from a 3.4GHz Xeon box to a 3.6GHz Xeon
> box about four network hops away.
>
> No pooling 1000 iterations.  Elapsed time: 00:00:12.9217096
> Pooling 1000 iterations.  Elapsed time: 00:00:01.0624864
>
> That's about 12ms per round-trip for a non-pooled connection, and 1ms
> per for a pooled connection (I was trying to prove that pooling didn't
> produce an order of magnitude performance improvement, and I was
> obviously wrong).

Out of interest, what happens to the figures if:

1) You're on the same box?
2) You're only one network hop away?

My guess is that 2) won't be terribly different, but 1) may well
decrease difference significantly.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
8 Jan 2007 10:27 PM
Michael Petrotta
Jon wrote:
Show quote
> Michael Petrotta <mpetro***@gmail.com> wrote:
> > Jon wrote:
> > > Next in line: how expensive is making a database connection these days?
> >
> > That may have been a rhetorical question, but I happen to have just
> > written a little test to figure out how pooled versus unpooled
> > connections performed.  The code below is executing the same simple
> > stored procedure 1000 times, from a 3.4GHz Xeon box to a 3.6GHz Xeon
> > box about four network hops away.
> >
> > No pooling 1000 iterations.  Elapsed time: 00:00:12.9217096
> > Pooling 1000 iterations.  Elapsed time: 00:00:01.0624864
> >
> > That's about 12ms per round-trip for a non-pooled connection, and 1ms
> > per for a pooled connection (I was trying to prove that pooling didn't
> > produce an order of magnitude performance improvement, and I was
> > obviously wrong).
>
> Out of interest, what happens to the figures if:
>
> 1) You're on the same box?

No pooling 1000 iterations.  Elapsed time: 00:00:06.0936720
   Pooling 1000 iterations.  Elapsed time: 00:00:00.2499968

> 2) You're only one network hop away?

No pooling 1000 iterations.  Elapsed time: 00:00:12.7683600
   Pooling 1000 iterations.  Elapsed time: 00:00:00.2904176

Interesting.  The results are repeatable, but it's not a perfect test
(in particular, the client for #2 is a much slower laptop).

If I take the results at face value, it seems to say that network speed
affects pooled connections (makes sense; there's not much connection
setup and teardown with pooled connections.  It's just the time taken
to get the query and results over the wires).

I think the speed of my laptop is affecting the non-pooled test; it's
taking time to have ADO.NET set up and tear down the connection.
Unfortunately, our network architecture is such that it's hard to have
two powerful desktops closely connected.

(The speed of pooled connections in general surprised me, when I first
ran this test.  I'd assumed network latency would give me round-trip
times around 10-20ms.  Pings to the server are also returning
sub-millisecond times.  Remember modems and their 200ms pings?)

Michael
Author
8 Jan 2007 11:25 PM
Jon Skeet [C# MVP]
Michael Petrotta <mpetro***@gmail.com> wrote:
Show quote
> > Out of interest, what happens to the figures if:
> >
> > 1) You're on the same box?
>
> No pooling 1000 iterations.  Elapsed time: 00:00:06.0936720
>    Pooling 1000 iterations.  Elapsed time: 00:00:00.2499968
>
> > 2) You're only one network hop away?
>
> No pooling 1000 iterations.  Elapsed time: 00:00:12.7683600
>    Pooling 1000 iterations.  Elapsed time: 00:00:00.2904176
>
> Interesting.  The results are repeatable, but it's not a perfect test
> (in particular, the client for #2 is a much slower laptop).

Wow. Just shows how wrong intuition can be!

> If I take the results at face value, it seems to say that network speed
> affects pooled connections (makes sense; there's not much connection
> setup and teardown with pooled connections.  It's just the time taken
> to get the query and results over the wires).

Yup.

> I think the speed of my laptop is affecting the non-pooled test; it's
> taking time to have ADO.NET set up and tear down the connection.
> Unfortunately, our network architecture is such that it's hard to have
> two powerful desktops closely connected.

Fair enough - thanks for taking the time to run the tests at all!

> (The speed of pooled connections in general surprised me, when I first
> ran this test.  I'd assumed network latency would give me round-trip
> times around 10-20ms.  Pings to the server are also returning
> sub-millisecond times.  Remember modems and their 200ms pings?)

It's quite incredible how fast some things can change while others stay
the same. Where are the 1TB+ cheap, fast static memory chips that we've
been waiting for for so long? That's what I think will *really*
transform computing...

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Author
6 Jan 2007 1:33 AM
Chris Mullins [MVP]
"Kunal" <koolku***@gmail.com> wrote
> There are about 10 new threads created per second. Threads do some
> processing and die. All threads process based on data read from the
> global structure.

You want to abandon this architecture as quickly as you can. The cost of
creating and destroying threads is very high, and at 10 threads per second,
you're application is spending 95% of it's time creating threads, and 5%
doing the work you want done.

You should:
1 - Use the System ThreadPool. If your work isn't I/O related (hitting SQL,
making a web service call, reading/writing to a file) this is what you want
to do. If your work is I/O centric, don't do this.

2 - Create your threads, but keep them around. When a thread is doing doing
it's work, have it go look in a "Work queue" for more work to do. This is a
good way to go in general.

> This global structure is populated from a set of files at application
> startup, which can be modified at run-time. There is also the option to
> apply the modified files at run-time. At the time the modified files
> are read into the application, I need to block access to this global
> structure from the 'reader' threads. Threads do not modify this
> structure.

The pattern you're looking for is a ReaderWriterLock. In my other post, I
used a Montor (which in C# looks like 'lock()'). A ReaderWriterLock is very
similar in terms of methodology though.

--
Chris Mullins, MCSD.NET, MCPD:Enterprise, MVP C#
http://www.coversant.net/blogs/cmullins

AddThis Social Bookmark Button