Home All Groups Group Topic Archive Search About

directory.exists occassionally locks up when checking network share

Author
1 Sep 2006 3:42 PM
Keith Langer
Hi,

I'm running the .Net 1.1 framework on XP Pro.  I am finding some
occasions where a call to Directory.Exists never returns when passing
in a network share.  Most of the time the call returns properly, but if
the machine is really busy or if there is a network problem (or if the
server's Secondary Logon or Server services have failed), then this
problem can surface.  This causes my application to hang indefinitely.
Is there some a framework patch to fix this or some other way to
prevent it?

thanks,
Keith Langer

Author
1 Sep 2006 7:54 PM
Ben Voigt
"Keith Langer" <tanal***@aol.com> wrote in message
news:1157125366.490006.37070@i42g2000cwa.googlegroups.com...
> Hi,
>
> I'm running the .Net 1.1 framework on XP Pro.  I am finding some
> occasions where a call to Directory.Exists never returns when passing
> in a network share.  Most of the time the call returns properly, but if
> the machine is really busy or if there is a network problem (or if the
> server's Secondary Logon or Server services have failed), then this
> problem can surface.  This causes my application to hang indefinitely.
> Is there some a framework patch to fix this or some other way to
> prevent it?

Start the I/O on a new thread and then wait on the thread with a timeout...
In .Net 2.0 you could use BackgroundWorker, in 1.1 you have to write a
little more code yourself.

Show quote
>
> thanks,
> Keith Langer
>
Author
1 Sep 2006 9:14 PM
Keith Langer
Ben,

One thing that concerns me is that when this problem occurred, the
process became so locked up that it couldn't be killed and we couldn't
get the machine to reboot.  We finally had to kill the winlogon service
which forced a reboot.  So I don't know if the background thread is
going to allow itself to be aborted so easily.

But your suggestion is something that I actually implemented a short
while ago.  Since I can't reproduce the problem locally, I'll have to
wait until next week to test it.

Keith


Ben Voigt wrote:
Show quote
> "Keith Langer" <tanal***@aol.com> wrote in message
> news:1157125366.490006.37070@i42g2000cwa.googlegroups.com...
> > Hi,
> >
> > I'm running the .Net 1.1 framework on XP Pro.  I am finding some
> > occasions where a call to Directory.Exists never returns when passing
> > in a network share.  Most of the time the call returns properly, but if
> > the machine is really busy or if there is a network problem (or if the
> > server's Secondary Logon or Server services have failed), then this
> > problem can surface.  This causes my application to hang indefinitely.
> > Is there some a framework patch to fix this or some other way to
> > prevent it?
>
> Start the I/O on a new thread and then wait on the thread with a timeout...
> In .Net 2.0 you could use BackgroundWorker, in 1.1 you have to write a
> little more code yourself.
>
> >
> > thanks,
> > Keith Langer
> >
Author
2 Sep 2006 1:19 PM
Keith Langer
Does anyone from Microsoft read this newsgroup?  It would be nice if
someone could tell me why this is happening.

Keith



Keith Langer wrote:
Show quote
> Ben,
>
> One thing that concerns me is that when this problem occurred, the
> process became so locked up that it couldn't be killed and we couldn't
> get the machine to reboot.  We finally had to kill the winlogon service
> which forced a reboot.  So I don't know if the background thread is
> going to allow itself to be aborted so easily.
>
> But your suggestion is something that I actually implemented a short
> while ago.  Since I can't reproduce the problem locally, I'll have to
> wait until next week to test it.
>
> Keith
>
>
> Ben Voigt wrote:
> > "Keith Langer" <tanal***@aol.com> wrote in message
> > news:1157125366.490006.37070@i42g2000cwa.googlegroups.com...
> > > Hi,
> > >
> > > I'm running the .Net 1.1 framework on XP Pro.  I am finding some
> > > occasions where a call to Directory.Exists never returns when passing
> > > in a network share.  Most of the time the call returns properly, but if
> > > the machine is really busy or if there is a network problem (or if the
> > > server's Secondary Logon or Server services have failed), then this
> > > problem can surface.  This causes my application to hang indefinitely.
> > > Is there some a framework patch to fix this or some other way to
> > > prevent it?
> >
> > Start the I/O on a new thread and then wait on the thread with a timeout...
> > In .Net 2.0 you could use BackgroundWorker, in 1.1 you have to write a
> > little more code yourself.
> >
> > >
> > > thanks,
> > > Keith Langer
> > >
Author
2 Sep 2006 1:34 PM
Carl Daniel [VC++ MVP]
Keith Langer wrote:
> Does anyone from Microsoft read this newsgroup?  It would be nice if
> someone could tell me why this is happening.

Various MSFT people do read this group.  Telling you why it's happening is
not necessarily something they'll readily be able to do.

From what you describe, your application is waiting for a synchronous I/O
request to complete, and that request is blocked in the kernel.  That's what
results in a hung, un-killable application.

The first things I'd check:

1. Do the machines where it fails have different network hardware than the
machines where it works?
2. Do the machines where it fails have different network driver versions
that the machines where it works?
3. Are the machines where it fails up to date on patches for the OS and all
installed hardware?

Bottom line - it's most likely a defective network driver or hardware that's
at the root of it.  Network issues do legitimately occur, but worst case,
your application should hang for a couple minutes before a timeout occurs
and everything gets unstuck.

-cd
Author
3 Sep 2006 5:22 AM
Keith Langer
Carl,

>From what I've been told, the machines that have had the failure may
have a different type of NIC than the machines that don't have the
failure.  The call to this function can return successfully hundreds or
thousands of times before the call doesn't return at all.   The OS is
identical on all machines.  As to whether the NIC drivers are up to
date, I don't know.  Shouldn't Windows be able to deal with even a bad
NIC driver by causing a timeout?

Some more background on how this function is used:  This application
will check for the server share every 30 seconds until it finds it.  It
also attempts to connect to the share with a secondary login since the
primary login has a password which conflicts with the server.  Due to a
virus, the Server and Secondary logon services had failed on the
server, so the application would check for the share every 30 seconds
and never find it.

A few questions:
1) Any idea how I can force this situation to be reproduced?
2) Do you think that if I call this method from a different thread that
I'm going to still have problems?  I'm guessing that the thread will
never be successfully aborted and the system performance will degrade
as a result.
3) Is there another way to check for the directory's existence while
avoiding the potential for a lockup?

thanks,
Keith



Carl Daniel [VC++ MVP] wrote:
Show quote
> Keith Langer wrote:
> > Does anyone from Microsoft read this newsgroup?  It would be nice if
> > someone could tell me why this is happening.
>
> Various MSFT people do read this group.  Telling you why it's happening is
> not necessarily something they'll readily be able to do.
>
> From what you describe, your application is waiting for a synchronous I/O
> request to complete, and that request is blocked in the kernel.  That's what
> results in a hung, un-killable application.
>
> The first things I'd check:
>
> 1. Do the machines where it fails have different network hardware than the
> machines where it works?
> 2. Do the machines where it fails have different network driver versions
> that the machines where it works?
> 3. Are the machines where it fails up to date on patches for the OS and all
> installed hardware?
>
> Bottom line - it's most likely a defective network driver or hardware that's
> at the root of it.  Network issues do legitimately occur, but worst case,
> your application should hang for a couple minutes before a timeout occurs
> and everything gets unstuck.
>
> -cd
Author
3 Sep 2006 5:49 AM
Carl Daniel [VC++ MVP]
Keith Langer wrote:
> Carl,
>
>> From what I've been told, the machines that have had the failure may
> have a different type of NIC than the machines that don't have the
> failure.  The call to this function can return successfully hundreds
> or thousands of times before the call doesn't return at all.   The OS
> is identical on all machines.  As to whether the NIC drivers are up to
> date, I don't know.  Shouldn't Windows be able to deal with even a bad
> NIC driver by causing a timeout?

Unfortunately, no.  Unless the driver correctly implements IO cancellation
and timeouts, there's nothing the IO manager in the OS can do to forcibly
stop it (only the driver can know the actions required to reliably cancel a
request).

>
> Some more background on how this function is used:  This application
> will check for the server share every 30 seconds until it finds it.
> It
> also attempts to connect to the share with a secondary login since the
> primary login has a password which conflicts with the server.  Due to
> a virus, the Server and Secondary logon services had failed on the
> server, so the application would check for the share every 30 seconds
> and never find it.
>
> A few questions:
> 1) Any idea how I can force this situation to be reproduced?

No.  From what you describe, I'd guess that there's a good chance that it's
a driver bug.

> 2) Do you think that if I call this method from a different thread
> that
> I'm going to still have problems?  I'm guessing that the thread will
> never be successfully aborted and the system performance will degrade
> as a result.

I wouldn't expect it to make any difference at all.

> 3) Is there another way to check for the directory's existence while
> avoiding the potential for a lockup?

Nothing comes to mind, sorry.

-cd
Author
3 Sep 2006 2:33 PM
Keith Langer
Carl,

Do you think I would still get the lock up if I tried to retrieve the
directory info or file info instead of calling Exists?  These calls
would normally throw an error if the directory doesn't exist.

Keith



Carl Daniel [VC++ MVP] wrote:
Show quote
> Keith Langer wrote:
> > Carl,
> >
> >> From what I've been told, the machines that have had the failure may
> > have a different type of NIC than the machines that don't have the
> > failure.  The call to this function can return successfully hundreds
> > or thousands of times before the call doesn't return at all.   The OS
> > is identical on all machines.  As to whether the NIC drivers are up to
> > date, I don't know.  Shouldn't Windows be able to deal with even a bad
> > NIC driver by causing a timeout?
>
> Unfortunately, no.  Unless the driver correctly implements IO cancellation
> and timeouts, there's nothing the IO manager in the OS can do to forcibly
> stop it (only the driver can know the actions required to reliably cancel a
> request).
>
> >
> > Some more background on how this function is used:  This application
> > will check for the server share every 30 seconds until it finds it.
> > It
> > also attempts to connect to the share with a secondary login since the
> > primary login has a password which conflicts with the server.  Due to
> > a virus, the Server and Secondary logon services had failed on the
> > server, so the application would check for the share every 30 seconds
> > and never find it.
> >
> > A few questions:
> > 1) Any idea how I can force this situation to be reproduced?
>
> No.  From what you describe, I'd guess that there's a good chance that it's
> a driver bug.
>
> > 2) Do you think that if I call this method from a different thread
> > that
> > I'm going to still have problems?  I'm guessing that the thread will
> > never be successfully aborted and the system performance will degrade
> > as a result.
>
> I wouldn't expect it to make any difference at all.
>
> > 3) Is there another way to check for the directory's existence while
> > avoiding the potential for a lockup?
>
> Nothing comes to mind, sorry.
>
> -cd
Author
3 Sep 2006 2:49 PM
Carl Daniel [VC++ MVP]
Keith Langer wrote:
> Carl,
>
> Do you think I would still get the lock up if I tried to retrieve the
> directory info or file info instead of calling Exists?  These calls
> would normally throw an error if the directory doesn't exist.

Most likely, it wouldn't make any difference, but it wouldn't hurt to try.

-cd

AddThis Social Bookmark Button