Discussion:
bug#40023: 26.3; Emacs crashes when creating process if many file handles are in use (e.g., when using ccls)
(too old to reply)
Robert Pluim
2020-03-11 15:32:14 UTC
Permalink
Holger> I have spent some time digging around the code and couldn't quite figure
Holger> out the logic behind the `fd < FD_SETSIZE`-assert. I suspect the CCLS
Holger> opens files without going through emacs' infrastructure which leads to
Holger> high file descriptors which emacs cannot handle. I would say that CCLS
Holger> is, in part, to blame but emacs shouldn't simply crash.

Emacs is limited by the limits of select, which supports a maximum of
FD_SETSIZE file descriptors, which on macOS is 1024

At least on macOS, itʼs possible to increase that value to eg 8192 by
setting FD_SETSIZE, which might alleviate this, but then you'd
probably run into the 'ulimit -n' maximum, so you'd have to remember
to increase that.

Using 'poll' would help, but as far as I know nobody is working on
converting emacs to use it.

It would be interesting to see how CCLS opens files. How does it pass
the descriptor back to emacs?

Robert
Eli Zaretskii
2020-03-11 16:17:02 UTC
Permalink
Date: Wed, 11 Mar 2020 16:32:14 +0100
Holger> I have spent some time digging around the code and couldn't quite figure
Holger> out the logic behind the `fd < FD_SETSIZE`-assert. I suspect the CCLS
Holger> opens files without going through emacs' infrastructure which leads to
Holger> high file descriptors which emacs cannot handle. I would say that CCLS
Holger> is, in part, to blame but emacs shouldn't simply crash.
Emacs is limited by the limits of select, which supports a maximum of
FD_SETSIZE file descriptors, which on macOS is 1024
At least on macOS, itʼs possible to increase that value to eg 8192 by
setting FD_SETSIZE, which might alleviate this, but then you'd
probably run into the 'ulimit -n' maximum, so you'd have to remember
to increase that.
Wasn't this fixed lately by Yamamoto-san?
Robert Pluim
2020-03-12 07:26:03 UTC
Permalink
Date: Wed, 11 Mar 2020 16:32:14 +0100
Holger> I have spent some time digging around the code and couldn't quite figure
Holger> out the logic behind the `fd < FD_SETSIZE`-assert. I suspect the CCLS
Holger> opens files without going through emacs' infrastructure which leads to
Holger> high file descriptors which emacs cannot handle. I would say that CCLS
Holger> is, in part, to blame but emacs shouldn't simply crash.
Emacs is limited by the limits of select, which supports a maximum of
FD_SETSIZE file descriptors, which on macOS is 1024
At least on macOS, itʼs possible to increase that value to eg 8192 by
setting FD_SETSIZE, which might alleviate this, but then you'd
probably run into the 'ulimit -n' maximum, so you'd have to remember
to increase that.
Eli> Wasn't this fixed lately by Yamamoto-san?

You mean Bug#39164? I donʼt think there was a commit for that.

Robert
Robert Pluim
2020-03-12 07:27:30 UTC
Permalink
Holger> I think it is fine if this is an open problem but I would
Holger> like more graceful failure. Isn’t it possible to catch
Holger> this behavior and turn it into a lisp-level error?

Probably. Iʼm still curious as to what ccls is doing that results in
these file descriptors (or maybe itʼs not ccls' fault, but the
underlying OS).

Robert
Eli Zaretskii
2020-03-12 15:11:39 UTC
Permalink
Date: Thu, 12 Mar 2020 08:26:03 +0100
Eli> Wasn't this fixed lately by Yamamoto-san?
You mean Bug#39164? I donʼt think there was a commit for that.
That's too bad. I think we should revive that bug and fix it in Emacs
27, since I understand the recent versions of macOS will have that
effect more and more frequently.
Robert Pluim
2020-03-12 15:24:44 UTC
Permalink
Date: Thu, 12 Mar 2020 08:26:03 +0100
Eli> Wasn't this fixed lately by Yamamoto-san?
You mean Bug#39164? I donʼt think there was a commit for that.
Eli> That's too bad. I think we should revive that bug and fix it in Emacs
Eli> 27, since I understand the recent versions of macOS will have that
Eli> effect more and more frequently.

If someone has an 'emacs -Q' testcase I can look at it. Soon we will
all be spending lots of time at home here :-(

Robert
Eli Zaretskii
2020-03-12 15:50:14 UTC
Permalink
Date: Thu, 12 Mar 2020 16:24:44 +0100
Post by Robert Pluim
You mean Bug#39164? I donʼt think there was a commit for that.
Eli> That's too bad. I think we should revive that bug and fix it in Emacs
Eli> 27, since I understand the recent versions of macOS will have that
Eli> effect more and more frequently.
If someone has an 'emacs -Q' testcase I can look at it. Soon we will
all be spending lots of time at home here :-(
We could simply install the proposed workaround on the emacs-27 branch
for now, I think.
Robert Pluim
2020-03-12 16:46:58 UTC
Permalink
Date: Thu, 12 Mar 2020 16:24:44 +0100
Post by Robert Pluim
You mean Bug#39164? I donʼt think there was a commit for that.
Eli> That's too bad. I think we should revive that bug and fix it in Emacs
Eli> 27, since I understand the recent versions of macOS will have that
Eli> effect more and more frequently.
If someone has an 'emacs -Q' testcase I can look at it. Soon we will
all be spending lots of time at home here :-(
Eli> We could simply install the proposed workaround on the emacs-27 branch
Eli> for now, I think.

Holger, is it possible for you to rebuild emacs with the following
patch and see if Emacs still crashes? If it fixes things Iʼll install
to emacs-27

diff --git a/src/nsterm.m b/src/nsterm.m
index aefbb2721e..90f63bc182 100644
--- a/src/nsterm.m
+++ b/src/nsterm.m
@@ -5805,6 +5805,22 @@ - (void)applicationDidFinishLaunching: (NSNotification *)notification
#endif

#ifdef NS_IMPL_COCOA
+ /* Some functions/methods in CoreFoundation/Foundation increase the
+ maximum number of open files for the process in their first call.
+ We make dummy calls to them and then reduce the resource limit
+ here, since pselect cannot handle file descriptors that are
+ greater than or equal to FD_SETSIZE. */
+ CFSocketGetTypeID ();
+ CFFileDescriptorGetTypeID ();
+ [[NSFileHandle alloc] init];
+ struct rlimit rlim;
+ if (getrlimit (RLIMIT_NOFILE, &rlim) == 0
+ && rlim.rlim_cur > FD_SETSIZE)
+ {
+ rlim.rlim_cur = FD_SETSIZE;
+ setrlimit (RLIMIT_NOFILE, &rlim);
+ }
+
if ([NSApp activationPolicy] == NSApplicationActivationPolicyProhibited) {
/* Set the app's activation policy to regular when we run outside
of a bundle. This is already done for us by Info.plist when we
Robert Pluim
2020-03-19 10:41:37 UTC
Permalink
Post by Robert Pluim
If someone has an 'emacs -Q' testcase I can look at it. Soon we will
all be spending lots of time at home here :-(
Eli> We could simply install the proposed workaround on the emacs-27 branch
Eli> for now, I think.

So I installed lsp-mode and ccls, added lsp to c-mode-hook, visited
emacs/src/font.c and then said yes when it asked me if I wanted to
watch all the files (there are ~6500). I then recompiled emacs using
M-x compile, and saw no crash.

This is with emacs-master. I checked with emacs-26 and saw no crash. I
did look in *lsp-log*, and saw a bunch of messages saying:

Failed to create a watch for File watching not possible, no file descriptor left: 975: message

(this is with 'ulimit -n 4096')

so I think this is very much dependent on which language server youʼre
using. Note also that itʼs watching directories, not files, I had to
create a bunch of extra directories to get that message.

I have:

ccls --version
ccls version 8.0.0 (tags/RELEASE_800/final)

Robert
Eli Zaretskii
2020-03-19 14:36:17 UTC
Permalink
Date: Thu, 19 Mar 2020 11:41:37 +0100
So I installed lsp-mode and ccls, added lsp to c-mode-hook, visited
emacs/src/font.c and then said yes when it asked me if I wanted to
watch all the files (there are ~6500). I then recompiled emacs using
M-x compile, and saw no crash.
I thought this only happens on some (recent enough) versions of the
OS?

Anyway, thanks for trying.
Mattias Engdegård
2020-04-11 18:15:37 UTC
Permalink
Actually reducing RLIMIT_NOFILE doesn't sound like a terrific idea -- if anything, we need as many descriptors as we can get, in particular on macOS where kqueue uses up one per monitored file. If we are limited by FD_SETSIZE, then we shouldn't use select. Or am I missing something?
Eli Zaretskii
2020-04-11 19:02:20 UTC
Permalink
Date: Sat, 11 Apr 2020 20:15:37 +0200
Actually reducing RLIMIT_NOFILE doesn't sound like a terrific idea -- if anything, we need as many descriptors as we can get, in particular on macOS where kqueue uses up one per monitored file. If we are limited by FD_SETSIZE, then we shouldn't use select. Or am I missing something?
We are talking about the emacs-27 branch, where such deep changes are
out of the question.
Michael Albinus
2020-04-12 10:19:44 UTC
Permalink
Post by Mattias Engdegård
Actually reducing RLIMIT_NOFILE doesn't sound like a terrific idea --
if anything, we need as many descriptors as we can get, in particular
on macOS where kqueue uses up one per monitored file. If we are
limited by FD_SETSIZE, then we shouldn't use select. Or am I missing
something?
kqueue limits itself to (RLIMIT_NOFILE - 50) file descriptors. If 50
file descriptors aren't sufficient, we could increase that number, or
make it a defvar changeable via Lisp.

See line 397 of kqueue.c

Best regards, Michael.
Mattias Engdegård
2020-04-14 16:02:47 UTC
Permalink
With lsp-mode I think itʼs one fd per directory containing a monitored
file, but either way itʼs a limitation, and one that people are
running into.
Are you sure? In contrast to inotify, kqueue can't detect changes to the contents of files (or to their inodes) by monitoring their containing directory.
- the patch from
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40023#32>, which
reduces RLIMIT_NOFILE to FD_SETSIZE.
It's perhaps the least bad option for emacs-27. In addition to the file monitoring requirement, there's the concern that the macOS libs raise RLIMIT_NOFILE for reasons of their own, and that more things will break when they no longer get what they want.
- select() -> poll(). GNU/Linux, macOS and *BSD all have poll, plus
thereʼs a gnulib module for it that we could perhaps use on
MS-Windows. As Eli says, not a small change.
Right. macOS doesn't have ppoll but Emacs doesn't seem to make any use of the sigmask argument.
(increasing FD_SETSIZE is a no-go: it works on macOS, maybe on *BSD,
and is on very shaky standards-compliance ground).
Agreed.
Robert Pluim
2020-04-14 16:14:19 UTC
Permalink
With lsp-mode I think itʼs one fd per directory containing a monitored
file, but either way itʼs a limitation, and one that people are
running into.
Mattias> Are you sure? In contrast to inotify, kqueue can't detect changes to
Mattias> the contents of files (or to their inodes) by monitoring their
Mattias> containing directory.

C-x C-f my-dodgy-memory-disclaimer.txt :-)
- the patch from
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40023#32>, which
reduces RLIMIT_NOFILE to FD_SETSIZE.
Mattias> It's perhaps the least bad option for emacs-27. In addition to the
Mattias> file monitoring requirement, there's the concern that the macOS libs
Mattias> raise RLIMIT_NOFILE for reasons of their own, and that more things
Mattias> will break when they no longer get what they want.

Libraries have no business messing with RLIMIT_NOFILE, but this is
macOS weʼre talking about.
- select() -> poll(). GNU/Linux, macOS and *BSD all have poll, plus
thereʼs a gnulib module for it that we could perhaps use on
MS-Windows. As Eli says, not a small change.
Mattias> Right. macOS doesn't have ppoll but Emacs doesn't seem to make any use of the sigmask argument.

Right, we always pass NULL.

Itʼs unfortunately very much a flag-day change that would need testing
on loads of different platforms. Or we #ifdef it to death.

Robert
Eli Zaretskii
2020-04-14 16:23:21 UTC
Permalink
Date: Tue, 14 Apr 2020 18:14:19 +0200
Itʼs unfortunately very much a flag-day change that would need testing
on loads of different platforms. Or we #ifdef it to death.
The user who reported bug#40555 said the patch we have here fixed the
problem, so I think we should install that patch on the emacs-27
branch (I hope we will have another pretest soon).

Or is there something else that's needed before we can install that
patch?

Thanks.
Robert Pluim
2020-04-14 13:58:51 UTC
Permalink
Post by Mattias Engdegård
Actually reducing RLIMIT_NOFILE doesn't sound like a terrific idea --
if anything, we need as many descriptors as we can get, in particular
on macOS where kqueue uses up one per monitored file. If we are
limited by FD_SETSIZE, then we shouldn't use select. Or am I missing
something?
With lsp-mode I think itʼs one fd per directory containing a monitored
file, but either way itʼs a limitation, and one that people are
running into.

Michael> kqueue limits itself to (RLIMIT_NOFILE - 50) file descriptors. If 50
Michael> file descriptors aren't sufficient, we could increase that number, or
Michael> make it a defvar changeable via Lisp.

50 is enough. The issue is that ccls uses up the other FD_SETSIZE - 50
descriptors, and then other libraries end up passing descriptors >
FD_SETSIZE back to emacs because RLIMIT_NOFILE has been changed behind
our back.

I only see two solutions:

- the patch from
<https://debbugs.gnu.org/cgi/bugreport.cgi?bug=40023#32>, which
reduces RLIMIT_NOFILE to FD_SETSIZE. It works for me, but I have to
artificially generate the crashing conditions, so some feedback from
others would be nice (especially if we want to put it in
emacs-27). Plus it limits the number of open file descriptors even
on systems where crazy users want to monitor 10k files (maybe thatʼs
a good thing :-) ).

- select() -> poll(). GNU/Linux, macOS and *BSD all have poll, plus
thereʼs a gnulib module for it that we could perhaps use on
MS-Windows. As Eli says, not a small change.

(increasing FD_SETSIZE is a no-go: it works on macOS, maybe on *BSD,
and is on very shaky standards-compliance ground).

Robert

Loading...