Discussion:
bug#10580: 24.0.92; gdb initialization takes more than one minute at 100% CPU
(too old to reply)
Glenn Morris
2012-01-23 00:53:46 UTC
Permalink
When trying to running gdb on my process, emacs gets stuck for more than
one minutes while the CPU load is at 100%.
Does this happen with `emacs -Q'? Because it looks like you may not be
/home/dov/.config/emacs/gdb-mi hides
/usr/local/public-dev/share/emacs/24.0.92/lisp/progmodes/gdb-mi
Glenn Morris
2012-01-23 09:21:09 UTC
Permalink
Thanks for your reply Glenn. Is there a public bugzilla like site for
emacs bugs or are they managed in email only. I have additional
problems with gdb in emacs 24 and I would like to know if the problems
have been reported.
You should have received this email acknowledgement of your report:

http://debbugs.gnu.org/cgi/bugreport.cgi?msg=4;bug=10580

which contains a URL pointing to your report. From there you can
easily find the front page: http://debbugs.gnu.org
I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU. Here is the output
from `report-emacs bug` when running with 'emacs -Q'.
In GNU Emacs 24.0.92.1 (i686-pc-linux-gnu, GTK+ Version 2.22.0)
of 2012-01-22 on dovg32
Windowing system distributor `Fedora Project', version 11.0.10905000
configured using `configure '--prefix=/usr/local/public-dev''
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: nil
locale-coding-system: nil
default enable-multibyte-characters: t
Major mode: Debugger
tooltip-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
<help-echo> C-x C-f C-a C-y C-k <S-insert> C-a C-k
<help-echo> <S-insert> C-a C-k <help-echo> <down-mouse-2>
<mouse-2> <help-echo> <help-echo> <help-echo> <return>
M-x g d b <return> <backspace> <backspace> <backspace>
<backspace> <backspace> <backspace> <backspace> <backspace>
<backspace> <backspace> <backspace> <backspace> B i
n L <tab> 3 2 <tab> S o <tab> <return> <help-echo>
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
C-x k <return> y e s <return> M-x M-p <return> <return>
M-x g d b <tab> <backspace> <backspace> <backspace>
b u g <tab> <C-backspace> <C-backspace> r e p <tab>
o r <tab> <return>
For information about GNU Emacs and the GNU system, type C-h C-a.
current-kill: Kill ring is empty
Making completion list...
None found.
(shadow sort gnus-util mail-extr message format-spec rfc822 mml mml-sec
mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mailabbrev mail-utils gmm-utils mailheader
emacsbug help-mode easymenu view gdb-mi bindat json gud easy-mmode
comint ring dired regexp-opt time-date tooltip ediff-hook vc-hooks
lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image fringe
lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer loaddefs button faces cus-face files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process dynamic-setting
system-font-setting font-render-setting move-toolbar gtk x-toolkit x
multi-tty emacs)
Glenn Morris
2012-01-25 00:37:14 UTC
Permalink
I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU.
Is this with everything you try to debug, or just certain things?

Can you try M-x toggle-debug-on-quit, then interrupt Emacs with ctrl-g
during those 40 seconds and see if you get a backtrace?

Or try M-x edebug-defun on the `gdb' function, step through it, and see
what is taking the time.

Guesses: do you have a huge .gdb_history or ~/.gdbinit file?
Dov Grobgeld
2012-01-25 08:49:29 UTC
Permalink
Here are some more tests:

1. It doesn't get stuck when debugging a "hello-world.c" program. Thus it
depends on the executable.
2. Regarding toggle-debug-on-quit and C-g, it doesn't work. No backtrace is
produced and the CPU continues to be at 100%.
3. I tried running edebug on gdb, which wasn't easy. I had to manually
first do eval-buffer on gdb-mi.el and gud.el. But in the end I managed and
found that the CPU is spend during (run-hooks 'gdb-mode-hook) . I'll try to
investigate it further in the next few days.

Regards,
Dov
Post by Glenn Morris
I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU.
Is this with everything you try to debug, or just certain things?
Can you try M-x toggle-debug-on-quit, then interrupt Emacs with ctrl-g
during those 40 seconds and see if you get a backtrace?
Or try M-x edebug-defun on the `gdb' function, step through it, and see
what is taking the time.
Guesses: do you have a huge .gdb_history or ~/.gdbinit file?
Dov Grobgeld
2012-01-25 09:39:25 UTC
Permalink
More tests.

1. I was mistaken in my previous email, the CPU spending was triggered
before gdb-mode-hook.
2. The following command in gdb-input starts the 40s 100% CPU:

(process-send-string (get-buffer-process gud-comint-buffer)
(concat command "\n")))

where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.

Is this enough info, or do you want me to probe deeper?
Post by Dov Grobgeld
1. It doesn't get stuck when debugging a "hello-world.c" program. Thus it
depends on the executable.
2. Regarding toggle-debug-on-quit and C-g, it doesn't work. No backtrace
is produced and the CPU continues to be at 100%.
3. I tried running edebug on gdb, which wasn't easy. I had to manually
first do eval-buffer on gdb-mi.el and gud.el. But in the end I managed and
found that the CPU is spend during (run-hooks 'gdb-mode-hook) . I'll try to
investigate it further in the next few days.
Regards,
Dov
Post by Glenn Morris
I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU.
Is this with everything you try to debug, or just certain things?
Can you try M-x toggle-debug-on-quit, then interrupt Emacs with ctrl-g
during those 40 seconds and see if you get a backtrace?
Or try M-x edebug-defun on the `gdb' function, step through it, and see
what is taking the time.
Guesses: do you have a huge .gdb_history or ~/.gdbinit file?
Glenn Morris
2012-01-25 19:05:35 UTC
Permalink
Post by Dov Grobgeld
(process-send-string (get-buffer-process gud-comint-buffer)
(concat command "\n")))
where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.
Thanks for the detective work. I'm afraid I don't know what to do about
this. It looks like process handling is messed up somehow.

I hope someone else on this list can help you further.
Dov Grobgeld
2012-04-30 05:33:31 UTC
Permalink
I finally ran emacs-24 under debugger to figure out where it got
stuck. What I found was that it is stuck in keyboard.c where in
read_key_sequence() the control keeps jumping back to the
replay_sequence label. Does this ring a bell to someone?

Regards,
Dov
Post by Glenn Morris
  (process-send-string (get-buffer-process gud-comint-buffer)
               (concat command "\n")))
where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.
Thanks for the detective work. I'm afraid I don't know what to do about
this. It looks like process handling is messed up somehow.
I hope someone else on this list can help you further.
Dov Grobgeld
2012-04-30 06:36:40 UTC
Permalink
Here is some more clarification. In keyboard.c:command_loop_1() the
line i=read_key_sequence() blocks during interactive work until a key
is pressed. But when the process is stuck in "gdb" read_key_sequence()
does not block but keeps returning. I still have no clue of what goes
on though.
Post by Dov Grobgeld
I finally ran emacs-24 under debugger to figure out where it got
stuck. What I found was that it is stuck in keyboard.c where in
read_key_sequence() the control keeps jumping back to the
replay_sequence label. Does this ring a bell to someone?
Regards,
Dov
Post by Glenn Morris
  (process-send-string (get-buffer-process gud-comint-buffer)
               (concat command "\n")))
where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.
Thanks for the detective work. I'm afraid I don't know what to do about
this. It looks like process handling is messed up somehow.
I hope someone else on this list can help you further.
Chong Yidong
2012-05-06 04:13:46 UTC
Permalink
Post by Dov Grobgeld
Here is some more clarification. In keyboard.c:command_loop_1() the
line i=read_key_sequence() blocks during interactive work until a key
is pressed. But when the process is stuck in "gdb" read_key_sequence()
does not block but keeps returning. I still have no clue of what goes
on though.
Is this still with 24.0.92, or the latest pretest 24.0.96, or the
emacs-24 branch? I made some changes to the pty handling a few weeks
ago (which should be included in 24.0.96), which may impact this issue.
Dov Grobgeld
2012-05-06 04:55:29 UTC
Permalink
Thank you very much for looking into this. For me, this is currently, the
main show stopper for starting to emacs-24.

I have just tested the same scenario with the latest git HEAD version from
http://git.savannah.gnu.org/r/emacs.git/ :

823d681 Optionally include holidays in cal-html output

and unfortunately the exact same behavior as before still remains.

Please let me know if there are any tests or debug info that I can provide
you with that can help in resolving this issue. I tried debugging it
myself, but I found it to complex since I lack the knowledge of how
sub-processes are handled in emacs.

Regards,
Dov
Post by Chong Yidong
Post by Dov Grobgeld
Here is some more clarification. In keyboard.c:command_loop_1() the
line i=read_key_sequence() blocks during interactive work until a key
is pressed. But when the process is stuck in "gdb" read_key_sequence()
does not block but keeps returning. I still have no clue of what goes
on though.
Is this still with 24.0.92, or the latest pretest 24.0.96, or the
emacs-24 branch? I made some changes to the pty handling a few weeks
ago (which should be included in 24.0.96), which may impact this issue.
Chong Yidong
2012-05-06 05:39:56 UTC
Permalink
Post by Dov Grobgeld
Please let me know if there are any tests or debug info that I can
provide you with that can help in resolving this issue. I tried
debugging it myself, but I found it to complex since I lack the
knowledge of how sub-processes are handled in emacs.
If you look at the *input/output of a.out* buffer, does it contain any
text?

Also, when you say that read_key_sequence keeps returning, what is its
return value? Also, when read_key_sequence returns early, what is the
value returned by read_char at keyboard.c:9341?

Also, is it possible for you to provide a copy of the problematic
program, or is it not allowed?
Dov Grobgeld
2012-05-06 07:06:13 UTC
Permalink
The *input/output of a.out* is empty.

It seems that I was wrong about read_key_sequence(). It doesn't "return
early". Here is a stack trace during the timeout:

#0 0x00110424 in __kernel_vsyscall ()
#1 0x005f288d in ___newselect_nocancel () at
../sysdeps/unix/syscall-template.S:82
#2 0x08149691 in xg_select (max_fds=17, rfds=0xbfffe860, wfds=0xbfffe7e0,
efds=0x0, timeout=0xbfffe7d4) at xgselect.c:100
#3 0x0821ffb0 in wait_reading_process_output (time_limit=30, microsecs=0,
read_kbd=-1, do_display=1, wait_for_cell=138903722, wait_proc=0x0,
just_wait_proc=0) at process.c:4620
#4 0x080601c7 in sit_for (timeout=120, reading=1, do_display=1) at
dispnew.c:6068
#5 0x08159606 in read_char (commandflag=1, nmaps=3, maps=0xbfffec50,
prev_event=138903722, used_mouse_menu=0xbfffed18, end_time=0x0) at
keyboard.c:2698
#6 0x08163f12 in read_key_sequence (keybuf=0xbfffee94, bufsize=30,
prompt=138903722, dont_downcase_last=0, can_return_switch_frame=1,
fix_current_buffer=1) at keyboard.c:9341
#7 0x08157472 in command_loop_1 () at keyboard.c:1455
#8 0x081d629d in internal_condition_case (bfun=0x815711b <command_loop_1>,
handlers=138934706, hfun=0x8156ad5 <cmd_error>) at eval.c:1448
#9 0x08156e49 in command_loop_2 (ignore=138903722) at keyboard.c:1160
#10 0x081d5d6f in internal_catch (tag=138932706, func=0x8156e25
<command_loop_2>, arg=138903722) at eval.c:1205
#11 0x08156e05 in command_loop () at keyboard.c:1139
#12 0x0815670e in recursive_edit_1 () at keyboard.c:759
#13 0x0815685f in Frecursive_edit () at keyboard.c:823
#14 0x08154d65 in main (argc=1, argv=0xbffff5e4) at emacs.c:1711

read_char() returns 158534621.

Sorry, I can't redistribute the executable. If the problem can't be solved
by otherwise, I'll try to generate another executable that has this
problem. In any case, here is the output when running gdb from the command
line on the executable.
gdb -i=mi SolarJet
=thread-group-added,id="i1"
~"GNU gdb (GDB) Fedora (7.2-52.fc14)\n"
~"Copyright (C) 2010 Free Software Foundation, Inc.\n"
~"License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>\nThis is free software: you are free to
change and redistribute it.\nThere is NO WARRANTY, to the extent permitted
by law. Type \"show copying\"\nand \"show warranty\" for details.\n"
~"This GDB was configured as \"i686-redhat-linux-gnu\".\nFor bug reporting
instructions, please see:\n"
~"<http://www.gnu.org/software/gdb/bugs/>...\n"
~"Reading symbols from
/mnt/fdrive/git/SolarJet/Apps/SolarJet/Project/qt/BinLinux32/SolarJet..."
~"done.\n"
(gdb)

Thanks,
Dov
Post by Dov Grobgeld
Please let me know if there are any tests or debug info that I can
provide you with that can help in resolving this issue. I tried
debugging it myself, but I found it to complex since I lack the
knowledge of how sub-processes are handled in emacs.
If you look at the *input/output of a.out* buffer, does it contain any
text?
Also, when you say that read_key_sequence keeps returning, what is its
return value? Also, when read_key_sequence returns early, what is the
value returned by read_char at keyboard.c:9341?
Also, is it possible for you to provide a copy of the problematic
program, or is it not allowed?
Chong Yidong
2012-05-07 02:53:52 UTC
Permalink
Post by Dov Grobgeld
The *input/output of a.out* is empty.
It seems that I was wrong about read_key_sequence(). It doesn't
"return early".
During this time, is Emacs responsive to user commands, i.e. does it
work normally apart from taking 100% CPU? Or is it just unresponsive?

Also, what is your gdb version?

Also, please set a breakpoint at process.c:4896, which should be the
line

struct Lisp_Process *p = XPROCESS (proc);

as well as the function exec_sentinel(). See if Emacs hits each
breakpoint, and step through it for the next several steps. In
exec_sentinel, please show the Lisp values of the `proc' and `reason'
variables (i.e. `pp proc' and `pp reason'.)

Basically, gdb-mi.el allocates a pty and passes it to the gdb process,
which hooks the pty up to the debugged process's input/output. That's
what the "1-inferior-tty-set /dev/pts/9" gdb command does. Emacs then
listens for input/output on the pty. Recently I fixed a bug in which
Emacs would use 100% CPU due to Emacs getting an EIO error on that pty
and then spinning; this fix involved setting up a sentinel that closes
the pty when Emacs gets EIO; it's possible the fix is not working for
you, though I don't know why. The other possibility is that the program
you are debugging does something strange with its input/output stream.

Thanks.
Dov Grobgeld
2012-05-07 05:07:44 UTC
Permalink
Hi Chong,

In response to your questions.

During the "100% CPU" time period, emacs still responds normally and files
can be opened, etc.

My gdb version is "GNU gdb (GDB) Fedora (7.2-52.fc14)". I have tried it at
home as well with a later version from Fedora 16 and the result is the same.

I put breakpoints at the lines that you indicated, but as you suspected,
the breakpoints are only reached when I exit gdb with the "quit" command.

What's next? Thanks again for looking into this.

Dov
Post by Chong Yidong
Post by Dov Grobgeld
The *input/output of a.out* is empty.
It seems that I was wrong about read_key_sequence(). It doesn't
"return early".
During this time, is Emacs responsive to user commands, i.e. does it
work normally apart from taking 100% CPU? Or is it just unresponsive?
Also, what is your gdb version?
Also, please set a breakpoint at process.c:4896, which should be the
line
struct Lisp_Process *p = XPROCESS (proc);
as well as the function exec_sentinel(). See if Emacs hits each
breakpoint, and step through it for the next several steps. In
exec_sentinel, please show the Lisp values of the `proc' and `reason'
variables (i.e. `pp proc' and `pp reason'.)
Basically, gdb-mi.el allocates a pty and passes it to the gdb process,
which hooks the pty up to the debugged process's input/output. That's
what the "1-inferior-tty-set /dev/pts/9" gdb command does. Emacs then
listens for input/output on the pty. Recently I fixed a bug in which
Emacs would use 100% CPU due to Emacs getting an EIO error on that pty
and then spinning; this fix involved setting up a sentinel that closes
the pty when Emacs gets EIO; it's possible the fix is not working for
you, though I don't know why. The other possibility is that the program
you are debugging does something strange with its input/output stream.
Thanks.
Chong Yidong
2012-05-07 06:11:42 UTC
Permalink
Post by Dov Grobgeld
I put breakpoints at the lines that you indicated, but as you
suspected, the breakpoints are only reached when I exit gdb with the
"quit" command.
What's next? Thanks again for looking into this.
Please apply the following patch, then, when Emacs is taking 100% CPU,
set a breakpoint at process.c:4854, i.e. at

if (p->pid == -2)
;

This captures the state just after Emacs calls read_process_output on
the pty passed to your program. If this breakpoint is triggered, please
report the value of nread and errno, and step through to the end of the
subsequent if/else block and report the gdb session. Thanks.


=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:09:25 +0000
***************
*** 4847,4852 ****
--- 4847,4859 ----
buffered-ahead character if we have one. */

nread = read_process_output (proc, channel);
+
+ {
+ struct Lisp_Process *p = XPROCESS (proc);
+ if (p->pid == -2)
+ ;
+ }
+
if (nread > 0)
{
/* Since read_process_output can run a filter,
Chong Yidong
2012-05-07 06:26:14 UTC
Permalink
Actually, try the following patch instead (apparently gdb has some
issues with printing errno). Apply the patch, then when Emacs is taking
100% CPU do an interrupt and set the breakpoint at process.c:4855, then
when the breakpoint triggers do

n
p nread
p errno

and step through the subsequent if/else blocks. Thanks.

Basically, the 100% CPU appears to be because Emacs' select() call keeps
getting worken up by the pty attached to your program. But, for some
reason, no actual output being read from that pty. These debugging
steps are trying to figure out if some uncaught errno is being reported
by the pty read.


=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:21:39 +0000
***************
*** 4822,4827 ****
--- 4822,4829 ----
&& !FD_ISSET (channel, &non_process_wait_mask))
{
int nread;
+ int saved_errno = 0;
+ struct Lisp_Process *pp;

/* If waiting for this channel, arrange to return as
soon as no more input to be processed. No more
***************
*** 4847,4852 ****
--- 4849,4859 ----
buffered-ahead character if we have one. */

nread = read_process_output (proc, channel);
+
+ pp = XPROCESS (proc);
+ if (pp->pid == -2)
+ saved_errno = errno;
+
if (nread > 0)
{
/* Since read_process_output can run a filter,
Dov Grobgeld
2012-05-08 05:33:15 UTC
Permalink
I added the above patch and the result is as follows:

After the following two lines:

nread = read_process_output (proc, channel);

pp = XPROCESS(proc);

nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).

This seems strange, as obviously the sub-process does not produce 4095
characters repeatedly.

Thanks,
Dov
Post by Chong Yidong
Actually, try the following patch instead (apparently gdb has some
issues with printing errno). Apply the patch, then when Emacs is taking
100% CPU do an interrupt and set the breakpoint at process.c:4855, then
when the breakpoint triggers do
n
p nread
p errno
and step through the subsequent if/else blocks. Thanks.
Basically, the 100% CPU appears to be because Emacs' select() call keeps
getting worken up by the pty attached to your program. But, for some
reason, no actual output being read from that pty. These debugging
steps are trying to figure out if some uncaught errno is being reported
by the pty read.
=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:21:39 +0000
***************
*** 4822,4827 ****
--- 4822,4829 ----
&& !FD_ISSET (channel, &non_process_wait_mask))
{
int nread;
+ int saved_errno = 0;
+ struct Lisp_Process *pp;
/* If waiting for this channel, arrange to return as
soon as no more input to be processed. No more
***************
*** 4847,4852 ****
--- 4849,4859 ----
buffered-ahead character if we have one. */
nread = read_process_output (proc, channel);
+
+ pp = XPROCESS (proc);
+ if (pp->pid == -2)
+ saved_errno = errno;
+
if (nread > 0)
{
/* Since read_process_output can run a filter,
Dov Grobgeld
2012-05-08 07:56:21 UTC
Permalink
Some more info that I found through strace that might help. Alltogether
read_process_output() is called 214 times and thus a total of 870k of text
is read through /dev/ptmx to read_process_output() . Could the amount of
data possibly explain the slowness?

Regards,
Dov
Post by Chong Yidong
nread = read_process_output (proc, channel);
pp = XPROCESS(proc);
nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).
This seems strange, as obviously the sub-process does not produce 4095
characters repeatedly.
Thanks,
Dov
Post by Chong Yidong
Actually, try the following patch instead (apparently gdb has some
issues with printing errno). Apply the patch, then when Emacs is taking
100% CPU do an interrupt and set the breakpoint at process.c:4855, then
when the breakpoint triggers do
n
p nread
p errno
and step through the subsequent if/else blocks. Thanks.
Basically, the 100% CPU appears to be because Emacs' select() call keeps
getting worken up by the pty attached to your program. But, for some
reason, no actual output being read from that pty. These debugging
steps are trying to figure out if some uncaught errno is being reported
by the pty read.
=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:21:39 +0000
***************
*** 4822,4827 ****
--- 4822,4829 ----
&& !FD_ISSET (channel, &non_process_wait_mask))
{
int nread;
+ int saved_errno = 0;
+ struct Lisp_Process *pp;
/* If waiting for this channel, arrange to return as
soon as no more input to be processed. No more
***************
*** 4847,4852 ****
--- 4849,4859 ----
buffered-ahead character if we have one. */
nread = read_process_output (proc, channel);
+
+ pp = XPROCESS (proc);
+ if (pp->pid == -2)
+ saved_errno = errno;
+
if (nread > 0)
{
/* Since read_process_output can run a filter,
Chong Yidong
2012-05-08 08:28:27 UTC
Permalink
    nread = read_process_output (proc, channel);
    pp = XPROCESS(proc);
nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).
Some more info that I found through strace that might help.
Alltogether read_process_output() is called 214 times and thus a total
of 870k of text is read through /dev/ptmx to read_process_output() .
Could the amount of data possibly explain the slowness?
Maybe, if this process IO is emitted non-stop. But this indicates that
the traffic is due to the main connection with the main gdb process
(which has a positive pid), not with the pty which gdb-mi uses for IO
(which has pid -2) like I guessed.

Could you do

M-: (setq gdb-enable-debug t) RET

and show the value of the variable `gdb-debug-log'?

For example, when I run M-x gdb on the Emacs binary itself,
`gdb-debug-log' gets 22 entries by the time I get to the (gdb) prompt;
this is the usual GDB-MI chatter. From the time I "run" the program
till the debugged program exits, it gains another 24 entries. Do you
see a lot more traffic with your program?
Dov Grobgeld
2012-05-08 11:59:36 UTC
Permalink
Indeed, my result is much fatter.

(safe-length gdb-debug-log) => 223

and the total length of the messages is about 800k. So it seems that the
time spent is simply the parsing of this large chunk of data? What gdb
command is run that outputs all the file="..." commands?

Thanks,
Dov
Post by Chong Yidong
Post by Chong Yidong
nread = read_process_output (proc, channel);
pp = XPROCESS(proc);
nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).
Some more info that I found through strace that might help.
Alltogether read_process_output() is called 214 times and thus a total
of 870k of text is read through /dev/ptmx to read_process_output() .
Could the amount of data possibly explain the slowness?
Maybe, if this process IO is emitted non-stop. But this indicates that
the traffic is due to the main connection with the main gdb process
(which has a positive pid), not with the pty which gdb-mi uses for IO
(which has pid -2) like I guessed.
Could you do
M-: (setq gdb-enable-debug t) RET
and show the value of the variable `gdb-debug-log'?
For example, when I run M-x gdb on the Emacs binary itself,
`gdb-debug-log' gets 22 entries by the time I get to the (gdb) prompt;
this is the usual GDB-MI chatter. From the time I "run" the program
till the debugged program exits, it gains another 24 entries. Do you
see a lot more traffic with your program?
Chong Yidong
2012-05-08 16:25:00 UTC
Permalink
Post by Dov Grobgeld
Indeed, my result is much fatter.
(safe-length gdb-debug-log) => 223
and the total length of the messages is about 800k. So it seems that
the time spent is simply the parsing of this large chunk of data? What
gdb command is run that outputs all the file="..." commands?
Those are status messages from turning on GDB's MI (machine interface)
system, I think, though I don't see why it makes so much difference in
your case.

If you type, from the shell,

gdb -i=mi YOUR-BINARY
r

do you similarly see a huge output?
Eli Zaretskii
2012-05-08 17:38:55 UTC
Permalink
Date: Tue, 8 May 2012 14:59:36 +0300
Indeed, my result is much fatter.
(safe-length gdb-debug-log) => 223
and the total length of the messages is about 800k.
Ouch! could you send it as a compressed attachment? I'd be interested
to see what gdb-mi.el commands generate such large output.
Eli Zaretskii
2012-05-08 17:47:57 UTC
Permalink
Date: Wed, 09 May 2012 00:25:00 +0800
Those are status messages from turning on GDB's MI (machine interface)
system, I think, though I don't see why it makes so much difference in
your case.
If you type, from the shell,
gdb -i=mi YOUR-BINARY
r
do you similarly see a huge output?
As you know, in addition to "run", gdb-mi.el sends lots of other
commands behind the scenes, so the above is not a faithful simulation
of what happens when GDB is run by Emacs. But I agree that if the
above produces similarly voluminous output, we cannot really blame
gdb-mi.el.
Dov Grobgeld
2012-05-08 21:07:05 UTC
Permalink
So my latest investigations suggest that the problem is the
"-file-list-exec-source-files" command which generates the huge output due
to the large number of files that is part of the project.

The following shows the problem from the command line.

echo -file-list-exec-source-files > /tmp/gdb.in
gdb MyExec < /tmp/gdb.in > /tmp/gdb.out
wc /tmp/gdb.out
11 69 3725105 /tmp/gdb.out

(Note that this is on a my home box, where the output is much larger than
what I reported this morning possibly due to different paths).

So there are still several questions:

* Why does it take several minutes to parse a 3.7M file? Could it be
related to the fact that gdb-mi/emacs concatinates the entire string before
trying to parse it. Still 3.7M is far too much.

* Noted that I can't run gdb MyExec < /tmp/gdb.in in a shell buffer. It
gets slower and slower while the CPU stays at 100%.

* There is a huge redundancy in gdb.out. The command
-file-list-exec-source-files should output all source files included, but
the same source files are listed multiple times. Consider the huge file
size reduction after sorting and uniq'ing:

perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }' /tmp/gdb.out |
sort | uniq | wc

3931 3931 220654

Why doesn't -file-list-exec-source file do uniq internally. This seems like
a bug in gdb.

* Why does gdb-mi.el do -file-list-exec-source-files at all? Can't it
search for source files on demand?

Regards,
Dov
Post by Eli Zaretskii
Date: Wed, 09 May 2012 00:25:00 +0800
Those are status messages from turning on GDB's MI (machine interface)
system, I think, though I don't see why it makes so much difference in
your case.
If you type, from the shell,
gdb -i=mi YOUR-BINARY
r
do you similarly see a huge output?
As you know, in addition to "run", gdb-mi.el sends lots of other
commands behind the scenes, so the above is not a faithful simulation
of what happens when GDB is run by Emacs. But I agree that if the
above produces similarly voluminous output, we cannot really blame
gdb-mi.el.
Andreas Schwab
2012-05-08 21:24:38 UTC
Permalink
Post by Dov Grobgeld
* Why does gdb-mi.el do -file-list-exec-source-files at all? Can't it
search for source files on demand?
Customize gdb-create-source-file-list to nil.

Andreas.
--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Dov Grobgeld
2012-05-08 21:30:08 UTC
Permalink
I first that at first too, but then I realized that it doesn't do it. The
gdb-mi.el source has the following logic.

(gdb-input
; Needs GDB 6.2 onwards.
(list "-file-list-exec-source-files" 'gdb-get-source-file-list))
(if gdb-create-source-file-list
(gdb-input
; Needs GDB 6.0 onwards.
(list "-file-list-exec-source-file" 'gdb-get-source-file)))

i.e. the gdb command "-file-list-exec-source-files" (note the s in files)
is called independantly from gdb-create-source-file-list which only
influences the execution of "-file-list-exec-source-file".
Post by Andreas Schwab
Post by Dov Grobgeld
* Why does gdb-mi.el do -file-list-exec-source-files at all? Can't it
search for source files on demand?
Customize gdb-create-source-file-list to nil.
Andreas.
--
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Andreas Schwab
2012-05-09 07:47:19 UTC
Permalink
Post by Dov Grobgeld
I first that at first too, but then I realized that it doesn't do it. The
gdb-mi.el source has the following logic.
(gdb-input
; Needs GDB 6.2 onwards.
(list "-file-list-exec-source-files" 'gdb-get-source-file-list))
(if gdb-create-source-file-list
(gdb-input
; Needs GDB 6.0 onwards.
(list "-file-list-exec-source-file" 'gdb-get-source-file)))
You are looking at a very old version of gdb-mi.el.

Andreas.
--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Dov Grobgeld
2012-05-09 08:44:32 UTC
Permalink
Yes, sorry. I discovered that I looked at an old version. The latest git
version indeed allows disabling the -file-list-exec-source-files. I will
use that option which indeed is a work around for the problem.

In addition, I filed a bug for gdb that it should uniq the filenames output
by -file-list-exec-source-files.

I realized that all filelist-exec-source-files is used for is to turn on
gdb minor mode for all files that are currently open in emacs. Perhaps we
should turn the problem around by asking for a gdb function that answers
the question whether a file is referenced by an executable. It would then
be possible to loop over the emacs buffers and turn on gdb minor mode if
the file is referenced by the new gdb session.

Regards,
Dov
Post by Andreas Schwab
Post by Dov Grobgeld
I first that at first too, but then I realized that it doesn't do it. The
gdb-mi.el source has the following logic.
(gdb-input
; Needs GDB 6.2 onwards.
(list "-file-list-exec-source-files" 'gdb-get-source-file-list))
(if gdb-create-source-file-list
(gdb-input
; Needs GDB 6.0 onwards.
(list "-file-list-exec-source-file" 'gdb-get-source-file)))
You are looking at a very old version of gdb-mi.el.
Andreas.
--
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
Eli Zaretskii
2012-05-09 17:36:14 UTC
Permalink
Date: Wed, 9 May 2012 11:44:32 +0300
In addition, I filed a bug for gdb that it should uniq the filenames output
by -file-list-exec-source-files.
Can you provide a link to that bug report?

FWIW, when I use -file-list-exec-source-files while debugging Emacs, I
don't see duplicate file names in the GDB output. Maybe I'm blind.
Dov Grobgeld
2012-05-10 06:00:49 UTC
Permalink
Here is a link to the gdb bug:

http://sourceware.org/bugzilla/show_bug.cgi?id=14081

I tried running file-list-exec-source-files and I get duplicates as well.
Try the following:

prompt> echo -file-list-exec-source-files > /tmp/gdb.in
prompt> gdb -i=mi emacs < /tmp/gdb.in > /tmp/gdb.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb.out | sort | head -15
file=alloc.c
file=alloc.c
file=allocator.c
file=atimer.c
file=atimer.c
file=bidi.c
file=bidi.c
file=buffer.c
file=buffer.c
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h

My version of gdb is:

GNU gdb (GDB) Fedora (7.2-52.fc14)

For my executable gdb outputs full paths as well as the fullname field,
which expands the output considerably.

Still, it bothering me the fact that the above perl expression parses the
gdb output in a fraction of a second, (0.01s user time) whereas gdb-mi.el
takes more than 40s.

Regards,
Dov
Date: Wed, 9 May 2012 11:44:32 +0300
In addition, I filed a bug for gdb that it should uniq the filenames
output
by -file-list-exec-source-files.
Can you provide a link to that bug report?
FWIW, when I use -file-list-exec-source-files while debugging Emacs, I
don't see duplicate file names in the GDB output. Maybe I'm blind.
Chong Yidong
2012-05-10 14:13:36 UTC
Permalink
Post by Dov Grobgeld
Still, it bothering me the fact that the above perl expression parses
the gdb output in a fraction of a second, (0.01s user time) whereas
gdb-mi.el takes more than 40s.
I would also like to learn more about where the bottleneck is.

Could you do the following:

M-: (require 'gdb-mi) RET
M-: (defun gdb-get-source-file-list () nil) RET

then run M-x gdb as usual, and see if that makes any difference in
performance? Leave gdb-create-source-file-list set at t.

(The above steps cause gdb-mi to issue the -file-list-exec-source-files
command and read the output, as usual, but skip parsing the output into
the source file list.)
Eli Zaretskii
2012-05-10 16:32:04 UTC
Permalink
Date: Thu, 10 May 2012 09:00:49 +0300
I tried running file-list-exec-source-files and I get duplicates as well.
prompt> echo -file-list-exec-source-files > /tmp/gdb.in
prompt> gdb -i=mi emacs < /tmp/gdb.in > /tmp/gdb.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb.out | sort | head -15
file=alloc.c
file=alloc.c
file=allocator.c
file=atimer.c
file=atimer.c
file=bidi.c
file=bidi.c
file=buffer.c
file=buffer.c
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
I don't see anything like that. Here's my output:

addr=0x011b4ea5
addr=0x011b4ea5
addr=0x012329a8
disp=del
disp=del
disp=keep
enabled=y
enabled=y
enabled=y
file=../lib/allocator.h
file=../lib/careadlinkat.h
file=../lib/ignore-value.h
file=../lib/intprops.h
file=../lib/intprops.h
file=../lib/intprops.h

IOW, all the duplicates I see are header files. Not a single .c file
shows up, not even if I change "head -15" into "head -100".
GNU gdb (GDB) Fedora (7.2-52.fc14)
Maybe you should upgrade, I dunno. I use 7.4.1, FWIW.

Or maybe GCC versions later than what I have do that.
For my executable gdb outputs full paths as well as the fullname field,
which expands the output considerably.
Here too, but that's expected (and necessary).
Dov Grobgeld
2012-05-10 18:43:39 UTC
Permalink
I just downloaded gdb from cvs and tried with the latest version and there
are still duplicates. Perhaps it is because of gcc? But on the positive
side there is indeed a huge difference in the gdb output.

prompt> gdb --version
GNU gdb (GDB) Fedora (7.3.50.20110722-13.fc16)
prompt> /usr/local/public-dev/bin/gdb --version
GNU gdb (GDB) 7.4.50.20120509-cvs
prompt> gdb -i=mi MyExec < /tmp/gdb.in > /tmp/gdb-old.out
prompt> /usr/local/public-dev/bin/gdb -i=mi MyExec < /tmp/gdb.in >
/tmp/gdb-new.out
prompt> ls -1s --block-size=1 /tmp/gdb*.out
884736 /tmp/gdb-new.out
3727360 /tmp/gdb-old.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-old.out | sort | wc
67311 67311 3522494
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-new.out | sort | wc
14221 14221 837082
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-old.out | sort |uniq| wc
3931 3931 220654
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-new.out | sort |uniq| wc
2245 2245 137404

But even the factor 837k vs 137k is substantial, so it is still valid to do
an internal uniq within gdb. I'll try to put together a patch.

Regards,
Dov
Post by Eli Zaretskii
Date: Thu, 10 May 2012 09:00:49 +0300
I tried running file-list-exec-source-files and I get duplicates as well.
prompt> echo -file-list-exec-source-files > /tmp/gdb.in
prompt> gdb -i=mi emacs < /tmp/gdb.in > /tmp/gdb.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb.out | sort | head -15
file=alloc.c
file=alloc.c
file=allocator.c
file=atimer.c
file=atimer.c
file=bidi.c
file=bidi.c
file=buffer.c
file=buffer.c
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
addr=0x011b4ea5
addr=0x011b4ea5
addr=0x012329a8
disp=del
disp=del
disp=keep
enabled=y
enabled=y
enabled=y
file=../lib/allocator.h
file=../lib/careadlinkat.h
file=../lib/ignore-value.h
file=../lib/intprops.h
file=../lib/intprops.h
file=../lib/intprops.h
IOW, all the duplicates I see are header files. Not a single .c file
shows up, not even if I change "head -15" into "head -100".
GNU gdb (GDB) Fedora (7.2-52.fc14)
Maybe you should upgrade, I dunno. I use 7.4.1, FWIW.
Or maybe GCC versions later than what I have do that.
For my executable gdb outputs full paths as well as the fullname field,
which expands the output considerably.
Here too, but that's expected (and necessary).
Dov Grobgeld
2012-05-10 19:07:38 UTC
Permalink
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.

Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s

Thus it is clear that most of the time is taken just reading the string
into emacs. But doing find-file on the same file is almost instantaneous.

Regards,
Dov
Post by Chong Yidong
Post by Dov Grobgeld
Still, it bothering me the fact that the above perl expression parses
the gdb output in a fraction of a second, (0.01s user time) whereas
gdb-mi.el takes more than 40s.
I would also like to learn more about where the bottleneck is.
M-: (require 'gdb-mi) RET
M-: (defun gdb-get-source-file-list () nil) RET
then run M-x gdb as usual, and see if that makes any difference in
performance? Leave gdb-create-source-file-list set at t.
(The above steps cause gdb-mi to issue the -file-list-exec-source-files
command and read the output, as usual, but skip parsing the output into
the source file list.)
Stefan Monnier
2012-05-10 20:25:59 UTC
Permalink
Post by Dov Grobgeld
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.
Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s
Thus it is clear that most of the time is taken just reading the string
into emacs. But doing find-file on the same file is almost instantaneous.
Sounds like a perfect test case for the native elisp profiler (a
prototype of which is at
http://cx4a.org/hack/emacs-native-profiler.html, and hopefully it will
mature over the summer since it's funded as a GSoC).


Stefan
Chong Yidong
2012-05-11 06:33:52 UTC
Permalink
Post by Dov Grobgeld
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.
Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s
Thus it is clear that most of the time is taken just reading the
string into emacs. But doing find-file on the same file is almost
instantaneous.
Here's another little experiment. Could you apply the two attached
patches, individually, and see what difference each patch makes?
(Again, with gdb-create-source-file-list at its default of t, and
without any other patches to gdb-mi.el.)
Dov Grobgeld
2012-05-11 08:29:57 UTC
Permalink
Here are the results.

With patch1: ~155s
With patch2: ~112s

On the other hand, I have patched gdb, seehttp://
sourceware.org/bugzilla/show_bug.cgi?id=14081, to do uniq internally in
gdb, and then it takes only about 10s.
Post by Chong Yidong
Post by Dov Grobgeld
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.
Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s
Thus it is clear that most of the time is taken just reading the
string into emacs. But doing find-file on the same file is almost
instantaneous.
Here's another little experiment. Could you apply the two attached
patches, individually, and see what difference each patch makes?
(Again, with gdb-create-source-file-list at its default of t, and
without any other patches to gdb-mi.el.)
Eli Zaretskii
2012-05-11 09:47:53 UTC
Permalink
Date: Fri, 11 May 2012 11:29:57 +0300
With patch1: ~155s
With patch2: ~112s
On the other hand, I have patched gdb, seehttp://
sourceware.org/bugzilla/show_bug.cgi?id=14081, to do uniq internally in
gdb, and then it takes only about 10s.
Which is much faster, but still way too slow.

I think it's quite clear at this point that what takes time is reading
the stuff from GDB, not parsing it. Chong, am I right?
Chong Yidong
2012-05-11 13:27:36 UTC
Permalink
Post by Eli Zaretskii
I think it's quite clear at this point that what takes time is reading
the stuff from GDB, not parsing it. Chong, am I right?
That is what the evidence suggests, which indicates that we won't be
able to fix it for 24.1. Maybe there is some problem in the adaptive
read buffering code?
Dov Grobgeld
2012-11-05 20:36:16 UTC
Permalink
Is there any chance of having this bug solved in the foreseeable future?
This bug is the reason I am still sticking with emacs-23. As an
alternative, is it possible to use the old gdb-mode (without -mi) in
emacs-24?
Post by Chong Yidong
Post by Eli Zaretskii
I think it's quite clear at this point that what takes time is reading
the stuff from GDB, not parsing it. Chong, am I right?
That is what the evidence suggests, which indicates that we won't be
able to fix it for 24.1. Maybe there is some problem in the adaptive
read buffering code?
Eli Zaretskii
2012-11-05 20:46:47 UTC
Permalink
Date: Mon, 5 Nov 2012 22:36:16 +0200
As an alternative, is it possible to use the old gdb-mode (without
-mi) in emacs-24?
It was always possible: type "M-x gud-gdb RET".
Stefan Monnier
2012-11-05 23:51:41 UTC
Permalink
Post by Eli Zaretskii
As an alternative, is it possible to use the old gdb-mode (without
-mi) in emacs-24?
It was always possible: type "M-x gud-gdb RET".
IIUC this gives you the Emacs-22 version of M-x gdb, but not the
Emacs-23 one (which used the gdb-ui.el code).


Stefan
Jean-Philippe Gravel
2012-12-14 04:14:59 UTC
Permalink
Hi guys,



I actually have a fix for this problem.



I use Emacs to debug a software of colossal size. With Emacs 23, it used
to be possible to start gdb within a minute. With Emacs 24, it takes more
than a couple of hour. I never actually had the patience to wait for it to
finish.



As stated in previous posts, the command -file-list-exec-source-files is
the one triggering the bottleneck. In my case, the reply is 5Mb long.



It turns out that the problem IS the parsing of the data stream coming from
GDB. The function gud-gdbmi-marker-filter is home to a colossal
bottleneck. You’ll find in there not only one, but 4 major algorithmic
problems.



The first thing that strikes the eye is the way it handles the fact that
there are several types of GDB replies possible. Instead of parsing the
data stream linearly, accepting the GDB messages in the order of arrival,
it looks over the whole stream for the first type of message, then the
second, then the next. Complexity: O(nm), where n is the size of the steam
and m is the number of message types. Then, when finding a message
matching the type we look for, it is removed from the stream by cloning the
whole string, but padding the original location with spaces. Complexity:
O(ni), where i is the number of messages.



To go on with the analysis, consider the following: my tests indicate that
gdbmi-marker-filter receives data by chunk of about 225 bytes. Since I
receive 5Mb of data, gdbmi ingests about 22000 packets. For every chunk of
data received, the incoming data is concatenated to a new string: O(nj),
where j is the number of packets. Each time, the whole parsing described
above is restarted: O(nmj) (!!!). Feed in a 5Mb data stream and you’ll get
a hung emacs!



I fixed the above by writing a proper parser reading the data stream in
O(n). With my test-case, the parse time goes down from ‘too long to even
tell’, to about 3 seconds. Not only does it fix the startup problem, but
it makes pretty much all other gud commands impressively faster. With
emacs 23, it used to take almost a full second to do a single “gud-next” in
my software. When enabling gud-tooltip-mode, it was becoming totally
unusable. By removing the bottleneck in gud-gdbmi-marker-filter, the above
become instantaneous.



I just need to finalize a few things and I can send you a patch file.



Cheers!

Loading...