bug#10580: 24.0.92; gdb initialization takes more than one minute at 100% CPU

When trying to running gdb on my process, emacs gets stuck for more than
one minutes while the CPU load is at 100%.

Does this happen with `emacs -Q'? Because it looks like you may not be

/home/dov/.config/emacs/gdb-mi hides
/usr/local/public-dev/share/emacs/24.0.92/lisp/progmodes/gdb-mi

Glenn Morris

2012-01-23 09:21:09 UTC

Thanks for your reply Glenn. Is there a public bugzilla like site for
emacs bugs or are they managed in email only. I have additional
problems with gdb in emacs 24 and I would like to know if the problems
have been reported.

You should have received this email acknowledgement of your report:

http://debbugs.gnu.org/cgi/bugreport.cgi?msg=4;bug=10580

which contains a URL pointing to your report. From there you can
easily find the front page: http://debbugs.gnu.org

I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU. Here is the output
from `report-emacs bug` when running with 'emacs -Q'.
In GNU Emacs 24.0.92.1 (i686-pc-linux-gnu, GTK+ Version 2.22.0)
of 2012-01-22 on dovg32
Windowing system distributor `Fedora Project', version 11.0.10905000
configured using `configure '--prefix=/usr/local/public-dev''
value of $LC_ALL: nil
value of $LC_COLLATE: nil
value of $LC_CTYPE: nil
value of $LC_MESSAGES: nil
value of $LC_MONETARY: nil
value of $LC_NUMERIC: nil
value of $LC_TIME: nil
value of $LANG: nil
locale-coding-system: nil
default enable-multibyte-characters: t
Major mode: Debugger
tooltip-mode: t
mouse-wheel-mode: t
tool-bar-mode: t
menu-bar-mode: t
file-name-shadow-mode: t
global-font-lock-mode: t
font-lock-mode: t
blink-cursor-mode: t
auto-composition-mode: t
auto-encryption-mode: t
auto-compression-mode: t
line-number-mode: t
transient-mark-mode: t
<help-echo> C-x C-f C-a C-y C-k <S-insert> C-a C-k
<help-echo> <S-insert> C-a C-k <help-echo> <down-mouse-2>
<mouse-2> <help-echo> <help-echo> <help-echo> <return>
M-x g d b <return> <backspace> <backspace> <backspace>
<backspace> <backspace> <backspace> <backspace> <backspace>
<backspace> <backspace> <backspace> <backspace> B i
n L <tab> 3 2 <tab> S o <tab> <return> <help-echo>
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo>
C-x k <return> y e s <return> M-x M-p <return> <return>
M-x g d b <tab> <backspace> <backspace> <backspace>
b u g <tab> <C-backspace> <C-backspace> r e p <tab>
o r <tab> <return>
For information about GNU Emacs and the GNU system, type C-h C-a.
current-kill: Kill ring is empty
Making completion list...
None found.
(shadow sort gnus-util mail-extr message format-spec rfc822 mml mml-sec
mm-decode mm-bodies mm-encode mail-parse rfc2231 rfc2047 rfc2045
ietf-drums mm-util mail-prsvr mailabbrev mail-utils gmm-utils mailheader
emacsbug help-mode easymenu view gdb-mi bindat json gud easy-mmode
comint ring dired regexp-opt time-date tooltip ediff-hook vc-hooks
lisp-float-type mwheel x-win x-dnd tool-bar dnd fontset image fringe
lisp-mode register page menu-bar rfn-eshadow timer select scroll-bar
mouse jit-lock font-lock syntax facemenu font-core frame cham georgian
utf-8-lang misc-lang vietnamese tibetan thai tai-viet lao korean
japanese hebrew greek romanian slovak czech european ethiopic indian
cyrillic chinese case-table epa-hook jka-cmpr-hook help simple abbrev
minibuffer loaddefs button faces cus-face files text-properties overlay
sha1 md5 base64 format env code-pages mule custom widget
hashtable-print-readable backquote make-network-process dynamic-setting
system-font-setting font-render-setting move-toolbar gtk x-toolkit x
multi-tty emacs)

Glenn Morris

2012-01-25 00:37:14 UTC

I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU.

Is this with everything you try to debug, or just certain things?

Can you try M-x toggle-debug-on-quit, then interrupt Emacs with ctrl-g
during those 40 seconds and see if you get a backtrace?

Or try M-x edebug-defun on the `gdb' function, step through it, and see
what is taking the time.

Guesses: do you have a huge .gdb_history or ~/.gdbinit file?

Dov Grobgeld

2012-01-25 08:49:29 UTC

Here are some more tests:

1. It doesn't get stuck when debugging a "hello-world.c" program. Thus it
depends on the executable.
2. Regarding toggle-debug-on-quit and C-g, it doesn't work. No backtrace is
produced and the CPU continues to be at 100%.
3. I tried running edebug on gdb, which wasn't easy. I had to manually
first do eval-buffer on gdb-mi.el and gud.el. But in the end I managed and
found that the CPU is spend during (run-hooks 'gdb-mode-hook) . I'll try to
investigate it further in the next few days.

Regards,
Dov

I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU.

Is this with everything you try to debug, or just certain things?
Can you try M-x toggle-debug-on-quit, then interrupt Emacs with ctrl-g
during those 40 seconds and see if you get a backtrace?
Or try M-x edebug-defun on the `gdb' function, step through it, and see
what is taking the time.
Guesses: do you have a huge .gdb_history or ~/.gdbinit file?

Dov Grobgeld

2012-01-25 09:39:25 UTC

More tests.

1. I was mistaken in my previous email, the CPU spending was triggered
before gdb-mode-hook.
2. The following command in gdb-input starts the 40s 100% CPU:

(process-send-string (get-buffer-process gud-comint-buffer)
(concat command "\n")))

where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.

Is this enough info, or do you want me to probe deeper?

Post by Dov Grobgeld
1. It doesn't get stuck when debugging a "hello-world.c" program. Thus it
depends on the executable.
2. Regarding toggle-debug-on-quit and C-g, it doesn't work. No backtrace
is produced and the CPU continues to be at 100%.
3. I tried running edebug on gdb, which wasn't easy. I had to manually
first do eval-buffer on gdb-mi.el and gud.el. But in the end I managed and
found that the CPU is spend during (run-hooks 'gdb-mode-hook) . I'll try to
investigate it further in the next few days.
Regards,
Dov

I tried again with emacs -Q and the same thing happens. To be more
precise the startup time is about 40s at 100% CPU.

Is this with everything you try to debug, or just certain things?
Can you try M-x toggle-debug-on-quit, then interrupt Emacs with ctrl-g
during those 40 seconds and see if you get a backtrace?
Or try M-x edebug-defun on the `gdb' function, step through it, and see
what is taking the time.
Guesses: do you have a huge .gdb_history or ~/.gdbinit file?

Glenn Morris

2012-01-25 19:05:35 UTC

Post by Dov Grobgeld
(process-send-string (get-buffer-process gud-comint-buffer)
(concat command "\n")))
where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.

Thanks for the detective work. I'm afraid I don't know what to do about
this. It looks like process handling is messed up somehow.

I hope someone else on this list can help you further.

Dov Grobgeld

2012-04-30 05:33:31 UTC

I finally ran emacs-24 under debugger to figure out where it got
stuck. What I found was that it is stuck in keyboard.c where in
read_key_sequence() the control keeps jumping back to the
replay_sequence label. Does this ring a bell to someone?

Regards,
Dov

(process-send-string (get-buffer-process gud-comint-buffer)
(concat command "\n")))
where command="1-inferior-tty-set /dev/pts/9". Note that the command
returns immediately, but some other thread apparently gets very busy.

Thanks for the detective work. I'm afraid I don't know what to do about
this. It looks like process handling is messed up somehow.
I hope someone else on this list can help you further.

Dov Grobgeld

2012-04-30 06:36:40 UTC

Here is some more clarification. In keyboard.c:command_loop_1() the
line i=read_key_sequence() blocks during interactive work until a key
is pressed. But when the process is stuck in "gdb" read_key_sequence()
does not block but keeps returning. I still have no clue of what goes
on though.

Post by Dov Grobgeld
I finally ran emacs-24 under debugger to figure out where it got
stuck. What I found was that it is stuck in keyboard.c where in
read_key_sequence() the control keeps jumping back to the
replay_sequence label. Does this ring a bell to someone?
Regards,
Dov

Thanks for the detective work. I'm afraid I don't know what to do about
this. It looks like process handling is messed up somehow.
I hope someone else on this list can help you further.

Chong Yidong

2012-05-06 04:13:46 UTC

Post by Dov Grobgeld
Here is some more clarification. In keyboard.c:command_loop_1() the
line i=read_key_sequence() blocks during interactive work until a key
is pressed. But when the process is stuck in "gdb" read_key_sequence()
does not block but keeps returning. I still have no clue of what goes
on though.

Is this still with 24.0.92, or the latest pretest 24.0.96, or the
emacs-24 branch? I made some changes to the pty handling a few weeks
ago (which should be included in 24.0.96), which may impact this issue.

Dov Grobgeld

2012-05-06 04:55:29 UTC

Thank you very much for looking into this. For me, this is currently, the
main show stopper for starting to emacs-24.

I have just tested the same scenario with the latest git HEAD version from
http://git.savannah.gnu.org/r/emacs.git/ :

823d681 Optionally include holidays in cal-html output

and unfortunately the exact same behavior as before still remains.

Please let me know if there are any tests or debug info that I can provide
you with that can help in resolving this issue. I tried debugging it
myself, but I found it to complex since I lack the knowledge of how
sub-processes are handled in emacs.

Regards,
Dov

Chong Yidong

2012-05-06 05:39:56 UTC

Post by Dov Grobgeld
Please let me know if there are any tests or debug info that I can
provide you with that can help in resolving this issue. I tried
debugging it myself, but I found it to complex since I lack the
knowledge of how sub-processes are handled in emacs.

If you look at the *input/output of a.out* buffer, does it contain any
text?

Also, when you say that read_key_sequence keeps returning, what is its
return value? Also, when read_key_sequence returns early, what is the
value returned by read_char at keyboard.c:9341?

Also, is it possible for you to provide a copy of the problematic
program, or is it not allowed?

Dov Grobgeld

2012-05-06 07:06:13 UTC

The *input/output of a.out* is empty.

It seems that I was wrong about read_key_sequence(). It doesn't "return
early". Here is a stack trace during the timeout:

#0 0x00110424 in __kernel_vsyscall ()
#1 0x005f288d in ___newselect_nocancel () at
../sysdeps/unix/syscall-template.S:82
#2 0x08149691 in xg_select (max_fds=17, rfds=0xbfffe860, wfds=0xbfffe7e0,
efds=0x0, timeout=0xbfffe7d4) at xgselect.c:100
#3 0x0821ffb0 in wait_reading_process_output (time_limit=30, microsecs=0,
read_kbd=-1, do_display=1, wait_for_cell=138903722, wait_proc=0x0,
just_wait_proc=0) at process.c:4620
#4 0x080601c7 in sit_for (timeout=120, reading=1, do_display=1) at
dispnew.c:6068
#5 0x08159606 in read_char (commandflag=1, nmaps=3, maps=0xbfffec50,
prev_event=138903722, used_mouse_menu=0xbfffed18, end_time=0x0) at
keyboard.c:2698
#6 0x08163f12 in read_key_sequence (keybuf=0xbfffee94, bufsize=30,
prompt=138903722, dont_downcase_last=0, can_return_switch_frame=1,
fix_current_buffer=1) at keyboard.c:9341
#7 0x08157472 in command_loop_1 () at keyboard.c:1455
#8 0x081d629d in internal_condition_case (bfun=0x815711b <command_loop_1>,
handlers=138934706, hfun=0x8156ad5 <cmd_error>) at eval.c:1448
#9 0x08156e49 in command_loop_2 (ignore=138903722) at keyboard.c:1160
#10 0x081d5d6f in internal_catch (tag=138932706, func=0x8156e25
<command_loop_2>, arg=138903722) at eval.c:1205
#11 0x08156e05 in command_loop () at keyboard.c:1139
#12 0x0815670e in recursive_edit_1 () at keyboard.c:759
#13 0x0815685f in Frecursive_edit () at keyboard.c:823
#14 0x08154d65 in main (argc=1, argv=0xbffff5e4) at emacs.c:1711

read_char() returns 158534621.

Sorry, I can't redistribute the executable. If the problem can't be solved
by otherwise, I'll try to generate another executable that has this
problem. In any case, here is the output when running gdb from the command
line on the executable.

gdb -i=mi SolarJet

=thread-group-added,id="i1"
~"GNU gdb (GDB) Fedora (7.2-52.fc14)\n"
~"Copyright (C) 2010 Free Software Foundation, Inc.\n"
~"License GPLv3+: GNU GPL version 3 or later <
http://gnu.org/licenses/gpl.html>\nThis is free software: you are free to
change and redistribute it.\nThere is NO WARRANTY, to the extent permitted
by law. Type \"show copying\"\nand \"show warranty\" for details.\n"
~"This GDB was configured as \"i686-redhat-linux-gnu\".\nFor bug reporting
instructions, please see:\n"
~"<http://www.gnu.org/software/gdb/bugs/>...\n"
~"Reading symbols from
/mnt/fdrive/git/SolarJet/Apps/SolarJet/Project/qt/BinLinux32/SolarJet..."
~"done.\n"
(gdb)

Thanks,
Dov

If you look at the *input/output of a.out* buffer, does it contain any
text?
Also, when you say that read_key_sequence keeps returning, what is its
return value? Also, when read_key_sequence returns early, what is the
value returned by read_char at keyboard.c:9341?
Also, is it possible for you to provide a copy of the problematic
program, or is it not allowed?

Chong Yidong

2012-05-07 02:53:52 UTC

Post by Dov Grobgeld
The *input/output of a.out* is empty.
It seems that I was wrong about read_key_sequence(). It doesn't
"return early".

During this time, is Emacs responsive to user commands, i.e. does it
work normally apart from taking 100% CPU? Or is it just unresponsive?

Also, what is your gdb version?

Also, please set a breakpoint at process.c:4896, which should be the
line

struct Lisp_Process *p = XPROCESS (proc);

as well as the function exec_sentinel(). See if Emacs hits each
breakpoint, and step through it for the next several steps. In
exec_sentinel, please show the Lisp values of the `proc' and `reason'
variables (i.e. `pp proc' and `pp reason'.)

Basically, gdb-mi.el allocates a pty and passes it to the gdb process,
which hooks the pty up to the debugged process's input/output. That's
what the "1-inferior-tty-set /dev/pts/9" gdb command does. Emacs then
listens for input/output on the pty. Recently I fixed a bug in which
Emacs would use 100% CPU due to Emacs getting an EIO error on that pty
and then spinning; this fix involved setting up a sentinel that closes
the pty when Emacs gets EIO; it's possible the fix is not working for
you, though I don't know why. The other possibility is that the program
you are debugging does something strange with its input/output stream.

Thanks.

Dov Grobgeld

2012-05-07 05:07:44 UTC

Hi Chong,

In response to your questions.

During the "100% CPU" time period, emacs still responds normally and files
can be opened, etc.

My gdb version is "GNU gdb (GDB) Fedora (7.2-52.fc14)". I have tried it at
home as well with a later version from Fedora 16 and the result is the same.

I put breakpoints at the lines that you indicated, but as you suspected,
the breakpoints are only reached when I exit gdb with the "quit" command.

What's next? Thanks again for looking into this.

Dov

Post by Dov Grobgeld
The *input/output of a.out* is empty.
It seems that I was wrong about read_key_sequence(). It doesn't
"return early".

During this time, is Emacs responsive to user commands, i.e. does it
work normally apart from taking 100% CPU? Or is it just unresponsive?
Also, what is your gdb version?
Also, please set a breakpoint at process.c:4896, which should be the
line
struct Lisp_Process *p = XPROCESS (proc);
as well as the function exec_sentinel(). See if Emacs hits each
breakpoint, and step through it for the next several steps. In
exec_sentinel, please show the Lisp values of the `proc' and `reason'
variables (i.e. `pp proc' and `pp reason'.)
Basically, gdb-mi.el allocates a pty and passes it to the gdb process,
which hooks the pty up to the debugged process's input/output. That's
what the "1-inferior-tty-set /dev/pts/9" gdb command does. Emacs then
listens for input/output on the pty. Recently I fixed a bug in which
Emacs would use 100% CPU due to Emacs getting an EIO error on that pty
and then spinning; this fix involved setting up a sentinel that closes
the pty when Emacs gets EIO; it's possible the fix is not working for
you, though I don't know why. The other possibility is that the program
you are debugging does something strange with its input/output stream.
Thanks.

Chong Yidong

2012-05-07 06:11:42 UTC

Post by Dov Grobgeld
I put breakpoints at the lines that you indicated, but as you
suspected, the breakpoints are only reached when I exit gdb with the
"quit" command.
What's next? Thanks again for looking into this.

Please apply the following patch, then, when Emacs is taking 100% CPU,
set a breakpoint at process.c:4854, i.e. at

if (p->pid == -2)
;

This captures the state just after Emacs calls read_process_output on
the pty passed to your program. If this breakpoint is triggered, please
report the value of nread and errno, and step through to the end of the
subsequent if/else block and report the gdb session. Thanks.

=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:09:25 +0000
***************
*** 4847,4852 ****
--- 4847,4859 ----
buffered-ahead character if we have one. */

nread = read_process_output (proc, channel);
+
+ {
+ struct Lisp_Process *p = XPROCESS (proc);
+ if (p->pid == -2)
+ ;
+ }
+
if (nread > 0)
{
/* Since read_process_output can run a filter,

Chong Yidong

2012-05-07 06:26:14 UTC

Actually, try the following patch instead (apparently gdb has some
issues with printing errno). Apply the patch, then when Emacs is taking
100% CPU do an interrupt and set the breakpoint at process.c:4855, then
when the breakpoint triggers do

n
p nread
p errno

and step through the subsequent if/else blocks. Thanks.

Basically, the 100% CPU appears to be because Emacs' select() call keeps
getting worken up by the pty attached to your program. But, for some
reason, no actual output being read from that pty. These debugging
steps are trying to figure out if some uncaught errno is being reported
by the pty read.

=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:21:39 +0000
***************
*** 4822,4827 ****
--- 4822,4829 ----
&& !FD_ISSET (channel, &non_process_wait_mask))
{
int nread;
+ int saved_errno = 0;
+ struct Lisp_Process *pp;

/* If waiting for this channel, arrange to return as
soon as no more input to be processed. No more
***************
*** 4847,4852 ****
--- 4849,4859 ----
buffered-ahead character if we have one. */

nread = read_process_output (proc, channel);
+
+ pp = XPROCESS (proc);
+ if (pp->pid == -2)
+ saved_errno = errno;
+
if (nread > 0)
{
/* Since read_process_output can run a filter,

Dov Grobgeld

2012-05-08 05:33:15 UTC

I added the above patch and the result is as follows:

After the following two lines:

nread = read_process_output (proc, channel);

pp = XPROCESS(proc);

nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).

This seems strange, as obviously the sub-process does not produce 4095
characters repeatedly.

Thanks,
Dov

Post by Chong Yidong
Actually, try the following patch instead (apparently gdb has some
issues with printing errno). Apply the patch, then when Emacs is taking
100% CPU do an interrupt and set the breakpoint at process.c:4855, then
when the breakpoint triggers do
n
p nread
p errno
and step through the subsequent if/else blocks. Thanks.
Basically, the 100% CPU appears to be because Emacs' select() call keeps
getting worken up by the pty attached to your program. But, for some
reason, no actual output being read from that pty. These debugging
steps are trying to figure out if some uncaught errno is being reported
by the pty read.
=== modified file 'src/process.c'
*** src/process.c 2012-04-20 06:39:29 +0000
--- src/process.c 2012-05-07 06:21:39 +0000
***************
*** 4822,4827 ****
--- 4822,4829 ----
&& !FD_ISSET (channel, &non_process_wait_mask))
{
int nread;
+ int saved_errno = 0;
+ struct Lisp_Process *pp;
/* If waiting for this channel, arrange to return as
soon as no more input to be processed. No more
***************
*** 4847,4852 ****
--- 4849,4859 ----
buffered-ahead character if we have one. */
nread = read_process_output (proc, channel);
+
+ pp = XPROCESS (proc);
+ if (pp->pid == -2)
+ saved_errno = errno;
+
if (nread > 0)
{
/* Since read_process_output can run a filter,

Dov Grobgeld

2012-05-08 07:56:21 UTC

Some more info that I found through strace that might help. Alltogether
read_process_output() is called 214 times and thus a total of 870k of text
is read through /dev/ptmx to read_process_output() . Could the amount of
data possibly explain the slowness?

Regards,
Dov

Post by Chong Yidong
nread = read_process_output (proc, channel);
pp = XPROCESS(proc);
nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).
This seems strange, as obviously the sub-process does not produce 4095
characters repeatedly.
Thanks,
Dov

Chong Yidong

2012-05-08 08:28:27 UTC

nread = read_process_output (proc, channel);
pp = XPROCESS(proc);
nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).
Some more info that I found through strace that might help.
Alltogether read_process_output() is called 214 times and thus a total
of 870k of text is read through /dev/ptmx to read_process_output() .
Could the amount of data possibly explain the slowness?

Maybe, if this process IO is emitted non-stop. But this indicates that
the traffic is due to the main connection with the main gdb process
(which has a positive pid), not with the pty which gdb-mi uses for IO
(which has pid -2) like I guessed.

Could you do

M-: (setq gdb-enable-debug t) RET

and show the value of the variable `gdb-debug-log'?

For example, when I run M-x gdb on the Emacs binary itself,
`gdb-debug-log' gets 22 entries by the time I get to the (gdb) prompt;
this is the usual GDB-MI chatter. From the time I "run" the program
till the debugged program exits, it gains another 24 entries. Do you
see a lot more traffic with your program?

Dov Grobgeld

2012-05-08 11:59:36 UTC

Indeed, my result is much fatter.

(safe-length gdb-debug-log) => 223

and the total length of the messages is about 800k. So it seems that the
time spent is simply the parsing of this large chunk of data? What gdb
command is run that outputs all the file="..." commands?

Thanks,
Dov

Post by Chong Yidong
nread = read_process_output (proc, channel);
pp = XPROCESS(proc);
nread==4095, pp->pid=1234 repeatedly. (Actually 1234 seems to be an
arbitrary, but constant number between 1000 and 2000).
Some more info that I found through strace that might help.
Alltogether read_process_output() is called 214 times and thus a total
of 870k of text is read through /dev/ptmx to read_process_output() .
Could the amount of data possibly explain the slowness?

Maybe, if this process IO is emitted non-stop. But this indicates that
the traffic is due to the main connection with the main gdb process
(which has a positive pid), not with the pty which gdb-mi uses for IO
(which has pid -2) like I guessed.
Could you do
M-: (setq gdb-enable-debug t) RET
and show the value of the variable `gdb-debug-log'?
For example, when I run M-x gdb on the Emacs binary itself,
`gdb-debug-log' gets 22 entries by the time I get to the (gdb) prompt;
this is the usual GDB-MI chatter. From the time I "run" the program
till the debugged program exits, it gains another 24 entries. Do you
see a lot more traffic with your program?

Chong Yidong

2012-05-08 16:25:00 UTC

Post by Dov Grobgeld
Indeed, my result is much fatter.
(safe-length gdb-debug-log) => 223
and the total length of the messages is about 800k. So it seems that
the time spent is simply the parsing of this large chunk of data? What
gdb command is run that outputs all the file="..." commands?

Those are status messages from turning on GDB's MI (machine interface)
system, I think, though I don't see why it makes so much difference in
your case.

If you type, from the shell,

gdb -i=mi YOUR-BINARY
r

do you similarly see a huge output?

Eli Zaretskii

2012-05-08 17:38:55 UTC

Date: Tue, 8 May 2012 14:59:36 +0300
Indeed, my result is much fatter.
(safe-length gdb-debug-log) => 223
and the total length of the messages is about 800k.

Ouch! could you send it as a compressed attachment? I'd be interested
to see what gdb-mi.el commands generate such large output.

Eli Zaretskii

2012-05-08 17:47:57 UTC

Date: Wed, 09 May 2012 00:25:00 +0800
Those are status messages from turning on GDB's MI (machine interface)
system, I think, though I don't see why it makes so much difference in
your case.
If you type, from the shell,
gdb -i=mi YOUR-BINARY
r
do you similarly see a huge output?

As you know, in addition to "run", gdb-mi.el sends lots of other
commands behind the scenes, so the above is not a faithful simulation
of what happens when GDB is run by Emacs. But I agree that if the
above produces similarly voluminous output, we cannot really blame
gdb-mi.el.

Dov Grobgeld

2012-05-08 21:07:05 UTC

So my latest investigations suggest that the problem is the
"-file-list-exec-source-files" command which generates the huge output due
to the large number of files that is part of the project.

The following shows the problem from the command line.

echo -file-list-exec-source-files > /tmp/gdb.in
gdb MyExec < /tmp/gdb.in > /tmp/gdb.out
wc /tmp/gdb.out
11 69 3725105 /tmp/gdb.out

(Note that this is on a my home box, where the output is much larger than
what I reported this morning possibly due to different paths).

So there are still several questions:

* Why does it take several minutes to parse a 3.7M file? Could it be
related to the fact that gdb-mi/emacs concatinates the entire string before
trying to parse it. Still 3.7M is far too much.

* Noted that I can't run gdb MyExec < /tmp/gdb.in in a shell buffer. It
gets slower and slower while the CPU stays at 100%.

* There is a huge redundancy in gdb.out. The command
-file-list-exec-source-files should output all source files included, but
the same source files are listed multiple times. Consider the huge file
size reduction after sorting and uniq'ing:

perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }' /tmp/gdb.out |
sort | uniq | wc

3931 3931 220654

Why doesn't -file-list-exec-source file do uniq internally. This seems like
a bug in gdb.

* Why does gdb-mi.el do -file-list-exec-source-files at all? Can't it
search for source files on demand?

Regards,
Dov

Post by Eli Zaretskii

Andreas Schwab

2012-05-08 21:24:38 UTC

Post by Dov Grobgeld
* Why does gdb-mi.el do -file-list-exec-source-files at all? Can't it
search for source files on demand?

Customize gdb-create-source-file-list to nil.

Andreas.

--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Dov Grobgeld

2012-05-08 21:30:08 UTC

I first that at first too, but then I realized that it doesn't do it. The
gdb-mi.el source has the following logic.

(gdb-input
; Needs GDB 6.2 onwards.
(list "-file-list-exec-source-files" 'gdb-get-source-file-list))
(if gdb-create-source-file-list
(gdb-input
; Needs GDB 6.0 onwards.
(list "-file-list-exec-source-file" 'gdb-get-source-file)))

i.e. the gdb command "-file-list-exec-source-files" (note the s in files)
is called independantly from gdb-create-source-file-list which only
influences the execution of "-file-list-exec-source-file".

Post by Andreas Schwab

Post by Dov Grobgeld
* Why does gdb-mi.el do -file-list-exec-source-files at all? Can't it
search for source files on demand?

Customize gdb-create-source-file-list to nil.
Andreas.
--
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Andreas Schwab

2012-05-09 07:47:19 UTC

Post by Dov Grobgeld
I first that at first too, but then I realized that it doesn't do it. The
gdb-mi.el source has the following logic.
(gdb-input
; Needs GDB 6.2 onwards.
(list "-file-list-exec-source-files" 'gdb-get-source-file-list))
(if gdb-create-source-file-list
(gdb-input
; Needs GDB 6.0 onwards.
(list "-file-list-exec-source-file" 'gdb-get-source-file)))

You are looking at a very old version of gdb-mi.el.

Andreas.

--
Andreas Schwab, ***@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Dov Grobgeld

2012-05-09 08:44:32 UTC

Yes, sorry. I discovered that I looked at an old version. The latest git
version indeed allows disabling the -file-list-exec-source-files. I will
use that option which indeed is a work around for the problem.

In addition, I filed a bug for gdb that it should uniq the filenames output
by -file-list-exec-source-files.

I realized that all filelist-exec-source-files is used for is to turn on
gdb minor mode for all files that are currently open in emacs. Perhaps we
should turn the problem around by asking for a gdb function that answers
the question whether a file is referenced by an executable. It would then
be possible to loop over the emacs buffers and turn on gdb minor mode if
the file is referenced by the new gdb session.

Regards,
Dov

Post by Andreas Schwab

You are looking at a very old version of gdb-mi.el.
Andreas.
--
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Eli Zaretskii

2012-05-09 17:36:14 UTC

Date: Wed, 9 May 2012 11:44:32 +0300
In addition, I filed a bug for gdb that it should uniq the filenames output
by -file-list-exec-source-files.

Can you provide a link to that bug report?

FWIW, when I use -file-list-exec-source-files while debugging Emacs, I
don't see duplicate file names in the GDB output. Maybe I'm blind.

Dov Grobgeld

2012-05-10 06:00:49 UTC

Here is a link to the gdb bug:

http://sourceware.org/bugzilla/show_bug.cgi?id=14081

I tried running file-list-exec-source-files and I get duplicates as well.
Try the following:

prompt> echo -file-list-exec-source-files > /tmp/gdb.in
prompt> gdb -i=mi emacs < /tmp/gdb.in > /tmp/gdb.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb.out | sort | head -15
file=alloc.c
file=alloc.c
file=allocator.c
file=atimer.c
file=atimer.c
file=bidi.c
file=bidi.c
file=buffer.c
file=buffer.c
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h

My version of gdb is:

GNU gdb (GDB) Fedora (7.2-52.fc14)

For my executable gdb outputs full paths as well as the fullname field,
which expands the output considerably.

Still, it bothering me the fact that the above perl expression parses the
gdb output in a fraction of a second, (0.01s user time) whereas gdb-mi.el
takes more than 40s.

Regards,
Dov

Date: Wed, 9 May 2012 11:44:32 +0300
In addition, I filed a bug for gdb that it should uniq the filenames

output

by -file-list-exec-source-files.

Can you provide a link to that bug report?
FWIW, when I use -file-list-exec-source-files while debugging Emacs, I
don't see duplicate file names in the GDB output. Maybe I'm blind.

Chong Yidong

2012-05-10 14:13:36 UTC

Post by Dov Grobgeld
Still, it bothering me the fact that the above perl expression parses
the gdb output in a fraction of a second, (0.01s user time) whereas
gdb-mi.el takes more than 40s.

I would also like to learn more about where the bottleneck is.

Could you do the following:

M-: (require 'gdb-mi) RET
M-: (defun gdb-get-source-file-list () nil) RET

then run M-x gdb as usual, and see if that makes any difference in
performance? Leave gdb-create-source-file-list set at t.

(The above steps cause gdb-mi to issue the -file-list-exec-source-files
command and read the output, as usual, but skip parsing the output into
the source file list.)

Eli Zaretskii

2012-05-10 16:32:04 UTC

Date: Thu, 10 May 2012 09:00:49 +0300
I tried running file-list-exec-source-files and I get duplicates as well.
prompt> echo -file-list-exec-source-files > /tmp/gdb.in
prompt> gdb -i=mi emacs < /tmp/gdb.in > /tmp/gdb.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb.out | sort | head -15
file=alloc.c
file=alloc.c
file=allocator.c
file=atimer.c
file=atimer.c
file=bidi.c
file=bidi.c
file=buffer.c
file=buffer.c
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h
file=buffer.h

I don't see anything like that. Here's my output:

addr=0x011b4ea5
addr=0x011b4ea5
addr=0x012329a8
disp=del
disp=del
disp=keep
enabled=y
enabled=y
enabled=y
file=../lib/allocator.h
file=../lib/careadlinkat.h
file=../lib/ignore-value.h
file=../lib/intprops.h
file=../lib/intprops.h
file=../lib/intprops.h

IOW, all the duplicates I see are header files. Not a single .c file
shows up, not even if I change "head -15" into "head -100".

GNU gdb (GDB) Fedora (7.2-52.fc14)

Maybe you should upgrade, I dunno. I use 7.4.1, FWIW.

Or maybe GCC versions later than what I have do that.

For my executable gdb outputs full paths as well as the fullname field,
which expands the output considerably.

Here too, but that's expected (and necessary).

Dov Grobgeld

2012-05-10 18:43:39 UTC

I just downloaded gdb from cvs and tried with the latest version and there
are still duplicates. Perhaps it is because of gcc? But on the positive
side there is indeed a huge difference in the gdb output.

prompt> gdb --version
GNU gdb (GDB) Fedora (7.3.50.20110722-13.fc16)
prompt> /usr/local/public-dev/bin/gdb --version
GNU gdb (GDB) 7.4.50.20120509-cvs
prompt> gdb -i=mi MyExec < /tmp/gdb.in > /tmp/gdb-old.out
prompt> /usr/local/public-dev/bin/gdb -i=mi MyExec < /tmp/gdb.in >
/tmp/gdb-new.out
prompt> ls -1s --block-size=1 /tmp/gdb*.out
884736 /tmp/gdb-new.out
3727360 /tmp/gdb-old.out
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-old.out | sort | wc
67311 67311 3522494
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-new.out | sort | wc
14221 14221 837082
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-old.out | sort |uniq| wc
3931 3931 220654
prompt> perl -ne 'while(/(\w+)=\"(.*?)\"/g) { print "$1=$2\n"; }'
/tmp/gdb-new.out | sort |uniq| wc
2245 2245 137404

But even the factor 837k vs 137k is substantial, so it is still valid to do
an internal uniq within gdb. I'll try to put together a patch.

Regards,
Dov

Post by Eli Zaretskii

addr=0x011b4ea5
addr=0x011b4ea5
addr=0x012329a8
disp=del
disp=del
disp=keep
enabled=y
enabled=y
enabled=y
file=../lib/allocator.h
file=../lib/careadlinkat.h
file=../lib/ignore-value.h
file=../lib/intprops.h
file=../lib/intprops.h
file=../lib/intprops.h
IOW, all the duplicates I see are header files. Not a single .c file
shows up, not even if I change "head -15" into "head -100".

GNU gdb (GDB) Fedora (7.2-52.fc14)

Maybe you should upgrade, I dunno. I use 7.4.1, FWIW.
Or maybe GCC versions later than what I have do that.

For my executable gdb outputs full paths as well as the fullname field,
which expands the output considerably.

Here too, but that's expected (and necessary).

Dov Grobgeld

2012-05-10 19:07:38 UTC

Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.

Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s

Thus it is clear that most of the time is taken just reading the string
into emacs. But doing find-file on the same file is almost instantaneous.

Regards,
Dov

Post by Dov Grobgeld
Still, it bothering me the fact that the above perl expression parses
the gdb output in a fraction of a second, (0.01s user time) whereas
gdb-mi.el takes more than 40s.

I would also like to learn more about where the bottleneck is.
M-: (require 'gdb-mi) RET
M-: (defun gdb-get-source-file-list () nil) RET
then run M-x gdb as usual, and see if that makes any difference in
performance? Leave gdb-create-source-file-list set at t.
(The above steps cause gdb-mi to issue the -file-list-exec-source-files
command and read the output, as usual, but skip parsing the output into
the source file list.)

Stefan Monnier

2012-05-10 20:25:59 UTC

Post by Dov Grobgeld
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.
Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s
Thus it is clear that most of the time is taken just reading the string
into emacs. But doing find-file on the same file is almost instantaneous.

Sounds like a perfect test case for the native elisp profiler (a
prototype of which is at
http://cx4a.org/hack/emacs-native-profiler.html, and hopefully it will
mature over the summer since it's funded as a GSoC).

Stefan

Chong Yidong

2012-05-11 06:33:52 UTC

Post by Dov Grobgeld
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.
Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s
Thus it is clear that most of the time is taken just reading the
string into emacs. But doing find-file on the same file is almost
instantaneous.

Here's another little experiment. Could you apply the two attached
patches, individually, and see what difference each patch makes?
(Again, with gdb-create-source-file-list at its default of t, and
without any other patches to gdb-mi.el.)

Dov Grobgeld

2012-05-11 08:29:57 UTC

Here are the results.

With patch1: ~155s
With patch2: ~112s

On the other hand, I have patched gdb, seehttp://
sourceware.org/bugzilla/show_bug.cgi?id=14081, to do uniq internally in
gdb, and then it takes only about 10s.

Post by Dov Grobgeld
Here are the tests when using the the latest cvs gdb that yields a gdb
output file of about 800k.
Without gdb-get-source-file-list override: ~139s
With gdb-get-source-file-list override: ~125s
Thus it is clear that most of the time is taken just reading the
string into emacs. But doing find-file on the same file is almost
instantaneous.

Eli Zaretskii

2012-05-11 09:47:53 UTC

Date: Fri, 11 May 2012 11:29:57 +0300
With patch1: ~155s
With patch2: ~112s
On the other hand, I have patched gdb, seehttp://
sourceware.org/bugzilla/show_bug.cgi?id=14081, to do uniq internally in
gdb, and then it takes only about 10s.

Which is much faster, but still way too slow.

I think it's quite clear at this point that what takes time is reading
the stuff from GDB, not parsing it. Chong, am I right?

Chong Yidong

2012-05-11 13:27:36 UTC

Post by Eli Zaretskii
I think it's quite clear at this point that what takes time is reading
the stuff from GDB, not parsing it. Chong, am I right?

That is what the evidence suggests, which indicates that we won't be
able to fix it for 24.1. Maybe there is some problem in the adaptive
read buffering code?

Dov Grobgeld

2012-11-05 20:36:16 UTC

Is there any chance of having this bug solved in the foreseeable future?
This bug is the reason I am still sticking with emacs-23. As an
alternative, is it possible to use the old gdb-mode (without -mi) in
emacs-24?

Post by Eli Zaretskii
I think it's quite clear at this point that what takes time is reading
the stuff from GDB, not parsing it. Chong, am I right?

That is what the evidence suggests, which indicates that we won't be
able to fix it for 24.1. Maybe there is some problem in the adaptive
read buffering code?

Eli Zaretskii

2012-11-05 20:46:47 UTC

Date: Mon, 5 Nov 2012 22:36:16 +0200
As an alternative, is it possible to use the old gdb-mode (without
-mi) in emacs-24?

It was always possible: type "M-x gud-gdb RET".

Stefan Monnier

2012-11-05 23:51:41 UTC

Post by Eli Zaretskii

As an alternative, is it possible to use the old gdb-mode (without
-mi) in emacs-24?

It was always possible: type "M-x gud-gdb RET".

IIUC this gives you the Emacs-22 version of M-x gdb, but not the
Emacs-23 one (which used the gdb-ui.el code).

Stefan

Jean-Philippe Gravel

2012-12-14 04:14:59 UTC