summaryrefslogtreecommitdiff
path: root/open_issues/term_blocking.mdwn
blob: eddc285e0de9c7431c212ba031b15913728ef3ad (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
[[!meta copyright="Copyright © 2009, 2011, 2012 Free Software Foundation,
Inc."]]

[[!meta license="""[[!toggle id="license" text="GFDL 1.2+"]][[!toggleable
id="license" text="Permission is granted to copy, distribute and/or modify this
document under the terms of the GNU Free Documentation License, Version 1.2 or
any later version published by the Free Software Foundation; with no Invariant
Sections, no Front-Cover Texts, and no Back-Cover Texts.  A copy of the license
is included in the section entitled [[GNU Free Documentation
License|/fdl]]."]]"""]]

[[!tag open_issue_hurd]]

There must be some blocking / dead-locking (?) problem in `term`.

[[!toc]]


# Original Findings

    # w | grep [t]sch
    tschwing  p1 192.168.10.60: Tue 8PM  0:03  2172 /bin/bash
    tschwing  p2 192.168.10.60: Tue 4PM 40hrs   689 emacs
    tschwing  p3 192.168.10.60:  8:52PM 11:37 15307 /bin/bash
    tschwing  p0 192.168.10.60:  6:42PM 11:47  8104 /bin/bash
    tschwing  p8 192.168.10.60:  8:27AM  0:02 16510 /bin/bash

Now open a new screen window, or login shell, or...

    # ps -Af | tail
    [...]
    tschwinge 16538  676  p6  0:00.08 /bin/bash
        root 16554   128  co  0:00.09 ps -Af
        root 16555   128  co  0:00.01 tail

`bash` is started (on `p6`), but newer makes it to the shell promt; doesn't
even start to execute `.bash_profile` / `.bashrc`.  The next shell started, on
the next available pseudoterminal, will work without problems.

The `term` on `p6` has already been running before:

    # ps -Af | grep [t]typ6
        root  6871     3   -  5:45.86 /hurd/term /dev/ptyp6 pty-master /dev/ttyp6

In this situation, `w` will sometimes report erroneous values for *IDLE*
for the process using that terminal.

Killed that `term` instance, and things were fine again.


All this reproducible happens while running the [[GDB testsuite|gdb]].

---

Have a freshly started shell blocking on such a `term` instance.

    $ ps -F hurd-long -p 1766 -T -Q
      PID TH#  UID  PPID  PGrp  Sess TH  Vmem   RSS %CPU     User   System Args
     1766        0     3     1     1  6  131M 1.14M  0.0  0:28.85  5:40.91 /hurd/term /dev/ptyp3 pty-master /dev/ttyp3
            0                                        0.0  0:05.76  1:08.48
            1                                        0.0  0:00.00  0:00.01
            2                                        0.0  0:06.40  1:11.52
            3                                        0.0  0:05.76  1:09.89
            4                                        0.0  0:05.42  1:06.74
            5                                        0.0  0:05.50  1:04.25

... and after 5:45 h:

    $ ps -F hurd-long -p 21987 -T -Q
      PID TH#  UID  PPID  PGrp  Sess TH  Vmem   RSS %CPU     User   System Args
    21987     1001   676 21987 21987  2  148M 2.03M  0.0  0:00.02  0:00.07 /bin/bash
            0                                        0.0  0:00.02  0:00.07 
            1                                        0.0  0:00.00  0:00.00 

    $ ps -F hurd-long -p 1766 -T -Q
      PID TH#  UID  PPID  PGrp  Sess TH  Vmem   RSS %CPU     User   System Args
     1766        0     3     1     1  6  131M 1.14M  0.0  0:29.04  5:42.38 /hurd/term /dev/ptyp3 pty-master /dev/ttyp3
            0                                        0.0  0:05.76  1:08.48
            1                                        0.0  0:00.00  0:00.01
            2                                        0.0  0:06.41  1:11.90
            3                                        0.0  0:05.82  1:10.28
            4                                        0.0  0:05.52  1:07.06
            5                                        0.0  0:05.52  1:04.63

    $ sudo gdb /hurd/term 1766
    [sudo] password for tschwinge: 
    GNU gdb (GDB) 7.0-debian
    Copyright (C) 2009 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "i486-gnu".
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/gdb/bugs/>...
    Reading symbols from /hurd/term...Reading symbols from /usr/lib/debug/hurd/term...done.
    (no debugging symbols found)...done.
    Attaching to program `/hurd/term', pid 1766
    [New Thread 1766.1]
    [New Thread 1766.2]
    [New Thread 1766.3]
    [New Thread 1766.4]
    [New Thread 1766.5]
    [New Thread 1766.6]
    Reading symbols from /lib/libhurdbugaddr.so.0.3...Reading symbols from /usr/lib/debug/lib/libhurdbugaddr.so.0.3...
    [System doesn't respond anymore, but no kernel crash.]

---

The very same behavior is still observable as of 2011-03-24.

Next: rebooted; on console started root shell, screen, a few spare windows; as
user started GDB test suite, noticed the PTY it's using; in a root shell
started GDB (the system one, for `.debug` stuff) on `/hurd/term`, `set
noninvasive on`, attach to the *term* that GDB is using.

---

[[2011-07-04]].

---

2012-11-05

Log file from a 2011-09-07 run:

    [...]
    Running ../../../master/gdb/testsuite/gdb.base/readline.exp ...
    spawn [...]/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory [...]/gdb/testsuite/../data-directory
    GNU gdb (GDB) 7.3.50.20110906-cvs
    Copyright (C) 2011 Free Software Foundation, Inc.
    License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
    This is free software: you are free to change and redistribute it.
    There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
    and "show warranty" for details.
    This GDB was configured as "i686-unknown-gnu0.3".
    For bug reporting instructions, please see:
    <http://www.gnu.org/software/gdb/bugs/>.
    (gdb) set height 0
    (gdb) set width 0
    (gdb) dir
    Reinitialize source path to empty? (y or n) y
    Source directories searched: $cdir:$cwd
    (gdb) dir ../../../master/gdb/testsuite/gdb.base
    Source directories searched: [...]/gdb/testsuite/../../../master/gdb/testsuite/gdb.base:$cdir:$cwd
    (gdb) p 1
    $1 = 1
    PASS: gdb.base/readline.exp: Simple operate-and-get-next - send p 1
    (gdb) p 2
    $2 = 2
    PASS: gdb.base/readline.exp: Simple operate-and-get-next - send p 2
    (gdb) p 3
    $3 = 3
    PASS: gdb.base/readline.exp: Simple operate-and-get-next - send p 3
    (gdb) p 3(gdb) p 3PASS: gdb.base/readline.exp: Simple operate-and-get-next - C-p to p 3
    ^H2(gdb) p 2PASS: gdb.base/readline.exp: Simple operate-and-get-next - C-p to p 2
    ^H1(gdb) p 1PASS: gdb.base/readline.exp: Simple operate-and-get-next - C-p to p 1
    ^OFAIL: gdb.base/readline.exp: Simple operate-and-get-next - C-o for p 1
    FAIL: gdb.base/readline.exp: operate-and-get-next with secondary prompt - send if 1 > 0
    FAIL: gdb.base/readline.exp: print 42 (timeout)
    FAIL: gdb.base/readline.exp: arrow keys with secondary prompt (timeout)
    spawn [...]/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory [...]/gdb/testsuite/../data-directory
    ERROR: (timeout) GDB never initialized after 10 seconds.
    ERROR: no fileid for coulomb
    ERROR: no fileid for coulomb
    UNRESOLVED: gdb.base/readline.exp: Simple operate-and-get-next - send p 7
    testcase ../../../master/gdb/testsuite/gdb.base/readline.exp completed in 646 seconds
    Running ../../../master/gdb/testsuite/gdb.base/wchar.exp ...
    Executing on host: gcc  -c -g  -o [...]/gdb/testsuite/gdb.base/wchar0.o ../../../master/gdb/testsuite/gdb.base/wchar.c    (timeout = 300)
    spawn gcc -c -g -o [...]/gdb/testsuite/gdb.base/wchar0.o ../../../master/gdb/testsuite/gdb.base/wchar.c
    Executing on host: gcc [...]/gdb/testsuite/gdb.base/wchar0.o  -g  -lm   -o [...]/gdb/testsuite/gdb.base/wchar    (timeout = 300)
    spawn gcc [...]/gdb/testsuite/gdb.base/wchar0.o -g -lm -o [...]/gdb/testsuite/gdb.base/wchar
    get_compiler_info: gcc-4-6-1
    spawn [...]/gdb/testsuite/../../gdb/gdb -nw -nx -data-directory [...]/gdb/testsuite/../data-directory
    ERROR: (timeout) GDB never initialized after 10 seconds.
    ERROR: no fileid for coulomb
    ERROR: no fileid for coulomb
    ERROR: no fileid for coulomb
    ERROR: couldn't load [...]/gdb/testsuite/gdb.base/wchar into [...]/gdb/testsuite/../../gdb/gdb (timed out).
    ERROR: no fileid for coulomb
    ERROR: Delete all breakpoints in delete_breakpoints (timeout)
    ERROR: no fileid for coulomb
    UNRESOLVED: gdb.base/wchar.exp: setting breakpoint at wchar.c:34 (timeout)
    testcase ../../../master/gdb/testsuite/gdb.base/wchar.exp completed in 797 seconds
    [...]


# Formal Verification

This issue may be a simple programming error, or it may be more complicated.

Methods of [[formal_verification]] should be applied to confirm that there is
no error in `/hurd/term`'s logic itself.  There are tools for formal
verification/[[code_analysis]] that can likely help here.

There is a [[!FF_project 277]][[!tag bounty]] on this task.