Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/tmp gets filled up on ksh93 93u+ #1472

Closed
rphaniram opened this issue Feb 20, 2020 · 15 comments
Closed

/tmp gets filled up on ksh93 93u+ #1472

rphaniram opened this issue Feb 20, 2020 · 15 comments

Comments

@rphaniram
Copy link

On versions of Solaris 10/Solaris 11 that ship with ksh93, the following behavior is observed:

Version of ksh93 on system:
$ ksh --version
version sh (AT&T Research) 93u+ 2012-08-01

The issue:
ksh93 seems to fill up the /tmp filesystem under certain scenarios. when redirection/here-doc operators are involved.

Example:
ksh93 does not seem to handle here-doc correctly in scenario's such as one below: Here's an expected behavior.

$ cat << EOS

foo
EOS
foo

But if I think of canceling the here-document by doing Ctrl-C, it's process is not interrupted.

$ cat << EOS

^C^C^C^C^C^C^C^C

Now, the ksh93 cannot be killed with SIGINT although it can be killed with SIGSEGV, SIGBUS, etc. (and of course SIGKILL). And until it's killed, /tmp keeps growing !!!!

/tmp on Solaris being a memory based tempfs filesystem, the above starts to consume a lot of memory.

There have been other scenarios too where the above problem is hit - while migrating ksh88 script to ksh93.

Also, the call stack looks something like below every time we hit the issue:

          libc.so.1`__write+0x8
          ksh`_sfflsbuf+0x268
          ksh`sfsync+0x42c
          ksh`sh_subtmpfile+0x1f0
          ksh`sh_exec+0x4f24
          ksh`sh_exec+0x343c
          ksh`sh_exec+0x1970
          ksh`sh_exec+0x3404
          ksh`sh_exec+0x3404
          ksh`sh_subshell+0x898
          ksh`comsubst+0x9e4
          ksh`varsub+0x618
          ksh`copyto+0xb30
          ksh`sh_macexpand+0x290
          ksh`sh_argbuild+0x254
          ksh`sh_exec+0x350c
          ksh`exfile+0xad8
          ksh`sh_main+0xc44
          ksh`main+0x3c
          ksh`_start+0x108

And, process tree looks like below most times:
AT(live/11V)> proc tree 16522
704 /usr/lib/ssh/sshd
16508 /usr/lib/ssh/sshd -R
16512 /usr/lib/ssh/sshd -R
16513 -bash
16517 su -
16518 -bash
16522 ksh -o vi
16686 less /var/adm/messages.0
16690 ksh

OR
CAT(vmcore.14/11V)> proc tree 10198
1 /usr/sbin/init
10198 ksh

PS: I presume the above is seen with the latest version of ksh93 as well.

@jghub
Copy link

jghub commented Feb 20, 2020

@rphaniram:
no solution to your problem, just a heads up that this repo is in a phase of readjustment. some context might be found in #1464 (scroll a bit down ...) and #1466. currently, there is an effort under way to ensure the longterm availability of ksh93u+ (plus bug fixes) as opposed to the ksh2020 offering provided these last 2-3 years in this repo (i.e. take note that the current ksh2020 branch is not considered to be ksh93 proper)

checkout "ksh-community" in github search soon (it has been created today, so no repos, yet. but ksh93u+ will be there soon) and possibly reopen your issue there in due time.

@rphaniram
Copy link
Author

rphaniram commented Feb 21, 2020

@jghub Thanks for the response and info.

Shall checkout the new group ksh-community and will look to reopen the issue.

@saper
Copy link
Contributor

saper commented Feb 22, 2020

I could not reproduce this with my FreeBSD ksh93v-

but I have checked on Solaris 11.3 and I get the following behaviour:

saper@sol11:~$ echo ${.sh.version}                                                                                         
Version JM 93u 2011-02-08
saper@sol11:~$ cat <<EOS                                                                                                   
> now pressing ctrl-c
> 
ksh93: syntax error: `<<' unmatched
$(set +o xtrace +o errexit
                printf "%*s\r%s" COLUMNS ""
                printf "%s@%s:" "${LOGNAME}" "$(/usr/bin/hostname)"
                ellip="${
                        [[ "${LC_ALL}/${LANG}" == ~(Elr)(.*UTF-8/.*|/.*UTF-8) ]] &&
                                printf "\u[2026]\n" || print "..." ; }"
                p="${PWD/~(El)${HOME}/\~}"
                (( ${#p} > 30 )) &&
                        print -r -n -- "${ellip}${p:${#p}-30:30}" ||
                        print -r -n -- "${p}"
                [[ "${LOGNAME}" == "root" ]] && print -n "# " || print -n "\$ "
                )       
saper@sol11:~$ ls -l /tmp                                                                                                  
total 16
drwxr-xr-x   2 root     root         117 Feb 22 19:57 hsperfdata_root
drwx------   2 saper    staff        184 Feb 22 20:01 ssh-XXXrai9b

While it is pretty strange to get that shell code dumped out after pressing Ctrl-C, I press Enter after this is the shell command exists normally.

@saper
Copy link
Contributor

saper commented Feb 22, 2020

Ah, that funny code is my default Solaris PS1:

PS1=$'$(set +o xtrace +o errexit\n                printf "%*s\\r%s" COLUMNS ""\n                printf "%s@%s:" "${LOGNAME}" "$(/usr/bin/hostname)"\n\t\tellip="${\n\t\t\t[[ "${LC_ALL}/${LANG}" == ~(Elr)(.*UTF-8/.*|/.*UTF-8) ]] &&\n\t\t\t\tprintf "\\u[2026]\\n" || print "..." ; }"\n\t\tp="${PWD/~(El)${HOME}/\\~}"\n\t\t(( ${#p} > 30 )) &&\n\t\t\tprint -r -n -- "${ellip}${p:${#p}-30:30}" ||\n\t\t\tprint -r -n -- "${p}"\n\t\t[[ "${LOGNAME}" == "root" ]] && print -n "# " || print -n "\\$ "\n\t\t)'

After I reset my PS1 to a simple value, things seem to work good as well:

saper@sol11:~$ export PS1="normal\$ "                                                                                      
normal$    
normal$ cat <<EOS
> pressing ctrl-c now
> 
normal$ ls -l /tmp
total 16
drwxr-xr-x   2 root     root         117 Feb 22 19:57 hsperfdata_root
drwx------   2 saper    staff        184 Feb 22 20:01 ssh-XXXrai9b
  • Can you check your PS1 (preferably with echo "${PS1}") ?
  • Can you reproduce it if you reset PS1 to something simple?

@jghub
Copy link

jghub commented Feb 22, 2020

@saper

a) you might need to notify @rphaniram explicitly since he closed the issue due to my initial reply.

b) you used Version JM 93u 2011-02-08, so that is pre-ksh93u+, no?

c) I just verified that on OSX with ksh93u+ 2012-08-01 nothing "special" or strange happens. indeed I just get PS2 (not PS1!) prompts (not their source code ;)) after Ctrl-C until I hit Ctrl-D at which point the shell exits normally.

@jghub
Copy link

jghub commented Feb 22, 2020

@rphaniram
on rereading your original post regarding not being able to get out of the here document with ^C:
actually you need just to type ^D (Ctrl-D) i.e. the canonical "end of input" control sequence. that gracefully terminates the here document. question: does that already solve your problem?

@saper
Copy link
Contributor

saper commented Feb 22, 2020

I think the original issue is about /tmp filling up with unfinished here documents, I could not reproduce this neither on BSD nor on Solaris, despite playing with the prompts. (I think @rphaniram gets notified anyway)

@jghub
Copy link

jghub commented Feb 22, 2020

yes, but he experienced this when not being able to quit the HERE document (presumably because missing to hit ^D and trying to get out with ^C) and then something went wild, no? my question was, whether he can get out of that situation with ^D (and avoiding ^C, possibly) etc.

regarding notification: no regular git(hub) user here, so not sure. I thought it depends on your settings etc. but whatever :)

@saper
Copy link
Contributor

saper commented Feb 22, 2020

Maybe something catches SIGINT or there is something funny in the terminal settings (stty -a)... we definitely need more info and we do not support Solaris :)

@jelmd
Copy link

jelmd commented Feb 22, 2020

Why not?

You may use the following dtrace script to dive deeper:

#!/usr/sbin/dtrace -Cs
/* Usage: $0 -p $kshPID */
#pragma D option quiet

pid$target:ksh::entry {
	@pc[probefunc] = count();
}
dtrace:::END {
	printa(@pc);
}

Anyway, step-by-step we get the ksh-community running, and than we'll be happy to work on it in the ksh repository.

@saper
Copy link
Contributor

saper commented Feb 22, 2020

What I mean by "we do not support Solaris" is "we do not really know what code is running as ksh93 on closed source Solaris releases done by Oracle".

Of course mainline ksh93 should run and be tested on Solaris as well.

@jelmd
Copy link

jelmd commented Feb 22, 2020

Basically we know, because Oracle dumps it on its github repos ;-)

@jghub
Copy link

jghub commented Feb 22, 2020

@jelmd: @saper already had the link to ksh and secured the price for opening the first issue there. not that we are really open for business, yet... :)

@cschuber
Copy link

A couple of points:

I'm not able to reproduce this on ksh93u nor on ksh2020 on FreeBSD-current.

You might want to look at the illumos source, which I think might still be close to Oracle's base. Though I'm not familiar with that part of their tree, I doubt illumos have touched that since Sun open sourced it and it's unlikely Oracle has either.

@jelmd
Copy link

jelmd commented Feb 22, 2020

The illumos base is far far behind. Oracle's is based on the last stable version, i.e. 2012-08-01 + patches.

JohnoKing added a commit to JohnoKing/ksh that referenced this issue Feb 16, 2024
This was first reported in att#1472.
Attempting to cancel a heredoc with ^C or ^D can cause ksh to crash with
a segfault, or hangup and fill /tmp with files. Copy of the reproducer:
   $ cat << EOS
   > <Press Ctrl+C or Ctrl+D>

src/cmd/ksh93/sh/main.c:
- Reset the lexer state in an interactive shell if here-document
  creation was cancelled. This patch has been adapted from Solaris:
  https://github.com/oracle/solaris-userland/blob/e478b48/components/ksh93/patches/400-29444429.patch

Due to the nature of this bug, I've skipped adding a regression test as
it risks causing pty to hang up if run against an older release of
ksh93 (cf. ksh93#356).
McDutchie pushed a commit to ksh93/ksh that referenced this issue Feb 23, 2024
…ls (#721)

This was first reported in att#1472.
Attempting to cancel a heredoc with ^C or ^D can cause ksh to crash
with a segfault, or hang up and fill /tmp with files. Copy of the
reproducer:
   $ cat << EOS
   > <Press Ctrl+C or Ctrl+D>

src/cmd/ksh93/sh/main.c:
- Reset the lexer state in an interactive shell if here-document
  creation was cancelled. This patch has been adapted from Solaris:
  https://github.com/oracle/solaris-userland/blob/e478b48/components/ksh93/patches/400-29444429.patch
McDutchie pushed a commit to ksh93/ksh that referenced this issue Feb 23, 2024
…ls (#721)

This was first reported in att#1472.
Attempting to cancel a heredoc with ^C or ^D can cause ksh to crash
with a segfault, or hang up and fill /tmp with files. Copy of the
reproducer:
   $ cat << EOS
   > <Press Ctrl+C or Ctrl+D>

src/cmd/ksh93/sh/main.c:
- Reset the lexer state in an interactive shell if here-document
  creation was cancelled. This patch has been adapted from Solaris:
  https://github.com/oracle/solaris-userland/blob/e478b48/components/ksh93/patches/400-29444429.patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants