Strange freezing issue
Page 1 of 1

Author:  dnagle [ Thu Nov 29, 2012 7:00 am ]
Post subject:  Strange freezing issue

We're experiencing a strange, intermittent issue where Yorick appears to be freezing on us. (Or if not Yorick, then something else, but I'm at a loss for what else it could be.)

We are running Yorick wrapped by Tcl. That is, we start a Tcl/Tk session and it spawns off Yorick via Expect. That whole thing is then wrapped by rlwrap. We end up with an interactive Yorick command line in the console window.

When running a certain rather involved function, it seems to be occasionally freezing right at a specific "write" statement. This write statement is in a loop, and it usually works just fine.

A related issue is that sometimes it seems like Yorick's "write" output isn't getting flushed right away. In other contexts, I've noticed that we'll get no output for a while, then all of it floods to the screen at once when the Yorick prompt is returned. In this context, it appears that we're in the middle of such an event. Looking at the log file I generated alongside this issue, there should be write statements on the screen that aren't there.

When this happens, Tcl itself is still responsive. We can still interactive with the Tcl GUIs, and we can still shut the whole session down by telling Tcl to exit (which as a consequence kills off the spawned Yorick session).

The only thing I can think of is that somehow Yorick's write output is getting blocked, and when some buffer fills up it just hangs. Is that possibly what's happening?

I recognize that I'm not giving much to go on here. And it may not even be Yorick, perhaps it's rlwrap or Tcl that is causing the issue. Unfortunately the issue is happening in a spot of our code that's multiple function calls deep, so I can't give a short code example. And I've been unable to find a set of input to run it on that causes the issue to happen predictably. So at this point... I'm grasping at straws and hoping someone else has some insights that might help. :)

Author:  munro [ Thu Nov 29, 2012 1:34 pm ]
Post subject:  Re: Strange freezing issue

Sounds bad. Keep me posted if it turns out to be a problem with yorick.

The only guidance I can give you is that yorick's write function is implemented in yorick/ascio.c. It calls p_stdout roughly for each line of output (although you might be able to fool it by embedding multiple newline characters in your format or arguments). the p_stdout function, in turn, is implemented in play/unix/stdinit.c. Here is is in its entirety:
p_stdout(char *output_line)
  if (!p_signalling) {
    fputs(output_line, stdout);

Thus, yorick calls fflush after virtually every line of output, and at least once per array element in its arguments. Unless you have really enormous strings you are trying to output in one block, it's hard to see how this hangs things. Given the big stack of programs you've got redirecting each other's stdin/stdout, that sounds a more likely place to look for problems. I don't have any immediate thoughts on how you might proceed. Perhaps you can capture the output from yorick, and substitute some simpler unix utility or shell script in its place that just parrots what yorick would have responded, to see it you can get that to hang in the same way?

If it is a problem with yorick, the more likely failure point is on input. Yorick has to detect when input becomes available on stdin, which is pretty complicated, involving the poll (or select) system call. It sounds like you've set up writes to a log file in your yorick program that show it entering the write command and never exiting, which rules out that possibility.

Good luck.

Author:  dnagle [ Fri Nov 30, 2012 6:46 am ]
Post subject:  Re: Strange freezing issue

Thanks for the response! The lines we are writing are not enormous so I guess that rules out that theory. Your suggestion to substitute something else to capture and parrot is a good idea. I'll try to investigate down that avenue. Perhaps it is rlwrap or expect that is causing the issue.

Author:  dnagle [ Wed Jan 16, 2013 12:46 pm ]
Post subject:  Re: Strange freezing issue

If it is a problem with yorick, the more likely failure point is on input. Yorick has to detect when input becomes available on stdin, which is pretty complicated, involving the poll (or select) system call. It sounds like you've set up writes to a log file in your yorick program that show it entering the write command and never exiting, which rules out that possibility.

I should clarify that I don't know that it's entering the write command and not exiting. That would require a logging statement inside the write statement itself. Rather, I had log statements on either side of a write statement. The log shows that it froze between those log statements. So it froze either before, during, or immediately after the write statement, but before the log call. (It's also possible it froze on the log call, but the freezing was occurring without the log calls as well so that seems less likely.)

I tried adding another layer in the mix to do some additional logging and that ended up leading to even more freezing, so that seems to confirm that it's one of the interim layers causing a problem.

I did notice that Yorick is definitely getting hung, even if it's not Yorick at fault. Comparing screen output to log output, I notice that the sequence seems to be:

Yorick processes with console output and log output
Yorick processes with log output, but console output stops
Both console and log output stop, and Yorick is apparently hung

In other words, there's a period where I can tell that Yorick is continuing to do stuff based on the log file, even if it's not having its console output show up. And then it appears that Yorick stops processing entirely, as the log file entries stop as well. (I suppose it's possible it continues and finishes processing in the background, but the console never gives me control back; it remains hung.) Meanwhile, Tcl is still fully responsive.

Is it possible for Yorick to get hung based on the (mis)behavior of one of the other processes wrapped around it (Tcl/expect or rlwrap)? Based on what you said, it sounds like if one of them is doing something funky with Yorick's stdin channel, it might get hung up? Does Yorick poll stdin while it's actively crunching through functions?

Possibly related: We are using "spawn" in the background to monitor a fifo pipe from Tcl, with an on_stdout function. The documentation for spawn says it gets called asynchronously whenever a new line is available. Does this occur even when Yorick is actively crunching through functions? If so, is it plausible that could be causing the freeze somehow?

Author:  munro [ Sat Jan 19, 2013 11:20 am ]
Post subject:  Re: Strange freezing issue

The yorick main event loop (play/unix/umain.c) will loop on all event sources (stdin, graphics window connections, spawned process stdout and stderr) making callbacks which read all available input, until no input is available on any source. Reads from disk files are specifically excluded, and write operations never participate. For stdin and spawned processes, the callback function creates an intepreted task and queues it for execution. When all input from all sources is exhausted, the main event loop calls the idler callback.

The idler callback executes a single yorick interpreted task, and returns a flag to indicate whether or not more tasks are available. Every time it returns the main even loop again exhausts all sources of input before calling the idler again. When neither any interpreted tasks remain nor any input is available, the main event loop blacks waiting for more input. If you have defined an "after" function (help,after), of if some other timer is set, it may block only for a finitie time before calling the timeout callback, but normally it blocks until new input becomes available.

The read (from stdin only), window,wait=1, pause, and possibly a few other interpreted functions, cause the interpreted program to suspend and the idler function returns even thought the interpreted task is not complete. However, there is no way for an intepreted function to execute while another is suspended, because there is no way to jump the task queue (in fact, other tasks may be pending behind the one that is suspended).

So no, "asynchronously" does not mean that on_stdout or on_stderr of a spawn can run in the middle of any other interpreted task, at least for the time being.

There are ways for snippets of yorick interpreted code to execute outside of the intepreted task queue (for example, the include or require functions), but spawn does not use these.

You might be interested in the new dbwhere function (commit 5b6b407a61), which might help you to log what is going on.

Author:  dnagle [ Wed Apr 03, 2013 11:41 am ]
Post subject:  Re: Strange freezing issue

I finally pinned down the source of this issue: it was a race condition.

Some of our Tcl GUIs let you interact with variables that are also defined in Yorick. Some of these variables pairs were tied together so that Tcl would occasionally update its variables with the values in Yorick's variables. Unfortunately, it did this in a blocking way: it sent a request (via pipe) to Yorick and then waiting until Yorick replied (via another pipe) with the value.

Since Yorick only interacts with those pipes when idle, Tcl would end up hanging when Yorick was busy. If the hang was short enough, Yorick would eventually finish and send Tcl the value it wanted. But if the hang was too long, it just wouldn't recover at all.

So as predicted, it wasn't a Yorick issue at all.

Page 1 of 1 All times are UTC - 8 hours
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group