Reply to topic  [ 5 posts ] 
installation problem 
Author Message

Joined: Mon Oct 17, 2005 3:25 pm
Posts: 11
Location: La Serena, Chile
Post installation problem
I am having problems installing Yorick on a PC running Linux. I have an AMD Sempron processor. I am running Linux 2.6.16-1.2096 FC4 (Fedora Core 4). Though I do not use the system heavily i have not detected problem with it until I tried to install Yorick on it.

I installed using the current source download from sourceforge: yorick-2.1.06.tar. No surprises until I try to install. When I execute the command "make config" the entire system reliably hangs during the play/unix part of the make process. Once hung I have no control of the machine: not even alt-cntl-del.

The last sign of life from the make process is the following line:

using poll(), poll.h header

It appears that the make process is executing the config.sh shell script (in the play/unix directory). It appears to get up to the point where it is about to make some tests of the processor (for the FPU?) when it fails. I am enough of a beginner that I don't know how to tell even if I have a floating point unit...

I tried editing the shell script to get it to tell me more before it fails, but I don't know enough about shell scripts to make decent progress, so I am calling for help.

Any suggestions?

Brooke


Sat Jul 09, 2011 1:08 pm
Profile
Yorick Master

Joined: Mon Nov 22, 2004 9:43 am
Posts: 354
Location: Livermore, CA, USA
Post Re: installation problem
That sounds bad; I can't imagine what is happening -- I've never heard of anything like that in a almost two decades of building yorick on intel/linux platforms...

You are using both an outdated yorick (current version is 2.2.00x), but much more importantly, a staggeringly ancient Linux distro. Fedora Core hasn't existed for many years; the current Fedora version is 15 (and yes, that is 11 versions beyond FC4 -- they dropped the "core" with Fedora 7 in 2007). I strongly recommend moving a more modern Linux distribution as soon as you can; support for reasonably modern hardware will be poor in FC4...

Nevertheless, yorick certainly did build under FC4. The yorick source code moved to github; the current source is unavailable at sourceforge. The http://yorick.github.com is the new home for yorick source (where the links in the middle of the yorick.sf.net page redirect you). If you want to cut through all the clicks and just download the latest source tarball, the link is http://github.com/dhmunro/yorick/tarball/master (the "as a tarball" link on the yorick.sf.net page). Download and unpack the tarball, then in a terminal window cd into the top level directory and type:

make install 2>&1 | tee make.log

You should wind up with a yorick executable in relocate/bin/yorick. If your machine crashes (or even if it doesn't), the fancy redirection part of the command should have saved everything you saw go by on the terminal in the make.log file. If you do get a crash, please email me the make.log file, and I'll see if it means anything to me.

Linux advertises that it is impossible to crash the OS by doing anything in user mode. If you are running inside of the X11 window system (e.g.- Gnome or KDE), things aren't nearly as robust. Are you sure you really crashed the OS and not just your X11 session? If so, you can try the build in a console (Ctl-Alt-F2 to switch to a console, then Alt-F7 to go back to X11, in case you don't know).

Yorick does do something extremely unusual in setting up your FPU (yes, you have two, in fact). I would be nauseated if it's crashing your machine -- yorick is the only program in the Debian distribution (one of the largest Linux distributions) that sets up the FPU to deliver a signal and stop the program when you divide by zero (or overflow or attempt some other meaningless floating point operation). Everybody else elects to keep on running, which is stupid for a numerical program, since if you do that it is impossible to figure out where you did the zero divide and fix it. Anyway, I'd be interested to hear if that's your problem. If it is, you can build a crippled version of yorick like this (in the top level source directory):

make siteclean
FPU_IGNORE=yes make install

This will skip trying to do the fancy FPU configuration.


Sun Jul 10, 2011 7:09 pm
Profile

Joined: Mon Oct 17, 2005 3:25 pm
Posts: 11
Location: La Serena, Chile
Post Re: installation problem
I followed your recipe using your new tarball. The behavior is exactly the same (as far as one can follow the process). The machine hangs after the reference to using poll() in the play/unix portion of the configuration.

By the way, I followed the same recipe in my office (linux version: 2.6.32.26-175.fc12.x86_64 ) where it worked perfectly, so I think I am following the recipe correctly.

When I first saw the problem, by the way, I (naively?) thought it might be a hardware (memory) problem. So I did in fact run memtest86+ which didn't reveal any errors.

Next, I followed your suggestion about avoiding a possible weakness in X-windows. In the console, the make install failed at the same point but before freezing it exhibited some lower level messages to the effect: "Unable to handle kernel paging request..."
This sounds like a useful clue to where the failure is.

Finally, I did the install with FPU_IGNORE=yes and everything in the install worked, including the resulting yorick.

So, I am on the air. Let me know if I can follow the error messages with any other tests.


Sun Jul 17, 2011 3:29 pm
Profile
Yorick Master

Joined: Mon Nov 22, 2004 9:43 am
Posts: 354
Location: Livermore, CA, USA
Post Re: installation problem
Boy, I still don't understand this. Years ago, yorick built fine under FC4 -- in fact, there hasn't ever been any Linux at all where yorick has failed to build, back into the 90s. Linux kernels, right up through the present, do have trouble passing SIGFPE (floating point exceptions) to user programs, as you can see by building play/unix/fputest.c (read the READMEs in that directory). For reasons I do not understand, the SIGFPE delivery works in yorick, even though it fails in fputest.c. If anyone has the time to figure out why, I'd be deeply appreciative. Bonus points if you can make fputest run correctly on a recent Linux.

I believe the only reason you see the message about poll() is that is the last thing before the SIGFPE section in play/unix/config.sh.

If you are willing to try a couple of more tests, let's just forget about the yorick configure script and go straight to the problem. From the top level yorick directory (where you typed "make install"), try this from a console (that is, log in after Ctl-Alt-F2):

cd play/unix
cc -DFPU_GNU_FENV -o fputest fputest.c -lm
./fputest

If fputest crashes your machine, then there isn't much I can do -- you can send off a bug report to the Linux kernel people, or to the GCC people, but you probably won't get any interest. It runs incorrectly on most Linux machines (failing to trap the first SIGFPE in the test), but it certainly does not crash the OS (try it on your office machine). If it doesn't crash your machine, try these compile lines:

cc -DFPU_GCC_X86 -o fputest fputest.c -lm

cc -DFPU_GCC_X86_64 -o fputest fputest.c -lm

One should work and the other should fail to compile, depending on whether you are running a 32-bit Linux or a 64-bit linux. For the one that works, do:

./fputest

If that also does not crash your machine, then the configure script managed to find a bogus FPU setting which did. That should be impossible.

If running fputest by hand reproduces your machine crash, then by extracting the relevant section of fpuset.c and inserting it directly into fputest.c, you have a sample program of a couple hundred lines which crashes the Linux kernel from user space, which isn't supposed to be possible. Since it doesn't happen on any other Linux/AMD machine I've ever heard about, I'm not sure you'll be able to find anyone interested in looking at it.

The "paging request" message sounds like it has triggered a stack overflow, presumably by putting the kernel into an infinite recursion. I'm surprised the kernel code isn't protected against that. But it isn't much of a clue for anybody to understand the problem.


Sun Jul 24, 2011 9:58 am
Profile

Joined: Mon Oct 17, 2005 3:25 pm
Posts: 11
Location: La Serena, Chile
Post Re: installation problem
I tried your first test, compiling and running fputest on a console session. It crashed the machine. A selection from the screen:
EIP is at do_page_fault
Unable to handle kernel paging request at virtual address .....
Recursive die() failure
BUG: spinlockup on CPU#0.

Thanks for your help! Next, I think I'll upgrade my linux installation.


Tue Jul 26, 2011 3:56 pm
Profile
Display posts from previous:  Sort by  
Reply to topic   [ 5 posts ] 

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by STSoftware for PTF.