Reply to topic  [ 8 posts ] 
mpy mpool: problem with message size 
Author Message
Yorick Guru

Joined: Sat Jan 22, 2005 2:44 pm
Posts: 86
Location: Pasadena, CA
Post mpy mpool: problem with message size
While doing some simple tests using mpy I observe that:

(1)
the latest oxy.c bug fix (bug in save,f - no arguments) appears to break testmp (in testmp.i.) The problem shows up in the mpool test with vsave.
Code:
$ mpirun -n 2 ../relocate/bin/mpy -j testmp.i
Copyright (c) 2005.  The Regents of the University of California.
All rights reserved.  Yorick 2.2.00x ready.  For help type 'help'
> testmp
testmp2 passed on all 2 ranks
testmp3 passed on all 2 ranks
begin testing mpool
mpool finished (vpack) nerrors=0
WARNING 1 ranks report fault, parallel task halted
ERROR (_mpool_create) attempt to call non-function or index non-array
  LINE: 226  FILE: ~/tmp/yorick/mpy/mpool.i
Type <RETURN> now to debug on rank 0


(2)
in a test involving the scaling of an array in chunks of equal size, I observe the expected behavior consistently(!) only when the size of messages is below a certain threshold (which itself seems varies between MPI inplementations.) If the test below is repeated several times, with longer messages, it fails some of the times. As far as I can tell, when there is failure, the data is good in functions FSOW, FWORK, FREAP, but one of several chunks of the array "x" is incorrect on exit... I also noticed that some of jobs/chunks are not submitted (only some of them,) when there is failure, and some are processed (in fwork) twice. Typically, if a job is not submitted, the following one is submitted twice.

The test is a simplified template for a radar subaperture focusing algorithm which dispatchs chunks of data for parallel processing. mpool could be very useful for processing large "raster" data sets in parallel.... The plan was to have yorick handle the parallelization and have a c-plugin handle the bulk of the low level processing.

2 systems tested, 2 MPI implementations:
mpirun (Open MPI) 1.4.3, & MPICH2 1.3.1
Linux 2.6.16 ia64, icc V11, & Linux 2.6.18 x86_64, icc V11

> for(i=1;i<10;i++)d=testmp5(300)
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0 min: 0

OPENMPI
> for(i=1;i<10;i++)d=testmp5(600)
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 1.99833 min: -5.36312e+154
diff max: 0 min: 0
diff max: 0 min: 0
diff max: 0.998333 min: -5.36312e+154
diff max: 0.998333 min: -5.36312e+154
diff max: 0.998333 min: -5.36312e+154
diff max: 1.99833 min: -5.36312e+154

MPICH2
> for(i=1;i<10;i++)d=testmp5(9000)
diff max: 0.999889 min: 0
diff max: 0.999889 min: 0
diff max: 8.99989 min: 0
diff max: 0.999889 min: 0
diff max: 1.99989 min: 0
diff max: 8.99989 min: 0
diff max: 0.999889 min: 0
diff max: 1.99989 min: 0
diff max: 0.999889 min: 0

Code:
func testmp (m, n, use_vsave=, self=)
/* DOCUMENT testmp, m, n, use_vsave=, self=
   m: job size
   n: number of jobs
*/
{
  if (is_void(mp_size)) require, "mpool.i";
  else mp_require, "mpool.i";

  if (!mp_rank) {
    if (is_void(m)) m= 10;
    if (is_void(n)) n= 10;
    x= span(0,1,m*n);                  // in
    xo= array(structof(x),dimsof(x));  // out
  }

  if (is_void(mp_size)) {
    //write, "...mpool_test\n";
    pool = mpool_test(fsow, fwork, freap, use_vsave=use_vsave);
    //mpool_stats, pool;
  } else {
    //write, "...pool\n";
    pool = mpool(fsow, fwork, freap, use_vsave=use_vsave,self=self);
    mp_exec,"mp_handin";
    //mpool_stats, pool;
  }

  x-= xo;
 
  if ((xx=abs(x(ptp)))>0)
    write,xx,x(avg),x(rms),format="noop x: %15.8lg %15.8lg %15.8lg\n";

  return x;
}

func fsow(i, d)
{
  if (i > n) return 0;

  extern x,m;
  y= x((i-1)*m+1:i*m)/2;

  if (use_vsave) vsave, d, y;
  else vpack, d, y;

                      write,"sow ",i;
  return 1;
}

func fwork(i, d, r)
{
  local y;
  v = is_stream(d);
  if (v) restore, d, y;
  else vunpack, d, y;
                       write,"work ",i;

  xx= 2*y;

  if (v) vsave, r, xx;
  else vpack, r, xx;
}

func freap(i, r)
{
  local xx;
  v = is_stream(r);
  if (v) restore, r, xx;
  else vunpack, r, xx;

                       write,"reap ",i;
  extern xo,m;
  xo((i-1)*m+1:i*m)= xx;
}


Last edited by Thierry Michel on Mon May 09, 2011 1:41 pm, edited 1 time in total.



Sat Jan 08, 2011 7:33 am
Profile YIM
Yorick Guru

Joined: Sat Jan 22, 2005 2:44 pm
Posts: 86
Location: Pasadena, CA
Post 
No luck trying to debug the MPI message length problem in mpool...


Sat Jan 22, 2011 1:24 pm
Profile YIM
Yorick Master

Joined: Mon Nov 22, 2004 9:43 am
Posts: 354
Location: Livermore, CA, USA
Post 
(1)

You misdiagnosed this. It is another unintended consequence of the "bug" I agreed to "fix" for Eric: If the first reference to a symbol in a function is as a keyword, the original yorick implicitly declared that statement extern. It is now implicitly declared local. The offending line in this case is mpool.i:414. Adding "extern vsave;" as the first line of the mpool_test function fixes the problem. As I said, I'll regret this change... This type of reference really needs to leave the symbol undecided.


Sat Jan 29, 2011 11:01 am
Profile
Yorick Guru

Joined: Sat Jan 22, 2005 2:44 pm
Posts: 86
Location: Pasadena, CA
Post Re: mpy mpool: problem with message size
I have now replicated the second problem -- mpool errors when message length increases -- on several linux systems and MPI implementations (MPICH and openmpi.) In short: when the mpi message size increase to several hundreds to thousands doubles, some jobs are not submitted, then the following chunk is submitted twice, and this lack of synchronization cascades through all jobs (see code in original post.) I would appreciate help in debugging the problem, or even just some advice on how to go about it. I have spent some time on it, but I don not see a clear way forward.
The application which I need to parallelize is essentially a tiled 2-D correlation with a space variant kernel. The MPI message size is dictated by tile size. The plan was to move the computationally intensive tile processing to a C plugin and to let mpy handle MPI jobs, as I would rather keep that out of the lower level code.

Thank you for your help.


Sat Apr 30, 2011 4:11 pm
Profile YIM
Yorick Master

Joined: Mon Nov 22, 2004 9:43 am
Posts: 354
Location: Livermore, CA, USA
Post Re: mpy mpool: problem with message size
Unless you have access to totalview, the only way I found to debug this was to start mpy, then attach a copy of gdb to each of the processes as they are blocked waiting for input (from you at the keyboard in the case of rank 0, from rank 0 for the others). Then set appropriate breakpoints, continue all the jobs, and give rank 0 the keyboard input to start your parallel task. Before you do all of this, you should strive mightily to produce an example that exhibits your bug with only two processes, so you only have to deal with two instances of gdb. Setting good breakpoints is even more difficult that usual. Let me know if you make any progress. I haven't had time to look at this yet. I'm most interested in mpich, if you have nothing else to decide which MPI to test. Good luck.


Sun May 08, 2011 1:42 pm
Profile
Yorick Guru

Joined: Sat Jan 22, 2005 2:44 pm
Posts: 86
Location: Pasadena, CA
Post Re: mpy mpool: problem with message size
Thank you for the hints. Now I do remember an earlier post in which you had suggested using totalview (no access to that yet.) I'll post an update if I make progress.

I can produce a similar problem with 2 CPUs, but only if using "self=1" (with testmp.i as above:)

Code:
mpirun -np 2 mpy -j ~/testmp/testmp.i
> testmp,5000,2,self=1;
sow          1
sow          2
work         1
work         2
reap         2
reap         1
// ok
> testmp,10000,2,self=1;
sow          1
sow          2
work         2
reap         2
...... hanging (note: work#1 missing)


With a greater number of CPUs the "skiped" work job (usually #1) produces an erroneous behavior

Code:
mpirun -np 10 mpy -j ~/testmp/testmp.i
> testmp,1000,2;
sow          1
sow          2
work         1
work         2
reap         1
reap         2
// ok
> testmp,100000,2;
sow          1
sow          2
work         2
work         2
reap         2
reap         2
noop x:       0.4999975      0.12499937      0.16137414  (note: work#1 missing)


Mon May 09, 2011 10:06 am
Profile YIM
Yorick Master

Joined: Mon Nov 22, 2004 9:43 am
Posts: 354
Location: Livermore, CA, USA
Post Re: mpy mpool: problem with message size
I believe I have found and fixed this bug. Please retry with latest github source (commit 7c3b8b84d359b36ccb64).

There was a serious problem that would have broken pretty much any mpy program eventually: The mp_send function did not block until all messages were sent, because I failed to pass the correct count of sent messages to the wait routine. Thus, the rank which called mp_send could continue processing and reuse the buffer memory before MPI had copied the message to the receiving process (or into its own buffers). This is most likely to cause problems with large messages because MPI tends to immediately copy small messages into its own buffers.

With any luck, this will make many of the growing pains mpy has had disappear. Please keep posting mpy problems here -- the bugs are quite probably real; it is very immature code. Thierry did a model job of reducing the bug to a manageable test code that I could use to replicate the problem. Good work.

Nevertheless, let me suggest the following form for a file containing a function which calls mpool.i, which you can use as a template for your own pool of tasks functions. I've stripped out the vpack/vsave choice, because you should only choose vsave if you need to pass structs or pointers (and you should avoid needing to do that), and the self=1 choice, which typically will slow you down by more than the extra processor can speed you up (by causing everyone else to block while rank 0 is working on a task). The mpbug function is intended to be called in serial mode from rank 0. (From the keyboard, or a -i command line option, or a serial include.)

Code:
/* All ranks must read this, so use mp_include or -j option on
* command line.  Therefore, this will always be executed in parallel
* mode, and the ordinary require statement is the correst way to
* handle dependencies.
*/
require, "mpool.i";

func mpbug(m, n)
/* DOCUMENT mpbug, m, n
   m: job size
   n: number of jobs
*/
{
  if (is_void(m)) m= 10;
  if (is_void(n)) n= 10;
  x = xo = array(0., m, n);
  x(*) = indgen(m*n);
  pool = mpool(fsow, fwork, freap);
  x -= xo;
  if ((xx=abs(x(ptp)))>0)
    write,xx,x(avg),x(rms),format="noop x: %15.8lg %15.8lg %15.8lg\n";
  return x;
}

func fsow(i, d)
{
  if (i > n) return 0;
  y = x(,i);
  vpack, d, y;
  return 1;
}

func fwork(i, d, r)
{
  local y;
  vunpack, d, y;
  xx = y;
  vpack, r, xx;
}

func freap(i, r)
{
  local xx;
  vunpack, r, xx;
  xo(,i) = xx;
}


Sun May 22, 2011 10:07 pm
Profile
Yorick Guru

Joined: Sat Jan 22, 2005 2:44 pm
Posts: 86
Location: Pasadena, CA
Post Re: mpy mpool: problem with message size
Dave, thank you very much. I had a vague idea that it had to do with blocking, but I am not confident that I would have cracked it, even with plenty of time. This opens up a lot of possibilities. Thank you.


Mon May 23, 2011 8:03 am
Profile YIM
Display posts from previous:  Sort by  
Reply to topic   [ 8 posts ] 

Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group.
Designed by STSoftware for PTF.