MPI Fortran compiler optimization error -
this question has answer here:
- mpi_recv changes value of count 1 answer
despite having written long, heavily parallelized codes complicated send/receives on 3 dimensional arrays, simple code 2 dimensional array of integers has got me @ wits end. combed stackoverflow possible solutions , found 1 resembled issue having:
boost.mpi: what's received isn't sent!
however solutions seem point looping segment of code culprit overwriting sections of memory. 1 seems act stranger. maybe careless oversight of simple detail on part. problem below code:
program main implicit none include 'mpif.h' integer :: i, j integer :: counter, offset integer :: rank, ierr, stval integer, dimension(10, 10) :: passmat, prntmat !! passmat contains values passed prntmat call mpi_init(ierr) call mpi_comm_rank(mpi_comm_world, rank, ierr) counter = 0 offset = (rank + 1)*300 j = 1, 10 = 1, 10 prntmat(i, j) = 10 !! prntmat of both ranks contain 10 passmat(i, j) = offset + counter !! passmat of rank=0 contains 300..399 , rank=1 contains 600..699 counter = counter + 1 end end if (rank == 1) call mpi_send(passmat(1:10, 1:10), 100, mpi_integer, 0, 1, mpi_comm_world, ierr) !! send passmat of rank=1 rank=0 else call mpi_recv(prntmat(1:10, 1:10), 100, mpi_integer, 1, 1, mpi_comm_world, stval, ierr) = 1, 10 print *, prntmat(:, i) end end if call mpi_finalize(ierr) end program main when compile code mpif90 no flags , run on machine mpirun -np 2, following output wrong values in first 4 indices of array:
0 0 400 0 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699
however, when compile same compiler -o3 flag on, correct output:
600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699
this error machine dependent. issue turns on system running ubuntu 14.04.2, using openmpi 1.6.5
i tried on other systems running redhat , centos , code ran , without -o3 flag. curiously machines use older version of openmpi - 1.4
i guessing -o3 flag performing odd optimization modifying manner in arrays being passed between processes.
i tried other versions of array allocation. above code uses explicit shape arrays. assumed shape , allocated arrays receiving equally, if not more bizarre results, of them seg-faulting. tried using valgrind trace origin of these seg-faults, still haven't gotten hang of getting valgrind not give false positives when running mpi programs.
i believe resolving difference in performance of above code me understand tantrums of other codes well.
any appreciated! code has gotten me questioning if other mpi codes wrote sound @ all.
using fortran 90 interface mpi reveals mismatch in call mpi_recv
call mpi_recv(prntmat(1:10, 1:10), 100, mpi_integer, 1, 1, mpi_comm_world, stval, ierr) 1 error: there no specific subroutine generic ‘mpi_recv’ @ (1) this because status variable stval integer scalar, rather array of mpi_status_size. f77 interface (include 'mpif.h') mpi_recv is:
include ’mpif.h’ mpi_recv(buf, count, datatype, source, tag, comm, status, ierror) <type> buf(*) integer count, datatype, source, tag, comm integer status(mpi_status_size), ierror
changing
integer :: rank, ierr, stval to
integer :: rank, ierr, stval(mpi_status_size) produces program works expected, tested gfortran 5.1 , openmpi 1.8.5.
using f90 interface (use mpi vs include "mpif.h") lets compiler detect mismatched arguments @ compile time rather producing confusing runtime problems.
Comments
Post a Comment