Uninitialized ProcOrient, segmentation fault

Any issues with the actual running of the WRF.

Uninitialized ProcOrient, segmentation fault

Postby arango » Fri Jan 11, 2019 5:55 pm

I am using WRF Version 4.0.3, and I getting a segmentation error when running in parallel with 4 MPI nodes with either 2x2, 1x4, 4x1 partitions (patch distributions). The error is in output_wrf.f90 around line 1032:
Code: Select all
       
       p => grid%head_statevars%next
       DO WHILE ( ASSOCIATED( p ) )
         IF ( p%ProcOrient .NE. 'X' .AND. p%ProcOrient .NE. 'Y' ) THEN 

because the character*1 p%ProcOrient is not assigned. It was initialized to a blank space ' '. It seems to be an issue in gen_allocs.c. The weird thing is that it runs with 1x2 or 2x1 partitions. It seems that the above conditional is not robust in parallel (I am using OpenMPI). As far as I understand it, ProcOrient is either "X', 'Y', or ' '. Then, a more robust conditional will be:
Code: Select all
        IF ( p%ProcOrient .EQ. CHAR(32) ) THEN 

where CHAR(32) is the ASCII character for a blank space (SP). Do this makes sense? I haven't been able to find a namelist parameter that takes care of this problem. I Google this error and I found information about replacing gen_allocs.c several years ago (2010).

By the way, I am using ifort 19.0.1.144 20181018 and OpenMPI for Darwin.

Any suggestion is appreciated.
Last edited by arango on Sat Jan 12, 2019 11:23 am, edited 1 time in total.
arango
 
Posts: 4
Joined: Wed Dec 05, 2018 5:13 pm

Re: Unitilized ProcOrient, segmentation fault

Postby kwthomas » Fri Jan 11, 2019 7:02 pm

You can try running more nodes. Every once in a while, a node configuration doesn't play well with WRF.

The Intel 17.x and Intel 18.x compilers have a history of generating badly optimized code at times. Maybe
Intel 19.x has the same problems. None of the systems that I have access to have Intel 19.x.

Try rebuilding WRF from scratch at a lower optimization level if adding more nodes doesn't help.
Kevin W. Thomas
Center for Analysis and Prediction of Storms
University of Oklahoma
kwthomas
 
Posts: 260
Joined: Thu Aug 07, 2008 6:53 pm

Re: Unitilized ProcOrient, segmentation fault

Postby arango » Fri Jan 11, 2019 10:46 pm

Thank you for the suggestion. It is not related to the optimization level since I get the same behavior with debugging flags (-g -O0) and optimized (-O3). I ran inside the parallel TotalView debugger and that's the reason why I provided more detailed information where it is happening. I also made the modification suggested above with CHAR(32) and still fails at the same line and I got the SIGSEGV segmentation fault. Maybe the error is not related to that IF-conditional but due to stack memory somewhere else. It seems that WRF requires lots of memory. However, my application grid is small 200x150x31.

I tried differently partitions with more nodes 2x4, 4x2, 3x3, and still get the segmentation fault. It only works with 1x2 or 2x1. I may try with gfortran to see if I get the same behavior, but I need to check first if I have consistent libraries with the same version of gfortran.
arango
 
Posts: 4
Joined: Wed Dec 05, 2018 5:13 pm


Return to Runtime Problems

Who is online

Users browsing this forum: Google [Bot] and 5 guests

cron