forrtl: severe (174): SIGSEGV, segmentation fault occurred

Any issues with the actual running of the WRF.

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Postby alberto » Thu Apr 11, 2013 11:28 am

Hello pepole,

I am using WRF 3.3.1, and I have this litle issue when runing wrf.exe. I am runing a nested simulation (1 parent and 3 child domains), resolutions 9,3,1 km and the last is 500 m. Now when I do the wrf.exe I the error described in the title of this thread. Additionaly I am using an external cluster so I am not runing the wrf.exe in my computer, it is runing on 40 cores, I tried lessening the amount of cores since I saw in in previous problem like this that it was advised but even at 22 cores nothing got solved, I cannot go belowe that otherwise my simulation will take the day I think (last one took the whole morning and midle afternon for 2 nested and 1 parent (9,3,1)km, so with this one it would be hectic... additionaly, there is no feedback on, and the domains do not interfere....

So here I post one sample rsl.error, my namelist.input,...

NAMELIST.INPUT::
>>---------------------------------------------------------------------------------------------------
&time_control
run_days = 0,
run_hours = 45,
run_minutes = 0,
run_seconds = 0,
start_year = 2010, 2010, 2010, 2010,
start_month = 03, 03, 03, 03,
start_day = 18, 18, 18, 18,
start_hour = 00, 00, 00, 00,
start_minute = 00, 00, 00, 00,
start_second = 00, 00, 00, 00,
end_year = 2010, 2010, 2010, 2010,
end_month = 03, 03, 03, 03,
end_day = 19, 19, 19, 19,
end_hour = 21, 21, 21, 21,
end_minute = 00, 00, 00, 00,
end_second = 00, 00, 00, 00,
interval_seconds = 10800
input_from_file = .true.,.true.,.true.,.true.,
history_interval = 180, 180, 180, 180,
frames_per_outfile = 1, 1, 1, 1,
restart = .false.,
restart_interval = 120000,
io_form_history = 2
io_form_restart = 2
io_form_input = 2
io_form_boundary = 2
debug_level = 0
/

&domains
time_step = 54,
time_step_fract_num = 0,
time_step_fract_den = 1,
max_dom = 4,
s_we = 1, 1, 1, 1,
e_we = 54, 100, 241, 377,
s_sn = 1, 1, 1, 1,
e_sn = 54, 100, 241, 377,
s_vert = 1, 1, 1, 1,
e_vert = 39, 39, 39, 39,
num_metgrid_soil_levels = 4
num_metgrid_levels = 25
dx = 9000, 3000, 1000, 500,
dy = 9000, 3000, 1000, 500,
grid_id = 1, 2, 3, 4,
parent_id = 0, 1, 2, 3,
i_parent_start = 1, 10, 10, 20,
j_parent_start = 1, 10, 10, 20,
parent_grid_ratio = 1, 3, 3, 2,
parent_time_step_ratio = 1, 3, 3, 2,
feedback = 0,
p_top_requested = 1000
smooth_option = 0
use_adaptive_time_step = .true.
step_to_output_time = .true.
target_cfl = 1.2,1.2,1.2,1.2
max_step_increase_pct = 5,51,51,51,
starting_time_step = -1,15,4,-1,
max_time_step = -1,24,8,-1,
min_time_step = -1,12,4,-1,
eta_levels = 1.00,0.9976,0.9942,0.9896,0.9838,0.9767,0.9686,0.9591,0.9469,0.9306,0.9092,0.8792,0.8369,0.7888,0.7357,0.6798,0.6207,0.5624,0.5053,0.4505,0.3999,0.3545,0.3138,0.2773,0.2448,0.2157,0.1897,0.1667,0.1467,0.1286,0.1112,0.0945,0.0792,0.0643,0.0502,0.0365,0.0233,0.0102,0.00,
/

&physics
mp_physics = 6, 6, 6, 6,
ra_lw_physics = 1, 1, 1, 1,
ra_sw_physics = 2, 2, 2, 2,
radt = 3, 1, 1, 1,
sf_sfclay_physics = 5, 5, 5, 5,
sf_surface_physics = 2, 2, 2, 2,
bl_pbl_physics = 5, 5, 5, 5,
bldt = 0, 0, 0, 0,
cu_physics = 3, 3, 3, 3,
cudt = 5, 5, 5, 5,
isfflx = 1,
ifsnow = 0,
icloud = 1,
surface_input_source = 1,
num_soil_layers = 4,
num_land_cat = 115,
sf_urban_physics = 0,
mp_zero_out = 0,
maxiens = 1,
maxens = 3,
maxens2 = 3,
maxens3 = 16,
ensdim = 144,
slope_rad = 0,
topo_shading = 0,
windturbines_spec = "windspec_alb.in"
/

&fdda
/

&dynamics
w_damping = 1,
diff_opt = 1,
km_opt = 4,
diff_6th_opt = 0,
diff_6th_factor = 0.12,
base_temp = 290.
damp_opt = 3,
w_damping = 1,
zdamp = 5000., 5000., 5000., 5000.,
dampcoef = 0.05, 0.01, 0.01, 0.01,
khdif = 0, 0, 0, 0,
kvdif = 0, 0, 0, 0,
non_hydrostatic = .true., .true., .true., .true.,
/

&bdy_control
spec_bdy_width = 5,
spec_zone = 1,
relax_zone = 4,
specified = .true., .false.,.false.,.false.,
nested = .false., .true., .true., .true.,
/

&grib2
/

&namelist_quilt
nio_tasks_per_group = 0,
nio_groups = 1,
/
>>>--------------------------------------------------------------------------------------

RSL.ERROR::

taskid: 21 hostname: blade002
Ntasks in X 2, ntasks in Y 11
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: sst_update is 0, setting io_form_auxinput4 = 0 and auxinput4_interval
= 0 for all domains
--- NOTE: grid_fdda is 0 for domain 1, setting gfdda interval and ending t
ime to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 1, setting
sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 1, setting obs nudging interval an
d ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain 2, setting gfdda interval and ending t
ime to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 2, setting
sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 2, setting obs nudging interval an
d ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain 3, setting gfdda interval and ending t
ime to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 3, setting
sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 3, setting obs nudging interval an
d ending time to 0 for that domain.
--- NOTE: grid_fdda is 0 for domain 4, setting gfdda interval and ending t
ime to 0 for that domain.
--- NOTE: both grid_sfdda and pxlsm_soil_nudge are 0 for domain 4, setting
sgfdda interval and ending time to 0 for that domain.
--- NOTE: obs_nudge_opt is 0 for domain 4, setting obs nudging interval an
d ending time to 0 for that domain.
--- NOTE: num_soil_layers has been set to 4
WRF V3.3.1 MODEL
*************************************
Parent domain
ids,ide,jds,jde 1 54 1 54
ims,ime,jms,jme 21 59 43 59
ips,ipe,jps,jpe 28 54 50 54
*************************************
DYNAMICS OPTION: Eulerian Mass Coordinate
alloc_space_field: domain 1, 19497816 bytes allocat
ed
med_initialdata_input: calling input_input
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
*************************************
Nesting domain
ids,ide,jds,jde 1 100 1 100
ims,ime,jms,jme 41 105 81 105
ips,ipe,jps,jpe 51 100 91 100
INTERMEDIATE domain
ids,ide,jds,jde 8 46 8 46
ims,ime,jms,jme 17 51 30 51
ips,ipe,jps,jpe 27 48 40 48
*************************************
d01 2010-03-18_00:00:00 alloc_space_field: domain 2,
44629728 bytes allocated
d01 2010-03-18_00:00:00 alloc_space_field: domain 2,
3757600 bytes allocated
d01 2010-03-18_00:00:00 *** Initializing nest domain # 2 from an input file. **
*
d01 2010-03-18_00:00:00 med_initialdata_input: calling input_input
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
WRF TILE 1 IS 28 IE 54 JS 50 JE 54
WRF NUMBER OF TILES = 1
*************************************
Nesting domain
ids,ide,jds,jde 1 241 1 241
ims,ime,jms,jme 111 246 210 246
ips,ipe,jps,jpe 121 241 220 241
INTERMEDIATE domain
ids,ide,jds,jde 8 93 8 93
ims,ime,jms,jme 40 98 73 98
ips,ipe,jps,jpe 50 95 83 95
*************************************
d02 2010-03-18_00:00:00 alloc_space_field: domain 3, 1
32964384 bytes allocated
d02 2010-03-18_00:00:00 alloc_space_field: domain 3,
7485920 bytes allocated
d02 2010-03-18_00:00:00 *** Initializing nest domain # 3 from an input file. **
*
d02 2010-03-18_00:00:00 med_initialdata_input: calling input_input
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
WRF TILE 1 IS 51 IE 100 JS 91 JE 100
WRF NUMBER OF TILES = 1
*************************************
Nesting domain
ids,ide,jds,jde 1 377 1 377
ims,ime,jms,jme 179 382 333 382
ips,ipe,jps,jpe 189 377 343 377
INTERMEDIATE domain
ids,ide,jds,jde 18 211 18 211
ims,ime,jms,jme 104 216 181 216
ips,ipe,jps,jpe 114 213 191 213
*************************************
d03 2010-03-18_00:00:00 alloc_space_field: domain 4, 2
65477308 bytes allocated
d03 2010-03-18_00:00:00 alloc_space_field: domain 4,
19851840 bytes allocated
d03 2010-03-18_00:00:00 *** Initializing nest domain # 4 from an input file. **
*
d03 2010-03-18_00:00:00 med_initialdata_input: calling input_input
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
INPUT LandUse = "USGS"
INITIALIZE THREE Noah LSM RELATED TABLES
WRF TILE 1 IS 121 IE 241 JS 220 JE 241
WRF NUMBER OF TILES = 1
WRF TILE 1 IS 189 IE 377 JS 343 JE 377
WRF NUMBER OF TILES = 1
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
wrf.exe 000000000131E334 Unknown Unknown Unknown
wrf.exe 000000000131920E Unknown Unknown Unknown
wrf.exe 0000000001314A98 Unknown Unknown Unknown
wrf.exe 000000000131324B Unknown Unknown Unknown
wrf.exe 0000000000ED603D Unknown Unknown Unknown
wrf.exe 0000000000FE3D15 Unknown Unknown Unknown
wrf.exe 0000000000BE3365 Unknown Unknown Unknown
wrf.exe 0000000000B22EF0 Unknown Unknown Unknown
wrf.exe 00000000004F75EB Unknown Unknown Unknown
wrf.exe 00000000004F7937 Unknown Unknown Unknown
wrf.exe 00000000004F7937 Unknown Unknown Unknown
wrf.exe 00000000004F7937 Unknown Unknown Unknown
wrf.exe 000000000048F783 Unknown Unknown Unknown
wrf.exe 000000000048F737 Unknown Unknown Unknown
wrf.exe 000000000048F6CC Unknown Unknown Unknown
libc.so.6 00002AAAAC3B39C4 Unknown Unknown Unknown
wrf.exe 000000000048F5C9 Unknown Unknown Unknown
>>-----------------------------------------------------------------------------------------------------




Please give me yopur feedback on this aspect.
alberto
 
Posts: 54
Joined: Mon Mar 25, 2013 7:48 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby jimmyc » Sat Apr 13, 2013 4:50 pm

This is a memory allocation error. You need more memory. Add more processors is what I would normally say, but this presumes each processor has its own memory which isnt really the case anymore. So I would say add more nodes if your cluster has them.
The views expressed in this message do not necessarily reflect those of NOAA or the National Weather Service or the University of Oklahoma.
James Correia, Jr
jimmyc
 
Posts: 519
Joined: Tue Apr 15, 2008 1:10 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby alberto » Mon Apr 15, 2013 10:37 am

Well I am runing the problem in 40 processors and I also tried 54 but that dint just work.. may be lessening, .. I have heard that recomendation from my work place but.. i will take long time still to find out...
alberto
 
Posts: 54
Joined: Mon Mar 25, 2013 7:48 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby alberto » Mon Apr 15, 2013 10:41 am

I get this now in my rsl.error from runing the real.exe :


------
Using sfcpr3 to compute psfc
d01 2010-03-18_00:00:00 Warning: vapor pressure exeeds total pressure, settins Qv to 0.
....
...
..
..
,,
,,'
All the way until d03
But is does say at the end SUCCESS COMPLETE REAL_EM INIT..
...

then when running WRF.exe I get the error I posted earlier ... the same code..
alberto
 
Posts: 54
Joined: Mon Mar 25, 2013 7:48 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby Phillip » Tue Apr 16, 2013 7:16 am

From the WRF manual:
"parent-to-nest domain grid size ratio: for real-data cases the ratio has to be odd; for idealized cases, the ratio can be even if feedback is set to 0."

Try if chosing an odd grid ratio solves your problem.
Phillip
 
Posts: 46
Joined: Thu Jul 19, 2012 3:29 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby alberto » Tue Apr 16, 2013 9:49 am

Hye, yeap I also tried that ,and also dimimish the domains but the segmentation kept hapening...
alberto
 
Posts: 54
Joined: Mon Mar 25, 2013 7:48 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby bassline » Mon Apr 22, 2013 7:53 am

Try to export MALLOC_CHECK = 0

or try to set "ulimit -s unlimited" on your script.

:)

Anyway... you can also try to set debug 1000 on your namelist to see the problem. And you can compile with the flags "-g -traceback" to get more details.
bassline
 
Posts: 55
Joined: Tue Jan 29, 2013 3:25 pm

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby alberto » Mon Apr 22, 2013 8:43 am

Hye thanks man, I wil chech them.. but btw , what is MALLOC_CHECK =0 ??
or the ulimit.. when you say the script, where exactly you mean, in the configure or compile?
alberto
 
Posts: 54
Joined: Mon Mar 25, 2013 7:48 am

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby AchternDiek » Tue Apr 23, 2013 6:01 am

ulimit -s unlimited is a common fix to avoid early segmentation faults with WRF.
It is a UNIX command to manage memory allocation.
Put it in the script you run your wrf.exe with. This command is independent from configure or compile.
You can also just type this command into your terminal / command line before starting WRF.

"man ulimit" will give you the manual entry.

Good luck!

--AchternDiek
AchternDiek
 
Posts: 147
Joined: Wed Jun 06, 2012 6:10 am
Location: Norway

Re: forrtl: severe (174): SIGSEGV, segmentation fault occurr

Postby alberto » Wed Apr 24, 2013 4:50 am

Thanks man, very much. Here I use a cluster and a script fro runing the programs, I am not the administrator but will make sure to set that int he script.

XD.
alberto
 
Posts: 54
Joined: Mon Mar 25, 2013 7:48 am

Next

Return to Runtime Problems

Who is online

Users browsing this forum: No registered users and 6 guests

cron