Memory leak in metgrid.exe v 3.8?

PostPosted: Thu Sep 26, 2019 5:07 pm
by BartBrashers
I'm testing ungrib.exe and metgrid.exe on a new CentOS 7.2 system (3.10.0-693.el7.x86_64). I have previously run these for years on an older system (CentOS 5.6, 2.6.18-238.el5) so I think I know what I'm doing.

Domain is CONUS 12km, 472 x 312 points. I'm initializing from ETA NAM12, and MUR SST from JPL (processed into 3-hourly netCFD files named SST:YYYY-MM-DD_HH using a custom Fortran program). Nothing I haven't done before.

I'm watching the memory usage using top, and it has gone up to the point of exhausting the 128 GB of RAM plus ~465GB of swap:

Code: Select all
# rtop c08
---- c08 ----
top - 13:59:03 up  1:12,  0 users,  load average: 1.64, 1.96, 1.64
Tasks: 323 total,   2 running, 321 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  1.5 sy,  0.0 ni, 95.6 id,  2.0 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem : 13201648+total,   325584 free, 13106730+used,   623596 buff/cache
KiB Swap: 48785612+total, 44698992+free, 40866192 used.   273636 avail Mem

 3294 bbrasher   20   0  164.8g 123.3g   1288 D  37.5 98.0  31:50.34 metgrid.exe

If anyone has any ideas on where to look for the cause of this problem, I'd be very appreciative.


Re: Memory leak in metgrid.exe v 3.8?

PostPosted: Mon Sep 30, 2019 12:19 pm
by BartBrashers
Sorry, meant to write "memory LEAK" as the title.

If anyone else who is running a similar e_we, e_sn sized domain could post the output from 'top' showing memory usage, I would be most appreciative.