Author Topic: Segmentation fault: Script works on PC but does not work on HPC Marconi  (Read 23 times)

Xtof

  • Jr. Member
  • **
  • Posts: 54
Dear Alex and OVITO users,

I wrote a script to identify defects in SiC after cascade. I use the WS modifier with the per_type_occupancies option.
I developed the script on my PC (i7 proc, Ubuntu 14.04) with the version dev105 and it works well. Then, I wanted to run it on the HPC Marconi where I have all the data and surprisingly, it does not work. It throws a Segmentation fault. I tried with different versions (dev52, dev105, dev115) and same thing. It is the first time that such problem occurs.
The nodes on the HPC Marconi are 2 x 24-cores Intel Xeon 8160 CPU (Skylake) at 2.10 GHz.

Best regards,
Christophe
« Last Edit: February 15, 2018, 10:55:34 AM by Xtof »

Xtof

  • Jr. Member
  • **
  • Posts: 54
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #1 on: February 15, 2018, 10:56:37 AM »
Update:
It is not due to this script in particular. Even old scripts that were working on the HPC Marconi no longer work. They must have changed something...

Christophe

Alexander Stukowski

  • Administrator
  • Sr. Member
  • *****
  • Posts: 284
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #2 on: February 15, 2018, 11:32:33 AM »
Christoph,

What happens if you run

  ovitos -c "pass"

This command will execute just the Python "pass" statement and nothing else. If possible, run this in a debugger such as GDB:

  gdb --args ovitos -c "pass"

Perhaps this allows you to capture a stack trace, which could tell us more about the reason for the segfault.

Xtof

  • Jr. Member
  • **
  • Posts: 54
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #3 on: February 15, 2018, 11:45:52 AM »
I could run it with the gdb. It shows nothing.

> gdb --args ovitos_dev115 -c "pass"

GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /ovito-3.0.0-dev115-x86_64/bin/ovitos...(no debugging symbols found)...done.
(gdb)

It says no debugging symbols found. Doesn't that mean that the code should be compiled with the -g option? Since I use the ovitos binary, maybe that is the reason why it does not show anything.

Christophe

Alexander Stukowski

  • Administrator
  • Sr. Member
  • *****
  • Posts: 284
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #4 on: February 15, 2018, 12:35:45 PM »
I forgot to tell: Within GDB, type "run" and press enter to start the execution. In case the segfault occurs and execution stops, type "bt" + enter to get the backtrace.
Even without debugging symbols included in the executable, you should get some first, approximate information abut the location where the crash happened.

Xtof

  • Jr. Member
  • **
  • Posts: 54
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #5 on: February 15, 2018, 01:33:03 PM »
Sorry, I have used gdb few times.

The result of the run inside gdb is:

Starting program: /marconi/home/userexternal/cortiz00/bin/ovitos_dev115 -c pass
warning: File "/marconi/prod/opt/compilers/gnu/6.1.0/none/lib64/libstdc++.so.6.0.22-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py".
To enable execution of this file add
   add-auto-load-safe-path /marconi/prod/opt/compilers/gnu/6.1.0/none/lib64/libstdc++.so.6.0.22-gdb.py
line to your configuration file "/marconi/home/userexternal/cortiz00/.gdbinit".
To completely disable this security protection add
   set auto-load safe-path /
line to your configuration file "/marconi/home/userexternal/cortiz00/.gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
   info "(gdb)Auto-loading safe path"
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffe83b8700 (LWP 54119)]
Missing separate debuginfo for /marconi/home/userexternal/cortiz00/ovito-3.0.0-dev115-x86_64/lib/ovito/plugins/../libfftw3.so.3
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/71/1de7374ad3a5db27f41b97b3ffa32025c08a84.debug
Missing separate debuginfo for /marconi/home/userexternal/cortiz00/ovito-3.0.0-dev115-x86_64/lib/ovito/plugins/../libssl.so.0.9.8
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/6c/0495a5040d7276bdad1f07fbc30285a140446d.debug
Missing separate debuginfo for /marconi/home/userexternal/cortiz00/ovito-3.0.0-dev115-x86_64/lib/ovito/plugins/../libcrypto.so.0.9.8
Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/69/4b2605b6fd822a880709e0b17561be88810f41.debug

Program received signal SIGSEGV, Segmentation fault.
PyType_IsSubtype (a=0x7365636e615f7061, b=0x7fffd013d680 <sipVoidPtr_Type>) at Objects/typeobject.c:1343
1343   Objects/typeobject.c: No such file or directory.
Missing separate debuginfos, use: debuginfo-install expat-2.1.0-8.el7.x86_64 fontconfig-2.10.95-7.el7.x86_64 freetype-2.4.11-11.el7.x86_64 glibc-2.17-106.el7_2.8.x86_64 libICE-1.0.9-2.el7.x86_64 libSM-1.2.2-2.el7.x86_64 libX11-1.6.3-2.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libXcursor-1.1.14-2.1.el7.x86_64 libXdamage-1.1.4-4.1.el7.x86_64 libXext-1.3.3-3.el7.x86_64 libXfixes-5.0.1-2.1.el7.x86_64 libXrender-0.9.8-2.1.el7.x86_64 libXxf86vm-1.1.3-2.1.el7.x86_64 libdrm-2.4.60-3.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 libuuid-2.23.2-26.el7_2.3.x86_64 libxcb-1.11-4.el7.x86_64 libxshmfence-1.2-1.el7.x86_64 mesa-libGL-10.6.5-3.20150824.el7.x86_64 mesa-libglapi-10.6.5-3.20150824.el7.x86_64 pcre-8.32-15.el7_2.1.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64 zlib-1.2.7-15.el7.x86_64
(gdb)

Clearly, a segmentation fault occurs. I understand that it is due to typeobject.c that is not found. I tried with dev52 and dev115. Same problem.
Do you think it could be due to a problem with modules that are loaded in the environment of the HPC Marconi? They often change things, which ends up messing up everything for the users...

I also typed bt but the result is a long sequence of lines of code + hexadecimal info. If you think it could give you a hint, I'll send it to you.

Xtof

  • Jr. Member
  • **
  • Posts: 54
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #6 on: February 15, 2018, 02:53:49 PM »
Update.

With the support of the HPC Marconi we found why it fails.
It is due to the module python/3.5.2. It has been working with this module during months but suddenly, it stopped working. They likely changed something, recompiled some modules, who knows...

However, it works with module python/3.6.4. That is a good news.

So it is solved.

The support told me they would investigate that to understand why it fails with module python/3.5.2. I will update you with the latest news.

Christophe

Alexander Stukowski

  • Administrator
  • Sr. Member
  • *****
  • Posts: 284
Re: Segmentation fault: Script works on PC but does not work on HPC Marconi
« Reply #7 on: February 15, 2018, 06:58:05 PM »
Okay, good to here that you solved the problem already.

I still don't understand why the loaded environment module affects the operation of Ovito on your HPC cluster, but I guess that is not important. What counts is that you got it working. In general, however, I can say that the binary Ovito installation packages come with their own copy of the Python interpreter. So there should be no need to load any Python interpreter on the system in order to use Ovito. But I can also say that the Python interpreter shipped with Ovito may not be perfectly isolated from the system environment. So things like environment variables set on the local system may interfere and still affect its operation (possibly letting it crash).