[Hpc-forum] MPI problemak

etele molnar etele.molnar at gmail.com
2013. Sze. 28., Szo, 12:12:04 CEST


Kedves Gabor es Ferenc

Most mar eljutottunk odaig, hogy a kod lefut egymas utan kovetkezve, 1, 
2, 3 ... de egy ido utan
megis hibak jonnek elo es leallas... ma delelott (28.09.2013)

most nem reszletezem itt helyben hanem attachmentkent megprobalom 
elkuldom a fileokat

elore is koszonom
udv
e



On 27-Sep-13 1:27 PM, Rőczei Gábor wrote:
> On 2013.09.27., at 8:36, Ferenc Bartha wrote:
>
>> Megint csak reszben ertekelve:
>>
>>> Error: NCE_PACKAGES not set
>> Nem eleg a module use /opt/nce/modulfiles, utana module load nce/global is kell.
>>
>> Van meg egy gyanus dolog az egy job-bol a hatterben elinditott ket mpi-s program esetleges viselkedesevel kapcsolatban. Ezt most inkabb nem taglalom, de tanacsolnam, hogy addig se hasznalj ilyet.
> Szerintem is ez a hiba.
>
> Most hozzáadtam a /home/emolnar/.bashrc fájlhoz ezt is. Jelenleg ez van benne:
>
> module use /opt/nce/modulefiles
> module load nce/global
> module load openmpi/1.6.3-gcc-4.7.2  gcc/4.7.2
>
> A HPC_run_cooperfrye_LHC2760_visc_all_serial.sh, HPC_run_cooperfrye_RHIC200_visc_all_serial.sh programok jól elindultak.
>
> Etele,
>
> Légyszives teszteld Te is.
>
> Gábor
>
>> ----- Original Message ----- From: "etele molnar" <molnar at fias.uni-frankfurt.de>
>> To: "Rőczei Gábor" <roczei at niif.hu>; "etele molnar" <etele.molnar at gmail.com>
>> Cc: <hpc-forum at listserv.niif.hu>
>> Sent: Friday, September 27, 2013 7:50 AM
>> Subject: Re: [Hpc-forum] MPI problemak
>>
>>
>> Kedves Gabor es tobbiek
>>
>> Eloszor is koszonom szepen a valaszokat es segitseget,
>> mindjart mindjart ott vagyunk..
>>
>> module use /opt/nce/modulefiles
>> bizony segitett es most lefut a program 12, 24, 48 slotton is, de most
>> van egy ujabb hiba,
>> ami meg a program futasa elott jon, azutan lefut a program (sikeresen)
>> de mar a kovetkezo hivasra
>> nem indul be es a job is megszakad.
>>
>> A mostani teszt programokat igy hivtam bash-bol
>>
>> mpirun -np 48 ./ program ... ;
>> mpirun -np 48 ./ program ... ;
>> ...
>> wait
>>
>>
>> Error: NCE_PACKAGES not set
>> ...
>> Error: NCE_PACKAGES not set
>>
>> real    11m7.497s
>> user    39m5.807s
>> sys    0m1.416s
>> [r1i0n6:09241] opal_os_dirpath_create: Error: Unable to create the
>> sub-directory
>> (/scratch/tmp/425790.1.parallel.q/openmpi-sessions-emolnar at r1i0n6_0/39007)
>> of
>> (/scratch/tmp/425790.1.parallel.q/openmpi-sessions-emolnar at r1i0n6_0/39007/0/0),
>> mkdir failed [1]
>> [r1i0n6:09241] [[39007,0],0] ORTE_ERROR_LOG: Error in file
>> util/session_dir.c at line 106
>> [r1i0n6:09241] [[39007,0],0] ORTE_ERROR_LOG: Error in file
>> util/session_dir.c at line 399
>> [r1i0n6:09241] [[39007,0],0] ORTE_ERROR_LOG: Error in file
>> ess_hnp_module.c at line 320
>> --------------------------------------------------------------------------
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>   orte_session_dir failed
>>   --> Returned value Error (-1) instead of ORTE_SUCCESS
>> --------------------------------------------------------------------------
>> [r1i0n6:09241] [[39007,0],0] ORTE_ERROR_LOG: Error in file
>> runtime/orte_init.c at line 128
>> --------------------------------------------------------------------------
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>>
>>   orte_ess_set_name failed
>>   --> Returned value Error (-1) instead of ORTE_SUCCESS
>> --------------------------------------------------------------------------
>> [r1i0n6:09241] [[39007,0],0] ORTE_ERROR_LOG: Error in file orterun.c at
>> line 694
>>
>>
>> elore is koszonom
>> udv
>> e
>>
>> On 25-Sep-13 11:17 AM, Rőczei Gábor wrote:
>>> Kedves Etele!
>>>
>>>> Leforditottam ujra a programot, mpicxx vagy mpic++ (gcc)
>>>> es futtatni probalom de sajnos most is hibauzenetet kapok
>>>>
>>>> mpirun-t adtam meg 1 job  8G memoria np=12:
>>>>
>>>> Warning: Permanently added '[r1i1n3.ice.debrecen.hpc.niif.hu]:58158,[10.148.0.21]:58158' (RSA) to the list of known hosts.
>>>> Warning: Permanently added '[r1i0n15.ice.debrecen.hpc.niif.hu]:45141,[10.148.0.17]:45141' (RSA) to the list of known hosts.
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'openmpi/1.6.3-gcc-4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'openmpi/1.6.3-gcc-4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'gcc/4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'gcc/4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'openmpi/1.6.3-gcc-4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'gcc/4.7.2'
>>>> --------------------------------------------------------------------------
>>>> WARNING: It appears that your OpenFabrics subsystem is configured to only
>>>> allow registering part of your physical memory.  This can cause MPI jobs to
>>>> run with erratic performance, hang, and/or crash.
>>>>
>>>> This may be caused by your OpenFabrics vendor limiting the amount of
>>>> physical memory that can be registered.  You should investigate the
>>>> relevant Linux kernel module parameters that control how much physical
>>>> memory can be registered, and increase them to allow registering all
>>>> physical memory on your machine.
>>>>
>>>> See this Open MPI FAQ item for more information on these Linux kernel module
>>>> parameters:
>>>>
>>>>      http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>>>>
>>>>    Local host:              r1i0n14
>>>>    Registerable memory:     32768 MiB
>>>>    Total memory:            49143 MiB
>>>>
>>>> Your MPI job will continue, but may be behave poorly and/or hang.
>>>> --------------------------------------------------------------------------
>>>> [r1i0n14:21422] 47 more processes have sent help message help-mpi-btl-openib.txt / reg mem limit low
>>>> [r1i0n14:21422] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>>> Sikerült rájönnöm, hogy mi volt a baj. A debreceni CN gépek mlx4_core kernel modul konfigurációjánál meg kell adni ezt a paramétert:
>>>
>>> options mlx4_core log_mtts_per_seg=5
>>>
>>> Mivel sok párhuzamos job nem futott ma reggel Debrecenben, ezért lehetőségem volt arra, hogy a CN gépek nagy részét újraindítsam annak érdekében hogy ez a beállítás aktiválódjon. Amiket még nem tudtam újraindítani azokat most disabled állapotba tettem addig amíg a rajta lévő jobok le nem futnak.
>>>
>>>> Tovabba ha ugyanezt a programot
>>>>
>>>> mpirun.sge probalom futtatni (nem lett ujraforditva, maradt a gcc's forditas) akkor szinten
>>>> ugyan ezt a hibaunzenetet kapom csak sokkal hosszabbat es azonnal le is all a job...
>>> mpirun.sge-t légyszives ne használj OpenMPI-os job esetén, ez csak SGI MPT-nél lesz jó.
>>>
>>> OpenMPI-nál mpirun-t kell használni. Példa:
>>>
>>> #!/bin/bash
>>> #$ -N CONNECTIVITY
>>> #$ -pe mpi 120
>>>
>>> mpirun -np $NSLOTS ./connectivity -v
>>>
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'openmpi/1.6.3-gcc-4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'gcc/4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'openmpi/1.6.3-gcc-4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'openmpi/1.6.3-gcc-4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'gcc/4.7.2'
>>>> ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for 'gcc/4.7.2'
>>> Ezt meg kellene adni a .bashrc fájlodban:
>>>
>>> module use /opt/nce/modulefiles
>>>
>>>> Meg 2 utolso "elmeleti" kerdes
>>>> Peldaul; egy program 4 Gb memoriat ker, ha parallel 2 programot szeretnek futtatni (./ program & ./ program &)
>>>> akkor 2*4 Gb memoriat kerjek vagy csak 4-et ?
>>>>> #$ -l h_vmem=8G vagy 4Gb
>>> h_vmem esetén azt adod meg, hogy 1 slot számára mennyi memóriára van szükség. Tehát: #$ -l h_vmem=4G
>>>
>>>> Ugyanez a program MPI-s verzioval tegyuk fel 6 slotton
>>>> mpirun -np 6 program
>>>>
>>>> akkor
>>>>> #$ -l h_vmem=4G vagy 6*4=24Gb
>>> Megoldás:
>>>
>>>   #$ -l h_vmem=4G
>>>
>>> Gábor
>>
>> _______________________________________________
>> Hpc-forum mailing list
>> Hpc-forum at listserv.niif.hu
>> https://listserv.niif.hu/mailman/listinfo/hpc-forum
>>
>> _______________________________________________
>> Hpc-forum mailing list
>> Hpc-forum at listserv.niif.hu
>> https://listserv.niif.hu/mailman/listinfo/hpc-forum

--------- következő rész ---------
Warning: Permanently added '[r1i1n9.ice.debrecen.hpc.niif.hu]:55682,[10.148.0.27]:55682' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n8.ice.debrecen.hpc.niif.hu]:36528,[10.148.0.26]:36528' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n10.ice.debrecen.hpc.niif.hu]:52600,[10.148.0.28]:52600' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n11.ice.debrecen.hpc.niif.hu]:57427,[10.148.0.29]:57427' (RSA) to the list of known hosts.

real	23m25.017s
user	185m42.968s
sys	0m1.480s
Warning: Permanently added '[r1i1n8.ice.debrecen.hpc.niif.hu]:32926,[10.148.0.26]:32926' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n9.ice.debrecen.hpc.niif.hu]:49862,[10.148.0.27]:49862' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n11.ice.debrecen.hpc.niif.hu]:50566,[10.148.0.29]:50566' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n10.ice.debrecen.hpc.niif.hu]:42882,[10.148.0.28]:42882' (RSA) to the list of known hosts.

real	20m45.714s
user	162m35.026s
sys	0m1.120s
[r1i1n7:07616] CANNOT CREATE FIFO /scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/0/debugger_attach_fifo: errno 2
Warning: Permanently added '[r1i1n9.ice.debrecen.hpc.niif.hu]:39650,[10.148.0.27]:39650' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n8.ice.debrecen.hpc.niif.hu]:41118,[10.148.0.26]:41118' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n10.ice.debrecen.hpc.niif.hu]:52936,[10.148.0.28]:52936' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n11.ice.debrecen.hpc.niif.hu]:44684,[10.148.0.29]:44684' (RSA) to the list of known hosts.
[r1i1n7:07628] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/3), mkdir failed [1]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07625] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/0), mkdir failed [1]
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07629] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/4), mkdir failed [1]
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07630] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/5), mkdir failed [1]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07632] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/7), mkdir failed [1]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07626] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/1), mkdir failed [1]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07628] [[42592,1],3] could not get route to [[INVALID],INVALID]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] could not get route to [[INVALID],INVALID]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07625] [[42592,1],0] could not get route to [[INVALID],INVALID]
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07629] [[42592,1],4] could not get route to [[INVALID],INVALID]
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07632] [[42592,1],7] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07626] [[42592,1],1] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07628] [[42592,1],3] could not get route to [[INVALID],INVALID]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] could not get route to [[INVALID],INVALID]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07625] [[42592,1],0] could not get route to [[INVALID],INVALID]
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07629] [[42592,1],4] could not get route to [[INVALID],INVALID]
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07632] [[42592,1],7] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07626] [[42592,1],1] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07631] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/6), mkdir failed [1]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07628] [[42592,1],3] could not get route to [[INVALID],INVALID]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] could not get route to [[INVALID],INVALID]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07629] [[42592,1],4] could not get route to [[INVALID],INVALID]
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07625] [[42592,1],0] could not get route to [[INVALID],INVALID]
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07632] [[42592,1],7] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07626] [[42592,1],1] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07631] [[42592,1],6] could not get route to [[INVALID],INVALID]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07627] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/2), mkdir failed [1]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07631] [[42592,1],6] could not get route to [[INVALID],INVALID]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07627] [[42592,1],2] could not get route to [[INVALID],INVALID]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07633] opal_os_dirpath_create: Error: Unable to create the sub-directory (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0) of (/scratch/tmp/425892.1.parallel.q/openmpi-sessions-emolnar at r1i1n7_0/42592/1/8), mkdir failed [1]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 106
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: Error in file util/session_dir.c at line 399
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: Error in file base/ess_base_std_app.c at line 130
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07631] [[42592,1],6] could not get route to [[INVALID],INVALID]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07627] [[42592,1],2] could not get route to [[INVALID],INVALID]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07633] [[42592,1],8] could not get route to [[INVALID],INVALID]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: Error in file ess_env_module.c at line 167
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07627] [[42592,1],2] could not get route to [[INVALID],INVALID]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07633] [[42592,1],8] could not get route to [[INVALID],INVALID]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: Error in file runtime/orte_init.c at line 128
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07631] [[42592,1],6] could not get route to [[INVALID],INVALID]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07632] [[42592,1],7] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07625] [[42592,1],0] could not get route to [[INVALID],INVALID]
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07627] [[42592,1],2] could not get route to [[INVALID],INVALID]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07626] [[42592,1],1] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07628] [[42592,1],3] could not get route to [[INVALID],INVALID]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07629] [[42592,1],4] could not get route to [[INVALID],INVALID]
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07633] [[42592,1],8] could not get route to [[INVALID],INVALID]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] could not get route to [[INVALID],INVALID]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07633] [[42592,1],8] could not get route to [[INVALID],INVALID]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] could not get route to [[INVALID],INVALID]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07627] [[42592,1],2] could not get route to [[INVALID],INVALID]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07628] [[42592,1],3] could not get route to [[INVALID],INVALID]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07633] [[42592,1],8] could not get route to [[INVALID],INVALID]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07625] [[42592,1],0] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07626] [[42592,1],1] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07629] [[42592,1],4] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07632] [[42592,1],7] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07631] [[42592,1],6] could not get route to [[INVALID],INVALID]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07628] [[42592,1],3] could not get route to [[INVALID],INVALID]
[r1i1n7:07628] [[42592,1],3] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07631] [[42592,1],6] could not get route to [[INVALID],INVALID]
[r1i1n7:07631] [[42592,1],6] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07632] [[42592,1],7] could not get route to [[INVALID],INVALID]
[r1i1n7:07632] [[42592,1],7] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07633] [[42592,1],8] could not get route to [[INVALID],INVALID]
[r1i1n7:07633] [[42592,1],8] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07630] [[42592,1],5] could not get route to [[INVALID],INVALID]
[r1i1n7:07630] [[42592,1],5] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07626] [[42592,1],1] could not get route to [[INVALID],INVALID]
[r1i1n7:07626] [[42592,1],1] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07629] [[42592,1],4] could not get route to [[INVALID],INVALID]
[r1i1n7:07629] [[42592,1],4] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07627] [[42592,1],2] could not get route to [[INVALID],INVALID]
[r1i1n7:07627] [[42592,1],2] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file rml_oob_send.c at line 104
[r1i1n7:07625] [[42592,1],0] could not get route to [[INVALID],INVALID]
[r1i1n7:07625] [[42592,1],0] ORTE_ERROR_LOG: A message is attempting to be sent to a process whose contact information is unknown in file util/show_help.c at line 627
--------- következő rész ---------
Warning: Permanently added '[r1i1n12.ice.debrecen.hpc.niif.hu]:60018,[10.148.0.30]:60018' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i2n0.ice.debrecen.hpc.niif.hu]:59770,[10.148.0.34]:59770' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n13.ice.debrecen.hpc.niif.hu]:55590,[10.148.0.31]:55590' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i2n1.ice.debrecen.hpc.niif.hu]:49362,[10.148.0.35]:49362' (RSA) to the list of known hosts.

real	44m54.191s
user	174m35.119s
sys	0m3.536s
[r1i1n11:08014] CANNOT CREATE FIFO /scratch/tmp/425893.1.parallel.q/openmpi-sessions-emolnar at r1i1n11_0/45409/0/debugger_attach_fifo: errno 2
Warning: Permanently added '[r1i1n12.ice.debrecen.hpc.niif.hu]:42282,[10.148.0.30]:42282' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n13.ice.debrecen.hpc.niif.hu]:49037,[10.148.0.31]:49037' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i2n0.ice.debrecen.hpc.niif.hu]:36956,[10.148.0.34]:36956' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i2n1.ice.debrecen.hpc.niif.hu]:44471,[10.148.0.35]:44471' (RSA) to the list of known hosts.
qrsh_starter: cannot write pid file /scratch/tmp/425893.1.parallel.q/pid.2.r1i2n0: No such file or directory
qrsh_starter: cannot open file /scratch/tmp/425893.1.parallel.q/qrsh_error: No such file or directory
qrsh_starter: cannot open file /scratch/tmp/425893.1.parallel.q/qrsh_exit_code.2.r1i2n0: No such file or directory
qrsh_starter: cannot open file /scratch/tmp/425893.1.parallel.q/qrsh_error: No such file or directory
--------------------------------------------------------------------------
A daemon (pid 8018) died unexpectedly with status 129 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

real	0m6.170s
user	0m0.012s
sys	0m0.044s
Warning: Permanently added '[r1i1n12.ice.debrecen.hpc.niif.hu]:60572,[10.148.0.30]:60572' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i2n1.ice.debrecen.hpc.niif.hu]:54876,[10.148.0.35]:54876' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i2n0.ice.debrecen.hpc.niif.hu]:41830,[10.148.0.34]:41830' (RSA) to the list of known hosts.
Warning: Permanently added '[r1i1n13.ice.debrecen.hpc.niif.hu]:47348,[10.148.0.31]:47348' (RSA) to the list of known hosts.
Connection to r1i2n0.ice.debrecen.hpc.niif.hu closed by remote host.
Connection to r1i2n1.ice.debrecen.hpc.niif.hu closed by remote host.
Connection to r1i2n1.ice.debrecen.hpc.niif.hu closed by remote host.
Connection to r1i1n12.ice.debrecen.hpc.niif.hu closed by remote host.
--------------------------------------------------------------------------
A daemon (pid 8026) died unexpectedly with status 129 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

real	0m2.199s
user	0m0.024s
sys	0m0.020s
error: executing task of job 425893 failed: execution daemon on host "r1i1n12.ice.debrecen.hpc.niif.hu" didn't accept task
error: executing task of job 425893 failed: execution daemon on host "r1i2n0.ice.debrecen.hpc.niif.hu" didn't accept task
error: executing task of job 425893 failed: execution daemon on host "r1i2n1.ice.debrecen.hpc.niif.hu" didn't accept task
--------------------------------------------------------------------------
A daemon (pid 8036) died unexpectedly with status 1 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

real	0m0.623s
user	0m0.016s
sys	0m0.024s
Warning: Permanently added '[r1i1n13.ice.debrecen.hpc.niif.hu]:37161,[10.148.0.31]:37161' (RSA) to the list of known hosts.
error: executing task of job 425893 failed: execution daemon on host "r1i1n12.ice.debrecen.hpc.niif.hu" didn't accept task
--------------------------------------------------------------------------
A daemon (pid 8050) died unexpectedly with status 1 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
error: executing task of job 425893 failed: execution daemon on host "r1i2n1.ice.debrecen.hpc.niif.hu" didn't accept task
error: executing task of job 425893 failed: execution daemon on host "r1i2n0.ice.debrecen.hpc.niif.hu" didn't accept task

real	0m0.453s
user	0m0.028s
sys	0m0.016s
Warning: Permanently added '[r1i1n13.ice.debrecen.hpc.niif.hu]:44221,[10.148.0.31]:44221' (RSA) to the list of known hosts.
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory
rm: cannot remove `elements.dat': No such file or directory


További információk a(z) Hpc-forum levelezőlistáról