Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: GPU failure during rendering, any way to set up notifications or other solutions?

  1. #1
    Join Date
    Sep 2017
    Posts
    13

    Default GPU failure during rendering, any way to set up notifications or other solutions?

    I've been doing sustained rendering with iray server for about a month now and have been running into a semi regular issue of the gpu failing due to memory errors and kicking to cpu only. Restarting iray server and starting the render again works fine, although I have to start the job over from scratch. Since I'll have a half dozen jobs queued up and each takes 3-6 hours to complete, I've taken to repeatedly checking the gpu utilization with gpuz over teamviewer on my phone to see if the gpu is still working so I know to restart the server remotely, but this obviously isn't an ideal process.

    So a couple questions:

    1. Can anyone provide any clarity on the gpu error posted below?
    2. Is there any way to set up email notifications so that I don't have to constantly monitor via teamviewer to see if there has been a gpu failure?
    3. I noticed talk about potentially being able to pause and resume downloads in a future release (which would substantially reduce the pain of a failure), can you share any news about this potential feature?
    Last edited by predatorgsr2; September 6th, 2017 at 16:35.

  2. #2
    Join Date
    Sep 2017
    Posts
    13

    Default

    These are the errors:

    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.58 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while launching CUDA renderer in core_renderer_wf.cpp:632)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.58 IRAY rend error: CUDA device 0 (GeForce GTX 970): Failed to launch renderer
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): Device failed while rendering
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while initializing memory buffer)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)
    [Wed, 06 Sep 2017 05:46:59] 3 278300.4305 | 1.6 IRAY rend error: CUDA device 0 (GeForce GTX 970): the launch timed out and was terminated (while de-allocating memory)

  3. #3
    Join Date
    Jul 2010
    Posts
    343

    Default

    Could you post the full logfile,pls.?

  4. #4
    Join Date
    Sep 2017
    Posts
    13

    Default

    Log file added.
    Attached Files Attached Files
    Last edited by predatorgsr2; September 10th, 2017 at 03:12.

  5. #5
    Join Date
    Sep 2017
    Posts
    13

    Default

    Anyone know what this means?

  6. #6
    Join Date
    Jul 2010
    Posts
    343

    Default

    Thanks, we will have a look.

  7. #7
    Join Date
    Sep 2017
    Posts
    13

    Default

    Some more information - I just added a brand new GTX 1080 to my system in addition to the 970 GTX, and today both of the graphics cards failed at the same time with that same message. I initially thought it might be hardware failure but since they both failed, that would point to something else.

    [Wed, 27 Sep 2017 00:02:54] 3 278300.4305 | 1.31 IRAY rend info : CUDA device 0 (GeForce GTX 1080): Scene processed in 42.535s
    [Wed, 27 Sep 2017 00:02:54] 3 278300.4305 | 1.18 IRAY rend info : CUDA device 1 (GeForce GTX 970): Scene processed in 42.541s
    [Wed, 27 Sep 2017 00:02:54] 3 278300.4305 | 1.31 IRAY rend info : CUDA device 0 (GeForce GTX 1080): Allocated 143.052 MiB for frame buffer
    [Wed, 27 Sep 2017 00:02:54] 3 278300.4305 | 1.31 IRAY rend info : CUDA device 0 (GeForce GTX 1080): Allocated 1.65625 GiB of work space (2048k active samples in 0.000s)
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): out of memory (while allocating memory)
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): Failed to allocate 23.8419 MiB
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend warn : CUDA device 1 (GeForce GTX 970): Failed to allocate 143.052 MiB for (device) frame buffer, will try allocating smaller (partial) frame buffer
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend info : CUDA device 1 (GeForce GTX 970): Allocated 71.5262 MiB for device frame buffer
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend info : CUDA device 1 (GeForce GTX 970): Allocated 143.052 MiB for host-side frame buffer
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend warn : CUDA device 1 (GeForce GTX 970): Succeeded in allocating partial device frame buffer. Device efficiency will be affected.
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend info : CUDA device 1 (GeForce GTX 970): Allocated 848 MiB of work space (1024k active samples in 0.145s)
    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend info : CUDA device 1 (GeForce GTX 970): Used for display, optimizing for interactive usage (performance could be sacrificed)
    [Wed, 27 Sep 2017 00:02:56] 3 278300.4305 | 1.31 IRAY rend info : Allocating 1 layer frame buffer
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.36 IRAY rend error: CUDA device 0 (GeForce GTX 1080): Kernel [18] failed after 0.038s
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.36 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while launching CUDA renderer in core_renderer_wf.cpp:791)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.36 IRAY rend error: CUDA device 0 (GeForce GTX 1080): Failed to launch renderer
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): Device failed while rendering
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.21 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while launching CUDA renderer in core_renderer_wf.cpp:506)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.21 IRAY rend error: CUDA device 1 (GeForce GTX 970): Failed to launch renderer
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while initializing memory buffer)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.31 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): Device failed while rendering
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend warn : All available GPUs failed.
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend info : Falling back to CPU rendering.
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: All workers failed: aborting render
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend info : Received update to 00001 iterations after 46.187s.
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | [Renderer] Updating progress image after 1 iterations...
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: CUDA device 1 (GeForce GTX 970): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: CUDA device 0 (GeForce GTX 1080): unspecified launch failure (while de-allocating memory)
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend error: Scheduler was aborted for restart and needs to be restarted
    [Wed, 27 Sep 2017 00:02:58] 3 278300.4305 | 1.27 IRAY rend info : CPU: using 4 cores for rendering

  8. #8
    Join Date
    Sep 2017
    Posts
    13

    Default

    I just built a new system with a new cpu/mobo/memory/psu using the same 1080GTX and 970GTX and am having the same continued issue with both cards failing at the same time on the new system. My trial expires in a few weeks and I'm planning to buy the full license, is there any additional support that comes with purchasing that can help fix this issue?

  9. #9
    Join Date
    Jul 2010
    Posts
    343

    Default

    Do you still run on trial?
    Your new system does, right?
    Please try another smaller scene and watch what happens.
    It seems your GTX 970 runs out of memory.
    Nothing we can do here with 4 GB you hit the limit for that card:

    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): out of memory (while allocating memory)
    Last edited by fuchs; October 5th, 2017 at 13:54.

  10. #10
    Join Date
    Sep 2017
    Posts
    13

    Default

    Quote Originally Posted by fuchs View Post
    Do you still run on trial?
    Your new system does, right?
    Please try another smaller scene and watch what happens.
    It seems your GTX 970 runs out of memory.
    Nothing we can do here with 4 GB you hit the limit for that card:

    [Wed, 27 Sep 2017 00:02:55] 3 278300.4305 | 1.18 IRAY rend error: CUDA device 1 (GeForce GTX 970): out of memory (while allocating memory)
    Is the 970 running out of memory causing the 1080 to fail too? Should I disable the 970GTX in order to let the 1080 GTX run solo?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •