Announcement

Collapse
No announcement yet.

Benchmarking AWS

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Benchmarking AWS

    I've been running the benchmarks on AWS instances, and discovered a big problem: vraybenchmark linux can't see the nvidia GPUs on the g- and p- series machines. Anyone have any ideas?

    https://aws.amazon.com/ec2/instance-types/ (click Accelerated Computing)

    Thanks!
    Den


  • #2
    OK, figured out the GPU problem - the instance I used didn't have CUDA on it (duh). However, I am still having problems with the GPUs.

    On the p3.8xlarge and p3.16xlarge, I am getting the following error:

    ./src/ocl_tracedevice.cpp(1907) : CUDA error 700
    CallStack (from HasCallStack):
    error, called at src/lib/VRay.hs:155:91 in vraybench-1.0.8-IH0QNi1KrDm4hWP4n3RvTh:VRay

    On the p2.8xlarge and p2.16xlarge they are rendering, but there is some sort of limit preventing the improvement in the speed (maybe a GPU number limit?):
    Tesla K80 11439MB 3:00
    Tesla K80 11439MB x8 0:30
    Tesla K80 11439MB x16 0:30
    I'll post the full benchmarks for all the useful AWS instances in a separate post once we have these working

    Comment


    • #3
      Do you know what driver version are these GPUs using?

      Best,
      Blago.
      V-Ray fan.
      Looking busy around GPUs ...
      RTX ON

      Comment


      • #4
        Here?s a dump of my Google Sheet GPU section. Probably won?t format nicely as I am on mobile.

        g3.4xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 16 Tesla M60 7613MB NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 2:02 2:42
        g3.8xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 32 Tesla M60 7613MB x2 NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 1:04 1:24
        g3.16xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 64 Tesla M60 7613MB x4 NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 0:35 0:45
        p2.xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 4 Tesla K80 11439MB NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 8:04 3:00
        p2.8xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 32 Tesla K80 11439MB x8 NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 1:04 0:30
        p2.16xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 64 Tesla K80 11439MB x16 NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 0:34 0:30
        p3.2xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 8 Tesla V100-SXM2-16GB 16152MB NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 4:06 0:28
        p3.8xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 32 Tesla V100-SXM2-16GB 16152MB x4 NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 1:06 error
        p3.16xlarge Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 64 Tesla V100-SXM2-16GB 16152MB x8 NVIDIA driver version: 384.111 Linux 4.9.76-38.79.amzn2.x86_64 V-Ray 3.57.01 0:34 error

        Comment


        • #5
          Alright. The driver seems reasonably new. This error comes from the CUDA API and means "something went wrong". I would guess that it might not be V-Ray GPU, since the benchmark is running on many other machines without such problems.
          Is updating/changing the drivers an option?

          Best,
          Blago.
          V-Ray fan.
          Looking busy around GPUs ...
          RTX ON

          Comment


          • #6
            Here's the result of updating the driver:
            [ec2-user@ip-172-31-2-45 tmp]$ ./vraybench_1.0.8_lin_x64 -q

            Starting V-Ray Benchmark...


            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, # of logical cores: 32

            Tesla V100-SXM2-16GB 16160MB x4

            NVIDIA driver version: 390.30

            Linux 4.9.85-38.58.amzn1.x86_64

            V-Ray 3.57.01


            Rendering took 01:03 minutes.


            ./src/ocl_tracedevice.cpp(1907) : CUDA error 700

            CallStack (from HasCallStack):

            error, called at src/lib/VRay.hs:155:91 in vraybench-1.0.8-IH0QNi1KrDm4hWP4n3RvTh:VRay

            ./src/ocl_tracedevice.cpp(1907) : CUDA error 700

            CallStack (from HasCallStack):

            error, called at src/lib/VRay.hs:155:91 in vraybench-1.0.8-IH0QNi1KrDm4hWP4n3RvTh:VRay./src/ocl_tracedevice.cpp(1907) : CUDA error 700


            CallStack (from HasCallStack):

            Scene integrity error. error, called at src/lib/VRay.hs:155:91 in vraybench-1.0.8-IH0QNi1KrDm4hWP4n3RvTh:VRay



            Closing V-Ray Benchmark...

            Scene integrity error.

            ./src/ocl_tracedevice.cpp(1907) : CUDA error 700

            CallStack (from HasCallStack):

            error, called at src/lib/VRay.hs:155:91 in vraybench-1.0.8-IH0QNi1KrDm4hWP4n3RvTh:VRay

            Comment


            • #7
              That?s really strange. Do you by any chance know if other CUDA apps run there fine?

              Best,
              Blago.
              V-Ray fan.
              Looking busy around GPUs ...
              RTX ON

              Comment


              • #8
                Can you suggest something I can run through command line which would be a good test? I have been running Redshift but that is on a slightly different configuration (same base image, however).

                Comment


                • #9
                  Originally posted by Eightvfx View Post
                  Can you suggest something I can run through command line which would be a good test? I have been running Redshift but that is on a slightly different configuration (same base image, however).
                  If it runs fine, can you try to delete everything in %temp% and in %appdata%/NVIDIA*. If this doesn?t work, we might have to arrange remote session to see if we can help with anything. Is that an option?

                  Unfortunately sometimes bug in drivers can cause only specific workflows to not work, so I can?t promise anything yet.

                  Best,
                  Blago.
                  V-Ray fan.
                  Looking busy around GPUs ...
                  RTX ON

                  Comment


                  • #10
                    Eightvfx, you can try running V-Ray Benchmark Next which has been released last month.

                    Comment

                    Working...
                    X