jittor.test.test_array蓝屏问题(检测到内存只有15B)

硬件信息:

  • NVIDIA GeForce MX230(显存2GB)

软件信息:

  • win10
  • 显卡驱动版本512.77
  • CUDA驱动版本11.6.134(更换为旧版显卡驱动为441、CUDA10.2也存在同样的问题)
  • python3.10.4
  • jittor1.3.4.15

问题复现

  1. 创建空的conda环境,pip install jittor安装最新版计图

  2. 第一次运行python -m jittor.test.test_array,日志如下:

    [i 0624 21:46:28.068000 40 compiler.py:953] Jittor(1.3.4.15) src: c:\programdata\miniconda3\envs\jittor\lib\site-packages\jittor
    [i 0624 21:46:28.095000 40 compiler.py:954] cl at C:\Users\wawcac\.cache\jittor\msvc\VC\_\_\_\_\_\bin\cl.exe(19.29.30133)
    [i 0624 21:46:28.097000 40 compiler.py:955] cache_path: C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default
    [i 0624 21:46:28.101000 40 install_cuda.py:53] cuda_driver_version: [11, 6, 0]
    [i 0624 21:46:28.126000 40 __init__.py:411] Found C:\Users\wawcac\.cache\jittor\jtcuda\cuda11.2_cudnn8_win\bin\nvcc.exe(11.2.67) at C:\Users\wawcac\.cache\jittor\jtcuda\cuda11.2_cudnn8_win\bin\nvcc.exe.
    [i 0624 21:46:28.282000 40 __init__.py:411] Found gdb(10.2) at C:\Program Files\TDM-GCC-64\bin\gdb.EXE.
    [i 0624 21:46:28.328000 40 __init__.py:411] Found addr2line(2.36.1) at C:\Program Files\TDM-GCC-64\bin\addr2line.EXE.
    [i 0624 21:46:28.367000 40 compiler.py:1008] cuda key:cu11.2.67
    [i 0624 21:46:28.369000 40 __init__.py:227] Total mem: 7.82GB, using 2 procs for compiling.
    [i 0624 21:46:29.859000 40 jit_compiler.cc:28] Load cc_path: C:\Users\wawcac\.cache\jittor\msvc\VC\_\_\_\_\_\bin\cl.exe
    [i 0624 21:46:29.860000 40 init.cc:62] Found cuda archs: [61,]
    [i 0624 21:46:30.064000 40 compile_extern.py:516] mpicc not found, distribution disabled.
    ...[i 0624 21:46:31.270000 40 cuda_flags.cc:32] CUDA enabled.
    .[i 0624 21:46:31.278000 40 cuda_flags.cc:32] CUDA enabled.
    ..[i 0624 21:46:31.289000 40 cuda_flags.cc:32] CUDA enabled.
    [w 0624 21:46:33.938000 40 cuda_device_allocator.cc:29] Unable to alloc cuda device memory, use unify memory instead. This may cause low performance.
    [i 0624 21:46:33.939000 40 cuda_device_allocator.cc:31]
    === display_memory_info ===
     total_cpu_ram:    15 B total_device_ram:     2GB
     hold_vars: 184 lived_vars: 733 lived_ops: 1318
     name: sfrl is_device: 1 used: 159.7MB(18.1%) unused: 722.3MB(81.9%) total:   882MB
     name: sfrl is_device: 1 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:   161KB(15.7%) unused:   863KB(84.3%) total:     1MB
     name: temp is_device: 0 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
     name: temp is_device: 1 used:     0 B(0%) unused: 442.8MB(100%) total: 442.8MB
     cpu&gpu:  1.41GB gpu: 1.351GB cpu:    60MB
     free: cpu(    0 B) gpu(68.87MB)
    ===========================
    
    [w 0624 21:46:34.296000 40 cuda_device_allocator.cc:29] Unable to alloc cuda device memory, use unify memory instead. This may cause low performance.
    [i 0624 21:46:34.296000 40 cuda_device_allocator.cc:31]
    === display_memory_info ===
     total_cpu_ram:    15 B total_device_ram:     2GB
     hold_vars: 184 lived_vars: 665 lived_ops: 692
     name: sfrl is_device: 1 used: 159.7MB(18.1%) unused: 722.3MB(81.9%) total:   882MB
     name: sfrl is_device: 1 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:  98.5KB(9.62%) unused: 925.5KB(90.4%) total:     1MB
     name: temp is_device: 0 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
     name: temp is_device: 1 used:     0 B(0%) unused: 442.8MB(100%) total: 442.8MB
     cpu&gpu:  1.41GB gpu: 1.351GB cpu:    60MB
     free: cpu(    0 B) gpu(68.87MB)
    ===========================
    
    E....
    Compiling Operators(2/2) used: 8.76s eta:    0s
    
    Compiling Operators(1/1) used: 4.37s eta:    0s
    [i 0624 21:46:47.603000 40 profiler.cc:468]
    Profile result, sorted by TotalTime
    ('it/s' represent number of iterations per sec)
          Name  FileName     Count TotalTime    %,cum%   AvgTime   MinTime   MaxTime     Input    Output     InOut   Compute
    Total time:    88.2us
    Total Memory Access:      80 B
    ?opkey0:array?T:int32?o:1?opkey1:broadcast_to?Tx:int32?DIM=1?BCAST=1?opkey2:binary?Tx:float32?Ty:int32?Tz:float32?OP:subtract?JIT:1?JIT_cuda:1?graph:000010,052020,010021,?var_info::11110121?index_t:int32
              C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_array__T_int32__o_1__opkey1_broadcast_to__Tx_int32__DIM_1__BCAST_1__opkey2_binary___hash_42bbdae5e63a5b90_op.cc
                                 1    88.2us(100%,100%)    88.2us    88.2us    88.2us   443KB/s   443KB/s   886KB/s  113Kit/s
    
    
    [i 0624 21:46:47.603000 40 profiler.cc:549]
    Memory profile result, sorted by CheckTimes
               Name       FileName     CheckTimes    TLBMissRate
    
    
    .[i 0624 21:46:47.603000 40 cuda_flags.cc:32] CUDA enabled.
    [i 0624 21:46:47.604000 40 profiler.cc:468]
    Profile result, sorted by TotalTime
    ('it/s' represent number of iterations per sec)
          Name  FileName     Count TotalTime    %,cum%   AvgTime   MinTime   MaxTime     Input    Output     InOut   Compute
    Total time:    78.9us
    Total Memory Access:      80 B
    ?opkey0:array?T:int32?o:1?opkey1:broadcast_to?Tx:int32?DIM=1?BCAST=1?opkey2:binary?Tx:float32?Ty:int32?Tz:float32?OP:subtract?JIT:1?JIT_cuda:1?graph:000010,052020,010021,?var_info::11110121?index_t:int32
              C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_array__T_int32__o_1__opkey1_broadcast_to__Tx_int32__DIM_1__BCAST_1__opkey2_binary___hash_42bbdae5e63a5b90_op.cc
                                 1    78.9us(100%,100%)    78.9us    78.9us    78.9us   495KB/s   495KB/s   990KB/s  127Kit/s
    
    
    [i 0624 21:46:47.604000 40 profiler.cc:549]
    Memory profile result, sorted by CheckTimes
               Name       FileName     CheckTimes    TLBMissRate
    
    
    [i 0624 21:46:47.604000 40 cuda_flags.cc:32] CUDA enabled.
    .
    Compiling Operators(4/4) used: 16.4s eta:    0s
    

    之后CPU占用会降到0,约一分钟后发生蓝屏,提示信息为driver power state failure。这一分钟里,即使按下Ctrl-C,提示Caught SIGNAL 2, quick exit,时间到了依然会蓝屏。

  3. 重启后在同一环境中第二次运行python -m jittor.test.test_array,不发生蓝屏,日志如下:

    [i 0624 22:00:03.576000 24 compiler.py:953] Jittor(1.3.4.15) src: c:\programdata\miniconda3\envs\jittor\lib\site-packages\jittor
    [i 0624 22:00:03.624000 24 compiler.py:954] cl at C:\Users\wawcac\.cache\jittor\msvc\VC\_\_\_\_\_\bin\cl.exe(19.29.30133)
    [i 0624 22:00:03.625000 24 compiler.py:955] cache_path: C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default
    [i 0624 22:00:03.677000 24 install_cuda.py:53] cuda_driver_version: [11, 6, 0]
    [i 0624 22:00:03.716000 24 __init__.py:411] Found C:\Users\wawcac\.cache\jittor\jtcuda\cuda11.2_cudnn8_win\bin\nvcc.exe(11.2.67) at C:\Users\wawcac\.cache\jittor\jtcuda\cuda11.2_cudnn8_win\bin\nvcc.exe.
    [i 0624 22:00:03.946000 24 __init__.py:411] Found gdb(10.2) at C:\Program Files\TDM-GCC-64\bin\gdb.EXE.
    [i 0624 22:00:03.999000 24 __init__.py:411] Found addr2line(2.36.1) at C:\Program Files\TDM-GCC-64\bin\addr2line.EXE.
    [i 0624 22:00:04.040000 24 compiler.py:1008] cuda key:cu11.2.67
    [i 0624 22:00:04.043000 24 __init__.py:227] Total mem: 7.82GB, using 2 procs for compiling.
    [i 0624 22:00:06.539000 24 jit_compiler.cc:28] Load cc_path: C:\Users\wawcac\.cache\jittor\msvc\VC\_\_\_\_\_\bin\cl.exe
    [i 0624 22:00:06.562000 24 init.cc:62] Found cuda archs: [61,]
    [i 0624 22:00:07.039000 24 compile_extern.py:516] mpicc not found, distribution disabled.
    ...[i 0624 22:00:10.926000 24 cuda_flags.cc:32] CUDA enabled.
    .[i 0624 22:00:10.978000 24 cuda_flags.cc:32] CUDA enabled.
    ..[i 0624 22:00:10.993000 24 cuda_flags.cc:32] CUDA enabled.
    [w 0624 22:00:14.356000 24 cuda_device_allocator.cc:29] Unable to alloc cuda device memory, use unify memory instead. This may cause low performance.
    [i 0624 22:00:14.357000 24 cuda_device_allocator.cc:31]
    === display_memory_info ===
     total_cpu_ram:    15 B total_device_ram:     2GB
     hold_vars: 184 lived_vars: 733 lived_ops: 1318
     name: sfrl is_device: 1 used: 159.7MB(18.1%) unused: 722.3MB(81.9%) total:   882MB
     name: sfrl is_device: 1 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:   161KB(15.7%) unused:   863KB(84.3%) total:     1MB
     name: temp is_device: 0 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
     name: temp is_device: 1 used:     0 B(0%) unused: 442.8MB(100%) total: 442.8MB
     cpu&gpu:  1.41GB gpu: 1.351GB cpu:    60MB
     free: cpu(    0 B) gpu(68.87MB)
    ===========================
    
    [w 0624 22:00:14.910000 24 cuda_device_allocator.cc:29] Unable to alloc cuda device memory, use unify memory instead. This may cause low performance.
    [i 0624 22:00:14.911000 24 cuda_device_allocator.cc:31]
    === display_memory_info ===
     total_cpu_ram:    15 B total_device_ram:     2GB
     hold_vars: 184 lived_vars: 665 lived_ops: 692
     name: sfrl is_device: 1 used: 159.7MB(18.1%) unused: 722.3MB(81.9%) total:   882MB
     name: sfrl is_device: 1 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:    58MB(98.3%) unused:     1MB(1.69%) total:    59MB
     name: sfrl is_device: 0 used:  98.5KB(9.62%) unused: 925.5KB(90.4%) total:     1MB
     name: temp is_device: 0 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
     name: temp is_device: 1 used:     0 B(0%) unused: 442.8MB(100%) total: 442.8MB
     cpu&gpu:  1.41GB gpu: 1.351GB cpu:    60MB
     free: cpu(    0 B) gpu(68.87MB)
    ===========================
    
    E....[i 0624 22:00:15.152000 24 profiler.cc:468]
    Profile result, sorted by TotalTime
    ('it/s' represent number of iterations per sec)
          Name  FileName     Count TotalTime    %,cum%   AvgTime   MinTime   MaxTime     Input    Output     InOut   Compute
    Total time:     146us
    Total Memory Access:      80 B
    ?opkey0:array?T:int32?o:1?opkey1:broadcast_to?Tx:int32?DIM=1?BCAST=1?opkey2:binary?Tx:float32?Ty:int32?Tz:float32?OP:subtract?JIT:1?JIT_cuda:1?graph:000010,052020,010021,?var_info::11110121?index_t:int32
              C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_array__T_int32__o_1__opkey1_broadcast_to__Tx_int32__DIM_1__BCAST_1__opkey2_binary___hash_42bbdae5e63a5b90_op.cc
                                 1     146us(100%,100%)     146us     146us     146us   267KB/s   267KB/s   534KB/s 68.3Kit/s
    
    
    [i 0624 22:00:15.153000 24 profiler.cc:549]
    Memory profile result, sorted by CheckTimes
               Name       FileName     CheckTimes    TLBMissRate
    
    
    .[i 0624 22:00:15.153000 24 cuda_flags.cc:32] CUDA enabled.
    [i 0624 22:00:15.154000 24 profiler.cc:468]
    Profile result, sorted by TotalTime
    ('it/s' represent number of iterations per sec)
          Name  FileName     Count TotalTime    %,cum%   AvgTime   MinTime   MaxTime     Input    Output     InOut   Compute
    Total time:    36.3us
    Total Memory Access:      80 B
    ?opkey0:array?T:int32?o:1?opkey1:broadcast_to?Tx:int32?DIM=1?BCAST=1?opkey2:binary?Tx:float32?Ty:int32?Tz:float32?OP:subtract?JIT:1?JIT_cuda:1?graph:000010,052020,010021,?var_info::11110121?index_t:int32
              C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_array__T_int32__o_1__opkey1_broadcast_to__Tx_int32__DIM_1__BCAST_1__opkey2_binary___hash_42bbdae5e63a5b90_op.cc
                                 1    36.3us(100%,100%)    36.3us    36.3us    36.3us  1.05MB/s  1.05MB/s   2.1MB/s  276Kit/s
    
    
    [i 0624 22:00:15.154000 24 profiler.cc:549]
    Memory profile result, sorted by CheckTimes
               Name       FileName     CheckTimes    TLBMissRate
    
    
    [i 0624 22:00:15.154000 24 cuda_flags.cc:32] CUDA enabled.
    ..[i 0624 22:00:15.231000 24 cuda_flags.cc:32] CUDA enabled.
    [i 0624 22:00:15.231000 24 cuda_flags.cc:32] CUDA enabled.
    ....
    ======================================================================
    ERROR: test_memcopy_overlap (__main__.TestArray)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py", line 125, in inner
        ret = func(*args, **kw)
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\test\test_array.py", line 49, in test_memcopy_overlap
        a.sync()
    RuntimeError: [f 0624 22:00:14.826000 24 executor.cc:665]
    Execute fused operator(128/186) failed.
    [JIT Source]: C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_broadcast_to__Tx_float32__DIM_7__BCAST_19__opkey1_reindex__Tx_float32__XDIM_4__YD___hash_b0347dac92a33895_op.cc
    [OP TYPE]: fused_op:( broadcast_to, reindex, binary.multiply, reduce.add,)
    [Input]: float32[256,256,3,3,]layer3.0.conv2.weight, float32[100,256,14,14,],
    [Output]: float32[100,256,14,14,],
    [Async Backtrace]: not found, please set env JT_SYNC=1, trace_py_var=3
    [Reason]: [f 0624 22:00:14.825000 24 helper_cuda.h:128] CUDA error at c:\programdata\miniconda3\envs\jittor\lib\site-packages\jittor\src\mem\allocator\cuda_device_allocator.cc:32  code=2( cudaErrorMemoryAllocation ) cudaMallocManaged(&ptr, size)
    **********
    Async error was detected. To locate the async backtrace and get better error report, please rerun your code with two enviroment variables set:
    cmd:
    >>> set JT_SYNC=1
    >>> set trace_py_var=3
    powershell:
    >>> $env:JT_SYNC=1
    >>> $env:trace_py_var=3
    
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py", line 124, in inner
        with self:
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py", line 145, in __exit__
        setattr(flags, k, v)
    RuntimeError: [f 0624 22:00:15.051000 24 executor.cc:665]
    Execute fused operator(68/126) failed.
    [JIT Source]: C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_broadcast_to__Tx_float32__DIM_7__BCAST_19__opkey1_reindex__Tx_float32__XDIM_4__YD___hash_b0347dac92a33895_op.cc
    [OP TYPE]: fused_op:( broadcast_to, reindex, binary.multiply, reduce.add,)
    [Input]: float32[256,256,3,3,]layer3.0.conv2.weight, float32[100,256,14,14,],
    [Output]: float32[100,256,14,14,],
    [Async Backtrace]: not found, please set env JT_SYNC=1, trace_py_var=3
    [Reason]: [f 0624 22:00:15.051000 24 helper_cuda.h:128] CUDA error at c:\programdata\miniconda3\envs\jittor\lib\site-packages\jittor\src\mem\allocator\cuda_device_allocator.cc:32  code=2( cudaErrorMemoryAllocation ) cudaMallocManaged(&ptr, size)
    **********
    Async error was detected. To locate the async backtrace and get better error report, please rerun your code with two enviroment variables set:
    cmd:
    >>> set JT_SYNC=1
    >>> set trace_py_var=3
    powershell:
    >>> $env:JT_SYNC=1
    >>> $env:trace_py_var=3
    
    
    ----------------------------------------------------------------------
    Ran 18 tests in 4.337s
    
    FAILED (errors=1)
    
  4. 按照提示设置JT_SYNCtrace_py_var后的报错信息如下:

    ======================================================================
    ERROR: test_memcopy_overlap (__main__.TestArray)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py", line 125, in inner
        ret = func(*args, **kw)
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\test\test_array.py", line 49, in test_memcopy_overlap
        a.sync()
    RuntimeError: [f 0624 22:06:50.515000 40 executor.cc:665]
    Execute fused operator(128/186) failed.
    [JIT Source]: C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_broadcast_to__Tx_float32__DIM_7__BCAST_19__opkey1_reindex__Tx_float32__XDIM_4__YD___hash_b0347dac92a33895_op.cc
    [OP TYPE]: fused_op:( broadcast_to, reindex, binary.multiply, reduce.add,)
    [Input]: float32[256,256,3,3,]layer3.0.conv2.weight, float32[100,256,14,14,],
    [Output]: float32[100,256,14,14,],
    [Async Backtrace]: ---
         C:\ProgramData\Miniconda3\envs\jittor\lib\runpy.py:196 <_run_module_as_main>
         C:\ProgramData\Miniconda3\envs\jittor\lib\runpy.py:86 <_run_code>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\test\test_array.py:212 <<module>>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\main.py:101 <__init__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\main.py:271 <runTests>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\runner.py:184 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:84 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:122 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:84 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:122 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\case.py:650 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\case.py:591 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\case.py:549 <_callTestMethod>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:125 <inner>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\test\test_array.py:48 <test_memcopy_overlap>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\models\resnet.py:152 <execute>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\models\resnet.py:144 <_forward_impl>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\nn.py:2054 <execute>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\models\resnet.py:52 <execute>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\nn.py:847 <execute>
    [Reason]: [f 0624 22:06:50.514000 40 helper_cuda.h:128] CUDA error at c:\programdata\miniconda3\envs\jittor\lib\site-packages\jittor\src\mem\allocator\cuda_device_allocator.cc:32  code=2( cudaErrorMemoryAllocation ) cudaMallocManaged(&ptr, size)
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py", line 124, in inner
        with self:
      File "C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py", line 145, in __exit__
        setattr(flags, k, v)
    RuntimeError: [f 0624 22:06:50.709000 40 executor.cc:665]
    Execute fused operator(68/126) failed.
    [JIT Source]: C:\Users\wawcac\.cache\jittor\jt1.3.4\cl\py3.10.4\Windows-10-10.xd5\IntelRCoreTMi7x8f\default\cu11.2.67\jit\__opkey0_broadcast_to__Tx_float32__DIM_7__BCAST_19__opkey1_reindex__Tx_float32__XDIM_4__YD___hash_b0347dac92a33895_op.cc
    [OP TYPE]: fused_op:( broadcast_to, reindex, binary.multiply, reduce.add,)
    [Input]: float32[256,256,3,3,]layer3.0.conv2.weight, float32[100,256,14,14,],
    [Output]: float32[100,256,14,14,],
    [Async Backtrace]: ---
         C:\ProgramData\Miniconda3\envs\jittor\lib\runpy.py:196 <_run_module_as_main>
         C:\ProgramData\Miniconda3\envs\jittor\lib\runpy.py:86 <_run_code>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\test\test_array.py:212 <<module>>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\main.py:101 <__init__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\main.py:271 <runTests>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\runner.py:184 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:84 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:122 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:84 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\suite.py:122 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\case.py:650 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\case.py:591 <run>
         C:\ProgramData\Miniconda3\envs\jittor\lib\unittest\case.py:549 <_callTestMethod>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:125 <inner>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\test\test_array.py:48 <test_memcopy_overlap>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\models\resnet.py:152 <execute>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\models\resnet.py:144 <_forward_impl>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\nn.py:2054 <execute>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\models\resnet.py:52 <execute>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\__init__.py:954 <__call__>
         C:\ProgramData\Miniconda3\envs\jittor\lib\site-packages\jittor\nn.py:847 <execute>
    [Reason]: [f 0624 22:06:50.709000 40 helper_cuda.h:128] CUDA error at c:\programdata\miniconda3\envs\jittor\lib\site-packages\jittor\src\mem\allocator\cuda_device_allocator.cc:32  code=2( cudaErrorMemoryAllocation ) cudaMallocManaged(&ptr, size)
    
    ----------------------------------------------------------------------
    Ran 18 tests in 3.431s
    
    FAILED (errors=1)
    

蓝屏后强制重启,尚未保存文件会丢失。