依照jittorllms的文档操作,出现RuntimeError: [f 0512 17:49:00.342000 48 executor.cc:682]

如果您使用 Jittor 时遇到了安装问题、运行错误、输出错误等问题,您可以在 “问答” 区提问。
为了更好地定位您的问题,建议包含以下内容:

  1. 错误信息、
  2. Jittor 运行的完整 log(建议复制粘贴运行输出,如果可能请不要使用截图)、
  3. 复现此问题的代码或者描述、
  4. 您认为正确结果应当如何、
  5. 其他必要信息

这有一个 参考模板

详细报错

用户输入:你好

Compiling Operators(5/5) used: 4.64s eta:    0s
[w 0512 17:49:00.220000 48 cuda_device_allocator.cc:30] Unable to alloc cuda device memory, use unify memory instead. This may cause low performance.
[i 0512 17:49:00.220000 48 cuda_device_allocator.cc:32]
=== display_memory_info ===
 total_cpu_ram:    15 B total_device_ram:     6GB
 hold_vars: 440 lived_vars: 460 lived_ops: 135
 jtorch_grad_vars: 0
 name: sfrl is_device: 1 used: 5.275GB(100%) unused: 963.5KB(0.0174%) total: 5.276GB
 name: sfrl is_device: 1 used:     0 B(0%) unused:     1MB(100%) total:     1MB
 name: sfrl is_device: 0 used:     0 B(0%) unused:     1MB(100%) total:     1MB
 name: sfrl is_device: 0 used: 7.525GB(58.8%) unused: 5.276GB(41.2%) total:  12.8GB
 name: temp is_device: 0 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
 name: temp is_device: 1 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
 cpu&gpu: 18.08GB gpu: 5.277GB cpu:  12.8GB
 free: cpu(    0 B) gpu(    0 B)
 swap: total(    0 B) last(    0 B)
===========================

[w 0512 17:49:00.342000 48 cuda_device_allocator.cc:30] Unable to alloc cuda device memory, use unify memory instead. This may cause low performance.
[i 0512 17:49:00.342000 48 cuda_device_allocator.cc:32]
=== display_memory_info ===
 total_cpu_ram:    15 B total_device_ram:     6GB
 hold_vars: 440 lived_vars: 460 lived_ops: 135
 jtorch_grad_vars: 0
 name: sfrl is_device: 1 used: 5.275GB(100%) unused: 963.5KB(0.0174%) total: 5.276GB
 name: sfrl is_device: 1 used:     0 B(0%) unused:     1MB(100%) total:     1MB
 name: sfrl is_device: 0 used:     0 B(0%) unused:     1MB(100%) total:     1MB
 name: sfrl is_device: 0 used: 7.525GB(58.8%) unused: 5.276GB(41.2%) total:  12.8GB
 name: temp is_device: 0 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
 name: temp is_device: 1 used:     0 B(-nan(ind)%) unused:     0 B(-nan(ind)%) total:     0 B
 cpu&gpu: 18.08GB gpu: 5.277GB cpu:  12.8GB
 free: cpu(    0 B) gpu(    0 B)
 swap: total(    0 B) last(    0 B)
===========================

Traceback (most recent call last):
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\cli_demo.py", line 9, in <module>
    model.chat()
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\models\chatglm\__init__.py", line 36, in chat
    for response, history in self.model.stream_chat(self.tokenizer, text, history=history):
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1259, in stream_chat
    for outputs in self.stream_generate(**input_ids, **gen_kwargs):
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1336, in stream_generate
    outputs = self(
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jtorch\nn\__init__.py", line 16, in __call__
    return self.forward(*args, **kw)
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 1138, in forward
    transformer_outputs = self.transformer(
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jtorch\nn\__init__.py", line 16, in __call__
    return self.forward(*args, **kw)
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 973, in forward
    layer_ret = layer(
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jtorch\nn\__init__.py", line 16, in __call__
    return self.forward(*args, **kw)
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 614, in forward
    attention_outputs = self.attention(
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jtorch\nn\__init__.py", line 16, in __call__
    return self.forward(*args, **kw)
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 454, in forward
    cos, sin = self.rotary_emb(q1, seq_len=position_ids.max() + 1)
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jtorch\nn\__init__.py", line 16, in __call__
    return self.forward(*args, **kw)
  File "C:\Users\anwu/.cache\huggingface\modules\transformers_modules\local\modeling_chatglm.py", line 202, in forward
    t = torch.arange(seq_len, device=x.device, dtype=self.inv_freq.dtype)
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jtorch\__init__.py", line 31, in inner
    ret = func(*args, **kw)
  File "E:\SystemApp\Desktop\testbot\JittorLLMs\envi\lib\site-packages\jittor\misc.py", line 809, in arange
    if isinstance(start, jt.Var): start = start.item()
RuntimeError: [f 0512 17:49:00.342000 48 executor.cc:682]
Execute fused operator(29/43) failed.
[JIT Source]: C:\Users\anwu\.cache\jittor\jt1.3.7\cl\py3.10.5\Windows-10-10.x08\AMDRyzen75800Hxb1\main\cu11.2.67\jit\__opkey0_broadcast_to__Tx_float16__DIM_3__BCAST_1__opkey1_broadcast_to__Tx_float16__DIM_3____hash_9730a00665a5a466_op.cc
[OP TYPE]: fused_op:( broadcast_to, broadcast_to, binary.multiply, reduce.add,)
[Input]: float16[12288,4096,]transformer.layers.11.attention.query_key_value.weight, float16[4,4096,],
[Output]: float16[4,12288,],
[Async Backtrace]: not found, please set env JT_SYNC=1, trace_py_var=3
[Reason]: [f 0512 17:49:00.342000 48 helper_cuda.h:128] CUDA error at e:\systemapp\desktop\testbot\jittorllms\envi\lib\site-packages\jittor\src\mem\allocator\cuda_device_allocator.cc:33  code=2( cudaErrorMemoryAllocation ) cudaMallocManaged(&ptr, size)
**********
Async error was detected. To locate the async backtrace and get better error report, please rerun your code with two enviroment variables set:
cmd:
>>> set JT_SYNC=1
>>> set trace_py_var=3
powershell:
>>> $env:JT_SYNC=1
>>> $env:trace_py_var=3

我想知道怎么做可以正常运行,求助

首先在环境变量中设置显存限制,然后一定要在管理员模式下的cmd中运行python cli_demo.py chatglm