使用transformers_jittor库中BLIP模型报错

问题代码:

from transformers import BlipConfig, BlipModel, BlipProcessor, BlipForConditionalGeneration, AutoProcessor

model = BlipModel.from_pretrained("Salesforce/blip-image-captioning-base")

使用transformers_jittor库,加载BlipModel或BlipForConditionalGeneration都存在同样报错,如下:

[i 0525 14:12:22.322198 68 compiler.py:956] Jittor(1.3.9.5) src: /home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/jittor
[i 0525 14:12:22.333077 68 compiler.py:957] g++ at /usr/bin/g++(9.4.0)
[i 0525 14:12:22.333300 68 compiler.py:958] cache_path: /home/user/.cache/jittor/jt1.3.9/g++9.4.0/py3.9.19/Linux-5.15.0-1x3f/IntelRXeonRGolxa3/54ff/default
[i 0525 14:12:22.352645 68 install_cuda.py:93] cuda_driver_version: [11, 4]
[i 0525 14:12:22.353170 68 install_cuda.py:81] restart /home/user/anaconda3/envs/jittor/bin/python3.9 ['Codes/try_caption.py']
[i 0525 14:12:22.990449 16 compiler.py:956] Jittor(1.3.9.5) src: /home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/jittor
[i 0525 14:12:23.001963 16 compiler.py:957] g++ at /usr/bin/g++(9.4.0)
[i 0525 14:12:23.002148 16 compiler.py:958] cache_path: /home/user/.cache/jittor/jt1.3.9/g++9.4.0/py3.9.19/Linux-5.15.0-1x3f/IntelRXeonRGolxa3/54ff/default
[i 0525 14:12:23.022333 16 install_cuda.py:93] cuda_driver_version: [11, 4]
[i 0525 14:12:23.037444 16 __init__.py:412] Found /home/user/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc(11.2.152) at /home/user/.cache/jittor/jtcuda/cuda11.2_cudnn8_linux/bin/nvcc.
[i 0525 14:12:23.138148 16 __init__.py:412] Found gdb(20.04.1) at /usr/bin/gdb.
[i 0525 14:12:23.150208 16 __init__.py:412] Found addr2line(2.34) at /usr/bin/addr2line.
[i 0525 14:12:47.557001 16 compiler.py:1011] cuda key:cu11.2.152_sm_86
[i 0525 14:12:47.924051 16 __init__.py:227] Total mem: 125.55GB, using 16 procs for compiling.
[i 0525 14:12:48.072731 16 jit_compiler.cc:28] Load cc_path: /usr/bin/g++
[i 0525 14:13:12.679537 16 init.cc:63] Found cuda archs: [86,]
Traceback (most recent call last):
  File "Codes/try_caption.py", line 67, in <module>
    model = BlipModel.from_pretrained("Salesforce/blip-image-captioning-base")
  File "/home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2370, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
  File "/home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/transformers/models/blip/modeling_blip.py", line 749, in __init__
    self.text_model = BlipTextModel(text_config)
  File "/home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/transformers/models/blip/modeling_blip_text.py", line 578, in __init__
    self.embeddings = BlipTextEmbeddings(config)
  File "/home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/transformers/models/blip/modeling_blip_text.py", line 60, in __init__
    self.register_buffer("position_ids", torch.arange(config.max_position_embeddings).expand((1, -1)))
  File "/home/user/anaconda3/envs/jittor/lib/python3.9/site-packages/jittor/misc.py", line 299, in expand
    shape[i] = x.shape[i]
RuntimeError: Wrong inputs arguments, Please refer to examples(help(jt.NanoVector.__map_getitem__)).

Types of your inputs are:
 self   = NanoVector,
 arg0   = int,

The function declarations are:
 inline int64 at(int i)
 inline NanoVector slice(Slice slice)

Failed reason:[f 0525 14:13:16.562800 16 nano_vector.h:116] Check failed: i>=0 && i<size()  Something wrong... Could you please report this issue?

请问如何解决?

expand函数暂不支持-1作为输入,请替换成等价的写法,之后我们会加入这种写法