SAM3 AutoDL复现完整流程 0)环境前置
NVIDIA 独显
建议驱动最新(对应 CUDA 12.6+)
已装 Conda
注意:SAM3 与新版 opencv/numpy 可能冲突
我使用的是镜像 PyTorch 2.8.0 Python 3.12(ubuntu22.04) CUDA12.8 GPU:RTX 4090(24GB)*1
1)创建并激活环境 建议 Python 3.12
1 2 conda create -n sam3 python=3 .12 -y conda activate sam3
2)安装 PyTorch(cu126)/AotuDL已有镜像选择则跳过 1 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126
3)下载源码并安装 下载源码
1 git clone https://github.com/facebookresearch/sam3.git
进入解压后的目录:
1 2 cd path \to\sam3-main pip install -e .
4)补齐依赖(未知原因导致包缺失) 1 2 pip install matplotlib pandas tqdm pillow pip install scikit-image scikit-learn
最终pip list 内容
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 Package Version Editable project location ------------------------- ----------- ------------------------- anyio 4.12.1 argon2-cffi 25.1.0 argon2-cffi-bindings 25.1.0 asttokens 3.0.0 async-lru 2.0.5 attrs 25.4.0 babel 2.17.0 beautifulsoup4 4.14.2 bleach 6.3.0 brotlicffi 1.2.0.0 certifi 2026.1.4 cffi 2.0.0 charset-normalizer 3.4.4 click 8.3.1 comm 0.2.3 contourpy 1.3.3 cuda-bindings 12.9.4 cuda-pathfinder 1.3.3 cycler 0.12.1 debugpy 1.8.16 decorator 5.2.1 decord 0.6.0 defusedxml 0.7.1 einops 0.8.1 executing 2.2.1 fastjsonschema 2.21.2 filelock 3.20.3 fonttools 4.61.1 fsspec 2026.1.0 ftfy 6.1.1 h11 0.16.0 hf-xet 1.2.0 html5lib 1.1 httpcore 1.0.9 httpx 0.28.1 huggingface_hub 1.3.3 idna 3.11 ImageIO 2.37.2 iopath 0.1.10 ipykernel 6.31.0 ipython 9.7.0 ipython_pygments_lexers 1.1.1 ipywidgets 8.1.7 jedi 0.19.2 Jinja2 3.1.6 joblib 1.5.3 json5 0.12.1 jsonschema 4.25.1 jsonschema-specifications 2025.9.1 jupyter 1.1.1 jupyter_client 8.8.0 jupyter-console 6.6.3 jupyter_core 5.9.1 jupyter-events 0.12.0 jupyter-lsp 2.2.5 jupyter_server 2.17.0 jupyter_server_terminals 0.5.3 jupyterlab 4.5.0 jupyterlab_pygments 0.3.0 jupyterlab_server 2.28.0 jupyterlab_widgets 3.0.16 kiwisolver 1.4.9 lazy_loader 0.4 MarkupSafe 3.0.3 matplotlib 3.10.8 matplotlib-inline 0.2.1 mistune 3.1.2 mpmath 1.3.0 nbclient 0.10.2 nbconvert 7.16.6 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.6.1 notebook 7.5.0 notebook_shim 0.2.4 numpy 1.26.4 nvidia-cublas-cu12 12.8.4.1 nvidia-cuda-cupti-cu12 12.8.90 nvidia-cuda-nvrtc-cu12 12.8.93 nvidia-cuda-runtime-cu12 12.8.90 nvidia-cudnn-cu12 9.10.2.21 nvidia-cufft-cu12 11.3.3.83 nvidia-cufile-cu12 1.13.1.3 nvidia-curand-cu12 10.3.9.90 nvidia-cusolver-cu12 11.7.3.90 nvidia-cusparse-cu12 12.5.8.93 nvidia-cusparselt-cu12 0.7.1 nvidia-nccl-cu12 2.27.5 nvidia-nvjitlink-cu12 12.8.93 nvidia-nvshmem-cu12 3.4.5 nvidia-nvtx-cu12 12.8.90 opencv-python 4.11.0.86 packaging 26.0 pandas 3.0.0 pandocfilters 1.5.1 parso 0.8.5 pexpect 4.9.0 pillow 12.1.0 pip 25.3 platformdirs 4.5.0 portalocker 3.2.0 prometheus_client 0.21.1 prompt_toolkit 3.0.52 psutil 7.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 pycocotools 2.0.11 pycparser 2.23 Pygments 2.19.2 pyparsing 3.3.2 PySocks 1.7.1 python-dateutil 2.9.0.post0 python-json-logger 4.0.0 PyYAML 6.0.3 pyzmq 27.1.0 qtconsole 5.7.0 QtPy 2.4.3 referencing 0.37.0 regex 2026.1.15 requests 2.32.5 rfc3339-validator 0.1.4 rfc3986-validator 0.1.1 rpds-py 0.28.0 safetensors 0.7.0 sam3 0.1.0 /root/autodl-tmp/sam3 scikit-image 0.26.0 scikit-learn 1.8.0 scipy 1.17.0 Send2Trash 1.8.3 setuptools 80.9.0 shellingham 1.5.4 six 1.17.0 sniffio 1.3.1 soupsieve 2.5 stack_data 0.6.3 sympy 1.14.0 terminado 0.18.1 threadpoolctl 3.6.0 tifffile 2026.1.14 timm 1.0.24 tinycss2 1.4.0 torch 2.10.0 torchvision 0.25.0 tornado 6.5.4 tqdm 4.67.1 traitlets 5.14.3 triton 3.6.0 typer-slim 0.21.1 typing_extensions 4.15.0 urllib3 2.6.3 wcwidth 0.3.5 webencodings 0.5.1 websocket-client 1.8.0 wheel 0.45.1 widgetsnbextension 4.0.14
5)权重下载 官方 HF 权限经常申请不到,几分钟给你拒了到也不占用时间(
**sam3.pt链接:**https://www.modelscope.cn/models/facebook/sam3/files
5.0 下载单个文件到指定本地文件夹(以下载README.md到当前路径下“dir”目录为例) 1 2 pip install modelscope modelscope download --model facebook/sam3 README.md --local_dir ./dir
5.1 把 sam3.pt 放到项目根目录 比如:
1 2 3 4 sam3-main/ sam3/ sam3.pt ← 放这里,你ls的话会看到这里面还有一层sam3,不要再往里cd了 ...
5.2 修改源码:强制本地加载 打开:sam3/model_builder.py
输入/检索以下内容
把:
load_from_hf = True 改成 False
checkpoint_path = None 改成 "sam3.pt"
保存。
6)opencv / numpy 版本冲突 这点你教程没写,但你前面在环境里已经见过:
sam3 要求 numpy < 2
opencv-python 新版 (你之前装的 4.12/4.13)常要求 numpy >= 2
处理方法:降级opencv到4.11
7)最终测试代码(你给的 main.py 我帮你“加固版”) 原版教程是 plt.show() 弹窗显示。Autodl无法直接交互展示图片,应保存并使用jupyter打开
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 import osimport matplotlib.pyplot as pltfrom PIL import Imagefrom sam3.model_builder import build_sam3_image_modelfrom sam3.model.sam3_image_processor import Sam3Processorfrom sam3.visualization_utils import plot_results model = build_sam3_image_model() processor = Sam3Processor(model) image = Image.open ("assets/images/test_image.jpg" ).convert("RGB" ) state = processor.set_image(image) state = processor.set_text_prompt(state=state, prompt="person" ) plot_results(image, state) out_path = "out.png" plt.savefig(out_path, dpi=200 , bbox_inches="tight" , pad_inches=0 ) plt.close()print ("Saved to:" , os.path.abspath(out_path))
运行:
写在最后: SAM3实例分割能力真的很强。(点头)而且4090跑起来很快(点头)