SAM3 AutoDL复现完整流程

First Post:

Last Update:

SAM3 AutoDL复现完整流程

0)环境前置

  • NVIDIA 独显
  • 建议驱动最新(对应 CUDA 12.6+)
  • 已装 Conda
  • 注意:SAM3 与新版 opencv/numpy 可能冲突

我使用的是镜像 PyTorch 2.8.0 Python 3.12(ubuntu22.04) CUDA12.8
GPU:RTX 4090(24GB)*1


1)创建并激活环境

建议 Python 3.12

1
2
conda create -n sam3 python=3.12 -y
conda activate sam3

2)安装 PyTorch(cu126)/AotuDL已有镜像选择则跳过

1
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu126

3)下载源码并安装

下载源码

1
git clone https://github.com/facebookresearch/sam3.git

进入解压后的目录:

1
2
cd path\to\sam3-main
pip install -e .

4)补齐依赖(未知原因导致包缺失)

1
2
pip install matplotlib pandas tqdm pillow
pip install scikit-image scikit-learn

最终pip list 内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158

Package Version Editable project location
------------------------- ----------- -------------------------
anyio 4.12.1
argon2-cffi 25.1.0
argon2-cffi-bindings 25.1.0
asttokens 3.0.0
async-lru 2.0.5
attrs 25.4.0
babel 2.17.0
beautifulsoup4 4.14.2
bleach 6.3.0
brotlicffi 1.2.0.0
certifi 2026.1.4
cffi 2.0.0
charset-normalizer 3.4.4
click 8.3.1
comm 0.2.3
contourpy 1.3.3
cuda-bindings 12.9.4
cuda-pathfinder 1.3.3
cycler 0.12.1
debugpy 1.8.16
decorator 5.2.1
decord 0.6.0
defusedxml 0.7.1
einops 0.8.1
executing 2.2.1
fastjsonschema 2.21.2
filelock 3.20.3
fonttools 4.61.1
fsspec 2026.1.0
ftfy 6.1.1
h11 0.16.0
hf-xet 1.2.0
html5lib 1.1
httpcore 1.0.9
httpx 0.28.1
huggingface_hub 1.3.3
idna 3.11
ImageIO 2.37.2
iopath 0.1.10
ipykernel 6.31.0
ipython 9.7.0
ipython_pygments_lexers 1.1.1
ipywidgets 8.1.7
jedi 0.19.2
Jinja2 3.1.6
joblib 1.5.3
json5 0.12.1
jsonschema 4.25.1
jsonschema-specifications 2025.9.1
jupyter 1.1.1
jupyter_client 8.8.0
jupyter-console 6.6.3
jupyter_core 5.9.1
jupyter-events 0.12.0
jupyter-lsp 2.2.5
jupyter_server 2.17.0
jupyter_server_terminals 0.5.3
jupyterlab 4.5.0
jupyterlab_pygments 0.3.0
jupyterlab_server 2.28.0
jupyterlab_widgets 3.0.16
kiwisolver 1.4.9
lazy_loader 0.4
MarkupSafe 3.0.3
matplotlib 3.10.8
matplotlib-inline 0.2.1
mistune 3.1.2
mpmath 1.3.0
nbclient 0.10.2
nbconvert 7.16.6
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.6.1
notebook 7.5.0
notebook_shim 0.2.4
numpy 1.26.4
nvidia-cublas-cu12 12.8.4.1
nvidia-cuda-cupti-cu12 12.8.90
nvidia-cuda-nvrtc-cu12 12.8.93
nvidia-cuda-runtime-cu12 12.8.90
nvidia-cudnn-cu12 9.10.2.21
nvidia-cufft-cu12 11.3.3.83
nvidia-cufile-cu12 1.13.1.3
nvidia-curand-cu12 10.3.9.90
nvidia-cusolver-cu12 11.7.3.90
nvidia-cusparse-cu12 12.5.8.93
nvidia-cusparselt-cu12 0.7.1
nvidia-nccl-cu12 2.27.5
nvidia-nvjitlink-cu12 12.8.93
nvidia-nvshmem-cu12 3.4.5
nvidia-nvtx-cu12 12.8.90
opencv-python 4.11.0.86
packaging 26.0
pandas 3.0.0
pandocfilters 1.5.1
parso 0.8.5
pexpect 4.9.0
pillow 12.1.0
pip 25.3
platformdirs 4.5.0
portalocker 3.2.0
prometheus_client 0.21.1
prompt_toolkit 3.0.52
psutil 7.0.0
ptyprocess 0.7.0
pure_eval 0.2.3
pycocotools 2.0.11
pycparser 2.23
Pygments 2.19.2
pyparsing 3.3.2
PySocks 1.7.1
python-dateutil 2.9.0.post0
python-json-logger 4.0.0
PyYAML 6.0.3
pyzmq 27.1.0
qtconsole 5.7.0
QtPy 2.4.3
referencing 0.37.0
regex 2026.1.15
requests 2.32.5
rfc3339-validator 0.1.4
rfc3986-validator 0.1.1
rpds-py 0.28.0
safetensors 0.7.0
sam3 0.1.0 /root/autodl-tmp/sam3
scikit-image 0.26.0
scikit-learn 1.8.0
scipy 1.17.0
Send2Trash 1.8.3
setuptools 80.9.0
shellingham 1.5.4
six 1.17.0
sniffio 1.3.1
soupsieve 2.5
stack_data 0.6.3
sympy 1.14.0
terminado 0.18.1
threadpoolctl 3.6.0
tifffile 2026.1.14
timm 1.0.24
tinycss2 1.4.0
torch 2.10.0
torchvision 0.25.0
tornado 6.5.4
tqdm 4.67.1
traitlets 5.14.3
triton 3.6.0
typer-slim 0.21.1
typing_extensions 4.15.0
urllib3 2.6.3
wcwidth 0.3.5
webencodings 0.5.1
websocket-client 1.8.0
wheel 0.45.1
widgetsnbextension 4.0.14

5)权重下载

官方 HF 权限经常申请不到,几分钟给你拒了到也不占用时间(

**sam3.pt链接:**https://www.modelscope.cn/models/facebook/sam3/files

5.0 下载单个文件到指定本地文件夹(以下载README.md到当前路径下“dir”目录为例)

1
2
pip install modelscope
modelscope download --model facebook/sam3 README.md --local_dir ./dir

5.1 把 sam3.pt 放到项目根目录

比如:

1
2
3
4
sam3-main/
sam3/
sam3.pt ← 放这里,你ls的话会看到这里面还有一层sam3,不要再往里cd了
...

5.2 修改源码:强制本地加载

打开:sam3/model_builder.py

1
vim model_builder.py

输入/检索以下内容

把:

  • load_from_hf = True 改成 False
  • checkpoint_path = None 改成 "sam3.pt"

保存。


6)opencv / numpy 版本冲突

这点你教程没写,但你前面在环境里已经见过:

  • sam3 要求 numpy < 2
  • opencv-python 新版(你之前装的 4.12/4.13)常要求 numpy >= 2

处理方法:降级opencv到4.11


7)最终测试代码(你给的 main.py 我帮你“加固版”)

原版教程是 plt.show() 弹窗显示。Autodl无法直接交互展示图片,应保存并使用jupyter打开

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
import os
import matplotlib.pyplot as plt
from PIL import Image

from sam3.model_builder import build_sam3_image_model
from sam3.model.sam3_image_processor import Sam3Processor
from sam3.visualization_utils import plot_results

# 加载模型(本地 sam3.pt)
model = build_sam3_image_model()
processor = Sam3Processor(model)

# 加载测试图片
image = Image.open("assets/images/test_image.jpg").convert("RGB")

# 全图编码
state = processor.set_image(image)

# 文本提示分割(示例)
state = processor.set_text_prompt(state=state, prompt="person")

# 可视化
plot_results(image, state)

# 1) 弹窗显示
# plt.show()

# 2) 或者保存(可选)
out_path = "out.png"
plt.savefig(out_path, dpi=200, bbox_inches="tight", pad_inches=0)
plt.close()
print("Saved to:", os.path.abspath(out_path))

运行:

1
python main.py。

写在最后:

SAM3实例分割能力真的很强。(点头)而且4090跑起来很快(点头)