-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[https://nvbugs/6330273][fix] In StorageManager.__init__, when typical_batch is supplied, append a synthetic…
#15465
opened Jun 17, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6329052][fix] Add
attn_backend: FLASHINFER and model_kwargs: {num_hidden_layers: 4} to…
#15464
opened Jun 17, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
Reserve KV cache slots for concurrent decode in V2
#15462
opened Jun 17, 2026 by
Kevin-Li-2025
Loading…
[None][fix] Fix encoder-decoder beam search corruption via per-slot fragmentPointerDevice
#15461
opened Jun 17, 2026 by
achartier
Collaborator
Loading…
1 task done
[None][fix] Fix stale sparse attention kwargs
#15460
opened Jun 17, 2026 by
bobboli
Collaborator
Loading…
[None][fix] AutoDeploy: handle torch dist all_gather in multi_stream MLA transform
#15456
opened Jun 17, 2026 by
MrGeva
Collaborator
Loading…
1 task
[None][infra] Tail Slurm job logs when job is no longer active.
#15455
opened Jun 17, 2026 by
mzweilz
Collaborator
Loading…
1 task
[None][fix] Fix cross KV sparse attention config kwarg
#15454
opened Jun 17, 2026 by
yuxianq
Collaborator
Loading…
1 task done
[https://nvbugs/6271740][test] Update llm_perf_core.yml to include new performance test for DeepSeek R1 0528 FP4 model
#15453
opened Jun 17, 2026 by
yufeiwu-nv
Collaborator
Loading…
1 task done
[None][feat] Add conversation params API
deepseek-v4
#15452
opened Jun 17, 2026 by
jiaganc
Collaborator
Loading…
1 task done
[TRTLLM-13334][fix] Add EP assertion to DenseGEMMFusedMoE
#15451
opened Jun 17, 2026 by
guqiqi
Loading…
[None][perf] disagg: optionally drop message history from generation request
#15448
opened Jun 17, 2026 by
lancelly
Collaborator
Loading…
[None][fix] Fix encoder-decoder beam search corruption (cross-KV sharing, mixed-beam exception, gather_generation_logits)
#15444
opened Jun 17, 2026 by
achartier
Collaborator
Loading…
1 task done
[None][test] Un-waive K2.5 Thinking FP4 disagg-NIXL e2e/gen_only tests
#15443
opened Jun 17, 2026 by
chenfeiz0326
Collaborator
Loading…
2 tasks done
[https://nvbugs/6335137][fix] Fix VSWA/Hybrid model crash with MAX_UTILIZATION
#15441
opened Jun 17, 2026 by
VALLIS-NERIA
Collaborator
Loading…
1 task done
[None][fix] Stabilize perf-sanity tests
#15440
opened Jun 17, 2026 by
chenfeiz0326
Collaborator
Loading…
3 tasks done
[None][test] add GLM nvfp4 stress test
#15437
opened Jun 17, 2026 by
xinhe-nv
Collaborator
Loading…
1 task done
DO NOT REVIEW [None][chore] Test MXFP4 MegaMoE integration
#15436
opened Jun 17, 2026 by
longlee0622
Collaborator
•
Draft
1 task
[None][fix] bound weight-load CPU parallelism by node topology
#15435
opened Jun 17, 2026 by
KyleShao1016
Loading…
[#15344][fix] Add SM121 to MLA allowlists
#15434
opened Jun 17, 2026 by
peter941221
Loading…
1 task done
[None][feat] VisualGen: per-layer mixed-precision quantization (per_layer YAML key)
#15433
opened Jun 17, 2026 by
wu6u3tw
Contributor
Loading…
3 tasks done
[TRTLLM-13250][feat] Enable MX post-transform Llama receiver
#15432
opened Jun 16, 2026 by
chienchunhung
Collaborator
•
Draft
[None][feat] Parallelize host KV cache pool prefault and add THP control
#15431
opened Jun 16, 2026 by
nafis271
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.