Limitations & Roadmap
The Tinker integration is under active development. This page documents current limitations.
Current Limitations
Multi-tenant LoRA: Megatron only
Multi-tenant LoRA training and sampling are supported on the Megatron backend with vLLM serving per-tenant adapters by name. See Multi-tenancy for the operator contract and SL/RL quickstarts. FSDP support is pending, and full-parameter fine-tuning remains single-tenant on both backends — calling create_model with lora_rank=0 while another model exists returns an error.
All adapters registered against one server must share the same (rank, alpha, target_modules) signature; mismatched signatures are hard-rejected at create_model.
No Prompt Logprobs
The sample() API does not yet return prompt logprobs, even when requested. A warning is logged but no error is raised. This may affect scripts that rely on prompt logprobs for KL penalty computation.
KL Penalty
KL penalty (kl_penalty_coef > 0) is not yet supported. This requires prompt logprobs from vLLM (not wired yet) and a way to serve frozen base model logprobs after weight sync. This is disabled by default in cookbook recipes, so it is not a blocker for most workflows.
RL Loss Functions
Only cross_entropy and importance_sampling are currently wired through the Tinker data conversion path. SkyRL's PolicyLossRegistry contains implementations for PPO (regular), cispo, and others, but these are not yet validated through the Tinker API.