Home / Glossary / Top-p sampling
What is Top-p (Nucleus) Sampling?
Top-p sampling restricts the model to the smallest set of tokens whose cumulative probability exceeds p. osFoundry’s pipeline configs let you tune top-p per agent or per chat path.
Detail
Top-p (also called nucleus sampling) is an alternative to temperature for controlling output randomness. At top-p = 0.9, the model only considers tokens that make up the top 90% of probability mass, ignoring the long tail.
In practice, set top-p around 0.9-0.95 for balanced output. Lower (0.5) is more focused; higher (0.99) is more diverse. Many providers combine top-p with temperature.
Related terms