Embedding Freshness Decay Calculator

Model how document embedding relevance decays over time for time-sensitive RAG.

Ad placeholder (leaderboard)

Vector similarity is blind to time. In domains where freshness matters — news, prices, policies, support docs — a stale document that happens to embed close to the query can beat a newer, slightly-less-similar one. A freshness decay layer fixes that by multiplying each similarity score by a factor that shrinks with document age. This calculator lets you model that factor and see the re-ranked result before you wire it into your pipeline.

How it works

For each document you supply an age in days and a base similarity from your vector search. The tool computes a freshness factor with the model you choose and multiplies it into the score:

  • Exponential: factor = 0.5 ^ (age / half-life) — smooth, never zero.
  • Linear: factor = max(0, 1 − age / cutoff) — straight-line drop to zero at the cutoff.
  • Step: factor = 1 while age ≤ cutoff, otherwise a fixed penalty.

It then re-sorts the documents by the adjusted score so you can compare the original and recency-aware rankings.

Choosing a model and half-life

  • Match the half-life to how fast your domain goes stale: hours for breaking news, weeks for product docs, months for reference material.
  • Use step decay when freshness is effectively binary (inside vs outside a window).
  • Keep base similarities on a consistent scale (cosine 0–1 is typical) so the multiplier behaves predictably.

Tips

  • Apply decay as a post-retrieval re-rank, not inside the ANN search itself.
  • If fresh documents start dominating obviously-irrelevant matches, lengthen the half-life.
  • Store each document’s timestamp as metadata so age is cheap to compute at query time.
Ad placeholder (rectangle)