get_rnd_text() selection distribution

While testing 'the("Capitalized Hallucinatory Monster")' I noticed that
some hallucinatory monsters showed up more often than others.  When
the random engravings, epitaphs, and bogus monsters were converted from
hard-coded arrays to data files accessed by random seek (3.6.0), they
became subject to the same distribution irregularites that rumors suffer
from.  The chance that an entry will be chosen depends upon the chance
that a random seek will hit somewhere in the line which precedes it, so
entries that follow long lines are more likely to be chosen and entries
that follow short lines are less likely.  We improved that for rumors
by having makedefs pad the shortest lines.  Distribution still isn't
uniform but is much better than it was (and could be further improved
with a longer padding length at the cost of making data files bigger so
possibly slower to access; both overall size and access speed mattered
back when floppy discs were supported but are probably irrelevant now).

Start doing the same thing for the newer files:  pad the shortest lines
to increase the chance that seek will find them.  The tradeoff is that
the data files become bigger.  Rumors, engravings, and epitaphs lines
are all at least 60 characters now; bogus monsters are at least 20.
These are the data file sizes I see (in bytes:  old, new; padding for
rumors was already in use so its size hasn't changed):
  bogusmon    4449    7211
  engrave     1326    2894
  epitaph    14159   24075
  rumors     49173   49173

The only place that padding is noticeable in-game is #wizrumorcheck.
This commit is contained in:
PatR
2021-11-25 17:57:37 -08:00
parent 3f3d1ad85c
commit 59beebcbcc
3 changed files with 107 additions and 55 deletions

View File

@@ -695,6 +695,12 @@ assigning a fruit name that matches the name of an artifact which doesn't use
any "the" prefix could yield messages showing "the Artifact" when
dealing with the artifact rather than fruit: "You are blasted by _the_
Excalibur's power!"; didn't impact basic inventory formatting
selection of random engravings, epitaphs, and hallucinatory monster names had
the same problem that rumor selection used to have: entries which
follow longer than average lines are most likely to be chosen and
ones which follow shorter than average lines are least likely; use
same workaround as for rumors: pad the shortest lines; result isn't
uniforn distribution but is better (tradeoff vs size; see makedefs)
Fixes to 3.7.0-x Problems that Were Exposed Via git Repository