Re: Jak zrobić testy Dieharda? - Grupy dyskusyjne w eGospodarka.pl

eGospodarka.pl › Grupy › pl.comp.programming › Jak zrobić testy Dieharda? › Re: Jak zrobić testy Dieharda?

Data: 2020-09-08 01:52:14
Temat: Re: Jak zrobić testy Dieharda?
Od: osobliwy nick <o...@g...com> szukaj wiadomości tego autora
[ pokaż wszystkie nagłówki ]
Ostatecznie dieharder -h wypisał opcje:

#===================================================
==========================#
# dieharder version 3.31.1 Copyright 2003 Robert G. Brown #
#===================================================
==========================#

Usage:

dieharder [-a] [-d dieharder test number] [-f filename] [-B]
[-D output flag [-D output flag] ... ] [-F] [-c separator]
[-g generator number or -1] [-h] [-k ks_flag] [-l]
[-L overlap] [-m multiply_p] [-n ntuple]
[-p number of p samples] [-P Xoff]
[-o filename] [-s seed strategy] [-S random number seed]
[-n ntuple] [-p number of p samples] [-o filename]
[-s seed strategy] [-S random number seed]
[-t number of test samples] [-v verbose flag]
[-W weak] [-X fail] [-Y Xtrategy]
[-x xvalue] [-y yvalue] [-z zvalue]

-a - runs all the tests with standard/default options to create a report
-d test number - selects specific diehard test.
-f filename - generators 201 or 202 permit either raw binary or
formatted ASCII numbers to be read in from a file for testing.
generator 200 reads in raw binary numbers from stdin.
Note well: many tests with default parameters require a lot of rands!
To see a sample of the (required) header for ASCII formatted input, run

dieharder -o -f example.input -t 10

and then examine the contents of example.input.
Raw binary input reads 32 bit increments of the specified data stream.
stdin_input_raw accepts a pipe from a raw binary stream.
-B binary output (used with -o)
-D output flag - permits fields to be selected for inclusion in dieharder
output. Each flag can be entered as a binary number that turns
on a specific output field or header or by flag name; flags are
aggregated. To see all currently known flags use the -F command.
-F - lists all known flags by name and number.
-c table separator - where separator is e.g. ',' (CSV) or ' ' (whitespace).
-g generator number - selects a specific generator for testing. Using
-1 causes all known generators to be printed out to the display.
-h prints context-sensitive help -- usually Usage (this message) or a
test synopsis if entered as e.g. dieharder -D 3 -h.
-k ks_flag - ks_flag

0 is fast but slightly sloppy for psamples > 4999 (default).

1 is MUCH slower but more accurate for larger numbers of psamples.

2 is very slow and accurate to machine precision.

3 is kuiper ks, fast, quite inaccurate for small samples, deprecated.

-l list all known tests.
-L overlap

1 (use overlap, default)

0 (don't use overlap)

in operm5 or other tests that support overlapping and non-overlapping
sample modes.
-m multiply_p - multiply default # of psamples in -a(ll) runs to crank
up the resolution of failure.
-n ntuple - set ntuple length for tests on short bit strings that permit
the length to be varied (e.g. rgb bitdist).
-o filename - output -t count random numbers from current generator to file.
-p count - sets the number of p-value samples per test (default 100).
-P Xoff - sets the number of psamples that will cumulate before deciding
that a generator is 'good' and really, truly passes even a -Y 2 T2D run.
Currently the default is 100000; eventually it will be set from
AES-derived T2D test failure thresholds for fully automated reliable
operation, but for now it is more a 'boredom' threshold set by how long
one might reasonably want to wait on any given test run.
-S seed - where seed is a uint. Overrides the default random seed
selection. Ignored for file or stdin input.
-s strategy - if strategy is the (default) 0, dieharder reseeds (or
rewinds) once at the beginning when the random number generator is
selected and then never again. If strategy is nonzero, the generator
is reseeded or rewound at the beginning of EACH TEST. If -S seed was
specified, or a file is used, this means every test is applied to the
same sequence (which is useful for validation and testing of dieharder,
but not a good way to test rngs). Otherwise a new random seed is
selected for each test.
-t count - sets the number of random entities used in each test, where
possible. Be warned -- some tests will take a long time with the
default value of 10000. Read the test synopses for suggested settings
for -t or use -a first. Many tests will ignore -t as they require
a very specific number of samples to be used in generating their
statistic.
-W weak - sets the 'weak' threshold to make the test(s) more or less
forgiving during e.g. a test-to-destruction run. Default is currently
0.005.
-X fail - sets the 'fail' threshold to make the test(s) more or less
forgiving during e.g. a test-to-destruction run. Default is currently
0.000001, which is basically 'certain failure of the null hypothesis',
the desired mode of reproducible generator failure.
-Y Xtrategy - the Xtrategy flag controls the new 'test to failure' (T2F)
modes. These flags and their modes act as follows:

0 - just run dieharder with the specified number of tsamples and
psamples, do not dynamically modify a run based on results. This is
the way it has always run, and is still the default.

1 - 'resolve ambiguity' (RA) mode. If a test returns 'weak', this is
an undesired result. What does that mean, after all? If you run a long
test series, you will see occasional weak returns for a perfect
generators because p is uniformly distributed and will appear in any
finite interval from time to time. Even if a test run returns more than
one weak result, you cannot be certain that the generator is failing.
RA mode adds psamples (usually in blocks of 100) until the
test result ends up solidly not weak or proceeds to unambiguous failure.
This is morally equivalent to running the test several times to see if a
weak result is reproducible, but eliminates the bias of personal
judgement in the process since the default failure threshold is very
small and very unlikely to be reached by random chance even in many
runs.

This option should only be used with -k 2.

2 - 'test to destruction' (T2D) mode. Sometimes you just want to know
where or if a generator will .I ever fail a test (or test series). -Y 2
causes psamples to be added 100 at a time until a test returns an
overall pvalue lower than the failure threshold or a specified maximum
number of psamples (see -P) is reached.

Note well! In this mode one may well fail due to the alternate
null hypothesis -- the test itself is a bad test and fails! Many
dieharder tests, despite our best efforts, are numerically unstable or
have only approximately known target statistics or are straight up
asymptotic results, and will eventually return a failing result even for
a gold-standard generator (such as AES), or for the hypercautious the
XOR generator with AES, threefish, kiss, all loaded at once and xor'd
together. It is therefore safest to use this mode comparatively,
executing a T2D run on AES to get an idea of the test failure
threshold(s) (something I will eventually do and publish on the web so
everybody doesn't have to do it independently) and then running it on
your target generator. Failure with numbers of psamples within an order
of magnitude of the AES thresholds should probably be considered
possible test failures, not generator failures. Failures at levels
significantly less than the known gold standard generator failure
thresholds are, of course, probably failures of the generator.

This option should only be used with -k 2.

-v verbose flag -- controls the verbosity of the output for debugging
only. Probably of little use to non-developers, and developers can
read the enum(s) in dieharder.h and the test sources to see which
flag values turn on output on which routines. 1 is 'all' and will
result in a highly detailed trace of program activity.

-x,-y,-z number - Some tests have parameters that can safely be varied
from their default value. For example, in the diehard birthdays test,
one can vary the number of 'dates' drawn from the 'year' of some
length, which can also be varied. -x 2048 -y 30 alters these two values
but should still run fine. These parameters should be documented
internally (where they exist) in the e.g. -d 0 -h visible notes.

NOTE WELL: The assessment(s) for the rngs may, in fact, be completely
incorrect or misleading. In particular, 'Weak' pvalues should occur
one test in a hundred, and 'Failed' pvalues should occur one test in
a thousand -- that's what p MEANS. Use them at your Own Risk! Be Warned!

Nie wiem, czy wszystko się zrobiło tak jak trzeba.