-- On the standard library side:
The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.
`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.
`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
of values.
-- On the libfuzzer and abi side:
--- Uids
These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:
1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.
2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.
Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.
--- Mutations
At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.
Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.
For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.
--- Passive Minimization
A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.
The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.
Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.
-- Comparison to byte-based smith
A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).
-- Test updates
All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
This option never worked properly (it emitted wrongly-formatted code),
and it doesn't seem particularly *useful* -- someone who's proficient
enough with `std.Build` to not need explanations probably just wants to
write their own thing. Meanwhile, the use case of writing your own
`build.zig` was extremely poorly served, because `build.zig.zon` *needs*
to be generated programmatically for a correct `fingerprint`, but the
only ways to do that were to a) do it wrong and get an error, or b) get
the full init template and delete the vast majority of it. Both of these
were pretty clunky, and `-s` didn't really help.
So, replace this flag with a new one, `--minimal`/`-m`, which uses a
different template. This template is trivial enough that I opted to just
hardcode it into the compiler for simplicity. The main job of
`zig init -m` is to generate a correct `build.zig.zon` (if it is unable
to do this, it exits with a fatal error). In addition, it will *attempt*
to generate a tiny stub `build.zig`, with only an `std` import and an
empty `pub fn build`. However, if `build.zig` already exists, it will
avoid overwriting it, and doesn't even complain. This serves the use
case of writing `build.zig` manually and *then* running `zig init -m`
to generate an appropriate `build.zig.zon`.
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API
make std.testing.expectFmt work at compile-time
std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.
Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
- anytype -> *std.io.Writer
- inferred error set -> error{WriteFailed}
- options -> (deleted)
* std.fmt.Formatted
- now takes context type explicitly
- no fmt string
preparing to rearrange std.io namespace into an interface
how to upgrade:
std.io.getStdIn() -> std.fs.File.stdin()
std.io.getStdOut() -> std.fs.File.stdout()
std.io.getStdErr() -> std.fs.File.stderr()
and also rename `advancedPrint` to `bufferedPrint` in the zig init templates
These are left overs from my previous changes to zig init.
The new templating system removes LITNAME because the new restrictions on package names make it redundant with NAME, and the use of underscores for marking templated identifiers lets us template variable names while still keeping zig fmt happy.
This commit introduces a new flag to generate a new Zig project using
`zig init` without comments for users who are already familiar with the
Zig build system.
Additionally, the generated files are now different. Previously we would
generate a set of files that defined a static library and an executable,
which real-life experience has shown to cause confusion to newcomers.
The new template generates one Zig module and one executable both in
order to accommodate the two most common use cases, but also to suggest
that a library could use a CLI tool (e.g. a parser library could use a
CLI tool that provides syntax checking) and vice-versa a CLI tool might
want to expose its core functionality as a Zig module.
All references to C interoperability are removed from the template under
the assumption that if you're tall enough to do C interop, you're also
tall enough to find your way around the build system. Experienced users
will still be able to use the current template and adapt it with minimal
changes in order to perform more advanced operations. As an example, one
only needs to change `b.addExecutable` to `b.addLibrary` to switch from
generating an executable to a dynamic (or static) library.
mainly this addresses the following use case:
1. Someone creates a template with build.zig.zon, id field included
(note that zig init does not create this problem since it generates
fresh id every time it runs).
2. User A uses the template, changing package name to "example" but not
id field.
3. User B uses the same template, changing package name also to
"example", also not changing the id field.
Here, both packages have unintentional conflicting logical ids.
By making the field a combination of name checksum + random id, this
accident is avoided. "nonce" is an OK name for this.
Also relaxes errors on remote packages when using `zig fetch`.
Introduces the `id` field to `build.zig.zon`.
Together with name, this represents a globally unique package
identifier. This field should be initialized with a 16-bit random number
when the package is first created, and then *never change*. This allows
Zig to unambiguously detect when one package is an updated version of
another.
When forking a Zig project, this id should be regenerated with a new
random number if the upstream project is still maintained. Otherwise,
the fork is *hostile*, attempting to take control over the original
project's identity.
`0x0000` is invalid because it obviously means a random number wasn't
used.
`0xffff` is reserved to represent "naked" packages.
Tracking issue #14288
Additionally:
* Fix bad path in error messages regarding build.zig.zon file.
* Manifest validates that `name` and `version` field of build.zig.zon
are maximum 32 bytes.
* Introduce error for root package to not switch to enum literal for
name.
* Introduce error for root package to omit `id`.
* Update init template to generate `id`
* Update init template to populate `minimum_zig_version`.
* New package hash format changes:
- name and version limited to 32 bytes via error rather than truncation
- truncate sha256 to 192 bits rather than 40 bits
- include the package id
This means that, given only the package hashes for a complete dependency
tree, it is possible to perform version selection and know the final
size on disk, without doing any fetching whatsoever. This prevents
wasted bandwidth since package versions not selected do not need to be
fetched.
breaking change to the fuzz testing API; it now passes a type-safe
context parameter to the fuzz function.
libfuzzer is reworked to select inputs from the entire corpus.
I tested that it's roughly as good as it was before in that it can find
the panics in the simple examples, as well as achieve decent coverage on
the tokenizer fuzz test.
however I think the next step here will be figuring out why so many
points of interest are missing from the tokenizer in both Debug and
ReleaseSafe modes.
does not quite close#20803 yet since there are some more important
things to be done, such as opening the previous corpus, continuing
fuzzing after finding bugs, storing the length of the inputs, etc.
Acts as a replacement for `addSharedLibrary` and `addStaticLibrary`, but
linking mode can be changed more easily in build.zig, for example:
In library:
```zig
const linkage = b.option(std.builtin.LinkMode, "linkage", "Link mode for a foo_bar library") orelse .static; // or other default
const lib = b.addLibrary(.{
.linkage = linkage,
.name = "foo_bar",
.root_module = mod,
});
```
In consumer:
```zig
const dep_foo_bar = b.dependency("foo_bar", .{
.target = target,
.optimize = optimize,
.linkage = .static // or dynamic
});
mod.linkLibrary(dep_foor_bar.artifact("foo_bar"));
```
It also matches nicely with `linkLibrary` name.
Signed-off-by: Eric Joldasov <bratishkaerik@landless-city.net>
There's been some proliferation of dependency URLs that reference
mutable data such as links to git branches that can change. This has
resulted in broken projects, i.e.
* 9eef9de94c/build.zig.zon
* 4b64353e9c
There's also disagreement about whether it's fine for URL's to point to
git branches, i.e.
https://github.com/Not-Nik/raylib-zig/pull/130
This updates the docs to mention that zig won't be able to use URLs if
their content changes.
Clarify the usage of .paths in build.zig.zon. Follow the recommendation
of the comments to explicitly list paths by explicitly listing the paths
in the init project.
Instead of `zig init-lib` and `zig init-exe`, now there is only
`zig init`, which initializes any of the template files that do not
already exist, and makes a package that contains both an executable and
a static library. The idea is that the user can delete whatever they
don't want. In fact, I think even more things should be added to the
build.zig template.