Commit graph

199 commits

Author SHA1 Message Date
Kendall Condon
5d58306162 rework fuzz testing to be smith based
-- On the standard library side:

The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.

`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
  guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.

`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
  possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
  of values.

-- On the libfuzzer and abi side:

--- Uids

These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:

1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.

2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.

Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.

--- Mutations

At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.

Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.

For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.

--- Passive Minimization

A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.

The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.

Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.

-- Comparison to byte-based smith

A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).

-- Test updates

All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
2026-02-13 22:12:19 -05:00
murtaza
b5770541bd testing: ability to read environment variables from unit tests 2026-01-17 00:40:22 +01:00
Andrew Kelley
790d28d6cd std.testing: delete refAllDeclsRecursive
suggested alternatives:
- actual tests
- explicitly list the decls
- compile an example application that uses the API
- stop worrying about dead code
- refAllDecls (non recursive) in each file

Don't fight the laziness, embrace it.

closes #23608
closes #30813
2026-01-16 12:34:42 -08:00
Justus Klausecker
6e35138901
all: prefer else => |e| return e, over else => return err,
When switching on an error, using the captured value instead of the original
one is always preferable since its error set has been narrowed to only
contain errors which haven't already been handled by other switch prongs.

The subsequent commits will disallow this form as an unreachable `else` prong.
2026-01-11 11:37:17 +00:00
David Rubin
aa2b178029
disallow switch case capture discards
Previously Zig allowed you to write something like,
```zig
switch (x) {
    .y => |_| {
```

This seems a bit strange because in other cases, such as when
capturing the tag in a switch case,
```zig
switch (x) {
    .y => |_, _| {
```
this produces an error.

The only usecase I can think of for the previous behaviour is
if you wanted to assert that all union payloads are able
to coerce,
```zig
const X = union(enum) { y: u8, z: f32 };

switch (x) {
    .y, .z => |_| {
```

This will compile-error with the `|_|` and pass without it.

I don't believe this usecase is strong enough to keep the current
behaviour; it was never used in the Zig codebase and I cannot
find a single usage of this behaviour in the real world, searching
through Sourcegraph.
2026-01-11 11:37:16 +00:00
Andrew Kelley
7248b4a4e4 std.fs: deprecate base64 APIs
100% of std.fs is now deprecated.
2026-01-07 17:33:06 -08:00
Andrew Kelley
1f1381a866 update API usage of std.crypto.random to io.random 2026-01-07 11:03:36 -08:00
Andrew Kelley
eb0d5b1377 std.testing: use debug Io instance in expectEqualSlices 2025-12-26 19:58:56 -08:00
Andrew Kelley
28639bd6d7 std: prevent testing.io from use outside tests 2025-12-26 19:58:56 -08:00
Andrew Kelley
fa79d34674 std: add changing cur dir back
There's a good argument to not have this in the std lib but it's more
work to remove it than to leave it in, and this branch is already
20,000+ lines changed.
2025-12-23 22:15:12 -08:00
Andrew Kelley
a8088306f6 std: rename other Dir "make" functions to "create" 2025-12-23 22:15:11 -08:00
Andrew Kelley
54865e0483 compiler: fix compilation when linking libc 2025-12-23 22:15:10 -08:00
Andrew Kelley
608145c2f0 fix more fallout from locking stderr 2025-12-23 22:15:10 -08:00
Andrew Kelley
4458e423bf link.MappedFile: update statx usage 2025-12-23 22:15:09 -08:00
Andrew Kelley
1925e0319f update lockStderrWriter sites
use the application's Io implementation where possible. This correctly
makes writing to stderr cancelable, fallible, and participate in the
application's event loop. It also removes one more hard-coded
dependency on a secondary Io implementation.
2025-12-23 22:15:09 -08:00
Andrew Kelley
b042e93522 std: update tty config references in the build system 2025-12-23 22:15:09 -08:00
Andrew Kelley
03526c59d4 std.debug: fix printLineFromFile 2025-12-23 22:15:09 -08:00
Andrew Kelley
ebdbbd20ac update makeDir() sites to specify permissions 2025-12-23 22:15:08 -08:00
Andrew Kelley
f53248a409 update all std.fs.cwd() to std.Io.Dir.cwd() 2025-12-23 22:15:08 -08:00
Andrew Kelley
dd1d15b72a update all occurrences of std.fs.Dir to std.Io.Dir 2025-12-23 22:15:08 -08:00
Andrew Kelley
aafddc2ea1 update all occurrences of close() to close(io) 2025-12-23 22:15:07 -08:00
Alex Rønne Petersen
aa0249d74e Merge pull request 'std.ascii: rename indexOf functions to find' (#30101) from adria/zig:indexof-find into master
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30101
Reviewed-by: Andrew Kelley <andrewrk@noreply.codeberg.org>
Reviewed-by: mlugg <mlugg@noreply.codeberg.org>
2025-12-22 12:50:46 +01:00
Adrià Arrufat
02c5f05e2f std: replace usages of std.mem.indexOf with std.mem.find 2025-12-05 14:31:27 +01:00
Linus Groh
39fa831947 std: Remove a handful of things deprecated during the 0.15 release cycle
- std.Build.Step.Compile.root_module mutators -> std.Build.Module
- std.Build.Step.Compile.want_lto -> std.Build.Step.Compile.lto
- std.Build.Step.ConfigHeader.getOutput -> std.Build.Step.ConfigHeader.getOutputFile
- std.Build.Step.Run.max_stdio_size -> std.Build.Step.Run.stdio_limit
- std.enums.nameCast -> @field(E, tag_name) / @field(E, @tagName(tag))
- std.Io.tty.detectConfig -> std.Io.tty.Config.detect
- std.mem.trimLeft -> std.mem.trimStart
- std.mem.trimRight -> std.mem.trimEnd
- std.meta.intToEnum -> std.enums.fromInt
- std.meta.TagPayload -> @FieldType(U, @tagName(tag))
- std.meta.TagPayloadByName -> @FieldType(U, tag_name)
2025-11-27 20:17:04 +00:00
Nir Lahad
14ba3bd9a1
std.testing: Fix expectEqualDeep formatted enum (#25960) 2025-11-25 05:39:07 -08:00
Matthew Lugg
74931fe25c
std.debug.lockStderrWriter: also return ttyconf
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
2025-10-30 09:31:28 +00:00
Andrew Kelley
10b1eef2d3 std: fix compilation errors on Windows 2025-10-29 06:20:50 -07:00
Andrew Kelley
885b3f8342 Io.net: finish implementing IPv6 parsing 2025-10-29 06:20:48 -07:00
Andrew Kelley
d801a71d29 add std.testing.io 2025-10-29 06:20:48 -07:00
Ryan Liptak
328ae41468 Reader.peekDelimiterInclusive: Fix handling of stream implementations that return 0
Previously, the logic in peekDelimiterInclusive (when the delimiter was not found in the existing buffer) used the `n` returned from `r.vtable.stream` as the length of the slice to check, but it's valid for `vtable.stream` implementations to return 0 if they wrote to the buffer instead of `w`. In that scenario, the `indexOfScalarPos` would be given a 0-length slice so it would never be able to find the delimiter.

This commit changes the logic to assume that `r.vtable.stream` can both:
- return 0, and
- modify seek/end (i.e. it's also valid for a `vtable.stream` implementation to rebase)

Also introduces `std.testing.ReaderIndirect` which helps in being able to test against Reader implementations that return 0 from `stream`/`readVec`

Fixes #25428
2025-10-08 16:42:55 -07:00
Andrew Kelley
5ec0a7d8a5 coerce vectors to arrays rather than inline for 2025-09-20 18:33:00 -07:00
Andrew Kelley
426af68b7d compiler: require comptime vector indexes 2025-09-20 18:33:00 -07:00
Andrew Kelley
79f267f6b9 std.Io: delete GenericReader
and delete deprecated alias std.io
2025-08-29 17:14:26 -07:00
Jacob Young
5060ab99c9 aarch64: add new from scratch self-hosted backend 2025-07-22 19:43:47 -07:00
Andrew Kelley
86699acbb9 std.Io.Reader: update OneByteReader usage to std.testing.Reader 2025-07-17 09:26:31 -07:00
Andrew Kelley
c072cf2bb8 std.io.Reader.peekDelimiterInclusive: simplify and fix 2025-07-09 17:18:23 -07:00
Andrew Kelley
f4720e1407 std.testing: update to new std.io API 2025-07-07 22:43:53 -07:00
Andrew Kelley
0e37ff0d59 std.fmt: breaking API changes
added adapter to AnyWriter and GenericWriter to help bridge the gap
between old and new API

make std.testing.expectFmt work at compile-time

std.fmt no longer has a dependency on std.unicode. Formatted printing
was never properly unicode-aware. Now it no longer pretends to be.

Breakage/deprecations:
* std.fs.File.reader -> std.fs.File.deprecatedReader
* std.fs.File.writer -> std.fs.File.deprecatedWriter
* std.io.GenericReader -> std.io.Reader
* std.io.GenericWriter -> std.io.Writer
* std.io.AnyReader -> std.io.Reader
* std.io.AnyWriter -> std.io.Writer
* std.fmt.format -> std.fmt.deprecatedFormat
* std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape
* std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape
* std.fmt.fmtSliceHexLower -> {x}
* std.fmt.fmtSliceHexUpper -> {X}
* std.fmt.fmtIntSizeDec -> {B}
* std.fmt.fmtIntSizeBin -> {Bi}
* std.fmt.fmtDuration -> {D}
* std.fmt.fmtDurationSigned -> {D}
* {} -> {f} when there is a format method
* format method signature
  - anytype -> *std.io.Writer
  - inferred error set -> error{WriteFailed}
  - options -> (deleted)
* std.fmt.Formatted
  - now takes context type explicitly
  - no fmt string
2025-07-07 22:43:51 -07:00
Andrew Kelley
0b3f0124dc std.io: move getStdIn, getStdOut, getStdErr functions to fs.File
preparing to rearrange std.io namespace into an interface

how to upgrade:

std.io.getStdIn() -> std.fs.File.stdin()
std.io.getStdOut() -> std.fs.File.stdout()
std.io.getStdErr() -> std.fs.File.stderr()
2025-07-07 22:43:51 -07:00
Ali Cheraghi
24bfefa75e std.mem.byteSwapAllFields: support untagged unions 2025-06-23 05:57:56 +02:00
Ali Cheraghi
872f68c9cb
rename spirv backend name
`stage2_spirv64` -> `stage2_spirv`
2025-06-16 13:22:19 +03:30
Alex Rønne Petersen
999777e73a compiler: Scaffold stage2_powerpc backend.
Nothing interesting here; literally just the bare minimum so I can work on this
on and off in a branch without worrying about merge conflicts in the non-backend
code.
2025-05-20 10:23:16 +02:00
Mark Rushakoff
86064e66d6 std.testing: improve compile error on untagged union equality 2025-02-16 15:51:40 +01:00
Benjamin Thompson
5ab5113077
added expectEqualDeep test coverage for issue 16625 (#22781) 2025-02-15 03:41:58 +01:00
Andrew Kelley
d789f1e5cf fuzzer: write inputs to shared memory before running
breaking change to the fuzz testing API; it now passes a type-safe
context parameter to the fuzz function.

libfuzzer is reworked to select inputs from the entire corpus.

I tested that it's roughly as good as it was before in that it can find
the panics in the simple examples, as well as achieve decent coverage on
the tokenizer fuzz test.

however I think the next step here will be figuring out why so many
points of interest are missing from the tokenizer in both Debug and
ReleaseSafe modes.

does not quite close #20803 yet since there are some more important
things to be done, such as opening the previous corpus, continuing
fuzzing after finding bugs, storing the length of the inputs, etc.
2025-02-11 13:39:20 -08:00
Andrew Kelley
ff8e759705 std.testing: don't ask wasm to stack trace 2025-02-06 14:46:16 -08:00
Andrew Kelley
f82ec3f02a std.testing.allocator: different canary + enable resize traces
Accept a slight performance degradation when unit testing for better
debuggability when a leak or double-free is detected.
2025-02-06 14:23:23 -08:00
Andrew Kelley
7320e8b3cd std.testing: make some things not pub
this looks like it was an accident to expose these
2025-02-06 14:23:23 -08:00
Andrew Kelley
a0b2a18648 std.testing.FailingAllocator: flatten namespace 2025-02-06 14:23:23 -08:00
ThisPC
e528ab4709 std.testing.expectEqual: {any} in print and move tests 2025-01-29 09:19:07 +01:00