Commit graph

592 commits

Author SHA1 Message Date
Andrew Kelley
e0173c2ce0 Merge pull request 'rework fuzz testing to be smith based' (#31205) from gooncreeper/zig:integrated-smith into master
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31205
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
2026-02-25 20:23:36 +01:00
Justus Klausecker
a3a9dc111d std.heap.ArenaAllocator: make it threadsafe
Modifies the `Allocator` implementation provided by `ArenaAllocator` to be
threadsafe using only atomics and no synchronization primitives locked
behind an `Io` implementation.

At its core this is a lock-free singly linked list which uses CAS loops to
exchange the head node. A nice property of `ArenaAllocator` is that the
only functions that can ever remove nodes from its linked list are `reset`
and `deinit`, both of which are not part of the `Allocator` interface and
thus aren't threadsafe, so node-related ABA problems are impossible.

There *are* some trade-offs: end index tracking is now per node instead of
per allocator instance. It's not possible to publish a head node and its
end index at the same time if the latter isn't part of the former.

Another compromise had to be made in regards to resizing existing nodes.
Annoyingly, `rawResize` of an arbitrary thread-safe child allocator can
of course never be guaranteed to be an atomic operation, so only one
`alloc` call can ever resize at the same time, other threads have to
consider any resizes they attempt during that time failed. This causes
slightly less optimal behavior than what could be achieved with a mutex.
The LSB of `Node.size` is used to signal that a node is being resized.
This means that all nodes have to have an even size.

Calls to `alloc` have to allocate new nodes optimistically as they can
only know whether any CAS on a head node will succeed after attempting it,
and to attempt the CAS they of course already need to know the address of
the freshly allocated node they are trying to make the new head.
The simplest solution to this would be to just free the new node again if
a CAS fails, however this can be expensive and would mean that in practice
arenas could only really be used with a GPA as their child allocator. To
work around this, this implementation keeps its own free list of nodes
which didn't make their CAS to be reused by a later `alloc` invocation.
To keep things simple and avoid ABA problems the free list is only ever
be accessed beyond its head by 'stealing' the head node (and thus the
entire list) with an atomic swap. This makes iteration and removal trivial
since there's only ever one thread doing it at a time which also owns all
nodes it's holding. When the thread is done it can just push its list onto
the free list again.

This implementation offers comparable performance to the previous one when
only being accessed by a single thread and a slight speedup compared to
the previous implementation wrapped into a `ThreadSafeAllocator` up to ~7
threads performing operations on it concurrently.
(measured on a base model MacBook Pro M1)
2026-02-25 19:12:35 +01:00
Kendall Condon
5d58306162 rework fuzz testing to be smith based
-- On the standard library side:

The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.

`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
  guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.

`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
  possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
  of values.

-- On the libfuzzer and abi side:

--- Uids

These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:

1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.

2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.

Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.

--- Mutations

At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.

Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.

For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.

--- Passive Minimization

A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.

The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.

Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.

-- Comparison to byte-based smith

A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).

-- Test updates

All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
2026-02-13 22:12:19 -05:00
Jacob Young
2fa1a78491 Io.Dispatch: introduce grand central dispatch io impl 2026-02-13 12:29:40 -05:00
Jacob Young
b5bd494606 std.Threaded: replace more kernel32 functions with ntdll 2026-02-07 00:02:50 -05:00
Andrew Kelley
550da1b676 std: migrate remaining sync primitives to Io
- delete std.Thread.Futex
- delete std.Thread.Mutex
- delete std.Thread.Semaphore
- delete std.Thread.Condition
- delete std.Thread.RwLock
- delete std.once

std.Thread.Mutex.Recursive remains... for now. it will be replaced with
a special purpose mechanism used only by panic logic.

std.Io.Threaded exposes mutexLock and mutexUnlock for the advanced case
when you need to call them directly.
2026-02-02 18:57:17 -08:00
Alex Rønne Petersen
2c7d3c8007
std.debug: use debug_io for the futex in waitForOtherThreadToFinishPanicking 2026-01-27 05:37:01 +01:00
Matthew Lugg
85cac9e5b6 std: use sigaltstack for default segfault handler
This allows stack overflows to print stack traces. The size of the
sigaltstack (and whether it is actually set) can be configured by
setting `std.Options.signal_stack_size`.

The default value for the signal stack size was chosen experimentally by
doubling the value required to get stack traces on stack overflow with
the self-hosted x86_64 backend. While some targets may typically use
more stack space than x86_64-linux, the self-hosted x86_64 backend is
quite wasteful with stack at the moment, making it a fair benchmark.
Executables produced by the LLVM backend should have lower stack usage.
2026-01-13 07:24:49 +01:00
Matthew Lugg
1111655131
std: block cancelation in default panic and segfault handlers
It doesn't make any sense for a task to be canceled while it's
panicking.

As a happy accident, this also solves some cases where safety panics in
`Io.Threaded` would cause stack traces not to print due to invalid
thread-local state: when cancelation is blocked, `Io.Threaded` doesn't
consult said thread-local state at all. For instance, try inserting a
panic just after a call to `Syscall.start()` in `Io.Threaded`, and then
call the `Io` function in question from a `concurrent` task. Before this
PR, the stack trace fails to print, because the panic handler sees the
thread-local cancelation state in an unexpected state, leading to a
recursive panic. After this PR, the stack trace prints fine.
2026-01-06 10:50:45 +00:00
Andrew Kelley
d6a1e73142 std: start wrangling environment variables and process args
this commit is unfinished. It marks a spot where I wanted to start
moving child process stuff below the std.Io.VTable
2026-01-04 00:27:07 -08:00
Andrew Kelley
7c1236e267 std: different way of doing some options
to avoid dependency loops
2025-12-23 22:15:12 -08:00
Andrew Kelley
0992f1204e std.debug: delete nosuspend blocks
now that the application can choose an Io implementation these might
actually suspend.
2025-12-23 22:15:12 -08:00
Andrew Kelley
78c4fcfcd8 std.debug.lockStderr: cancel protection rather than recancel
because we need to return the value
2025-12-23 22:15:12 -08:00
Andrew Kelley
3c2f5adf41 std: integrate Io.Threaded with environment variables
* std.option allows overriding the debug Io instance
* if the default is used, start code initializes environ and argv0

also fix some places that needed recancel(), thanks mlugg!

See #30562
2025-12-23 22:15:12 -08:00
Andrew Kelley
fd0c324cb0 std.debug: fix simple_panic 2025-12-23 22:15:11 -08:00
Andrew Kelley
77d2ad8c92 std: consolidate all instances of std.Io.Threaded into a singleton
It's better to avoid references to this global variable, but, in the
cases where it's needed, such as in std.debug.print and collecting stack
traces, better to share the same instance.
2025-12-23 22:15:11 -08:00
Andrew Kelley
1381f9f612 std.debug: fix printLineFromFile
by using streamDelimiter and discardDelimiter functions that don't
depend on the buffer size being large enough
2025-12-23 22:15:11 -08:00
Andrew Kelley
7ce5ee2e92 std: update remaining unit tests for std.Io API changes 2025-12-23 22:15:10 -08:00
Andrew Kelley
21d0264c61 std.dynamic_library: use a global static single threaded Io
See #30150
2025-12-23 22:15:10 -08:00
Andrew Kelley
608145c2f0 fix more fallout from locking stderr 2025-12-23 22:15:10 -08:00
Andrew Kelley
aa57793b68 std: rework locking stderr 2025-12-23 22:15:09 -08:00
Andrew Kelley
b042e93522 std: update tty config references in the build system 2025-12-23 22:15:09 -08:00
Andrew Kelley
95b0399d1b std: finish implementing futexWait with timer 2025-12-23 22:15:09 -08:00
Andrew Kelley
ffcbd48a12 std: rework TTY detection and printing
This commit sketches an idea for how to deal with detection of file
streams as being terminals.

When a File stream is a terminal, writes through the stream should have
their escapes stripped unless the programmer explicitly enables terminal
escapes. Furthermore, the programmer needs a convenient API for
intentionally outputting escapes into the stream. In particular it
should be possible to set colors that are silently discarded when the
stream is not a terminal.

This commit makes `Io.File.Writer` track the terminal mode in the
already-existing `mode` field, making it the appropriate place to
implement escape stripping.

`Io.lockStderrWriter` returns a `*Io.File.Writer` with terminal
detection already done by default. This is a higher-level application
layer stream for writing to stderr.

Meanwhile, `std.debug.lockStderrWriter` also returns a `*Io.File.Writer`
but a lower-level one that is hard-coded to use a static single-threaded
`std.Io.Threaded` instance. This is the same instance that is used for
collecting debug information and iterating the unwind info.
2025-12-23 22:15:09 -08:00
Andrew Kelley
78d262d96e std: WIP: debug-level stderr writing 2025-12-23 22:15:09 -08:00
Andrew Kelley
03526c59d4 std.debug: fix printLineFromFile 2025-12-23 22:15:09 -08:00
Andrew Kelley
90f7259ef1 std.Progress: use a global static Io instance
This decision should be audited and discussed.

Some factors:
* Passing an Io instance into start.
* Avoiding reference to global static instance if it won't be used, so
  that it doesn't bloat the executable.
* Being able to use std.debug.print, and related functionality when
  debugging std.Io instances and std.Progress.
2025-12-23 22:15:08 -08:00
Andrew Kelley
bee8005fe6 std.heap.DebugAllocator: never detect TTY config
instead, allow the user to set it as a field.

this fixes a bug where leak printing and error printing would run tty
config detection for stderr, and then emit a log, which is not necessary
going to print to stderr.

however, the nice defaults are gone; the user must explicitly assign the
tty_config field during initialization or else the logging will not have
color.

related: https://github.com/ziglang/zig/issues/24510
2025-12-23 22:15:08 -08:00
Andrew Kelley
4218344dd3 std.Build.Cache: remove readSmallFile and writeSmallFile
These were to support optimizations involving detecting when to avoid
calling into LLD, which are no longer implemented.
2025-12-23 22:15:08 -08:00
Andrew Kelley
950d18ef69 update all access() to access(io) 2025-12-23 22:15:08 -08:00
Andrew Kelley
314c906dba std.debug: simplify printLineFromFile 2025-12-23 22:15:08 -08:00
Andrew Kelley
9ccd68de0b std: move abort and exit from posix into process
and delete the unit tests that called fork()

no forking allowed in the std lib, including unit tests, except to implement child process spawning.
2025-12-23 22:15:08 -08:00
Andrew Kelley
f53248a409 update all std.fs.cwd() to std.Io.Dir.cwd() 2025-12-23 22:15:08 -08:00
Andrew Kelley
3204fb7569 update all occurrences of std.fs.File to std.Io.File 2025-12-23 22:15:07 -08:00
Andrew Kelley
aafddc2ea1 update all occurrences of close() to close(io) 2025-12-23 22:15:07 -08:00
Andrew Kelley
d1d2c37af2 std: all Dir functions moved to std.Io 2025-12-23 22:15:07 -08:00
Alex Rønne Petersen
aa0249d74e Merge pull request 'std.ascii: rename indexOf functions to find' (#30101) from adria/zig:indexof-find into master
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30101
Reviewed-by: Andrew Kelley <andrewrk@noreply.codeberg.org>
Reviewed-by: mlugg <mlugg@noreply.codeberg.org>
2025-12-22 12:50:46 +01:00
mlugg
dbb4c8d151 Merge pull request 'Remove things deprecated during the 0.15 release cycle' (#30018) from linus/zig:remove-deprecated-stuff into master
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30018
2025-12-06 08:51:15 +01:00
Matthew Lugg
ea94ac52c5
std.debug: skip manage resources correctly with cbe 2025-12-05 15:10:58 +01:00
Adrià Arrufat
02c5f05e2f std: replace usages of std.mem.indexOf with std.mem.find 2025-12-05 14:31:27 +01:00
Linus Groh
39fa831947 std: Remove a handful of things deprecated during the 0.15 release cycle
- std.Build.Step.Compile.root_module mutators -> std.Build.Module
- std.Build.Step.Compile.want_lto -> std.Build.Step.Compile.lto
- std.Build.Step.ConfigHeader.getOutput -> std.Build.Step.ConfigHeader.getOutputFile
- std.Build.Step.Run.max_stdio_size -> std.Build.Step.Run.stdio_limit
- std.enums.nameCast -> @field(E, tag_name) / @field(E, @tagName(tag))
- std.Io.tty.detectConfig -> std.Io.tty.Config.detect
- std.mem.trimLeft -> std.mem.trimStart
- std.mem.trimRight -> std.mem.trimEnd
- std.meta.intToEnum -> std.enums.fromInt
- std.meta.TagPayload -> @FieldType(U, @tagName(tag))
- std.meta.TagPayloadByName -> @FieldType(U, tag_name)
2025-11-27 20:17:04 +00:00
Matthew Lugg
010dcd6a9b
fuzzer: account for runtime address slide
This is relevant to PIEs, which are notably enabled by default on macOS.
The build system needs to only see virtual addresses, that is, those
which do not have the slide applied; but the fuzzer itself naturally
sees relocated addresses (i.e. with the slide applied). We just need to
subtract the slide when we communicate addresses to the build system.
2025-11-20 10:42:20 +00:00
Matthew Lugg
0caca625eb
std.debug: split up Mach-O debug info handling
Like ELF, we now have `std.debug.MachOFile` for the host-independent
parts, and `std.debug.SelfInfo.MachO` for logic requiring the file to
correspond to the running program.
2025-11-20 10:42:20 +00:00
Alex Rønne Petersen
9ab7eec23e represent Mac Catalyst as aarch64-maccatalyst-none rather than aarch64-ios-macabi
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.

While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
2025-11-14 11:33:35 +01:00
Matthew Lugg
92bc619c49 std.debug: allow fp unwind from context
It's easy to do FP unwinding from a CPU context: you just report the
captured ip/pc value first, and then unwind from the captured fp value.
All this really needed was a couple of new functions on the
`std.debug.cpu_context` implementations so that we don't need to rely on
`std.debug.Dwarf` to access the captured registers.

Resolves: #25576
2025-11-12 21:02:38 +00:00
qilme
8347791ce3
std.os.windows: eliminate forwarder function in kernel32 (#25766)
#1840

kernel32.AddVectoredExceptionHandler -> ntdll.RtlAddVectoredExceptionHandler
kernel32.RemoveVectoredExceptionHandler -> ntdll.RtlRemoveVectoredExceptionHandler
kernel32.ExitProcess -> ntdll.RtlExitUserProcess
kernel32.InitializeCriticalSection -> ntdll.RtlInitializeCriticalSection
kernel32.EnterCriticalSection -> ntdll.RtlEnterCriticalSection
kernel32.LeaveCriticalSection -> ntdll.RtlLeaveCriticalSection
kernel32.DeleteCriticalSection -> ntdll.RtlDeleteCriticalSection
kernel32.TryAcquireSRWLockExclusive -> ntdll.RtlTryAcquireSRWLockExclusive
kernel32.AcquireSRWLockExclusive -> ntdll.RtlAcquireSRWLockExclusive
kernel32.ReleaseSRWLockExclusive -> ntdll.RtlReleaseSRWLockExclusive
kernel32.WakeConditionVariable -> ntdll.RtlWakeConditionVariable
kernel32.WakeAllConditionVariable -> ntdll.RtlWakeAllConditionVariable
kernel32.HeapReAlloc -> ntdll.RtlReAllocateHeap
kernel32.HeapAlloc -> ntdll.RtlAllocateHeap
2025-10-31 13:54:50 +00:00
Matthew Lugg
74931fe25c
std.debug.lockStderrWriter: also return ttyconf
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
2025-10-30 09:31:28 +00:00
Andrew Kelley
a072d821be
Merge pull request #25592 from ziglang/init-std.Io
std: Introduce `Io` Interface
2025-10-29 13:51:37 -07:00
Alex Rønne Petersen
a7119d4269 remove all IBM AIX and z/OS support
As with Solaris (dba1bf9353), we have no way to
actually audit contributions for these OSs. IBM also makes it even harder than
Oracle to actually obtain these OSs.

closes #23695
closes #23694
closes #3655
closes #23693
2025-10-29 14:25:51 +01:00
Andrew Kelley
8b269f7e18 std: make signal numbers into an enum
fixes start logic for checking whether IO/POLL exist
2025-10-29 06:20:51 -07:00