`std.process.spawn`: remove the TODO for nonblocking file stdio and document the behavior.
Fix a bug in Io.Uring.dup2 where the function does not return on success
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31379
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: breakmit <breakmit@noreply.codeberg.org>
Co-committed-by: breakmit <breakmit@noreply.codeberg.org>
Replace the O(n) shift-subtract loop with a constant-time trial
quotient approach (Knuth Algorithm D, TAOCP Vol 2 Section 4.3.1).
The old code iterates clz(b_hi)-clz(a_hi)+1 times (up to 64
iterations of 128-bit arithmetic). The new code uses a single
divwide call to get a trial quotient, then verifies with two
native-width widening multiplies.
Benchmark (Apple M1, ReleaseFast):
- Large divisor, large shift: 87ns -> 7.5ns (11.5x faster)
- Small divisor / uniform: unchanged
It's annoying to have jobs fail on new machines or when switching tarball. Also
quirky is easily accessible over SSH now, so we can just upload tarballs as we
do for other machines.
The dependency on advapi32.dll actually silently brings along 3 other dlls at runtime (msvcrt.dll, sechost.dll, bcrypt.dll), even if no advapi32 APIs are called. So, this commit actually reduces the number of dlls loaded at runtime by 4 (but only when LLVM is not linked, since LLVM has its own dependency on advapi32.dll).
The data is not super conclusive, but the ntdll version of WindowsSdk appears to run slightly faster than the previous advapi32 version:
Benchmark 1: libc-ntdll.exe ..
Time (mean ± σ): 6.0 ms ± 0.6 ms [User: 3.9 ms, System: 7.1 ms]
Range (min … max): 4.8 ms … 7.9 ms 112 runs
Benchmark 2: libc-advapi32.exe ..
Time (mean ± σ): 7.2 ms ± 0.5 ms [User: 5.4 ms, System: 9.2 ms]
Range (min … max): 6.1 ms … 8.9 ms 103 runs
Summary
'libc-ntdll.exe ..' ran
1.21 ± 0.15 times faster than 'libc-advapi32.exe ..'
and this mostly seems to be due to changes in the implementation (the advapi32 APIs do a lot of NtQueryKey calls that the new implementation doesn't do) rather than due to the decrease in dll loading. LLVM-less zig binaries don't show the same reduction (the only difference here is the DLLs being loaded):
Benchmark 1: stage4-ntdll\bin\zig.exe version
Time (mean ± σ): 3.0 ms ± 0.6 ms [User: 5.3 ms, System: 4.8 ms]
Range (min … max): 1.3 ms … 4.2 ms 112 runs
Benchmark 2: stage4-advapi32\bin\zig.exe version
Time (mean ± σ): 3.5 ms ± 0.6 ms [User: 6.9 ms, System: 5.5 ms]
Range (min … max): 2.5 ms … 5.9 ms 111 runs
Summary
'stage4-ntdll\bin\zig.exe version' ran
1.16 ± 0.28 times faster than 'stage4-advapi32\bin\zig.exe version'
---
With the removal of the advapi32 dependency, the non-ntdll dependencies that remain in an LLVM-less Zig binary are ws2_32.dll (which brings along rpcrt4.dll at runtime), kernel32.dll (which brings along kernelbase.dll at runtime), and crypt32.dll (which brings along ucrtbase.dll at runtime).
The following assertions fail on non-Linux platforms after c0c2010535
which inserted padding based on musl definitions. This padding only
exists on musl to workaround a discrepancy betweeen the POSIX API and
Linux ABI, and is incorrect on other POSIX operating systems.
This change makes the padding musl-only, and documents the reason it
exists. With this change, the assertions pass on Linux and FreeBSD
targets. The corresponding definitions on other targets line up with the
POSIX and FreeBSD ones, so they should work there too.
```zig
const std = @import("std");
const assert = std.debug.assert;
const msghdr = std.c.msghdr;
const cmsghdr = std.c.cmsghdr;
const c = @cImport({
@cInclude("sys/socket.h");
});
comptime {
assert(@offsetOf(msghdr, "iovlen") == @offsetOf(c.msghdr, "msg_iovlen"));
assert(@offsetOf(msghdr, "controllen") == @offsetOf(c.msghdr, "msg_controllen"));
assert(@offsetOf(msghdr, "control") == @offsetOf(c.msghdr, "msg_control"));
assert(@offsetOf(msghdr, "flags") == @offsetOf(c.msghdr, "msg_flags"));
assert(@sizeOf(msghdr) == @sizeOf(c.msghdr));
assert(@offsetOf(cmsghdr, "len") == @offsetOf(c.cmsghdr, "cmsg_len"));
assert(@offsetOf(cmsghdr, "level") == @offsetOf(c.cmsghdr, "cmsg_level"));
assert(@sizeOf(cmsghdr) == @sizeOf(c.cmsghdr));
}
```
There were good reviews made after #31365 was merged, so this commit
addresses them separately.
1. Assert that the number is greater than zero
2. Use `constants` instead of calculating constants manually
3. Use `Const.bitCountAbs` for log2
While the general guidance remains useful, it is not the case that
error.Canceled will always pass across the Group task function boundary.
Remove the too-aggressive assertions and add unit test coverage.
Closes#30096Closes#31340Closes#31358
Remove the `{D}` format specifier. It is moved into `std.Io.Duration` as
a format method.
Migration plan:
```diff
-writer.print("{D}", .{ns});
+writer.print("{f}", .{std.Io.Duration{ .nanoseconds = ns }});
```
All instances where `{D}` was used have been changed to use
`std.Io.Duration` and `{f}`.
Fixes#31281
and make the return value of `cancel` return queue items.
I don't think it's possible to make `cancel` not deadlock with an empty
queue buffer without introducing a new Group primitive.
This is the best I could come up with based on existing primitives.
Let's see if applications find these APIs palatable.
## Summary of changes
+ Make adjustments to the `allocator` field and ensure the below tests pass:
```sh
zig test lib/std/std.zig --zig-lib-dir lib
zig build test-std -Dno-matrix --summary all
```
+ Rename `add` to `push` and `remove` to `pop` in methods and tests
+ Incorporate the functionality of `pop` in `popOrNull`, then rename the `popOrNull` to `pop` and update tests
+ Use `.empty` to set default field values and rename the `init` method to `initContext`
+ Improve variable types in tests: min heap uses the less than context function and max heap uses greater than context function
+ Remove the `dump` method as its not being used anywhere
+ Document methods `clearRetainingCapacity`, `clearAndFree`, `update`, and `ensureTotalCapacityPrecise`
Closes https://codeberg.org/ziglang/zig/issues/31298
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31299
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Saurabh Mishra <saurabh.m@proton.me>
Co-committed-by: Saurabh Mishra <saurabh.m@proton.me>
Previously resetting with `retain_capacity < @sizeOf(Node)` would create
an invalid node. This is now fixed, plus `Node.size` now has its own `Size`
type that provides additional safety via assertions to prevent bugs like
this in the future.
This is achieved by bumping `end_index` by a large enough amount so that
a suitably aligned region of memory can always be provided. The potential
wasted space this creates is then recovered by a single cmpxchg. This is
always successful for single-threaded arenas which means that this version
still behaves exactly the same as the old single-threaded implementation
when only being accessed by one thread at a time. It can however fail when
another thread bumps `end_index` in the meantime. The observerd failure
rates under extreme load are:
2 Threads: 4-5%
3 Threads: 13-15%
4 Threads: 15-17%
5 Threads: 17-18%
6 Threads: 19-20%
7 Threads: 18-21%
This version offers ~25% faster performance under extreme load from 7 threads,
with diminishing speedups for less threads. The performance for 1 and 2
threads is nearly identical.
Quoting Vol. 2B 4-590 of [Intel SSA manual][1]:
> Bit 3 of the immediate byte controls processor behavior for a
precision exception <...>
The Direction struct is incorrectly sized to 4 bits, which pushes the
precision exception bit to the reserved bits, which helpfully crashes
under valgrind (but works on real hardware, since CPU ignores it):
``
const std = @import("std");
noinline fn ceil(x: f32) f32 {
return @ceil(x);
}
test "ceil" {
try std.testing.expectEqual(@as(f32, 2.0), ceil(1.5));
}
```
With Zig 0.15.1:
```
$ zig test -fno-llvm -mcpu x86_64_v3 -fvalgrind ceil_test.zig --test-cmd valgrind --test-cmd --quiet --test-cmd-bin
Illegal instruction at address 0x102d14d
/home/motiejus/code/ceil_test.zig:4:5: 0x102d14d in ceil (ceil_test.zig)
return @ceil(x);
^
/home/motiejus/code/ceil_test.zig:8:52: 0x102d097 in test.ceil (ceil_test.zig)
try std.testing.expectEqual(@as(f32, 2.0), ceil(1.5));
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/compiler/test_runner.zig:218:25: 0x1167e80 in mainTerminal (test_runner.zig)
if (test_fn.func()) |_| {
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/compiler/test_runner.zig:66:28: 0x11610a1 in main (test_runner.zig)
return mainTerminal();
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/std/start.zig:618:22: 0x115ae3d in posixCallMainAndExit (std.zig)
root.main();
^
/nix/store/n81qdpd96d86ngfv5bqymy26b8kqm654-zig-0.15.1/lib/std/start.zig:232:5: 0x115a6d1 in _start (std.zig)
asm volatile (switch (native_arch) {
^
???:?:?: 0x0 in ??? (???)
error: the following test command crashed:
valgrind --quiet /home/motiejus/.cache/zig/o/7cc564b5410fc3facdb6e8768643074e/test
```
Zig 0.15.1 with this patch:
```
zig/zig3 test -fno-llvm -mcpu x86_64_v3 -fvalgrind ceil_test.zig --test-cmd valgrind --test-cmd --quiet --test-cmd-bin
All 1 tests passed.
```
Looking through `CodeGen.zig` it _seems_ like the same RoundMode struct
is used for vroundss/vroundps/vcvtps2ph, so update the comment while
we're at it. But I am only 90% sure it's true.
[1]: https://cdrdv2.intel.com/v1/dl/getContent/671200
Matches now use memcpy and memset when possible.
Block loops have been rewritten to be more optimizer friendly.
Reworks Symbol and HuffmanDecoder
* Symbol now only includes the value and number of code bits.
decodeSymbol returns only the value.
* HuffmanDecoder now takes the regular bits instead of the reversed.
* Code table construction now uses buckets instead of sorting.
* For linked codes, the value field of Symbol is now used as the next
index. The actual value is the element index.
* InvalidCode is now detected only once with a special linked index.
Performance is 39.7% faster than before and 1.1% faster than gzip using
a sample created from compressing a tar of the src directory.
Modifies the `Allocator` implementation provided by `ArenaAllocator` to be
threadsafe using only atomics and no synchronization primitives locked
behind an `Io` implementation.
At its core this is a lock-free singly linked list which uses CAS loops to
exchange the head node. A nice property of `ArenaAllocator` is that the
only functions that can ever remove nodes from its linked list are `reset`
and `deinit`, both of which are not part of the `Allocator` interface and
thus aren't threadsafe, so node-related ABA problems are impossible.
There *are* some trade-offs: end index tracking is now per node instead of
per allocator instance. It's not possible to publish a head node and its
end index at the same time if the latter isn't part of the former.
Another compromise had to be made in regards to resizing existing nodes.
Annoyingly, `rawResize` of an arbitrary thread-safe child allocator can
of course never be guaranteed to be an atomic operation, so only one
`alloc` call can ever resize at the same time, other threads have to
consider any resizes they attempt during that time failed. This causes
slightly less optimal behavior than what could be achieved with a mutex.
The LSB of `Node.size` is used to signal that a node is being resized.
This means that all nodes have to have an even size.
Calls to `alloc` have to allocate new nodes optimistically as they can
only know whether any CAS on a head node will succeed after attempting it,
and to attempt the CAS they of course already need to know the address of
the freshly allocated node they are trying to make the new head.
The simplest solution to this would be to just free the new node again if
a CAS fails, however this can be expensive and would mean that in practice
arenas could only really be used with a GPA as their child allocator. To
work around this, this implementation keeps its own free list of nodes
which didn't make their CAS to be reused by a later `alloc` invocation.
To keep things simple and avoid ABA problems the free list is only ever
be accessed beyond its head by 'stealing' the head node (and thus the
entire list) with an atomic swap. This makes iteration and removal trivial
since there's only ever one thread doing it at a time which also owns all
nodes it's holding. When the thread is done it can just push its list onto
the free list again.
This implementation offers comparable performance to the previous one when
only being accessed by a single thread and a slight speedup compared to
the previous implementation wrapped into a `ThreadSafeAllocator` up to ~7
threads performing operations on it concurrently.
(measured on a base model MacBook Pro M1)
This is the same as for Edwards25519 without the y coordinate,
since it returns Montgomery coordinates, but it can be confusing
to call the Edwards25519 function while working on the
Curve25519 representation.
New protocols such as CPACE requires the map over Curve25519.
Linux's approach to mapping the main thread's stack is quite odd: it essentially
tries to select an mmap address (assuming unhinted mmap calls) which do not
cover the region of virtual address space into which the stack *would* grow
(based on the stack rlimit), but it doesn't actually *prevent* those pages from
being mapped. It also doesn't try particularly hard: it's been observed that the
first (unhinted) mmap call in a simple application is usually put at an address
which is within a gigabyte or two of the stack, which is close enough to make
issues somewhat likely. In particular, if we get an address which is close-ish
to the stack, and then `mremap` it without the MAY_MOVE flag, we are *very*
likely to map pages in this "theoretical stack region". This is particularly a
problem on loongarch64, where the initial mmap address is empirically only
around 200 megabytes from the stack (whereas on most other 64-bit targets it's
closer to a gigabyte).
To work around this, we just need to avoid mremap in some cases. Unfortunately,
this system call isn't used too heavily by musl or glibc, so design issues like
this can and do exist without being caught. So, when `PageAllocator.resize` is
called, let's not try to `mremap` to grow the pages. We can still call `mremap`
in the `PageAllocator.remap` path, because in that case we can set the
`MAY_MOVE` flag, which empirically appears to make the Linux kernel avoid the
problematic "theoretical stack region".
The old logic was fine for targets where the stack grows up (so, literally just
hppa), but problematic on targets where it grows down, because we could hint
that we wanted an allocation to happen in an area of the address space that the
kernel expects to be able to expand the stack into. The kernel is happy to
satisfy such a hint despite the obvious problems this leads to later down the
road.
Co-authored-by: rpkak <rpkak@noreply.codeberg.org>
This reverts commit e314dadb01.
This idea requires more consideration before committing to it.
At the very least let's not regress autodocs.
Closes#31230
This function works with a slice of futures and returns the index of a
completed one. This doesn't work very well in practice because it's
either too high level or too low level.
At the lower level we have Io.Batch for doing this kind of thing at the
Operation API layer.
At the higher level we have Io.Select which is a convenience wrapper
around an Io.Group and an Io.Queue.
Instead of padding, use static entropy to detect corrupted header.
* 64-bit, safe modes: 10 canary bits
* 64-bit, unsafe modes: 0 canary bits
* 32-bit: 27 canary bits
A further enhancement, not done here, would be to upgrade from canary to
parity.
-- On the standard library side:
The `input: []const u8` parameter of functions passed to `testing.fuzz`
has changed to `smith: *testing.Smith`. `Smith` is used to generate
values from libfuzzer or input bytes generated by libfuzzer.
`Smith` contains the following base methods:
* `value` as a generic method for generating any type
* `eos` for generating end-of-stream markers. Provides the additional
guarantee `true` will eventually by provided.
* `bytes` for filling a byte array.
* `slice` for filling part of a buffer and providing the length.
`Smith.Weight` is used for giving value ranges a higher probability of
being selected. By default, every value has a weight of zero (i.e. they
will not be selected). Weights can only apply to values that fit within
a u64. The above functions have corresponding ones that accept weights.
Additionally, the following functions are provided:
* `baselineWeights` which provides a set of weights containing every
possible value of a type.
* `eosSimpleWeighted` for unique weights for `true` and `false`
* `valueRangeAtMost` and `valueRangeLessThan` for weighing only a range
of values.
-- On the libfuzzer and abi side:
--- Uids
These are u32s which are used to classify requested values. This solves
the problem of a mutation causing a new value to be requested and
shifting all future values; for example:
1. An initial input contains the values 1, 2, 3 which are interpreted
as a, b, and c respectively by the test.
2. The 1 is mutated to a 4 which causes the test to request an extra
value interpreted as d. The input is now 4, 2, 3, 5 (new value) which
the test corresponds to a, d, b, c; however, b and c no longer
correspond to their original values.
Uids contain a hash component and type component. The hash component
is currently determined in `Smith` by taking a hash of the calling
`@returnAddress()` or via an argument in the corresponding `WithHash`
functions. The type component is used extensively in libfuzzer with its
hashmaps.
--- Mutations
At the start of a cycle (a run), a random number of values to mutate is
selected with less being exponentially more likely. The indexes of the
values are selected from a selected uid with a logarithmic bias to uids
with more values.
Mutations may change a single values, several consecutive values in a
uid, or several consecutive values in the uid-independent order they
were requested. They may generate random values, mutate from previous
ones, or copy from other values in the same uid from the same input or
spliced from another.
For integers, mutations from previous ones currently only generates
random values. For bytes, mutations from previous mix new random data
and previous bytes with a set number of mutations.
--- Passive Minimization
A different approach has been taken for minimizing inputs: instead of
trying a fixed set of mutations when a fresh input is found, the input
is instead simply added to the corpus and removed when it is no longer
valuable.
The quality of an input is measured based off how many unique pcs it
hit and how many values it needed from the fuzzer. It is tracked which
inputs hold the best qualities for each pc for hitting the minimum and
maximum unique pcs while needing the least values.
Once all an input's qualities have been superseded for the pcs it hit,
it is removed from the corpus.
-- Comparison to byte-based smith
A byte-based smith would be much more inefficient and complex than this
solution. It would be unable to solve the shifting problem that Uids
do. It is unable to provide values from the fuzzer past end-of-stream.
Even with feedback, it would be unable to act on dynamic weights which
have proven essential with the updated tests (e.g. to constrain values
to a range).
-- Test updates
All the standard library tests have been updated to use the new smith
interface. For `Deque`, an ad hoc allocator was written to improve
performance and remove reliance on heap allocation. `TokenSmith` has
been added to aid in testing Ast and help inform decisions on the smith
interface.
The end of the archive needs to also be aligned to a two-byte boundary,
not just the start of records. This was causing lld to reject archives.
Notably, this was happening with compiler_rt when rebuilding in fuzz
mode, which is why this commit is included in this patchset.
Replaces the "flush denormals to zero" placeholder in divtf3.zig with IEEE 754 denormal support including rounding.
fixes#30179
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30198
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Chris Boesch <chrboesch@noreply.codeberg.org>
Co-committed-by: Chris Boesch <chrboesch@noreply.codeberg.org>
On FreeBSD, the maximum waiters should be Cs INT_MAX instead of the maximum of a u32. Waiting for the Io.Event in the broadcast test triggers this bug.
Resolves#30715
Co-authored-by: Simon Galli <hubi@hubidubi.net>
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30094
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: hubidubi <hubidubi@noreply.codeberg.org>
Co-committed-by: hubidubi <hubidubi@noreply.codeberg.org>
some fp exceptions are prohibited by zig test-libc (libc-test).
Promoting to f64 prevents vectorization of some floating point
divisions. The vectorization has unused lanes which contain zero. On
division the lanes containing zero are divided and trigger the INVALID
fp flags.
This makes it practical to store large items or items that are meant to
be mutable directly inside of the deque.
It is the responsibility of the user to stop using the returned pointers
after calling a function that could invalidate them.
These functions can only be exported when external libc components are
available due to the errno location dependency. Note that even when zig
libc is complete, on Windows, errno location will always be external (in
ucrtbase.dll).
Normally, libc goes into a static archive, making all symbols
overrideable. However, Zig supports including the libc functions as part
of the Zig Compilation Unit, so to support this use case we make all
symbols weak.
This should only be used when the fundamental types break down, e.g. on freestanding or when adding support for a non-posix OS, where std.Io.File.Handle is set to void.
use the "symbol" helper function in all exports
move all declarations from common.zig to compiler_rt.zig
flatten the tree structure somewhat (move contents of tiny files into
parent files)
No functional changes.
Add the missing F_SEAL_SEAL, F_SEAL_SHRINK, F_SEAL_GROW, F_SEAL_WRITE,
F_SEAL_FUTURE_WRITE, and F_SEAL_EXEC constants used with
F.ADD_SEALS/F.GET_SEALS for memfd file sealing. These are defined in the
Linux kernel at include/uapi/linux/fcntl.h.
The FreeBSD equivalents already exist in std.c (freebsd.F),
but the Linux side was missing them.
After fetching a package and applying the filter by deleting files that
are not part of the hash, creates a recompressed $GLOBAL_CACHE/p/$PKG_HASH.tar.gz
Checking this cache before fetching network URLs is not yet implemented.
Changes an assert back into a conditional to match the behavior of `getPosix`, see https://codeberg.org/ziglang/zig/pulls/31113#issuecomment-10371698 and https://github.com/ziglang/zig/issues/23331.
Note: the conditional has been updated to also return null early on 0-length key lookups, since there's no need to iterate the block in that case.
For `Environ.Map`, validation of keys has been split into two categories: 'put' and 'fetch', each of which are tailored to the constraints that the implementation actually relies upon. Specifically:
- Hashing (fetching) requires the keys to be valid WTF-8 on Windows, but does not rely on any other properties of the keys (attempting to fetch `F\x00=` is not a problem, it just won't be found)
- `create{Posix,Windows}Block` relies on the Map to always have fully valid keys (no NUL, no `=` in an invalid location, no zero-length keys), which means that the 'put' APIs need to validate that incoming keys adhere to those properties.
The relevant assertions are now documented on each of the Map functions.
Also reinstates some test cases in the `env_vars` standalone test. Some of the reinstated tests are effectively just testing the Environ.Map implementation due to how `Environ.contains`, `Environ.getAlloc`, etc are implemented, but that is not inherent to those functions so the tests are still potentially relevant if e.g. `contains` is implemented in terms of `getPosix`/`getWindows` in the future (which is totally possible and maybe a good idea since constructing the whole map is not necessary for looking up one key).
This PR replaces the bundled musl swab() implementation with zig's one.
Contributes towards #30978.
It looks like there are not test cases for swab() in test-libc.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31130
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Ivel <ivel.santos@proton.me>
Co-committed-by: Ivel <ivel.santos@proton.me>
Replaced by the lockStderr functions of std.Io. Trying to make
`std.process.stderr_thread_mutex` be a bridge across different Io
implementations didn't work in practice.
We can't use Io.Mutex in parking_futex; instead, we need a simple
parking-based mutex implementation. That's fairly simple to do.
Also deal with spurious unparks on NetBSD, where they *can* happen (as
opposed to Windows, where they cannot).
It turns out that at least on Windows, spurious unparks are *not*
possible, and in fact triggering them breaks some RTL synchronization
primitives. For instance, if you have a pending unpark going into a
contended `RtlEnterCriticalSection` call, it will never unblock. In
other words, the Windows API worked exactly how I thought it did, and
it's only the NetBSD/Illumos one which is dumb. This is actually exactly
why Windows 8 introduced the parking API despite alertable sleeps being
a thing!
This commit doesn't yet deal with making NetBSD work, nor does it even
compile I imagine. The next commit will fix everything back up.
This reverts commit c518593e97.
Importantly, adds ability to get Clock resolution, which may be zero.
This allows error.Unexpected and error.ClockUnsupported to be removed
from timeout and clock reading error sets.
- delete std.Thread.Futex
- delete std.Thread.Mutex
- delete std.Thread.Semaphore
- delete std.Thread.Condition
- delete std.Thread.RwLock
- delete std.once
std.Thread.Mutex.Recursive remains... for now. it will be replaced with
a special purpose mechanism used only by panic logic.
std.Io.Threaded exposes mutexLock and mutexUnlock for the advanced case
when you need to call them directly.
This commit should be reverted - it's testing a hypothesis that Windows
is deadlocking due to bug in the implementation of
std.Io.Threaded.parking_futex
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31093
Co-authored-by: Krzysztof Antonowski <krzysztofantonowski@proton.me>
Co-committed-by: Krzysztof Antonowski <krzysztofantonowski@proton.me>
This implementation is a bit of a hacky workaround, as we use a ntdll
API to grab the full path of the directory handle. As far as I can tell
this might be the only solution to the problem, as
kernel32.CreateProcessW takes a directory path as a string only.
I might be wrong though as haven't researched the problem thoroughly.
- batchAwaitAsync does blocking reads with NtReadFile (no APC, no event)
when the nonblocking flag is unset, but still takes advantage of
APCs when nonblocking flag is set.
- batchAwaitConcurrent returns error.ConcurrencyUnavailable when it
encounters a file_read_streaming operation on a file in blocking mode.
- fileReadStreaming avoids pointlessly checking sync cancelation status
when nonblocking flag is set, uses an APC with a done flag, and waits
on that value to change in NtDelayExecution before returning.
- fix incorrect use of NtCancelIoFile (ntdll function prototype was
wrong, leading to misuse)
On Windows, we need to know ahead of time whether a file was opened in
synchronous mode or asynchronous mode. There may be advantages to
tracking this state for POSIX operating systems as well.
It is legal to call batchWait with already completed operations in the
ring. In such case, we need to avoid waiting in the syscall. The
any_done flag was a poor way of tracking state we already have: whether
the completion queue is empty.
This problem affects the posix poll implementation as well.
Thanks again to jacobly for finding the problem.
and make reading file streaming allowed to return 0 byte reads.
According to Microsoft documentation, on Windows it is possible to get
0-byte reads from pipes when 0-byte writes are made.
The defer would cause two problems:
1. keeping the state active during call to NtCancelIoFile
2. invalid state transition. after canceled is returned from
checkCancel, new status is already canceled. calling finish after
that is illegal.
reasoning is that polling with large amount of operations will be rarely
done with std.Io.Threaded. However this still provides the opportunity
to provide concurrency for any real world use cases that need it.
This commit shows a proof-of-concept direction for std.Io.VTable to go,
which is to have general support for batching, timeouts, and
non-blocking.
I'm not sure if this is a good idea or not so I'm putting it up for
scrutiny.
This commit introduces `std.Io.operate`, `std.Io.Operation`, and
implements it experimentally for `FileReadStreaming`.
In `std.Io.Threaded`, the implementation is based on poll().
This commit shows how it can be used in `std.process.run` to collect
both stdout and stderr in a single-threaded program using
`std.Threaded.Io`.
It also demonstrates how to upgrade code that was previously using
`std.Io.poll` (*not* integrated with the interface!) using concurrency.
This may not be ideal since it makes the build runner no longer support
single-threaded mode. There is still a needed abstraction for
conveniently reading multiple File streams concurrently without
io.concurrent, but this commit demonstrates that such an API can be
built on top of the new `std.Io.operate` functionality.
When `NtReadFile` returns `SUCCESS`, the APC routine still runs when
next alertable, which was previously clobbering an out of scope `done`.
Instead of adding an extra syscall to the success path, avoid all APC
side effects, allowing instant completions to return immediately.
As mlugg pointed out those race when a thread finishes an operation just
after it is canceled and then that thread to picks up another task,
resulting in these fields being potentially overwritten.
This updates fileReadStreaming on Windows to handle being alerted, and
then manage its own cancelation of the file I/O.
For now, let us refrain from putting the sync mode into the Io.File
struct, and document that to do concurrent batch operations, any Windows
file handles must be in asynchronous mode. The consequences for
violating this requirement is neither illegal behavior, nor an error,
but that concurrency is lost. In other words, deadlock might occur. This
prevents the addition of flags field.
partial revert of 2faf14200f58ee72ec3a13e894d765f59e6483a9
This tracks whether it is a file opened in synchronous mode, or
something that supports APC.
This will be needed in order to know whether concurrent batch operations
on the file should return error.ConcurrencyUnavailable, or use APC to
complete the batch.
This patch also switches to using NtCreateFile directly in
std.Io.Threaded for dirCreateFile, as well as NtReadFile for
fileReadStreaming, making it handle files opened in synchronous mode as
well as files opened in asynchronous mode.
Most places where `undefined` was previously (intentionally) passed across
function calls now use `Air.Inst.Ref.none` instead to ensure that these
`undefined` references don't accidentally outlive the `switch` logic they
belong to.
The call to `rebase` in `discardIndirect` and `discardDirect` was inappropriate. As `rebase` expects the `capacity` parameter to exclude the sliding window, this call was asking for ANOTHER `d.window_len` bytes. This was impossible to fulfill with a buffer smaller than 2*`d.window_len`, and caused [#25764](https://github.com/ziglang/zig/issues/25764).
This PR adds a basic test to do a discard (which does trigger [#25764](https://github.com/ziglang/zig/issues/25764)), and rebases only as much as is required to make the discard succeed ([or no rebase at all](https://github.com/ziglang/zig/issues/25764#issuecomment-3484716253)). That means: ideally rebase to fit `limit`, or if the buffer is too small, as much as possible.
I must say, `discardDirect` does not make much sense to me, but I replaced it anyway. `rebaseForDiscard` works fine with `d.reader.buffer.len == 0`. Let me know if anything should be changed.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30891
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: mercenary <mercenary@noreply.codeberg.org>
Co-committed-by: mercenary <mercenary@noreply.codeberg.org>
Confusingly, the POSIX spec for clock_nanosleep() says it returns
*positive* error values directly and does not touch `errno`. Not
detecting EINTR properly here was breaking the cancellation of
threads blocked in this call when linking libc.
In Zig standard library, Dir means an open directory handle. path
represents a file system identifier string. This function is better
named after "current path" than "current dir". "get" and "working" are
superfluous.
- remove error.SharingViolation from all error sets since it has the
same meaning as FileBusy
- add error.FileBusy to CreateFileAtomicError and ReadLinkError
- update dirReadLinkWindows to use NtCreateFile and NtFsControlFile and
integrate with cancelation properly.
- move windows CTL_CODE constants to the proper namespace
- delete os.windows.ReadLink
The software impl of `acos` and `asin` depends on the `sqrt` op. Since support for `sqrt` in `f16`, `f80`, and `f128` has been added, the impl of `acos` and `asin` for `f16`, `f80`, and `f128` is now being supplemented.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30997
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: lzm-build <3575188313@qq.com>
Co-committed-by: lzm-build <3575188313@qq.com>
This error is actually only ever directly returned from `std.posix.getcwd` (and only on POSIX systems, so never on Windows). Its inclusion in almost all of the error sets its currently found in is a leftover from when `std.fs.path.resolve` called `std.process.getCwdAlloc` (https://github.com/ziglang/zig/issues/13613).
Code used `ReturnType`, comment used `T` (which is what is used in similar functions).
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/31007
Co-authored-by: Robert Ancell <robert.ancell@gmail.com>
Co-committed-by: Robert Ancell <robert.ancell@gmail.com>
Add vertical margin to the `.table-wrapper` class so that there's space
between the table and the test figures. It does not affect any of the
existing tables because the margin collapses with the adjacent `<p>`.
The stated reason for this commit was cancelation didn't work. On
further review, we know why cancelation didn't work, and recent
enhancements on master branch make it easy to make it work.
Meanwhile, not using overlapped means that multiple threads cannot use
the same open socket handle. I also added line comments to explain the
choice in this revert commit.
This reverts commit fd3657bf8c.
closes#31011reopens#30865
Finite field elements can be in regular or Montgomery form, and
chaining different operations use to require manual and error-prone
conversions.
Now:
- `add`, `sub` and `mul` convert the second operand to match the
first operand's form
- `sq` and `pow` preserve the input's Montgomery form
- `toPrimitive` and `toBytes` return `UnexpectedRepresentation` if
the element is in Montgomery form, preventing incorrect serialization
This is fully backwards compatible and allows seamless chaining of
operations regardless of their representation.
The vast majority of libc functions return `c_int` for the return value,
when setting errno. This utility function is for those cases.
Other cases can hand-roll the logic, or additional helpers can be added.
IPPROTO_RAW (255) was missing from the Darwin/macOS IPPROTO struct,
even though it is defined in system headers and supported by the platform.
This is a commonly used protocol for raw IP sockets.
We use long calls for these just like thumb*-linux-* to prevent range issues as
the binaries grow larger over time.
We also need function and data sections due to the many __stack_chk_guard
references within the std test binary; without these options, the linker is not
able to insert range thunks in between functions because the std binary just has
one giant .text section that's opaque to the linker.
closes https://codeberg.org/ziglang/zig/issues/30923
If the library isn't actually installed, `generated_bin` will be null,
causing this to panic about a missing dependency for itself; checking
for this state avoids this.
It can fail arbitrarily with ENOENT if the kernel happens to not have the FD in
its name cache. That makes it useless for our purposes.
closes https://codeberg.org/ziglang/zig/issues/30843
Since this is my first time contributing I wanted to keep it simple and I added just one function
I tested with
```bash
stage3/bin/zig build -p stage4 -Dno-lib -Dno-langref -Denable-llvm=true --search-prefix ~/repos/zig-boostrap/out/native-linux-musl-baseline/
stage4/bin/zig build test-libc -Dlibc-test-path=../libc-test -Dtest-target-filter=x86_64-linux-musl
```
and the tests passed.
I'm planning on doing more once I get the hang of it.
Co-authored-by: david <davidc.fried@gmail.com>
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30888
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Daggerfall-is-the-best-TES-game <daggerfall-is-the-best-tes-game@noreply.codeberg.org>
Co-committed-by: Daggerfall-is-the-best-TES-game <daggerfall-is-the-best-tes-game@noreply.codeberg.org>
I'm not sure where the old logic came from but it certainly didn't match NetBSD
10.1 system headers, and was causing the build system to see incorrect exit
status information for processes that were expected to crash (e.g. SIGABRT).
suggested alternatives:
- actual tests
- explicitly list the decls
- compile an example application that uses the API
- stop worrying about dead code
- refAllDecls (non recursive) in each file
Don't fight the laziness, embrace it.
closes#23608closes#30813
- remove file_size parameter from MemoryMap.write
- remove requirement for mapping length to be aligned
- align allocated fallback memory
- add unit test for std.Io.Threaded.disable_memory_mapping = true
- add unit test for MemoryMap.setLength
- change offset to u64
- make len non-optional
- make write take a file_size parameter
- std.Io.Threaded: introduce disable_memory_mapping flag to force it to
take the fallback path.
Additionally:
- introduce BlockSize to File.Stat. On Windows, based on cached call to
NtQuerySystemInformation. On unsupported OS's, set to 1.
- support File.NLink on Windows. this was available the whole time, we
just didn't see the field at first.
- remove EBADF / INVALID_HANDLE from reading/writing file error sets
by defining the pointer contents to only be synchronized after explicit
sync points, makes it legal to have a fallback implementation based on
file operations while still supporting a handful of use cases for memory
mapping.
furthermore, it makes it legal for evented I/O implementations to use
evented file I/O for the sync points rather than memory mapping.
not yet done:
- implement checking the length when options.len is null
- some windows impl work
- some wasi impl work
- unit tests
- integration with compiler
On NetBSD and Illumos, we were using the `std.Thread.Futex`-based
implementation of `std.Thread.Mutex`. But since futex is not a primitive
on these targets, the implementation of `std.Thread.Futex` was based on
pthread primitives, including `pthread_mutex_t`. This had the amusing
consequence that locking a contended mutex on NetBSD would actually
perform 2 mutex locks, 2 mutex unlocks, and 1 condition wait; likewise,
unlocking a contended mutex would perform 2 mutex locks, 2 mutex
unlocks, and 1 condition signal. Having read some cutting-edge studies,
I have concluded that this is a slightly suboptimal approach. Instead,
let's just use pthread mutexes directly in this case; that's an
obviously better idea.
In the future, I think we can probably entirely remove our usages of
pthread sync primitives---no platform actually treats them as the base
primitives. Of the platforms which std has any meaningful support for
today, most support futexes, and the exceptions (NetBSD and Illumos)
support a thread parking API. We can implement futex and/or mutex on top
of thread parking and drop the pthread dependency entirely.
Apparently the thread parking APIs on Windows and NetBSD aren't as good
as I thought---or, at least, the way they're *used* makes them not as
good. It's perfectly possible to use these APIs in a way where you don't
trigger spurious wakeups, but standard primitives (SRWLOCK on Windows,
pthread bits on NetBSD) are perfectly happy to leave pending unparks
sitting around, meaning in practice you have to assume spurious unparks
are possible. This brings me great sadness... but we soldier on!
Once in a while, these VMs get slightly slower due to heavy load from other VMs
on the host machine, causing the jobs to time out. This is a very rare
occurrence, but we do need to account for it.
When targeting x86-windows, this parameter referring to read-only memory can result in an ACCESS_VIOLATION error, and this has been seen when using FILE_DISPOSITION_INFORMATION_EX. It's unclear how exactly this ACCESS_VIOLATION is occurring, though, as the memory does not actually change before/after the call.
Closes https://codeberg.org/ziglang/zig/issues/30802
This allows stack overflows to print stack traces. The size of the
sigaltstack (and whether it is actually set) can be configured by
setting `std.Options.signal_stack_size`.
The default value for the signal stack size was chosen experimentally by
doubling the value required to get stack traces on stack overflow with
the self-hosted x86_64 backend. While some targets may typically use
more stack space than x86_64-linux, the self-hosted x86_64 backend is
quite wasteful with stack at the moment, making it a fair benchmark.
Executables produced by the LLVM backend should have lower stack usage.
Commit dec1163fbb removed sentinels from the returned slice for C
pointers. Since C pointers have no bounds, we know that it'll keep
scanning until it finds `end` (or crash trying). The same is also true
of many-item pointers without a sentinel (e.g. `[*]T`), so I added
support for those too.
e2338edb47 didn't *quite* do it, the call sites of all switch prong related
functions now have to do their part too and be a little more precise about
what kind of prong they're currently analyzing.
Also removes some unused/unnecessary stuff.
- Corrects WASI `UTIME_*` definitions now that the libc build has been
fixed (see previous commit), and adds the corresponding definitions
for Emscripten which were missing.
- Fixes `dirReadUnimplemented()`, which didn't compile.
- Prevents a dependency on `pthread_kill` from being pulled in in
single-threaded Emscripten builds, where it isn't defined.
With these changes, Emscripten can now participate in juicy main.
05d8b565ad deduplicated wasi-libc headers
that are identical to upstream musl, but because some public headers
match the same `#include <>` patterns as private headers, this resulted
in unmodified upstream musl headers sometimes getting prioritized when
compiling wasi-libc, resulting in incorrect definitions. For example,
`UTIME_OMIT` is supposed to be -2 on WASI, but became 0x3ffffffe.
The restored headers were sourced from
<c89896107d>
which is the same revision as the most recent wasi-libc sync
07cc32b004.
Fixes several issues with eh_frame generation and unwind info on macOS:
* Fix FDE address calculation by including atom_offset
* Store and use pc_range (function size) from FDE for coverage checks
* Avoid creating null unwind records for addresses already covered by
DWARF FDEs, allowing unwinder to properly fall back to DWARF info
* Remove incorrect addend adjustment in personality pointer calculation
This fixes a regression from a couple of commits ago; breaking from the
`else` block of a loop to the loop's tag should be allowed when explicitly
targeting the label by name.
This commit aims to simplify and de-duplicate the logic required for
semantically analyzing `switch` expressions.
The core logic has been moved to `analyzeSwitchBlock`, `resolveSwitchBlock`
and `finishSwitchBlock` and has been rewritten around the new iterator-based
API exposed by `Zir.UnwrappedSwitchBlock`.
All validation logic and switch prong item resolution have been moved to
`validateSwitchBlock`, which produces a `ValidatedSwitchBlock` containing
all the necessary information for further analysis.
`Zir.UnwrappedSwitchBlock`, `ValidatedSwitchBlock` and `SwitchOperand`
replace `SwitchProngAnalysis` while adding more flexibility, mainly for
better integration with `switch_block_err_union`.
`analyzeSwitchBlock` has an explicit code path for OPV types which lowers
them to either a `block`-`br` or a `loop`-`repeat` construct instead of a
switch. Backends expect `switch` to actually have an operand that exists
at runtime, so this is a bug fix and avoids further special cases in the
rest of the switch logic.
`resolveSwitchBlock` and `finishSwitchBr` exclusively deal with operands
which can have more than one value, at comptime and at runtime respectively.
This commit also reworks `RangeSet` to be an unmanaged container and adds
`Air.SwitchBr.BranchHints` to offload some complexity from Sema to there
and save a few bytes of memory in the process.
Additionally, some new features have been implemented:
- decl literals and everything else requiring a result type (`@enumFromInt`!)
may now be used as switch prong items
- union tag captures are now allowed for all prongs, not just `inline` ones
- switch prongs may contain errors which are not in the error set being
switched on, if these prongs contain `=> comptime unreachable`
and some bugs have been fixed:
- lots of issues with switching on OPV types are now fixed
- the rules around unreachable `else` prongs when switching on errors now
apply to *any* switch on an error, not just to `switch_block_err_union`,
and are applied properly based on the AST
- switching on `void` no longer requires an `else` prong unconditionally
- lazy values are properly resolved before any comparisons with prong items
- evaluation order between all kinds of switch statements is now the same,
with or without label
When switching on an error, using the captured value instead of the original
one is always preferable since its error set has been narrowed to only
contain errors which haven't already been handled by other switch prongs.
The subsequent commits will disallow this form as an unreachable `else` prong.
This makes `is_non_err` and `unwrap_errunion_err[_ptr]` friendlier towards
being emitted into comptime blocks. Also, `analyzeIsNonErr` now actually
attempts to do something at comptime.
Moved to a more linear layout which lends itself well to exposing an iterator.
Consumers of this iterator now just have to keep track of an index into a
homogenous sequence of bodies.
The new ZIR layout also enables giving switch prong items result locations
by storing the bodies of all items inside of the switch encoding itself.
There are some deliberate exceptions to this: enum literals and error values
are directly encoded as strings and number literals are resolved to comptime
values outside of the switch block. These special encodings exist to save
space and can easily be resolved during semantic analysis.
This commit also re-implements `AstGen` and `print_zir` for switch based on
the new layout and adds some additional information to the ZIR text repr.
Notably `switchExprErrUnion` has been merged into `switchExpr` to reduce
code duplication.
The rules around allowing an unreachable `else` prong in error switches are
also refined by this commit, and enforced properly based on the actual AST.
The special cases are listed exhaustively below:
`else => unreachable,`
`else => return,`
`else => |e| return e,` (where `e` is any identifier)
Additionally `{...} => comptime unreachable,` prongs are marked to support
future features (refer to next couple of commits).
Also fixes 'value with comptime-only type depends on runtime control flow'
error for labeled error switch statements by surrounding the entire expr
with a common block to break to (see previous commits for details).
Adds `Scope.Unwrapped`, a simple union of pointers to already-casted scopes
with `Scope.Tag` as its tag enum. This pairs very nicely with labeled switch
and gets rid of almost every `scope.cast(...).?`, improving developer QOL :)
Enhances `GenZir` to allow labels to provide separate `break` and `continue`
target blocks and adds some more information on continue targets to
communicate whether the target is a switch block or cannot be targeted by
`continue` at all.
The main motivation is enabling this:
```
const result: u32 = operand catch |err| label: switch (err) {
else => continue :label error.MyError,
error.MyError => break :label 1,
};
```
to be lowered to something like this:
```
%1 = block({
%2 = is_non_err(%operand)
%3 = condbr(%2, {
%4 = err_union_payload_unsafe(%operand)
%5 = break(%1, result) // targets enclosing `block`
}, {
%6 = err_union_code(%operand)
%7 = switch_block(%6,
else => {
%8 = switch_continue(%7, "error.MyError") // targets `switch_block`
},
"error.MyError" => {
%9 = break(%1, @one) // targets enclosing `block`
},
)
%10 = break(%1, @void_value)
})
})
```
which makes the non-error case and all breaks from switch prongs, but not
continues from switch prongs, peers.
This is required to avoid the problems described in gh#11957 for labeled
switches without having to introduce a fairly complex special case to the
`switch_block_err_union` logic. Since this construct is very rare in practice,
introducing this additional complexity just to save a few ZIR bytes is
likely not worth it, so the simplified lowering described above will be
used instead.
As a nice bonus, AstGen can now also detect a `continue` trying to target
a labeled block and emit an appropriate error message.
Previously Zig allowed you to write something like,
```zig
switch (x) {
.y => |_| {
```
This seems a bit strange because in other cases, such as when
capturing the tag in a switch case,
```zig
switch (x) {
.y => |_, _| {
```
this produces an error.
The only usecase I can think of for the previous behaviour is
if you wanted to assert that all union payloads are able
to coerce,
```zig
const X = union(enum) { y: u8, z: f32 };
switch (x) {
.y, .z => |_| {
```
This will compile-error with the `|_|` and pass without it.
I don't believe this usecase is strong enough to keep the current
behaviour; it was never used in the Zig codebase and I cannot
find a single usage of this behaviour in the real world, searching
through Sourcegraph.
This reverts commit 867501d9d2.
It seems that the increased run time was a result of the regressed
test-incremental step and that we are now back to normal.
If the float can store all possible values of the integer without
rounding, coercion is allowed. The integer's precision must be less than
or equal to the float's significand precision.
Closes#18614
Since we ignore this flag in `clang_options_data.zig`, it makes
sense to ignore it for Nix as well. One thing I've been thinking about
is if it would make sense to somehow use `clang_options_data.zig` as
source of truth for handling Nix cflags too rather than slap more
hard-coded escape hatches.
The implementation of HostName.validate was too generous. It considered
strings like ".example.com", "exa..mple.com", and "-example.com" to be
valid hostnames, which is incorrect according to RFC 1123 (the currently
accepted standard).
Reviewed-on: https://github.com/ziglang/zig/pull/25710
The previous logic was made really messy by the fact that upon entry to
the step eval worker, the step may not be ready to run, we may be racing
with other workers doing the same check, and we had already acquired our
RSS requirement even though we might not run. It also required iterating
all dependencies each time we were called to check whether we were even
ready to run yet.
A much better strategy is for each step to have an atomic counter
representing how many of its dependencies are yet to complete. When a
step completes (successfully or otherwise), it decrements this value for
all of its dependants, and if it drops any to 0, it schedules that step
to run. This means each step is scheduled exactly once, and only when
all of its dependencies have finished, reducing redundant checks and
hence contention. If the step being scheduled needs to claim RSS which
isn't available, then it is instead added to `memory_blocked_steps`,
which is iterated by the step worker after a step with an RSS claim
finishes.
This logic is more concise than before, simpler to understand, generally
more efficient, and fixes a bug in the RSS tracking. Also, as a nice
side effect, it should also play a little bit nicer with `Io.Threaded`'s
scheduling strategy, because we no longer spawn extremely short-lived
tasks all the time as we previously did.
Resolves: https://codeberg.org/ziglang/zig/issues/30742
ABI detection previously did not take into account the non-standard
directory structure of Android. This has been fixed.
The API level is detected by running `getprop ro.build.version.sdk`,
since we don't want to depend on bionic, and reading system properties
ourselves is not trivially possible.
clock_nanosleep is specified by POSIX but not implemented on these
hereby shamed operating systems:
* macOS
* OpenBSD (which defines TIMER_ABSTIME for some reason...?)
RISC-V and LoongArch ELF psABIs define a kind of
RELAX relocations which are expected to have a normal
relocation at the same address.
Change-Id: I5737bfcfd3e5041fb6ab7d193c9fc57eeca1eec8
Our implementation did the classic add-sub rounding trick `(y = x +/- C =+ C - x)`
with `C = 1 / eps(T) = 2^(mantissa - 1)`. This approach only works for values whose
magnitude is below the rounding capacity of the constant. For a 64-bit mantissa
(like f80 has), `C = 2^63` only rounds for `|x| < 2^63`. Before we allowed this to
be ran on `e < bias + 64` aka `|x| < 2^64`. And because it isn't large enough,
we lose a bit to rounding.
For reference, the musl implementation does the same thing, using `mantissa - 1`:
https://git.musl-libc.org/cgit/musl/tree/src/math/ceill.c#n18
where `LDBL_MANT_DIG` is 64 for `long double` on x86.
This commit also combines the floor and ceil implementations into one generic one.
Following up on #30571
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30724
Reviewed-by: Andrew Kelley <andrew@ziglang.org>
Co-authored-by: Steven Casper <sebastiancasper3@gmail.com>
Co-committed-by: Steven Casper <sebastiancasper3@gmail.com>
The option should probably still be implemented properly at some point, but LLD
has ignored this for years and nobody seems to mind, so just do the same for
now.
This unblocks using zig cc with CMake on OpenBSD, among other use cases.
ref https://github.com/ziglang/zig/issues/18713
Previously these functions made the assumption that
when performing a on the input digits,
there could be no collisions between the less
significant digits being larger than '9', and the
upper digits being small enough to get past the
checks.
Now we perform a correct check across all of the
digits to ensure they're in between '0'-'9', at
a minimal cost, since all digits are checked in
parallel.
Necessary to work around this:
error: ld.lld: /home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/c67e3042d116da0f73d3129d377044bf/libc++abi.a(/home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/61dd7765ab891b4db54a5e49b8ab8c86/cxa_thread_atexit.o):(function __cxa_thread_atexit: .text.__cxa_thread_atexit+0x40): relocation R_PPC64_REL24_NOTOC out of range: -251925824 is not in [-33554432, 33554431]; references '__cxa_thread_atexit_impl'
note: referenced by cxa_thread_atexit.cpp:116 (/home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/lib/libcxxabi/src/cxa_thread_atexit.cpp:116)
note: defined in /home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/c67e3042d116da0f73d3129d377044bf/libc++abi.a(/home/ci/.cache/act/f23fdcb74e58cd75/hostexecutor/zig-global-cache/o/61dd7765ab891b4db54a5e49b8ab8c86/cxa_thread_atexit.o)
This is only a problem for now because we link libLLVM; once we start invoking
llc instead, we won't need this workaround anymore.
It has been observed in practice on powerpc64(le)-linux that zig.o (in various
stages and build modes) is large enough that the +/- 32MB branch range of
PowerPC is insufficient to reach from one end of the code section to the other.
With LLVM 21, this leads to silent miscompiles that then crash at runtime. With
LLVM 22, it will at least lead to branch range errors that fail the compilation,
but that gets us no closer to a working compiler.
By using these options, we give the linker much greater flexibility to move code
and data around to satisfy these range constraints; without them, the linker is
not allowed to split up the huge code and data sections of zig.o to do so.
Similar issues have also been observed on powerpc-linux (32-bit), hexagon-linux,
and some variations of mips(64)(el)-linux. But let's be conservative for now;
those other targets can be added to the condition later.
As a data point to support this change, it's worth noting that LLD started
enabling these options for LTO precisely because the resulting large compilation
units ran into these range issues. In some abstract sense, Zig can be seen as
doing a limited form of "LTO" in the frontend, so it's not surprising that we
would hit the same issues.
We already did this for some of them; this just makes us consistent. Doing this
gives the linker more flexibility to rearrange code/data, but more importantly,
allows --gc-sections to get rid of all the unused code, which is a real concern
for these libraries in particular.
I believe these tests may have been flaky as a result of the bug fixed
in the previous commit. A big hint is that they were all crashing with
SIGSEGV with no stack trace. I suspect that some lingering SIGIOs from
cancelations were being delivered to a thread after its `munmap` call,
which was happening because the test runner called `Io.Threaded.deinit`
to cause all of the (detached) worker threads to exit.
If this passes, I'll re-run the x86_64-linux CI jobs on this commit a
few times before merge to try and be sure there are no lingering
failures.
Resolves: https://codeberg.org/ziglang/zig/issues/30096
Resolves: https://codeberg.org/ziglang/zig/issues/30592
Resolves: https://codeberg.org/ziglang/zig/issues/30682
As the comment explains, if a signal were to arrive between a detached
thread's `munmap` and `exit` calls, the signal handler would immediately
trigger SIGSEGV due to the stack being unmapped. To solve this, we need
to block all signals before entering this logic. The musl implementation
which this logic was ported from does this exact thing; that logic was
just lost when porting.
Notably, this would lead to a crash with no stack trace, because the
SIGSEGV handler would itself crash due to the missing stack.
It doesn't make any sense for a task to be canceled while it's
panicking.
As a happy accident, this also solves some cases where safety panics in
`Io.Threaded` would cause stack traces not to print due to invalid
thread-local state: when cancelation is blocked, `Io.Threaded` doesn't
consult said thread-local state at all. For instance, try inserting a
panic just after a call to `Syscall.start()` in `Io.Threaded`, and then
call the `Io` function in question from a `concurrent` task. Before this
PR, the stack trace fails to print, because the panic handler sees the
thread-local cancelation state in an unexpected state, leading to a
recursive panic. After this PR, the stack trace prints fine.
There are various reasons why one might want to still create libc-less
compilations on these targets. Case in point: Compiling our bundled crt0 for
OpenBSD.
We will still default to linking libc on these targets, though.
This was removed in 125a9aa82b. But now that our
x86_64-linux machines have had their resources properly allocated, resulting in
runs taking 1-3 hours rather than 4-8, we can add this back.
Remove the RemoveDir step with no replacement. This step had no valid
purpose. Mutating source files? That should be done with
UpdateSourceFiles step. Deleting temporary directories? That required
creating the tmp directories in the configure phase which is broken.
Deleting cached artifacts? That's going to cause problems.
Similarly, remove the `Build.makeTempPath` function. This was used to
create a temporary path in the configure place which, again, is the
wrong place to do it.
Instead, the WriteFile step has been updated with more functionality:
tmp mode: In this mode, the directory will be placed inside "tmp" rather
than "o", and caching will be skipped. During the `make` phase, the step
will always do all the file system operations, and on successful build
completion, the dir will be deleted along with all other tmp
directories. The directory is therefore eligible to be used for
mutations by other steps. `Build.addTempFiles` is introduced to
initialize a WriteFile step with this mode.
mutate mode: The operations will not be performed against a freshly
created directory, but instead act against a temporary directory.
`Build.addMutateFiles` is introduced to initialize a WriteFile step with
this mode.
`Build.tmpPath` is introduced, which is a shortcut for
`Build.addTempFiles` followed by `WriteFile.getDirectory`.
* give Cache a gpa rather than arena because that's what it asks for
These are handled by Io.Dir now. This is part of an effort to eliminate
error.OperationCanceled from the std lib. Also an effort to delete
all std.posix functions.
* cache nul handle for child process execution
* make opening the nul file integrate properly with cancelation
* replace all calls to SleepEx to parking_sleep.sleep instead, making
them properly cancelable. These sleeps are workarounds for Windows
kernel bugs. Now you can even cancel while waiting for kernel bug
workarounds!
error: memory usage peaked at 6.05GB (6047436800 bytes), exceeding the
declared upper bound of 6.04GB (6044158771 bytes)
This value isn't meant to be OS-specific anyway.
this simplifies the implementation, allows specializing it, and allows
deleting the corresponding free function. In practice this is how it is
always used anyway.
this gets the build runner compiling again on linux
this work is incomplete; it only moves code around so that environment
variables can be wrangled properly. a future commit will need to audit
the cancelation and error handling of this moved logic.
Because I accidentially squished two commit and can't figure out how to separate them
again, this also
- standardizes some doc-comments
- makes a slight change to `std.ArrayList`'s `initCapacity`'s doc-comment
TL;DR: "r" considered harmful.
If LLVM chose registers badly, the inline asm which cleans up a thread
on Linux could, on all architectures other than x86_64, clobber either
`munmap` argument with the other argument *or* with the syscall number.
This would cause munmap to return EINVAL, and we would literally *never*
free the thread memory, which isn't ideal.
As it turns out, this was happening on MIPS, and was the cause of the
failures we've recently been seeing for that target: QEMU genuinely was
running out of memory (or at least, the virtualized address space was
getting too fragmented to map many contiguous pages). I've therefore
re-enabled a test which was disabled due to that flakiness.
This bug was accidentally fixed for x86_64 back in 2022 (see 59e33b447),
which probably helped it to go unnoticed for as long as it did!
Resolves: https://codeberg.org/ziglang/zig/issues/30216
Change the log implementation to prepend the current target and update
to all logs which happen during an update.
Makes progress on https://github.com/ziglang/zig/issues/22510, but does
not fully resolve it.
Previously, 64-bit '<<|' operations were emitting 64-bit shifts with one
64-bit operand and one 32-bit operand, which is illegal. Instead, as in
the lowering for regular shifts, we need to cast the RHS in this case.
This was missed when updating to the new group cancelation API, and
caused illegal behavior in many cases (the condition was simply that a
DNS query returned a second result before a connection was successfully
established).
This commit includes some API changes which I agreed with Andrew as a
follow-up to the recent `Io.Group` changes:
* `Io.Group.await` *does* propagate cancelation to group tasks; it then
waits for them to complete, and *also* returns `error.Canceled`. The
assertion that group tasks handle `error.Canceled` "correctly" means
this behavior is loosely analagous to how awaiting a future works. The
important thing is that the semantics of `Group.await` and
`Future.await` are similar, and `error.Canceled` will always be
visible to the caller (assuming correct API usage).
* `Io.Group.awaitUncancelable` is removed.
* `Future.await` calls `recancel` only if the "child" task (the future
being awaited) did not acknowledge cancelation. If it did, then it is
assumed that the future will propagate `error.Canceled` through
`await` as needed.
As of this branch, the performance impact of robust cancelation is now
negligible (and in fact entirely unmeasurable in almost all cases), so
there is no good reason to not enable it in all cases. The performance
issues before were primarily down to a typo in the robust cancelation
logic which resulted in every canceled syscall potentially being sent
hundreds of signals in quick succession, because the delay between
signals started out at 1ns instead of 1us!
The most interesting thing here is the replacement of the pthread futex
implementation with an implementation based on thread park/unpark APIs.
Thread parking tends to be the primitive provided by systems which do
not have a futex primitive, such as NetBSD, so this implementation is
far more efficient than the pthread one. It is also useful on Windows,
where `RtlWaitOnAddress` is itself a userland implementation based on
thread park/unpark; we can implement it ourselves including support for
features which Windows' implementation lacks, such as cancelation and
waking a number of waiters with 1<n<infinity.
Compared to the pthread implementation, this thread-parking-based one
also supports full robust cancelation. Thread parking also turns out to
be useful for implementing `sleep`, so is now used for that on Windows
and NetBSD.
This commit also introduces proper cancelation support for most Windows
operations. The most notable omission right now is DNS lookups through
`GetAddrInfoEx`, just because they're a little more work due to having
a unique cancelation mechanism---but the machinery is all there, so I'll
finish gluing it together soon.
As of this commit, there are very few parts of `Io.Threaded` which do
not support full robust cancelation. The only ones which actually really
matter (because they could block for a prolonged period of time) are DNS
lookups on Windows (as discussed above) and futex waits on WASM.
The goal of this internal refactor is to fix some bugs in cancelation
and allow group tasks to clean up their own resources eagerly. The
latter will become a guarantee of the `std.Io` interface, which is
important so that groups can be used to "detach" tasks.
This commit changes the API which POSIX system calls use internally (the
functions formerly called `beginSyscall` etc), but does not update the
usage sites yet.
This reverts commit c9fa8e46df.
This commit was failing CI checks. This failure was unfortunately not
noticed before merge, due in part to the build runner bug fixed in the
last commit.
When linking libc, it should be the libc that manages the heap. The main
Wasm memory might have been configured as non-growable, which makes
`WasmAllocator` a poor default and causes the common `DebugAllocator`
use case fail with OOM errors unless the user uses `std_options` to
override the default page allocator. Additionally, on Emscripten,
growing Wasm memory without notifying the JS glue code will cause array
buffers to get detached and lead to spurious crashes.
Fixes a issue in tryFindProgram where it would fail if the PATH environment variable contained relative paths, due to its incorrect assumption that the full_path argument is always absolute path.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30619
Co-authored-by: hixuyuming <hixuyuming@gmail.com>
Co-committed-by: hixuyuming <hixuyuming@gmail.com>
Reject low-order points by checking projective coordinates directly
instead of using affine coordinates.
Equivalent, but saves CPU cycles (~254 field multiplications total
before, 3 field multiplications after).
Now, the return type of functions spawned with `Group.async` and
`Group.concurrent` may be anything that coerces to `Io.Cancelable!void`.
Before this commit, group tasks were the only exception to the rule
"error.Canceled should never be swallowed". Now, there is no exception,
and it is enforced with an assertion upon closure completion.
Finally, fixes a case of swallowing error.Canceled in the compiler,
solving a TODO.
There are three ways to handle `error.Canceled`. In order of most
common:
1. Propagate it
2. After receiving it, io.recancel() and then don't propagate it
3. Make it unreachable with io.swapCancelProtection()
Rename `wait` to `await` to be consistent with Future API. The
convention here is that this set of functionality goes together:
* async/concurrent
* await/cancel
Also rename Select `wait` to `await` for the same reason.
`Group.await` now can return `error.Canceled`. Furthermore,
`Group.await` does not auto-propagate cancelation. Instead, users should
follow the pattern of `defer group.cancel(io);` after initialization,
and doing `try group.await(io);` at the end of the success path.
Advanced logic can choose to do something other than this pattern in the
event of cancelation.
Additionally, fixes a bug in `std.Io.Threaded` future await, in which it
swallowed an `error.Canceled`. Now if a task is canceled while awaiting
a future, after propagating the cancel request, it also recancels,
meaning that the awaiting task will properly detect its own cancelation
at the next cancelation point.
Furthermore, fixes a bug in the compiler where `error.Canceled` was
being swallowed in `dispatchPrelinkWork`.
Finally, fixes std.crypto code that inappropriately used
`catch unreachable` in response to cancelation without even so much as a
comment explaining why it was believed to be unreachable. Now, those
functions have `error.Canceled` in the error set and propagate
cancelation properly.
With this way of doing things, `Group.await` has a nice property: even if
all tasks in the group are CPU bound and without cancelation points, the
`Group.await` can still be canceled. In such case, the task that was
waiting for `await` wakes up with a chance to do some more resource
cleanup tasks, such as canceling more things, before entering the
deferred `Group.cancel` call at which point it has to suspend until the
canceled but uninterruptible CPU bound tasks complete.
closes#30601
* According to OpenBSD's getdents docs indicate the buffer must be
greater or or equal to the block size associated with the file and to
refer to stat(2).
* Use S_BLKSIZE, which is 512, instead of @sizeOf(std.c.dirent), which is 280.
* Oddly the other BSDs are not this picky.
move some of the clearing logic from std.Io.Threaded to the Io interface
layer, thus allowing it to be skipped by advanced usage code via calling
the vtable functions directly.
then take advantage of this in std.Progress to avoid clearing the
terminal twice.
closes#30611
I know the real solution would be to fix this.
But this is still way better than unexplained bugs.
I hope this is the correct way to do this.
I've tested it and it seems to work as advertised.
This may not fix it in the build system... but yk, better than nothing.
Reviewed-on: https://codeberg.org/ziglang/zig/pulls/30229
Reviewed-by: Andrew Kelley <andrewrk@noreply.codeberg.org>
Co-authored-by: Sivecano <sivecano@gmail.com>
Co-committed-by: Sivecano <sivecano@gmail.com>
On Windows, changing the value at the target address is not sufficient, there needs to be a wake call.
From [MSDN](https://learn.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-waitonaddress):
> If the value at Address differs from the value at CompareAddress, the function returns immediately. If the values are the same, the function does not return until another thread in the same process signals that the value at Address has changed by calling WakeByAddressSingle or WakeByAddressAll or the timeout elapses, whichever comes first.
We could note that this behavior is only observed on Windows. However, if we want the function to have consistent behavior across all platforms, we should require users to call wake. On platforms where changing the value is sufficient to wake the waiting thread, an unblocking of the thread without a matching wake would just be considered a spurious wake.
Some filesystems, such as ZFS, do not report atime. It's pretty useless
in general, so make it an optional field in File.Stat.
Also take the opportunity to make setting timestamps API more flexible
and match the APIs widely available, which have UTIME_OMIT and UTIME_NOW
constants that can be independently set for both fields.
This is needed to handle smoothly the case when atime is null.
Because `std.Io` forces analysis of unused functions in the vtable,
there are now references to 4 more WASI functions, even though the
compiler does not actually call them at runtime.
Therefore, these new stubs fix the bootstrap process after a zig1.wasm
update (though this branch does *not* update zig1.wasm).
I resisted adding this because it's generally better to create a
File.Reader instead, but we do need a simple wrapper around the vtable
function, and there are use cases for it such as std.Progress.
Description of problem:
- wasm linker does GC in flush()
- it has the mechanism where it tracks the end index of a bunch of
ArrayHashMap before flush() and after flush, shrinkRetainingCapacity()
them to restore them to pre-flush() state
- this includes `functions`, which contains `__divti3`
- flush() notices the call to `__divti3` and calls markFunctionImport(),
but that function does nothing on a second update because `alive` is
already set to `true` so it incorrectly skips adding the intrinsic
back to `functions`
I tried to remember why I thought it was OK to use this `alive` flag
which is state that's not being restored after flush(). If I remember
correctly, I was just leaving the code how it was before, with the plan
to change the data layout after encountering this exact problem.
However, I found a solution that doesn't require changing data layout,
and still takes advantage of the 1-bit-per-symbol data layout.
There's a good argument to not have this in the std lib but it's more
work to remove it than to leave it in, and this branch is already
20,000+ lines changed.
* std.option allows overriding the debug Io instance
* if the default is used, start code initializes environ and argv0
also fix some places that needed recancel(), thanks mlugg!
See #30562
This is needed unfortunately for OpenBSD and Haiku for process
executable path.
I made it so that you can omit the options usually, but you get a
compile error if you omit the options on those targets.
* policy is to always handle EINTR from all syscalls
* assert the result of realpath
* don't ignore errors from realpath
* hard-code path separators for code simplicity
Since we know the read will fail for directories, we can take advantage
of Windows being able to fail with IsDir during open to avoid needing to
wait until the read to find out about the directory-ness of the file.
It's better to avoid references to this global variable, but, in the
cases where it's needed, such as in std.debug.print and collecting stack
traces, better to share the same instance.
Stephen Gregoratto (original author of the fallback) says:
I read through both libcs again to compare:
Musl:
- `stat` path. Check error.
- If path is a symlink, return `OPNOTSUPP`.
- `openat` path as `O_PATH|O_NOFOLLOW|O_CLOEXEC`. Return `OPNOTSUPP` if we got `ELOOP`.
- Build procfs filename.
- `stat` procfs file.
- If procfs file is a symlink, return `OPNOTSUPP`.
- close path fd.
Glibc:
- open path as `O_PATH|O_NOFOLLOW|O_CLOEXEC`.
- fstatat path fd.
- If path is a symlink, return `OPNOTSUPP`.
- Build procfs filename.
- chmod procfs filename. Return `OPNOTSUPP` if we got `ENOENT`.
- close path fd.
I prefer glibc since you open the path first, which avoids a possible
TOCTOU race.
The only known situation in which this occurs is when using musl with
the undocumented extension PTHREAD_CANCEL_MASKED, causing the next
syscall to return ECANCELED.
However zig std lib does not use this mechanism, even when targeting
musl libc because it would require each async task to be on a fresh
pthread.
For the same reason, if third party code were to cause ECANCELED to be
returned from any of these syscalls, it would cause subsequent tasks to
be incorrectly canceled since they cannot be rearmed.
Thus, this Io implementation cannot handle this error code correctly,
expecting never to receive it.
It's important not to swallow error.Canceled. We don't have recancel()
yet but that will be a way to "rearm" cancelation after handling it so
that it is not ignored.
At first I thought about keeping this as an assertion but I can see this
being useful if you already know how many bytes to read and you are
filling the end of the buffer.
This also more closely mirrors POSIX APIs.
This is one way of addressing/closing https://github.com/ziglang/zig/issues/16738
Previously, there was a mismatch between the default behaviors on Windows vs other platforms, where Windows was implicitly using .NON_DIRECTORY_FILE for its `openFile` implementation which caused `error.IsDir` when opening a directory, while on other platforms there is no equivalent flag for the `open` syscall. This meant that `openFile` on a path of a directory would fail on Windows but succeed on other platforms.
Adding `allow_directory` to `File.OpenFlags` serves two purposes:
1. It provides a cross-platform way to get the `.NON_DIRECTORY_FILE` behavior in the most efficient available way for the platform (on Windows, no extra syscalls are required, on other systems, an extra `fstat` is required)
2. It allows `statFile` to be implemented on top of `openFile` on Windows while still allowing `statFile` to work on directory paths. Before this commit, `statFile` on a directory path on Windows failed with `error.IsDir`
Note: The second purpose could have been addressed in different ways (bespoke call to NtCreateFile in the `statFile` implementation to avoid passing `NON_DIRECTORY_FILE`, or just never pass `NON_DIRECTORY_FILE` in the `openFile` implementation), so the first purpose is the more relevant/motivating force behind this change.
The default being `true` is intended to cut down on the number of syscalls as much as possible when using the default flags.
use the application's Io implementation where possible. This correctly
makes writing to stderr cancelable, fallible, and participate in the
application's event loop. It also removes one more hard-coded
dependency on a secondary Io implementation.
In case exit(0) will be called, this provides the opportunity for the
application's Io instance to be the one to clear the terminal in case
std.Progress or similar was used.
This commit sketches an idea for how to deal with detection of file
streams as being terminals.
When a File stream is a terminal, writes through the stream should have
their escapes stripped unless the programmer explicitly enables terminal
escapes. Furthermore, the programmer needs a convenient API for
intentionally outputting escapes into the stream. In particular it
should be possible to set colors that are silently discarded when the
stream is not a terminal.
This commit makes `Io.File.Writer` track the terminal mode in the
already-existing `mode` field, making it the appropriate place to
implement escape stripping.
`Io.lockStderrWriter` returns a `*Io.File.Writer` with terminal
detection already done by default. This is a higher-level application
layer stream for writing to stderr.
Meanwhile, `std.debug.lockStderrWriter` also returns a `*Io.File.Writer`
but a lower-level one that is hard-coded to use a static single-threaded
`std.Io.Threaded` instance. This is the same instance that is used for
collecting debug information and iterating the unwind info.
This decision should be audited and discussed.
Some factors:
* Passing an Io instance into start.
* Avoiding reference to global static instance if it won't be used, so
that it doesn't bloat the executable.
* Being able to use std.debug.print, and related functionality when
debugging std.Io instances and std.Progress.
instead, allow the user to set it as a field.
this fixes a bug where leak printing and error printing would run tty
config detection for stderr, and then emit a log, which is not necessary
going to print to stderr.
however, the nice defaults are gone; the user must explicitly assign the
tty_config field during initialization or else the logging will not have
color.
related: https://github.com/ziglang/zig/issues/24510
The 6.17 kernel added[1] syscalls for getting/setting certain file flags
and attributes. It's meant to be a more extensible replacement for these
ioctl's:
- `FS_IOC_GETFLAGS`/`FS_IOC_SETFLAGS`.
- `FS_IOC_FSGETXATTR`/`FS_IOC_FSSETXATTR`.
The definitions of these calls are as follows:
```zig
const file_attr = extern struct {
/// Extended flags that apply to this file. (get/set).
xflags: u64,
/// Preferred extent allocation size, in bytes. (get/set).
extsize: u32,
/// One of:
/// - The number of data extents in this file.
/// - If `FS_IOC_FSGETXATTRA` is set, the number of extended attribute events in the file.
/// (get)
nextents: u32,
/// Project Identifier (get/set).
projid: u32,
/// Preferred extent allocation size for CoW operations, in bytes (get/set).
cowextsize: u32,
};
// size=@sizeOf(file_attr)
fn file_getattr(dirfd: fd_t, path: [*:0], fattr: *file_attr, size: usize, at_flags: u32) {}
fn file_setattr(dirfd: fd_t, path: [*:0], fattr: *file_attr, size: usize, at_flags: u32) {}
```
Users need to set/check `xflags` with the `FS_XFLAG` flags defined in
`linux/fs.h`. `ioctl_xfs_fsgetxattr(2)` has more information about the
type of information one can retrieve.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=57fcb7d930d8f00f383e995aeebdcd2b416a187a
Eliminate the `std.Thread.Pool` used in the compiler for concurrency and
asynchrony, in favour of the new `std.Io.async` and `std.Io.concurrent`
primitives.
This removes the last usage of `std.Thread.Pool` in the Zig repository.
More tasks could be added to the group at any time before it completes,
so it's not valid to look at the `token` passed in here.
There's also a related bug in `Threaded`, which is that tasks spawned in
a group after it is canceled will not observe that cancelation, but that
is a more complex bug which needs some deeper design changes.
Move ensureUnusedCapacity inside the mutex call to prevent
a race condition where other worker threads could append to
memory_blocked_steps between checking the length and iterating.
repro branch: https://codeberg.org/erg/zig/src/branch/fix-30014-repro
check out that branch, and depending on if you have the fix-30014 patch or not,
it will either race condition or succeed
Fixes#30014
Queues can now be "closed". A closed queue cannot have more elements
appended with `put`, and blocked calls to `put` will immediately unblock
having failed to append some elements. Calls to `get` will continue to
succeed as long as the queue buffer is non-empty, but will then never
block; already-blocked calls to `get` will unblock.
All queue get/put operations can now return `error.Closed` to indicate
that the queue has been closed. For bulk get/put operations, they may
add/receive fewer elements than the minimum requested *if* the queue was
closed or the calling task was canceled. In that case, if any elements
were already added/received, they are returned first, and successive
calls will return `error.Closed` or `error.Canceled`.
Also, fix a bug where `Queue.get` could deadlock because it incorrectly
blocked until the given buffer was *filled*.
Resolves: #30141
This work was partially cherry-picked from Andrew's WIP std.fs branch.
However, I also analyzed and simplified the Mutex and Condition
implementations, and brought them in line with modern Zig style.
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
Previously, `E1!void` failed to coerce to `E2!E1!void` because it tried
to coerce the `E1` error to an `E2` if comptime-known and refused to
attempt any other coercion. This is a strange language rule with no
clear justification, and breaks real use cases (such as calling `getOne`
on an `Io.Queue(E!T)`). Instead, only error *sets* should try to coerce
to an "error" value of an error union type.
On the x86_64 self hosted backend, thread locals are accessed through
__tls_get_addr on PIC. Usually this goes through a fast path which does
not lose any registers, however in some cases (notably any dlopened
library on my machine) this can take a slow path which calls out to C
ABI functions
Catch this case and backup registers as necessary
Fix a few other ones while we're here. Credit to mlugg
Fixes#30183
Targets that don't support tail calls will see:
/home/ci/zig/.zig-cache/o/35dbe82c8e4d49ae5b7d630329568133/tmp.zig:5:5: error: unable to perform tail call: compiler backend 'stage2_llvm' does not support tail calls on target architecture 'powerpc64le' with the selected CPU feature flags
So just run this test on a known-good target.
The number of possible errors in the theoretical error set of this function is very large, while each individual call to the function is only concerned with a (potentially small) subset of those errors that are specific to the control code being used. This commit makes the callers determine which statuses they are interested in to avoid an ever-ballooning error set and ever-growing switch cases at each call site that throw away most of those errors.
Maintaining the POSIX `stat` bits for Zig is a pain. The order and
bit-length of members differ between all architectures, and int types
can be signed or unsigned. The libcs deal with this by introducing the
own version of `struct stat` and copying the kernel structure members to
it. In the case of glibc, they did it twice thanks to the largefile
transition!
In practice, the project needs to maintain three versions of `struct
stat`:
- What the kernel defines.
- What musl wants for `struct stat`.
- What glibc wants for `struct stat64`. Make sure to use `fstatat64`!
This isn't as simple as running `zig translate-c`. In #21440 I had to:
- Compile toolchains for each arch+glibc/musl combo.
- Create a test `fstat` program with/without `FILE_OFFSET_BITS=64`.
- Dump the value for `struct stat`.
- Stare at `std.os.linux`/`std.c` and cry.
- Add some missing padding.
The fact that so many target checks in the `linux` and `posix` tests
exist is most likely due to writing to padding bits and failing later.
The solution to this madness is `statx(2)`:
- It takes a single structure that is the same for all arches AND libcs.
- It uses a custom timestamp format, but it is 64-bit ready.
- It gives the same info as `fstatat(2)` and more!
- Unlike `fstatat(2)`, you can request a subset of the info required
based on passing a mask.
It's so good that modern Linux arches (e.g. riscv) don't even implement
`stat`, with the libcs using a generic `struct stat` and copying from
`struct statx`.
Therefore, this commit rips out all the `stat` bits from `std.os.linux`
and `std.c`. `std.posix.Stat` is now `void`, and calling
`std.posix.*stat` is an compile-time error. A wrapper around `statx` has
been added to `std.os.linux`, and callers have been upgraded to use it.
Tests have also been updated to use `statx` where possible.
While I was here, I converted the mask and file attributes to be packed
struct bitfields. A nice side effect is checking that you actually
recieved the members you asked for via `Statx.mask`, which I have used
by adding `assert`s at specific callsites.
The ML-KEM decapsulation was returning z directly when implicit
rejection was triggered, but FIPS 203 specifies it should return
J(z || c) = SHAKE256(z || c).
FreeBSD 15 dropped this target. It also dropped x86-freebsd-none, but since that
target remains buildable and usable with the 32-bit compat layer, we keep it for
now.
After https://codeberg.org/ziglang/zig/pulls/30103, `raw_c_allocator` is
redundant. It existed to avoid overhead when you could assert that all
of your `Allocator` usage was going to be compatible with the C `malloc`
API, but the standard `c_allocator` is now able to avoid that overhead
*in the case* that your usage is compatible (and use the less efficient
path in the rare case where it's not), so there's no need for the raw
version anymore. Leaving it in `std.heap` at this point seems like it
would just be a footgun.
This was presumably unintentional, as the old behavior looked quite odd
when you noticed it. All of the line-drawing characters ought to be the
same color; only the step name/status is dimmed when a step is "reused"
(i.e. appears multiple times in the tree).
https://github.com/ziglang/zig/issues/26027#issuecomment-3571227050
tracked some bad performance in `DebugAllocator` on macOS down to a
function in dyld which `std.debug.SelfInfo` was calling into. It turns
out `dladdr`'s symbol lookup logic is horrendously slow (looking at its
source code, it appears to be doing a *linear scan* over all symbols in
the image?!). However, we don't actually need the symbol, so we want to
try and avoid this logic.
Luckily, dyld has more precise APIs for what we need! Unluckily, Apple,
in their infinite wisdom, decided they should be deprecated in favour of
`dladdr`, despite the latter being several times slower (and by "several
times", I have measured a 50x slowdown on repeated calls to `dladdr`
compared to the other API). But luckily again, the deprecated APIs are
still exposed.
So, after a careful analysis of the situation (reading dyld code and
cursing Apple engineers), I think it makes sense to just use these
deprecated APIs for now. If they ever go away, we can write our own
cache for this data to bypass Apple's awfully slow code, but I suspect
these functions will stick around for the foreseeable future.
Uh, and if `_dyld_get_image_header_containing_address` goes away,
there's also `dyld_image_header_containing_address`, which is a
seemingly identical function, exported by dyld just the same, but with a
separate (functionally identical) implementation, and not documented in
the public header file. Apple work in mysterious ways, I guess.
The main goal here was to avoid allocating padding and header space if
`malloc` already guarantees the alignment we need via `max_align_t`.
Previously, the compiler was using `std.heap.raw_c_allocator` as its GPA
in some cases depending on `std.c.max_align_t`, but that's pretty
fragile (it meant we had to encode our alignment requirements into
`src/main.zig`!). Perhaps more importantly, that solution is
unnecessarily restrictive: since Zig's `Allocator` API passes the
`Alignment` not only to `alloc`, but also to `free` etc, we are able to
use a different strategy depending on its value. So `c_allocator` can
simply compare the requested align to `Alignment.of(std.c.max_align_t)`,
and use a raw `malloc` call (no header needed!) if it will guarantee a
suitable alignment (which, in practice, will be true the vast majority
of the time).
So in short, this makes `std.heap.c_allocator` more memory efficient,
and probably removes any incentive to use `std.heap.raw_c_allocator`.
I also refactored the `c_allocator` implementation while doing this,
just to neaten things up a little.
This avoids pessimizing concurrency on all machines due to e.g. the macOS
machine having high memory usage across the board due to 16K page size.
This also adds max_rss to test-unit and test-c-abi since those tend to eat a
decent chunk of memory too.
If ECANCELED occurs, it's from pthread_cancel which will *permanently*
set that thread to be in a "canceling" state, which means the cancel
cannot be ignored. That means it cannot be retried, like EINTR. It must
be acknowledged.
Unfortunately, trying again until the cancellation request is
acknowledged has been observed to incur a large amount of overhead,
and usually strong cancellation guarantees are not needed, so the
race condition is not handled here. Users who want to avoid this
have this menu of options instead:
* Use no libc, in which case Zig std lib can avoid the race (tracking
issue: https://codeberg.org/ziglang/zig/issues/30049)
* Use musl libc
* Use `std.Io.Evented`. But this is not implemented yet. Tracked by
- https://codeberg.org/ziglang/zig/issues/30050
- https://codeberg.org/ziglang/zig/issues/30051
glibc + threaded is the only problematic combination.
On a heavily loaded Linux 6.17.5, I observed a maximum of 20 attempts
not acknowledged before the timeout (including exponential backoff) was
sufficient, despite the heavy load.
The time wasted here sleeping is mitigated by the fact that, later on,
the system will likely wait for the canceled task, causing it to
indefinitely yield until the canceled task finishes, and the task must
acknowledge the cancel before it proceeds to that point.
Now, before a syscall is entered, beginSyscall is called, which may
return error.Canceled. After syscall returns, whether error or success,
endSyscall is called. If the syscall returns EINTR then checkCancel is
called.
`cancelRequested` is removed from the std.Io VTable for now, with plans
to replace it with a more powerful API that allows protection against
cancellation requests.
closes#25751
The inverse MixColumns operation is already used internally for
AES decryption, but it wasn’t exposed in the public API because
it didn’t seem necessary at the time.
Since then, several new AES-based block ciphers and permutations
(such as Vistrutah and Areion) have been developed, and they require
this operation to be implementable in Zig.
Since then, new interesting AES-based block ciphers and permutations
(Vistrutah, Areion, etc). have been invented, and require that
operation to be implementable in Zig.
This was caused a `[0]std.builtin.Type.StructField.Attributes` to be
considered `undefined`, even though that type is OPV so should prefer
its OPV `.{}` over `undefined`.
Resolves: #30039
Basically detect `-idirafter` flag in `NIX_CFLAGS_COMPILE` and treat it like `-isystem`, also detect `NIX_CFLAGS_LINK` environment variable and treat it like the `NIX_LDFLAGS` .
Reference:
74eefb4210/pkgs/build-support/build-fhsenv-chroot/env.nix (L83)
Rewrite `Reader.takeLeb128` to not use `takeMultipleOf7Leb128` and
instead:
* Use byte aligned integers
* Turn the main reading loop into an inlined loop of static length
* Special case small integers (<= 7 bits)
Notably signed and unsigned 32 bit integers have 5x to 12x(!)
performance improvement.
Outside of that:
For u8, u16 and u64 performance increases ~1.5x to ~6x
For i8, i16 and i64 performance increases ~1.5x to ~3.5x
For integers with bit multiples of 7 performance is roughly equal within the
margin or error.
Also expand on test coverage
Microbenchmark: https://zigbin.io/242cb1
Rewrite `writeLeb128` to no longer use `writeMultipleOf7Leb128` and instead:
* Make use of byte aligned ints
* Special case small numbers (fitting inside 7 bits)
Amongst u8, u16, u32 and u64 performance gains are between ~1.5x and ~2x
Amongst i8, i16, i32 ane i64 perfromance gains are between ~2x and >4x
Additinally add test coverage for written encodings
Microbenchmark: https://zigbin.io/7ed5fe
Hybrid KEMs combine a post-quantum secure KEM with a traditional
elliptic curve Diffie-Hellman key exchange.
The hybrid construction provides security against both classical and quantum
adversaries: even if one component is broken, the combined scheme remains
secure as long as the other component holds.
The implementation follows the IETF CFRG draft specification for concrete
hybrid KEMs:
https://datatracker.ietf.org/doc/draft-irtf-cfrg-concrete-hybrid-kems/
KangarooTwelve is a family of two fast and secure extendable-output
functions (XOFs): KT128 and KT256. These functions generalize
traditional hash functions by allowing arbitrary output lengths.
KangarooTwelve was designed by SHA-3 authors. It aims to deliver
higher performance than the SHA-3 and SHAKE functions defined in
FIPS 202, while preserving their flexibility and core security
principles.
On high-end platforms, it can take advantage of parallelism,
whether through multiple CPU cores or SIMD instructions.
As modern SHA-3 constructions, KT128 and KT256 can serve as
general-purpose hash functions and can be used, for example, in
key-derivation, and with arbitrarily large inputs.
RFC9861: https://datatracker.ietf.org/doc/rfc9861/
It seems to me this was simply forgotten.
Or there is some reason I don't know why this code doesn't work for `comptime_float`.
For a more comprehensive fix, https://github.com/ziglang/zig/pull/24057 is the place to look.
This method is called on an identifier token, so let's rename the parameter to make this clear.
This is also how it's named on most of the caller's sides.
This also unifies the rename implementations, since previously `posix.renameW` used `MoveFileEx` while `posix.renameatW` used `NtOpenFile`/`NtSetInformationFile`. This, in turn, allows the `MoveFileEx` bindings to be deleted as `posix.renameW` was the only usage.
This functionality -- if it's actually needed -- can be reintroduced through
some other mechanism. An ABI is clearly not the right way to represent it.
closes#25918
https://github.com/ziglang/zig/issues/25471
This is not the only test that aborts like this, nor does it happen only on
FreeBSD, but it happens to be disproportionally disruptive on FreeBSD in
particular.
The new builtins are:
* `@EnumLiteral`
* `@Int`
* `@Fn`
* `@Pointer`
* `@Tuple`
* `@Enum`
* `@Union`
* `@Struct`
Their usage is documented in the language reference.
There is no `@Array` because arrays can be created like this:
if (sentinel) |s| [n:s]T else [n]T
There is also no `@Float`. Instead, `std.meta.Float` can serve this use
case if necessary.
There is no `@ErrorSet` and intentionally no way to achieve this.
Likewise, there is intentionally no way to reify tuples with comptime
fields, or function types with comptime parameters. These decisions
simplify the Zig language specification, and moreover make Zig code more
readable by discouraging overly complex metaprogramming.
Co-authored-by: Ali Cheraghi <alichraghi@proton.me>
Resolves: #10710
This reverts commit d54fbc0123.
Since all incremental tests are flaky on Windows, this is reinstated and
all test-incremental tests will be skipped on Windows until the
flakiness is resolved.
Closes#26003
If a Reader implementation implements `stream` by ignoring the Writer, writing directly to its internal buffer, and returning 0, then `defaultDiscard` would not update `seek` and also return 0, which is incorrect and can cause `discardShort` to violate the contract of `VTable.discard` by calling into `vtable.discard` with a non-empty buffer.
This commit fixes the problem by advancing seek up to the limit after the stream call. This logic could likely be somewhat simplified in the future depending on how #25170 is resolved.
This commit flips usage of PathType.isSep from requiring the caller to convert to native to assuming the input is LE encoded, which is a breaking change. This makes usage a bit nicer, though, and moves the endian conversion work from runtime to comptime.
while still preserving the guarantee about async() being assigned a unit
of concurrency (or immediately running the task), this change:
* retains the error from calling getCpuCount()
* spawns all threads in detached mode, using WaitGroup to join them
* treats all workers the same regardless of whether they are processing
concurrent or async tasks. one thread pool does all the work, while
respecting async and concurrent limits.
This is a reimplementation of Io.Threaded that fixes the issues
highlighted in the recent Zulip discussion. It's poorly tested but it
does successfully run to completion the litmust test example that I
offered in the discussion.
This implementation has the following key design decisions:
- `t.cpu_count` is used as the threadpool size.
- `t.concurrency_limit` is used as the maximum number of
"burst, one-shot" threads that can be spawned by `io.concurrent` past
`t.cpu_count`.
- `t.available_thread_count` is the number of threads in the pool that
is not currently busy with work (the bookkeeping happens in the worker
function).
- `t.one_shot_thread_count` is the number of active threads that were
spawned by `io.concurrent` past `t.cpu_count`.
In this implementation:
- `io.async` first tries to decrement `t.available_thread_count`. If
there are no threads available, it tries to spawn a new one if possible,
otherwise it runs the task immediately.
- `io.concurrent` first tries to use a thread in the pool same as
`io.async`, but on failure (no available threads and pool size limit
reached) it tries to spawn a new one-shot thread. One shot threads
run a different main function that just executes one task, decrements
the number of active one shot threads, and then exits.
A relevant future improvement is to have one-shot threads stay on for a
few seconds (and potentially pick up a new task) to amortize spawning
costs.
I would like a chance to review this before it lands, please. Feel free
to submit the work again without changes and I will make review
comments.
In the meantime, these reverts avoid intermittent CI failures, and
remove bad patterns from occurring in the standard library that other
users might copy.
Revert "std.crypto: improve KT documentation, use key_length for B3 key length (#25807)"
This reverts commit 4b593a6c24.
Revert "crypto - threaded K12: separate context computation from thread spawning (#25793)"
This reverts commit ee4df4ad3e.
Revert "crypto.kt128: when using incremental hashing, use SIMD when possible (#25783)"
This reverts commit bf9082518c.
Revert "Add std.crypto.hash.sha3.{KT128,KT256} - RFC 9861. (#25593)"
This reverts commit 95c76b1b4a.
When calling QueryObjectName, NT namespaced paths can be returned. This
change appropriately strips the prefix to turn it into an absolute path.
(The above behaviour was observed at least in Wine so far)
Co-authored-by: Ryan Liptak <squeek502@hotmail.com>
Previously, fs.path handled a few of the Windows path types, but not all of them, and only a few of them correctly/consistently. This commit aims to make `std.fs.path` correct and consistent in handling all possible Win32 path types.
This commit also slightly nudges the codebase towards a separation of Win32 paths and NT paths, as NT paths are not actually distinguishable from Win32 paths from looking at their contents alone (i.e. `\Device\Foo` could be an NT path or a Win32 rooted path, no way to tell without external context). This commit formalizes `std.fs.path` being fully concerned with Win32 paths, and having no special detection/handling of NT paths.
Resources on Windows path types, and Win32 vs NT paths:
- https://googleprojectzero.blogspot.com/2016/02/the-definitive-guide-on-win32-to-nt.html
- https://chrisdenton.github.io/omnipath/Overview.html
- https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
API additions/changes/deprecations
- `std.os.windows.getWin32PathType` was added (it is analogous to `RtlDetermineDosPathNameType_U`), while `std.os.windows.getNamespacePrefix` and `std.os.windows.getUnprefixedPathType` were deleted. `getWin32PathType` forms the basis on which the updated `std.fs.path` functions operate.
- `std.fs.path.parsePath`, `std.fs.path.parsePathPosix`, and `std.fs.path.parsePathWindows` were added, while `std.fs.path.windowsParsePath` was deprecated. The new `parsePath` functions provide the "root" and the "kind" of a path, which is platform-specific. The now-deprecated `windowsParsePath` did not handle all possible path types, while the new `parsePathWindows` does.
- `std.fs.path.diskDesignator` has been deprecated in favor of `std.fs.path.parsePath`, and same deal with `diskDesignatorWindows` -> `parsePathWindows`
- `relativeWindows` is now a compile error when *not* targeting Windows, while `relativePosix` is now a compile error when targeting Windows. This is because those functions read/use the CWD path which will behave improperly when used from a system with different path semantics (e.g. calling `relativePosix` from a Windows system with a CWD like `C:\foo\bar` will give you a bogus result since that'd be treated as a single relative component when using POSIX semantics). This also allows `relativeWindows` to use Windows-specific APIs for getting the CWD and environment variables to cut down on allocations.
- `componentIterator`/`ComponentIterator.init` have been made infallible. These functions used to be able to error on UNC paths with an empty server component, and on paths that were assumed to be NT paths, but now:
+ We follow the lead of `RtlDetermineDosPathNameType_U`/`RtlGetFullPathName_U` in how it treats a UNC path with an empty server name (e.g. `\\\share`) and allow it, even if it'll be invalid at the time of usage
+ Now that `std.fs.path` assumes paths are Win32 paths and not NT paths, we don't have to worry about NT paths
Behavior changes
- `std.fs.path` generally: any combinations of mixed path separators for UNC paths are universally supported, e.g. `\/server/share`, `/\server\share`, `/\server/\\//share` are all seen as equivalent UNC paths
- `resolveWindows` handles all path types more appropriately/consistently.
+ `//` and `//foo` used to be treated as a relative path, but are now seen as UNC paths
+ If a rooted/drive-relative path cannot be resolved against anything more definite, the result will remain a rooted/drive-relative path.
+ I've created [a script to generate the results of a huge number of permutations of different path types](https://gist.github.com/squeek502/9eba7f19cad0d0d970ccafbc30f463bf) (the result of running the script is also included for anyone that'd like to vet the behavior).
- `dirnameWindows` now treats the drive-relative root as the dirname of a drive-relative path with a component, e.g. `dirname("C:foo")` is now `C:`, whereas before it would return null. `dirnameWindows` also handles local device paths appropriately now.
- `basenameWindows` now handles all path types more appropriately. The most notable change here is `//a` being treated as a partial UNC path now and therefore `basename` will return `""` for it, whereas before it would return `"a"`
- `relativeWindows` will now do its best to resolve against the most appropriate CWD for each path, e.g. relative for `D:foo` will look at the CWD to check if the drive letter matches, and if not, look at the special environment variable `=D:` to get the shell-defined CWD for that drive, and if that doesn't exist, then it'll resolve against `D:\`.
Implementation details
- `resolveWindows` previously looped through the paths twice to build up the relevant info before doing the actual resolution. Now, `resolveWindows` iterates backwards once and keeps track of which paths are actually relevant using a bit set, which also allows it to break from the loop when it's no longer possible for earlier paths to matter.
- A standalone test was added to test parts of `relativeWindows` since the CWD resolution logic depends on CWD information from the PEB and environment variables
Edge cases worth noting
- A strange piece of trivia that I found out while working on this is that it's technically possible to have a drive letter that it outside the intended A-Z range, or even outside the ASCII range entirely. Since we deal with both WTF-8 and WTF-16 paths, `path[0]`/`path[1]`/`path[2]` will not always refer to the same bits of information, so to get consistent behavior, some decision about how to deal with this edge case had to be made. I've made the choice to conform with how `RtlDetermineDosPathNameType_U` works, i.e. treat the first WTF-16 code unit as the drive letter. This means that when working with WTF-8, checking for drive-relative/drive-absolute paths is a bit more complicated. For more details, see the lengthy comment in `std.os.windows.getWin32PathType`
- `relativeWindows` will now almost always be able to return either a fully-qualified absolute path or a relative path, but there's one scenario where it may return a rooted path: when the CWD gotten from the PEB is not a drive-absolute or UNC path (if that's actually feasible/possible?). An alternative approach to this scenario might be to resolve against the `HOMEDRIVE` env var if available, and/or default to `C:\` as a last resort in order to guarantee the result of `relative` is never a rooted path.
- Partial UNC paths (e.g. `\\server` instead of `\\server\share`) are a bit awkward to handle, generally. Not entirely sure how best to handle them, so there may need to be another pass in the future to iron out any issues that arise. As of now the behavior is:
+ For `relative`, any part of a UNC disk designator is treated as the "root" and therefore isn't applicable for relative paths, e.g. calling `relative` with `\\server` and `\\server\share` will result in `\\server\share` rather than just `share` and if `relative` is called with `\\server\foo` and `\\server\bar` the result will be `\\server\bar` rather than `..\bar`
+ For `resolve`, any part of a UNC disk designator is also treated as the "root", but relative and rooted paths are still elligable for filling in missing portions of the disk designator, e.g. `resolve` with `\\server` and `foo` or `\foo` will result in `\\server\foo`
Fixes#25703Closes#25702
This is relevant to PIEs, which are notably enabled by default on macOS.
The build system needs to only see virtual addresses, that is, those
which do not have the slide applied; but the fuzzer itself naturally
sees relocated addresses (i.e. with the slide applied). We just need to
subtract the slide when we communicate addresses to the build system.
Like ELF, we now have `std.debug.MachOFile` for the host-independent
parts, and `std.debug.SelfInfo.MachO` for logic requiring the file to
correspond to the running program.
When reporting a compile error, we would load the new file, but assume
we could apply old AST/token indices (etc) to it, potentially causing
crashes. Instead, if the file stat has changed since it was loaded, just
emit an error that the file was modified mid-update.
This has no business being here. Tests for our compiler-rt routines should be in
compiler-rt, and tests for our C ABI compliance should be in `test-c-abi`.
Little-endian is what `std.zig.Server` expects, but the old logic just
send the raw bytes of the struct, so sent in native endian (causing a
crash on big-endian targets).
The big-endian logic here was simply incorrect. Luckily, it was also
overcomplicated; after calling `Value.writeToPackedMemory`, there is a
method on `std.math.big.int.Mutable` which just does the correct
endianness load for us.
Integers with padding bits on big-endian targets cannot quite be bitcast
with a trivial memcpy, because the padding bits (which are zext or sext)
are the most-significant, so are at the *lowest* addresses. So to
bitcast to something which doesn't have padding bits, we need to offset
past the padding.
The logic I've added here definitely doesn't handle all possibilities
correctly; I think that would actually be quite complicated. However, it
handles a common case, and so prevents the Zig compiler itself from
being miscompiled on big-endian targets (hence fixing a bootstrapping
problem on big-endian).
- Affects the following functions:
+ `std.fs.Dir.readLinkW`
+ `std.os.windows.ReadLink`
+ `std.os.windows.ntToWin32Namespace`
+ `std.posix.readlinkW`
+ `std.posix.readlinkatW`
Each of these functions (except `ntToWin32Namespace`) took WTF-16 as input and would output WTF-8, which makes optimal buffer re-use difficult at callsites and could force unnecessary WTF-16 <-> WTF-8 conversion during an intermediate step.
The functions have been updated to output WTF-16, and also allow for the path and the output to re-use the same buffer (i.e. in-place modification), which can reduce the stack usage at callsites. For example, all of `std.fs.Dir.readLink`/`readLinkZ`/`std.posix.readlink`/`readlinkZ`/`readlinkat`/`readlinkatZ` have had their stack usage reduced by one PathSpace struct (64 KiB) when targeting Windows.
The new `ntToWin32Namespace` takes an output buffer and returns a slice from that instead of returning a PathSpace, which is necessary to make the above possible.
The reasoning in the comment deleted by this commit no longer applies, since that same benefit can be obtained by using OpenFile with `.filter = .any`.
Also removes a stray debug.print
We were already using this for `stage1/zig.h`, but `stage1/zig1.wasm`
was being modified directly by the `wasm-opt` command. That's a bad idea
because it forces the build system to assume that `wasm-opt` has side
effects, so it is re-run every time you run `zig build update-zig1`,
i.e. it does not interact with the cache system correctly. It is much
better to create non-side-effecting `Run` steps (using `addOutput*Arg`)
where possible so that the build system has a more correct understanding
of the step graph.
A new `Legalize.Feature` tag is introduced for each float bit width
(16/32/64/80/128). When e.g. `soft_f16` is enabled, all arithmetic and
comparison operations on `f16` are converted to calls to the appropriate
compiler_rt function using the new AIR tag `.legalize_compiler_rt_call`.
This includes casts where the source *or* target type is `f16`, or
integer<=>float conversions to or from `f16`. Occasionally, operations
are legalized to blocks because there is extra code required; for
instance, legalizing `@floatFromInt` where the integer type is larger
than 64 bits requires calling an arbitrary-width integer conversion
function which accepts a pointer to the integer, so we need to use
`alloc` to create such a pointer, and store the integer there (after
possibly zero-extending or sign-extending it).
No backend currently uses these new legalizations (and as such, no
backend currently needs to implement `.legalize_compiler_rt_call`).
However, for testing purposes, I tried modifying the self-hosted x86_64
backend to enable all of the soft-float features (and implement the AIR
instruction). This modified backend was able to pass all of the behavior
tests (except for one `@mod` test where the LLVM backend has a bug
resulting in incorrect compiler-rt behavior!), including the tests
specific to the self-hosted x86_64 backend.
`f16` and `f80` legalizations are likely of particular interest to
backend developers, because most architectures do not have instructions
to operate on these types. However, enabling *all* of these legalization
passes can be useful when developing a new backend to hit the ground
running and pass a good amount of tests more easily.
Simplifies the logic, clarifies the comment, and fixes a minor bug,
which is that we exported the Windows ABI name *instead* of the standard
compiler-rt name, but it's meant to be exported *in addition* to the
standard name (this is LLVM's behavior and it is more useful).
The build runner was previously forcing child processes to have their
stderr colorization match the build runner by setting `CLICOLOR_FORCE`
or `NO_COLOR`. This is a nice idea in some cases---for instance a simple
`Run` step which we just expect to exit with code 0 and whose stderr is
not being programmatically inspected---but is a bad idea in others, for
instance if there is a check on stderr or if stderr is captured, in
which case forcing color on the child could cause checks to fail.
Instead, this commit adds a field to `std.Build.Step.Run` which
specifies a behavior for the build runner to employ in terms of
assigning the `CLICOLOR_FORCE` and `NO_COLOR` environment variables. The
default behavior is to set `CLICOLOR_FORCE` if the build runner's output
is colorized and the step's stderr is not captured, and to set
`NO_COLOR` otherwise. Alternatively, colors can be always enabled,
always disabled, always match the build runner, or the environment
variables can be left untouched so they can be manually controlled
through `env_map`.
Notably, this fixes a failure when running `zig build test-cli` in a
TTY (or with colors explicitly enabled). GitHub CI hadn't caught this
because it does not request color, but Codeberg CI now does, and we were
seeing a failure in the `zig init` test because the actual output had
color escape codes in it due to 6d280dc.
Apple's own headers and tbd files prefer to think of Mac Catalyst as a distinct
OS target. Earlier, when DriverKit support was added to LLVM, it was represented
a distinct OS. So why Apple decided to only represent Mac Catalyst as an ABI in
the target triple is beyond me. But this isn't the first time they've ignored
established target triple norms (see: armv7k and aarch64_32) and it probably
won't be the last.
While doing this, I also audited all Darwin OS prongs throughout the codebase
and made sure they cover all the tags.
There is approximately zero chance of the Zig team ever spending any effort on
supporting Cygwin; the MSVC and MinGW-w64 ABIs are superior in every way that
matters, and not least because they lead to binaries that just run natively on
Windows without needing a POSIX emulation environment installed.
It's easy to do FP unwinding from a CPU context: you just report the
captured ip/pc value first, and then unwind from the captured fp value.
All this really needed was a couple of new functions on the
`std.debug.cpu_context` implementations so that we don't need to rely on
`std.debug.Dwarf` to access the captured registers.
Resolves: #25576
The changes to `codegen.c` are blatant hacks, but the problem they work
around isn't a regression: it's an existing miscompilation. This branch
happened to *expose* that miscompilation in more cases by changing how
an incorrect result is *used*.
It turns out we did use these in the C backend. However, it's really
just as easy, if not easier, to replicate the logic directly in C.
Synchronizes stage1/zig.h to make sure the bootstrap doesn't depend on
these functions either. The actual zig1 tarball is unmodified because
regenerating it is unnecessary in this instance.
I had tried unrolling the loops to avoid requiring the
`vector_store_elem` instruction, but it's arguably a problem to generate
O(N) code for an operation on `@Vector(N, T)`. In addition, that
lowering emitted a lot of `.aggregate_init` instructions, which is
itself a quite difficult operation to codegen.
This requires reintroducing runtime vector indexing internally. However,
I've put it in a couple of instructions which are intended only for use
by `Air.Legalize`, named `legalize_vec_elem_val` (like `array_elem_val`,
but for indexing a vector with a runtime-known index) and
`legalize_vec_store_elem` (like the old `vector_store_elem`
instruction). These are explicitly documented as *not* being emitted by
Sema, so need only be implemented by backends if they actually use an
`Air.Legalize.Feature` which emits them (otherwise they can be marked as
`unreachable`).
`__addosi4`, `__addodi4`, `__addoti4`, `__subosi4`, `__subodi4`, and
`__suboti4` were all functions which we invented for no apparent reason.
Neither LLVM, nor GCC, nor the Zig compiler use these functions. It
appears the functions were created in a kind of misunderstanding of an
old language proposal; see https://github.com/ziglang/zig/pull/10824.
There is no benefit to these functions existing; if a Zig compiler
backend needs this operation, it is trivial to implement, and *far*
simpler than calling a compiler-rt routine. Therefore, this commit
deletes them. A small amount of that code was used by other parts of
compiler-rt; the logic is trivial so has just been inlined where needed.
I also chose to quickly implement `__addvdi3` (a standard function)
because it is trivial and we already implement the `sub` parallel.
I started this diff trying to remove a little dead code from the C
backend, but ended up finding a bunch of dead code sprinkled all over
the place:
* `packed` handling in the C backend which was made dead by `Legalize`
* Representation of pointers to runtime-known vector indices
* Handling for the `vector_store_elem` AIR instruction (now removed)
* Old tuple handling from when they used the InternPool repr of structs
* Straightforward unused functions
* TODOs in the LLVM backend for features which Zig just does not support
The main goal of this change was to avoid emitting the
`vector_store_elem` AIR tag, because this represents an operation which
Zig no longer supports (and hence Sema no longer emits) as of 010d9a6
(because runtime vector indices are now forbidden). Backends should not
need to lower this operation, so I rewrote the legalizations which
emitted it (scalarizations of vector operations) to instead unroll the
loop and hence emit comptime-known vector indices.
In doing this, I actually reworked those legalizations to use a
different strategy; instead of using an `alloc` and storing to
individual vector elements, the vector is constructed by-val, for
instance by performing the scalar operation on all elements and passing
them to an `aggregate_init`. This is vastly simpler to implement in
Legalize, conceptually simpler, and doesn't severely pessimise memory
usage, because a non-optimizing backend will store the full vector on
the stack either way.
Given the above rationale, I also ended up reworking several other
legalizations to use simpler lowerings. The legalizations in question
were bitcast scalarization, `struct_field_val` of `packed struct`s
(where we just bitcast to an integer and perform the appropriate
shift/trunc sequence), and `aggregate_init` of a `packed struct` (also
implemented in terms of integer bitwise operations with bitcasts to and
from the actual types). This hugely simplified some parts of `Legalize`.
So, `Legalize` is now much simpler, and the `vector_store_elem`
instruction is no longer emitted by any part of the compiler so can be
removed in a future commit.
This fixes package fetching on Windows.
Previously, `Async/GroupClosure` allocations were only aligned for the
closure struct type, which resulted in panics when `context_alignment`
(or `result_alignment` for that matter) had a greater alignment.
`Clock.real` being defined to return timestamps relative to an
implementation-specific epoch means that there's currently no way for
the user to translate returned timestamps to actual calendar dates
without digging into implementation details of any particular `Io`
implementation. Redefining it to return timestamps relative to
1970-01-01T00:00:00Z fixes this problem.
There are other ways to solve this, such as adding a new vtable function
for returning the implementation-specific epoch, but in terms of
complexity this redefinition is by far the simplest solution and only
amounts to a simple 96-bit integer addition's worth of overhead on OSes
like Windows that use non-POSIX/Unix epochs.
ML-DSA is a post-quantum signature scheme that was recently
standardized by NIST.
Keys and signatures are pretty large, not making it a drop-in
replacement for classical signature schemes.
But if you are shipping keys that may still be used in 10 years
or whenever large quantum computers able to break ECC arrive,
it that ever happens, and you don't have the ability to replace
these keys, ML-DSA is for you.
Performance is great, verification is faster than Ed25519 / ECDSA.
I tried manual vectorization, but it wasn't worth it, the compiler
does at good job at auto-vectorization already.
This configuration hasn't had much work put into it yet, so is all but
guaranteed to miscompile or crash. Since users are starting to try out
`-fincremental`, and LLVM is still the default backend in many cases,
it's worth having this warning to avoid bug reports like
https://github.com/ziglang/zig/issues/25873.
To match the new default implementation. In fact, I implemented this by
simply dispatching *to* the default implementation after the debug log
guard; no need to complicate things!
This change removes the ref_start_index from the possible enum values of
Index and OptionalIndex. It is not really a index, but a constant that
tells the offset of static Refs, so lets move it where such constant
belongs i.e. to the Ref.
Since the child process is spawned with the tmp directory as its CWD, the child process opens it without DELETE access. On error, the child process would still be alive while the tmp directory is attempting to be deleted, so it would fail with `.SHARING_VIOLATION => return error.FileBusy`.
Fixes arguably the least important part of #22510, since it's only the directory itself that would fail to get deleted, all the files inside would get deleted just fine.
It was not obvious that the KT128/KT256 customization string can be
used to set a key, or what it was designed to be used for at all.
Also properly use key_length and not digest_length for the BLAKE3
key length (no practical changes as they are both 32, but that was
confusing).
Remove unneeded simd_degree copies by the way, and that doesn't need
to be in the public interface.
This seems to work around a very puzzling miscompilation first
present in LLVM 21.x. We already unconditionally add these
clobbers to inline assembly that came from the source, the
valgrind requests should also contain them.
If we use `undefined`, then `netReceive` can `@intCast` the
control slice len to msghdr controllen, which is sometimes `u32`,
even on 64-bit platforms.
`init` just avoids this entirely by setting `control` to an empty
slice rather than undefined.
68d2f68ed introduced special handling for StructInit fields
containing multiline strings to prevent inserting whitespace after =.
However, this logic didn't handle cases without a trailing comma,
which resulted in unwanted trailing whitespace.
The subsystem detection was flaky and often incorrect and was not
actually needed by the compiler or standard library. The actual
subsystem won't be known until at link time, so it doesn't make
sense to try to determine it at compile time.
* threaded K12: separate context computation from thread spawning
Compute all contexts and store them in a pre-allocated array,
then spawn threads using the pre-computed contexts.
This ensures each context is fully materialized in memory with the
correct values before any thread tries to access it.
* kt128: unroll the permutation rounds only twice
This appears to deliver the best performance thanks to improved cache
utilization, and it’s consistent with what we already do for SHA3.
KT128 and KT256 are fast, secure cryptographic hash functions based on Keccak (SHA-3).
They can be seen as the modern version of SHA-3, and evolution of SHAKE, with better performance.
After the SHA-3 competition, the Keccak team proposed these variants in 2016, and the constructions underwent 8 years of public scrutiny before being standardized in October 2025 as RFC 9861.
They uses a tree-hashing mode on top of TurboSHAKE, providing both high security and excellent performance, especially on large inputs.
They support arbitrary-length output and optional customization strings.
Hashing of very large inputs can be done using multiple threads, for high throughput.
KT128 provides 128-bit security strength, equivalent to AES-128 and SHAKE128, which is sufficient for virtually all applications.
KT256 provides 256-bit security strength, equivalent to SHA-512. For virtually all applications, KT128 is enough (equivalent to SHA-256 or BLAKE3).
For small inputs, TurboSHAKE128 and TurboSHAKE256 (which KT128 and KT256 are based on) can be used instead as they have less overhead.
Also remove the example implementation from the file doc comment; it's
better to just link to `defaultLog` as an example, since this avoids
writing the example implementation twice and prevents the example from
bitrotting.
`std.Io.tty.Config.detect` may be an expensive check (e.g. involving
syscalls), and doing it every time we need to print isn't really
necessary; under normal usage, we can compute the value once and cache
it for the whole program's execution. Since anyone outputting to stderr
may reasonably want this information (in fact they are very likely to),
it makes sense to cache it and return it from `lockStderrWriter`. Call
sites who do not need it will experience no significant overhead, and
can just ignore the TTY config with a `const w, _` destructure.
As with Solaris (dba1bf9353), we have no way to
actually audit contributions for these OSs. IBM also makes it even harder than
Oracle to actually obtain these OSs.
closes#23695closes#23694closes#3655closes#23693
restores code closer to master branch in hopes of avoiding a regression
that was introduced when this was based on openSelfExe rather than
GetModuleFileNameExW.
The compiler crashed when we tried to call a function pointer for which
the type signature does not match any function body or function import
in the entire wasm executable, because there is no way to create a
reference to a function without it being in the function table or import
table. Solution is to make this instruction lower to unreachable.
let's handle this in a follow-up change. implementation needs to use
ConvertInterfaceNameToLuidW and the additional dependency on
Iphlpapi.dll poses some challenges.
Now std.Io.Threaded can return error.ConcurrencyUnavailable rather than
asserting. This is handy for logic that wants to try a concurrent
implementation but then fall back to a synchronous one.
Microsoft documentation says "The if_nametoindex function is implemented
for portability of applications with Unix environments, but the
ConvertInterface functions are preferred."
This was also the only dependency on iphlpapi.
The previous implementation would eagerly attempt TCP connection upon
receiving a DNS reply, but it would still wait for all the DNS results
before returning from the function.
This implementation returns immediately upon first successful TCP
connection, canceling not only in-flight TCP connection attempts but
also unfinished DNS queries.
Unfortunately this can't be implemented "above the vtable" because
various operating systems don't provide low level DNS resolution
primitives such as just putting the list of nameservers in a file.
Without libc on Linux it works great though!
Anyway this also changes the API to be based on Io.Queue. By using a
large enough buffer, reusable code can be written that does not require
concurrent, yet takes advantage of responding to DNS queries as they
come in. I sketched out a new implementation of `HostName.connect` to
demonstrate this, but it will require an additional API (`Io.Select`) to
be implemented in a future commit.
This commit also introduces "uncancelable" variants for mutex locking,
waiting on a condition, and putting items into a queue.
This only shows in release mode, the compiler tries to preserve some
value in rdi, but that gets replaced inside the fiber. This would not
happen in the C calling convention, but in these normal Zig functions,
it can happen.
In the future, it might be nice to introduce a type for file system path
names. This would be a way to avoid having InvalidFileName in the error
set, since construction of such type could validate it above the
interface.
Calling recvmsg first means no poll syscall needed when messages are
already in the operating system queue. Empirically, this happens when
repeating a DNS query that has been already been made recently. In such
case, poll() is never called!
glibc and linux kernel use size_t for some field lengths while POSIX and
musl use int. This bug would have caused breakage the first time someone
tried to call sendmsg on a 64-bit big endian system when linking musl
libc.
my opinion:
* msghdr.iovlen: kernel and glibc have it right. This field should
definitely be size_t. With int, the padding bytes are wasted for no
reason.
* msghdr.controllen: POSIX and musl have it right. 4 bytes is plenty for
the length, and it saves 4 bytes next to flags.
* cmsghdr.len: POSIX and musl have it right. 4 bytes is plenty for the
length, and it saves 4 bytes since the other fields are also 32-bits
each.
The one about INT_MAX is self-evident from the type system.
The one about kernel having bad types doesn't seem accurate as I checked
the source code and it uses size_t for all the appropriate types,
matching the libc struct definition for msghdr and msghdr_const.
extract pure functional logic into pure functions and then layer the
scope crap on top properly
the formatting code incorrectly didn't do the reverse operation
(if_indextoname). fix that with some TODO panics
1. a fiber can't put itself on a queue that allows it to be rescheduled
2. allow the idle fiber to unlock a mutex held by another fiber by
ignoring reschedule requests originating from the idle fiber
When the previous fiber did not request to be registered as an awaiter,
it may not have actually been a full blown `Fiber`, so only create the
`Fiber` pointer when needed.
The logic for computing reference traces was unintentionally finding the
*longest* possible trace (approximately). I think I already tried to fix
this before, but misunderstood how my own code works. Here, we fix it
properly: by slightly reworking the logic to use one ArrayHashMap for
both the result and the traversal queue, we trivially get a proper
breadth-first traversal so that we can find the shortest possible
reference trace for every referenced unit.
There is no straightforward way for the Zig team to access the Solaris system
headers; to do this, one has to create an Oracle account, accept their EULA to
download the installer ISO, and finally install it on a machine or VM. We do not
have to jump through hoops like this for any other OS that we support, and no
one on the team has expressed willingness to do it.
As a result, we cannot audit any Solaris contributions to std.c or other
similarly sensitive parts of the standard library. The best we would be able to
do is assume that Solaris and illumos are 100% compatible with no way to verify
that assumption. But at that point, the solaris and illumos OS tags would be
functionally identical anyway.
For Solaris especially, any contributions that involve APIs introduced after the
OS was made closed-source would also be inherently more risky than equivalent
contributions for other proprietary OSs due to the case of Google LLC v. Oracle
America, Inc., wherein Oracle clearly demonstrated its willingness to pursue
legal action against entities that merely copy API declarations.
Finally, Oracle laid off most of the Solaris team in 2017; the OS has been in
maintenance mode since, presumably to be retired completely sometime in the 2030s.
For these reasons, this commit removes all Oracle Solaris support.
Anyone who still wishes to use Zig on Solaris can try their luck by simply using
illumos instead of solaris in target triples - chances are it'll work. But there
will be no effort from the Zig team to support this use case; we recommend that
people move to illumos instead.
Renames arePointersLogical to shouldBlockPointerOps for clarity
adds capability check to allow pointer ops on .storage_buffer when
variable_pointers capability is enabled.
Fixes#25638
Zig uses "rN" for MIPS register clobbers which are more
ergonomic and easier to write (.rN vs. .@"$N").
However, GCC and Clang uses "$N".
Bug: #25613
Signed-off-by: Bingwu Zhang <xtex@xtexx.eu.org>
MIPS I has load hazards so we need to insert nops in a few places. This is not a
problem for MIPS II and later.
While doing this, I also touched up all the inline asm to use ABI register
aliases and a consistent formatting convention. Also fixed a few places that
didn't properly check if the syscall return value should be negated.
This allows us to rule out support for certain address spaces based on the OS.
This commit is just a refactor, however, and doesn't actually make use of that
opportunity yet.
build system: unit test enhancements
Contributes towards https://github.com/ziglang/zig/issues/19821, but does not close it, since the timeout currently cannot be modified per unit test.
The last commit passed CI, so this final bump is just to allow for
deviation caused by different loads on the runner machines. With this
change, I don't expect any current unit test to ever time out, even when
CI is under extreme load.
Unfortunately, Windows' scheduler means that test timeouts get hit very
easily, because it seems the system can refuse to schedule a waiting
process for *upwards of 10 minutes*. We should look for a better
solution for this problem going forwards, but for now, just give Windows
a very high test timeout.
The 30 minute timeout set here is around the duration of a *full CI run*
on Windows, so it should be impossible to hit normally, but it means
that if a test gets stuck we'll at least get told (eventually).
The Wycheproof test suite is extensive, but takes a long time to
complete on CI.
Keep only the most relevant ones and take it as an opportunity to describe
what they are.
The remaining ones are still available for manual testing when required.
This test called `yield` 80,000 times, which is nothing on a system with
little load, but murder on a CI system. macOS' scheduler in particular
doesn't seem to deal with this very well. The `yield` calls also weren't
even necessarily doing what they were meant to: if the optimizer could
figure out that it doesn't clobber some memory, then it could happily
reorder around the `yield`s anyway!
The test has been simplified and made to work better, and the number of
yields have been reduced. The number of overall iterations has also been
reduced, because with the `yield` calls making races very likely, we
don't really need to run too many iterations to be confident that the
implementation is race-free.
For instance, when running a Zig test using the self-hosted aarch64
backend, this logic was previously expecting `std.zig.Server` to be
used, but the default test runner intentionally does not do this because
the backend is too immature to handle it. On 'master', this is causing
sporadic failures; on this branch, they became consistent failures.
The new `--error-style` option decides how build failures are printed.
The default mode "verbose" prints all context including the step graph
fragment and the failed command (if any). The alternative mode "minimal"
prints only the failed step itself, and does not print the failed
command. There are also "verbose_clear" and "minimal_clear" modes, which
have the distinction that the output is cleared (through ANSI escape
codes) between updates, preventing different updates from being confused
in the output. If `--error-style` is not specified, the environment
variable `ZIG_BUILD_ERROR_STYLE` is checked before falling back to the
default of "verbose"; this means the value can effectively be chosen
system-wide since it is generally a personal preference.
Also introduced is a `--multiline-errors` option which decides how to
print errors which span multiple lines. By default, non-initial lines
are indented to align with the first. Alternatively, a leading newline
can be printed to align everyting on the first column, or no special
treatment can be applied, resulting in misaligned output. Again, there
is an environment variable (`ZIG_BUILD_MULTILINE_ERRORS`) to specify a
preferred default if the option is not explicitly provided.
Resolves: #23472
This is a major refactor to `Step.Run` which adds new functionality,
primarily to the execution of Zig tests.
* All tests are run, even if a test crashes. This happens through the
same mechanism as timeouts where the test processes is repeatedly
respawned as needed.
* The build status output is more precise. For each unit test, it
differentiates pass, skip, fail, crash, and timeout. Memory leaks are
reported separately, as they do not indicate a test's "status", but
are rather an additional property (a test with leaks may still pass!).
* The number of memory leaks is tracked and reported, both per-test and
for a whole `Run` step.
* Reporting is made clearer when a step is failed solely due to error
logs (`std.log.err`) where every unit test passed.
For now, there is a flag to `zig build` called `--test-timeout-ms` which
accepts a value in milliseconds. If the execution time of any individual
unit test exceeds that number of milliseconds, the test is terminated
and marked as timed out.
In the future, we may want to increase the granularity of this feature
by allowing timeouts to be specified per-step or even per-test. However,
a global option is actually very useful. In particular, it can be used
in CI scripts to ensure that no individual unit test exceeds some
reasonable limit (e.g. 60 seconds) without having to assign limits to
every individual test step in the build script.
Also, individual unit test durations are now shown in the time report
web interface -- this was fairly trivial to add since we're timing tests
(to check for timeouts) anyway.
This commit makes progress on #19821, but does not close it, because
that proposal includes a more sophisticated mechanism for setting
timeouts.
Co-Authored-By: David Rubin <david@vortan.dev>
The ABIs do not define a frame pointer register, nor do they define a guaranteed
and fixed area on the stack where one might find saved registers such as a frame
pointer or return address.
The compile-time check against the minimum version here wasn't appropriate, since it still makes sense to try using FILE_RENAME_INFORMATION_EX even if the minimum version is something like `xp`, since that doesn't rule out the possibility of the compiled code running on Windows 10/11. This compile-time check was doubly bad since the default minimum windows version (`.win10`) was below the `.win10_rs5` that was checked for, so when providing a target like `x86_64-windows-gnu` it'd always rule out using this syscall.
After this commit, we always try using FILE_RENAME_INFORMATION_EX and then let the operating system tell us when some aspect of it is not supported. This allows us to get the benefits of these new syscalls/flags whenever it's actually possible.
The possible error returns were validated experimentally:
- INVALID_PARAMETER is returned when the underlying filesystem is FAT32
- INVALID_INFO_CLASS is returned on Windows 7 when trying to use FileRenameInformationEx/FileDispositionInformationEx
- NOT_SUPPORTED is returned on Windows 10 >= .win10_rs5 when setting a bogus flag value (I used `0x1000`)
This is very likely full of wrong stuff. It's effectively just a copy of the
x86_64 file - needed because the former stopped using usize/isize. To be clear,
this is no more broken than the old situation was; this just makes the
brokenness explicit.
This is very likely full of wrong stuff. It's effectively just a copy of the
mips64 file - needed because the former stopped using usize/isize. To be clear,
this is no more broken than the old situation was; this just makes the
brokenness explicit.
This is a rewrite of the BLAKE3 implementation, with vectorization.
On Apple Silicon, the new implementation is about twice as fast as the previous one.
With AVX2, it is more than 4 times faster.
With AVX512, it is more than 7.5x faster than the previous implementation (from 678 MB/s to 5086 MB/s).
The return address points to the call instruction on SPARC, so the actual return
address is 8 bytes after. This means that we shouldn't do the return address
adjustment that we normally do.
The FP would point to the register save area for the previous frame, while the
SP points to the register save area for the current frame. So use the latter.
I have no idea if this is a QEMU bug or real kernel behavior. Either way, the
register save area specifically exists for asynchronous spilling of incoming and
local registers, so there should be no harm in doing this.
* std.crypto: add AES-CCM and CBC-MAC
Add AES-CCM (Counter with CBC-MAC) authenticated encryption and
CBC-MAC message authentication code implementations to the standard
library.
AES-CCM combines CTR mode encryption with CBC-MAC authentication as
specified in NIST SP 800-38C and RFC 3610. It provides authenticated
encryption with support for additional authenticated data (AAD).
CBC-MAC is a simple MAC construction used internally by CCM, specified
in FIPS 113 and ISO/IEC 9797-1.
Includes comprehensive test vectors from RFC 3610 and NIST SP 800-38C.
* std.crypto: add CCM* (encryption-only) support to AES-CCM
Implements CCM* mode per IEEE 802.15.4 specification, extending
AES-CCM to support encryption-only mode when tag_len=0. This is
required by protocols like ZigBee, Thread, and WirelessHART.
Changes:
- Allow tag_len=0 for encryption-only mode (no authentication)
- Skip CBC-MAC computation when tag_len=0 in encrypt/decrypt
- Correctly encode M'=0 in B0 block for CCM* mode
- Add Aes128Ccm0 and Aes256Ccm0 convenience instances
- Add IEEE 802.15.4 test vectors and CCM* tests
* std.crypto: add doc comments for AES-CCM variants
If these ever get allocated, it's most likely going to be for things that don't
matter to us anyway, so completely abandoning DWARF unwinding just because we
see these doesn't seem justified. We will still do so if we're actually asked to
read from such a register, which is the only actually problematic case; see
c23a5ccd19 for more details.
This is to help diagnose #25498. We can't use `unexpectedErrno` here,
because `std.posix.munmap` is infallible. So, when the flag is set to
report unexpected errnos, we just call `std.debug.panic` to provide
details instead of doing `unreachable`.
Pushing straight to master after running checks locally; there's no
point waiting for CI on the PR just for this.
This path being relative is unconventional and causes issues for us
if the output artifact is ever used from a different cwd than the one it
was built from. The behavior implemented by this commit of always
emitting these paths as absolute was actually the behavior in 0.14.x,
but it regressed in 0.15.1 due to internal reworks to path handling
which led to relative paths being more common in the compiler internals.
Resolves: #25433
This type is useful for two things:
* Doing non-local control flow with ucontext.h functions.
* Inspecting machine state in a signal handler.
The first use case is not one we support; we no longer expose bindings to those
functions in the standard library. They're also deprecated in POSIX and, as a
result, not available in musl.
The second use case is valid, but is very poorly served by the standard library.
As evidenced by my changes to std.debug.cpu_context.signal_context_t, users will
be better served rolling their own ucontext_t and especially mcontext_t types
which fit their specific situation. Further, these types tend to evolve
frequently as architectures evolve, and the standard library has not done a good
job keeping up, or even providing them for all supported targets.
I made a couple of decisions for this based on the fact that we don't expose the
signal_ucontext_t type outside of the file:
* Adding all the floating point and vector state to every ucontext_t and
mcontext_t variant was way, way too much work, especially when we don't even
use the stuff. So I deleted all that and kept only the bare minimum needed to
reach into general-purpose registers.
* There is no particularly compelling reason to stick to the naming and struct
nesting used in the system headers. So we can actually unify the access
patterns for almost all of these variants by taking some liberties here; as a
result, fromPosixSignalContext() is now much nicer to read and extend.
I broke this when porting this logic for the `std.debug` rework in
https://github.com/ziglang/zig/pull/25227. The offset that I copied was
actually being treated as relative to the address of the *saved* base
pointer. I think it makes more sense to do what I did and just treat all
offsets as relative to this frame's base.
- Revive some of the removed cache integration logic in `cmdTranslateC` now that `translate-c` can return error bundles
- Fixup inconsistent path separators (on Windows) when building the aro include path
- Move some error bundle logic from resinator into aro.Diagnostics
- Add `ErrorBundle.addRootErrorMessageWithNotes` (extracted from resinator)
Now it's based on calling fillMore rather than an illegal aliased stream
into the Reader buffer.
This commit also includes a disambiguation block inspired by #25162. If
`StreamTooLong` was added to `RebaseError` then this logic could be
replaced by removing the exit condition from the while loop. That error
code would represent when `buffer` capacity is too small for an
operation, replacing the current use of asserts.
Fix `takeDelimiter` and `takeDelimiterExclusive` tossing too many bytes
(#25132)
Also add/improve test coverage for all delimiter and sentinel methods,
update usages of `takeDelimiterExclusive` to not rely on the fixed bug,
tweak a handful of doc comments, and slightly simplify some logic.
I have not fixed#24950 in this commit because I am a little less
certain about the appropriate solution there.
Resolves: #25132
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
* File.Writer.seekBy passed wrong offset to setPosAdjustingBuffer.
* File.Writer.sendFile incorrectly used non-logical position.
Related to 1d764c1fdf
Test case provided by:
Co-authored-by: Kendall Condon <goon.pri.low@gmail.com>
Previously, the logic in peekDelimiterInclusive (when the delimiter was not found in the existing buffer) used the `n` returned from `r.vtable.stream` as the length of the slice to check, but it's valid for `vtable.stream` implementations to return 0 if they wrote to the buffer instead of `w`. In that scenario, the `indexOfScalarPos` would be given a 0-length slice so it would never be able to find the delimiter.
This commit changes the logic to assume that `r.vtable.stream` can both:
- return 0, and
- modify seek/end (i.e. it's also valid for a `vtable.stream` implementation to rebase)
Also introduces `std.testing.ReaderIndirect` which helps in being able to test against Reader implementations that return 0 from `stream`/`readVec`
Fixes#25428
This reverts commit 27aba2d776.
I'd like to review this contribution more carefully, particularly with
the alternate implementation that is also open as a pull request
(#25109).
Reopens#25093
`findScalarPos` might do repetitive work, even if using simd. For
example, when searching the string `/abcde/fghijk/lm` for the character
`/`, a 16-byte wide search would yield `1000001000000100` but would only
count the first `1` and re-search the remaining of the string.
When testing locally, the difference was quite significative:
```
count scalar
5737 iterations 522.83us per iterations
0 bytes per iteration
worst: 2370us median: 512us stddev: 107.64us
count v2
38333 iterations 78.03us per iterations
0 bytes per iteration
worst: 713us median: 76us stddev: 10.62us
count scalar v2
99565 iterations 29.80us per iterations
0 bytes per iteration
worst: 41us median: 29us stddev: 1.04us
```
Note that `count v2` is a simpler string search, similar to the
remaining version of the simd approach:
```
pub fn countV2(comptime T: type, haystack: []const T, needle: T) usize {
const n = haystack.len;
if (n < 1) return 0;
var count: usize = 0;
for (haystack[0..n]) |item| {
count += @intFromBool(item == needle);
}
return count;
}
```
Which implies the compiler yields some optimized code for a simpler loop
that is more performant than the `findScalarPos`-based approach, hence
the usage of iterative approach for the remaining of the haystack.
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
For unwinding purposes, we don't care about unsupported registers. Yet because
we added these rules to the cache entry, we'd later try to evaluate them and
thus fail the unwind attempt for no good reason. They'd also take up cache rule
slots that would be better spent on actually relevant registers.
Note that any attempt to read unsupported registers during unwinding will still
fail the unwind attempt as expected.
The previous version (ported from musl) used bit-by-bit calculations and was slow, but the current version (also ported from musl) uses lookup tables combined with Goldschmidt iterations to significantly improve the speed.
* ELF v1 on powerpc64 is only barely kept on life support in a couple of Linux
distros. I don't anticipate that this will last much longer.
* Most of the Linux world has moved to powerpc64le which requires ELF v2.
* Some Linux distros have even started supporting powerpc64 with ELF v2.
* The BSD world has long since moved to ELF v2.
* We have no actual linking support for ELF v1.
* ELF v1 had confused DWARF register mappings which is becoming a problem in
our DWARF code in std.debug.
It's clear that ELF v1 is on its way out, and we never fully supported it
anyway. So let's not waste any time or energy on it going forward.
closes#5927
FreeBSD doesn't support the same number of platforms as Linux, and even then,
only has usermode emulation for a subset of its supported platforms.
NetBSD's usermode emulation support is apparently just broken at the moment.
The amount of cross compilation required for these tests was too time-consuming
for how much value they added. test-stack-traces now cover these well enough,
especially as we add more exotic machines to the CI fleet to run native tests.
For the supported COFF machine types of X64 (x86_64), I386 (x86), ARMNT (thumb), and ARM64 (aarch64), this new Zig implementation results in byte-for-byte identical .lib files when compared to the previous LLVM-backed implementation.
Previously, `setAlignment` would set the value to 1 fewer than it should, so if you were intending to set alignment to 8 bytes, it would actually set it to 4 bytes, etc.
This enables depth-related use cases without any dependency on the Walker's internal stack which doesn't always pertain to the actual depth of the current entry (i.e. recursing into a directory immediately affects the stack).
Some decision-making might depend on the level of the traversal, so
it makes sense to expose depth here since it's stable, and not in the
automatic walker where it's not.
This matches all other platforms. Even if this field is defined as 'int'
in the C definition, the expectation is that the full 32-bit unsigned
integer range can be used. In particular this Sigaction initializer in
the new std.debug code was causing a build failure:
```zig
.flags = (posix.SA.SIGINFO | posix.SA.RESTART | posix.SA.RESETHAND)
```
This is a little different from how C/C++ compilers do this, but I think it's
justified because it's what users actually *mean* when the use frame pointer
options.
This is another one of those LLVM "CPU" features that have nothing to do with
CPU at all and should really be a TargetMachine option or something. One day
we'll figure out a better way of dealing with these...
contributing is in the readme already, and code of conduct should go on
the website. this is a code repository; it doesn't dictate social norms.
the reason for these documents being in .github/ was to satisfy GitHub
demands so that the UI would look more favorably upon ziglang/zig but
that is no longer a concern.
Before, this had a subtle ordering bug where duplicate
deps that are specified as both lazy and eager in different
parts of the dependency tree end up not getting fetched
depending on the ordering. I modified it to resubmit lazy
deps that were promoted to eager for fetching so that it will
be around for the builds that expect it to be eager downstream
of this.
Implements deflate compression from scratch. A history window is kept in
the writer's buffer for matching and a chained hash table is used to
find matches. Tokens are accumulated until a threshold is reached and
then outputted as a block. Flush is used to indicate end of stream.
Additionally, two other deflate writers are provided:
* `Raw` writes only in store blocks (the uncompressed bytes). It
utilizes data vectors to efficiently send block headers and data.
* `Huffman` only performs Huffman compression on data and no matching.
The above are also able to take advantage of writer semantics since they
do not need to keep a history.
Literal and distance code parameters in `token` have also been reworked.
Their parameters are now derived mathematically, however the more
expensive ones are still obtained through a lookup table (expect on
ReleaseSmall).
Decompression bit reading has been greatly simplified, taking advantage
of the ability to peek on the underlying reader. Additionally, a few
bugs with limit handling have been fixed.
There were only a few dozen lines of common logic, and they frankly
introduced more complexity than they eliminated. Instead, let's accept
that the implementations of `SelfInfo` are all pretty different and want
to track different state. This probably fixes some synchronization and
memory bugs by simplifying a bunch of stuff. It also improves the DWARF
unwind cache, making it around twice as fast in a debug build with the
self-hosted x86_64 backend, because we no longer have to redundantly go
through the hashmap lookup logic to find the module. Unwinding on
Windows will also see a slight performance boost from this change,
because `RtlVirtualUnwind` does not need to know the module whatsoever,
so the old `SelfInfo` implementation was doing redundant work. Lastly,
this makes it even easier to implement `SelfInfo` on freestanding
targets; there is no longer a need to emulate a real module system,
since the user controls the whole implementation!
There are various other small refactors here in the `SelfInfo`
implementations as well as in the DWARF unwinding logic. This change
turned out to make a lot of stuff simpler!
Apparently the `__eh_frame` in Mach-O binaries doesn't include the
terminator entry, but in all other respects it acts like `.eh_frame`
rather than `.debug_frame`. I have no idea.
By my estimation, these changes speed up DWARF unwinding when using the
self-hosted x86_64 backend by around 7x. There are two very significant
enhancements: we no longer iterate frames which don't fit in the stack
trace buffer, and we cache register rules (in a fixed buffer) to avoid
re-parsing and evaluating CFI instructions in most cases. Alongside this
are a bunch of smaller enhancements, such as pre-caching the result of
evaluating the CIE's initial instructions, avoiding re-parsing of CIEs,
and big simplifications to the `Dwarf.Unwind.VirtualMachine` logic.
This was causing a zig2 miscomp, which emitted slightly broken debug
information, which caused extremely slow stack unwinding. We're working
on fixing or reporting this upstream, but we can use this workaround for
now, because GCC guarantees arithmetic signed shift.
This logic was causing some occasional infinite looping on ARM, where
the `.debug_frame` section is often incomplete since the `.exidx`
section is used for unwind information. But the information we're
getting from the compiler is totally *valid*: it's leaving the rule as
the default, which is (as with most architectures) equivalent to
`.undefined`!
This has been a TODO for ages, but in the past it didn't really matter
because stack traces are typically printed to stderr for which a mutex
is held so in practice there was a mutex guarding usage of `SelfInfo`.
However, now that `SelfInfo` is also used for simply capturing traces,
thread safety is needed. Instead of just a single mutex, though, there
are a couple of different mutexes involved; this helps make critical
sections smaller, particularly when unwinding the stack as `unwindFrame`
doesn't typically need to hold any lock at all.
Calling `current` here causes compilation failures as the C backend
currently does not emit valid MSVC inline assembly. This change means
that when building for MSVC with the self-hosted C backend, only FP
unwinding can be used.
Processes should reasonably be able to expect their children to abort
with typical exit codes, rather than a debugger breakpoint signal. This
flag in the PEB is what would be checked by `IsDebuggerPresent` in
kernel32, which is the function you would typically use for this
purpose.
This fixes `test-stack-trace` failures on Windows, as these tests were
expecting exit code 3 to indicate abort.
...and just deal with signal handlers by adding 1 to create a fake
"return address". The system I tried out where the addresses returned by
`StackIterator` were pre-subtracted didn't play nicely with error
traces, which in hindsight, makes perfect sense. This definition also
removes some ugly off-by-one issues in matching `first_address`, so I do
think this is a better approach.
This crash exists on master, and seems to have existed since 2019; I
think it's just very rare and depends on the exact binary generated. In
theory, a stream block should always be a "data" block rather than a FPM
block; the FPMs use blocks `1, 4097, 8193, ...` and `2, 4097, 8194, ...`
respectively. However, I have observed LLVM emitting an otherwise valid
PDB which maps FPM blocks into streams. This is not a bug in
`std.debug.Pdb`, because `llvm-pdbutil` agrees with our stream indices.
I think this is arguably an LLVM bug; however, we don't really lose
anything from just weakening this check. To be fair, MSF doesn't have an
explicit specification, and LLVM's documentation (which is the closest
thing we have) does not explicitly state that FPM blocks cannot be
mapped into streams, so perhaps this is actually valid.
In the rare case that LLVM emits this, previously, stack traces would
have been completely useless; now, stack traces will work okay.
Mostly on macOS, since Loris showed me a not-great stack trace, and I
spent 8 hours trying to make it better. The dyld shared cache is
designed in a way which makes this really hard to do right, and
documentation is non-existent, but this *seems* to work pretty well.
I'll leave the ruling on whether I did a good job to CI and our users.
Our usage of `ucontext_t` in the standard library was kind of
problematic. We unnecessarily mimiced libc-specific structures, and our
`getcontext` implementation was overkill for our use case of stack
tracing.
This commit introduces a new namespace, `std.debug.cpu_context`, which
contains "context" types for various architectures (currently x86,
x86_64, ARM, and AARCH64) containing the general-purpose CPU registers;
the ones needed in practice for stack unwinding. Each implementation has
a function `current` which populates the structure using inline
assembly. The structure is user-overrideable, though that should only be
necessary if the standard library does not have an implementation for
the *architecture*: that is to say, none of this is OS-dependent.
Of course, in POSIX signal handlers, we get a `ucontext_t` from the
kernel. The function `std.debug.cpu_context.fromPosixSignalContext`
converts this to a `std.debug.cpu_context.Native` with a big ol' target
switch.
This functionality is not exposed from `std.c` or `std.posix`, and
neither are `ucontext_t`, `mcontext_t`, or `getcontext`. The rationale
is that these types and functions do not conform to a specific ABI, and
in fact tend to get updated over time based on CPU features and
extensions; in addition, different libcs use different structures which
are "partially compatible" with the kernel structure. Overall, it's a
mess, but all we need is the kernel context, so we can just define a
kernel-compatible structure as long as we don't claim C compatibility by
putting it in `std.c` or `std.posix`.
This change resulted in a few nice `std.debug` simplifications, but
nothing too noteworthy. However, the main benefit of this change is that
DWARF unwinding---sometimes necessary for collecting stack traces
reliably---now requires far less target-specific integration.
Also fix a bug I noticed in `PageAllocator` (I found this due to a bug
in my distro's QEMU distribution; thanks, broken QEMU patch!) and I
think a couple of minor bugs in `std.debug`.
Resolves: #23801Resolves: #23802
This only matters if `callMain` is called by a user, since `std.start`
will never itself call `callMain` when `target.os.tag == .other`.
However, it *is* a valid use case for a user to call
`std.start.callMain` in their own startup logic, so this makes sense.
The input path could be cwd-relative, in which case it must be modified
before it is written into the batch script.
Also, remove usage of deprecated `GeneralPurposeAllocator` alias, rename
`allocator` to `gpa`, use unmanaged `ArrayList`.
Turns out that RtlCaptureStackBackTrace is actually just doing FP (ebp)
unwinding under the hood, making this logic completely redundant with
our own FP-walking implementation; see added comment for details.
Previously, the `test-stack-traces` step was essentially just testing
error traces, and even there we didn't have much coverage. This commit
solves that by splitting the "stack trace" tests into two separate
harnesses: the "stack trace" tests are for actual stack traces (i.e.
involving stack unwinding), while the "error trace" tests are
specifically for error return traces.
The "stack trace" tests will test different configurations of:
* `-lc`
* `-fPIE`
* `-fomit-frame-pointer`
* `-fllvm`
* unwind tables (currently disabled)
* strip debug info (currently disabled)
The main goal there is to test *stack unwinding* under different
conditions. Meanwhile, the "error trace" tests will test different
configurations of `-O` and `-fllvm`; the main goal here, aside from
checking that error traces themselves do not miscompile, is to check
whether debug info is still working even in optimized builds. Of course,
aggressive optimizations *can* thwart debug info no matter what, so as
before, there is a way to disable cases for specific targets / optimize
modes.
The program which converts stack traces into a more validatable format
by removing things like addresses (previously `check-stack-trace.zig`,
now `convert-stack-trace.zig`) has been rewritten and simplified. Also,
thanks to various fixes in this branch, several workarounds have become
unnecessary: for instance, we don't need to ignore the function name
printed in stack traces in release modes, because `std.debug.Dwarf` now
uses the correct DIE for inlined functions!
Neither `test-stack-traces` nor `test-error-traces` does general foreign
architecture testing, because it seems that (at least for now) external
executors often aren't particularly good at handling stack tracing
correctly (looking at you, Wine). Generally, they just test the native
target (this matches the old behavior of `test-stack-traces`). However,
there is one exception: when on an x86_64 or aarch64 host, we will also
test the 32-bit version (x86 or arm) if the OS supports it, because such
executables can be trivially tested without an external executor.
Oh, also, I wrote a bunch of stack trace tests. Previously there was,
erm, *one* test in `test-stack-traces` which wasn't for error traces.
Now there are a good few!
It was possible for `arg6` to be passed as an operand relative to esp.
In that case, the `push` at the top clobbered esp and hence made the
reference to arg6 invalid. This was manifesting in this branch as broken
stack traces on x86-linux due to an `mmap2` syscall accidentally passing
the page offset as non-zero!
This commit fixes a bug introduced in cb0e6d8aa.
We mustn't emit the DT_PLTGOT entry in `.dynamic` in a statically-linked
PIE, because there's no dl to relocate it (and `std.pie.relocate`, or
the PIE relocator in libc, won't touch it). In that case, there cannot
be any PLT entries, so there's no point emitting the `.got.plt` section
at all. If we just don't create that section, `link.Elf` already knows
not to add the DT_PLTGOT entry to `.dynamic`.
Co-authored-by: Jacob Young <jacobly0@users.noreply.github.com>
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:
Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.
The symtab work is derived from #22077.
Co-authored-by: geemili <opensource@geemili.xyz>
If it's not given, we should set `first_address` to the return address
of `dumpCurrentStackTrace` to avoid the call to `writeCurrentStackTrace`
appearing in the trace. However, we must only do that if no `context` is
given; if there's a context then we're starting the stack unwind
elsewhere.
At least, when there's not a ZigObject. The old behavior was incorrect
in the presence of a ZigObject, and this doesn't really mix nicely with
incremental compilation anyway; but when the objects are all external,
we may as well build the search table.
Far simpler, because everything which `crash_report.zig` did is now
handled pretty well by `std.debug` anyway. All we want is to print some
context around panics and segfaults. Using the new ability to override
the default segfault handler while still having std handle the
target-specific bits for us, that's really simple.
The downside of this commit is that more precise errors are no longer
propagated up. However, these errors were pretty useless in isolation
due to them having no context; and regardless, we intentionally swallow
most of them in `std.debug` anyway. Therefore, this is better in
practice, because it allows `std.debug` to give slightly more useful
warnings when handling errors. This commit does that for unwind errors,
for instance, which differentiate between the unwind info being corrupt
vs missing vs inaccessible vs unsupported.
A better solution would be to also include more detailed information via
the diagnostics pattern, but this commit is an incremental improvement.
turns out this isn't technically specific to that target at all; other
targets just don't emit mid-function 'ret' instructions as much so
certain CFI instruction patterns were only seen on aarch64.
thanks to jacob for finding the bug <3
Because -fno-llvm is now the default on x86_64-linux, this target was
exactly equivalent to one specified earlier in the matrix. This was
probably just missed when doing the work to enable the self-hosted
backend by default for x86_64.
The memory operand might use one of the extended GPRs R8 through R15 and
hence require a REX prefix, but having a REX prefix makes the high-byte
register AH unencodeable as the src operand. This latent bug was exposed
by this branch, presumably because `select` now happens to be putting
something in an extended GPR instead of a legacy GPR.
In theory this could be fixed with minimal cost by introducing a way to
communicate to `select` that neither the destination memory nor the
other temporary can be in an extended GPR. However, I just went for the
simple solution which comes at a cost of one trivial instruction: copy
the remainder from AH to AL, and *then* copy AL to the destination.
`rep movsb` isn't usually a great idea here. This commit makes the logic
which tentatively existed in `genInlineMemcpy` apply in more cases, and
in particular applies it to the "new" backend logic. Put simply, all
copies of 128 bytes or fewer will now attempt this path first,
where---provided there is an SSE register and/or a general-purpose
register available---we will lower the operation using a sequence of 32,
16, 8, 4, 2, and 1 byte copy operations.
The feedback I got on this diff was "Push it to master and if it
miscomps I'll revert it" so don't blame me when it explodes
* Add missing functions like ISDIR() or ISREG(). This is required to
build the zig compiler
* Use octal notation for the S_ constants. This is how it is done for
".freebsd" and it is also the notation used by DragonFly in
"sys/stat.h"
* Reorder S_ constants in the same order as ".freebsd" does. Again, this
follows the ordering within "sys/stat.h"
Before https://github.com/ziglang/zig/pull/18160, error tracing defaulted to true in ReleaseSafe, but that is no longer the case. These option descriptions were never updating accordingly.
--debug-rt previously would make rt libs match the root module. Now they
are always debug when --debug-rt is passed. This includes compiler-rt,
fuzzer lib, and others.
Moving towards our function naming convention of having one word per
concept and constructing function names out of concatenated concepts.
In `std.mem` the concepts are:
* "find" - return index of substring
* "pos" - starting index parameter
* "last" - search from the end
* "linear" - simple for loop rather than fancy algo
* "scalar" - substring is a single element
This is f5fb720a5399ee98e45f36337b2f68a4d23a783c plus ehaas's nonnull
attribute pull request currently at 4b26cb3ac610a0a070fc43e43da8b4cdf0e9101b
with zig patches intact.
Adds the limit option to `--fuzz=[limit]`. the limit expresses a number
of iterations that *each fuzz test* will perform at maximum before
exiting. The limit argument supports also 'K', 'M', and 'G' suffixeds
(e.g. '10K').
Does not imply `--web-ui` (like unlimited fuzzing does) and prints a
fuzzing report at the end.
Closes#22900 but does not implement the time based limit, as after
internal discussions we concluded to be problematic to both implement
and use correctly.
Clang fails to compile the CBE translation of this code ("non-ASM
statement in naked function"). Similar to the implementations of
`restore_rt` on x86 and ARM, when the CBE is in use, this commit employs
alternative inline assembly that avoids using non-immediate input
operands.
In a library, the two `builtin.link_libc` and `builtin.output_mode ==
.Exe` checks could both be false. Thus, you would get a compile error
even if you specified an `env_map` at runtime. This change turns the
compile error into a runtime panic and updates the documentation to
reflect the runtime requirement.
Fixes#25209.
On PowerPC, some registers are both inputs to syscalls and clobbered by
them. An example is r0, which initially contains the syscall number, but
may be overwritten during execution of the syscall.
musl and glibc use a `+` (read-write) constraint to indicate this, which
isn't supported in Zig. The current implementation of PowerPC syscalls
in the Zig standard library instead lists these registers as both inputs
and clobbers, but this results in the C backend generating code that is
invalid for at least some C compilers, like GCC, which doesn't support
the specifying the same register as both an input and a clobber.
This PR changes the PowerPC syscall functions to list such registers as
inputs and outputs rather than inputs and clobbers. Thanks to jacobly0
who pointed out that it's possible to have multiple outputs; I had
gotten the wrong idea from the documentation.
If the compiler happens to pick `ret = r0`, then this will assemble to
`ag r0, 0` which is obviously not what we want. Using `a` instead of `r` will
ensure that we get an appropriate address register, i.e. `r1` through `r15`.
Re-enable pie_linux for s390x-linux which was disabled in
ed7ff0b693.
There are two reasons for this:
1. Apple is about to drop support for this target. Zig will keep support
but move it to a lower tier - one that does not require continuous CI
testing. Support for this target will be maintained by the enthusiasm
of contributors but will not block other bug fixes and enhancements.
2. This is our only non-self-hosted action runner. We are migrating away
from GitHub soon at which point this runner will no longer be
available.
Disabled due to no active maintainer (feel free to fix the failures and
then re-enable at any time). The failures occur due to backend
miscompilation of different AIR from the frontend.
Disabled due to no active maintainer (feel free to fix the failures and
then re-enable at any time). The failures occur due to changing AIR from
the frontend, and backend being incomplete.
C translation is in the process of switching to be aro-based
(see #24497)
That codebase will need to gain some kind of helper for translating C
code that uses runtime vector indexing.
with field_ptr_load and field_ptr_named_load.
These avoid doing by-val load operations for structs that are
runtime-known while keeping the previous semantics for comptime-known
values.
If `r.end` is updated in the `stream` implementation, then it's possible that `r.end += ...` will behave unexpectedly. What seems to happen is that it reverts back to its value before the function call and then the increment happens. Here's a reproduction:
```zig
test "fill when stream modifies `end` and returns 0" {
var buf: [3]u8 = undefined;
var zero_reader = infiniteZeroes(&buf);
_ = try zero_reader.fill(1);
try std.testing.expectEqual(buf.len, zero_reader.end);
}
pub fn infiniteZeroes(buf: []u8) std.Io.Reader {
return .{
.vtable = &.{
.stream = stream,
},
.buffer = buf,
.end = 0,
.seek = 0,
};
}
fn stream(r: *std.Io.Reader, _: *std.Io.Writer, _: std.Io.Limit) std.Io.Reader.StreamError!usize {
@memset(r.buffer[r.seek..], 0);
r.end = r.buffer.len;
return 0;
}
```
When `fill` is called, it will call into `vtable.readVec` which in this case is `defaultReadVec`. In `defaultReadVec`:
- Before the `r.end += r.vtable.stream` line, `r.end` will be 0
- In `r.vtable.stream`, `r.end` is modified to 3 and it returns 0
- After the `r.end += r.vtable.stream` line, `r.end` will be 0 instead of the expected 3
Separating the `r.end += stream();` into two lines fixes the problem (and this separation is done elsewhere in `Reader` so it seems possible that this class of bug has been encountered before).
Potentially related issues:
- https://github.com/ziglang/zig/issues/4021
- https://github.com/ziglang/zig/issues/12064
This clarifies that it is legal to return an invalid pointer from a
function, provided that such pointer is not dereferenced.
This matches current status quo of the language. Any change to this
should be a proposal that argues for different semantics.
It is also legal in C to return a pointer to a local. The C backend
lowers such thing directly, so the corresponding warning in C must be
disabled (`-Wno-return-stack-address`).
This test works by assuming that std.ArrayList will grow with a specific
capacity increasing pattern, which is an invalid assumption. Delete the
offending test.
I measured this against master branch and found no statistical
difference. Since this code is simpler and logically superior due to
always leaving sufficient unused capacity when growing, it is preferred
over status quo.
Adds `addFileContentArg` and `addPrefixedFileContentArg` to pass the content
of a file with a lazy path as an argument to a `std.Build.Step.Run`.
This enables replicating shell `$()` / cmake `execute_process` with `OUTPUT_VARIABLE`
as an input to another `execute_process` in conjuction with `captureStdOut`/`captureStdErr`.
To also be able to replicate `$()` automatically trimming trailing newlines and cmake
`OUTPUT_STRIP_TRAILING_WHITESPACE`, this patch adds an `options` arg to those functions
which allows specifying the desired handling of surrounding whitespace.
The `options` arg also allows to specify a custom `basename` for the output. e.g.
to add a file extension (concrete use case: Zig `@import()` requires files to have a
`.zig`/`.zon` extension to recognize them as valid source files).
The data structure was originally added in
41e1cd185b and then removed in
50a336fff8, but brought back in
711bf55eaa for Decl in the compiler
frontend, and then the last reference to it was eliminated in
548a087faf which removed Decl in favor of
Nav and Cau.
When building on macOS Tahoe, binaries were getting duplicate LC_RPATH
load commands which caused dyld to refuse to run them with a
"duplicate LC_RPATH" error that has become a hard error.
The duplicates occurred when library directories were being added
to rpath_list twice:
- from lib_directories
- from native system paths detection which includes the same dirs
This can be re-evaluated at a later time, but at the moment the
performance and stability concerns hold it back. Additionally, it
promotes a non-smithing approach to fuzz tests.
This PR significantly improves the capabilities of the fuzzer.
The changes made to the fuzzer to accomplish this feat mostly include
tracking memory reads from .rodata to determine fresh inputs, new
mutations (especially the ones that insert const values from .rodata
reads and __sanitizer_conv_const_cmp), and minimizing found inputs.
Additionally, the runs per second has greatly been increased due to
generating smaller inputs and avoiding clearing the 8-bit pc counters.
An additional feature added is that the length of the input file is now
stored and the old input file is rerun upon start.
Other changes made to the fuzzer include more logical initialization,
using one shared file `in` for inputs, creating corpus files with
proper sizes, and using hexadecimal-numbered corpus files for
simplicity.
Furthermore, I added several new fuzz tests to gauge the fuzzer's
efficiency. I also tried to add a test for zstandard decompression,
which it crashed within 60,000 runs (less than a second.)
Bug fixes include:
* Fixed a race conditions when multiple fuzzer processes needed to use
the same coverage file.
* Web interface stats now update even when unique runs is not changing.
* Fixed tokenizer.testPropertiesUpheld to allow stray carriage returns
since they are valid whitespace.
I’ve been typing `zig fmt **/.zig` for a long time, until I discovered
that the argument can actually be a directory.
Mention this feature explicitly in the help message.
* Remove the generic model; we already have generic_la32 and generic_la64 and
pick appropriately based on bitness.
* Remove the loongarch64 model. We used this as our baseline for 64-bit, but it's
actually pretty misleading and useless; it doesn't represent any real CPU and
has less features than generic_la64.
* Add la64v1_0 and la64v1_1 models.
* Change our baseline CPU model for 64-bit to be la64v1_0, thus adding LSX to
the baseline feature set.
Ascon is the family of cryptographic constructions standardized by NIST
for lightweight cryptography.
The Zig standard library already included the Ascon permutation itself,
but higher-level constructions built on top of it were intentionally
postponed until NIST released the final specification.
That specification has now been published as NIST SP 800-232:
https://csrc.nist.gov/pubs/sp/800/232/final
With this publication, we can now confidently include these constructions
in the standard library.
* std.sort.pdq: fix out-of-bounds access in partialInsertionSort
When sorting a sub-range that doesn't start at index 0, the
partialInsertionSort function could access indices below the range
start. The loop condition `while (j >= 1)` didn't respect the
arbitrary range boundaries [a, b).
This changes the condition to `while (j > a)` to ensure indices
never go below the range start, fixing the issue where pdqContext
would access out-of-bounds indices.
Fixes#25250
In ed25519.zig, we checked if a test succeeds, in which case we
returned an error. This was confusing, and Andrew pointed out that
Zig weights branches against errors by default.
* test: remove test-compare-output and test-asm-link tests
These were low value and unfocused tests. We already have coverage of the
important aspects of these tests elsewhere. Additionally, there was really no
need for these to have their own test harness.
* test: rename issue_8550 standalone test to compile_asm
* test: rename backend=stage2 to backend=selfhosted, and add backend=auto
backend=auto (now the default if backend is omitted) means to let the compiler
pick whatever backend it wants as the default. This is important for platforms
where we don't yet have a self-hosted backend, such as loongarch64.
Also purge a bunch of redundant target=native.
* test: delete old stage1 compile_errors tests
generic_function_returning_opaque_type.zig was salvaged as it's still worth
having.
* test: pull tests in test/cases/llvm/ up to test/cases/
There is nothing inherently LLVM-specific about any of these.
* test: remove @cImport usage in interdependent_static_c_libs
* test: move glibc_compat from link to standalone tests
This is not really testing the linker.
* build: -Dskip-translate-c now implies -Dskip-run-translated-c
* build: skip test-cimport when -Dskip-translate-c is given
backend=auto (now the default if backend is omitted) means to let the compiler
pick whatever backend it wants as the default. This is important for platforms
where we don't yet have a self-hosted backend, such as loongarch64.
Also purge a bunch of redundant target=native.
These were low value and unfocused tests. We already have coverage of the
important aspects of these tests elsewhere. Additionally, there was really no
need for these to have their own test harness.
The Zig standard library lacked schemes that resist nonce reuse.
AES-SIV and AES-GCM-SIV are the standard options for this.
AES-GCM-SIV can be very useful when Zig is used to target embedded
systems, and AES-SIV is especially useful for key wrapping.
Also take it as an opportunity to add a bunch of test vectors to
modes.ctr and make sure it works with block ciphers whose size is
not 16.
This bug was manifesting for user as a nasty link error because they
were calling their application's main entry point as a coerced function,
which essentially broke reference tracking for the entire ZCU, causing
exported symbols to silently not get exported.
I've been a little unsure about how coerced functions should interact
with the unit graph before, but the solution is actually really obvious
now: they shouldn't! `Sema` is now responsible for unwrapping
possibly-coerced functions *before* queuing analysis or marking unit
references. This makes the reference graph optimal (there are no
redundant edges representing coerced versions of the same function) and
simplifies logic elsewhere at the expense of just a few lines in Sema.
The switch from @bitCast() to @intCast() here safety-checks
Linux's assertion that these 3 calls never return errors (negative
values as pid_t). getppid() can legally return 0 if the parent is
in a different pid namespace, but this is not an error.
When not linking libc on 64-bit Linux and calling posix.setsid(),
we get a type error at compile time inside of posix.errno(). This
is because posix.errno()'s non-libc branch expects a usize-sized
value, which is what all the error-returning os.linux syscalls
return, and linux.setsid() instead returned a pid_t, which is only
32 bits wide.
This and the other 3 pid-related calls just below it (getpid(),
getppid(), and gettid()) are the only Linux syscall examples here
that are casting their return values to pid_t. For the other 3
this makes sense: those calls are documented to have no possible
errors and always return a valid pid_t value.
However, setsid() actually can return the error EPERM, and
therefore needs to return the raw value from syscall0 for
posix.errno() to process like normal.
Additionally, posix.setsid() needs an @intCast(rc) for the success
case as a result, like most other such cases.
We need std.os.linux and std.c to agree on the types here, or else
we'd have to pointlessly cast across the difference up in the
std.posix wrapper. I ran into this as a type error the first time
I tried to compile my code that calls posix.socketpair() on Linux
without libc.
All of our existing socket calls with these kinds of arguments in
std (including the existing c.socketpair as well as
os.linux.socket in this same file) use unsigned for all of these
parameters, and so this brings linux.socketpair() into alignment
with everything else.
FreeBSD normally provides this symbol in libc, but it's in the
FBSDprivate_1.0 namespace, so it doesn't get included in our abilists file.
Fortunately, the implementation is identical for Linux and FreeBSD, so we can
just provide it in compiler-rt.
It's interesting to note that the same is not true for NetBSD where the
implementation is more complex to support older Arm versions. But we do include
the symbol in our abilists file for NetBSD libc, so that's fine.
closes#25215
* Make cat in test/standalone/simple working again
- Fixes:
zig/0.15.1/lib/zig/std/Io/Writer.zig:939:11: 0x1049aef63 in sendFileAll (nclip)
assert(w.buffer.len > 0);
- because we are no using non zero buffers for stdout - "do not forget to flush"
* replace std.fs with fs because we are already importing it
This instruction actually has fairly useless semantics, and even the
cases that were semantically correct could save 1 cycle of latency by
using a different sequnce involving the avx version instead.
Closes#25174
Re-enable the test. Will trigger #24380 as-is, but follow-on change moes
this code over to test/standalone.
Make the test a bit easier to debug by stashing the "seen" signal number
in the shared `seen_sig` (instead of just incrementing a counter for each
hit). And only doing so if the `seen_sig` is zero.
These tests aren't (directly) using Posix APIs, so they don't need to be
in posix/test.zig. Put them over with the code and tests in Thread.zig.
Since the spawn/join test in the posix code was redundant, just dropped
that one.
Because these lists are very long in several cases and quite
varied, I opted to place them in the existing c/foo.zig files.
There are many other sets of network-related constants like this
to add over time across all the OSes. For now I picked these
because I needed a few constants from each of these namespaces for
my own project, so I tried to flesh out these namespaces
completely as best I could, at least for basic sockopt purposes.
Note windows has some of these already defined in ws2_32 as
individual constants rather than contained in a namespacing
struct. I'm not sure what to do with that in the long run (break
it and namespace them?), but this doesn't change the status quo
for windows in any case.
in_pktinfo is only used on a few targets for the IP_PKTINFO
sockopt, as many BSDs use an alternate mechanism (IP_RECVDSTADDR)
that doesn't require a special struct. in6_pktinfo is more
universal.
This is the struct type used as set/getsockopt() option data with
SO.ACCEPTFILTER, which is also only declared on this same limited
set of BSD-ish targets.
In theory this could be aliased over to std.posix as well, but I
think for a corner case like this, it's not unreasonable for a
user that is avoiding uneccessary std.c references to access it as
"posix.system.accept_filter_arg" (which would still work fine if,
in the future, FreeBSD escapes its libc dep and defines it in
std.os.freebsd).
missing `extern` on a struct.
but also all these instances that call pwriteAll with a `@ptrCast` are
endianness bugs.
this should be changed to use File.Writer and call writeSliceEndian
instead.
this commit fixes one immediate problem but does not fix everything.
Add verifyStrict() functions for cofactorless verification.
Also:
- Support messages < 64 characters in the test vectors
- Allow mulDoubleBasePublic to return the identity as a regular
value. There are valid use cases for this.
I noticed this by stress testing my tls server implementation. From time to time curl (and other tools: ab, vegeta) will report invalid signature. I trace the problem to the way how std lib is encoding raw signature into der format. Using raw signature I got in some cases different encoding using std and openssl. Std is not producing minimal der when signature `r` or `s` integers has leading zero(es).
Here is an example to illustrate difference. Notice leading 00 in `s`
integer which is removed in openssl encoding but not in std encoding.
```Zig
const std = @import("std");
test "ecdsa signature to der" {
// raw signature r and s bytes
const raw = hexToBytes(
\\ 49 63 0c 94 95 2e ff 4b 02 bf 35 c4 97 9e a7 24
\\ 20 dc 94 de aa 1b 17 ff e1 49 25 3e 34 ef e8 d0
\\ c4 43 aa 7b a9 f3 9c b9 f8 72 7d d7 0c 9a 13 1e
\\
\\ 00 56 85 43 d3 d4 05 62 a1 1d d8 a1 45 44 b5 dd
\\ 62 9f d1 e0 ab f1 cd 4a 85 d0 1f 5d 11 d9 f8 89
\\ 89 d4 59 0c b0 6e ea 3c 19 6a f7 0b 1a 4a ce f1
);
// encoded by openssl
const expected = hexToBytes(
\\ 30 63 02 30
\\ 49 63 0c 94 95 2e ff 4b 02 bf 35 c4 97 9e a7 24
\\ 20 dc 94 de aa 1b 17 ff e1 49 25 3e 34 ef e8 d0
\\ c4 43 aa 7b a9 f3 9c b9 f8 72 7d d7 0c 9a 13 1e
\\
\\ 02 2f
\\ 56 85 43 d3 d4 05 62 a1 1d d8 a1 45 44 b5 dd
\\ 62 9f d1 e0 ab f1 cd 4a 85 d0 1f 5d 11 d9 f8 89
\\ 89 d4 59 0c b0 6e ea 3c 19 6a f7 0b 1a 4a ce f1
);
// encoded by std
const actual = hexToBytes(
\\ 30 64 02 30
\\ 49 63 0c 94 95 2e ff 4b 02 bf 35 c4 97 9e a7 24
\\ 20 dc 94 de aa 1b 17 ff e1 49 25 3e 34 ef e8 d0
\\ c4 43 aa 7b a9 f3 9c b9 f8 72 7d d7 0c 9a 13 1e
\\
\\ 02 30
\\ 00 56 85 43 d3 d4 05 62 a1 1d d8 a1 45 44 b5 dd
\\ 62 9f d1 e0 ab f1 cd 4a 85 d0 1f 5d 11 d9 f8 89
\\ 89 d4 59 0c b0 6e ea 3c 19 6a f7 0b 1a 4a ce f1
);
_ = actual;
const Ecdsa = std.crypto.sign.ecdsa.EcdsaP384Sha384;
const sig = Ecdsa.Signature.fromBytes(raw);
var buf: [Ecdsa.Signature.der_encoded_length_max]u8 = undefined;
const encoded = sig.toDer(&buf);
try std.testing.expectEqualSlices(u8, &expected, encoded);
}
pub fn hexToBytes(comptime hex: []const u8) [removeNonHex(hex).len / 2]u8 {
@setEvalBranchQuota(1000 * 100);
const hex2 = comptime removeNonHex(hex);
comptime var res: [hex2.len / 2]u8 = undefined;
_ = comptime std.fmt.hexToBytes(&res, hex2) catch unreachable;
return res;
}
fn removeNonHex(comptime hex: []const u8) []const u8 {
@setEvalBranchQuota(1000 * 100);
var res: [hex.len]u8 = undefined;
var i: usize = 0;
for (hex) |c| {
if (std.ascii.isHex(c)) {
res[i] = c;
i += 1;
}
}
return res[0..i];
}
```
Trimming leading zeroes from signature integers fixes encoding.
Also, added EPIPE to recvfrom() error set (it's a documented error
for unix and tcp sockets, at least), which recvmsg() largely
shares. Windows has an odd, callback-based form of recvmsg() that
doesn't fit the normal interface here.
socketpair is something like a pipe2() for sockets, and generally
only works for AF_UNIX sockets for most platforms. Winsock2
explicitly does not support this call, even though it does have
AF_UNIX sockets.
* Document std.mem.* functions
Functions in std.mem are essential for virtually all applications,
yet many of them lacked documentation.
Co-authored-by: Andrew Kelley <andrew@ziglang.org>
- introduce seekToUnbuffered which asserts no buffered data and does not
have WriteFailed in the error set
- remove WriteFailed from SeekError
- make seekTo based on calling flush and then seekToUnbuffered
- revert the change to reset seek_err since the error sets are
compatible again
Call start/endBlock before/after `parseBlockInfoBlock` in order to not
use the current block context, which is wrong and leads to e.g. incorrect
abbrevlen being used.
Previously we had a single definition of std.c.cmsghdr for all
libc-linking platforms which aliased from the Solaris definition,
a superfluous matching one in std.os.dragonfly, and no others.
The existing definition from std.c didn't actually work for Linux,
as Linux's "len" field is usize in the kernel's definition.
Emscripten follows the Linux model of course (but uses the
binary-compatible musl definition, which has an endian-sensitive
padding scheme to make the len type "socklen_t" even though the
kernel uses a usize, which is fair).
This unifies and documents all the known *nix-ish cases (I'm not
sure if wasi or windows really has cmsghdr support? Could be added
later, void for now), such that c.cmsghdr and posix.system.cmsghdr
should work correctly for all the known cases here, libc or
otherwise.
lzma2 Decoder already checks if decoding is finished or not inside the
process function, `range_decoder`finish does not mean the decoder has
finished, also need to check `ld.rep[0] == 0xFFFF_FFFF`, which was
already done inside the proccess function. This fix delete the redundant
`isFinish()` check for `range_decoder`.
Before this commit, -Mfoo=bar=baz would be incorrectly split into mod_name: `foo` and root_src_orig: `bar`
After this commit, -Mfoo=bar=baz will be correctly split into mod_name: `foo` and root_src_orig: `bar=baz`
Closes#25059
* update the MSG struct with the correct values for openbsd
* add comment with link to sys/sys/socket.h
---------
Co-authored-by: Brandon Mercer <bmercer@eutonian.com>
LLVM 21 has started recognizing strlen-like idioms and optimizing them to strlen
calls, so we need this function provided in compiler-rt for libc-less
compilations.
Before this commit, calling appendRemaining with an ArrayList where list.items.len != list.capacity could result in illegal behavior if the Writer.Allocating resized the list during the appendRemaining call.
Fixes#25057
We can't call `@frameAddress()` and then immediately `return`! That
invalidates the frame. This *usually* isn't a problem, because the stack
walk `next` call will *probably* have a stack frame and it will
*probably* be at the exact same address, but neither of those is a
guarantee. On powerpc, presumably some unfortunate inlining was going
on, so this frame was indeed invalidated when we started walking frames.
We need to explicitly pass `@frameAddress` into any function which will
return before we actually walk the stack. Pretty simple patch.
Resolves: #24970
The TLS 1.2 implementation was incorrectly hardcoded to always send the
secp256r1 public key in the client key exchange message, regardless of
which elliptic curve the server actually negotiated.
This caused TLS handshake failures with servers that preferred other curves
like X25519.
This fix:
- Tracks the negotiated named group from the server key exchange message
- Dynamically selects the correct public key (X25519, secp256r1, or
secp384r1) based on what the server negotiated
- Properly constructs the client key exchange message with the
appropriate key size for each curve type
Fixes TLS 1.2 connections to servers like ziglang.freetls.fastly.net
that prefer X25519 over secp256r1.
Note the previous "28" here for openbsd was some kind of copy
error long ago. That's the value of KERN.SOMAXCONN, which is an
entirely different thing.
missing these things:
- implementation of finish()
- detect packed bytes read for check and block padding
- implementation of discard()
- implementation of block stream checksum
Fixes#23993
Previously, if multiple build processes tried to create the same args file, there was a race condition with the use of the non-atomic `writeFile` function which could cause a spawned compiler to read an empty or incomplete args file. This commit avoids the race condition by first writing to a temporary file with a random path and renaming it to the desired path.
* add macos handling for totalSystemMemory
* fix return type cast for .freebsd in totalSystemMemory
* add handling for the whole Darwin family in totalSystemMemory
This make `fs.Dir.access` has compatibility like the zig version before.
With this change the `zig build --search-prefix` command would work again like
the zig 0.14 version when used on Ubuntu22.04, kernel version 5.4.
This is my penance for baiting andrew into deleting the existing generic
queue data structures with my talk of "too many ring buffers".
The new Reader and Writer interfaces are excellent ring buffers for many
use cases, but a generic queue container type is now missing.
This new double-ended queue, known more succinctly as a deque, is
implemented from scratch based on the API design lessons learned from
ArrayList over the years.
The API is not yet as featureful as ArrayList, but the core
functionality is in place and I will be using this in my personal
projects shortly. I think it makes sense to add further functions as
needed based on real-world use-cases.
* Adds "flat" alternatives to zon.parse.from* that don't support pointers
* Fixes documentation
* Removes flat postfix from non allocating functions, adds alloc to others
* Stops using alloc variant in tests where not needed
It is important we copy the left-overs in the message *before* we XOR
it into the ciphertext, because if we're encrypting in-place (i.e., m ==
c), we will manipulate the message that will be used for tag generation.
This will generate faulty tags when message length doesn't conform with
16 byte blocks.
The big endian RISC-V effort is mostly driven by MIPS (the company) which is
pivoting to RISC-V, and presumably needs a big endian variant to fill the niche
that big endian MIPS (the ISA) did.
GCC already supports these targets, but LLVM support will only appear in 22;
this commit just adds the necessary target knowledge and checks on our end.
Without this change, the docs are formatted s.t. the text "Edge case rules ordered by precedence:" is appended onto the prior line of text "Underflow: Absolute value of result smaller than 1", instead of getting its own line.
This API is based around the unsound idea that a process can perform
checked virtual memory loads to prevent crashing. This depends on
OS-specific APIs that may be unavailable, disabled, or impossible due to
virtualization.
It also makes collecting stack traces ridiculously slow, which is a
problem for users of DebugAllocator - in other words, everybody, all the
time. It also makes strace go from being superbly clean to being awful.
Linux already gained the relevant syscalls and consts in #24473
The basic mlock() and munlock() are fairly universal across the
*nix world with a consistent interface, but are missing on wasi
and windows.
The mlockall() and munlockall() calls are not as widely supported
as the basic ones. Notable non-implementers include darwin,
haiku, and serenity (and of course wasi and windows again).
mlock2() is Linux-only, as are its MLOCK flags.
This mainly just moves stuff around.
Justifications for other changes:
* `KEVENT.FLAGS` is backed by `c_uint` because that's what the `kevent64` flags param takes (according to the 'latest' manpage from 2008)
* `MACH_RCV_NOTIFY` is a legacy name and `MACH_RCV_OVERWRITE` is deprecated (xnu/osfmk/mach/message.h), so I removed them. They were 0 anyway and thus couldn't be represented
as a packed struct field.
* `MACH.RCV` and `MACH.SEND` are technically the same 'type' because they can both be supplied at the same time to `mach_msg`. I decided to still keep them separate because
naming works out better that way and all flags except for `MACH_MSG_STRICT_REPLY` aren't shared anyway. Both are part of a packed union `mach_msg_option_t` which supplies a
helper function to combine the two types.
* `PT` is backed by `c_int` because that's what `ptrace` takes as a request arg (according to the latest manpage from 2015)
* extend std.Io.Reader.peekDelimiterExclusive test to repeat successful end-of-stream path (fails)
* fix std.Io.Reader.peekDelimiterExclusive to not advance seek position in successful end-of-stream path
Be sure to select "Desktop development with C++" when prompted.
* You must additionally check the optional component labeled **C++ ATL for
v142 build tools**.
Install [CMake](http://cmake.org).
Use [git](https://git-scm.com/) to clone the zig repository to a path with no spaces, e.g. `C:\Users\Andy\zig`.
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value.
```bat
mkdir C:\Users\Andy\zig\build-release
cd C:\Users\Andy\zig\build-release
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_PREFIX_PATH=C:\Users\Andy\llvm+clang+lld-20.0.0-x86_64-windows-msvc-release-mt -DCMAKE_BUILD_TYPE=Release
msbuild -p:Configuration=Release INSTALL.vcxproj
```
You now have the `zig.exe` binary at `bin\zig.exe` and you can run the tests:
```bat
bin\zig.exe build test
```
This can take a long time.
Note: In case you get the error "llvm-config not found" (or similar), make sure
that you have **no** trailing slash (`/` or `\`) at the end of the
`-DCMAKE_PREFIX_PATH` value.
## Building LLVM, LLD, and Clang from Source
### Windows
Install [CMake](https://cmake.org/), version 3.20.0 or newer.
[Download LLVM, Clang, and LLD sources](https://releases.llvm.org/download.html#21.0.0)
The downloads from llvm lead to the github release pages, where the source's
will be listed as : `llvm-21.X.X.src.tar.xz`, `clang-21.X.X.src.tar.xz`,
`lld-21.X.X.src.tar.xz`. Unzip each to their own directory. Ensure no
Be sure to select "C++ build tools" when prompted.
* You **must** additionally check the optional component labeled **C++ ATL for
v142 build tools**. As this won't be supplied by a default installation of
Visual Studio.
* Full list of supported MSVC versions:
- 2017 (version 15.8) (unverified)
- 2019 (version 16.7)
Install [Python 3.9.4](https://www.python.org). Tick the box to add python to
your PATH environment variable.
#### LLVM
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value. Here is listed a brief explanation of each of the CMake parameters we pass when configuring the build
- `-Thost=x64` : Sets the windows toolset to use 64 bit mode.
- `-A x64` : Make the build target 64 bit .
- `-G "Visual Studio 16 2019"` : Specifies to generate a 2019 Visual Studio project, the best supported version.
- `-DCMAKE_INSTALL_PREFIX=""` : Path that llvm components will being installed into by the install project.
- `-DCMAKE_PREFIX_PATH=""` : Path that CMake will look into first when trying to locate dependencies, should be the same place as the install prefix. This will ensure that clang and lld will use your newly built llvm libraries.
- `-DLLVM_ENABLE_ZLIB=OFF` : Don't build llvm with ZLib support as it's not required and will disrupt the target dependencies for components linking against llvm. This only has to be passed when building llvm, as this option will be saved into the config headers.
- `-DCMAKE_BUILD_TYPE=Release` : Build llvm and components in release mode.
- `-DCMAKE_BUILD_TYPE=Debug` : Build llvm and components in debug mode.
- `-DLLVM_USE_CRT_RELEASE=MT` : Which C runtime should llvm use during release builds.
- `-DLLVM_USE_CRT_DEBUG=MTd` : Make llvm use the debug version of the runtime in debug builds.
##### Release Mode
```bat
mkdir C:\Users\Andy\llvm-21.0.0.src\build-release
cd C:\Users\Andy\llvm-21.0.0.src\build-release
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=C:\Users\Andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-release-mt -DCMAKE_PREFIX_PATH=C:\Users\Andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-release-mt -
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value.
##### Release Mode
```bat
mkdir C:\Users\Andy\lld-21.0.0.src\build-release
cd C:\Users\Andy\lld-21.0.0.src\build-release
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=C:\Users\Andy\llvm+clang+lld-14.0.6-x86_64-windows-msvc-release-mt -DCMAKE_PREFIX_PATH=C:\Users\Andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-release-mt -DCMAKE_BUILD_TYPE=Release -DLLVM_USE_CRT_RELEASE=MT
"c:\Program Files\CMake\bin\cmake.exe" .. -Thost=x64 -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=C:\Users\andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-debug -DCMAKE_PREFIX_PATH=C:\Users\andy\llvm+clang+lld-21.0.0-x86_64-windows-msvc-debug -DCMAKE_BUILD_TYPE=Debug -DLLVM_USE_CRT_DEBUG=MTd
msbuild /m INSTALL.vcxproj
```
#### Clang
Using the start menu, run **x64 Native Tools Command Prompt for VS 2019** and execute these commands, replacing `C:\Users\Andy` with the correct value.
Testing foreign architectures with dynamically linked glibc is one step trickier.
This requires enabling `--glibc-runtimes /path/to/glibc/multi/install/glibcs`.
This path is obtained by building glibc for multiple architectures. This
process for me took an entire day to complete and takes up 65 GiB on my hard
drive. The CI server does not provide this test coverage.
[Instructions for producing this path](https://codeberg.org/ziglang/infra/src/branch/master/building-libcs.md#linux-glibc) (just the part with `build-many-glibcs.py`).
It is understood that most contributors will not have these tests enabled.
#### Testing Windows from a Linux Machine with Wine
When developing on Linux, another option is available to you: `-fwine`.
This will enable running behavior tests and std lib tests with Wine. It's
recommended for Linux users to install Wine and enable this testing option
when editing the standard library or anything Windows-related.
#### Testing WebAssembly using wasmtime
If you have [wasmtime](https://wasmtime.dev/) installed, take advantage of the
`-fwasmtime` flag which will enable running WASI behavior tests and std
lib tests. It's recommended for all users to install wasmtime and enable this
testing option when editing the standard library and especially anything
WebAssembly-related.
### Improving Translate-C
`translate-c` is a feature provided by Zig that converts C source code into
Zig source code. It powers the `zig translate-c` command as well as
constuse_zig_libcxx=b.option(bool,"use-zig-libcxx","If libc++ is needed, use zig's bundled version, don't try to integrate with the system")orelsefalse;
consttest_step=b.step("test","Run all the tests");
constskip_install_lib_files=b.option(bool,"no-lib","skip copying of lib/ files and langref to installation prefix. Useful for development")orelsefalse;
constskip_install_lib_files=b.option(bool,"no-lib","skip copying of lib/ files and langref to installation prefix. Useful for development")orelseonly_c;
constskip_install_langref=b.option(bool,"no-langref","skip copying of langref to the installation prefix")orelseskip_install_lib_files;
conststd_docs=b.option(bool,"std-docs","include standard library autodocs")orelsefalse;
constpie=b.option(bool,"pie","Produce a Position Independent Executable");
constio_mode=b.option(IoMode,"io-mode","How the compiler performs IO")orelse.threaded;
constvalue_interpret_mode=b.option(ValueInterpretMode,"value-interpret-mode","How the compiler translates between 'std.builtin' types and its internal datastructures")orelse.direct;
constvalue_tracing=b.option(bool,"value-tracing","Enable extra state tracking to help troubleshoot bugs in the compiler (using the std.debug.Trace API)")orelsefalse;
exe_options.addOption(DevEnv,"dev",b.option(DevEnv,"dev","Build a compiler with a reduced feature set for development of specific features")orelseif(only_c).bootstrapelse.full);
Doc comments can be interleaved with normal comments. Currently, when producing
the package documentation, normal comments are merged with doc comments.
Doc comments can be interleaved with normal comments, which are ignored.
</p>
{#header_close#}
{#header_open|Top-Level Doc Comments#}
@ -438,6 +438,25 @@
{#header_close#}
{#header_close#}
{#header_open|Identifiers#}
<p>
Identifiers must start with an alphabetic character or underscore and may be followed
by any number of alphanumeric characters or underscores.
They must not overlap with any keywords. See {#link|Keyword Reference#}.
</p>
{#header_open|String Identifier Syntax#}
<p>
If a name that does not fit these requirements is needed, such as for
linking with external libraries, the {#syntax#}@""{#endsyntax#} syntax
may be used.
</p>
{#code|identifiers.zig#}
{#header_close#}
{#header_close#}
{#header_open|Values#}
{#code|values.zig#}
@ -639,7 +658,7 @@
{#syntax#}i7{#endsyntax#} refers to a signed 7-bit integer. The maximum allowed bit-width of an
integer type is {#syntax#}65535{#endsyntax#}.
</p>
{#see_also|Integers|Floats|void|Errors|@Type#}
{#see_also|Integers|Floats|void|Errors|@Int#}
{#header_close#}
{#header_open|Primitive Values#}
<divclass="table-wrapper">
@ -963,6 +982,9 @@
humans and computers to do when reading code, and creates more optimization opportunities.
</p>
<p>
Variables are never allowed to shadow {#link|Identifiers#} from an outer scope.
</p>
<p>
The {#syntax#}extern{#endsyntax#} keyword or {#link|@extern#} builtin function can be used to link against a variable that is exported
from another object. The {#syntax#}export{#endsyntax#} keyword or {#link|@export#} builtin function
can be used to make a variable available to other objects at link time. In both cases,
@ -970,22 +992,6 @@
</p>
{#see_also|Exporting a C Library#}
{#header_open|Identifiers#}
<p>
Variable identifiers are never allowed to shadow identifiers from an outer scope.
</p>
<p>
Identifiers must start with an alphabetic character or underscore and may be followed
by any number of alphanumeric characters or underscores.
They must not overlap with any keywords. See {#link|Keyword Reference#}.
</p>
<p>
If a name that does not fit these requirements is needed, such as for linking with external libraries, the {#syntax#}@""{#endsyntax#} syntax may be used.
</p>
{#code|identifiers.zig#}
{#header_close#}
{#header_open|Container Level Variables#}
<p>
{#link|Container|Containers#} level variables have static lifetime and are order-independent and lazily analyzed.
@ -1969,6 +1975,14 @@ or
</p>
{#see_also|@splat|@shuffle|@select|@reduce#}
{#header_open|Relationship with Arrays#}
<p>Vectors and {#link|Arrays#} each have a well-defined <strong>bit layout</strong>
and therefore support {#link|@bitCast#} between each other. {#link|Type Coercion#} implicitly peforms
{#syntax#}@bitCast{#endsyntax#}.</p>
<p>Arrays have well-defined byte layout, but vectors do not, making {#link|@ptrCast#} between
them {#link|Illegal Behavior#}.</p>
{#header_close#}
{#header_open|Destructuring Vectors#}
<p>
Vectors can be destructured:
@ -2454,7 +2468,8 @@ or
{#header_open|Tagged union#}
<p>Unions can be declared with an enum tag type.
This turns the union into a <em>tagged</em> union, which makes it eligible
to use with {#link|switch#} expressions.
to use with {#link|switch#} expressions. When switching on tagged unions,
the tag value can be obtained using an additional capture.
Tagged unions coerce to their tag type: {#link|Type Coercion: Unions and Enums#}.
</p>
{#code|test_tagged_union.zig#}
@ -2496,6 +2511,7 @@ or
{#header_open|packed union#}
<p>A {#syntax#}packed union{#endsyntax#} has well-defined in-memory layout and is eligible
to be in a {#link|packed struct#}.</p>
<p>All fields in a packed union must have the same {#link|@bitSizeOf#}.</p>
{#header_close#}
{#header_open|Anonymous Union Literals#}
@ -2586,6 +2602,13 @@ or
{#header_close#}
{#header_open|Switching on Errors#}
<p>
When switching on errors, some special cases are allowed to simplify generic programming patterns:
</p>
{#code|test_switch_on_errors.zig#}
{#header_close#}
{#header_open|Labeled switch#}
<p>
When a switch statement is labeled, it can be referenced from a
@ -2651,12 +2674,13 @@ or
{#code|test_inline_else.zig#}
<p>
When using an inline prong switching on an union an additional
capture can be used to obtain the union's enum tag value.
When using an inline prong switching on an union an additional capture
can be used to obtain the union's enum tag value at comptime, even though
its payload might only be known at runtime.
</p>
{#code|test_inline_switch_union_tag.zig#}
{#see_also|inline while|inline for#}
{#see_also|inline while|inline for|Tagged union#}
{#header_close#}
{#header_close#}
@ -3016,7 +3040,7 @@ or
{#syntax#}catch{#endsyntax#} after performing some logic, you
can combine {#syntax#}catch{#endsyntax#} with named {#link|Blocks#}:
<p>Returns the comptime-only "enum literal" type. This is the type of uncoerced {#link|Enum Literals#}. Values of this type can coerce to any {#link|enum#} with a matching field.</p>